An outcome is said to be missing not at random mnar if, conditional on the observed variables, the missing data mechanism still depends on the unobserved outcome. Strategies for bayesian modeling and sensitivity analysis m. In many cases, missing data in an analysis is treated in a casual and adhoc manner, leading to invalid inferences and erroneous conclusions. Semiparametric nonlinear mixedeffects nlme models are flexible for modelling complex longitudinal data. Id 7011694 semiparametric theory and missing data anastasios a. For a more thorough discussion of semiparametric efficiency theory and precise definitions of pathwise derivatives and the tangent space we refer to. Dec 28, 2012 the semiparametric models allow for estimating functions that are nonsmooth with respect to the parameter. Semiparametric efficiency in gmm models of nonclassical measurement errors, missing data and treatment effects by xiaohong chen, han hong, and alessandro tarozzi march 2008 cowles foundation discussion paper no.
When the proportion of itemlevel missing values is nontrivial and the data are not missing completely at random mcar, typical solutions for missing data like complete case analysis often lead to increased bias and reduced statistical power. Analysis of semiparametric regression models for repeated outcomes in the presence of missing data james m. Missing data occurs frequently in empirical studies in health and social sciences, often compromising our ability to make accurate inferences. Tsiatis takes a new approach with semiparametric models. Semiparametric theory and missing data anastasios tsiatis. While many of the other missing data books do mention clinical trials some quite extensively, this book focuses exclusively on missing data in trials. Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. Semiparametric estimating equations inference with. A statistical model is a parameterized family of distributions.
This sensitivity is exacerbated when inverse probability weighting methods are used, which may overweight contaminated observations. A class of estimators is defined that includes as special cases a semiparametric regression imputation estimator, a marginal average estimator, and a marginal propensity score weighted estimator. In this paper, we consider a semiparametric regression model in the presence of missing covariates for nonparametric components under a bayesian framework. Stanford libraries official online search tool for books, media, journals, databases, government documents and more. Statistics in the pharmaceutical industry, 3rd edition. We propose a nonparametric imputation method for the missing values, which then leads to imputed estimating equations for the finite dimensional parameter of interest. An outcome is said to be missing not at random mnar if, conditional on the observed variables, the missing data mechanism still depends on.
Semiparametric theory for causal mediation analysis. While estimation of the marginal total causal effect of a point exposure on an outcome is arguably the most common objective of experimental and observational studies in the health and social sciences, in recent years, investigators have also become increasingly interested in mediation analysis. This paper considers the problem of parameter estimation in a general class of semiparametric models when observations are subject to missingness at random. Covariates are usually introduced in the models to partially explain interindividual variations. We consider a semiparametric model that parameterizes the conditional density of the response, given covariates, but allows the marginal distribution we use cookies to enhance your experience on our website.
Semiparametric regression analysis with missing response at. The theory and methods for measurement errors and missing. A semiparametric estimation of mean functionals with. We also refer to chen, hong and tarozzi 2008, who study semiparametric e. Semiparametric models allow at least part of the datagenerating process to be. The asymptotic properties of these estimators are developed using theory of counting processes and semiparametric theory for missing data problems. Semiparametric regression models with missing data. Semiparametric regression analysis with missing response at random qihua wang, oliver linton and wolfgang h. Asymptotic theory for the semiparametric accelerated failure. Kriging regression imputation method to semiparametric model. Semiparametric theory and missing data in searchworks catalog. Semiparametric methods for missing data and causal. Semiparametric theory and missing data ebook, 2006. Based on semiparametric theory and taking into account the symmetric nature of the population distribution, we propose both consistent estimators, i.
Estimation in semiparametric models with missing data 789 from the imputed estimating function gn. The description of the theory of estimation for semiparametric models is both rigorous and intuitive, relying on geometric ideas to reinforce the intuition and understanding of the theory. Semiparametric methods for missing data and causal inference abstract in this dissertation, we propose methodology to account for missing data as well as a strategy to account for outcome heterogeneity. Semiparametric theory and missing data springerlink. Missing data is a big issue in the world of clinical trials. Abstract we develop inference tools in a semiparametric partially linear regression model with missing response data. In this paper, based on the exponential tilting model, we propose a semiparametric estimation method of mean functionals with nonignorable missing data. We consider a class of doubly weighted rankbased estimating methods for the transformation or accelerated failure time model with missing data as arise, for. Semiparametric location estimation under nonrandom sampling. These methods are also illustrated using data from a breast cancer stage ii clinical trial. Missing data occur frequently in empirical studies in health and social sciences, often compromising our ability to make accurate inferences. Asymptotic theory for the semiparametric accelerated failure time model with missing data. Semiparametric theory and missing data by anastasios. Kalbfleisch and menggang yu university of michigan, university of michigan and indiana university we consider a class of doubly weighted rankbased estimating methods for the transformation or accelerated failure time model with.
Rod little and don rubin have contributed massively to the development of theory and methods for handling missing data rubin being the originator of multiple imputation. Missing data arise in almost all scientific disciplines. Missing data is a very important issue in clinical trials and many models have been devised to handle them including multiple imputation rubin, pattern mixture models little and mixed effects linear and nonlinear models pinheiro and bates, molenberghs and verbeke and davidian. In such settings, identification is generally not possible without imposing additional. If time permits, it will also cover other advanced topics on handling incomplete data, such as in highdimensional data. Bounded, efficient and doubly robust estimation with.
It starts with the study of semiparametric methods when there are no missing data. A wellknown example of a semiparametric model is the cox proportional hazards model. Semiparametric modelling is, as its name suggests, a hybrid of the parametric and nonparametric approaches to construction, fitting, and validation of statistical models. Estimation in semiparametric models with missing data. Semiparametric modelbased inference in the presence of.
The theory of missing data applied to semiparametric models is scattered throughout the literature with no thorough comprehensive treatment of the subject. In statistics, a semiparametric model is a statistical model that has parametric and nonparametric components. The description of the theory of estimation for semiparametric. Semiparametric theory and missing data anastasios a. Anastasios a tsiatis this book combines much of what is known in regard to the theory of estimation for semiparametric models with missing data in an organized and comprehensive manner. The theory is applied to a semiparametric missing data model where it is shown that the twostep gelweighted estimator possesses good efficiency and robustness properties when nuisance models are. Semiparametric estimators for the regression coefficients in. Classical semiparametric inference with missing outcome data is not robust to contamination of the observed data and a single observation can have arbitrarily large influence on estimation of a parameter of interest.
Semiparametric singleindex panel data models with interactive fixed e ects. In many cases, the treatment of missing data in an analysis is carried out in a casual and adhoc manner, leading, in many cases, to invalid inference and erroneous conclusions. Pdf semiparametric estimation with data missing not at. Simulation studies demonstrate the relevance of the theory in finite samples. Zhiqiang tan, bounded, efficient and doubly robust estimation with inverse weighting, biometrika, volume 97, issue 3. It has just been published, and ive not looked at it yet, but my guess is that it will be of use to many statisticians and trialists. We propose a nonparametric imputation method for the missing values, which then leads to imputed estimating equations for the. By adopting nonparametric components for the model, the estimation method can be made robust.
A semiparametric logistic regression model is assumed for the response probability and a nonparametric regression approach for missing data discussed in cheng 1994 is used in the estimator. Our theoretical results provide new insight for the theory of semiparametric efficiency bounds literature and open the door to new applications. In particular, we investigate a class of regressionlike mean regression, quantile regression, models with missing data, an example of a supply and demand simultaneous equations model and a. Zhao 1994 and robins and rotnitzky 1992 are revisited for semiparametric regression models with missing data using the theory outlined in the monograph by bickel, klaassen, ritov, and wellner 1993. Semiparametric theory and missing data researchgate. Semiparametric theory and missing data by tsiatis, a. In order to overcome the robust defect of traditional complete data estimation method and regression imputation estimation technique, we propose a modified imputation estimation approach called krigingregression imputation. Asymptotic theory for the semiparametric accelerated failure time model with missing data by bin nan,1 johnd.
This paper investigates a class of estimation problems of the semiparametric model with missing data. Theory and practice guohua fengx, bin pengy, liangjun su. Kriging regression imputation method to semiparametric. Semiparametric inverse propensity weighting for nonignorable. The generalized semiparametric regression models for the cumulative incidence functions with missing covariates are investigated. Tsiatis andrea rotnitzky missing data in longitudinal studies. Missing data are a common occurrence and can have a significant effect on the conclusions that can be drawn from the data. Semiparametric models allow at least part of the data generating process to be. Bounded, efficient and doubly robust estimation with inverse weighting zhiqiang tan. This book summarizes knowledge regarding the theory of estimation for semiparametric models with missing data. The cumulative incidence function quantifies the probability of failure over time due to a specific cause for competing risks data.
Statistical analysis with missing data, by little and rubin, 2002, 408 pages. This book combines much of what is known in regard to the theory of estimation for semiparametric models with missing data in an organized and comprehensive manner. Semiparametric theory and missing data electronic resource anastasios a. Missing data is a pervasive problem in data analyses, resulting in datasets that contain censored realizations of a target distribution. Kalbfleischand menggangyu university of michigan, university of michigan and indiana university we consider a class of doubly weighted rankbased estimating methods for the transformation or accelerated failure time model.
International conference on robust statistics 2016 1. In order to overcome the robust defect of traditional complete data estimation method and regression imputation estimation technique, we propose a modified imputation estimation approach called krigingregression. Abstract we consider the efficiency bound for the estimation of the parameters of semiparametric models defined solely by restrictions on the means of a vector of correlated outcomes, y, when the data on y are missing at random. Handling data with the missing not at random mnar mechanism is still a challenging problem in statistics. We study a class of semiparametric skewed distributions arising when the sample selection process produces nonrandomly sampled observations. We show that any of our class of estimators is asymptotically normal. Multiple imputation mi rubin, 1987 is a principled method for addressing itemlevel missing data.
Asymptotic theory for the semiparametric accelerated. If we are interested in studying the time to an event such as death due to cancer or failure of a light bulb, the cox model specifies the following distribution function for. Efficiency bounds, multiple robustness and sensitivity analysis eric j. Missing data often appear as a practical problem while applying classical models in the statistical analysis. In statistics, missing data, or missing values, occur when no data value is stored for the variable in an observation.
Some covariates, however, may be measured with substantial errors. In this article, we propose a nonparametric imputation method based on the propensity score in a general class of semiparametric models for nonignorable missing data. Semiparametric regression analysis with missing response. Pdf asymptotic theory for the semiparametric accelerated. The main results are given in a more relevant format for. We develop inference tools in a semiparametric partially linear regression model with missing response data. Tchetgen tchetgen and ilya shpitser more by eric j. Moreover, the responses may be missing and the missingness may be nonignorable. The main results are given in a more relevant format for the calculations of ecient score functions. By continuing to use our website, you are agreeing to our use of cookies. Analysis of generalized semiparametric regression models for. Semiparametric theory and missing data pp 5399 cite as.