Semiparametric theory and missing data springerlink. A semiparametric estimation of mean functionals with. Semiparametric theory and missing data electronic resource anastasios a. Semiparametric theory and missing data by tsiatis, a. Bounded, efficient and doubly robust estimation with. Missing data occurs frequently in empirical studies in health and social sciences, often compromising our ability to make accurate inferences. Missing data is a big issue in the world of clinical trials. Stanford libraries official online search tool for books, media, journals, databases, government documents and more. The theory is applied to a semiparametric missing data model where it is shown that the twostep gelweighted estimator possesses good efficiency and robustness properties when nuisance models are. Statistical analysis with missing data, by little and rubin, 2002, 408 pages. Missing data is a very important issue in clinical trials and many models have been devised to handle them including multiple imputation rubin, pattern mixture models little and mixed effects linear and nonlinear models pinheiro and bates, molenberghs and verbeke and davidian. The theory and methods for measurement errors and missing. A wellknown example of a semiparametric model is the cox proportional hazards model.
Missing data occur frequently in empirical studies in health and social sciences, often compromising our ability to make accurate inferences. Semiparametric efficiency in multivariate regression models. Abstract we consider the efficiency bound for the estimation of the parameters of semiparametric models defined solely by restrictions on the means of a vector of correlated outcomes, y, when the data on y are missing at random. An outcome is said to be missing not at random mnar if, conditional on the observed variables, the missing data mechanism still depends on the unobserved outcome. Statistics in the pharmaceutical industry, 3rd edition. Analysis of generalized semiparametric regression models for. When the proportion of itemlevel missing values is nontrivial and the data are not missing completely at random mcar, typical solutions for missing data like complete case analysis often lead to increased bias and reduced statistical power.
We also refer to chen, hong and tarozzi 2008, who study semiparametric e. This paper considers the problem of parameter estimation in a general class of semiparametric models when observations are subject to missingness at random. Asymptotic theory for the semiparametric accelerated failure time model with missing data. By continuing to use our website, you are agreeing to our use of cookies. Semiparametric theory and missing data anastasios tsiatis. Tchetgen tchetgen and ilya shpitser more by eric j. This sensitivity is exacerbated when inverse probability weighting methods are used, which may overweight contaminated observations. Semiparametric nonlinear mixedeffects nlme models are flexible for modelling complex longitudinal data. It has just been published, and ive not looked at it yet, but my guess is that it will be of use to many statisticians and trialists. Missing data are a common occurrence and can have a significant effect on the conclusions that can be drawn from the data. Semiparametric modelling is, as its name suggests, a hybrid of the parametric and nonparametric approaches to construction, fitting, and validation of statistical models. The theory of missing data applied to semiparametric models is scattered throughout the literature with no thorough comprehensive treatment of the subject. We study a class of semiparametric skewed distributions arising when the sample selection process produces nonrandomly sampled observations. Estimation in semiparametric models with missing data.
In such settings, identification is generally not possible without imposing additional. Semiparametric methods for missing data and causal inference abstract in this dissertation, we propose methodology to account for missing data as well as a strategy to account for outcome heterogeneity. The semiparametric models allow for estimating functions that are nonsmooth with respect to the parameter. Semiparametric theory for causal mediation analysis. We consider a semiparametric model that parameterizes the conditional density of the response, given covariates, but allows the marginal distribution we use cookies to enhance your experience on our website. In this article, we propose a nonparametric imputation method based on the propensity score in a general class of semiparametric models for nonignorable missing data. Semiparametric modelbased inference in the presence of. Semiparametric theory and missing data in searchworks catalog. We propose a nonparametric imputation method for the missing values, which then leads to imputed estimating equations for the finite dimensional parameter of interest. Theory and practice guohua fengx, bin pengy, liangjun su. Pdf semiparametric estimation with data missing not at. Semiparametric singleindex panel data models with interactive fixed e ects.
Covariates are usually introduced in the models to partially explain interindividual variations. In this paper, we consider a semiparametric regression model in the presence of missing covariates for nonparametric components under a bayesian framework. Semiparametric estimating equations inference with. Dec 28, 2012 the semiparametric models allow for estimating functions that are nonsmooth with respect to the parameter.
Simulation studies demonstrate the relevance of the theory in finite samples. In many cases, missing data in an analysis is treated in a casual and adhoc manner, leading to invalid inferences and erroneous conclusions. Semiparametric efficiency in gmm models of nonclassical measurement errors, missing data and treatment effects by xiaohong chen, han hong, and alessandro tarozzi march 2008 cowles foundation discussion paper no. Semiparametric regression analysis with missing response at. Id 7011694 semiparametric theory and missing data anastasios a. Some covariates, however, may be measured with substantial errors. Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. Semiparametric theory and missing data anastasios a. For a more thorough discussion of semiparametric efficiency theory and precise definitions of pathwise derivatives and the tangent space we refer to. Our theoretical results provide new insight for the theory of semiparametric efficiency bounds literature and open the door to new applications.
Semiparametric inverse propensity weighting for nonignorable. The description of the theory of estimation for semiparametric. Efficiency bounds, multiple robustness and sensitivity analysis eric j. Estimation in semiparametric models with missing data 789 from the imputed estimating function gn. Semiparametric models allow at least part of the datagenerating process to be. Multiple imputation mi rubin, 1987 is a principled method for addressing itemlevel missing data. An outcome is said to be missing not at random mnar if, conditional on the observed variables, the missing data mechanism still depends on. Zhao 1994 and robins and rotnitzky 1992 are revisited for semiparametric regression models with missing data using the theory outlined in the monograph by bickel, klaassen, ritov, and wellner 1993.
The cumulative incidence function quantifies the probability of failure over time due to a specific cause for competing risks data. We develop inference tools in a semiparametric partially linear regression model with missing response data. Semiparametric regression analysis with missing response at random qihua wang, oliver linton and wolfgang h. Zhiqiang tan, bounded, efficient and doubly robust estimation with inverse weighting, biometrika, volume 97, issue 3. If time permits, it will also cover other advanced topics on handling incomplete data, such as in highdimensional data. In order to overcome the robust defect of traditional complete data estimation method and regression imputation estimation technique, we propose a modified imputation estimation approach called krigingregression. Strategies for bayesian modeling and sensitivity analysis m.
Tsiatis takes a new approach with semiparametric models. In particular, we investigate a class of regressionlike mean regression, quantile regression, models with missing data, an example of a supply and demand simultaneous equations model and a. We consider a class of doubly weighted rankbased estimating methods for the transformation or accelerated failure time model with missing data as arise, for. Rod little and don rubin have contributed massively to the development of theory and methods for handling missing data rubin being the originator of multiple imputation. Abstract we develop inference tools in a semiparametric partially linear regression model with missing response data. This book combines much of what is known in regard to the theory of estimation for semiparametric models with missing data in an organized and comprehensive manner. Analysis of semiparametric regression models for repeated outcomes in the presence of missing data james m. Asymptotic theory for the semiparametric accelerated failure. While many of the other missing data books do mention clinical trials some quite extensively, this book focuses exclusively on missing data in trials. In order to overcome the robust defect of traditional complete data estimation method and regression imputation estimation technique, we propose a modified imputation estimation approach called krigingregression imputation.
Missing data is a pervasive problem in data analyses, resulting in datasets that contain censored realizations of a target distribution. Kriging regression imputation method to semiparametric. Classical semiparametric inference with missing outcome data is not robust to contamination of the observed data and a single observation can have arbitrarily large influence on estimation of a parameter of interest. Semiparametric regression models with missing data. Anastasios a tsiatis this book combines much of what is known in regard to the theory of estimation for semiparametric models with missing data in an organized and comprehensive manner. In statistics, missing data, or missing values, occur when no data value is stored for the variable in an observation. A class of estimators is defined that includes as special cases a semiparametric regression imputation estimator, a marginal average estimator, and a marginal propensity score weighted estimator. A statistical model is a parameterized family of distributions. The asymptotic properties of these estimators are developed using theory of counting processes and semiparametric theory for missing data problems. In many cases, the treatment of missing data in an analysis is carried out in a casual and adhoc manner, leading, in many cases, to invalid inference and erroneous conclusions. Semiparametric methods for missing data and causal. Semiparametric estimators for the regression coefficients in.
Missing data often appear as a practical problem while applying classical models in the statistical analysis. If we are interested in studying the time to an event such as death due to cancer or failure of a light bulb, the cox model specifies the following distribution function for. By adopting nonparametric components for the model, the estimation method can be made robust. The generalized semiparametric regression models for the cumulative incidence functions with missing covariates are investigated. Semiparametric theory and missing data researchgate. Kalbfleisch and menggang yu university of michigan, university of michigan and indiana university we consider a class of doubly weighted rankbased estimating methods for the transformation or accelerated failure time model with. It starts with the study of semiparametric methods when there are no missing data. Handling data with the missing not at random mnar mechanism is still a challenging problem in statistics. Tsiatis andrea rotnitzky missing data in longitudinal studies. Semiparametric theory and missing data by anastasios. Semiparametric models allow at least part of the data generating process to be. Missing data arise in almost all scientific disciplines. Semiparametric theory and missing data ebook, 2006. Semiparametric location estimation under nonrandom sampling.
The main results are given in a more relevant format for the calculations of ecient score functions. Asymptotic theory for the semiparametric accelerated. Semiparametric regression analysis with missing response. In this paper, based on the exponential tilting model, we propose a semiparametric estimation method of mean functionals with nonignorable missing data. While estimation of the marginal total causal effect of a point exposure on an outcome is arguably the most common objective of experimental and observational studies in the health and social sciences, in recent years, investigators have also become increasingly interested in mediation analysis. International conference on robust statistics 2016 1. The description of the theory of estimation for semiparametric models is both rigorous and intuitive, relying on geometric ideas to reinforce the intuition and understanding of the theory. Bounded, efficient and doubly robust estimation with inverse weighting zhiqiang tan.
This book summarizes knowledge regarding the theory of estimation for semiparametric models with missing data. Asymptotic theory for the semiparametric accelerated failure time model with missing data by bin nan,1 johnd. Semiparametric theory and missing data pp 5399 cite as. We show that any of our class of estimators is asymptotically normal. A semiparametric logistic regression model is assumed for the response probability and a nonparametric regression approach for missing data discussed in cheng 1994 is used in the estimator. The main results are given in a more relevant format for. These methods are also illustrated using data from a breast cancer stage ii clinical trial. Pdf asymptotic theory for the semiparametric accelerated. This paper investigates a class of estimation problems of the semiparametric model with missing data. Kalbfleischand menggangyu university of michigan, university of michigan and indiana university we consider a class of doubly weighted rankbased estimating methods for the transformation or accelerated failure time model. Moreover, the responses may be missing and the missingness may be nonignorable. In statistics, a semiparametric model is a statistical model that has parametric and nonparametric components. We propose a nonparametric imputation method for the missing values, which then leads to imputed estimating equations for the.