We've Moved!
Visit SDSU’s new digital collections website at https://digitalcollections.sdsu.edu
Description
Generalized varying-coefficient mixed model (GVMM), as a nonparametric model, has been well studied in the statistics literature, and has shown great potentials in a wide variety of data-intensive applications. A classical GVMM contains both fixed effects and random effects and allows the coefficients to change with time. While such models can be readily estimated from completely observed data, it still remains challenging how to robustly estimate a GVMM from partially observed data, for example, in biomedical studies and many others. The missing data might occur in either explanatory variables or response variables or their combinations. And, the situations become more complicated while dealing with practical data which might be noisy or even erroneous. The objective of this dissertation is to study a theoretical framework for estimating GVMM with missing data, and explore how surrogate information can help resolve the ambiguities in model estimation with the aid of local quasi-likelihood techniques. The proposed framework comprises of three major components. First, we rigorously study the estimation of response variables missing at random in generalized linear varying-coefficient model (GLVM), which is a special case of GVMM. We propose three nonparametric methods based on local quasi-likelihood estimation, including a local quasi-likelihood estimator using only complete-case data, a locally weighted quasi-likelihood estimator, and a local quasi-likelihood estimator with imputed values. We also develop a local quasi-likelihood imputation method for estimating the mean function of the response variable. Analysises of asymptotic properties show that, among the three working estimators, the proposed imputation estimator performs better than the other two estimators, while the other two are similar. Simulation results with comparisons to the alternative methods also support this conclusion. Second, we propose a two-step weighted estimator for GVMM while using fully observed data. The first step is to partition all data samples into multiple independent centers based on random effects, and estimate a GVMM for each center using the quasi-likelihood method. The second step is to construct an across-center estimator by taking a weighted average of center-specific estimators. Such a two-step estimator is characterized by its improved simplicity in model estimation. We analyze the major mathematical properties of the proposed estimator and prove it to be a consistent estimator. We empirically validate the proposed estimator and its properties on simulated data, showing that the estimation works properly. We will also apply the proposed method over a vicon physical action data set to demonstrate its practical effectiveness. Third, we study how to deal with missing measurements in explanatory variables in GVMM, and extend the proposed two-step estimator to take advantages of surrogate data when available. In the first step, we estimate parameters within a given center as follows: when an explanatory variable is observed, a regular local quasi-likelihood method can be used; when the variable is missing, a semi-parametric method with the mean score method is employed to explore the related surrogate data. In the second step, we derive an expression of the weights used in the across-center estimator to facilitate the optimization of model parameters. Empirical studies are conducted on simulation data and results with comparisons to alternative methods clearly demonstrate the promising results of the proposed estimator.