Please login to be able to save your searches and receive alerts for new content matching your search criteria.
We propose a statistical method based on graphical Gaussian models for estimating large gene networks from DNA microarray data. In estimating large gene networks, the number of genes is larger than the number of samples, we need to consider some restrictions for model building. We propose weighted lasso estimation for the graphical Gaussian models as a model of large gene networks. In the proposed method, the structural learning for gene networks is equivalent to the selection of the regularization parameters included in the weighted lasso estimation. We investigate this problem from a Bayes approach and derive an empirical Bayesian information criterion for choosing them. Unlike Bayesian network approach, our method can find the optimal network structure and does not require to use heuristic structural learning algorithm. We conduct Monte Carlo simulation to show the effectiveness of the proposed method. We also analyze Arabidopsis thaliana microarray data and estimate gene networks.
Xia, Tong, Li and Zhu (2002) proposed a general estimation method termed minimum average variance estimation (MAVE) for semiparametric models. The method has been found very useful in estimating complicated semiparametric models (Xia, Zhang and Tong, 2004; Xia and Härdle, 2006) and general dimension reduction (Xia, 2008; Wang and Xia, 2008). The method is also convenient to combine with other methods in order to incorporate additional statistical requirements (Wang and Yin, 2007). In this paper, we give a general review on the method and discuss some issues arising in estimating semiparametric models and dimension reduction (Li, 1991 and Cook, 1998) when complicated statistical requirements are imposed, including quantile regression, sparsity of variables and censored data.
Recent interest in the application of microarray technology focuses on relating gene expression profiles to censored survival outcome such as patients' overall survival time or time to cancer relapse. Due to the high-dimensional nature of the gene expression data, regularization becomes an effective approach for such analyses. In this chapter, we review several aspects of the recent development of penalized regression models for censored survival data with high-dimensional covariates, e.g. gene expressions. We first discuss the Cox proportional hazards model (Cox 1972) as the primary example and then the accelerated failure time model (Kalbfleisch and Prentice 2002) for further consideration.
In this paper, we suggest the pretest estimation strategy for variable selection and estimating the regression parameters using quasi-likelihood method when uncertain prior information (UPI) exist. We also apply the lasso-type estimation and variable selection strategy and compare the relative performance of lasso with the pretest and quasi-likelihood estimators. The performance of each estimator is evaluated in terms of the simulated mean square error. Further, we develop the asymptotic properties of pretest estimator (PTE) using the notion of asymptotical distributional risk, and compare it with the unrestricted quasi-likelihood estimator (UE) and restricted quasi-likelihood estimator (RE), respectively. The asymptotic result demonstrates the superiority of pretest strategy over the UE and RE in meaningful part of the parameter space. The simulation results show that when UPI is correctly specified the PTE outperforms lasso.