Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

SEARCH GUIDE  Download Search Tip PDF File

  • chapterFree Access

    WEIGHTED LASSO IN GRAPHICAL GAUSSIAN MODELING FOR LARGE GENE NETWORK ESTIMATION BASED ON MICROARRAY DATA

    We propose a statistical method based on graphical Gaussian models for estimating large gene networks from DNA microarray data. In estimating large gene networks, the number of genes is larger than the number of samples, we need to consider some restrictions for model building. We propose weighted lasso estimation for the graphical Gaussian models as a model of large gene networks. In the proposed method, the structural learning for gene networks is equivalent to the selection of the regularization parameters included in the weighted lasso estimation. We investigate this problem from a Bayes approach and derive an empirical Bayesian information criterion for choosing them. Unlike Bayesian network approach, our method can find the optimal network structure and does not require to use heuristic structural learning algorithm. We conduct Monte Carlo simulation to show the effectiveness of the proposed method. We also analyze Arabidopsis thaliana microarray data and estimate gene networks.

  • articleOpen Access

    AN AI APPROACH TO MEASURING FINANCIAL RISK

    AI artificial intelligence brings about new quantitative techniques to assess the state of an economy. Here, we describe a new measure for systemic risk: the Financial Risk Meter (FRM). This measure is based on the penalization parameter (λ) of a linear quantile lasso regression. The FRM is calculated by taking the average of the penalization parameters over the 100 largest US publicly-traded financial institutions. We demonstrate the suitability of this AI-based risk measure by comparing the proposed FRM to other measures for systemic risk, such as VIX, SRISK and Google Trends. We find that mutual Granger causality exists between the FRM and these measures, which indicates the validity of the FRM as a systemic risk measure. The implementation of this project is carried out using parallel computing, the codes are published on www.quantlet.de with keyword formula FRM. The R package RiskAnalytics is another tool with the purpose of integrating and facilitating the research, calculation and analysis methods around the FRM project. The visualization and the up-to-date FRM can be found on hu.berlin/frm.

  • articleOpen Access

    PREDICTION OF HYPERTENSION RISKS WITH FEATURE SELECTION AND XGBOOST

    There are about 1 billion hypertensives patients on a global scale. Hypertension has become the main cause of shorter lifespan and disability for humans worldwide. In this essay, we constructed a new model based on hybrid feature selection and the standard XGBoost for hypertension detection and prediction. After having successfully utilized Lasso regression to identify hypertension-related factors, we used the standard XGBoost model for hypertension prediction. The result from the experiments conducted on the data from the BRFSS shows that proposed model can achieve 77.2% accuracy and 84.6% AUC, both about 7% higher than that without the nonoptimized model. Our proposed model can not only be used to predict the risk of hypertension, but also provide customers with suggestions on how to lead a healthy lifestyle.

  • articleNo Access

    A TRANSCRIPTOME ANALYSIS BY LASSO PENALIZED COX REGRESSION FOR PANCREATIC CANCER SURVIVAL

    Pancreatic cancer is the fourth leading cause of cancer deaths in the United States with five-year survival rates less than 5% due to rare detection in early stages. Identification of genes that are directly correlated to pancreatic cancer survival is crucial for pancreatic cancer diagnostics and treatment. However, no existing GWAS or transcriptome studies are available for addressing this problem. We apply lasso penalized Cox regression to a transcriptome study to identify genes that are directly related to pancreatic cancer survival. This method is capable of handling the right censoring effect of survival times and the ultrahigh dimensionality of genetic data. A cyclic coordinate descent algorithm is employed to rapidly select the most relevant genes and eliminate the irrelevant ones. Twelve genes have been identified and verified to be directly correlated to pancreatic cancer survival time and can be used for the prediction of future patient's survival.

  • chapterNo Access

    An Adaptive Estimation Method for Semiparametric Models and Dimension Reduction

    Xia, Tong, Li and Zhu (2002) proposed a general estimation method termed minimum average variance estimation (MAVE) for semiparametric models. The method has been found very useful in estimating complicated semiparametric models (Xia, Zhang and Tong, 2004; Xia and Härdle, 2006) and general dimension reduction (Xia, 2008; Wang and Xia, 2008). The method is also convenient to combine with other methods in order to incorporate additional statistical requirements (Wang and Yin, 2007). In this paper, we give a general review on the method and discuss some issues arising in estimating semiparametric models and dimension reduction (Li, 1991 and Cook, 1998) when complicated statistical requirements are imposed, including quantile regression, sparsity of variables and censored data.