Please login to be able to save your searches and receive alerts for new content matching your search criteria.
Over the last decades, the exuberant development of next-generation sequencing has revolutionized gene discovery. These technologies have boosted the mapping of single nucleotide polymorphisms (SNPs) across the human genome, providing a complex universe of heterogeneity characterizing individuals worldwide. Fractal dimension (FD) measures the degree of geometric irregularity, quantifying how “complex” a self-similar natural phenomenon is. We compared two FD algorithms, box-counting dimension (BCD) and Higuchi’s fractal dimension (HFD), to characterize genome-wide patterns of SNPs extracted from the HapMap data set, which includes data from 1184 healthy subjects of eleven populations. In addition, we have used cluster and classification analysis to relate the genetic distances within chromosomes based on FD similarities to the geographical distances among the 11 global populations. We found that HFD outperformed BCD at both grand average clusterization analysis by the cophenetic correlation coefficient, in which the closest value to 1 represents the most accurate clustering solution (0.981 for the HFD and 0.956 for the BCD) and classification (79.0% accuracy, 61.7% sensitivity, and 96.4% specificity for the HFD with respect to 69.1% accuracy, 43.2% sensitivity, and 94.9% specificity for the BCD) of the 11 populations present in the HapMap data set. These results support the evidence that HFD is a reliable measure helpful in representing individual variations within all chromosomes and categorizing individuals and global populations.
A key step for Alzheimer's disease (AD) study is to identify associations between genetic variations and intermediate phenotypes (e.g., brain structures). At the same time, it is crucial to develop a noninvasive means for AD diagnosis. Although these two tasks—association discovery and disease diagnosis—have been treated separately by a variety of approaches, they are tightly coupled due to their common biological basis. We hypothesize that the two tasks can potentially benefit each other by a joint analysis, because (i) the association study discovers correlated biomarkers from different data sources, which may help improve diagnosis accuracy, and (ii) the disease status may help identify disease-sensitive associations between genetic variations and MRI features. Based on this hypothesis, we present a new sparse Bayesian approach for joint association study and disease diagnosis. In this approach, common latent features are extracted from different data sources based on sparse projection matrices and used to predict multiple disease severity levels based on Gaussian process ordinal regression; in return, the disease status is used to guide the discovery of relationships between the data sources. The sparse projection matrices not only reveal the associations but also select groups of biomarkers related to AD. To learn the model from data, we develop an efficient variational expectation maximization algorithm. Simulation results demonstrate that our approach achieves higher accuracy in both predicting ordinal labels and discovering associations between data sources than alternative methods. We apply our approach to an imaging genetics dataset of AD. Our joint analysis approach not only identifies meaningful and interesting associations between genetic variations, brain structures, and AD status, but also achieves significantly higher accuracy for predicting ordinal AD stages than the competing methods.