EXPLORATORY MODELING OF YEAST STRESS RESPONSE AND ITS REGULATION WITH gCCA AND ASSOCIATIVE CLUSTERING
Abstract
We model dependencies between m multivariate continuous-valued information sources by a combination of (i) a generalized canonical correlations analysis (gCCA) to reduce dimensionality while preserving dependencies in m - 1 of them, and (ii) summarizing dependencies with the remaining one by associative clustering. This new combination of methods avoids multiway associative clustering which would require a multiway contingency table and hence suffer from curse of dimensionality of the table. The method is applied to summarizing properties of yeast stress by searching for dependencies (commonalities) between expression of genes of baker's yeast Saccharomyces cerevisiae in various stressful treatments, and summarizing stress regulation by finally adding data about transcription factor binding sites.