Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

SEARCH GUIDE  Download Search Tip PDF File

  • articleNo Access

    SCALING LARGE LEARNING PROBLEMS WITH HARD PARALLEL MIXTURES

    A challenge for statistical learning is to deal with large data sets, e.g. in data mining. The training time of ordinary Support Vector Machines is at least quadratic, which raises a serious research challenge if we want to deal with data sets of millions of examples. We propose a "hard parallelizable mixture" methodology which yields significantly reduced training time through modularization and parallelization: the training data is iteratively partitioned by a "gater" model in such a way that it becomes easy to learn an "expert" model separately in each region of the partition. A probabilistic extension and the use of a set of generative models allows representing the gater so that all pieces of the model are locally trained. For SVMs, time complexity appears empirically to local growth linearly with the number of examples, while generalization performance can be enhanced. For the probabilistic version of the algorithm, the iterative algorithm probably goes down in a cost function that is an upper bound on the negative log-likelihood.

  • articleNo Access

    A TIED-MIXTURE 2D HMM FACIAL IMAGE RETRIEVAL SYSTEM

    In this paper, the effect of mixture tying on a second-order 2D Hidden Markov Model (HMM) is studied as applied to the face recognition problem. While tying HMM parameters is a well-known solution in the case of insufficient training data that leads to nonrobust estimation, it is used here to improve the overall performance in the small model case where the resolution in the observation space is the main problem.

    The fully-tied-mixture 2D HMM-based face recognition system is applied to the facial database of AT&T and the facial database of Georgia Institute of Technology. The performance of the proposed 2D HMM tied-mixture system is studied and the expected improvement is confirmed.

  • articleOpen Access

    Initializing the EM Algorithm for Univariate Gaussian, Multi-Component, Heteroscedastic Mixture Models by Dynamic Programming Partitions

    Setting initial values of parameters of mixture distributions estimated by using the EM recursive algorithm is very important to the overall quality of estimation. None of the existing methods are suitable for heteroscedastic mixtures with a large number of components. We present relevant novel methodology of estimating the initial values of parameters of univariate, heteroscedastic Gaussian mixtures, on the basis of dynamic programming partitioning of the range of observations into bins. We evaluate variants of the dynamic programming method corresponding to different scoring functions for partitioning. We demonstrate the superior efficiency of the proposed method compared to existing techniques for both simulated and real datasets.

  • chapterOpen Access

    TRACING CO-REGULATORY NETWORK DYNAMICS IN NOISY, SINGLE-CELL TRANSCRIPTOME TRAJECTORIES

    The availability of gene expression data at the single cell level makes it possible to probe the molecular underpinnings of complex biological processes such as differentiation and oncogenesis. Promising new methods have emerged for reconstructing a progression ’trajectory’ from static single-cell transcriptome measurements. However, it remains unclear how to adequately model the appreciable level of noise in these data to elucidate gene regulatory network rewiring. Here, we present a framework called Single Cell Inference of MorphIng Trajectories and their Associated Regulation (SCIMITAR) that infers progressions from static single-cell transcriptomes by employing a continuous parametrization of Gaussian mixtures in high-dimensional curves. SCIMITAR yields rich models from the data that highlight genes with expression and co-expression patterns that are associated with the inferred progression. Further, SCIMITAR extracts regulatory states from the implicated trajectory-evolvingco-expression networks. We benchmark the method on simulated data to show that it yields accurate cell ordering and gene network inferences. Applied to the interpretation of a single-cell human fetal neuron dataset, SCIMITAR finds progression-associated genes in cornerstone neural differentiation pathways missed by standard differential expression tests. Finally, by leveraging the rewiring of gene-gene co-expression relations across the progression, the method reveals the rise and fall of co-regulatory states and trajectory-dependent gene modules. These analyses implicate new transcription factors in neural differentiation including putative co-factors for the multi-functional NFAT pathway.