Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

SEARCH GUIDE  Download Search Tip PDF File

  • articleNo Access

    Eigenvalue-Corrected Natural Gradient Based on a New Approximation

    Using second-order optimization methods for training deep neural networks (DNNs) has attracted many researchers. A recently proposed method, Eigenvalue-corrected Kronecker Factorization (EKFAC), proposed an interpretation by viewing natural gradient update as a diagonal method and corrects the inaccurate re-scaling factor in the KFAC eigenbasis. What’s more, a new method to approximate the natural gradient called Trace-restricted Kronecker-factored Approximate Curvature (TKFAC) is also proposed, in which the Fisher information matrix (FIM) is approximated as a constant multiplied by the Kronecker product of two matrices and the traces can be kept equal before and after the approximation. In this work, we combine the ideas of these two methods and propose Trace-restricted Eigenvalue-corrected Kronecker Factorization (TEKFAC). The proposed method not only corrects the inexact re-scaling factor under the Kronecker-factored eigenbasis, but also considers the new approximation method and the effective damping technique adopted by TKFAC. We also discuss the differences and relationships among the related Kronecker-factored approximations. Empirically, our method outperforms SGD with momentum, Adam, EKFAC and TKFAC on several DNNs.

  • articleNo Access

    CONJUGATE AND NATURAL GRADIENT RULES FOR BYY HARMONY LEARNING ON GAUSSIAN MIXTURE WITH AUTOMATED MODEL SELECTION

    Under the Bayesian Ying–Yang (BYY) harmony learning theory, a harmony function has been developed on a BI-directional architecture of the BYY system for Gaussian mixture with an important feature that, via its maximization through a general gradient rule, a model selection can be made automatically during parameter learning on a set of sample data from a Gaussian mixture. This paper further proposes the conjugate and natural gradient rules to efficiently implement the maximization of the harmony function, i.e. the BYY harmony learning, on Gaussian mixture. It is demonstrated by simulation experiments that these two new gradient rules not only work well, but also converge more quickly than the general gradient ones.

  • articleNo Access

    BLIND SEPARATION OF MIXED KURTOSIS SIGNED SIGNALS USING PARTIAL OBSERVATIONS AND LOW COMPLEXITY ACTIVATION FUNCTIONS

    Although several highly accurate blind source separation algorithms have already been proposed in the literature, these algorithms must store and process the whole data set which may be tremendous in some situations. This makes the blind source separation infeasible and not realisable on VLSI level, due to a large memory requirement and costly computation. This paper concerns the algorithms for solving the problem of tremendous data sets and high computational complexity, so that the algorithms could be run on-line and implementable on VLSI level with acceptable accuracy. Our approach is to partition the observed signals into several parts and to extract the partitioned observations with a simple activation function performing only the "shift-and-add" micro-operation. No division, multiplication and exponential operations are needed. Moreover, obtaining an optimal initial de-mixing weight matrix for speeding up the separating time will be also presented. The proposed algorithm is tested on some benchmarks available online. The experimental results show that our solution provides comparable efficiency with other approaches, but lower space and time complexity.