Loading [MathJax]/jax/output/CommonHTML/jax.js
World Scientific
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

Batched Computation of the Singular Value Decompositions of Order Two by the AVX-512 Vectorization

    https://doi.org/10.1142/S0129626420500152Cited by:3 (Source: Crossref)

    In this paper a vectorized algorithm for simultaneously computing up to eight singular value decompositions (SVDs, each of the form A=UΣV) of real or complex matrices of order two is proposed. The algorithm extends to a batch of matrices of an arbitrary length n, that arises, for example, in the annihilation part of the parallel Kogbetliantz algorithm for the SVD of matrices of order 2n. The SVD method for a single matrix of order two is derived first. It scales, in most instances error-free, the input matrix A such that the scaled singular values cannot overflow whenever the elements of A are finite, and then computes the URV factorization of the scaled matrix, followed by the SVD of the non-negative upper-triangular middle factor. A vector-friendly data layout for the batch is then introduced, where the same-indexed elements of each of the input and the output matrices form vectors, and the algorithm’s steps over such vectors are described. The vectorized approach is shown to be about three times faster than processing each matrix in the batch separately, while slightly improving accuracy over the straightforward method for the 2×2 SVD.

    Supplementary material is available in https://github.com/venovako/VecKog repository.

    This work is dedicated to the memory of Saša Singer.

    AMSC: 65F15, 65Y05, 65Y10