Clustering and presorting for parallel burrows wheeler-based compression
Abstract
We describe practical improvements for parallel BWT-based lossless compressors frequently utilized in modern day big data applications. We propose a clustering-based data permutation approach for improving compression ratio for data with significant alphabet variation along with a faster string sorting approach based on the application of the O(n) complexity counting sort with permutation reindexing.
Remember to check out the Most Cited Articles! |
---|
Check out our handbook collection in computer science! |