The possibility of applying compressed matching in JPEG encoded images is investigated and the problems raised by the scheme are discussed. A part of the problems can be solved by the use of some auxiliary data which yields various time/space trade-offs. Finally, approaches to deal with extensions such as allowing scaling or rotations are suggested.
Denote by sq(w) the number of distinct squares in a string w and let be the class of standard Sturmian words. They are generalizations of Fibonacci words and are important in combinatorics on words. For Fibonacci words the asymptotic behaviour of the number of runs and the number of squares is the same. We show that for Sturmian words the situation is quite different. The tight bound
for the number of runs was given in [3]. In this paper we show that the tight bound for the maximal number of squares is
. We use the results of [11], where exact (but not closed) complicated formulas were given for sq(w) for
. We show that for all
we have
and there is an infinite sequence of words
such that limk→∞ |wk| = ∞ and
.
Surprisingly the maximal number of squares is reached by the words with recurrences of length only 5. This contrasts with the situation of Fibonacci words, though standard Sturmian words are natural extension of Fibonacci words. If this length drops to 4, the asymptotic behaviour of the maximal number of squares falls down significantly below . The structure of Sturmian words rich in squares has been discovered by us experimentally and verified theoretically. The upper bound is much harder, its proof is not a matter of simple calculations. The summation formulas for the number of squares are complicated, no closed formula is known. Some nontrivial reductions were necessary.
Quite often a camera that is shooting a video sequence swings or rocks (for example the camera is hand held). For compression of digitized video it can be important to compensate for the motion of the camera by realigning the frames, shifting them one respect to the other, absorbing the movement of the camera. We show that for the general version of optimal frame alignment, the Vector Alignment problem is NP-complete. Moreover we prove that the Run Length Coding Alignment problem, a restricted subcase of the Vector Alignment problem that has direct application in video compression, can be solved in linear time by a dynamic programming algorithm.
For the problem that the limited star map field angle cannot obtain the complete star map accurately, the paper study astral intrinsic and imaging features, a star map stitching algorithm based on the principle of visual perception is proposed firstly. The matching models of time and space dimensions is constructed by simulating the visual perception, then the stars and the planets points are saved by searching the matching star group dynamically, the star map is stitched and reconstructed efficiently by creating the computer sparse storage model. The experimental results show that the algorithm can achieve data compression quickly, compression ratio is 99.54%, which can reduce complexity of manual processing and can achieve star map stitching accurately.
This work proposes a lossless data compression algorithm for short data blocks. The proposed compression scheme combines a modified move-to-front algorithm with Huffman coding. This algorithm is applicable in storage systems where the data compression is performed on block level with short block sizes, in particular, in non-volatile memories. For block sizes in the range of 1kB, it provides a compression gain comparable to the Lempel–Ziv–Welch algorithm. Moreover, encoder and decoder architectures are proposed that have low memory requirements and provide fast data encoding and decoding.
In this paper, source coding or data compression is viewed as a measurement problem. Given a measurement device with fewer states than the observable of a stochastic source, how can one capture their essential information? We propose modeling stochastic sources as piecewise-linear discrete chaotic dynamical systems known as Generalized Luröth Series (GLS) which has its roots in Georg Cantor's work in 1869. These GLS are special maps with the property that their Lyapunov exponent is equal to the Shannon's entropy of the source (up to a constant of proportionality). By successively approximating the source with GLS having fewer states (with the nearest Lyapunov exponent), we derive a binary coding algorithm which turns out to be a rediscovery of Huffman coding, the popular lossless compression algorithm used in the JPEG international standard for still image compression.
Prediction is an important component in a variety of domains in Artificial Intelligence and Machine Learning, in order that Intelligent Systems may make more informed and reliable decisions. Certain domains require that prediction be performed on sequences of events that can typically be modeled as stochastic processes. This work presents Active LeZi (ALZ), a sequential prediction algorithm that is founded on an Information Theoretic approach, and is based on the acclaimed LZ78 family of data compression algorithms. The efficacy of this algorithm in a typical Smart Environment – the Smart Home, is demonstrated by employing this algorithm to predict device usage in the home. The performance of this algorithm is tested on synthetic data sets that are representative of typical interactions between a Smart Home and the inhabitant. In addition, for the Smart Home environment, we introduce a method of learning a measure of the relative time between actions using ALZ, and demonstrate the efficacy of this approach on synthetic Smart Home data.
Agriculture faces several uncertain problems in terms of making the best use of its natural resources. As a result, and in light of the growing threat of changing weather conditions, we must closely track local soil conditions and meteorological data to expedite the adoption of culture-friendly decisions. In the Internet of Things (IoT) era, deploying Wireless Sensor Networks (WSN) as a low-cost remote monitoring and management system for these types of features is a viable choice. However, the WSN is hampered by the motes’ insufficient energy sources, which reduces the network’s overall lifespan. Each mote collects the tracked feature regularly and sends the data to the sink for further analysis. This method of transmitting large amounts of data requires the sensor node to use a lot of energy and a lot of network bandwidth. We propose a lightweight lossless compression algorithm based on Differential Encoding (DE) and Huffman techniques in this paper, which is especially useful for IoT sensor nodes that track environmental features, especially those with limited computing and memory resources. Rather than attempting to create novel ad hoc algorithms, we show that, given a general understanding of the features to be monitored, classical Huffman coding can be used to effectively represent the same features that measure at different times and locations. Even though the proposed system does not achieve the theoretical limit, results using temperature measurements show that it outperforms standard methods built specifically for WSNs.
A schema database functions as a repository for interconnected data points, facilitating comprehension of data structures by organizing information into tables with rows and columns. These databases utilize established connections to arrange data, with attribute values linking related tuples. This integrated approach to data management and distributed processing enables schema databases to maintain models even when the working set size surpasses available RAM. However, challenges such as data quality, storage, scarcity of data science professionals, data validation, and sourcing from diverse origins persist. Notably, while schema databases excel at reviewing transactions, they often fall short in updating them effectively. To address these issues, a Chimp-based radial basis neural model (CbRBNM) is employed. Initially, the Schemaless database was considered and integrated into the Python system. Subsequently, compression functions were applied to both schema and schema-less databases to optimize relational data size by eliminating redundant files. Performance validation involved calculating compression parameters, with the proposed method achieving memory usage of 383.37Mb, a computation time of 0.455s, a training time of 167.5ms, and a compression rate of 5.60%. Extensive testing demonstrates that CbRBNM yields a favorable compression ratio and enables direct searching on compressed data, thereby enhancing query performance.
Nowadays, the digital register process of electrocardiographical signals (ECG) constitutes a common practice for the diagnosis and controlling of patients suffering from cardiac disorders. In this paper we study the usefulness of Artificial Neural Network (ANN) for clinical diagnosis through the detection of arrythmias and the reduction of the large spaces occupied by ECG records.
We have seen that Huffman coding has been widely used in data, image, and video compression. In this paper novel maximal prefix coding is introduced. Relationship between the Huffman coding and the optimal maximal prefix coding are discussed. We show that all Huffman coding schemes are optimal maximal prefix coding schemes and that conversely the optimal maximal prefix coding schemes need not be the Huffman coding schemes. Moreover, it is proven that, for any maximal prefix code C, there exists an information source I = (∑ P) such that C is exactly a Huffman code for I. Therefore, it is essential to show that the class of Huffman codes is coincident with one of the maximal prefix codes. A case study of data compression is also given. Comparing the Huffman coding, the maximal prefix coding is used not only for statistical modeling but also for dictionary methods. It is also good at applying to a large information retrieval system and improving its security.
In the present digital era, the exploitation of medical technologies and massive generation of medical data using different imaging modalities, adequate storage, management, and transmission of biomedical images necessitate image compression techniques. Vector quantization (VQ) is an effective image compression approach, and the widely employed VQ technique is Linde–Buzo–Gray (LBG), which generates local optimum codebooks for image compression. The codebook construction is treated as an optimization issue solved with utilization of metaheuristic optimization techniques. In this view, this paper designs an effective biomedical image compression technique in the cloud computing (CC) environment using Harris Hawks Optimization (HHO)-based LBG techniques. The HHO-LBG algorithm achieves a smooth transition among exploration as well as exploitation. To investigate the better performance of the HHO-LBG technique, an extensive set of simulations was carried out on benchmark biomedical images. The proposed HHO-LBG technique has accomplished promising results in terms of compression performance and reconstructed image quality.
Three-dimensional (3D) human mesh sequences obtained by 3D scanning equipment are often used in film and television, games, the internet, and other industries. However, due to the dense point cloud data obtained by 3D scanning equipment, the data of a single frame of a 3D human model is always large. Considering the different topologies of models between different frames, and even the interaction between the human body and other objects, the content of 3D models between different frames is also complex. Therefore, the traditional 3D model compression method always cannot handle the compression of the 3D human mesh sequence. To address this problem, we propose a sequence compression method of 3D human mesh sequence based on the Laplace operator, and test it on the complex interactive behavior of a soccer player bouncing the ball. This method first detects the mesh separation degree of the interactive object and human body, and then divides the sequence into a series of fragments based on the consistency of separation degrees. In each fragment, we employ a deformation algorithm to map keyframe topology to other frames, to improve the compression ratio of the sequence. Our work can be used for the storage of mesh sequences and mobile applications by providing an approach for data compression.
Heart rate exhibits spontaneous fluctuations that are mainly modulated by control loops within the autonomic nervous system. Assessing the dynamics of heart rate fluctuations can provide valuable information about regulatory processes and patho-physiological behavior. In this paper heart rate fluctuations and its entropy are assessed using an algorithmic information theoretic concept applying a data compressor. First, the beat-to-beat fluctuations of heart rate are binary coded for decreases and increases, respectively. Subsequently, those symbol sequences are compressed using the LZ77 algorithm. The ratio of the length of the compressed sequences to the original length is used as an estimate of entropy. We investigated the compressibility of heart rate fluctuations in athletes before, during, and after a training camp. Heart rate time series were obtained from ECGs recorded over 30 minutes under supine resting conditions. We found a significant entropy reduction during the training camp, reflecting the effects of physical fatigue. In conclusion, the compression entropy seems to be a suitable approach to assess the complexity of heart rate fluctuations.
Manufacturing system is becoming larger and more complicated. Global manufacturing chains have become common in the new millennium. Internet and intranet integrate the advanced manufacturing system. To perform remote monitoring and diagnosis in such chains and systems, real-time data compression has become a core factor in the efficient and effective exchange of information exchange via computer networks. This paper presents a new technique for compressing data using a kernel-based method. Overcoming the drawbacks of support vector techniques — that is, fast decompression but slow compression — the new method exhibits high speed in both phases. In addition, the new method can also be applied for pattern classification. Based on strain signal example tests derived from sheet metal stamping operations, the new method is very effective. The proposed technology has enormous potential in the application of advanced manufacturing system monitoring and control through internet or intranet.
By combining the T-spline and wavelet techniques together, this paper proposes a new local fairing algorithm for B-spline surfaces. There are two steps of operations in the new algorithm. The first is to decompose a local region into a scale part and a detail one by wavelet analysis. The second is to reconstruct the surface by removing the detail part and then replacing the local region by the scale one. Because the scale part constitutes a T-spline, the new algorithm can not only realize local fairing, but also simultaneously reduce the control points. The effectiveness of the new approach is verified by three examples, which also demonstrate that the major advantage of the new algorithm over the traditional ones lies in its local fairing capability.
In this paper, the orthogonality of coefficient matrices of wavelet filters is utilized to derive the energy equation for the relation between time-domain signal and its corresponding wavelet coefficients. Using the energy equation, the relationship between the wavelet coefficient error and the reconstruction error is obtained. The errors considered in this paper include the truncation error and quantization error. This not only helps to control the reconstruction quality but also brings two advantages: (1) It is not necessary to perform inverse transform to obtain the distortion caused by compression using wavelet transform and can thus reduce computation efforts. (2) By using the energy equation, we can search for a threshold value to attain a better compression ratio within the range of a pre-specified percent root-mean-square difference (PRD) value. A compression algorithm with run length encoding is proposed based on the energy equation. In the end, the Matlab software and MIT-BIH database are adopted to perform simulations for verifying the feasibility of our proposed method. The algorithm is also implemented on a DSP chip to examine the practicality and suitability. The required computation time of an ECG segment is less than 0.0786 ,s which is fast enough to process real-time signals. As a result, the proposed algorithm is applicable for implementation on mobile ECG recording devices.
Electrocardiogram (ECG) is one of the best representatives of physiological signal that provides the state of the autonomic nervous system, primarily responsible for the cardiac activity. The ECG data compression plays a significant role in localized digital storage or efficient communication channel utilization in telemedicine applications. The lossless and lossy compression system’s compressor efficiency depends on the methodologies used for compression and the quality measure used to evaluate distortion. Based on domain ECG, data compression can be performed either one-dimensional (1D) or two-dimensional (2D) for utilization of inter and inter with intra beat correlation, respectively. In this paper, a comparative study between 1D and 2D ECG data compression methods was taken out from the existing literature to provide an update in this regard. ECG data compression techniques and algorithms in 1D and 2D domain have their own merits and limitations. Recently, numerous research and techniques in 1D ECG data compression have been developed, including direct and transformed domain. Additionally, 2D ECG data compression research is reported based on period normalization and complexity sorting in recent times. Finally, several practical issues highlight the assessment of reconstructed signal quality and performance comparisons with an average comparative of exhaustive existing 1D and 2D ECG compression methods based on the utilized digital signal processing systems.
Due to technology advances, reconstructing three-dimensional representations of physical environments in real time using cheap and non-professional cameras and PCs becomes possible. These advances will make 3-D tele-immersive environments available for everyone in the near future. However, the data captured by a Tele-Immersion (TI) system can be very large. Compression is needed to ensure real-time transmission of the data. Due to this real-time requirement, compressing TI data can be a challenging task and no current techniques can effectively address this issue. In this paper, we propose a model driven compression. The main idea of this approach is to take advantage of prior knowledge of objects, e.g., human figures, in the physical environments and to represent their motions using just a few parameters. The proposed compression method provides tunable and high compression ratios (from 50:1 to 5000:1) with reasonable reconstruction quality. Moreover, the proposed method can estimate motions from the noisy data captured by our TI system in real time. The best way to visualize our results is via the videos at http://www.cs.gmu.edu/~jmlien/research/TI/.
Non-destructive testing is developing in the direction of automatization and intelligentization, but there still exists some difficulty in analyzing quantitatively the signal of magnetic flux leakage. The Mallat quick algorithm of wavelet transformation is described simply. As an advanced digital signal processing method, wavelet transformation can be used to compress data and separate the signal and the noise in the magnetic flux leakage inspection. So introducing wavelet transformation into the magnetic flux leakage inspection provides high feasibility for the quantitative analysis of the test signals.
Please login to be able to save your searches and receive alerts for new content matching your search criteria.