Data compression
In
The process of reducing the size of a
Data Compression algorithms present a
Lossless
The
The strongest modern lossless compressors use
Archive software typically has the ability to adjust the "dictionary size", where a larger size demands more random-access memory during compression and decompression, but compresses stronger, especially on repeating patterns in files' content.[12][13]
Lossy
In the late 1980s, digital images became more common, and standards for lossless
Most forms of lossy compression are based on
).Lossy image compression is used in
In lossy audio compression, methods of psychoacoustics are used to remove non-audible (or less audible) components of the
Lossy compression can cause generation loss.
Theory
The theoretical basis for compression is provided by information theory and, more specifically, Shannon's source coding theorem; domain-specific theories include algorithmic information theory for lossless compression and rate–distortion theory for lossy compression. These areas of study were essentially created by Claude Shannon, who published fundamental papers on the topic in the late 1940s and early 1950s. Other topics associated with compression include coding theory and statistical inference.[19]
Machine learning
There is a close connection between
An alternative view can show compression algorithms implicitly map strings into implicit
According to AIXI theory, a connection more directly explained in Hutter Prize, the best possible compression of x is the smallest possible software that generates x. For example, in that model, a zip file's compressed size includes both the zip file and the unzipping software, since you can not unzip it without both, but there may be an even smaller combined form.
Examples of AI-powered audio/video compression software include
In
Data compression aims to reduce the size of data files, enhancing storage efficiency and speeding up data transmission. K-means clustering, an unsupervised machine learning algorithm, is employed to partition a dataset into a specified number of clusters, k, each represented by the
Data differencing
Data compression can be viewed as a special case of
The term differential compression is used to emphasize the data differencing connection.
Uses
Image
Entropy coding originated in the 1940s with the introduction of Shannon–Fano coding,[31] the basis for Huffman coding which was developed in 1950.[32] Transform coding dates back to the late 1960s, with the introduction of fast Fourier transform (FFT) coding in 1968 and the Hadamard transform in 1969.[33]
An important image compression technique is the
Audio
Audio data compression, not to be confused with dynamic range compression, has the potential to reduce the transmission bandwidth and storage requirements of audio data. Audio compression formats compression algorithms are implemented in software as audio codecs. In both lossy and lossless compression, information redundancy is reduced, using methods such as coding, quantization, DCT and linear prediction to reduce the amount of information used to represent the uncompressed data.
Lossy audio compression algorithms provide higher compression and are used in numerous audio applications including Vorbis and MP3. These algorithms almost all rely on psychoacoustics to eliminate or reduce fidelity of less audible sounds, thereby reducing the space required to store or transmit them.[2][46]
The acceptable trade-off between loss of audio quality and transmission or storage size depends upon the application. For example, one 640 MB compact disc (CD) holds approximately one hour of uncompressed high fidelity music, less than 2 hours of music compressed losslessly, or 7 hours of music compressed in the MP3 format at a medium bit rate. A digital sound recorder can typically store around 200 hours of clearly intelligible speech in 640 MB.[47]
Lossless audio compression produces a representation of digital data that can be decoded to an exact digital duplicate of the original. Compression ratios are around 50–60% of the original size,[48] which is similar to those for generic lossless data compression. Lossless codecs use curve fitting or linear prediction as a basis for estimating the signal. Parameters describing the estimation and the difference between the estimation and the actual signal are coded separately.[49]
A number of lossless audio compression formats exist. See
Some
When audio files are to be processed, either by further compression or for
Lossy audio compression
Lossy audio compression is used in a wide range of applications. In addition to standalone audio-only applications of file playback in MP3 players or computers, digitally compressed audio streams are used in most video DVDs, digital television, streaming media on the
Psychoacoustics recognizes that not all data in an audio stream can be perceived by the human auditory system. Most lossy compression reduces redundancy by first identifying perceptually irrelevant sounds, that is, sounds that are very hard to hear. Typical examples include high frequencies or sounds that occur at the same time as louder sounds. Those irrelevant sounds are coded with decreased accuracy or not at all.
Due to the nature of lossy algorithms,
Several proprietary lossy compression algorithms have been developed that provide higher quality audio performance by using a combination of lossless and lossy algorithms with adaptive bit rates and lower compression ratios. Examples include aptX, LDAC, LHDC, MQA and SCL6.
Coding methods
To determine what information in an audio signal is perceptually irrelevant, most lossy compression algorithms use transforms such as the
Other types of lossy compressors, such as the linear predictive coding (LPC) used with speech, are source-based coders. LPC uses a model of the human vocal tract to analyze speech sounds and infer the parameters used by the model to produce them moment to moment. These changing parameters are transmitted or stored and used to drive another model in the decoder which reproduces the sound.
Lossy formats are often used for the distribution of streaming audio or interactive communication (such as in cell phone networks). In such applications, the data must be decompressed as the data flows, rather than after the entire data stream has been transmitted. Not all audio codecs can be used for streaming applications.[50]
Latency is introduced by the methods used to encode and decode the data. Some codecs will analyze a longer segment, called a frame, of the data to optimize efficiency, and then code it in a manner that requires a larger segment of data at one time to decode. The inherent latency of the coding algorithm can be critical; for example, when there is a two-way transmission of data, such as with a telephone conversation, significant delays may seriously degrade the perceived quality.
In contrast to the speed of compression, which is proportional to the number of operations required by the algorithm, here latency refers to the number of samples that must be analyzed before a block of audio is processed. In the minimum case, latency is zero samples (e.g., if the coder/decoder simply reduces the number of bits used to quantize the signal). Time domain algorithms such as LPC also often have low latencies, hence their popularity in speech coding for telephony. In algorithms such as MP3, however, a large number of samples have to be analyzed to implement a psychoacoustic model in the frequency domain, and latency is on the order of 23 ms.
Speech encoding
This is accomplished, in general, by some combination of two approaches:
- Only encoding sounds that could be made by a single human voice.
- Throwing away more of the data in the signal—keeping just enough to reconstruct an "intelligible" voice rather than the full frequency range of human hearing.
The earliest algorithms used in speech encoding (and audio data compression in general) were the A-law algorithm and the μ-law algorithm.
History
Early audio research was conducted at
The world's first commercial broadcast automation audio compression system was developed by Oscar Bonello, an engineering professor at the University of Buenos Aires.
[63]
In 1983, using the psychoacoustic principle of the masking of critical bands first published in 1967,
A literature compendium for a large variety of audio coding systems was published in the IEEE's Journal on Selected Areas in Communications (JSAC), in February 1988. While there were some papers from before that time, this collection documented an entire variety of finished, working audio coders, nearly all of them using perceptual techniques and some kind of frequency analysis and back-end noiseless coding.[67]
Video
The two key video compression techniques used in
Most video codecs are used alongside audio compression techniques to store the separate but complementary data streams as one combined package using so-called container formats.[71]
Encoding theory
Video data may be represented as a series of still image frames. Such data usually contains abundant amounts of spatial and temporal redundancy. Video compression algorithms attempt to reduce redundancy and store information more compactly.
Most
The intra-frame video coding formats used in camcorders and video editing employ simpler compression that uses only intra-frame prediction. This simplifies video editing software, as it prevents a situation in which a compressed frame refers to data that the editor has deleted.
Usually, video compression additionally employs lossy compression techniques like quantization that reduce aspects of the source data that are (more or less) irrelevant to the human visual perception by exploiting perceptual features of human vision. For example, small differences in color are more difficult to perceive than are changes in brightness. Compression algorithms can average a color across these similar areas in a manner similar to those used in JPEG image compression.[10] As in all lossy compression, there is a trade-off between video quality and bit rate, cost of processing the compression and decompression, and system requirements. Highly compressed video may present visible or distracting artifacts.
Other methods other than the prevalent DCT-based transform formats, such as
Inter-frame coding
In inter-frame coding, individual frames of a video sequence are compared from one frame to the next, and the video compression codec records the differences to the reference frame. If the frame contains areas where nothing has moved, the system can simply issue a short command that copies that part of the previous frame into the next one. If sections of the frame move in a simple manner, the compressor can emit a (slightly longer) command that tells the decompressor to shift, rotate, lighten, or darken the copy. This longer command still remains much shorter than data generated by intra-frame compression. Usually, the encoder will also transmit a residue signal which describes the remaining more subtle differences to the reference imagery. Using entropy coding, these residue signals have a more compact representation than the full signal. In areas of video with more motion, the compression must encode more data to keep up with the larger number of pixels that are changing. Commonly during explosions, flames, flocks of animals, and in some panning shots, the high-frequency detail leads to quality decreases or to increases in the variable bitrate.
Hybrid block-based transform formats
Many commonly used video compression methods (e.g., those in standards approved by the
In the prediction stage, various deduplication and difference-coding techniques are applied that help decorrelate data and describe new data based on already transmitted data.
Then rectangular blocks of remaining pixel data are transformed to the frequency domain. In the main lossy processing stage, frequency domain data gets quantized in order to reduce information that is irrelevant to human visual perception.
In the last stage statistical redundancy gets largely eliminated by an
In an additional in-loop filtering stage various filters can be applied to the reconstructed image signal. By computing these filters also inside the encoding loop they can help compression because they can be applied to reference material before it gets used in the prediction process and they can be guided using the original signal. The most popular example are deblocking filters that blur out blocking artifacts from quantization discontinuities at transform block boundaries.
History
In 1967, A.H. Robinson and C. Cherry proposed a
The most popular
Genetics
Outlook and currently unused potential
It is estimated that the total amount of data that is stored on the world's storage devices could be further compressed with existing compression algorithms by a remaining average factor of 4.5:1.
See also
References
- PCMsource and thereby achieve a reduction in the overall source rate R.
- ^ a b Mahdi, O.A.; Mohammed, M.A.; Mohamed, A.J. (November 2012). "Implementing a Novel Approach an Convert Audio Compression to Text Coding via Hybrid Technique" (PDF). International Journal of Computer Science Issues. 9 (6, No. 3): 53–59. Archived (PDF) from the original on 2013-03-20. Retrieved 6 March 2013.
- ^ Pujar, J.H.; Kadlaskar, L.M. (May 2010). "A New Lossless Method of Image Compression and Decompression Using Huffman Coding Techniques" (PDF). Journal of Theoretical and Applied Information Technology. 15 (1): 18–23. Archived (PDF) from the original on 2010-05-24.
- ISBN 9781848000728.
- ISBN 978-81-8489-988-7.
- ^ Navqi, Saud; Naqvi, R.; Riaz, R.A.; Siddiqui, F. (April 2011). "Optimized RTL design and implementation of LZW algorithm for high bandwidth applications" (PDF). Electrical Review. 2011 (4): 279–285. Archived (PDF) from the original on 2013-05-20.
- ^ Document Management - Portable document format - Part 1: PDF1.7 (1st ed.). Adobe Systems Incorporated. July 1, 2008.
{{cite book}}
: CS1 maint: date and year (link) - ISBN 1-57955-008-8.)
{{cite book}}
: CS1 maint: location missing publisher (link - ^ a b Mahmud, Salauddin (March 2012). "An Improved Data Compression Method for General Data" (PDF). International Journal of Scientific & Engineering Research. 3 (3): 2. Archived (PDF) from the original on 2013-11-02. Retrieved 6 March 2013.
- ^ a b Lane, Tom. "JPEG Image Compression FAQ, Part 1". Internet FAQ Archives. Independent JPEG Group. Retrieved 6 March 2013.
- S2CID 64404.
- ^ "How to choose optimal archiving settings – WinRAR".
- ^ "(Set compression Method) switch – 7zip". Archived from the original on 2022-04-09. Retrieved 2021-11-07.
- ISBN 978-1-57955-008-0.
- ^ Arcangel, Cory. "On Compression" (PDF). Archived (PDF) from the original on 2013-07-28. Retrieved 6 March 2013.
- ^ .
- ^ (PDF) from the original on 2016-12-08.
- ^ CCITT Study Group VIII und die Joint Photographic Experts Group (JPEG) von ISO/IEC Joint Technical Committee 1/Subcommittee 29/Working Group 10 (1993), "Annex D – Arithmetic coding", Recommendation T.81: Digital Compression and Coding of Continuous-tone Still images – Requirements and guidelines (PDF), pp. 54 ff, retrieved 2009-11-07
{{citation}}
: CS1 maint: numeric names: authors list (link) - ^ Marak, Laszlo. "On image compression" (PDF). University of Marne la Vallee. Archived from the original (PDF) on 28 May 2015. Retrieved 6 March 2013.
- ^ Mahoney, Matt. "Rationale for a Large Text Compression Benchmark". Florida Institute of Technology. Retrieved 5 March 2013.
- (PDF) from the original on 2009-07-09.
- S2CID 9376086.
- S2CID 12311412.
- ^ Gary Adcock (January 5, 2023). "What Is AI Video Compression?". massive.io. Retrieved 6 April 2023.
- arXiv:2006.09965 [eess.IV].
- ^ "What is Unsupervised Learning? | IBM". www.ibm.com. 23 September 2021. Retrieved 2024-02-05.
- ^ "Differentially private clustering for large-scale datasets". blog.research.google. 2023-05-25. Retrieved 2024-03-16.
- ^ Edwards, Benj (2023-09-28). "AI language models can exceed PNG and FLAC in lossless compression, says study". Ars Technica. Retrieved 2024-03-07.
- ^ Korn, D.; et al. (July 2002). "RFC 3284: The VCDIFF Generic Differencing and Compression Data Format". Internet Engineering Task Force. Retrieved 5 March 2013.
- ^ Korn, D.G.; Vo, K.P. (1995). B. Krishnamurthy (ed.). Vdelta: Differencing and Compression. Practical Reusable Unix Software. New York: John Wiley & Sons, Inc.
- (PDF) from the original on 2011-05-24. Retrieved 2019-04-21.
- (PDF) from the original on 2005-10-08
- .
- CCITT. September 1992. Retrieved 12 July 2019.
- BT.com. BT Group. 31 May 2018. Archived from the originalon 5 August 2019. Retrieved 5 August 2019.
- ^ Baraniuk, Chris (15 October 2015). "Copy protections could come to JPEGs". BBC News. BBC. Retrieved 13 September 2019.
- ^ "What Is a JPEG? The Invisible Object You See Every Day". The Atlantic. 24 September 2013. Retrieved 13 September 2019.
- ^ "The GIF Controversy: A Software Developer's Perspective". 27 January 1995. Retrieved 26 May 2015.
- . Retrieved 2014-04-23.
- ISBN 9781461560319.
Basically, wavelet coding is a variant on DCT-based transform coding that reduces or eliminates some of its limitations. (...) Another advantage is that rather than working with 8 × 8 blocks of pixels, as do JPEG and other block-based DCT techniques, wavelet coding can simultaneously compress the entire image.
- ISBN 9781461507994.
- S2CID 2765169.
- ^ Sullivan, Gary (8–12 December 2003). "General characteristics and design considerations for temporal subband video coding". ITU-T. Video Coding Experts Group. Retrieved 13 September 2019.
- ISBN 9780080922508.
- ISBN 9780240806174.
- .
- ^ The Olympus WS-120 digital speech recorder, according to its manual, can store about 178 hours of speech-quality audio in .WMA format in 500 MB of flash memory.
- ^ Coalson, Josh. "FLAC Comparison". Retrieved 2020-08-23.
- ^ "Format overview". Retrieved 2020-08-23.
- ^ ISBN 9788190639675.
- ^ ISBN 9783642126512.
- ^ US patent 2605361, C. Chapin Cutler, "Differential Quantization of Communication Signals", issued 1952-07-29
- .
- ISSN 0005-8580.
- ^ ISBN 9783319056609.
- (PDF) from the original on 2010-07-04.
- ^ Guckert, John (Spring 2012). "The Use of FFT and MDCT in MP3 Audio Compression" (PDF). University of Utah. Archived (PDF) from the original on 2014-01-24. Retrieved 14 July 2019.
- ISBN 9780387782638.
- S2CID 897622.
- ^ Brandenburg, Karlheinz (1999). "MP3 and AAC Explained" (PDF). Archived (PDF) from the original on 2017-02-13.
- S2CID 58446992.
- .
- ^ "Ricardo Sametband, La Nación Newspaper "Historia de un pionero en audio digital"" (in Spanish).
- ^ Zwicker, Eberhard; et al. (1967). The Ear As A Communication Receiver. Melville, NY: Acoustical Society of America. Archived from the original on 2000-09-14. Retrieved 2011-11-11.
- ^ "Summary of some of Solidyne's contributions to Broadcast Engineering". Brief History of Solidyne. Buenos Aires: Solidyne. Archived from the original on 8 March 2013. Retrieved 6 March 2013.
- ^ "Anuncio del Audicom, AES Journal, July-August 1992, Vol 40, # 7/8, pag 647".
- ^ "File Compression Possibilities". A Brief guide to compress a file in 4 different ways. 17 February 2017.
- ^ Dmitriy Vatolin; et al. (Graphics & Media Lab Video Group) (March 2007). Lossless Video Codecs Comparison '2007 (PDF) (Report). Moscow State University. Archived (PDF) from the original on 2008-05-15.
- ISBN 9780203904183.
- ISBN 9789812709998.
- ^ "Video Coding". CSIP website. Center for Signal and Information Processing, Georgia Institute of Technology. Archived from the original on 23 May 2013. Retrieved 6 March 2013.
- .
- ^ ISBN 9780852967102.
- doi:10.1117/12.2239493. Archived from the originalon 2016-12-08. Lecture recording, from 3:05:10.
- ^ a b c d "The History of Video File Formats Infographic — RealPlayer". 22 April 2012.
- ^ "Patent statement declaration registered as H261-07". ITU. Retrieved 11 July 2019.
- ^ "MPEG-2 Patent List" (PDF). MPEG LA. Archived (PDF) from the original on 2019-05-29. Retrieved 7 July 2019.
- ^ "MPEG-4 Visual - Patent List" (PDF). MPEG LA. Archived (PDF) from the original on 2019-07-06. Retrieved 6 July 2019.
- ^ "AVC/H.264 – Patent List" (PDF). MPEG LA. Retrieved 6 July 2019.
- PMID 22844100.
- PMID 18996942.
- PMID 23793748.
- .
- ^ "Data Compression via Logic Synthesis" (PDF).
- S2CID 206531385.
External links
- "Part 3: Video compression", Data Compression Basics
- Pierre Larbier, Using 10-bit AVC/H.264 Encoding with 4:2:2 for Broadcast Contribution, Ateme, archived from the original on 2009-09-05
- Why does 10-bit save bandwidth (even when content is 8-bit)? at the Wayback Machine (archived 2017-08-30)
- Which compression technology should be used? at the Wayback Machine (archived 2017-08-30)
- Introduction to Compression Theory (PDF), Wiley, archived (PDF) from the original on 2007-09-28
- EBU subjective listening tests on low-bitrate audio codecs
- Audio Archiving Guide: Music Formats (Guide for helping a user pick out the right codec)
- MPEG 1&2 video compression intro (pdf format) at the Wayback Machine (archived September 28, 2007)
- hydrogenaudio wiki comparison
- Introduction to Data Compression by Guy E Blelloch from CMU
- Explanation of lossless signal compression method used by most codecs
- Videsignline – Intro to Video Compression at the Wayback Machine (archived 2010-03-15)
- Data Footprint Reduction Technology at the Wayback Machine (archived 2013-05-27)
- What is Run length Coding in video compression