Dictionary coder
This article needs additional citations for verification. (June 2020) |
A dictionary coder, also sometimes known as a substitution coder, is a class of
Methods and applications
Some dictionary coders use a 'static dictionary', one whose full set of strings is determined before coding begins and does not change during the coding process. This approach is most often used when the message or set of messages to be encoded is fixed and large; for instance, an application that stores the contents of a book in the limited storage space of a PDA generally builds a static dictionary from a concordance of the text and then uses that dictionary to compress the verses. This scheme of using Huffman coding to represent indices into a concordance has been called "Huffword".[1]
In a related and more general method, a dictionary is built from redundancy extracted from a data environment (various input streams) which dictionary is then used statically to compress a further input stream. For example, a dictionary is built from old English texts then is used to compress a book.[2]
More common are methods where the dictionary starts in some predetermined state but the contents change during the encoding process, based on the data that has already been encoded. Both the
LZ78 uses a more explicit dictionary structure; at the beginning of the encoding process, the dictionary is empty. An index value of zero is used to represent the end of a string, so the first index of the dictionary is one. At each step of the encoding process, if there is no match, then the last matching index (or zero) and character are both added to the dictionary and output to the compressed stream. If there is a match, then the working index is updated to the matching index, and nothing is output.
References
- ISBN 9780442018634.
- ^ Rodney J. Smith. Streaming Compression System Using Dynamic Connection Groups, US patent 5,748,955, priority date 20 December 1993.
See also
- Grammar-based code
- Entropy encoding