B.2 Lossless and Lossy Compression

There are actually two fundamentally different "styles" of data compression: lossless and lossy. This appendix is generally about lossless compression techniques, but the reader would be served to understand the distinction first. Lossless compression involves a transformation of the representation of a data set such that it is possible to reproduce exactly the original data set by performing a decompression transformation. Lossy compression is a representation that allows you to reproduce something "pretty much like" the original data set. As a plus for the lossy techniques, they can frequently produce far more compact data representations than lossless compression techniques can. Most often lossy compression techniques are used for images, sound files, and video. Lossy compression may be appropriate in these areas insofar as human observers do not perceive the literal bit-pattern of a digital image/sound, but rather more general "gestalt" features of the underlying image/sound.

From the point of view of "normal" data, lossy compression is not an option. We do not want a program that does "about the same" thing as the one we wrote. We do not want a database that contains "about the same" kind of information as what we put into it. At least not for most purposes (and I know of few practical uses of lossy compression outside of what are already approximate mimetic representations of the real world, likes images and sounds).