Compression of Mixed Images


In certain applications such as scanned document (e.g., newspaper) archives, the user may primarily be interested in text readability, while the visual quality of the rest of the image may be of secondary importance. The compression of these mixed images (images which include text) is therefore a substantially different problem than the compression of typical photographic images.

Due to its ability to capture the semantically important structure in an image, segmentation-based coding is a promising approach for the compression of this type of images. Because segmentation preserves much of the high frequency content related to structure in the image, segmentation-based methods do not suffer from the problems prevalent in conventional transform-based coding approaches. In fact, the large contrast between the text and the background facilitates the segmentation of the text.

For the purpose of automatic classification, the error is partitioned into square windows and the statistics of each window (mean, variance) are computed individually. An adaptive threshold which depends on the statistics of the error within the window as well as the window size is computed. Given the threshold, Td, each window is labeled as text or image (non-text) by the window decision function:

g(j) = { Text if Var(j) > Td

where g(j) is the decision function for the jth window (j=1,2,..N) and Var(j) is the variance of the jth window.

Given the mixed image approximation, quantized error is added to the approximation only within the windows which are labeled as text. As a further refinement, only the error within a narrow band around the text contours (which fall in a text window) can be added to enhance the achievable compression. The idea of adding error to the approximation within a narrow band around the text contours is shown below.

Selective addition of error in the textual portion.