Due to its ability to capture the semantically important structure in an image, segmentation-based coding is a promising approach for the compression of this type of images. Because segmentation preserves much of the high frequency content related to structure in the image, segmentation-based methods do not suffer from the problems prevalent in conventional transform-based coding approaches. In fact, the large contrast between the text and the background facilitates the segmentation of the text.
For the purpose of automatic classification, the error is partitioned into square windows and the statistics of each window (mean, variance) are computed individually. An adaptive threshold which depends on the statistics of the error within the window as well as the window size is computed. Given the threshold, Td, each window is labeled as text or image (non-text) by the window decision function:
Given the mixed image approximation, quantized error is added to the approximation only within the windows which are labeled as text. As a further refinement, only the error within a narrow band around the text contours (which fall in a text window) can be added to enhance the achievable compression. The idea of adding error to the approximation within a narrow band around the text contours is shown below.