Highly Compressed (MRC) PDF Export

Definition

  • MRC = Mixed Raster Content
  • MRC = process of using image segmentation methods to improve the contrast resolution of a raster image composed of pixels.

Source: http://en.wikipedia.org/wiki/Mixed_raster_content

Benefits

MRC compression (in PDFs) achieves significantly better file compression without visible degradation of document representation. Significant reduced file size, up to 10 times smaller compared to JPEG compression. Ideal when colour documents are scanned and processed.

Improved MRC compression saves bandwidth and storage, especially for colour documents. Application users benefit from the combination of good visual quality with small file sizes - particularly useful for reading documents on mobile devices.

MRC Compression Scheme

The diagram illustrates the basic principle of MRC PDFs: different sections of a page are compressed with different compression algorithms. So for example

  • The black/white text layer might be compressed with CCITT4
  • Images with JPEG or JPEG2000

  • Document image files are usually very large due to the background, which is often makes up to 90% of the file size. The background may, however, be unnecessary in the resulting document. It is the text and pictures that are important.
  • The MRC compression technology allows locating the colour background and deleting it or compressing to a high degree. This leaves text and pictures against a white background contributing to smaller file size.
  • Picture objects (diagrams, graphs, logos, photos, drawings, stamps, signatures, etc.) are also slightly compressed, but only to an extent that doesn’t lower the quality.
  • The MRC technology analyzes the outlines of similar characters in the document, creates an average character template and uses it instead of a character itself. This leads to better readability, because some of the text defects are corrected, and the character outlines become more precise.
  • As a result, you get a smaller image which looks even better than before. The resulting document will have an unobtrusive bland background with fine text and pictures.
  • This “reconstruction” of the document can be useful when you have to deal with low quality images due to: bad lighting, out-of-focus photo, incorrect scanning/photo parameters, dark uncoated paper, or document dilapidation.

MCR Profiles in FineReader Engine 10

  • With FineReader Engine 10 (Win) more improvements to the MRC export were released:
    • For simplified fine tuning, there are now 6 new high-level parameters for PDF Export
    • New export profiles for different scenarios
      • Maximum Quality
      • Balanced (quality / file size)
      • Minimal Size
      • Maximum Speed

The image below shows a code sample that is installed with FineReader Engine 10. It makes the testing and implementation of a good PDF MRC export really easy.


See also:


Back to: OCR Processing Feature Overview