PDF/A Export

What is PDF/A

  • PDF/A is a file format and an ISO Standard for the long-term archiving of electronic documents.
  • PDF/A is in fact a subset of PDF, obtained by leaving out PDF features not suited to long-term archiving.
  • There are different levels of PDF/A
    • PDF/A-1b - Level B compliance in Part 1
      PDF/A-1b has the objective of ensuring reliable reproduction of the visual appearance of the document.
    • PDF/A-1a - Level A compliance in Part 1
      PDF/A-1a includes all the requirements of PDF/A-1b and additionally requires that document structure be included (also known as being “tagged”/“Tagged PDF”), with the objective of ensuring that document content can be searched and repurposed. PDF/A-1a also requires Unicode character maps.
    • PDF/A-2 is based on ISO 32000-1
      A-2 a very recent standard and is not widely used (yet)
      PDF 1.7 and is defined by ISO 19005-2:2011, published on June 20, 2011 under the formal name Document management – Electronic document file format for long-term preservation – Part 2: Use of ISO 32000-1 (PDF/A-2).

PDF/A Minimum Requirements

  • Things that have to be full filled to be PDF/A compliant:
    • Audio and video content are forbidden.
    • JavaScript and executable file launches are forbidden.
    • All fonts must be embedded and also must be legally embeddable for unlimited, universal rendering. This also applies to the so-called PostScript standard fonts such as Times or Helvetica.
    • Colorspaces specified in a device-independent manner.
    • Encryption is forbidden.
    • Use of standards-based metadata is mandated.
    • External content references are forbidden.
    • LZW and JPEG2000 image compressions are forbidden in PDF/A-1,
      but JPEG 2000 compression is allowed in PDF/A-2.
    • Transparent objects and layers (Optional Content Groups) are forbidden in PDF/A-1, but they are supported in PDF/A-2.
    • Provisions for digital signatures in accordance with the PAdES (PDF Advanced Electronic Signatures) standard are supported in PDF/A-2.
    • Embedded files are forbidden in PDF/A-1, but PDF/A-2 offers the possibility to embed PDF/A files, allowing archiving of sets of documents in a single file.

Source: http://en.wikipedia.org/wiki/PDF/A

PDF/A Support in ABBYY Technology Products

PDF/A Export (PDF/A-1b & PDF/A-1a) is available in the following ABBYY technology products

FineReader Engines - OCR & Document Conversion

FlexiCapture Engine - Separation, Classification & Data Capture

Recognition Server - Solution for server based processing and document capture

FlexiCapture - Solutions for Data Capture

About PDF/A-2 Support

The technical changes of PDF/A-2 are:

  • based on based PDF 1.7 (ISO 32000-1)
  • highly efficient JPEG2000 compression allowed
  • support for transparency effects and layers
  • embedding of OpenType fonts
  • provisions for digital signatures in accordance with the
    PAdES (PDF Advanced Electronic Signatures) standard.
  • possibility to embed PDF/A files in PDF/A-2,
    allowing archiving of sets of documents as individual documents in a single file.

The most important feature for OCR-conversion is the usage of JPEG2000 compression to generate smaller output files.

ABBYY will add the PDF/A-2 export in future product releases.
The first product will be FineReader Engine 11 Windows, targeted for end of 2012. The update will be free for customers with a valid Software Maintenance contract.
… this is roadmap information - no announcement subject to change ;-)

Stand: 03/2012

Further Information

ABBYY is a worldwide member in the PDF/A Competence Center and committed to support PDF/A



Back to: SDK Feature Overview