PDF/A is a file format and an
ISO Standard for the long-term archiving of electronic documents.
PDF/A is in fact a subset of PDF, obtained by leaving out
PDF features not suited to long-term archiving.
There are different levels of
PDF/A
PDF/A-1b - Level B compliance in Part 1
PDF/A-1b has the objective of ensuring reliable reproduction of the visual appearance of the document.
PDF/A-1a - Level A compliance in Part 1
PDF/A-1a includes all the requirements of
PDF/A-1b and additionally requires that
document structure be included (also known as being “tagged”/“Tagged
PDF”), with the objective of ensuring that document content can be searched and repurposed.
PDF/A-1a also requires Unicode character maps.
PDF/A-2 is based on
ISO 32000-1
A-2 a very recent standard and is not widely used (yet)
PDF 1.7 and is defined by
ISO 19005-2:2011, published on
June 20, 2011 under the formal name Document management – Electronic document file format for long-term preservation – Part 2: Use of
ISO 32000-1 (
PDF/A-2).
Things that have to be full filled to be PDF/A compliant:
Audio and video content are forbidden.
JavaScript and executable file launches are forbidden.
All fonts must be embedded and also must be legally embeddable for unlimited, universal rendering. This also applies to the so-called PostScript standard fonts such as Times or Helvetica.
Colorspaces specified in a device-independent manner.
Encryption is forbidden.
Use of standards-based metadata is mandated.
External content references are forbidden.
LZW and JPEG2000 image compressions are forbidden in
PDF/A-1,
but
JPEG 2000 compression is allowed in
PDF/A-2.
Transparent objects and layers (Optional Content Groups) are forbidden in
PDF/A-1, but they are supported in
PDF/A-2.
Provisions for digital signatures in accordance with the PAdES (
PDF Advanced Electronic Signatures) standard are supported in
PDF/A-2.
Embedded files are forbidden in
PDF/A-1, but
PDF/A-2 offers the possibility to embed
PDF/A files, allowing archiving of sets of documents in a single file.
Source: http://en.wikipedia.org/wiki/PDF/A
PDF/A Export (PDF/A-1b & PDF/A-1a) is available in the following ABBYY technology products
FineReader Engines - OCR & Document Conversion
FlexiCapture Engine - Separation, Classification & Data Capture
Recognition Server - Solution for server based processing and document capture
FlexiCapture - Solutions for Data Capture
The technical changes of PDF/A-2 are:
based on based
PDF 1.7 (
ISO 32000-1)
highly efficient JPEG2000 compression allowed
support for transparency effects and layers
embedding of OpenType fonts
provisions for digital signatures in accordance with the
PAdES (
PDF Advanced Electronic Signatures) standard.
possibility to embed
PDF/A files in
PDF/A-2,
allowing archiving of sets of documents as individual documents in a single file.
The most important feature for OCR-conversion is the usage of JPEG2000 compression to generate smaller output files.
ABBYY will add the PDF/A-2 export in future product releases.
The first product will be FineReader Engine 11 Windows, targeted for end of 2012. The update will be free for customers with a valid Software Maintenance contract.
… this is roadmap information - no announcement subject to change
Stand: 03/2012
ABBYY is a worldwide member in the PDF/A Competence Center and committed to support PDF/A
Back to: SDK Feature Overview