Recognition

Once the recognition areas are set up, character and word recognition are executed.

Character & Word Recognition

ABBYY OCR uses different internal classifiers to be able to read a character. Here some examples:

  • Pattern
  • Feature
  • Outline
  • Vector

Font Types

ABBYY SDKs have a built-in omnifont OCR Engine. So it is capable to recognise a large variety of font types and objects:

  • Standard fonts used in office environments, magazines, newspapers
  • Documents printed with dot-matrix printers or typewriters
  • Special fonts like OCR-A, OCR-B, MICR (E13B) and CMC7
  • Old fonts such as Fraktur and Schwabacher
  • Hand-printed characters (ICR) in various field borders and frames

Recognition Modes

FineReader Engine gives developers full processing control, therefore it offers different recognition modes:

  • Normal
  • Fast
  • Balanced mode

There are also special recognition options for ICR, and barcode reading.

Languages

The SDKs support over 190 OCR and over 110 ICR languages.
The list of supported languages can be found in the documentation of each version.

Dictionaries

After the “individual” characters are recognized, ABBYY Technology uses language and dictionary information to make a final decision about characters that are not 100 % identified.
Dictionaries allow to make a decision!

  • ABBYY SDKs come with a set of predefined morphological standard dictionaries.
  • Own language definitions and dictionaries can be used to improve the recognition results
  • Dictionaries in RAM allow very fast access and they can fully be controlled via API

Advanced Options

  • Voting API, gives developers access to word-level and character-level hypotheses. This information can then be used in external voting systems.
  • Pattern training, e.g. for special characters, or decorative fonts
  • Core recognition parameters tuning allows switching on/off certain algorithms for pre-processing, document analysis and recognition

Intelligent PDF Processing

FineReader Engine 8.0/9.0 determines on a block by block basis when to apply full recognition or if the text layer can be used.


… more to come

Back to: OCR Processing Steps