English is the default recognition language. If you want to change the default recognition language, you must use the SetPredefinedTextLanguage method of the RecognizerParams object.
If a block contains text of different types, ABBYY FineReader Engine will still treat it as text of the same type. To improve the quality of OCR, draw a separate block for text of each type.
More details can be found in the documentation chapter: How autodetection works.
More details can be found in the documentation chapter: How autodetection works.
If the PossibleTextTypes property of the RecognizerParams object contains any combination of TT_MATRIX, TT_TYPEWRITER, TT_OCR_A, and TT_OCR_B, italic fonts and superscript/subscript will not be recognized, regardless of the values of the ProhibitItalic, ProhibitSubscript and ProhibitSuperscript properties of the RecognizerParams object.
More details can be found in the documentation chapter: How autodetection works.
Yes, hieroglyphic characters have such recognition attributes.
More details can be found in the documentation chapter: Recognizing Hieroglyphic Languages, ExtendedRecAttributes, CharParams.
The CharConfidence property of the ExtendedRecAttributes, the PlainText, and the CharacterRecognitionVariant objects is the read-only long property which stores the value of character confidence. It is in the range from 0 to 100, and 255 corresponds to the fact that confidence is undefined. It represents an estimate of recognition confidence of a character in percentage points. The greater its value, the greater the confidence. Character confidence can be undefined, for example, for characters which were added during text editing.
Recognition confidence of a character image is a numerical estimate of the similarity of this image and the “ideal” whose recognition confidence would be 100%. When recognizing a character, the program provides several recognition variants which are ranked by their confidence values. For example, an image of the letter “e” may be recognized
The sum total of the confidence values of all the recognition variants of a character need not be 100%. The hypothesis with a higher confidence rating is selected as the recognition result. But the choice also depends on the context (i.e. the word to which the character belongs) and the results of a differential comparison. For example, if the word with the “e” hypothesis is not a dictionary word while the word with the “c” hypothesis is a dictionary word, the latter will be selected as the recognition result, and its confidence rating will be 85%. The rest of the recognition variants can be obtained as hypotheses.
The IsSuspicious property of the CharParams object is the Boolean property. This property set to TRUE means that the character was recognized unreliably. This property is determined by an algorithm which takes into account a number of parameters, such as recognition confidence of a character, nearby characters and their recognition confidence, hypotheses and their recognition confidence, the geometric parameters of a character, the context (i.e. the word to which a character belongs), etc.