Fork me on GitHub

Plugin ocra2ia

Presentation

This plugin makes it possible to perform optical character recognition (OCR) on 3 types of documents:

  • RIB : may read the following information Code Etablissement, Code Guichet, Numero de compte, Cle, IBAN, BIC, the name and address of the account holder
  • Tax assessment : may read the following information Tax amount, established date, tax payer name and address
  • Identity card : may read the following information name, address, birth date, nationality, gender, id number, ...

The plugin will query the A2IA engine (https://www.a2ia.com/en) that will proceed to the OCR, then the plugin returns the results in a HashMap.

Important

To work with A2ia, the plugin uses the Jacob library (https://sourceforge.net/projects/jacob-project). The use of Jacob requires the loading of the Windows DLL file jacob-1.19-x64.dll. That why the site-Lutèce using this plugin must be deployed on a Windows server.

How to use it

The plugin contains a single bean Spring OcrService that offers a "proceed" method to launch the OCR and retrieve the results.

/** * Perform OCR with A2iA. * * @param byteImageContent * image to process * @param strFileExtension * image extension : values allowed : Tiff, Bmp, Jpeg * @param strDocumentType * document type : values allowed : Rib, TaxAssessment,Identity * @return Map result of OCR * @throws OcrException * the OcrException * */ public Map<String, String> proceed( byte[] byteImageContent, String strFileExtension, String strDocumentType ) throws OcrException

File ocra2ia.properties description

  • ocra2ia.jacob.dll : path to directory that contains jacob-1.19-x64.dll file.
  • ocra2ia.activex.clsid : clsid of activeX A2ia. To find it open Window Registry Editor and go to path Computer\HKEY_CLASSES_ROOT\A2iAMobilityCOM.APIMobility64\CLSID.
  • ocra2ia.server.host : machine host of A2ia server. Must be empty for localhost (Lutèce-site and A2ia server on same machine).
  • ocra2ia.server.port : port to access A2ia server in remote. Must be empty for localhost (Lutèce-site and A2ia server on same machine).
  • ocra2ia.param.dir : path to param A2ia directory.
  • ocra2ia.document.rib : value for document type RIB.
  • ocra2ia.document.tax : value for document type Tax Assessement.
  • ocra2ia.document.identity : value for document type Identity card.
  • ocra2ia.extension.file : file extension allowed (must be always equal to Tiff,Bmp,Jpeg).
  • ocra2ia.tbl.* : path to tbl document corresponding to the document type.
  • ocra2ia.result.* : key corresponding to the result of the ocr for a field.