LE QUY DON
Technical University
VietnameseClear Cookie - decide language by browser settings

Contract metadata identification in czech scanned documents

Ha, H.T. and Horák, A. and Bui, M.T. (2021) Contract metadata identification in czech scanned documents. In: 13th International Conference on Agents and Artificial Intelligence, ICAART 2021, 4 February 2021 through 6 February 2021.

Full text not available from this repository. (Upload)

Abstract

Although nowadays digital-born documents are generally prevalent, exchange of business documents often consists in processing their scanned image form as a general human-readable format with one-to-one correspondence to paper documents. Bulk processing of such scanned documents then requires human intervention to extract and enter the main document metadata. In this paper, we present the design and evaluation of a contract processing module in the OCRMiner system. The information extraction process allows to combine layout properties with text analysis as input to a rule-based extraction with confidence score propagation. The first results are evaluated with public Czech contract documents reaching the item extraction accuracy of almost 88%. © 2021 by SCITEPRESS - Science and Technology Publications, Lda.

Item Type: Conference or Workshop Item (Paper)
Divisions: Faculties > Faculty of Special Equipments
Uncontrolled Keywords: Artificial intelligence; Character recognition; Copying; Metadata; Business documents; Confidence score; Contract document; Design and evaluations; Document metadatas; Human intervention; Item extraction; Processing modules; Electronic document exchange
Additional Information: Conference code: 167493. Language of original document: English.
URI: http://eprints.lqdtu.edu.vn/id/eprint/8805

Actions (login required)

View Item
View Item