DocCreator: A New Software for Creating Synthetic Ground-Truthed Document Images
AbstractMost digital libraries that provide user-friendly interfaces, enabling quick and intuitive access to their resources, are based on Document Image Analysis and Recognition (DIAR) methods. Such DIAR methods need ground-truthed document images to be evaluated/compared and, in some cases, trained. Especially with the advent of deep learning-based approaches, the required size of annotated document datasets seems to be ever-growing. Manually annotating real documents has many drawbacks, which often leads to small reliably annotated datasets. In order to circumvent those drawbacks and enable the generation of massive ground-truthed data with high variability, we present DocCreator, a multi-platform and open-source software able to create many synthetic image documents with controlled ground truth. DocCreator has been used in various experiments, showing the interest of using such synthetic images to enrich the training stage of DIAR tools. View Full-Text
Scifeed alert for new publicationsNever miss any articles matching your research from any publisher
- Get alerts for new papers matching your research
- Find out the new papers from selected authors
- Updated daily for 49'000+ journals and 6000+ publishers
- Define your Scifeed now
Journet, N.; Visani, M.; Mansencal, B.; Van-Cuong, K.; Billy, A. DocCreator: A New Software for Creating Synthetic Ground-Truthed Document Images. J. Imaging 2017, 3, 62.
Journet N, Visani M, Mansencal B, Van-Cuong K, Billy A. DocCreator: A New Software for Creating Synthetic Ground-Truthed Document Images. Journal of Imaging. 2017; 3(4):62.Chicago/Turabian Style
Journet, Nicholas; Visani, Muriel; Mansencal, Boris; Van-Cuong, Kieu; Billy, Antoine. 2017. "DocCreator: A New Software for Creating Synthetic Ground-Truthed Document Images." J. Imaging 3, no. 4: 62.
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.