In this era of digitization, most hardcopy documents are being transformed into digital formats. In the process of transformation, large quantities of documents are stored and preserved through electronic scanning. These documents are available from various sources such as ancient documentation, old legal records, medical reports, music scores, palm leaf, and reports on security-related issues. In particular, ancient and historical documents are hard to read due to their degradation in terms of low contrast and existence of corrupted artefacts. In recent times, degraded document binarization has been studied widely and several approaches were developed to deal with issues and challenges in document binarization. In this paper, a comprehensive review is conducted on the issues and challenges faced during the image binarization process, followed by insights on various methods used for image binarization. This paper also discusses the advanced methods used for the enhancement of degraded documents that improves the quality of documents during the binarization process. Further discussions are made on the effectiveness and robustness of existing methods, and there is still a scope to develop a hybrid approach that can deal with degraded document binarization more effectively.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited