An Experimental Comparison between Deep Learning and Classical Machine Learning Approaches for Writer Identification in Medieval Documents
Abstract
:1. Introduction
2. Related Work
3. Layout Features for Writer Identification
4. Deep Learning for Writer Identification
5. The Final Page Classification Step
6. Experimental Comparisons
6.1. The Avila Dataset
6.2. Comparing the Performances of the Page Classification Systems
6.3. Testing the Reject Option
7. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Stokes, P. Computer-Aided Palaeography, Present and Future. In Kodikologie und Paläographie im Digitalen Zeitalter—Codicology and Palaeography in the Digital Age; Institut für Dokumentologie und Editorik: Cologne, Germany, 2009; pp. 309–338. [Google Scholar]
- Ciula, A. The Palaeographical Method Under the Light of a Digital Approach. In Kodikologie und Paläographie im digitalen Zeitalter-Codicology and Palaeography in the Digital Age; Rehbein, M., Sahle, P., Schaßan, T., Eds.; Institut für Dokumentologie und Editorik: Cologne, Germany, 2009; pp. 219–237. [Google Scholar]
- Gurrado, M. “Graphoskop”, uno Strumento Informatico per l’analisi Ialeografica Quantitativa. In Kodikologie und Paläographie im digitalen Zeitalter-Codicology and Palaeography in the Digital Age; Rehbein, M., Sahle, P., Schaßan, T., Eds.; Institut für Dokumentologie und Editorik: Cologne, Germany, 2009; pp. 251–259. [Google Scholar]
- De Stefano, C.; Fontanella, F.; Maniaci, M.; Scotto di Freca, A. A Method for Scribe Distinction in Medieval Manuscripts Using Page Layout Features. In Image Analysis and Processing—ICIAP 2011; Lecture Notes in Computer Science; Maino, G., Foresti, G., Eds.; Springer: Berlin/Heidelberg, Germany, 2011; Volume 6978, pp. 393–402. [Google Scholar]
- De Stefano, C.; Maniaci, M.; Fontanella, F.; Scotto di Freca, A. Reliable writer identification in medieval manuscripts through page layout features: The Avila Bible case. Eng. Appl. Artif. Intell. 2018, 72, 99–110. [Google Scholar] [CrossRef]
- Papaodysseus, C.; Rousopoulos, P.; Giannopoulos, F.; Zannos, S.; Arabadjis, D.; Panagopoulos, M.; Kalfa, E.; Blackwell, C.; Tracy, S. Identifying the writer of ancient inscriptions and Byzantine codices. A novel approach. Comput. Vis. Image Underst. 2014, 121, 57–73. [Google Scholar] [CrossRef]
- Wahlberg, F.; Mårtensson, L.; Brun, A. Large Scale Style Based Dating of Medieval Manuscripts. In HIP ’15: Proceedings of the 3rd International Workshop on Historical Document Imaging and Processing; ACM: New York, NY, USA, 2015; pp. 107–114. [Google Scholar]
- Pintus, R.; Yang, Y.; Gobbetti, E.; Rushmeier, H. An automatic word-spotting framework for medieval manuscripts. In Proceedings of the 2015 Digital Heritage, Granada, Spain, 28 September–2 October 2015; Volume 2, pp. 5–12. [Google Scholar]
- En, S.; Petitjean, C.; Nicolas, S.; Heutte, L. A scalable pattern spotting system for historical documents. Pattern Recognit. 2016, 54, 149–161. [Google Scholar] [CrossRef]
- Bulacu, M.; Schomaker, L. Text-Independent Writer Identification and Verification Using Textural and Allographic Features. IEEE Trans. Pattern Anal. Mach. Intell. 2007, 29, 701–717. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Dhali, M.A.; He, S.; Popovic, M.; Tigchelaar, E.; Schomaker, L. A Digital Palaeographic Approach towards Writer Identification in the Dead Sea Scrolls. In Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods, ICPRAM, Porto, Portugal, 24–26 February 2017; pp. 693–702. [Google Scholar]
- Liang, Y.; Fairhurst, M.C.; Guest, R.M.; Erbilek, M. Automatic Handwriting Feature Extraction, Analysis and Visualization in the Context of Digital Palaeography. Int. J. Pattern Recognit. Artif. Intell. 2016, 30, 1653001. [Google Scholar] [CrossRef]
- He, S.; Samara, P.; Burgers, J.; Schomaker, L. Image-based historical manuscript dating using contour and stroke fragments. Pattern Recognit. 2016, 58, 159–171. [Google Scholar] [CrossRef]
- Zhou, S.; Tan, B. Electrocardiogram soft computing using hybrid deep learning CNN-ELM. Appl. Soft Comput. 2020, 86, 105778. [Google Scholar] [CrossRef]
- He, S.; Li, Z.; Tang, Y.; Liao, Z.; Li, F.; Lim, S.J. Parameters Compressing in Deep Learning. Comput. Mater. Contin. 2020, 62, 321–336. [Google Scholar] [CrossRef]
- Gadekallu, T.R.; Rajput, D.S.; Reddy, M.P.K.; Lakshmanna, K.; Bhattacharya, S.; Singh, S.; Jolfaei, A.; Alazab, M. A novel PCA–whale optimization-based deep neural network model for classification of tomato plant diseases using GPU. J. Real Time Image Process. 2020, 1–14. [Google Scholar] [CrossRef]
- Savita, A.; Choudhary, A.; Nayyar, A.; Singh, S.; Yoon, B. Improved Handwritten Digit Recognition Using Convolutional Neural Networks (CNN). Sensors 2020, 20, 3344. [Google Scholar] [CrossRef]
- Chen, Y.; Xiong, J.; Xu, W.; Zuo, J. A novel online incremental and decremental learning algorithm based on variable support vector machine. Clust. Comput. 2018, 22, 7435–7445. [Google Scholar] [CrossRef]
- Ly, N.T.; Nguyen, C.T.; Nakagawa, M. An attention-based row-column encoder-decoder model for text recognition in Japanese Historical Documents. Pattern Recognit. Lett. 2020. [Google Scholar] [CrossRef]
- Nguyen, K.C.; Nguyen, C.T.; Nakagawa, M. Nom document digitalization by deep convolution neural networks. Pattern Recognit. Lett. 2020, 133, 8–16. [Google Scholar] [CrossRef]
- Ziran, Z.; Pic, X.; Undri Innocenti, S.; Mugnai, D.; Marinai, S. Text alignment in early printed books combining deep learning and dynamic programming. Pattern Recognit. Lett. 2020, 133, 109–115. [Google Scholar] [CrossRef]
- Bozzolo, C.; Coq, D.; Muzerelle, D.; Ornato, E. Noir et Blanc. Premiers Résultats d’une Enquête sur la Mise en Page dans le Livre Médiéval; Il Libro e il Testo; Università degli Studi di Urbino: Urbino, Italy, 1982; pp. 195–221. [Google Scholar]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 4510–4520. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single Shot Multibox Detector. In Computer Vision—ECCV 2016, Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 8–16 October 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 21–37. [Google Scholar]
- Quinlan, J.R. C4.5: Programs for Machine Learning (Morgan Kaufmann Series in Machine Learning); Morgan Kaufmann: San Francisco, CA, USA, 1993. [Google Scholar]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
- Cilia, N.; De Stefano, C.; Fontanella, F.; Marrocco, C.; Molinara, M.; Scotto Di Freca, A. An end-to-end deep learning system for medieval writer identification. Pattern Recognit. Lett. 2020, 129, 137–143. [Google Scholar] [CrossRef]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common Objects in Context. In Computer Vision—ECCV 2014; Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T., Eds.; Springer International Publishing: Berlin/Heidelberg, Germany, 2014; pp. 740–755. [Google Scholar]
- Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A.A. Inception-V4, Inception-ResNet and the impact of residual connections on learning. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
- Zoph, B.; Vasudevan, V.; Shlens, J.; Le, Q.V. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 8697–8710. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Li, F.F. ImageNet: A large-scale hierarchical image database. In Proceedings of the CVPR IEEE Computer Society, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar]
- Cilia, N.; De Stefano, C.; Fontanella, F.; Molinara, M.; Scotto di Freca, A. What is the minimum training data size to reliably identify writers in medieval manuscripts? Pattern Recognit. Lett. 2020, 129, 198–204. [Google Scholar] [CrossRef]
- Chow, C. On optimum recognition error and reject trade off. IEEE Trans. Inf. Theor. 2006, 16, 41–46. [Google Scholar] [CrossRef] [Green Version]
Layer | Type | Input Size | Output Size | Kernel Size | Rate |
---|---|---|---|---|---|
1 | Fully connected | 2048 | |||
2 | Dropout | 2048 | 2048 | ||
3 | Fully connected | 2048 | |||
4 | Softmax |
Model | Input Size | ||
---|---|---|---|
VGG19 | 512 | 5,259,264 | |
ResNet50 | 2048 | 8,404,992 | |
InceptionV3 | 2048 | 8,404,992 | |
InceptionResNetV2 | 1536 | 7,356,416 | |
NASNetLarge | 4032 | 12,468,224 |
Model | Measure | Writers | Global Performance | |||||||
---|---|---|---|---|---|---|---|---|---|---|
A | D | E | F | G | H | I | X | |||
LF-DT | Acc | 0.8811 | 0.9765 | 0.9695 | 0.8987 | 0.9730 | 0.9678 | 0.9872 | 0.9296 | 0.8298 |
F1 | 0.8549 | 0.7111 | 0.8903 | 0.8026 | 0.7541 | 0.7273 | 0.9278 | 0.5495 | 0.8199 | |
LF-RF | Acc | 0.9237 | 0.5200 | 0.9377 | 0.8366 | 0.7000 | 0.7429 | 0.9400 | 0.8788 | 0.8604 |
F1 | 0.8647 | 0.6667 | 0.9211 | 0.8421 | 0.7200 | 0.8254 | 0.9592 | 0.6105 | 0.8440 | |
LF-MLP | Acc | 0.7867 | 0.9505 | 0.9408 | 0.7880 | 0.9427 | 0.9486 | 0.9809 | 0.9165 | 0.7071 |
F1 | 0.7573 | 0.1429 | 0.8079 | 0.6331 | 0.3913 | 0.5455 | 0.9143 | 0.5333 | 0.6830 | |
DL-VGG19 | Acc | 0.9281 | 0.9474 | 0.5915 | 0.8947 | 0.7500 | 0.9655 | 0.6875 | 0.3125 | 0.8315 |
F1 | 0.9264 | 0.6667 | 0.6412 | 0.8635 | 0.6316 | 0.9655 | 0.7674 | 0.4167 | 0.8274 | |
DL-ResNet50 | Acc | 0.9065 | 1.0000 | 0.7324 | 0.7961 | 0.7083 | 0.8966 | 0.5000 | 0.9375 | 0.8285 |
F1 | 0.9097 | 0.6230 | 0.7879 | 0.8203 | 0.5965 | 0.9286 | 0.6154 | 0.8219 | 0.8307 | |
DL-InceptionV3 | Acc | 0.9065 | 1.0000 | 0.8732 | 0.8289 | 0.9167 | 0.9655 | 0.9167 | 0.9688 | 0.8943 |
F1 | 0.9351 | 0.8837 | 0.9254 | 0.8571 | 0.7458 | 0.9655 | 0.8544 | 0.8158 | 0.8970 | |
DL-InceptionResNetV2 | Acc | 0.9928 | 1.0000 | 0.9859 | 0.9145 | 0.9583 | 0.9655 | 0.9167 | 0.9688 | 0.9648 |
F1 | 0.9910 | 0.9048 | 0.9859 | 0.9521 | 0.9388 | 0.9655 | 0.9462 | 0.8493 | 0.9656 | |
DL-NASNetLarge | Acc | 0.9784 | 1.0000 | 0.8873 | 0.8816 | 0.9583 | 0.9655 | 0.8750 | 0.9688 | 0.9372 |
F1 | 0.9680 | 0.9500 | 0.9265 | 0.9241 | 0.8364 | 0.9655 | 0.9130 | 0.8493 | 0.9379 | |
Number of pages per class | 278 | 19 | 71 | 152 | 24 | 29 | 48 | 32 | 653 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cilia, N.D.; De Stefano, C.; Fontanella, F.; Marrocco, C.; Molinara, M.; Freca, A.S.d. An Experimental Comparison between Deep Learning and Classical Machine Learning Approaches for Writer Identification in Medieval Documents. J. Imaging 2020, 6, 89. https://doi.org/10.3390/jimaging6090089
Cilia ND, De Stefano C, Fontanella F, Marrocco C, Molinara M, Freca ASd. An Experimental Comparison between Deep Learning and Classical Machine Learning Approaches for Writer Identification in Medieval Documents. Journal of Imaging. 2020; 6(9):89. https://doi.org/10.3390/jimaging6090089
Chicago/Turabian StyleCilia, Nicole Dalia, Claudio De Stefano, Francesco Fontanella, Claudio Marrocco, Mario Molinara, and Alessandra Scotto di Freca. 2020. "An Experimental Comparison between Deep Learning and Classical Machine Learning Approaches for Writer Identification in Medieval Documents" Journal of Imaging 6, no. 9: 89. https://doi.org/10.3390/jimaging6090089
APA StyleCilia, N. D., De Stefano, C., Fontanella, F., Marrocco, C., Molinara, M., & Freca, A. S. d. (2020). An Experimental Comparison between Deep Learning and Classical Machine Learning Approaches for Writer Identification in Medieval Documents. Journal of Imaging, 6(9), 89. https://doi.org/10.3390/jimaging6090089