On the Development of the Hellenic Digital Library of Arabic Historical Sources: A Framework for Digital Scholarship in the Humanities
Abstract
1. Introduction
- Scholarly acceleration. By replacing months of archival travel with seconds of browser search, the platform shortens the research cycle and widens participation in Arabic studies;
- Pedagogical enrichment. Multilingual metadata will power bilingual lesson plans, enabling Greek and international classrooms to compare sources across languages;
- Civic dialogue. Open, verifiable evidence of past cultural exchange counters intolerance by showing that coexistence and mutual influence have deep historical roots.
2. Scholarly Objectives and Relevance
2.1. Fragmented Access and Its Consequences
2.2. Greece at the Crossroads
2.3. Scenarios
2.4. Relevance for Byzantium, Ottoman Studies, and Global Trade
2.5. Catalysing New Teaching and Public-Engagement Formats
3. Theoretical Framework and Design Principles
4. Methodological Implementation
- Scholarly significance—the frequency of citation in recent research, representation in university syllabi, and appearance in reference bibliographies.
- Rarity—the absence of modern critical editions or restricted physical access (e.g., single-copy manuscripts).
- Physical condition—items at risk from ink corrosion or brittle paper receive priority, provided conservation labs approve safe handling.
- Consulting Middle Eastern repositories through partner liaisons who can access non-public inventories.
Selection Scoring and Governance
5. Legal and Ethical Considerations
5.1. Risk Zone A: Post-1925 Editions Still Under Copyright
5.2. Risk Zone B: Unpublished Manuscripts with Contested Ownership
5.3. Risk Zone C: Culturally Sensitive Content
5.4. Rights–Clearance Workflow
- Initial rights survey: For every item on the short-list, the rights officer records publication date, author death date, editor death date, and current publisher or repository. A traffic-light flag (green for public domain, amber for uncertain, red for likely in-copyright) is stored in the union catalogue.
- Stakeholder contact: Amber and red items trigger template letters—available in Arabic, Greek, English, and French—sent to publishers, heirs, or archive directors. The letter explains the non-profit scope, the intended Creative Commons licence, and the technical safeguards against commercial exploitation.
- Negotiation and documentation: Where consent is granted, a Memorandum of Understanding specifies resolution, format, attribution wording, and revocation procedures. Where no response comes after 90 days, the dossier moves to fallback strategies: restricted images or metadata-only display.
- Repository update: Clearance outcomes propagate automatically to the CMS. Only green-lit assets pass to the digitisation queue; embargoed or partial-access items receive a manifest bearing a striking “rights pending” icon and a summary note for users.
5.5. Licensing Strategy
5.6. GDPR Compliance
- Lawfulness and consent: Employment contracts, volunteer agreements, and mailing-list opt-ins include explicit clauses on data storage duration and access scope;
- Purpose limitation and minimisation: Only data essential for project delivery (e.g., email for GitHub (https://github.com) access, IP addresses for security logs) are retained;
- Storage limitation: Logs older than 180 days are anonymised; contributor contact details are deleted five years after final release unless renewed consent is given;
- Integrity and confidentiality: All databases use field-level encryption for personally identifiable information; daily backups are encrypted at rest;
- Accountability: A Data Protection Impact Assessment (DPIA) is filed with the Greek Data Protection Authority, and the project appoints a part-time Data Protection Officer.
5.7. Cultural Sensitivity and the CARE Principles
- Collective Benefit. Scans are repatriated digitally to partner institutions in Egypt, Syria, and Morocco, giving local scholars free, immediate access;
- Authority to Control: Source repositories and, where appropriate, descendant communities can request takedown or licence adjustments through a documented governance channel;
- Responsibility: Attribution chains record both the holding library and the originating cultural group, acknowledging layered custodianship;
- Ethics: The sensitivity flagging system ensures that sacred or vulnerable narratives are not exposed without consultation.
6. Expected Academic and Societal Impact
6.1. Accelerating Academic Research
6.2. Pedagogical Transformation
6.3. Societal Benefits and Public Discourse
6.4. Illustrative Scenarios
- A Greek high school student is assigned a project on the Crusades. Using the HDB-AHS, she accesses both Arabic and Greek accounts of the siege of Acre, compares the different perspectives, and presents her findings on cross-cultural understanding to her class.
- A postgraduate researcher in economic history downloads a dataset of grain prices from fourteenth-century Arabic chronicles and integrates it with Venetian customs records. This allows him to test new hypotheses about Mediterranean market integration.
- A community museum curator in Thessaloniki designs an interactive exhibit using IIIF manifests from the HDB-AHS. Visitors explore digitized Arabic travelogues and Greek port records side by side, learning how trade and migration shaped the city’s diverse identity over centuries.
6.5. Catalysing Downstream Projects
- Topic modelling: Once thousands of pages are tokenised, latent-Dirichlet allocation can surface thematic clusters—plague outbreaks, military taxation, pilgrimage logistics—guiding historians to research questions they might not have framed otherwise.
- Geographic network analysis: Place-name extraction plus GeoNames coordinates enable the construction of trade and communication networks visualised in Gephi (https://gephi.org/). Scholars will plot shifting hubs from Umayyad Damascus to Ottoman Constantinople, exposing macro-patterns invisible in isolated studies.
- Machine translation training: Parallel passages between Arabic originals and Greek paraphrases supply high-quality sentence pairs for transformer models, potentially boosting low-resource Arabic-Greek machine translation.
6.6. Boosting Greek Capacity and Regional Leadership
- Ottoman Turkish manuscripts—hosted in the same backend infrastructure, with scripts and OCR models tuned to Arabic-derived Ottoman hand.
- Syriac Christian chronicles—forming a Semitic-language cluster that complements Arabic records and showcases Greece as a Mediterranean digital-heritage hub.
7. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
Acronym | Term | How we use it |
FAIR | Findable, Accessible, Interoperable, Reusable | Primary technical framework guiding identifiers, metadata, and APIs. |
CARE | Collective Benefit, Authority to Control, Responsibility, Ethics | Governs community authority and access to sensitive materials. |
IIIF | International Image Interoperability Framework | We will publish Presentation 3.0 manifests and image services. |
OCR | Optical Character Recognition | We will benchmark tools (e.g., Tesseract for print; Kraken for handwriting) before selection. |
TEI | Text Encoding Initiative | We will use a minimal header and selected modules for item-level description. |
DOI | Digital Object Identifier | Persistent citation ID for items and releases. |
ARK | Archival Resource Key | Internal persistence ID; resolves to the same landing page as the DOI. |
API | Application Programming Interface | Public endpoints (IIIF; planned read APIs) to access data and metadata. |
1 | Authors’ note on the project stage: The HDB-AHS is currently in the pre-implementation stage. The team has completed corpus identification and designed the implementation phases and work-packages; the consortium and funding are being finalised. Technical activities (digitisation, OCR, platform deployment) will begin upon award, following a 36-month plan. This article should be read as a blueprint and impact case, not as a report of completed implementation. |
2 | https://www.europeana.eu/ (accessed on 5 May 2025). |
3 | https://developers.wellcomecollection.org/docs/iiif (accessed on 5 May 2025). |
4 | https://ymdi.uoregon.edu (accessed on 5 May 2025). |
5 | https://viaf.org/en (accessed on 5 May 2025). |
6 | https://www.geonames.org/ (accessed on 5 May 2025). |
7 | http://wamcp.bibalex.org/ (accessed on 5 May 2025). |
8 | https://www.ekt.gr/en/index (accessed on 5 May 2025). |
9 | |
10 | |
11 | https://www.loc.gov/item/2021666192/ (accessed on 5 May 2025). |
12 | https://pro.europeana.eu/page/linked-open-data-faq (accessed on 5 May 2025). |
13 | https://voyant-tools.org/ (accessed on 5 May 2025). |
14 | https://recogito.pelagios.org (accessed on 5 May 2025). |
15 | http://symogih.org/?q=node/78&lang=en (accessed on 5 May 2025). |
16 | https://projectmirador.org/ (accessed on 5 May 2025). |
17 | https://universalviewer.io/ (accessed on 5 May 2025). |
18 | https://www.levelaccess.com/blo (accessed on 5 May 2025). |
19 | https://library.parliament.gr/ (accessed on 5 May 2025). |
20 | |
21 | https://bib.hwg-lu.de/en/search-and-find/catalogues (accessed on 5 May 2025). |
22 | https://www.bl.uk/ (accessed on 5 May 2025). |
23 | https://github.com/tesseract-ocr/tesseract (accessed on 5 May 2025). |
24 | https://kraken.re/main/index.html (accessed on 5 May 2025). |
25 | https://pandas.pydata.or (accessed on 5 May 2025). |
References
- Ekpenyong, A. Digital Humanities Scholarship: A Model for Reimagining Knowledge Work in the 21st Century. Divers. Divergence Dialogue 2021, 12645, 435–445. [Google Scholar] [CrossRef]
- Muslimova, M.; Mamedova, G.; Dzhukaeva, M. Digital Technology and Practices of Humanities Research. SHS Web Conf. 2023, 172, 05001. [Google Scholar] [CrossRef]
- Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J.; Appleton, G.; Axton, M.; Baak, A.; Blomberg, N.; Boiten, J.W.; da Silva Santos, L.B.; Bourne, P.E.; et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 2016, 3, 160018. [Google Scholar] [CrossRef] [PubMed]
- Komar, P. Imports and Market Integration in the Roman Mediterranean. J. Mediterr. Archaeol. 2024, 37, 54–76. [Google Scholar] [CrossRef]
- Kourtelis, C. A quarter-century of studying Euro-Mediterranean relations: A systematic literature review. Mediterr. Politics 2022, 29, 165–185. [Google Scholar] [CrossRef]
- Windhager, F.; Federico, P.; Schreder, G.; Glinka, K.; Dörk, M.; Miksch, S.; Mayr, E. Visualization of Cultural Heritage Collection Data: State of the Art and Future Challenges. IEEE Trans. Vis. Comput. Graph. 2019, 25, 2311–2330. [Google Scholar] [CrossRef]
- Borowiecki, K.; Navarrete, T. Digitization of heritage collections as indicator of innovation. Econ. Innov. New Technol. 2015, 26, 227–246. [Google Scholar] [CrossRef]
- Papakostas, C. Bridging Church History, Geopolitics, and Digital Education: A New Approach to Teaching Religious Heritage. Teach. Theol. Relig. 2025. Early View (First Published 16 June 2025). [Google Scholar] [CrossRef]
- Broekhuizen, T.; Gijsenberg, M.; Sloot, L.; Broekhuis, M.; Donkers, B.; Emrich, O. Digital platform openness: Drivers, dimensions and outcomes. J. Bus. Res. 2021, 122, 902–914. [Google Scholar] [CrossRef]
- Al-Shamayleh, A.; Haider, S.; Khalil, W.; Gani, A.; Akhunzada, A. Risk Factors and Practices for the Development of Open Source Software From Developers’ Perspective. IEEE Access 2023, 11, 63333–63350. [Google Scholar] [CrossRef]
- Lythreatis, S.; El-Kassar, A.; Singh, S. The digital divide: A review and future research agenda. Technol. Forecast. Soc. Change 2021, 175, 121359. [Google Scholar] [CrossRef]
- Picatoste, X.; Aceleanu, M.; Șerban, A.; Vasilescu, M.; Dimian, G. Digital divide, skills and perceptions on digitalisation in the European Union—Towards a smart labour market. PLoS ONE 2020, 15, e0232032. [Google Scholar] [CrossRef]
- Fan, J.; Zhang, W.; Kuang, Z.; Zhang, B.; Yu, J.; Lin, D. Leveraging Content Sensitiveness and User Trustworthiness to Recommend Fine-Grained Privacy Settings for Social Image Sharing. IEEE Trans. Inf. Forensics Secur. 2018, 13, 1317–1332. [Google Scholar] [CrossRef]
- Ramadan, A. The Treatment of Arab Prisoners of war in Byzantium, 9th–10th Centuries. Ann. Islam. 2009, 43, 155–194. [Google Scholar]
- Durak, K. Performance and Ideology in the Exchange of Prisoners between the Byzantines and the Islamic Near Easterners in the Early Middle Ages. In Medieval and Early Modern Performance in the Eastern Mediterranean; Brepols Publisher: Turnhout, Belgium, 2014; pp. 167–180. [Google Scholar] [CrossRef]
- Söderberg, J. Grain Prices in Cairo and Europe in the Middle Ages. Res. Econ. Hist. 2006, 24, 189–216. [Google Scholar] [CrossRef]
- Hoyland, R. EXCURSUS B: The Byzantine-Arab Chronicle of 741 and Its Eastern Source. In A Survey and Evaluation of Christian, Jewish and Zoroastrian Writings on Early Islam; Gorgias Press: Piscataway, NJ, USA, 2019; pp. 477–494. [Google Scholar] [CrossRef]
- König, D.G. Latin literature and the Arabic language. In A Millennium Heritage; Stella, F., Doležalová, L., Shanzer, D., Eds.; John Benjamins Publishing Company: Amsterdam, The Netherlands, 2024; Chapter 17; pp. 284–295. [Google Scholar] [CrossRef]
- Carroll, S.R.; Garba, I.; Figueroa-Rodríguez, O.L.; Holbrook, J.; Lovett, R.; Materechera, S.; Parsons, M.; Raseroka, K.; Rodriguez-Lonebear, D.; Rowe, R.; et al. The CARE Principles for Indigenous Data Governance. Data Sci. J. 2020, 19, 43. [Google Scholar] [CrossRef]
- Congrong, X. The Impact of the Battle of Manzikert on the Late Byzantine Empire and Balkan Issues. Int. Theory Pract. Humanit. Soc. Sci. 2025, 2, 376–392. [Google Scholar] [CrossRef]
- Bartusis, M. The Byzantine empire and the Balkans. In The Cambridge History of War; Cambridge University Press: Cambridge, UK, 2020; pp. 429–448. [Google Scholar] [CrossRef]
- Çalık, Z.A. Forging Cosmopolitan Networks: Muslim Ottoman Merchants in Trieste’s Mediterranean Trade Network in the Late-Eighteenth and the Early-Nineteenth Centuries. Mediterr. Stud. 2025, 33, 70–97. [Google Scholar] [CrossRef]
- Ossiannilsson, E. Open educational resources (OER) and some of the United Nations sustainable development goals. Int. J. Inf. Learn. Technol. 2023, 40, 548–561. [Google Scholar] [CrossRef]
- Santos-Hermosa, G. Impact and implementation of UNESCO’s Recommendation on Open Educational Resources in academic libraries: SPARC Europe Case Study. Res. Learn. Technol. 2024, 32, 3183. [Google Scholar] [CrossRef]
- Kelly, M.; Greenberg, J.; Rauch, C.B.; Grabus, S.; Boone, J.P.; Kunze, J.A.; Logan, P.M. A Computational Approach to Historical Ontologies. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Virtual, 10–13 December 2020; pp. 1878–1883. [Google Scholar] [CrossRef]
- Freire, N.; Manguinhas, H.; Isaac, A.; Charles, V. Persistent Identifier Usage by Cultural Heritage Institutions: A Study on the Europeana.eu Dataset. In Linking Theory and Practice of Digital Libraries, In Proceedings of the 27th International Conference on Theory and Practice of Digital Libraries, TPDL 2023, Zadar, Croatia, 26–29 September 2023; Springer: Berlin/Heidelberg, Germany, 2023; pp. 341–348. [Google Scholar] [CrossRef]
- Nishioka, C.; Nagasaki, K. Understanding IIIF image usage based on server log analysis. Digit. Scholarsh. Humanit. 2021, 36, 210–221. Available online: https://consensus.app/papers/understanding-iiif-image-usage-based-on-server-log-nagasaki-nishioka/bdb48337cb8b5c71b59a36421464c85e/ (accessed on 26 January 2025). [CrossRef]
- Qi, L.; He, Q.; Chen, F.; Zhang, X.; Dou, W.; Ni, Q. Data-Driven Web APIs Recommendation for Building Web Applications. IEEE Trans. Big Data 2022, 8, 685–698. [Google Scholar] [CrossRef]
- Al-Ma’adeed, S.; Elliman, D.; Higgins, C. A Data Base for Arabic Handwritten Text Recognition Research. Int. Arab J. Inf. Technol. 2024, 1, 38–42. [Google Scholar]
- Papakostas, C. Artificial Intelligence in Religious Education: Ethical, Pedagogical, and Theological Perspectives. Religions 2025, 16, 563. [Google Scholar] [CrossRef]
- Hutukka, P. Copyright Law in the European Union, the United States and China. IIC-Int. Rev. Intellect. Prop. Compet. Law 2023, 54, 1044–1080. [Google Scholar] [CrossRef]
- King, J. Inscriptions and Ways of Owning Books among the Sisters of Syon Abbey. Rev. Engl. Stud. 2021, 72, 836–859. [Google Scholar] [CrossRef]
- Yanson, R. Nathan W. Hill: The Historical Phonology of Tibetan, Burmese, and Chinese. Bull. Sch. Orient. Afr. Stud. 2020, 83, 166–168. [Google Scholar] [CrossRef]
- Smith, R.; Snow, P.; Serry, T.; Hammond, L. The Role of Background Knowledge in Reading Comprehension: A Critical Review. Read. Psychol. 2021, 42, 214–240. [Google Scholar] [CrossRef]
- Cabell, S.; Hwang, H. Building Content Knowledge to Boost Comprehension in the Primary Grades. Read. Res. Q. 2020, 55, S99–S107. [Google Scholar] [CrossRef]
- McCarthy, K.; McNamara, D. The Multidimensional Knowledge in Text Comprehension framework. Educ. Psychol. 2021, 56, 196–214. [Google Scholar] [CrossRef]
- Hoofnagle, C.; Van Der Sloot, B.; Borgesius, F. The European Union general data protection regulation: What it is and what it means. Inf. Commun. Technol. Law 2019, 28, 65–98. [Google Scholar] [CrossRef]
- Tamburri, D. Design principles for the General Data Protection Regulation (GDPR): A formal concept analysis and its evaluation. Inf. Syst. 2020, 91, 101469. [Google Scholar] [CrossRef]
- Papakostas, C. Faith in Frames: Constructing a Digital Game-Based Learning Framework for Religious Education. Teach. Theol. Relig. 2024, 27, 137–154. [Google Scholar] [CrossRef]
- Papakostas, C.; Troussas, C.; Krouska, A.; Mylonas, P.; Sgouropoulou, C. Utilizing Fuzzy Weights for Enhanced User Experience in Virtual Museums. In Proceedings of the 2024 19th International Workshop on Semantic and Social Media Adaptation & Personalization (SMAP), Athens, Greece, 21–22 November 2024; pp. 25–31. [Google Scholar] [CrossRef]
- Strousopoulos, P.; Papakostas, C.; Troussas, C.; Krouska, A.; Mylonas, P.; Sgouropoulou, C. SculptMate: Personalizing Cultural Heritage Experience Using Fuzzy Weights. In UMAP ’23 Adjunct: Adjunct Proceedings of the 31st ACM Conference on User Modeling, Adaptation and Personalization, Limassol, Cyprus, 26–29 June 2023; Association for Computing Machinery: New York, NY, USA, 2023; pp. 397–407. [Google Scholar] [CrossRef]
- Marsili, G.; Orlandi, L.M. Digital Humanities and Cultural Heritage Preservation. Stud. Digit. Herit. 2020, 3, 144–155. [Google Scholar] [CrossRef]
- Stamou, A.; Nassis, C.; Chrysafi, E.; Sylaiou, S.; Kaya, G.; Sarlak, E.; Ribolov, S.; Karavaltchev, V.; Constantinides, A.; Belk, M.; et al. Preserving Ecclesiastical Cultural Heritage of Thrace: A Needs Analysis for Digital Recording in Monasteries and Temples. Heritage 2025, 8, 66. [Google Scholar] [CrossRef]
- Kantaros, A.; Soulis, E.; Alysandratou, E. Digitization of Ancient Artefacts and Fabrication of Sustainable 3D-Printed Replicas for Intended Use by Visitors with Disabilities: The Case of Piraeus Archaeological Museum. Sustainability 2023, 15, 12689. [Google Scholar] [CrossRef]
- Bellia, A. Towards a Digital Approach to the Listening to Ancient Places. Heritage 2021, 4, 2470–2480. [Google Scholar] [CrossRef]
Project | Primary Scope | Focus/Holdings | Content Type | Standards/Access | What HDB-AHS Adds |
---|---|---|---|---|---|
Qatar Digital Library (QDL) | Gulf history and Arabic-language heritage | British Library and partners | Digitised manuscripts, archives, maps | IIIF viewer and rich metadata; bilingual interface | Greece-anchored aggregation across Greek repositories; trilingual (GR/AR/EN); selection scoring + DOI/ARK policy; teaching kits for Greek curricula |
Digital Muṣḥaf Project | Qur’anic manuscripts | Specialist collections | High-quality facsimiles and codicological data | Public viewing emphasis | Broader corpus beyond Qur’an; programmatic access plan (IIIF v3 + APIs); integration with Greek catalogues |
Open Islamicate Texts Initiative (OpenITI) | Machine-readable texts (Arabic, Persian, etc.) | Large text corpora | OCR/edited text, not page images | Text formats and NLP pipelines | Image-first library with IIIF manifests; paired text where feasible; OCR benchmarking protocol |
Fihrist (union catalogue, UK) | Islamic manuscripts catalogue | Multiple UK libraries | Descriptive records (little or no images) | Catalogue aggregation | Union catalogue plus images, manifests, and persistent IDs; Greece-based holdings |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Karageorgoudis, E.; Papakostas, C.; Lianos Liantis, E.; Miotto, M. On the Development of the Hellenic Digital Library of Arabic Historical Sources: A Framework for Digital Scholarship in the Humanities. Heritage 2025, 8, 330. https://doi.org/10.3390/heritage8080330
Karageorgoudis E, Papakostas C, Lianos Liantis E, Miotto M. On the Development of the Hellenic Digital Library of Arabic Historical Sources: A Framework for Digital Scholarship in the Humanities. Heritage. 2025; 8(8):330. https://doi.org/10.3390/heritage8080330
Chicago/Turabian StyleKarageorgoudis, Emmanuil, Christos Papakostas, Efstathios Lianos Liantis, and Marco Miotto. 2025. "On the Development of the Hellenic Digital Library of Arabic Historical Sources: A Framework for Digital Scholarship in the Humanities" Heritage 8, no. 8: 330. https://doi.org/10.3390/heritage8080330
APA StyleKarageorgoudis, E., Papakostas, C., Lianos Liantis, E., & Miotto, M. (2025). On the Development of the Hellenic Digital Library of Arabic Historical Sources: A Framework for Digital Scholarship in the Humanities. Heritage, 8(8), 330. https://doi.org/10.3390/heritage8080330