Enhancing Discoverability: A Metadata Framework for Empirical Research in Theses
Abstract
1. Introduction
1.1. The Problem: A Hidden Empirical Investigation on Youth Unemployment
- The repository only stored basic metadata (title, author, date);
- No fields indicated that her study used interviews, focused on rural areas, or involved qualitative analysis;
- The search engine could not differentiate her work from hundreds of unrelated theses.
1.2. The Solution: A Hybrid Metadata Framework
- Research Strategy: Qualitative.
- Data Collection Method: Semi-structured interviews.
- Population: unemployed youth (ages 18–30).
- Geographic Coverage: Rural Greece.
- A student researching qualitative methods could now filter by methodology;
- A policymaker looking for data on rural youth unemployment would find Maria’s thesis in seconds;
- The thesis becomes part of a discoverable, structured academic network.
- We introduce a hybrid metadata framework that merges selected elements of Dublin Core and DDI, offering a user-friendly yet expressive structure for documenting empirical research in theses across methodological types.
- We address a well-documented gap in existing standards by proposing a model that is both granular and accessible, particularly for non-specialist users working with qualitative and mixed-methods research.
- We validate the framework through a structured analysis of actual student theses, which informed the refinement of metadata fields and ensured alignment with real documentation practices.
- We implement a web-based prototype using open-source technologies, enabling students and institutional staff to input structured metadata through an intuitive interface.
Novelty and Methodological Contribution
- Conducting a structured, thesis-specific content analysis to empirically inform field selection.
- Designing a hybrid model optimized for non-specialist academic contexts.
- Operationalizing the framework as a working prototype validated by real users.
- Addressing a documented but unresolved institutional need for empirical thesis metadata documentation.
2. Related Work
2.1. MARC
2.2. Dublin Core
2.3. Data Documentation Initiative (DDI)
2.4. Additional Metadata Standards in the Digital Library Ecosystem
2.5. Toward a Hybrid Model
3. Methodology and Design Process
3.1. Phase 1: Problem Identification and Gap Analysis
- Dublin Core is widely used in institutional repositories but lacks empirical research descriptors.
- DDI is robust but unsuitable for non-technical users and does not target theses.
- No metadata framework exists that is both empirically expressive and accessible for students or repository staff without specialist training.
3.2. Phase 2: Empirical Content Analysis of Theses
- Present in at least 70% (10 out of 14) of the theses, leaving the remaining fields optional;
- Cross-methodologically relevant (usable across qualitative, quantitative, and mixed methods);
- Essential for understanding empirical research design (e.g., sampling, data collection);
- Realistically identifiable and fillable by non-specialist users.
3.3. Phase 3: Hybrid Metadata Model Construction
- Part A—Research-Specific Fields (20 fields): Drawn from DDI, covering strategy, population, methodology, and data production.
- Part B—General Description Fields (13 fields): Based on Dublin Core and common thesis documentation fields (e.g., title, abstract, language).
3.4. Phase 4: Prototype Implementation
- Guided metadata entry via modular, type-aware forms.
- Export to XML for integration with institutional repositories.
- Basic filtering and search based on research method and metadata categories.
3.5. Phase 5: Informal User-Centered Validation
4. Documentation Requirements for Theses
- Bibliographic theses (non-empirical).
- Theses with quantitative empirical research.
- Theses with qualitative empirical research.
- Theses with mixed-method empirical research.
4.1. Bibliographic Theses
4.2. Theses with Quantitative Research
4.3. Theses with Qualitative Research
4.4. Theses with Mixed-Method Research
5. Documentation Standards
5.1. The Data Documentation Initiative (DDI)
5.1.1. Overview
5.1.2. Structure
- docDscr: Description of the XML document.
- stdyDscr: Study-level metadata.
- fileDscr: Information about data files.
- dataDscr: Variable-level metadata.
- othMat: Supplementary materials.
5.1.3. DDI Lite
5.1.4. DDI and Quantitative Research
5.2. Dublin Core
Overview
- Simple Dublin Core: Uses the original 15 core elements.
- Qualified Dublin Core: Adds three more elements (Audience, Provenance, Rights Holder) and allows element refinements (qualifiers) to support more accurate searches.
5.3. Bridging to the Hybrid Model
- Adopting DDI’s core empirical fields (e.g., Research Strategy, Data Collection Method) that and most frequent that appeared in ≥70% of analyzed theses;
- Preserving Dublin Core’s lightweight structure for general description.
6. Initial Metadata Model: Design and Investigation
6.1. The Empirical Investigation
6.2. Research Objective
6.3. Initial Metadata Template
6.4. Testing and Observations
- Some fields were too broad or vague to capture meaningful distinctions between research types.
- Other fields were frequently left incomplete because the relevant information was not present in the thesis text.
- Certain empirical features, especially in qualitative studies, were described inconsistently.
6.5. Research Purpose and Initial Form Design
- Dublin Core offers general information on theses.
- DDI can be adapted to cover essential elements of all types of empirical research (quantitative and qualitative).
- What are the essential elements of empirical research that must be included in the metadata form to ensure proper documentation?
Initial Metadata Entry Form (Prototype Documentation Model)
- Research strategy.
- Data source (primary/secondary).
- Time/space framework.
- Research questions.
- Hypotheses.
- Reference and collection dates.
- Countries, geographic coverage and units.
- Population and observation unit.
- Sample size and sampling method.
- Data collection and analysis methods.
- Recording and analytical tools.
6.6. Results of the Empirical Investigation
6.7. Table 7b—Final Selection of Fields
(a) | ||
---|---|---|
# | Field | Presence Count |
1 | Title | 7 |
2 | Abstract | 7 |
3 | Purpose | 14 |
4 | Working Hypotheses | 12 |
5 | Other Sources | 3 |
6 | Research Problems | 2 |
7 | Ethics in Research | 5 |
(b) | ||
# | Field | Presence Count |
1 | Purpose | 14 |
2 | Working Hypotheses | 12 |
6.7.1. Justification for the 70% Threshold
6.7.2. Balancing Usability and Empirical Coverage
6.7.3. Sensitivity Analysis of Threshold Selection
6.7.4. Proposed Documentation Schema
New Metadata Form
7. Discussion
7.1. Adaptation of the DDI Standard
7.2. Combining DDI and Dublin Core
8. Conclusions
Future Work
- Integrating the framework with institutional repository platforms (e.g., DSpace, EPrints) to enhance adoption and scalability.
- Mapping the 33-field schema to Schema.org vocabularies (e.g., ScholarlyArticle, Dataset) to improve thesis visibility on web search engines, potentially through a JSON-LD export feature alongside the existing XML output.
- Exploring RDF serialization of the metadata schema to enable linked data integration, allowing theses to be linked to related resources (e.g., datasets, publications, or ORCID profiles) for cross-repository queries and semantic search.
- Developing an NLP module (or use LLMs) to extract metadata fields (e.g., methodology, sample size) from thesis PDFs, reducing manual entry for students, with pilot testing using tools like GROBID or spaCy to validate the approach for legacy theses.
- Supporting multilingual metadata entry.
- Expanding the framework to cover student-authored empirical papers and conference submissions beyond theses.
- Conducting broader user testing across diverse academic departments and institutional contexts to assess usability, adoption, and scalability.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
Term | Description |
AI | Artificial Intelligence—Computer systems able to perform tasks that typically require human intelligence |
DCMI | Dublin Core Metadata Initiative—The organization responsible for maintaining Dublin Core metadata standards |
DDI | Data Documentation Initiative—A standard for documenting quantitative empirical research |
DDI Lite | A simplified version of DDI with fewer descriptive fields, intended for lighter applications |
DDI Study Description | The component of DDI used to describe the structure and content of quantitative research studies |
Dublin Core | A metadata standard used to describe web resources and digital objects in a general and simplified way |
Dublin Core Extensions | Additional elements beyond the core Dublin Core schema to support more detailed descriptions |
EAD | Encoded Archival Description—A standard for encoding archival finding aids |
ELLANIKOS | The Dublin Core-based documentation system for the gray literature at the University of the Aegean |
ETD | Electronic Theses and Dissertations—Digital versions of academic theses and dissertations |
FAIR | Findable, Accessible, Interoperable, and Reusable—Principles for scientific data management |
HEI | Higher Education Institution—Universities and colleges providing post-secondary education |
JSON-LD | JavaScript Object Notation for Linked Data—A method of encoding linked data using JSON |
MARC | Machine-Readable Cataloging—One of the oldest electronic cataloging systems for libraries |
Metadata | Data that describe other data, typically structured using description languages like XML |
METS | Metadata Encoding and Transmission Standard—A framework for encoding descriptive, administrative, and structural metadata |
MODS | Metadata Object Description Schema—A bibliographic element set for describing resources |
NLP | Natural Language Processing—A branch of AI that helps computers understand human language |
OAI-PMH | Open Archives Initiative Protocol for Metadata Harvesting—A protocol for harvesting metadata records |
ONIX | Online Information Exchange—A standard for representing and communicating book industry product information |
PREMIS | Preservation Metadata: Implementation Strategies—A standard for preservation metadata |
RDF | Resource Description Framework—A framework for representing information about resources on the web |
Schema.org | A collaborative vocabulary for structured data markup on web pages |
URI | Uniform Resource Identifier—A string of characters that unambiguously identifies a particular resource |
URL | Uniform Resource Locator—A reference to a web resource that specifies its location on a computer network |
W3C | World Wide Web Consortium—The main international standards organization for the World Wide Web |
XML | eXtensible Markup Language—A language used for describing and exchanging data through tags (fields), commonly applied in metadata schemes |
XSD | XML Schema Definition—A way to formally describe the elements in an XML document |
Appendix A. Overview of the DDI Study Description (Version 2)
Appendix A.1. The stdDscr (2.0)
Appendix A.2. 2.1 Citation
- 2.1.1
- titlStmt, Title Statement for the research description
- 2.1.1.1
- Full title related to the research. (corresponds to Dublin Core Title)
- 2.1.1.2
- Secondary title, used to reinforce possible limitations of the main title. May repeat information from the main title
- 2.1.1.3
- Title by which we usually refer to the research
- 2.1.1.4
- The title translated into another language
- 2.1.1.5
- Identification Number–Characteristic number-Unique number or character sequence. (corresponds to Dublin Core Identifier Element).
- 2.1.2
- rspStmt–Responsibility Statement–The person responsible for creating the research.
- 2.1.2.1
- AuthEnty–Authoring Entity/Primary Investigator The person, group, or service responsible for the intellectual content of the work. (corresponds to Dublin Core Creator Element).
- 2.1.2.2
- othId–Other Identification/Acknowledgments Responsibility statements not recorded in the title and responsibility statement area for the work. Named here are individuals or groups related to the work or significant persons connected with previous editions who have not already been mentioned by name in the description. (corresponds to Dublin Core Contributor element)
- 2.1.3
- prodStmt–Production StatementStatements about the production of the research.
- 2.1.3.1
- producer–The producer is the person or organization with financial or administrative responsibility for the physical processes by which the text was created. (corresponds to Dublin Core Publisher element).
- 2.1.3.2
- copyright–Copyright statement
- 2.1.3.3
- procDate–Production date, when the data was produced
- 2.1.3.4
- prodPlac–Place of Production-Address of the organization that produced the research.
Appendix A.3. The stdInfo (2.2)
- 2.2.1
- subject–Generally describes the content of the data.
- 2.2.1.1
- keyword–Describes keywords that characterize the data.
- 2.2.2
- abstract–Presents the purpose, nature, and scope of data collection, specific characteristics of their content, as well as what questions the researcher conducting the research attempts to answer. A list of the main variables is also provided.
- 2.2.3
- sumDscr–Presents information about the chronological as well as geographical coverage of the research and their units.In detail:
- 2.2.3.1
- timePrd–Analyzes the time period to which the data refers. This is neither the time when documentation was carried out, nor when the data was collected.
- 2.2.3.2
- collDate–Contains the dates of data collection.
- 2.2.3.3
- nation–Reports the country or countries to which the data refers.
- 2.2.3.4
- geogCover–Provides information about the geographical coverage of the data. Contains the complete geographical scope of the data.
- 2.2.3.5
- geogUnit–Reports the smallest geographical unit (e.g., prefecture) covered by the data.
- 2.2.3.6
- anlyUnit- Reports the basic unit of analysis or observation that the files describe (e.g., individuals, families/households, groups, organizations, administrative units).
- 2.2.3.7
- universe–Informs about the group of individuals or other elements that are the subject of the research, to which the results/data refer. Age, nationality, and residence usually help characterize a specific environment (universe). Many factors can participate such as gender, race, income, convictions, etc. The environment (universe) may consist of elements other than persons, such as households, legal cases, deaths, countries, etc.Generally, it should be possible to determine from the description of the universe whether a specific individual or element (hypothetical or real) is a member of the research population.
- 2.2.3.8
- dataKind–Informs about the type of data in the file: research data, census data, clinical-medical data, experimental data, psychological data, etc.
Appendix A.4. The Method (2.3)
- 2.3.1
- dataColl–Contains information about the data collection methodology.
- 2.3.1.1
- timeMeth–Informs about the temporal method for data collection.
- 2.3.1.2
- dataCollector–Informs about the person or persons responsible for data collection.
- 2.3.1.3
- frequency–Informs about the frequency (if any) of data collection.
- 2.3.1.4
- sampleproc–Informs about the type of sample (e.g., random).
- 2.3.1.5
- collMode–Informs about the method of data collection (e.g., telephone interviews)
Appendix B. Dublin Core in Detail
Appendix C. XML Description of the Proposed Documentation System
XML Schema Definition (XSD) of the Proposed Model
Appendix D. Detailed Description of Fields
- Writer (Author Information)
- 1.1
- surName–Author’s surname
- 1.2
- firstName–Author’s first name
- 1.3
- fathersName–Author’s father’s name
- 1.4
- gender–Gender
- Dissertation (General Information about the Work)
- 2.1
- Title–The title of the work
- 2.2
- nameOfSupervisor–The name of the supervising professor
- 2.3
- dateOfSupport–Date of defense
- 2.4
- Abstract–Brief summary
- 2.5
- Keywords–Key words (phrases)
- 2.6
- altTitle–Alternative title (or title in English)
- 2.7
- typeOfDissertation–Type of work (e.g., undergraduate, graduate)
- 2.8
- Program–The name of the program under which the work was conducted
- 2.9
- nameOfCom1–The name of committee member (1)
- 2.10
- nameOfCom2–The name of committee member (2)
- 2.11
- otherNotes–Other useful notes about the work
- 2.12
- libraryLink–The internet link to the library
- 2.13
- Institution–The institution where the work was conducted
- 2.14
- Department–The department where the work was conducted
- Survey (General Information about Potential Empirical Research)
- 3.1
- strategy–Research strategy (e.g., quantitative/qualitative/mixed)
- 3.2
- sampleProd–Data production (e.g., primary/secondary)
- 3.3
- SpaceTime–Space-Time Management
- 3.3.1
- timeAnalysis–Temporal Analysis (e.g., Static, Dynamic, Comparative)
- 3.3.2
- spaceAnalysis–Spatial Analysis (e.g., Single-regional)
- 3.4
- Target–The purpose of the research
- 3.5
- studyQuestions–Research questions
- 3.6
- studyAssumptions–Research hypotheses
- 3.7
- workAssumptions–Working hypotheses
- 3.8
- Sample–Regarding the sample/population
- 3.8.1
- Size–The size
- 3.8.2
- Description–Description (e.g., specific group of people)
- 3.8.3
- basicUnit–Basic observation unit (e.g., Individual)
- 3.8.4
- collectionMethod–Sample collection method (e.g., Random) and description (e.g., sampling)
- 3.8.5
- methodOfRecording–Sample recording method (e.g., Interview)
- 3.8.6
- methodOfAnalysis–Analysis method (e.g., ANOVA, Relational)
- 3.8.7
- recordingTools–Recording tools (e.g., Interview Guide)
- 3.8.8
- analysisTools–Analysis tools
- 3.8.9
- sampleTimeReference–Data reference time period (from-to)
- 3.8.10
- sampleCollectionReference–Data collection time period (from-to)
- 3.8.11
- statesOfReference–Country or countries of data reference
- 3.8.12
- geographicalCover–Geographical coverage of data (e.g., regions)
- 3.8.13
- geographicalUnit–The smallest geographical unit (e.g., communities)
Appendix E. User Feedback Questionnaire
- Section A: General Information
- 1.
- Your role (select one):□ Student □ Faculty Supervisor □ Repository Administrator
- 2.
- Have you used any metadata systems or repositories before?□ Yes □ No
- Section B: Usability (for all participants)
- 3.
- Was the interface easy to navigate?□ Very easy □ Somewhat easy □ Difficult □ Very difficult
- 4.
- Were the field labels and instructions clear?□ All clear □ Somewhat clear □ Mostly unclear □ Not clear at all
- 5.
- Which fields were unclear or confusing to you?
- 6.
- Did the system support your documentation needs for empirical content?□ Fully □ Partially □ Not at allComments:
- Section C: Role-Specific Questions
- 7.
- Did the system help you reflect more deeply on your research design and methodology?□ Yes □ No □ Not sure
- 8.
- Which features would make the process easier for you? (check all that apply)□ Field examples/tooltips □ Dropdown menus □ Field autofill□ Metadata preview before submission □ Other:
- 9.
- Do you believe this framework improves students’ research documentation?□ Yes □ Somewhat □ No
- 10.
- Would you use this tool in your supervision or teaching?□ Yes, in both □ Only in supervision □ Only in teaching □ Not interested
- 11.
- Is the system compatible with your current repository infrastructure (e.g., DSpace, EPrints)?□ Yes □ Partially □ No
- 12.
- What features would support scalability in your institution? (check all that apply)□ Bulk metadata import □ Mandatory field validation□ Role-based access control □ Logging and audit trail
- Five graduate students from the Department of Social Work
- Three faculty supervisors with thesis advising responsibilities
- One institutional repository administrator
- Summary of Key Findings
- Found the form-based interface intuitive and appreciated the structured separation between general and empirical metadata.
- Requested clearer definitions for fields such as Observation Unit, Geographical Unit, and Time/Space Management.
- Suggested adding tooltips, dropdowns for recurring values, and previews of the completed metadata record.
- One student noted: “It really helped me clarify how to document my sample and method without being too technical.”
- Supported the framework’s role in guiding students toward more complete methodological documentation.
- Recommended making certain fields (e.g., research hypotheses) optional for qualitative or exploratory work.
- Proposed including a general-purpose “Notes” field to capture context not reflected in standard metadata.
- Validated the compatibility of the XML export structure with institutional repositories.
- Suggested adding basic field validation, and a batch import feature for legacy theses.
- Implications and Next Steps
- Integration of contextual help (tooltips) for complex fields
- Optional metadata fields for non-structured research outputs
- Role-based access control for students and repository staff
- Exploration of a batch-entry module for archival scalability
Appendix F. The Web Application Developed
References
- Papenmeier, A.; Krämer, T.; Friedrich, T.; Hienert, D.; Kern, D. Genuine information needs of social scientists looking for data. Proc. Assoc. Inf. Sci. Technol. 2021, 58, 292–302. [Google Scholar] [CrossRef]
- Nicholson, S.; Bennett, T. Do institutional repository deposit guidelines deter data discovery? Evid. Based Libr. Inf. Pract. 2021, 16, 2–17. [Google Scholar] [CrossRef]
- Osman, R.; Yanti Idaya, A.M.K.; Abrizah, A. Metadata matters: Evaluating the quality of Electronic Theses and Dissertations (ETDs) descriptions in Malaysian institutional repositories. Malays. J. Libr. Inf. Sci. 2023, 28, 109–125. [Google Scholar] [CrossRef]
- Khan, A.; Loan, F.; Parray, U.; Rashid, S. Global overview of research data repositories: An analysis of re3data registry. Inf. Discov. Deliv. 2024, 52, 53–61. [Google Scholar] [CrossRef]
- Jowkar, A. Dublin Core Metadata Element Set usage in national libraries’ web sites. Electron. Libr. 2009, 27, 441–447. [Google Scholar] [CrossRef]
- Standard ANSI/NISO Z39.85-2017; The Dublin Core Metadata Element Set. NISO Press: Baltimore, MD, USA, 2017.
- Baker, T. A Grammar of Dublin Core; Technical Report; GMD-German National Research Center for Information Technology: St. Augustin, Germany, 2000. [Google Scholar]
- Dublin Core. Detailed Information About the Dublin Core, Its Activities and Metadata Sets. 2001. Available online: http://www.dublincore.org (accessed on 7 July 2025).
- Avram, H. Machine-Readable Cataloging (MARC); The Library of Congress: Washington, DC, USA, 2003. [Google Scholar]
- Avram, H. MARC: Its History and Implications; The Library of Congress: Washington, DC, USA, 1975. [Google Scholar]
- DDI Alliance. The Data Documentation Initiative Description; DDI Alliance: Ann Arbor, MI, USA, 2017; Available online: https://ddialliance.org/ (accessed on 30 July 2025).
- Mosha, N.F.; Ngulube, P. Metadata Standard for Continuous Preservation, Discovery, and Reuse of Research Data in Repositories by Higher Education Institutions: A Systematic Review. Information 2023, 14, 427. [Google Scholar] [CrossRef]
- Furrie, B. Follett Software Company. Understanding MARC Bibliographic: Machine-Readable Cataloging, 8th ed.; Library of Congress: Washington, DC, USA, 2009. [Google Scholar]
- Weibel, S. The Dublin Core: A Simple Content Description Model for Electronic Resources. Bull. Am. Soc. Inf. Sci. Technol. 1997, 24, 9–11. [Google Scholar] [CrossRef]
- Ward, J. Unqualified Dublin Core usage in OAI-PMH data providers. OCLC Syst. Serv. 2004, 20, 40–47. [Google Scholar] [CrossRef]
- Greenberg, J.; Pattuelli, M.; Parsia, B.; Robertson, W. Author-generated Dublin Core metadata for web resources: A baseline study in an organization. J. Digit. Inf. 2001, 2, 38–46. [Google Scholar] [CrossRef]
- Robertson, T.; Döring, M.; Guralnick, R.; Bloom, D.; Wieczorek, J.; Braak, K.; Otegui, J.; Russell, L.; Desmet, P. The GBIF Integrated Publishing Toolkit: Facilitating the Efficient Publishing of Biodiversity Data on the Internet. PLoS ONE 2014, 9, e102623. [Google Scholar] [CrossRef] [PubMed]
- Vardigan, M. Data Documentation Initiative: Toward a Standard for the Social Sciences. Int. J. Digit. Curation 2007, 3, 107–113. [Google Scholar] [CrossRef]
- Radler, B.; Lyle, J.; Johnson, J. A DDI Primer: An Overview and Examples of DDI in Action; University of Wisconsin–Madison, Data Documentation Initiative (DDI) Working Group: Madison, WI, USA, 2016. [Google Scholar]
- CESSDA Qualitative Metadata Working Group. *Best Practice Guidelines for Qualitative Data Documentation*. 2021. Available online: https://www.cessda.eu/ (accessed on 30 July 2025).
- QSR International. NVivo Qualitative Data Analysis Software. Available online: https://www.qsrinternational.com/nvivo-qualitative-data-analysis-software/home (accessed on 14 June 2025).
- Southall, H.; Woollard, M. A Qualitative Data Model for DDI; Working Paper; DDI Alliance: Ann Arbor, MI, USA, 2011. [Google Scholar]
- Library of Congress. MODS: Metadata Object Description Schema: Official Web Site. Available online: http://www.loc.gov/standards/mods/ (accessed on 7 July 2025).
- Guenther, R. MODS: The Metadata Object Description Schema. Cat. Classif. Quarterly 2003, 36, 81–91. [Google Scholar] [CrossRef]
- Library of Congress. METS: Metadata Encoding and Transmission Standard: Official Web Site. Available online: http://www.loc.gov/standards/mets/ (accessed on 7 July 2025).
- McDonough, J. METS: Standardized Encoding for Digital Library Objects. Int. J. Digit. Libr. 2006, 6, 148–158. [Google Scholar] [CrossRef]
- Library of Congress. EAD: Encoded Archival Description: Official Web Site. Available online: https://www.loc.gov/ead/ (accessed on 7 July 2025).
- Pitti, D.V. Encoded Archival Description: An Introduction and Overview. D-Lib Mag. 1999, 5, 11. [Google Scholar] [CrossRef]
- Schema.org. Available online: https://schema.org/ (accessed on 7 July 2025).
- Ronallo, J. HTML5 Microdata and Schema.org. Code4Lib J. 2012, 16. Available online: https://journal.code4lib.org/articles/6400 (accessed on 15 July 2025).
- W3C. RDF 1.1 Primer. 2014. Available online: https://www.w3.org/TR/rdf11-primer/ (accessed on 7 July 2025).
- Haslhofer, B.; Isaac, A. data.europeana.eu: The Europeana Linked Open Data Pilot. In Proceedings of the International Conference on Dublin Core and Metadata Applications, The Hague, The Netherlands, 21–23 September 2011; pp. 94–104. [Google Scholar]
- EDItEUR. ONIX for Books. Available online: https://www.editeur.org/83/Overview/ (accessed on 7 July 2025).
- Library of Congress. PREMIS: Preservation Metadata Maintenance Activity. Available online: https://www.loc.gov/standards/premis/ (accessed on 7 July 2025).
- Caplan, P. Understanding PREMIS: An Overview of the PREMIS Data Dictionary for Preservation Metadata; Library of Congress: Washington, DC, USA, 2009. [Google Scholar]
- Wira-Alam, A.; Dimitrov, D.; Zenk-Möltgen, W. Extending Basic Dublin Core Elements for an Open Research Data Archive. In Proceedings of the 2012 International Conference on Dublin Core and Metadata Applications, Kuching Sarawak, Malaysia, 3–7 September 2012. [Google Scholar]
Feature | Dublin Core | DDI | Schema.org | RDF | Proposed Model |
---|---|---|---|---|---|
Intended for Theses | × | × | × | × | ✓ |
Bibliographic Description | ✓ | ✓ | ✓ | ✓ | ✓ |
Empirical Research Metadata | × | ✓ | × | × | ✓ |
Support for Qualitative Methods | × | partial | × | × | ✓ |
Ease of Use for Students | ✓ | × | ✓ | × | ✓ |
Semantic Web Integration | × | × | ✓ | ✓ | × (Future Work) |
Metadata Field | Options |
---|---|
Research Strategy | 0 Quantitative, 1 Qualitative, 2 Mixed |
Data Source | 0 Primary, 1 Secondary, 2 Mixed |
Title | 0 Absent, 1 Present |
Time/Space Management | 0 Non-historical, 1 Historical-comparative |
Abstract | 0 Absent, 1 Present |
Purpose | 0 Absent, 1 Present |
Research Questions | 0 Absent, 1 Present |
Working Hypotheses | 0 Absent, 1 Present |
Research Hypotheses | 0 Absent, 1 Present |
Reference Date for Data | 0 Not Mentioned, 1 Mentioned |
Data Collection Date | 0 Not Mentioned, 1 Mentioned |
Country/Countries of Study | 0 Not Mentioned, 1 Mentioned |
Geographical Coverage | 0 Not Mentioned, 1 Mentioned |
Geographical Unit | 0 Not Mentioned, 1 Mentioned |
Population | 0 Not Mentioned, 1 Mentioned |
Observation Unit | 0 Not Mentioned, 1 Mentioned |
Sample | 0 Not Mentioned, *Number* of Sample Size |
Sampling Method | 0 Not Mentioned, 1 Mentioned |
Data Collection Method | 0 Not Mentioned, 1 Mentioned |
Analysis Method | 0 Not Mentioned, 1 Mentioned |
Other Sources | 0 Not Mentioned, 1 Mentioned |
Research Problems | 0 Not Mentioned, 1 Mentioned |
Research Ethics | 0 Not Mentioned, 1 Mentioned |
Recording Tools | 0 Not Mentioned, 1 Mentioned |
Analysis Tools | 0 Not Mentioned, 1 Mentioned |
Metadata Field | Options |
---|---|
1. Research Strategy | 0 Quantitative, 1 Qualitative, 2 Mixed |
2. Data Source | 0 Primary, 1 Secondary, 2 Mixed |
3. Time/Space Management | 0 Non-historical, 1 Historical-comparative |
4. Research Questions | 0 Absent, 1 Present |
5. Research Hypotheses | 0 Absent, 1 Present |
6. Reference Date for Data | 0 Not Mentioned, 1 Mentioned |
7. Data Collection Date | 0 Not Mentioned, 1 Mentioned |
8. Country/Countries of Study | 0 Not Mentioned, 1 Mentioned |
9. Geographical Coverage | 0 Not Mentioned, 1 Mentioned |
10. Geographical Unit | 0 Not Mentioned, 1 Mentioned |
11. Population | 0 Not Mentioned, 1 Mentioned |
12. Observation Unit | 0 Not Mentioned, 1 Mentioned |
13. Sample | 0 Not Mentioned, *Sample Size* |
14. Sampling Method | 0 Not Mentioned, 1 Mentioned |
15. Data Collection Method | 0 Not Mentioned, 1 Mentioned |
16. Analysis Method | 0 Not Mentioned, 1 Mentioned |
17. Recording Tools | 0 Not Mentioned, 1 Mentioned |
18. Analysis Tools | 0 Not Mentioned, 1 Mentioned |
Field | 1st | 2nd | 3rd | 4th | 5th | 6th |
---|---|---|---|---|---|---|
Research Strategy | 0 | 0 | 0 | 0 | 0 | 0 |
Data Source | 0 | 0 | 0 | 0 | 0 | 0 |
Title | 0 | 0 | 0 | 1 | 0 | 1 |
Time/Space Management | 0 | 0 | 0 | 0 | 0 | 0 |
Abstract | 1 | 1 | 0 | 0 | 0 | 1 |
Purpose | 1 | 1 | 1 | 1 | 1 | 1 |
Research Questions | 0 | 1 | 1 | 1 | 1 | 1 |
Working Hypotheses | 0 | 1 | 1 | 1 | 1 | 1 |
Research Hypotheses | 0 | 1 | 1 | 1 | 1 | 1 |
Reference Date | 1 | 1 | 1 | 1 | 1 | 1 |
Data Collection Date | 1 | 1 | 1 | 1 | 1 | 1 |
Countries of Reference | 1 | 1 | 1 | 1 | 1 | 1 |
Geographical Coverage | 1 | 1 | 1 | 1 | 1 | 1 |
Geographical Unit | 1 | 1 | 1 | 1 | 1 | 1 |
Population | 1 | 1 | 1 | 1 | 1 | 1 |
Observation Unit | 1 | 1 | 1 | 1 | 1 | 1 |
Sample | 110 | 212 | 300 | 150 | 250 | 320 |
Sampling Method | 1 | 1 | 1 | 1 | 1 | 1 |
Recording Method | 1 | 1 | 1 | 1 | 1 | 1 |
Analysis Method (ANOVA/Relational) | 1 | 1 | 1 | 1 | 1 | 1 |
Other Sources | 0 | 0 | 0 | 0 | 0 | 0 |
Research Problems | 0 | 0 | 0 | 0 | 0 | 0 |
Ethics in Research | 1 | 1 | 0 | 1 | 0 | 0 |
Recording Tools | 1 | 1 | 1 | 1 | 1 | 1 |
Analysis Tools | 0 | 1 | 1 | 1 | 1 | 1 |
Field | 1st | 2nd | 3rd | 4th | 5th | 6th |
---|---|---|---|---|---|---|
Research Strategy | 1 | 1 | 1 | 1 | 1 | 1 |
Data Source | 0 | 1 | 1 | 1 | 1 | 1 |
Title | 0 | 0 | 1 | 1 | 1 | 1 |
Time/Space Management | 0 | 0 | 0 | 1 | 0 | 1 |
Abstract | 0 | 0 | 0 | 1 | 1 | 1 |
Purpose | 1 | 1 | 1 | 1 | 1 | 1 |
Research Questions | 1 | 1 | 1 | 1 | 1 | 1 |
Working Hypotheses | 1 | 1 | 1 | 1 | 1 | 0 |
Research Hypotheses | 1 | 1 | 1 | 0 | 0 | 0 |
Reference Date | 1 | 1 | 1 | 1 | 1 | 1 |
Data Collection Date | 1 | 0 | 1 | 1 | 1 | 1 |
Countries of Reference | 1 | 0 | 1 | 0 | 1 | 1 |
Geographical Coverage | 1 | 0 | 0 | 1 | 1 | 0 |
Geographical Unit | 1 | 0 | 0 | 1 | 1 | 1 |
Population | 1 | 0 | 1 | 1 | 1 | 0 |
Observation Unit | 1 | 1 | 1 | 1 | 1 | 1 |
Sample | 12 | 1 | 1 | 6 | 3 | 10 |
Sampling Method | 1 | 1 | 1 | 1 | 1 | 1 |
Recording Method | 1 | 1 | 1 | 1 | 1 | 1 |
Analysis Method (ANOVA/Relational) | 1 | 1 | 1 | 0 | 0 | 0 |
Other Sources | 0 | 0 | 0 | 1 | 1 | 0 |
Research Problems | 0 | 0 | 0 | 1 | 0 | 0 |
Ethics in Research | 0 | 0 | 0 | 1 | 0 | 0 |
Recording Tools | 1 | 0 | 1 | 1 | 1 | 0 |
Analysis Tools | 1 | 0 | 0 | 1 | 1 | 1 |
Field | Value 1 | Value 2 |
---|---|---|
Research Strategy | 2 | 2 |
Data Source | 0 | 1 |
Title | 0 | 1 |
Time/Space Management | 0 | 0 |
Abstract | 1 | 0 |
Purpose | 1 | 1 |
Research Questions | 1 | 1 |
Working Hypotheses | 1 | 1 |
Research Hypotheses | 1 | 1 |
Reference Date | 1 | 1 |
Data Collection Date | 0 | 1 |
Countries of Reference | 0 | 1 |
Geographical Coverage | 0 | 1 |
Geographical Unit | 0 | 0 |
Population | 1 | 1 |
Observation Unit | 1 | 1 |
Sample | 6 | 12 |
Sampling Method | 1 | 1 |
Recording Method | 1 | 1 |
Analysis Method (ANOVA/Relational) | 1 | 1 |
Other Sources | 0 | 1 |
Research Problems | 0 | 1 |
Ethics in Research | 0 | 1 |
Recording Tools | 1 | 1 |
Analysis Tools | 0 | 1 |
(a) | |
---|---|
# | Field (Description) |
1 | Research Strategy—Type of research conducted |
2 | Data Production—How data were produced |
3 | Time/Space Management—Historical or non-historical comparative approach |
4 | Purpose |
5 | Research Questions—Clear presence of research questions |
6 | Working Hypotheses—Clear presence of working hypotheses |
7 | Research Hypotheses—Clear presence of research hypotheses |
8 | Reference Time of Data—When the data refer to |
9 | Data Collection Time—When the data were collected |
10 | Country/Countries of Reference—Where the study was conducted |
11 | Geographical Coverage—Areas covered (e.g., regions) |
12 | Geographical Unit—Units (e.g., cities) |
13 | Population—Which population the research refers to |
14 | Observation Unit—Subsets of the sample |
15 | Sample—Sample size |
16 | Sampling Method—How the sample was selected (e.g., random) |
17 | Data Recording Method—How data were recorded (e.g., interview) |
18 | Data Analysis Method—e.g., ANOVA, relational |
19 | Recording Tools—Presence of tools like interview guides |
20 | Analysis Tools—Presence of tools like SPSS |
(b) | |
# | Field |
1 | Author Name |
2 | Author Gender |
3 | Title |
4 | Supervisor Name |
5 | Defense Date |
6 | Number of Pages |
7 | Language |
8 | Abstract |
9 | Keywords |
10 | Alternative Title |
11 | Committee Member 1 |
12 | Committee Member 2 |
13 | Additional Notes |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Vassiliou, G.; Tsamis, G.; Chatzinikolaou, S.; Nipurakis, T.; Papadakis, N. Enhancing Discoverability: A Metadata Framework for Empirical Research in Theses. Algorithms 2025, 18, 490. https://doi.org/10.3390/a18080490
Vassiliou G, Tsamis G, Chatzinikolaou S, Nipurakis T, Papadakis N. Enhancing Discoverability: A Metadata Framework for Empirical Research in Theses. Algorithms. 2025; 18(8):490. https://doi.org/10.3390/a18080490
Chicago/Turabian StyleVassiliou, Giannis, George Tsamis, Stavroula Chatzinikolaou, Thomas Nipurakis, and Nikos Papadakis. 2025. "Enhancing Discoverability: A Metadata Framework for Empirical Research in Theses" Algorithms 18, no. 8: 490. https://doi.org/10.3390/a18080490
APA StyleVassiliou, G., Tsamis, G., Chatzinikolaou, S., Nipurakis, T., & Papadakis, N. (2025). Enhancing Discoverability: A Metadata Framework for Empirical Research in Theses. Algorithms, 18(8), 490. https://doi.org/10.3390/a18080490