Ontological Representation of the Structure and Vocabulary of Modern Greek on the Protégé Platform

Samaridi, Nikoletta; Papakitsos, Evangelos; Karanikolas, Nikitas

doi:10.3390/computation12120249

Open AccessCase Report

Ontological Representation of the Structure and Vocabulary of Modern Greek on the Protégé Platform

by

Nikoletta Samaridi

,

Evangelos Papakitsos

^* and

Nikitas Karanikolas

Department of Informatics and Computer Engineering, School of Engineering, University of West Attica, Ag. Spyridonos Str., Egaleo, 12243 Athens, Greece

^*

Author to whom correspondence should be addressed.

Computation 2024, 12(12), 249; https://doi.org/10.3390/computation12120249

Submission received: 4 November 2024 / Revised: 17 December 2024 / Accepted: 20 December 2024 / Published: 23 December 2024

(This article belongs to the Special Issue Recent Advances on Computational Linguistics and Natural Language Processing)

Download

Browse Figures

Versions Notes

Abstract

:

One of the issues in Natural Language Processing (NLP) and Artificial Intelligence (AI) is language representation and modeling, aiming to manage its structure and find solutions to linguistic issues. With the pursuit of the most efficient capture of knowledge about the Modern Greek language and, given the scientifically certified usability of the ontological structuring of data in the field of the semantic web and cognitive computing, a new ontology of the Modern Greek language at the level of structure and vocabulary is presented in this paper, using the Protégé platform. With the specific logical and structured form of knowledge representation to express, this research processes and exploits in an easy and useful way the distributed semantics of linguistic information.

Keywords:

ontology; Protégé; Modern Greek; knowledge-based system; machine-readable dictionary; cognitive computing

1. Introduction

Understanding language is possible through its representation and interpretation, a task that is fundamental in the fields of Natural Language Processing (NLP) and Artificial Intelligence (AI). While significant progress has been made in language representation, the challenge remains: how can we achieve a comprehensive ontological organization of a language’s vocabulary and structure at all levels of linguistic analysis? Furthermore, what benefits can this ontological approach offer to various scientific domains? Ontology has proven to be the most effective method for representing and interpreting linguistic knowledge [1], as it connects various levels of linguistic analysis (e.g., grammar, syntax, lexical analysis) through the interdependencies of its objects. Unlike traditional linguistic analysis, ontology offers a more integrated and flexible framework, allowing for a deeper and more complete understanding of language. This framework enables linguistic information to be represented in a way that is not only comprehensible to humans but also interpretable by computers, thereby expanding the possibilities for automated linguistic processing. Despite these advantages, particularly for widely spoken languages, existing state-of-the-art approaches to language representation often lack the necessary depth and coherence, especially when applied to less commonly represented languages like Modern Greek, which remains underrepresented with notable gaps in structured and computational approaches. The shortcomings of current methods lie in their limited ability to connect linguistic analysis across all levels, leading to fragmented or incomplete interpretations of linguistic data. This paper addresses these shortcomings by presenting a novel ontology for Modern Greek, developed using the Protégé platform. The proposed ontology provides a structured, logical framework for knowledge representation that encompasses both vocabulary and structure. It enables a comprehensive and integrated approach to linguistic analysis, offering a more complete and accurate representation of the language. By utilizing the principles of the semantic web and cognitive computing, this work demonstrates how ontology can support new technologies that make linguistic information both machine-readable and semantically interpretable. The significance of this study lies in its ability to bridge the gap between theoretical linguistic analysis and computational implementation for Modern Greek. By incorporating an ontological framework, this work addresses key weaknesses in current approaches and introduces innovative techniques for processing and representing linguistic knowledge. Considering the advantages of an ontological organization of language in cognitive computing, the work presented here attempts to create an ontological lexicon of Modern Greek using the Protégé platform. Although not historically unique, at present, this is the only active project in this scope, while its design and implementation details can be directly useful for other Indo-European languages, having a similar structure to Modern Greek in terms of morphology and syntax, at least.

2. Methodology and Usefulness

Starting a systematic and organized study by reviewing the literature, the knowledge representation and reasoning methods that have been proposed in the context of developing knowledge-based systems in [1] were initially studied and presented. Subsequently, the various lexicographic environments that constitute semantic networks were investigated, as well as the ontologies/lexica, the micro-formats that are most commonly found in Semantic Web documents, as well as the electronic dictionaries and corpora of texts of the Modern Greek language that have become linguistic resources and useful tools in the field of Greek language study and natural language processing. Also, computational methods and tools for collecting, organizing, classifying, and automatically retrieving lexical information were sought to implement a conceptually structured lexical database that will be the means of overcoming the knowledge acquisition bottleneck for computational applications [2]. Finally, after taking into account the current research challenges in the field of engineering and technologies, as described in [3], an attempt was made to create a language knowledge system that can provide semantic integration and interoperability capabilities in an automatic and secure way and can contribute to meeting the current requirements for intelligent modeling, analysis and usage services for web information, connecting heterogeneous systems and heterogeneous approaches in the field of engineering and technologies in the light of a standard and standardized organization, and coding of data coming from structured and semi-structured information sources. Thus, a transition was made from basic research to applied research, with the design and implementation of a new ontological knowledge system of the Modern Greek language, which contributes not only to the description but also to the understanding and interpretation of theoretical linguistic models and the production of practical results, changing the way in which natural language is computationally understood and processed.

A new ontological knowledge system was designed and implemented, called Multi-Solution Ontology (MSO), precisely because its use applies to a variety of domains and functions in language and other sciences. It was implemented on the Protégé platform, a widely used, open-source computational tool that is considered the most comprehensive, modular, and scalable system for writing, managing, and visualizing ontologies on the Semantic Web [3]. Protégé was specifically chosen due to its ease of use, scalability, support for the OWL (Ontology Web Language) standard [4], and its active user community, which provides strong resources for troubleshooting and improvements. Protégé’s interactive graphical interface is particularly distinguished for its extensibility and constant evolution, with regular new software versions. OWL, the web language used, is proposed as the official ontology language by the W3C (World Wide Web Consortium). It has a well-defined syntax and semantics, supports semantic definitions, and provides logical inference mechanisms, allowing for the expression of complex linguistic relationships and generating non-explicit knowledge. This reasoning support enabled the solution of various linguistic problems in ontology development. The platform’s robust reasoning capabilities and its wide adoption in both academic and industrial settings make it an ideal choice for creating the Modern Greek ontology. Finally, two highly effective reasoning systems for logical descriptions, Pellet [5] and Hermit [6], were used to evaluate the system.

3. The New Ontological Knowledge System of Modern Greek

The new ontological knowledge system includes three “sub-ontologies” (which are actually three subclasses of the higher entity “Thing” of Protégé):

The Greek Language Ontology, which captures the structure of the Modern Greek language at its various levels of analysis;
The Greek Ontology Dictionary, which includes in its entries the vocabulary of the Modern Greek language;
The Supply Chain Ontology, which explores a vertical application of the language in the specific domain, as a proof-of-concept implementation.

3.1. The Ontology of Modern Greek Language

The implementation from scratch of a new ontology in the field of Modern Greek responds to:

the practical abandonment of the Greek WordNet [7], almost 20 years ago;
the absence of another ordered integrated presentation of its basic concepts, as revealed by a review of the literature [2];
and the necessity of unifying and integrating linguistic knowledge at all its levels.

The ontology proposed in this paper is a natural language simulation ontology, the design of which was a very challenging task, since reflections on various issues related to both the structure of the ontology and the structure of the language were constantly arising to give an accurate picture of the domain. That is, decisions had to be made on research questions about the choice of the precise modeling technique that would achieve the intended benefits, as well as decisions on research hypotheses about:

which concepts of language should be approached for a complete analysis of the field,
which constituents of the language should be considered primary or secondary,
how these constituents should be structured to have a more complete structure of the language at all its levels,
from which perspective natural language should be approached as a unified whole, etc.

The present ontology is an easily understood model that essentially leads to the decomposition of natural language entities (i.e., things that exist either physically or logically) into objects (i.e., separate entities that try to model and approximate the physical world as best as possible [8]), which are organized hierarchically into classes (that represent the concepts related to a field or some tasks, which are usually organized in some taxonomic system), which include the set of properties of the methods and the messages or events to which they respond. This is a utilitarian ontology in which:

it achieves the mapping of the properties of the concepts of Modern Greek;
it defines stable and well-defined semantic generalization and specialization relations between concepts;
it creates a hierarchical relation between concepts and their semantic associations by linking them as nodes;
the inheritance of properties, namely multiple and non-monotonic inheritance, is supported;
and terminologies supported by valid reasoning are developed, through a systematic way of encoding associations, based on rules, axioms, functions, and constraints, while conclusions are drawn.

All of the above results in each ‘object’, containing all the relations that apply to it, lead to possibilities of unlimited semantic expression.

3.2. The Greek Ontology Dictionary

The Greek Ontology Dictionary is the second ontology in the knowledge system of Modern Greek and is fully connected to the ontology of Modern Greek. It is a multi-purpose lexical database, supporting a variety of applications, such as phonetic, morphological, syntactic, and semantic processing of the language, as well as information retrieval, achieving a flexible, reusable, expandable, simple, and compact data representation.

More specifically, this dictionary records the Modern Greek vocabulary of written and spoken speech. The basic criterion for the compilation of the lexicon was the use of the word. For each lexical unit, this dictionary provides:

the existence of phonetic transcription;
the definition/interpretation;
morphological and grammatical information;
the connection of each lemma of a constituent word (noun, adjective, verb, etc.) with its declension paradigm;
the recording of the largest possible number of expressions and phrases of Modern Greek, related to the entry;
etymology;
synonyms, antonyms, diminutives, magnifying nouns;
English translation;
textual/phonological features;
examples of usage;
spelling;
and, occasionally, photographs related to each entry.

The dictionary contains about 49,546 entries/words, alphabetically classified as instances of classes, and at the same time provides the possibility of structuring all the linguistic material under nodes, based on semantic relations between concepts, such as meronymy/holonymy, synonymy/antonymy, literalism/metaphor, etc., with the hyperonymy/hyponymy relation predominating. Thus, this dictionary makes it possible to capture a wide range of semantic relations in a systematic way and allows the word to be treated as a combination of semantic and grammatical–syntactic information, in order to achieve the extraction and derivation of correct knowledge, through the connection with hierarchical/ontological correlations of morphological, syntactic and semantic information.

3.3. The Ontology of the Supply Chain

With the aim of applying the knowledge system for the Modern Greek language to a specific field of knowledge, this research was specialized in the area of the supply chain, where the development of ontological schemas is almost non-existent [9], to contribute to a field in which there is imperative need for research [10] related to the conceptual understanding of knowledge. Thus, the supply chain ontology was implemented as a vertical application in the area of the Modern Greek language. This ontology, which is the third in the knowledge system for the Modern Greek language, is essentially an application ontology, because—according to the distinction of ontologies based on [11]—it provides the vocabulary for a domain, which is none other than the supply chain, as well as a specific task, which is none other than the operations of logistics, covering a wide range of tasks. With this work, which is presented in [10,12], the staffing of the lexical database of the Greek Ontology Dictionary [3] is achieved, as well as the support and promotion of natural language processing technologies through the representation of the terminological knowledge of this domain.

4. Designing the New Ontological Dictionary of Modern Greek

The Modern Greek Language Ontology is built on four (4) fundamental concepts (as top-level classes), each of which separately represents knowledge at the four (4) levels of the language: “morphology”, “syntax”, “semantics”, and “phonetics”. The levels of analysis of pragmatics and textual linguistics dealing with speakers’ use of language are the subject of future approaches. The field covered by ontology is mainly the grammatical-syntactic and semantic structure of language. Together with the ontological dictionary, which is the lexical basis of the knowledge system, it actually creates an ontological dictionary. In this ontology, the phonetic, morphological, syntactic, and semantic information of each lexical entry is explicitly stated through classes, sub-classes, and their interrelationships. It is both a taxonomy and ontology, which contains the definition of concepts and a large and extensible set of standardized types of relations and which, as a consequence, clearly describes and fully documents the concepts and their semantics, intending to categorize real-world objects. It is based on a taxonomy, containing two main hierarchies:

the hierarchy of concepts, which contains the entities,
and the hierarchy of conceptual relations, which contains the semantic relations of the entities.

Relationships are arranged, based on a unique identifier (MSO_ID), into hyperonyms and hyponyms, with a key feature of inheritance from hyperonyms to hyponyms. The relationships between lexical entities are only binary. Each concept, each idea, and each entity can, that is, be expressed as a collection of one or more binary relations between related objects. This has the effect of reducing complexity, design procedures and access, processing, and memory time, since the associations are explicit, discrete, and easily implementable.

The part of the ontology that represents the morphological information has as a basic unit (entity) in the hierarchy of classes, the part of speech, that by definition is a word (Appendix A–item 1), which in turn is a distinct unit of language. The part of the ontology representing syntactic information has as its highest hierarchical class the sentence and as subclasses the word sets resulting from the syntactic relations of words (e.g., nominal, verbal, prepositional, adverbial set, etc.). The part of ontology that refers to semantics has as its basic unit the concept of the word (Appendix A–item 2), with subclasses the conceptual/semantic relations that developed between words, such as hyperonymy-hyponymy, meronymy, synonymy–antonymy, ambiguity–polysemy, literalism–metaphor, etc. The part of the ontology that represents phonetic information has the word as its highest hierarchical class, with subclasses of the phonemes used in Modern Greek. The above parts of the ontology communicate with each other, thus encoding the semantic relationships that exist at the grammatical-syntactic and semantic levels. With this interconnection, the lexical meaning can be fully attributed through paradigmatic and constitutional relations (Appendix A–item 3) and the various associations of lexical entities/lemmas can infer all the derived forms of words and associations.

At this point, it should be noted that a decision had to be made from the outset as to which constituent element of the Modern Greek language would constitute both its beginning and its basis: the phoneme, the word, or the sentence? By studying the structure of the language, it was decided that the basic constituent of the language that can form the common basis at all four (4) levels (morphology, syntax, semantics, and phonetics) is the word (Figure 1). Specifically, these four classes are linked to each other through the sub-class “Λέξη_Word” (Figure 1, on the left), which is a common basis for all four (4) levels in the ontology. The rectangular shapes depict the entities, while the arrow lines depict the relationships between the entities. In Figure 1, the relationship between the entities ‘Λέξη_Word’ and ‘Φωνητική_Phonetic’ is declared through the object property ‘isBasisOf’.

Thus, at each level of language, the basic class in the hierarchy is the ‘word’. Even in the case of the domain of syntax, where the basis of syntax is normally the sentence, a relation was created linking the word to the sentence.

4.1. Design Methodology of the New Ontological Dictionary

The new ontological dictionary for the Modern Greek language was designed and implemented, using as a guide the ontology design steps of [13]. Thus, the ontological dictionary includes:

the definition of classes in the ontology and their organization into a taxonomic hierarchy (superclass–subclass);
the definition of the relations that link the classes to each other, i.e., the definition of object properties;
the definition of the individuals/instances, their annotation properties, and the relations between the individuals, i.e., the definition of the data properties;
the definition of the values of the object properties and of the data properties of the individuals;
the axioms, or the rules defined based on the relations between classes and individuals.

More specifically, the following methodology was followed:

4.1.1. Identifying the Scope of Definition

The scope and domain of the ontology were defined by formulating seventy-two (72) “Competency Questions” [14] that the ontology-based knowledge base is able to answer, according to the requirements and objectives that it should achieve. These questions were formulated based on the six (6) questions posed by J. Zachman in the Zachman Framework [15]: ‘What?’, ‘Who?’, ‘How?’, ‘Where?’, ‘When?’, and ‘Why?’, and relate to the level of operation of the ontology, e.g.:

What is the scope of definition that the ontology will cover?
What will we use the ontology for?
Who will use and maintain the ontology?
For which types of questions should the information in the ontology provide answers?
How will the four levels of linguistic analysis be linked? etc.
and to the level of operation of the language at its various levels, e.g.:
What is the definition-interpretation of a word?
What are the morphosyntactic features of a word?
Which words have only singular and which have only plural form?
Which parts of speech can form the first or second compound in a compound word?
How is the concept of polysemy defined in Modern Greek?
How does the origin of words affect the style of speech?
How can we be led in the automatic derivation of new words (derivative and compound)?
Why can a grammatical type be formed in various ways in Modern Greek (e.g., the monosyllabic and periphrastic formation of perfect tenses)?
When are the conjunctions introducing the same kind of subordinate clause used (e.g., when is the conjunction ’since’ and when is the conjunction ’while’ used in temporal clauses);
Where (in what circumstances) is each of the words with similar meanings, e.g,. synonyms, used? etc.

The range of questions, of course, can become as unlimited as the possibilities of natural language can be. Of course, the ontological dictionary aims to cover all possible questions concerning linguistic phenomena at the four levels of language.

4.1.2. Ontology from Scratch

The possibility of reusing already existing ontologies was considered, but a search of the literature (in the Hellenic Academic Libraries Link, on 29-11-22, with keywords: Ontology, Grammar, Greek Language) for corresponding ontologies returned no results, resulting in the creation of a prototype ontology from scratch, intending to become a model for future work in the field of NLP.

4.1.3. Language Resources

A collection of the most important terms used to represent the four levels of the language, as well as the entries of the lexical base of the ontological dictionary, was collected. The terms relating to grammar, syntax, and semantics have been extracted from [16] and [17], from which the material concerning the morphological variety of words and syntax was mainly drawn. The entries that form the lexical basis of the ontological dictionary are taken from [18] and have been automatically extracted through code in Python. These linguistic resources were chosen because of their widespread use and impact in primary and secondary education, their validity and reliability, and also because of their open access on the World Wide Web. These are dictionaries that have been tested and have demonstrated their effectiveness, and their advantages are exploited in this ontological dictionary.

4.1.4. Definition of Classes-Subclasses

Classes and the hierarchy of classes, i.e., entities, objects, and their hierarchy, were defined using a combination of bottom-up and top-down approaches. A top-down development process starts with the definition of the most general concepts in the domain and the subsequent specialization of the concepts. In contrast, a bottom-up development process starts with the definition of the most specific classes, the leaves of the hierarchy, and the subsequent grouping of these classes into more general concepts. The most obvious concepts were defined first and then generalized and specialized appropriately. For example, the concept ‘Word’ specializes in the concepts ‘Morphology’, ‘Semantics’, ‘Phonetics’, ‘Sentence’, as it is their basic component, while at the same time the concept ‘Word’ is generalized, as it is a super-class of the concept ‘Parts of Speech’. To make, indeed, this generalization clear, the plural number was used when naming the class ‘Parts of Speech’ and not the singular number ‘Part of Speech’.

4.1.5. Defining Relations

The relations between objects, i.e., the object properties, were defined, the number of which is ninety-two (92). The types of relations were designed according to three criteria:

relations are denoted by phrases that indicate logical naming conventions, which means that they obey a logical pattern (e.g., isA, aKindOf, agreesIn, consistsOf, derivedFrom, describes, expresses, forms, functionsSyntacticallyAs includes isAccompaniedBy, isDistinguisedIn, isDividedInto, isPartOf, refersTo, relatedTo, represents);
each relation has its inverse, wherever this is possible (e.g., the relation ‘follows’ has its inverse ‘isFollowedBy’); and
the types of relations follow a taxonomy and extend to all four levels of language analysis to ensure consistency of the language.

At this point, a peculiarity of the ontology should be pointed out. The relationships are specified and defined in great detail. Thus, relations refer not only to ‘what something is’ or ‘what happens’, but also to ‘what something can be’, ‘what something is likely to be’ or ‘what can happen’, and ‘what is likely to happen’. Thus, for example, we have:

the relation ‘isA’, and also the relations ‘canBe’, ‘mayBe’;
the relation ‘consistsOf’, and also the relation ‘mayConsistOf’;
the relation ‘expresses’, and also the relation ‘canExpress’;
the relation ‘takesAsNegation’, and also ‘canTakeAsNegation’;
the relation ‘isFormedBy’, and also ‘mayBeFormedBy’ and ‘canBeFormedBy’;
the relations ‘isAccompaniedBy’ and ‘canBeAccompaniedBy’, etc.

Also, not only ‘what is done’ between two entities is specified, but also ‘what is usually done’ or ‘can be usually done’ or ‘what is rarely done’. Thus, we have, for example:

the relation ‘isAccompaniedBy’, but also the relation ‘isUsuallyAccompaniedBy’;
the relation ‘forms’, but also the relation ‘usuallyForms’ or even the relation ‘canBeUsuallyFormedBy’;
the relation ‘usuallyTakesAnObjectIn’, but also the relation ‘rarelyTakesAnObjectIn’.

In linguistic ontology, the defined relations serve to capture the nuances of how words and linguistic constructs interact within sentences. These semantic relations are not only instrumental in defining the structure of language, but also in understanding its fluidity and potential for variation in different contexts. For example:

The relations “isAccompaniedBy/canBeAccompaniedBy” describe situations in which an entity (such as a word or phrase) coexists with another in a sentence. The relation “isAccompaniedBy” could describe how an adjective is often accompanied by a noun. The extended form “canBeAccompaniedBy” takes into account less frequent or potential occurrences, thereby introducing flexibility into the language model (e.g., a noun can be accompanied by an adverb).

The relations “consistsOf/mayConsistOf” describe the components that make up a larger structure. “consistsOf” is used to describe definite, regular components (e.g., a sentence consists of words), while “mayConsistOf” accounts for possible variations or less common structures (e.g., a noun phrase consists of a noun, but in some cases, it may also consist of an entire sentence).

The relation “isDistinguishedIn” specifies how certain elements in the language are distinguished within specific contexts, such as how a verb may have different meanings depending on the sentence structure or the presence of particular adverbs. For example, the verb “run” can be distinguished in terms of its meaning in the phrases “run in a race” vs. “run a business”.

The relations “isFormedBy/mayBeFormedBy” indicates how complex linguistic structures are created by simpler components. “isFormedBy” refers to the regular construction of a linguistic entity, such as how a word is formed by morphemes. “mayBeFormedBy” accounts for alternate forms or possible constructions, such as compound words that might be formed in different ways.

These relations play crucial roles in linguistic ontology, as they allow for the representation of not only what is linguistically standard but also what can occur under certain conditions or in less frequent scenarios. The flexibility of these relations ensures that the model can adapt to various linguistic phenomena, offering a dynamic representation of language. Since language is a living and ever-evolving entity, not a static system with fixed relations, its description must inherently reflect this adaptability through flexible connections between its components. Protégé provides tools to define, modify, and extend these relations over time, ensuring the ontology remains relevant as language evolves.

4.1.6. Determining the Data Properties

Since one of the aims of the ontological dictionary is to provide answers to lexicographic and morphosyntactic questions concerning each word-lemma, e.g., questions about gender (if it is masculine, feminine or neuter), number (if it is singular or plural), declension (which declension category it belongs to, how it is inflected), synonyms (or synonymous phrases), antonyms, diminutives, magnifying nouns, etymology, phonetic transcription, different (second) forms of a word, e.g., some words form other (second) types, differentiated phonetically or morphologically:

αδίστακτος/adistaktos/ and αδίσταχτος/adistachtos/ (unscrupulous),
different forms of the masculine, e.g., two forms for the masculine noun upholsterer:
ταπετσιέρης/tapetsieris/ and ταπετσέρης/tapetseris/
or of the feminine, e.g., three forms for the feminine noun doctor:
γιατρίνα/yatrina/, γιάτρισσα/yatrissa/, γιατρέσσα/yatressa/,
the other gender in a word (e.g., ο διευθυντής—the director, η διευθύντρια—the directress), and the English translation (Appendix A—item 4) that a word-lemma may have, the characteristics of object properties, i.e., the data properties (facets of slots), were defined, as shown in Figure 2. This is information that, like the lemmas, is also automatically extracted from [18] via code in Python computer language.

4.1.7. Defining Instances

The instances/individuals of the ontology, which are the lexicographic entries/lemmas, were defined and annotation properties were added to them, which give information about each instance. Individual lexical types (e.g., ‘shelf’, ‘chair’), linguistic-grammatical, syntactic, phonetic terms (e.g., cardinal numeral, subordinate clauses, posterior vowels), and multiword terms (e.g., ‘shelves free entry’) have been registered as lexicographic entries/lemmas. As far as entries/lemmas with different meanings (polysemy) are concerned, they have been registered as separate entries. At the grammatical level, the endings of decliners and verbs were also registered as instances. As for the annotation properties, these are related to the definition of each lexicographic item, its pronunciation, its spelling, its meaning (label), its semantics, remarks on its inflection, the link referring to its inflection, its morphological variety, its type formation, its syntax, the idioms (phrase) (Appendix A—item 5) or expressions or sayings associated with it, examples of its use, its style in various communicative contexts, various information which may be considered important for an entry (comment or SeeAlso) in a dictionary, even an image of the object to which it refers, or a link pointing to an online source (SeeAlsoURL). Both data properties and annotations properties can be entered manually or automatically from digital data, depending on whether or not they can be digitally extracted.

4.1.8. Definition of Rules and Axioms

Finally, rules and axioms were defined, so that the ontological system can work effectively and efficiently and the questions can be answered.

4.2. The Classes and Subclasses of Modern Greek Ontology

The class ‘New_Greek_Language’, being (according to a historical criterion) a subclass of the class ‘Greek_Language’, has as its subclasses (according to the levels of linguistic analysis) the four levels of linguistic analysis: ‘Morphology’, ‘Semantics’, ‘Syntax’ and ‘Phonetics’.

4.2.1. The ‘Morphology’ Class

Starting the hierarchical classification from the object ‘Word’, the class ‘Morphology’ includes subclasses, which have been derived using various and multi-level division criteria (based on both historical and synchronic division of the language) and enable the user to search for grammatical information on all parts of speech (inflected and non-inflected): article (definite, indefinite, prepositional), pronoun (referential, indefinite, autopathetic, indicative, interrogative, possessive, definite and personal), name (noun, adjective, numeral), participle (inflected: passive present participle and passive past participle, and non-inflected: active present participle), infinitive, verb (auxiliary, linking, deponent, impersonal, elliptic, consonantal-ending, vowel-ending), adverb (locative, modal, quantitative, etc.), and also adverb in -α (/a/), -ώς (/os/), -ί (/i/), etc., conjunction (coordinating or subordinating), preposition (common, obsolete, multi-word preposition), interjection, particle. Also, information can be derived about the number, inflection, case, gender, persons and inflected forms of all the declining parts of speech. It is worth noting that through the hyperonymy and hyponymy of the classes concerning the two numbers of the names of the Modern Greek language, the user can search and find all Greek words used only in the singular or only in the plural. In addition, through the classes relating to the numbers, genders, cases, and declension of the name types, the user can search for all the values of the endings in all cases (nominative, genitive, causative, vocative) of nouns (parisyllabic and imparisyllabic) and of adjectives of all degrees (positive, comparative and superlative) in all three genders. Therefore, it can be seen that the present ontology implements the ontological organization of the endings of nominal types in singular and plural in all three genders, in all cases, and by inflection.

But in addition to the declining parts of speech, the user can search for grammatical information about the verb, e.g., information about the mood, function, conjugation, voice, tense, mode of action, persons, and number of a verb type. Thus, for example, the user can search the instances of the subclasses of the ‘Verb’ class for the endings of all persons in all tenses, in all conjugations in both voices, in baritones as well as contract verbs, as a result of which the ontological organization of the endings of all verb types is implemented. Also, in the annotations properties of each tense, the user can find (a) the meaning of the tense, (b) the way it is formed, (c) examples of verbal types, (d) spelling observations, (e) semantic observations, and (f) irregular formations of verbs. In addition, it is possible to look up the inflection of each verb, based on its theme (present tense or aorist), its ending, its formation, or the declension category which it belongs to. Finally, the classes ‘Transitive_Verbs’, ‘Intransitive_Verbs’ and ‘Deponent_Verbs’ are also linked through various binary relations to the entity ‘Syntax’, as a result of which the user can search which verbs of Modern Greek are transitive or intransitive, what kind of objects they take (direct or indirect object)—if they take—and in which cases (genitive, accusative, ancient dative case) they take their objects. This ability is imperative for the computational parsing of Modern Greek.

Regarding the non-inflected parts of speech, the user of the ontology can derive information about the uninflected noun forms (nouns, adjectives, pronouns, numerals), uninflected participles (active present tense participle), infinitives (past tense), adverbs (in the three degrees), prepositions (common, obsolete, multi-word), conjunctions (coordinating and subordinating), interjections and particles. As far as adverbs are concerned, the user can look for information both on how they are formed and on the values of their endings in all three degrees. As for the class ‘Particle’, the user can search beyond the particles of the new Greek language and the words that accompany them. These particles, in addition to being instances of the ‘Particle’ class, were also registered as classes in the ontology, so that it is possible to capture the constitutive relationships. The same phenomenon occurs in the ‘Preposition’ class: the various prepositions were registered both as instances and as classes, as they had to be connected to other classes through various relationships to declare the prepositional attributes created at the syntactic level as well as their meaning at the semantic level. Generally, a basic principle of ontology is that lemmas are registered as instances. However, any lemma that is a basic structural element for the formation of some grammatical-syntactic types was also registered as a class and not only as an instance.

4.2.2. The ‘Syntax’ Class

At the syntactic level, starting from the ‘Syntax’ class, which has ‘Sentence’ as its subclass, the user can search through the hierarchical classification and the binary relations created between the classes for all the information related to the main and subordinate sentences, which, as objects of the ontology, are subclasses of the ‘Sentence’ class. Thus, it can find out whether a sentence is:

a sentence of desire, of estimation, interrogative, or exclamatory (the division criterion being its content);
whether it is a simple, augmented, elliptical, or compound sentence (the division criterion being its basic terms);
or even whether it is an affirmative or negative sentence (the division criterion being its quality).

It can also find all kinds of subordinate clauses, look up their introducing conjunctions, their moods of speech, the negations they accept, their modality, and the adverbial relations indicated in each clause. In addition, the user can search for the types of conditionals, the tenses and moods with which the hypothesis or the main clause of each conditional is uttered and the relations expressed (fact, non-fact, expected, etc.).

In addition to subordinate clauses, the user can search for information about the main terms of a sentence (subject, verb, object, predicative), as well as about each lectical set (nominal, verbal, adverbial, prepositional phrase) that makes up a sentence and its function within it (e.g., a nominal phrase can function as subject, object, predicative or even adverbial attribute). It is also possible to investigate the types of predicative (predicative in the subject, predicative of the object, preventive predicative, adverbial predicative) and to link them to the verbs (linking or not) that each of them follows. Of course, crucially, the user can also look for all the types of objects (direct, indirect/prepositional, corresponding), and also find the verbs (transitive or ditransitive) that follow each object as well as the cases (e.g., genitive, causative, ancient dative) in which the nominal set that functions as an object is found. Finally, the ontology provides information on nominal (congruent or non-congruent) and adverbial attributes in all their forms. It is worth noting that all adverbial attributes are linked to the level of semantics, to indicate the adverbial relation they express.

4.2.3. The ‘Semantics’ Class

At the level of semantics, the basic class of which is ‘Concept_of_Word’, the user can find information on the origin of words (colloquial and literary words), their formation (derivation and compounding) and meaning, as well as on the modality (deontic and epistemic) of verbs. With regard to the meaning of words, ontology provides the possibility of providing information on topics such as ambiguity (structural and lexical), polysemy (antonyms, homonyms, paronyms, synonyms, tautonyms, hyperonyms, hyponyms, literalism, and metaphor), vagueness, gender-based meaning, and the meaning of adverbial attributes. Each class relating to an adverbial attribute concept enables the user to search for all the ways of expressing (e.g., by an adverb, by a prepositional set, by an oblique case, etc.) the specific concept and all the adverbial concepts that a prepositional or adverbial phrase or, more generally, a lectical set with an adverbial function can express.

Regarding the derivation of words, the user can search in the instances of the subclasses of the class ‘Derivational_Ending’:

the derivational endings of adjectives derived from adverbs, verbs, adjectives, or nouns;
the derivational endings of adverbs;
the derivational endings of nouns; and
the derivational endings of verbs derived from names or non-inflected words.

Also, it is very important that the user can search in the instances of the subclasses of the class ‘Derivative’, not only the root but the derivational suffixes of word formation: suffix, postfix, prefix, and symfix. In this way, it is reflected that in order to derive words, the Greek language uses derivational prefixes, which are nothing but morphemes. Also, at the level of word synthesis, the ontology provides information on genuine and abusive synthesis as well as on all types of compounds (objective, ordinal, possessive, determinative, parasynthetic, and multiword compounds). In addition, the user of the ontology can search for the part of speech that occupies the position of the first or second compound of a compound word, the form that the compound of a word has taken during synthesis, e.g., the numeral ‘δύο’ (two) in synthesis is transformed into δι- /di-/ or δισ- /dis-/: δι-πλός (double) and also δι-σύλλαβος (di-syllabic). In this way, the ontology also makes it possible to create a synthesis, as it includes all the words that can form the first or second compound of a word. Therefore, it becomes obvious that in the field of word derivation and word compounding, ontology is a potential system of word derivation and word synthesis.

4.2.4. The ‘Phonetics’ Class

The class ‘Phonetics’ has as its subclass the class ‘Word’, which has as its subclasses all kinds of words (e.g., simple, compound, derivative, original, radical, monosyllabic, disyllabic, trisyllabic, polysyllabic, proparoxytone, paroxytone, oxytone, atonal, enclitic, etc.). It is noteworthy that in the data properties of the individuals of the various classes and subclasses, the corresponding values of each object of the ontology are listed. Thus, for example, in the data properties of the individual ‘Diphthong_Vowel’ of the corresponding class, the values of the diphthong vowels have been registered. In the same way, the user can find the values of all vowels (front, central, back, or round, non-round, or high, low, and middle), consonants (voiceless, sonorous, or bidental, dental, laryngeal, sibilant, labial, liquid, nasal, or fricative and instant) as well as vowel and consonant complexes.

5. Results and Evaluation

The results from the application of ontological dictionary demonstrated the special status it can occupy in the field of Natural Language Processing. Its most important advantages are:

the alphabetical listing (and coding) of lemmas, which allows for their easy searching;
the search for information about the various meanings and uses of words, their correct spelling, inflection and pronunciation, their valid etymology, their conceptual connection with other words, their position in the text and, more generally, their comprehensive mapping, representation and description;
the grouping of endings of nominal and verbal types, which makes it suitable for use by natural language automatic processing programs (taggers and lemmatizers);
the connection of all four levels of the language, which enables the encoding of the semantic relations existing at the grammatical-syntactic level;
the complete rendering of lexical meaning through the various correlations of lexical entities/lemmas, which can infer all the word forms and associations derived, as relations related, for example, to inflectional or derivational patterns that have been developed, e.g., συγγράφω > συγγράφεις (I write > you write), or συγγράφω > συγγραφέας (write > writer), respectively;
the possibility of multiple inheritance;
the possibility of dealing with cases of ambiguity;
the existence of an English translation for each lemma-instance, so that the ontology can be understood and used internationally, and also so that the differences between the Greek language and English can be more easily seen.

5.1. Connecting the Levels of Linguistic Analysis

In the common textbooks of Modern Greek grammar and syntax, the connection between the levels of linguistic analysis (morphology, syntax, semantics, and phonetics) cannot be captured under any circumstances. However, in the present ontology, this is achievable. Some examples, indicating the connection between the morphology level and the syntax level, are the mapping through object properties of:

the agreement of the verb with its subject (level of syntax) in person and number (level of morphology);
the agreement of the predicative with its subject (level of syntax) in gender, case and number (level of morphology);
the case (level of morphology) of uttering the direct, indirect and corresponding object (level of syntax);
the syntactic functions (level of syntax) that the different parts of speech can have (level of morphology) as subject, object, predicative, nominal or adverbial attributes, etc.

Also, examples that indicate the connection between the levels of morphology, syntax, and semantics are the mapping through object properties of:

the inflection and tense (level of morphology) of the utterance of subordinate clauses, as well as the rendering of conditional clauses (level of syntax), so that the meaning of each sentence (semantics level), even of conditionals, can be ontologically captured;
the manner of utterance (level of syntax) of the moods (level of morphology) in relation to the meaning of the moods (level of semantics);
the cases (level of morphology) with which the prepositions are formed into (level of syntax) and the meaning expressed by the prepositional phrases (level of semantics) that are created.

5.2. Multiple Inheritance at All Levels of Linguistic Analysis

Of particular interest is the fact that in the ontological dictionary the possibility of multiple inheritance is given, i.e., properties from different “parents” are inherited in a ‘child’ class. For example, the class ‘Numeral_Adjective’ is a subclass of both the class ‘Numeral’ and the class ‘Adjective’, which means that the numeral adjective inherits the properties of both the numeral and the adjective. This possibility runs through all four levels of the language, meaning that a class can be ‘descended’ from a ‘parent’ belonging to one level of the language and from another ‘parent’ belonging to another level of the language. In this way, the user of the ontology has the possibility, even if he/she selects e.g., the field of syntax, to find elements also from the field of morphology, and vice versa, which cannot be achieved in conventional grammar and syntax textbooks.

This possibility of multiple inheritance also affects the categorization of parts of speech. While generally following the categorization found in [16], one can easily realizes that the categorization of parts of speech in ontology is a more complex case than their categorization in a conventional grammar of Modern Greek. In other words, it is possible to have ‘multiple’ categorizations. However, there are also cases in which specific restrictions are placed on the issues of multiple inheritance, as it is possible for a class ‘child’ to descend from two diametrically opposed ‘parents’, in terms of the properties they inherit. Noteworthy is the case of the infinitive and the participle which:

combine features of the name and the verb;
sometimes have features of inflected types and sometimes features of non-inflected types, e.g., the non-inflected participle in -οντας/-ontas/ or ώντας/-όntas/ and the inflected participle in -μένος/-menos/; and
belong to the impersonal moods which are non-inflected, but also to the pendant parts of the verb, which is an inflected grammatical type.

This means that in each case of the above, restrictions are set (by using rules) on which properties should be inherited from one ‘parent’ and which from the other ‘parent’, since the two ‘parents’ (e.g., ‘Non-Inflected_Parts_of_Speech’ and ‘Inflected_Parts_of_Speech’) have opposite properties.

Therefore, it seems that, while in a conventional grammar of the modern Greek language, all adjectives or nouns or pronouns or participles, etc., are categorized together, in the ontology, there must be separate classes with a different hierarchy for each case, as different properties are inherited each time from the ‘parent superclass’. That is why many classes are ‘equivelantTo’ with other classes. For example, the class ‘Numeral_Noun’, superclassed by the class ‘Numeral’, is equivalent to the class ‘Numeral_Noun’, which superclasses the class ‘Noun’. However, this is an advantage of ontology, as it can capture the complexity of the relationships of the language constituents.

5.3. Linguistic Ambiguity

The biggest difficulty in natural language processing is the ambiguous interpretation that causes ambiguity in the language at the syntactic, lexical, referential, semantic, or pragmatic level. Precisely because various ambiguities co-exist in the modern Greek language, recording the linguistic issues in an unambiguous way is quite difficult, e.g., the case of the ending ‘–a’, which can be the ending of a noun, adjective, pronoun, verb, or adverb. However, these ambiguities are removed in the present ontology of the Modern Greek language, because they are represented through ontological associations and thus it is possible for a ‘child’ node to have more than one ‘parent’ node, with the result that a lexical entity can be a subclass of two or more classes. Thus, e.g., the ending ‘-a’ is a value of instances of several subclasses, as a result of which confusion phenomena are avoided. Therefore, by combining, when defining and prioritizing classes, the creation of a common super-class from two or more classes (bottom-up method) with the creation of classes from a super-class (top-down method), it is possible to avoid phenomena of confusion and over-derivation.

5.4. Automatic Management and Processing of Information

One of the main advantages of the ontological dictionary is that information contained in other non-semantic forms of organization, such as conventional relational databases, has been incorporated, through matching mechanisms. Data flooding the internet in structured and semi-structured sources were accessed and scalability problems that often appear in ontological descriptions with a large number of individuals were addressed. Thus, an important achievement is automatic ontology population from [18], which was used as the basic linguistic resource of the Modern Greek language, as well as from other sources on the internet (e.g., the open data terms from [19] were used). That is, a framework was created to extract the data of the aforementioned electronic dictionary and other sources under a common ontological dictionary, and the resulting set is provided as a basis for internet information mapping and the creation of semantic web applications. The result is that we have a lexical knowledge base, the scope of which covers almost all the lemmas of the modern Greek language and includes hundreds of thousands of terms and relationships that connect the terms to each other. This means that the present Knowledge-Based System (KBS) can be used to explore additional applications, such as machine learning. That is, it can be used to improve the performance and coherence of systems in the field of Natural Language Understanding (NLU), Natural Language Generation (NLG) and Large Language Models (LLMs), enhancing natural language understanding and generation through the structured representation of terms and relationships between them, the categorization and classification of data or the extraction of features.

6. Conclusions and Future Approaches

Finally, it is concluded that a conceptually structured Knowledge-Based System (KBS) was implemented in the field of lexical semantics of the Modern Greek language that uses an ontological dictionary as a basis for the semantic integration of information. At the same time, a logico-semantic theory is developed, which defines in practice the types and structures of objects, properties, events, processes, and relations that exist in the system and are relevant to natural language. With this theory, language overcomes the division of its nature and finds its ontological unity. Language, according to linguistic structuralism, is a structural system, a system of interdependent relations between phonemes and formulas. In this way, it is treated in this research, since it is depicted as a network of elements, governed by interdependence relations and not as a set of individual elements of language, as is the case in traditional grammatical categories (noun, adjective, verb, etc.). Each word is examined synthetically, as an element in relation to the whole, and even as a network of value. It consists of connected and interdependent elements, beginning with the morpheme and ending with the complete sentence, which work together to structure the speech. Indeed, many linguistic elements are involved and many different linguistic processes are completed at all levels of language to achieve linguistic coherence and an integrated communication between transmitter and receiver.

At the same time, the present ontological linguistic knowledge system has the utility of all ontologies, as mentioned in [1]. This means that:

it is readable and understandable by computers (machine-readable);
it provides semantic description to the contents of the internet;
it makes it possible with simple reasoning mechanisms to perform concept-based search instead of keyword-based search, thus enabling semantic focus of queries, questions and answers in terms of more than one, and the use of text transformation operators;
it enables automated inference and reasoning services;
it is useful for sharing a common structure for understanding information [20,21] and reuse of knowledge, enabling the integration of heterogeneous information sources;
it is a powerful tool for database integration and natural language understanding [22];
it can participate in the use of different information sources in a variety of applications;
it includes an electronic conceptual dictionary of Modern Greek that organizes its linguistic material ontologically, according to the standards of dictionaries at the international level; and
it is the most appropriate way of representing linguistic knowledge, since it allows for the definition of relationships between words, which is not present in a standard dictionary [23].

Regarding the experimental data used for building and validating the ontology, these were sourced from the Modern Greek linguistic resource mentioned in [18], as well as from additional internet-based open data sources, such as those listed in [19]. To ensure the accuracy and relevance of the ontology, various validation methods were employed. These included both manual verification by linguistic experts and automated checks against established linguistic databases. Furthermore, the ontology’s construction and validation were evaluated through its application in semantic web frameworks, where its effectiveness in providing meaningful relationships and mappings was tested. These validation techniques ensured that the resulting ontological dictionary is both comprehensive and reliable.

Of course, the construction of a linguistic ontology faces several technical challenges, such as data sparsity, subjectivity in the ontology construction process, and linguistic complexity. Data sparsity limits the ontology’s coverage, while subjectivity arises from varying decisions on how to categorize and define linguistic elements. Additionally, linguistic complexity, including ambiguities and irregularities, makes it difficult to accurately capture linguistic phenomena. Addressing these challenges requires a combination of strategies, including data augmentation, integration of multiple linguistic resources, collaboration with linguistic experts, and the adoption of flexible models to accommodate linguistic complexities.

The implementation of this linguistic knowledge system is not an end in itself. The definition of the data itself and its structure can be used by other programs, for various problem-solving methods, for real-world applications, etc. The ultimate goal is for the present ontology to provide possibilities for the introduction and integration of existing ontologies and compatibility with top-level ontologies, such as the Greek WordNet and BalkaNet ontologies, as well as the creation of an extensible repository of knowledge and information with potential applications in a variety of scientific fields, as noted below:

Linguistic and cross-linguistic research: Facilitates research in linguistics by offering a structured framework for analyzing and comparing languages.
Educational applications: Supports language learning tools by analyzing grammar, syntax, and linguistic structures.
Interconnection with other dictionaries and enrichment: Enables the integration and enhancement of existing dictionaries through semantic alignment and data sharing.
Search Engine Optimization (SEO): Improves semantic analysis of queries and search results, enhancing search engine performance.
Natural Language Generation (NLG): Assists in producing high-quality, context-sensitive content in various styles.
Natural Language Understanding (NLU): Enhances the interpretation of texts and conversations for more accurate understanding.
Interoperability and data exchange: Ensures seamless integration with other systems for improved data sharing and interaction.
Semantic web integration: Contributes to better categorization and searchability of data on the semantic web.
Decision-making systems: Supports decision-making processes in fields like healthcare, law, and administration by providing structured linguistic data.
Sentiment analysis: Assists in identifying emotional and ideological content within texts, aiding in sentiment analysis applications.

Finally, it is essential to understand that there is no ‘right’ way to model a domain. Instead, there are always viable alternatives, which almost always depend on the application the user wants to use and the extensions they want to provide.

Author Contributions

Conceptualization, methodology, investigation, resources, writing—original draft preparation, N.S.; formal analysis, software, data curation, writing—review and editing, E.P.; visualization, validation, supervision, project administration, N.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Acknowledgments

The authors express their thankfulness to the reviewers for their comments and suggestions that improved the content of this article.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

The term “word” means any linguistic unit that contains meaning and grammatical determination (see [24]).
The term “concept” means the totality of the main characteristics of a multitude of similar objects, concrete or abstract, as well as the permanent and definite representation formed in our minds of them (see [25]).
The relationships that connect linguistic elements/linguistic units to a specific linguistic environment or other similar units, through contrast and substitution, are paradigmatic [26]. The relations that the linguistic unit enters into as a result of its occurrence with other similar units are constitutional.
There were, of course, cases in which there was no correspondence between the Greek terms and the English ones, due to the lack of the corresponding grammar-syntactic phenomena in English. However, an attempt was made to reproduce these terms as closely as possible to their meaning in Greek. Thus, for example, the term ‘ειδική πρόταση’, which does not exist in English, was rendered with the term: ‘sentence_of_saying’.
The annotation property: phrase lists stereotyped compounds whose meaning has been completely removed from the literal and ceases to have any literal relation to the words of which they are composed, referring directly to a metaphor, e.g., “the last hole of the flute”, a phrase referring to a person we consider inferior and do not consider (see [27]).

References

Samaridi, N.; Karanikolas, N.; Papoutsidakis, M.; Papakitsos, E. A Survey on Ontological Organization of Data in the Semantic Web. In Proceedings of the 14th International Scientific Conference “eRA 2021” in the Field of “Industry 4.0”, Data Management and Educational Parameters of Industry 4.0, Athens, Greece, 19 October 2021; Available online: http://era-conference.teipir.gr/ (accessed on 12 June 2024).
Samaridi, N.; Karanikolas, N.; Papakitsos, E. Lexicographic Environments in Natural Language Processing (NLP). In Proceedings of the 24th Pan-Hellenic Conference on Informatics (PCI 2020), Athens, Greece, 20–22 November 2020. [Google Scholar] [CrossRef]
Samaridi, N.; Karanikolas, N.; Papakitsos, E.; Papoutsidakis, M. Designing a Greek Electronic Dictionary based on Ontology. In Proceedings of the 24th Pan-Hellenic Conference on Informatics (PCI 2020), Athens, Greece, 20–22 November 2020. [Google Scholar] [CrossRef]
McGuinness, N.; Van Harmelen, F. OWL Web Ontology Language Overview. February 2004. Available online: www.w3.org/TR/owl-features/ (accessed on 27 July 2024).
Sirin, E.; Parsia, B.; Grau, B.C.; Kalyanpur, A.; Katz, Y. Pellet: A practical OWL-DL reasoner. J. Web Semant. Sci. Serv. Agents World Wide Web 2007, 5, 51–53. [Google Scholar] [CrossRef]
Shearer, R.; Motik, B.; Horrocks, I. HermiT: A Highly-Efficient OWL Reasoner. In Proceedings of the Fifth OWLED Workshop on OWL: Experiences and Directions, Collocated with the 7th International Semantic Web Conference (ISWC-2008), Karlsruhe, Germany, 26–27 October 2008. [Google Scholar]
Grigoriadou, M.; Kornilakis, H.; Galiotou, E.; Stamou, S.; Papakitsos, E. The Software Infrastructure for the Development and Validation of the Greek Wordnet. J. Inf. Sci. Technol. 2004, 7, 89–105. [Google Scholar]
Copeland, G.; Khoshafian, S. Identity and Versions for Complex Objects. In Proceedings of the 1986 International Workshop on Object-Oriented Database Systems, Pacific Grove, CA, USA, 23–26 September 1986; Available online: https://dl.acm.org/doi/10.5555/318826.318873 (accessed on 27 July 2024).
Stavroulas, G. Object-Oriented Methodology and Tools for the Integration of Structured and Semi-Structured Information Using Ontologies [Aντικειµενοστραφής Μεθοδολογία και Εργαλεία για την Oλοκλήρωση ∆οµηµένης και Hµι-∆οµηµένης Πληροφορίας µε Χρήση Oντολογιών]. Ph.D. Thesis, National Technical University of Athens, Athens, Greece, 2016. (In Greek). [Google Scholar]
Samaridi, N.E.; Karanikolas, N.N.; Papoutsidakis, M.; Papakitsos, E.C.; Papakitsos, C.E. A survey on supply chain ontologies. Int. J. Prod. Manag. Eng. 2023, 11, 89–101. [Google Scholar] [CrossRef]
Guarino, N., Ed.; Formal Ontology in Information Systems. In Proceedings of the FOIS’98, Trento, Italy, 6–8 June 1998; Available online: https://www.researchgate.net/publication/272169039_Formal_Ontologies_and_Information_Systems (accessed on 12 June 2024).
Samaridi, N.; Papakitsos, E.; Papoutsidakis, M.; Mouzala, M.; Karanikolas, N. Developing a Logistics Ontology for Natural Language Processing. WSEAS Trans. Inf. Sci. Appl. 2024, 21, 385–397. [Google Scholar] [CrossRef]
Noy, N.; McGuinness, D.L. Ontology Development 101: A Guide to Creating Your First Ontology. Knowledge Systems Laboratory of Stanford University, Technical Report KSL-01-05 and Stanford Medical Informatics Technical Report SMI-2001-0880, March 2001. Available online: https://protege.stanford.edu/publications/ontology_development/ontology101.pdf (accessed on 12 June 2024).
Grüninger, M.; Fox, M.S. Methodology for the Design and Evaluation of Ontologies. In Proceedings of the Workshop on Basic Ontological Issues in Knowledge Sharing, IJCAI-95, Montreal, QC, Canada, 19–21 August 1995. [Google Scholar]
Zachman, J.A. A framework for information systems architecture. IBM Syst. J. 1987, 26, 276–292. [Google Scholar] [CrossRef]
Triantaphyllidis, M. Modern Greek Grammar; Update of the Small Modern Greek Grammar; OEDB: Athens, Greece, 2007. (In Greek) [Google Scholar]
Hatzisavvidis, S.; Hatzisavvidou, A. Grammar of the Modern Greek Language, A, B, C High School; ITYE “Diophantos”: Athens, Greece, 2011. (In Greek) [Google Scholar]
Electronic Dictionary of Common Modern Greek by M. Triantafyllidis. Available online: https://www.greek-language.gr/greekLang/modern_greek/tools/lexica/triantafyllides/ (accessed on 27 July 2024). (In Greek).
Supply Chain. Available online: https://www.supplychain.gr (accessed on 27 July 2024). (In Greek).
Musen, M.A. Dimensions of knowledge sharing and reuse. Comput. Biomed. Res. 1992, 25, 435–467. [Google Scholar] [CrossRef] [PubMed]
Gruber, T.R. A Translation Approach to Portable Ontology Specification. Acad. Press 1993, 2, 199–220. [Google Scholar] [CrossRef]
Dahlgren, K. A Linguistic Ontology. Int. J. Hum.-Comput. Stud. 1995, 43, 809–818. [Google Scholar] [CrossRef]
Markantonatou, S.; Fotopoulou, A. The tool “Ekfrasi”. In Proceeding of the 8th International Conference on Greek Linguistics, the Lexicography Workshop, Ioannina, Greece, 8–10 September 2007. [Google Scholar]
Gate for the Greek language. Available online: https://www.greek-language.gr/greekLang/modern_greek/tools/lexica/triantafyllides/search.html?lq=%CE%BB%CE%AD%CE%BE%CE%B7&dq (accessed on 27 July 2024). (In Greek).
Gate for the Greek Language. Available online: https://www.greek-language.gr/greekLang/modern_greek/tools/lexica/triantafyllides/search.html?lq=%22%CE%AD%CE%BD%CE%BD%CE%BF%CE%B9%CE%B1+1%22&dq (accessed on 27 July 2024). (In Greek).
Babiniotis, G. Theoretical Linguistics, Introduction to Modern Linguistics [Θεωρητική Γλωσσολογία, Εισαγωγή στη Σύγχρονη Γλωσσολογία]; Babiniotis: Athens, Greece, 1988; p. 121. (In Greek) [Google Scholar]
Gate for the Greek Language. Available online: https://www.greek-language.gr/greekLang/modern_greek/tools/lexica/triantafyllides/semasiology.html#toc006 (accessed on 27 July 2024). (In Greek).

Figure 1. The four (4) basic concepts (Μορφολογία_Morphology, Σύνταξη_Syntax, Σημασιολογία_Semantics, and Φωνητική_Phonetics), on which the ontology of Modern Greek is structured on the Protégé platform.

Figure 2. The data properties (GreekLanguageDataProperty) of the new Greek Language Ontology Dictionary on the Protégé platform.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Samaridi, N.; Papakitsos, E.; Karanikolas, N. Ontological Representation of the Structure and Vocabulary of Modern Greek on the Protégé Platform. Computation 2024, 12, 249. https://doi.org/10.3390/computation12120249

AMA Style

Samaridi N, Papakitsos E, Karanikolas N. Ontological Representation of the Structure and Vocabulary of Modern Greek on the Protégé Platform. Computation. 2024; 12(12):249. https://doi.org/10.3390/computation12120249

Chicago/Turabian Style

Samaridi, Nikoletta, Evangelos Papakitsos, and Nikitas Karanikolas. 2024. "Ontological Representation of the Structure and Vocabulary of Modern Greek on the Protégé Platform" Computation 12, no. 12: 249. https://doi.org/10.3390/computation12120249

APA Style

Samaridi, N., Papakitsos, E., & Karanikolas, N. (2024). Ontological Representation of the Structure and Vocabulary of Modern Greek on the Protégé Platform. Computation, 12(12), 249. https://doi.org/10.3390/computation12120249

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Ontological Representation of the Structure and Vocabulary of Modern Greek on the Protégé Platform

Abstract

1. Introduction

2. Methodology and Usefulness

3. The New Ontological Knowledge System of Modern Greek

3.1. The Ontology of Modern Greek Language

3.2. The Greek Ontology Dictionary

3.3. The Ontology of the Supply Chain

4. Designing the New Ontological Dictionary of Modern Greek

4.1. Design Methodology of the New Ontological Dictionary

4.1.1. Identifying the Scope of Definition

4.1.2. Ontology from Scratch

4.1.3. Language Resources

4.1.4. Definition of Classes-Subclasses

4.1.5. Defining Relations

4.1.6. Determining the Data Properties

4.1.7. Defining Instances

4.1.8. Definition of Rules and Axioms

4.2. The Classes and Subclasses of Modern Greek Ontology

4.2.1. The ‘Morphology’ Class

4.2.2. The ‘Syntax’ Class

4.2.3. The ‘Semantics’ Class

4.2.4. The ‘Phonetics’ Class

5. Results and Evaluation

5.1. Connecting the Levels of Linguistic Analysis

5.2. Multiple Inheritance at All Levels of Linguistic Analysis

5.3. Linguistic Ambiguity

5.4. Automatic Management and Processing of Information

6. Conclusions and Future Approaches

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI