Abstract
Ontologies provide a powerful method for representing, reusing, and sharing domain knowledge. They are extensively used in a wide range of disciplines, including artificial intelligence, knowledge engineering, biomedical informatics, and many more. For several reasons, developing domain ontologies is a challenging task. One of these reasons is that it is a complicated and time-consuming process. Multiple ontology development methodologies have already been proposed. However, there is room for improvement in terms of covering more activities during development (such as enrichment) and enhancing others (such as conceptualization). In this research, an enhanced ontology development methodology (ON-ODM) is proposed. Ontology-driven conceptual modeling (ODCM) and natural language processing (NLP) serve as the foundation of the proposed methodology. ODCM is defined as the utilization of ontological ideas from various areas to build engineering artifacts that improve conceptual modeling. NLP refers to the scientific discipline that employs computer techniques to analyze human language. The proposed ON-ODM is applied to build a tourism ontology that will be beneficial for a variety of applications, including e-tourism. The produced ontology is evaluated based on competency questions (CQs) and quality metrics. It is verified that the ontology answers SPARQL queries covering all CQ groups specified by domain experts. Quality metrics are used to compare the produced ontology with four existing tourism ontologies. For instance, according to the metrics related to conciseness, the produced ontology received a first place ranking when compared to the others, whereas it received a second place ranking regarding understandability. These results show that utilizing ODCM and NLP could facilitate and improve the development process, respectively.
1. Introduction
In recent years, semantic technologies have advanced rapidly. The semantic web is one of such technologies concerned with transforming the Web from a repository of human-readable content to a format that can be easily understood by machines [1,2]. The popularity of semantic data models such as ontologies and knowledge graphs has grown significantly in recent years [3]. Ontologies are considered to be the backbone of the semantic web [4]. Ontology is defined as an “explicit specification of a conceptualization” [5]. Given that ontologies provide context and meaning to data, they are essential for efficient knowledge extraction and reuse. Additionally, they offer a remedy for syntactic and semantic interoperability problems, which hinder efficient information exchange and collaboration among heterogenous systems [6]. Thus, ontologies have been employed in a wide range of applications across different domains [2]. Including, but not limited to, climate policy development [7], robot inspection systems [8], study of terrorism [9], knowledge about digital extortion attacks [10], drones’ semantic trajectories [11], and sentiment analysis [12].
In this context, ontology engineering has lately garnered great attention [13]. It is defined as “the set of activities that concern the ontology development process, the ontology life cycle, and the methodologies, tools and languages for building ontologies” [14]. One of the objectives of this branch of engineering is to offer a method to develop ontologies. For several reasons, developing domain ontologies is a challenging task. One of these reasons is that it is a complicated and time-consuming process [15]. Further challenges and possible future directions have been proposed in [16].
Ontology development methodology provides guidelines about the organization of activities and tasks, the definition of transitions between them, the selection of methods applied in each task, recommendation of the most suitable tools, and so on. Despite the fact that a variety of methodologies have been proposed [4,15,17], there are still many open issues about ontology development that have yet to be answered [13]. New methodologies continue to be introduced as they propose ontology development from different perspectives and focus on different aspects [15]. In brief, there is no common consensus on an ideal methodology; however, the purpose of developing the ontology may aid in the selection of the most suitable methodology.
One of the most essential activities in the process of ontology engineering is conceptualization. It is concerned with recognizing concepts in the real world in order to construct a model of the relevant domain [18]. Enhancing the activity of conceptualization has a significant impact on the final ontology’s quality. The reason for this is that the quality of any model-based artifact is highly constrained by the quality of the model itself [19]. The researchers presented a novel method called ontology-driven conceptual modeling (ODCM) [20], which is defined as the utilization of ontological ideas to construct engineering objects that enhance theory and practice of conceptual modeling. OntoUML is one of the most popular languages in ODCM, which is “a language whose meta-model has been designed to comply with the ontological distinctions and axiomatization of a theoretically well-grounded foundational ontology named UFO (Unified Foundational Ontology)” [21]. UFO is “an axiomatic formal theory based on contributions from Formal Ontology in Philosophy, Philosophical Logics, Cognitive Psychology, and Linguistics” [22].
Natural Language Processing (NLP) is one of the most significant sciences utilized in the semantic web. NLP analyses human natural language in text format using computer techniques to obtain meaningful semantic information [23]. Several NLP methods have been utilized in conjunction with ontologies in many studies, for instance, [24,25,26].
The goal of this research is to propose an enhanced ontology development methodology (ON-ODM). Ontology-driven conceptual modeling (ODCM) and natural language processing (NLP) serve as the foundation of the proposed methodology. The proposed ON-ODM methodology is applied to build a tourism ontology, which will be useful for many applications such as e-tourism. The remaining sections of the paper are organized as follows. The paper begins with a review of the literature. The proposed methodology is then described in detail, followed by the results and discussion. Finally, the paper ends with conclusions and recommendations for further research.
2. Literature Review
Several ontology development methodologies have been defined in the literature. This is because there is no single correct methodology for constructing ontologies, where target application and necessary features aid in the selection of the most appropriate methodology. Therefore, authors competed to suggest methodologies that consider the development process from different perspectives and focus on various aspects. This section summarizes some such methodologies that have been proposed in the past five years. Table 1 compares 20 methodologies to the proposed ON-ODM. The comparison is based on 20 different criteria that cover all stages of the development process.
Table 1.
Comparison of ontology development methodologies (ODM) from 2018 to 2023.
Some criteria have been collected from [13,15,29,31,35,36,44], in addition to another set suggested in the current study. An outline of these criteria is given below.
- Domain requirements analysis (C1): analyzes the domain requirements.
- Conceptualization (C2): contains a conceptualization phase.
- Implementation (C3): transforms from model to ontology.
- Instantiation (C4): provides a method for populating the ontology.
- Enrichment (C5): enriches ontology concepts or relations automatically.
- Verification against CQs (C6): verified by answering competency questions.
- Evaluation (C7): applies evaluation techniques to assess the ontology’s quality.
- Maintenance (C8): supports ontology maintenance.
- Documentation (C9): offers comprehensive documentation with the ontology.
- Publication (C10): publishes the ontology online.
- Origins of methodology (C11): is based on well-designed methodologies.
- Reusability (C12): can be easily reused.
- Integration (C13): can be easily integrated with other ontologies.
- Interoperability (C14): concepts can be easily shared with other ontologies.
- Collaborative Construction (C15): supports construction by multiple engineers.
- Localization (C16): supports multiple languages.
- Detailed steps (C17): phases and activities are described in detail.
- Case study (C18): the methodology applied to a case study.
- Tools (C19): utilized tools are described clearly.
- Degree of automation (C20): fully automated (F) or semi-automated (S).
As noticed from Table 1:
- More than 75% from the methodologies provided C1, C2, C3, C4, C7, C17, C18, and C19.
- C11 was offered by 14 methodologies out of 20.
- Between 20% and 50% of the methodologies proposed C6, C8, C9, C12, C13, C14, and C15.
- Fewer than 25% fulfilled C5, C10, C16.
- Only two methodologies applied entirely automated processes. This is due to the fact that fully automated methods have some disadvantages when compared to other methods [27,42]. For instance, they may cause unexpected errors that compromise the integrity of the content.
Thus, the aim of this research is to propose the ON-ODM semi-automatic methodology that supports all the criteria discussed in Table 1.
The proposed methodology has been applied to the tourism domain. In this sector, a wide variety of ontologies have been introduced. The following are some that are accessible for download and will be used later on during the evaluation process:
- HONTOLOGY [45];
- IMHO_EVENTS [46];
- IMHO [46];
- TRAVEL [47].
3. Proposed Methodology
The study proposes an enhanced ontology development methodology (ON-ODM) that offers detailed guidelines for all crucial activities, from requirements specification to ontology assessment. ON-ODM is domain-independent and can be applied to any domain. In this paper, ON-ODM is applied to the context of tourism, because of its significance and impact on promoting the economy of any nation. Egyptian tourism is suggested as a case study due to its richness of tourist cities and monuments, which leads to an abundance of data to be used in the application. Figure 1 depicts the proposed methodology, which is made up of seven major modules. The following subsections provide a detailed explanation of these modules.
Figure 1.
Proposed ON-ODM methodology.
3.1. Requirements Acquisition Module
The aim of this module is to acquire the list of requirements that the output ontology should fulfill. In ON-ODM, the acquisition of requirements begins with the identification of the domain’s main information and user needs. It is followed by an analysis of these needs, and the final list of ontology requirements is specified after that. This module’s input consists of the available domain documentation in a variety of forms, in addition to related online resources and knowledge about the domain from experts. The output of this module is the list of ontology requirements in three different forms, which will be covered in further detail in the next subsections.
3.1.1. Identification
In this step, at first, the ontology engineer should list all of the resources that are accessible so that they can be used to obtain the necessary information and requirements. These resources include the following:
- Related domain documentation including textbooks, reports, HR documents, proposals, and more.
- Glossaries, descriptive dictionaries, and other lexicographic resources.
- Related online resources such as portals, videos and other Internet materials.
- Interviews with domain experts.
Then, all the above resources are utilized to extract the main information about the domain and the user needs that should be addressed later by the ontology. Natural language statements are used to outline this description in a document. A document called the Domain Description Document (DDD) is proposed. This document’s significance lies in several points:
- In the subsequent activities, it will aid in defining the specifications that should comprise the ontology.
- At the end, it will be employed to evaluate the ontology and determine whether or not it fulfilled the customer expectations for which it was created.
- It will be included in the ontology documentation, for its main role in helping to elucidate the domain, which will facilitate the use of the ontology in various applications.
Table 2 shows the suggested DDD template applied to the tourism domain
Table 2.
The suggested DDD applied to the tourism domain.
This is a sample of how the DDD can be used to describe the domain and express the user requirements that reflect their final expectations.
3.1.2. Analysis
Currently, the DDD has a wide range of user requirements that have been listed as points. Now, it is time to analyze these needs and turn them from their present form into a set of distinct and well-defined functions that are devoid of repetitions and impractical requirements. The use case diagram can be helpful for this target. It is one of the UML diagrams that focuses on the system requirements as seen from the user’s perspective and expresses them as system functionalities. Additionally, it shows the interaction between users (actors) and functions (use cases). The use case’s most significant advantage is that it offers a technique to represent the domain in a diagram that is easy for different roles to comprehend, review, and evaluate. Any of the UML design tools can be utilized in this activity, in the tourism case study, the Rational Rose tool [48] is used.
A portion of the citizen actor use case diagram is illustrated in Figure 2. It includes 15 of the total number of functions that a citizen is capable of performing. For instance: “Obtain photographic permit”, “Organize activities and events”, “Browse directory”, and so on. The figure illustrates that twelve use cases are initiated directly by the citizen, while three are extended from other use cases.
Figure 2.
A fragment of the Egyptian tourism use case diagram—Citizen actor.
3.1.3. Specification
In this activity, the previous list of refined requirements will be transformed into a final set of competency questions (CQs). This approach, which was first described in [49], is one of the widely used methods for specifying ontology functional requirements. These questions are crucial for guiding the ontology development process, since the ontology in its complete version must be capable of answering them. Furthermore, they can be combined with expected results to be employed later in the ontology’s evaluation. Table 3 displays sample of the proposed CQs for the Egyptian tourism case study. They are categorized into the following five main categories:
Table 3.
Proposed competency questions (CQs) for the Egyptian tourism case study.
- Antiquities;
- Museums;
- Tourism Companies;
- Hotels;
- Events.
At this point, the requirements acquisition module is complete, and three alternative versions of the requirements (DDD, use case diagram, and CQs) are available.
3.2. Ontology Development Module
In this module, the ontology is constructed on the basis of previously collected requirements. ON-ODM’s development module adheres to the fundamental structure proposed in METHONTOLOGY [50], as this research places the same emphasis on the conceptualization activity. The module consists of four main activities: Specification, Conceptualization, Formalization, and Implementation. They are explained in further detail though the next subsections. This module’s result is the initial version of the ontology.
3.2.1. Specification
The initial stage entails describing the ontology data in a textual document using natural language. The ontology metadata vocabulary (OMV) [51] is a typical suggestion for the ontologies’ description. It enables ontologies to be easily accessed, exchanged over the Internet, and integrated across different domains. The Egyptian tourism ontology’s OMV is listed in Table 4.
Table 4.
The proposed tourism domain ontology ‘s OMV.
3.2.2. Conceptualization
This activity is one of the strengths of the proposed methodology, because of its reliance on ODCM, which applies the ontological theories to enhance conceptual modeling [20]. The activity involves designing the model of the target domain, using one of the ODCM languages. The conceptual model of the proposed case study is designed using OntoUML language [21], whose class and relationship stereotypes are elaborated upon in [52]. OntoUML reliance on UFO foundational ontology aids in the application of a common structure that guarantees easy reusability, integration, and interoperability. The OLED [53] tool is utilized, which is a model-based environment for formalizing, developing, and testing OntoUML models.
Due to its quite large size, just two portions of the Egyptian tourism conceptual model are depicted in Figure 3 and Figure 4. Figure 3 shows a sample of the various relations that the citizen ROLE is capable of performing. For instance, the Citizen can:
Figure 3.
A fragment of the Egyptian tourism OntoUML model—Citizen ROLE.
Figure 4.
A fragment of the Egyptian tourism OntoUML model—Consumer ROLE.
- Check various SUBKINDs of Internal News including (Job Vacancies, Training Courses, Workshops, and Scholarships).
- Buy Ticket of Archaeological Site or Museum such as (Scientific, Archaeological, and so on).
- Watch Videos.
- Obtain Bulk Visitor Permit.
- Obtain Photographic Permit.
- Organize Activities and Events.
- Browse several SUBKINDs of Directory such as (Tourism Companies, Hotels, and so on).
SUBKINDs of Place including (Hotel, Resort, Restaurant, Cafe, and Cruise). Some Reservations have a Type like (Full, Half, Breakfast, and All inclusive). Hotels are composite of Floors, which are made up of Rooms. The Room may be (Single, Double, Triple, or Suite).
SUBKINDs of Transportation such as (Airplane, Ship, Bus, and Car). And the Airplane is composed of many Flights.
For both Reservations, a Bill is issued. This Bill is associated with a Payment.
3.2.3. Formalization
The prior activity’s model is expressed in a modeling language that is only comprehensible by humans. The purpose of this activity is to transform this form into a new one that is interpreted by computer programs. This can be easily performed using the OLED code generation feature. The outcome is an ontology represented by Web Ontology Language (OWL), which is one of the most well-known ontology representation languages. Currently, there is an ontology called EGYTOUR that comprises 228 classes.
3.2.4. Implementation
In this activity, the ontology engineer populates the ontology by manually adding new data properties and individuals. Then, they are assigned to the appropriate ontology classes. This can be accomplished with the help of any ontology editor. In the proposed case study, the protégé tool [54] was used. It offers many useful features, the most essential of which is that it supports collaborative construction. The resources defined in Section 3.1.1 assist the ontology engineer in obtaining the necessary information. Due to the case study’s excessive amount of data, only a portion of it is used in this activity. This portion consists of 246 data properties and 1602 individuals. Figure 5 and Figure 6 depict a sample of EGYTOUR data properties and individuals, respectively.
Figure 5.
Sample of EGYTOUR data properties.
Figure 6.
Sample of EGYTOUR individuals.
3.3. Ontology Enrichment Module
Unlike many previous methodologies, ON-ODM considers the enrichment module as an essential step in developing ontologies. This is because it assists the ontology engineer with suggestions for classes and relationships that were missing in the initial version of the ontology. There are multiple approaches to enrich ontologies; ON-ODM suggests one that makes use of NLP techniques. Another method that depends on ontology matching was proposed in an earlier work [55]. The scope of this study is to enrich the ontology with relationships extracted from corpus. The module consists of three main activities: Preprocessing, Relations Extraction, and Enrichment. The module’s input is the initial version of the ontology in addition to the corpus documents. An enriched version of the ontology is the output. The module is thoroughly explained in the following sections.
3.3.1. Preprocessing
The names of the ontology classes should undergo some preparation before extracting the new relations from corpus. The preprocessing step is essential, because only the words that are determined at this stage will go through the subsequent activities. In ON-ODM, the preprocessing activity consists of four steps:
- Tokenization: segmenting class name into words.
- Non-alphabetic removal: removing numbers and special characters.
- Stop words removal: eliminating a list of commonly used words that contain very little beneficial information.
- Lemmatization: retrieving the base form of each word.
In EGYTOUR ontology, the preprocessing is fulfilled via the SpaCy library [56], which is a Python open-source library for advanced NLP techniques. It assists in developing applications that comprehend large volumes of text. SpaCy can be used in text preprocessing, natural language understanding, and information extraction. Table 5 lists some examples of the results of this activity.
Table 5.
Examples of the EGYTOUR ontology’s preprocessing results.
3.3.2. Relations Extraction
In this activity, a text corpus is used to extract new candidates for relationships between classes. Any of the numerous NLP-based techniques for information extraction from text may be employed in this activity. Considering that the NLP is not the main focus of this study, a simple and straightforward method is suggested in the tourism case study. The SpaCy library [56] is used for:
- Sentence segmentation: dividing the text into sentences.
- Searching for class lemma: looking up the class in the text.
- Results Tokenization: tokenizing a sentence into words.
- Part-Of-Speech (POS) tagging: assigning type to tokens such as (noun, verb …and so on).
- Verbs extraction: extracting verbs to be used later in naming extracted relations.
The results of this activity are a list of candidates for each class. In addition to the creation of a recommended list of verbs that can be used for identifying relationships. The Open American National Corpus (OANC) [57] is used to identify those new candidates for the EGYTOUR ontology. OANC is an entirely open repository of American English electronic texts. It has 8832 files with a total of approximately 15 million words. As is well known, dealing with such large corpus is not an easy task. It is one of the significant challenges confronting NLP models. For that reason, only a sample of the EGYTOUR ontology is used to apply the extraction approach. This sample consists of 15 classes, yielding 1661 extracted candidates. Table 6 displays two candidates returned from the extraction step.
Table 6.
Examples of the EGYTOUR ontology’s relations extraction results.
The preprocessing and relations extraction activities are both outlined in Algorithms 1 and 2.
| Algorithm 1. Relations Extraction from Corpus—Main Algorithm | |
| INPUT: Proposed ontology (proposedonto)INPUT: Corpus documentsOUTPUT: List of occurrences for ontology classes (occlist)BEGIN | |
| 1 | classeslist ← proposedonto.GETCLASSES() |
| 2 | LOAD corpus documents INTO documentslist |
| 3 | FOR EACH c IN classeslist DO |
| 4 | name ← c.GETCLASSNAME() |
| //Preprosessing | |
| 5 | name.REMOVESTOPWORDS() |
| 6 | name.REMOVENONALPHABETIC() |
| 7 | lemma ← name.GETLEMMA() |
| 8 | FOR EACH doc IN documentslist DO |
| //Call Algorithm 2 to get class occurrences in corpus document | |
| 9 | occlist ← GETOCCURRENCES(lemma,doc) |
| 10 | END FOR |
| 11 | END FOR |
| 12 | RETURN (c,occlist) |
| END | |
| Algorithm 2. Class Occurrences Extraction from Document | |
| INPUT: Lemma of the ontology class (lemma)INPUT: Corpus document (doc)OUTPUT: List of document statements in which the class occurred BEGIN | |
| //Sentence Segmentation | |
| 1 | senlist ← doc.GETSENTENCES() |
| 2 | FOR EACH s IN senlist DO |
| 3 | IF exists(lemma,s) THEN |
| //POS tagging | |
| 4 | POSTAG(s) |
| 5 | verbs ← s.EXTRACTVERBS() |
| 6 | outputlist.ADD (lemma, doc, s, verbs) |
| 7 | END IF |
| 8 | END FOR |
| 9 | RETURN (outputlist) |
| END | |
3.3.3. Enrichment
In this activity, the ontology engineer decides the appropriate action towards each candidate. It is also possible to consult domain experts to benefit from their guidance about the correct decision. Thus, the intervention of a human in this step is crucial in order to avoid ambiguity and redundancy. The engineer approves the candidate if: (1) new, (2) meaningful, and (3) both classes exist in the ontology. While they reject the candidate if: (1) the relation already exists in the ontology, (2) has no meaning, or (3) one of the classes is not defined in the ontology. The approved candidates are added to the ontology as object properties between the participating classes. For instance, Table 7 shows the actions performed with examples from Table 6.
Table 7.
Examples of the EGYTOUR ontology’s enrichment results.
As a result of the enrichment process, 71 additional object properties are defined for the 15 classes specified in the previous activity. The metrics of the most recent version of the EGYTOUR ontology are illustrated in Figure 7.
Figure 7.
EGYTOUR metrics.
3.4. Ontology Assessment Module
There are many approaches for measuring the quality of the constructed ontology from different perspectives. Summaries of these approaches were proposed in many papers, such as [58,59,60,61,62,63]. The ontology engineer decides the best-fitting approach for each situation. In ON-ODM, two different approaches are suggested: (1) CQ-based verification and (2) Metric-based evaluation. Both approaches have many advantages, including:
- Cover different quality dimensions (expressiveness, accuracy, understandability, cohesion, and conciseness).
- Easily applied.
- Flexible and adaptable for application in a variety of contexts.
They will be described in depth in the next sections.
3.4.1. CQ-based Verification
In this method, the ontology is verified against a collection of predefined criteria, which are represented in the form of competency questions. This approach aids in evaluating the expressiveness criteria [59], that depends on the ontology’s ability to provide answers to competency questions. As the process of CQs specification was already a main step (Section 3.1.3) in ON-ODM, this will facilitate its application in assessing the produced ontology. In the current step, the ontology engineer writes SPARQL queries to answer CQs, executes the queries on the produced ontology, and then compares the outcomes with the expected results, which were also defined in Section 3.1.3. Table 8 provides some examples of SPARQL queries suggested to evaluate the EGYTOUR ontology.
Table 8.
Examples of EGYTOUR’s SPARQL queries.
3.4.2. Metric-Based Evaluation
The second approach is based on the computation of ontology quality metrics. Several metrics correlated to different ontology dimensions have been developed. The author of [60] offers a free online platform called OntoMetrics [64] for metric definition and calculation. In the proposed case study, OntoMetrics was used to calculate 11 different metrics, which are categorized into three groups (Schema, Knowledgebase, and Graph). Further information about calculation of the utilized metrics is provided in Table 9. As indicated in [65], these metrics are correlated to four ontology dimensions, as shown below:
Table 9.
OntoMetrics equations.
- Accuracy: Equations (1)–(3) and (8)–(11)
- Understandability: Equation (7)
- Cohesion: Equations (6) and (7)
- Conciseness: Equations (4) and (5)
3.5. Publication
The goal of this activity is to create a translation file for the final version of the ontology and then publish it online. This activity is strongly suggested, so that the constructed ontology supports localization and becomes available to others. Furthermore, the OMV suggested in Section 3.2.1 facilitates the ontology’s access and exchange over the Internet. The current version of the EGYTOUR ontology is available only in English. However, for the final release, an Arabic translation file will be created. Web Protégé [66] can be used to easily accomplish the publication activity.
3.6. Maintenance
This module handles making any necessary updates or corrections to the ontology. This occurs in two cases: after evaluation or after publishing the online version. Such updates may be required, as it is possible that the ontology might lack certain domain knowledge or contain some errors.
3.7. Documentation
The ON-ODM methodology gives the utmost importance to documentation. Since the first activity, many documents have been presented in different forms. The documentation activity continues until the completion of the ontology and its publication on the Internet with documents that explain all of its components to facilitate ontology reusability, integration, and interoperability.
4. Results and Discussion
The objective of this section is to list and discuss the evaluation results of the ontology that has been developed using the proposed methodology (ON-ODM).
4.1. CQ-Based Evaluation Results
The EGYTOUR’s SPARQL queries defined in Section 3.4.1 were executed on Protégé [54]. These queries represent one question per each CQ group defined by the domain experts in Section 3.1.3. Figure 8, Figure 9, Figure 10, Figure 11 and Figure 12 show how EGYTOUR was able to successfully answer all queries.
Figure 8.
SPARQL query and results of CQ1-1.
Figure 9.
SPARQL query and results of CQ2-1.
Figure 10.
SPARQL query and results of CQ3-1.
Figure 11.
SPARQL query and results of CQ4-1.
Figure 12.
SPARQL query and results of CQ5-1.
4.2. Metric-Based Evaluation Results
The OntoMetrics results of the EGYTOUR ontology are displayed in Table 10. Furthermore, four more tourism ontologies were downloaded; (HONTOLOGY [45], IMHO_EVENTS [46], IMHO [46], and TRAVEL [47]). Table 10 compares the OntoMetrics results of the five ontologies. Whereas Table 11, Table 12, Table 13 and Table 14 show the EGYTOUR metrics correlated to the four dimensions mentioned in Section 3.4.2.
Table 10.
OntoMetrics results.
Table 11.
Accuracy-correlated metrics.
Table 12.
Understandability-correlated metrics.
Table 13.
Cohesion-correlated metrics.
Table 14.
Conciseness-correlated metrics.
As displayed in Table 10, EGYTOUR is ranked first in the IR, AP, CR, AD, and MD metrics, second in AC, third in AR, ARC, and MB, and fourth in RR and AB.
Low values of AB and MB indicate that the ontology concentrates on the vertical rather than the horizontal modeling of hierarchies. This can be improved by defining additional classes at the same level (siblings).
In EGYTOUR, only a sample of the data properties and extracted relations were used, as mentioned in Section 3.2.4 and Section 3.3.2, respectively. Applying the complete case study data will raise the AR and RR values.
Regarding the ARC metric, it counts the number of roots that do not receive is-a relations. Therefore, EGYTOUR’s ARC value is low because it contains numerous is-a relations.
Accuracy-correlated metrics measure the degree to which the ontology represents the real-world domain [65]. As illustrated in Table 11, EGYTOUR placed third in the average of accuracy-related metrics due to low AR, RR, AB, and MB values. As mentioned above, increasing those metrics will improve EGYTOUR’s rank.
Understandability-correlated metrics determine the comprehension of the elements of the ontology [65]. According to Table 12, EGYTOUR is the second-ranked ontology. This is due to the high value of AC.
Cohesion-correlated metrics refer to the degree to which the classes in the ontology are related to one another [65]. As seen in Table 13, EGYTOUR has a high average of these metrics, indicating that classes are strongly related.
The degree to which the ontological information is useful is measured using conciseness-correlated metrics [65]. Table 14 shows that among ontologies, EGYTOUR has the greatest average. This means that EGYTOUR does not provide any unnecessary or duplicate information.
5. Conclusions
Ontologies are widely used in a variety of applications and domains. Nonetheless, developing domain ontologies is a difficult and time-consuming process, which is a significant challenge. Many ontology development methodologies have already been proposed; however, there are still certain activities that have not yet been covered and others that could be enhanced. Furthermore, the scarcity of details and applications provided with the methodologies makes most of them challenging to put into practice. In this article, an enhanced ontology development methodology called (ON-ODM) is proposed. The article offers four main contributions:
- ON-ODM concentrates on the requirements acquisition module, which contributes significantly to the final outcome. Therefore, the requirements are gathered from different perspectives and presented in three different forms (Domain Description Document (DDD), use case diagram, and competency questions).
- ON-ODM recommends ODCM for the conceptualization phase, which improves the conceptual modeling by incorporating the ontological theories when building the engineering artifacts.
- ON-ODM considers enriching ontologies as a main step in ontology development cycle, so it suggests an NLP technique to extract from corpus a list of new candidates for relations between classes.
- A comprehensive case study in the field of tourism is applied. To facilitate the process for others to apply ON-ODM, it is backed with thorough details at each stage. The created ontology was assessed using two different approaches.
In the tourism case study, a portion of data was applied. However, during the application of the complete case study data, the below challenges might be encountered:
- The process of populating the ontology with manually extracted individuals.
- The large number of candidates returned during the process of relations extraction, which places a burden on the ontology engineer while reviewing and selecting the approved candidates.
In future work, there are several directions that can improve the proposed work. For instance, utilizing different techniques of advanced NLP and observing how this can affect the extracted list of candidates and, subsequently, the developed ontology. As well as applying other approaches for ontology evaluation, such as an application-based approach. Finally, investigating ON-ODM in terms of dimensions such as efficiency, ease of use, and adaptability to different and more complex domains.
Author Contributions
Conceptualization: S.H.; methodology: S.H.; data collection: S.H.; analysis and interpretation of results: S.H.; writing—original draft preparation: S.H.; writing—review and editing: R.M.I., N.B. and M.H.; supervision: R.M.I., N.B. and M.H. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Data Availability Statement
The most recent version of EGYTOUR ontology has been made available at [67]. In addition to the use case diagram, OntoUML model, SPARQL queries with their answers, and OANC corpus.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Turchet, L.; Antoniazzi, F. Semantic Web of Musical Things: Achieving Interoperability in the Internet of Musical Things. J. Web Semant. 2023, 75, 100758. [Google Scholar] [CrossRef]
- Liu, X.; Tong, Q.; Liu, X.; Qin, Z. Ontology Matching: State of the Art, Future Challenges, and Thinking Based on Utilized Information. IEEE Access 2021, 9, 91235–91243. [Google Scholar] [CrossRef]
- Alexopoulos, P. Semantic Modeling for Data; O’Reilly Media: Sebastopol, CA, USA, 2020. [Google Scholar]
- Rawat, R. Logical Concept Mapping and Social Media Analytics Relating to Cyber Criminal Activities for Ontology Creation. Int. J. Inf. Technol. 2023, 15, 893–903. [Google Scholar] [CrossRef]
- Thomas, R. Gruber Toward Principles for the Design of Ontologies Used for Knowledge Sharing? Int. J. Hum. Comput. Stud. 1995, 43, 907–928. [Google Scholar]
- Psarommatis, F.; Fraile, F.; Ameri, F. Zero Defect Manufacturing Ontology: A Preliminary Version Based on Standardized Terms. Comput. Ind. 2023, 145, 103832. [Google Scholar] [CrossRef]
- Taylor, P.J. The Geographical Ontology Challenge in Attending to Anthropogenic Climate Change: Regional Geography Revisited. Tijdschr. Voor Econ. Soc. Geogr. 2023, 114, 63–70. [Google Scholar] [CrossRef]
- Ma, L.; Hartmann, T. A Proposed Ontology to Support the Hardware Design of Building Inspection Robot Systems. Adv. Eng. Inform. 2023, 55, 101851. [Google Scholar] [CrossRef]
- Al-Fayez, R.Q.; Al-Tawil, M.; Abu-Salih, B.; Eyadat, Z. GTDOnto: An Ontology for Organizing and Modeling Knowledge about Global Terrorism. Big Data Cogn. Comput. 2023, 7, 24. [Google Scholar] [CrossRef]
- Keshavarzi, M.; Ghaffary, H.R. An Ontology-Driven Framework for Knowledge Representation of Digital Extortion Attacks. Comput. Human Behav. 2023, 139, 107520. [Google Scholar] [CrossRef]
- Kotis, K.; Soularidis, A. ReconTraj4Drones: A Framework for the Reconstruction and Semantic Modeling of UAVs’ Trajectories on MovingPandas. Appl. Sci. 2023, 13, 670. [Google Scholar] [CrossRef]
- Alexopoulos, P.; Wallace, M. Creating Domain-Specific Semantic Lexicons for Aspect-Based Sentiment Analysis. In Proceedings of the 10th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP), Trento, Italy, 5–6 November 2015; IEEE: New York, NY, USA, 2015. [Google Scholar]
- Poveda-Villalón, M.; Fernández-Izquierdo, A.; Fernández-López, M.; García-Castro, R. LOT: An Industrial Oriented Ontology Engineering Framework. Eng. Appl. Artif. Intell. 2022, 111, 104755. [Google Scholar] [CrossRef]
- Asunción, G.-P.; Fernández-López, M.; Corcho, O. Ontological Engineering; Springer-Verlag: London, UK, 2004; ISBN 1-85233-551-3. [Google Scholar]
- Mahmood, K.; Mokhtar, R.; Raza, M.A.; Noraziah, A.; Alkazemi, B. Ecological and Confined Domain Ontology Construction Scheme Using Concept Clustering for Knowledge Management. Appl. Sci. 2023, 13, 32. [Google Scholar] [CrossRef]
- Tudorache, T. Ontology Engineering: Current State, Challenges, and Future Directions. Semant. Web 2020, 11, 125–138. [Google Scholar] [CrossRef]
- Patel, A.S.; Merlino, G.; Puliafito, A.; Vyas, R.; Vyas, O.P.; Ojha, M.; Tiwari, V. An NLP-Guided Ontology Development and Refinement Approach to Represent and Query Visual Information. Expert Syst. Appl. 2023, 213, 118998. [Google Scholar] [CrossRef]
- Trujillo, J.; Davis, K.C.; Du, X.; Damiani, E.; Storey, V.C. Conceptual Modeling in the Era of Big Data and Artificial Intelligence: Research Topics and Introduction to the Special Issue. Data Knowl. Eng. 2021, 135, 101911. [Google Scholar] [CrossRef]
- Guizzardi, G. Theoretical Foundations and Engineering Tools for Building Ontologies as Reference Conceptual Models. Semant. Web 2010, 1, 3–10. [Google Scholar] [CrossRef]
- Verdonck, M.; Gailly, F. Insights on the Use and Application of Ontology and Conceptual Modeling Languages in Ontology-Driven Conceptual Modeling. In Proceedings of the International Conference on Conceptual Modeling, Gifu, Japan, 14–17 November 2016; LNCS; Springer: Cham, Switzerland, 2016; Volume 9974, pp. 83–97. [Google Scholar]
- Guizzardi, G. Ontological Foundations for Structural Conceptual Models; Centre for Telematics and Information Technology: Delhi, India, 2005; ISBN 9075176813. [Google Scholar]
- Guizzardi, G.; Wagner, G.; Paulo, J.; Almeida, A.; Guizzardi, R.S. Towards Ontological Foundations for Conceptual Modeling: The Unified Foundational Ontology (UFO) Story. Appl. Ontol. 2015, 10, 259–271. [Google Scholar] [CrossRef]
- Rudwan, M.S.M.; Fonou-Dombeu, J.V. Machine Learning Selection of Candidate Ontologies for Automatic Extraction of Context Words and Axioms from Ontology Corpus. In Proceedings of the Information Integration and Web Intelligence; iiWAS; Pardede, E., Delir Haghighi, P., Khalil, I., Kotsis, G., Eds.; Springer: Cham, Switzerland, 2022; pp. 282–294. [Google Scholar]
- Ibrahim, S.; Fathalla, S.; Lehmann, J.; Jabeen, H. Toward the Multilingual Semantic Web: Multilingual Ontology Matching and Assessment. IEEE Access 2023, 11, 8581–8599. [Google Scholar] [CrossRef]
- Sonfack Sounchio, S.; Kamsu-Foguem, B.; Geneste, L. Construction of a Base Ontology to Represent Accident Expertise Knowledge. Cogn. Technol. Work 2023, 1–19. [Google Scholar] [CrossRef]
- Hari, A.; Kumar, P. WSD Based Ontology Learning from Unstructured Text Using Transformer. Procedia Comput. Sci. 2023, 218, 367–374. [Google Scholar] [CrossRef]
- Rawsthorne, H.M.; Abadie, N.; Kergosien, E.; Duchêne, C.; Saux, É. ATONTE: Towards a New Methodology for Seed Ontology Development from Texts and Experts. In Proceedings of the 23rd International Conference on Knowledge Engineering and Knowledge Management (EKAW 2022), Bolzano, Italy, 26–29 September 2022; Springer: Berlin/Heidelberg, Germany, 2022. [Google Scholar]
- Polenghi, A.; Roda, I.; Macchi, M.; Pozzetti, A.; Panetto, H. Knowledge Reuse for Ontology Modelling in Maintenance and Industrial Asset Management. J. Ind. Inf. Integr. 2022, 27, 100298. [Google Scholar] [CrossRef]
- Sattar, A.; Ahmad, M.N.; Surin, E.S.M.; Mahmood, A.K. An Improved Methodology for Collaborative Construction of Reusable, Localized, and Shareable Ontology. IEEE Access 2021, 9, 17463–17484. [Google Scholar] [CrossRef]
- Alaa, R.; Gawish, M.; Fernández-Veiga, M. Improving Recommendations for Online Retail Markets Based on Ontology Evolution. Electronics 2021, 10, 1650. [Google Scholar] [CrossRef]
- Smirnov, A.; Levashova, T.; Ponomarev, A.; Shilov, N. Methodology for Multi-Aspect Ontology Development: Ontology for Decision Support Based on Human-Machine Collective Intelligence. IEEE Access 2021, 9, 135167–135185. [Google Scholar] [CrossRef]
- Elnagar, S.; Yoon, V.; Thomas, M.A. An Automatic Ontology Generation Framework with An Organizational Perspective. In Proceedings of the Hawaii International Conference on System Sciences (HICSS-53), Wailea-Makena, HI, USA, 7–10 January 2020. [Google Scholar]
- Dera, E.; Frasincar, F.; Schouten, K.; Zhuang, L. SASOBUS: Semi-Automatic Sentiment Domain Ontology Building Using Synsets. In Proceedings of the The Semantic Web. ESWC 2020, Crete, Greece, 31 May–June 4 2020; pp. 105–120. [Google Scholar]
- Guimarães, N.C.; De Carvalho, C.L. A Modular Framework for Ontology Learning from Text in Portuguese. Multi Sci. J. 2020, 3, 37–42. [Google Scholar] [CrossRef]
- Lassaad, M.; Raja, H.; Ghezala, H.H. Ben “Onto-Computer-Project”, a Computer Project Domain Ontology: Construction and Validation. Int. J. Adv. Comput. Sci. Appl. 2020, 11, 360–366. [Google Scholar] [CrossRef]
- Milanifard, O.; Kahani, M. Proposing an Integrated Multi Source Ontology Construction Methodology. Comput. Knowl. Eng. 2020, 3, 11–24. [Google Scholar] [CrossRef]
- Femi Aminu, E.; Oyefolahan, I.O.; Bashir Abdullahi, M.; Salaudeen, M.T. A Review on Ontology Development Methodologies for Developing Ontological Knowledge Representation Systems for Various Domains. Int. J. Inf. Eng. Electron. Bus. 2020, 12, 28–39. [Google Scholar] [CrossRef]
- Yunianta, A.; Hoirul Basori, A.; Prabuwono, A.S.; Bramantoro, A.; Syamsuddin, I.; Yusof, N.; Almagrabi, A.O.; Alsubhi, K. OntoDI: The Methodology for Ontology Development on Data Integration. Int. J. Adv. Comput. Sci. Appl. 2019, 10, 160–168. [Google Scholar] [CrossRef]
- Abdelghany, A.S.; Darwish, N.R.; Hefni, H.A. An Agile Methodology for Ontology Development. Int. J. Intell. Eng. Syst. 2019, 12, 170–181. [Google Scholar] [CrossRef]
- Jacksi, K. Design and Implementation of E-Campus Ontology with a Hybrid Software Engineering Methodology. Sci. J. Univ. Zakho 2019, 7, 95–100. [Google Scholar] [CrossRef]
- Alsanad, A.A.; Chikh, A.; Mirza, A. A Domain Ontology for Software Requirements Change Management in Global Software Development Environment. IEEE Access 2019, 7, 49352–49361. [Google Scholar] [CrossRef]
- Fawei, B.; Pan, J.Z.; Kollingbaum, M.; Wyner, A.Z. A Methodology for a Criminal Law and Procedure Ontology for Legal Question Answering. In Proceedings of the Semantic Technology, JIST, Awaji City, Japan, 26–28 November 2018; pp. 198–214. [Google Scholar]
- John, S.; Shah, N.; Stewart, C. Towards a Software Centric Approach for Ontology Development: Novel Methodology and Its Application. In Proceedings of the 2018 IEEE 15th International Conference on e-Business Engineering (ICEBE), Xi’An, China, 12–14 October 2018; IEEE: New York, NY, USA, 2018; pp. 139–146. [Google Scholar]
- Zulkipli, Z.Z.; Maskat, R.; Teo, N.H.I. A Systematic Literature Review of Automatic Ontology Construction. Indones. J. Electr. Eng. Comput. Sci. 2022, 28, 878. [Google Scholar] [CrossRef]
- Hontology Ontology. Available online: https://portulanclarin.net/repository/browse/hontology/a83c9d04cb7a11e1a404080027e73ea2359e10ea62b940109aabe03684aa5ea4/ (accessed on 11 March 2023).
- Harmonise Ontology. Available online: https://sourceforge.net/projects/hmafra/ (accessed on 11 March 2023).
- Travel Ontology. Available online: https://protege.stanford.edu/ontologies/travel.owl (accessed on 11 March 2023).
- Boggs, W.; Boggs, M. Mastering UML with Rational Rose 2002; Sybex: Alameda, CA, USA, 2002; Volume 1, ISBN 0-7821-4017-3. [Google Scholar]
- Grüninger, M.; Fox, M.S. The Role of Competency Questions in Enterprise Engineering. In Benchmarking — Theory and Practice; Springer: Boston, MA, 1995; pp. 22–31. [Google Scholar]
- Fernández-López, M.; Gómez-Pérez, A.; Juristo, N. METHONTOLOGY: From Ontological Art towards Ontological Engineering. In Proceedings of the Ontological Engineering AAAI97 Spring Symposium Series. American Association for Artificial Intelligence, Palo Alto, CA, USA, 24–25 March 1997. [Google Scholar]
- Hartmann, J.; Palma, R.; Sure, Y.; Suárez-Figueroa, M.C.; Haase, P.; Gómez-Pérez, A.; Studer, R. Ontology Metadata Vocabulary and Applications. In Proceedings of the OTM Workshops; Springer: Berlin, Heidelberg, 2005; Volume 3762, pp. 906–915. [Google Scholar]
- Suchánek, M. OntoUML Specification Documentation. 2020. Available online: https://ontouml.readthedocs.io/_/downloads/en/latest/pdf/ (accessed on 19 April 2023).
- Guerson, J.; Sales, T.P.; Guizzardi, G.; Almeida, P.A. OntoUML Lightweight Editor A Model-Based Environment to Build, Evaluate and Implement Reference Ontologies. In Proceedings of the 2015 IEEE 19th International Enterprise Distributed Object Computing Workshop, Adelaide, Australia, 21–25 September 2015; pp. 144–147. [Google Scholar]
- Musen, M.A. The Protégé Project: A Look Back and a Look Forward. AI Matters 2015, 1, 4–12. [Google Scholar] [CrossRef] [PubMed]
- Haridy, S.; Ismail, R.M.; Badr, N.; Hashem, M. The Combination of Ontology-Driven Conceptual Modeling and Ontology Matching for Building Domain Ontologies: E-Government Case Study. Int. J. Comput. Their Appl. 2022, 29, 269–282. [Google Scholar]
- Honnibal, M.; Montani, I. SpaCy 2: Natural Language Understanding with Bloom Embeddings, Convolutional Neural Networksand Incremental Parsing. Available online: https://spacy.io/ (accessed on 3 March 2023).
- The Open American National Corpus. Available online: https://anc.org/ (accessed on 19 April 2023).
- Agárdi, A.; Kovács, L. Property-Based Quality Measures in Ontology Modeling. Appl. Sci. 2022, 12, 12475. [Google Scholar] [CrossRef]
- Auriol Degbelo, A. Snapshot of Ontology Evaluation Criteria and Strategies. In Proceedings of the 13th International Conference on Semantic Systems, Amsterdam, The Netherlands, 11–14 September 2017; pp. 1–8. [Google Scholar]
- Lantow, B. OntoMetrics: Putting Metrics into Use for Ontology Evaluation. In Proceedings of the The 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2016), Porto, Portugal, 12 May 2016; pp. 186–191. [Google Scholar]
- Raad, J.; Cruz, C. A Survey on Ontology Evaluation Methods. In Proceedings of the the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, Lisbon, Portugal, 12–14 November 2015. [Google Scholar]
- Tartir, S.; Arpinar, I.B.; Sheth, A.P. Ontological Evaluation and Validation. In Theory and Applications of Ontology: Computer Applications; Springer: Dordrecht, The Netherlands, 2010; pp. 115–130. [Google Scholar] [CrossRef]
- Vrandečić, D. Ontology Evaluation. In Handbook on Ontologies; Springer: Berlin, Germany, 2009; pp. 293–313. [Google Scholar]
- OntoMetrics. Available online: http://www.ontometrics.org (accessed on 9 March 2023).
- Fonou-Dombeu, J.V.; Viriri, S. OntoMetrics Evaluation of Quality of E-Government Ontologies. In Proceedings of the International Conference on Electronic Government and the Information Systems Perspective; Springer: Cham, Switzerland, 2019; pp. 189–203. [Google Scholar]
- Stanford University Web Protégé. Available online: https://webprotege.stanford.edu/ (accessed on 14 April 2023).
- Shaimaa Haridy EGYTOUR Ontology. Available online: https://drive.google.com/drive/folders/1WYJ_mji0SPMsInyjlVBP0lLtEnb3XsRM?usp=sharing (accessed on 13 March 2023).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).