1. Introduction
Many tasks first require retrieving, comparing, aggregating and organizing an important amount of information of many different kinds in order to make good and timely decisions. This is the case of sustainability-oriented decisions, if only because they have to balance economical, societal and environmental issues. This is also the case of many tasks for the management of risks or emergencies. E.g., both Search&Rescue and preemptively reducing disaster risks require access and use of many kinds of information or other resources, such as particular kinds of persons, detection devices, communication tools, maps, search methods and search software. These tasks also depend on many parameters such as the nature of the emergency, the weather, the terrain and the availability of the needed resources.
Ideally, to support such tasks and hence the findability, gathering, interoperability, reusability, integration and analysis of information potentially useful to those tasks or to the design of tools for those tasks, that information should be published, related and organized on the Web in places and in ways that allow people and software agents to (i) retrieve and compare information with respect to non-predefined sets of criteria, and (ii) complement information while keeping them as organized and hence as retrievable.
As explained below, one requirement for such an ideal and scalable organization – and thus a primary very general best practice for information dissemination and collaboration between people, organizations or software – is to represent and organize information either directly within knowledge representation bases (KR bases) or in ways that can be automatically imported into KR bases (e.g. in documents and databases that have been designed to allow such an importation). These KR bases can be either privately developed or, preferably, collaboratively developed.
In this article, these KR bases are simply called
KBs and, before going further, need to be more introduced now. Such KBs do not store texts or other
data; they store KRs (or simply, “knowledge”), i.e. logic-based representations of
semantic relations between pieces of information –
semantic relations being relations that
can be represented in a logic-based way. The boxes and figures in
Section 2.1 and
Section 3 include many examples. In this article, the notions referred to by the words “
knowledge” (KRs) and “
data” are mutually exclusive. “
Data” refers to information not explicitly organized – or poorly organized – by semantic relations, e.g. as in databases or XML documents: they are mainly organized by predefined
structural relations (i.e. partOf ones) and few
semantic relations of very few predefined types (mostly typeOf relations and sometimes subtype relations). In KBs, unlike in relational databases, all the types (i.e. relation types and concept types) and their definitions are user-provided (not predefined by the database designer); most of the knowledge in many KBs are expressed via such definitions; large KBs such as CYC [
1,
2], Freebase [
3] and DBpedia [
4] have hundreds of thousands of subtype relations. Document-based technologies and database systems generally only handle
data, although deductive databases may be seen as steps towards KBs. A KB is composed of an ontology and, generally, a base of facts. An ontology is (i) a formal terminology, i.e. a set of terms (alias, object identifiers) used in the representations stored in the KB, along with (ii) representations of term definitions, and thereby direct or indirect semantic relations between these terms. Databases and natural-language-based documents cannot automatically be converted into
KBs that are well-organized via generalization and implication relations, if only because these documents and bases most often lack the necessary information to derive such relations (these relations are rarely made explicit by document authors and even human readers often cannot infer such relations with certainty). These relations – and thus, manually or semi-automatically built KBs – are necessary for the support of (i)
semantic-based searches, via queries or navigation, and (ii) any
scalable way of integrating or organizing information. This explains why architectures or methodologies for building ontologies or systems exploiting them have already often been discussed regarding disaster risk reduction or management. For example, in February 2022, the digital library of the ISCRAM conferences (“Information Systems for Crisis Response and Management” conferences) included 64 articles with main fields mentioning ontologies, and 46 of these articles recorded “ontology” as a keyword.
Several small
top-level ontologies related to disaster risk reduction or management, e.g. the agent-oriented ontology of [
5] for better indexing and retrieving “disaster management plans” in document repositories for such plans, SEMA4A [
6] which supports alerting people about imminent disasters,
empathi [
7] which is more general and integrates some other ontologies, and POLARISCO [
8] which is a suite of ontologies formalizing and relating the terminologies and methods of various emergency response organizations (e.g. fire departments, police, and healthcare services). However, as of 2022, it seems there are no public large
content ontology related to disaster risk reduction or management, let alone KBs where people or organizations could relate or aggregate information. As an example, even though [
9] (which is also about disaster related terminologies) mentions past “massive efforts e.g. in European projects such as DISASTER (
cordis.europa.eu/project/id/285069 (accessed on 7 August 2022)), SecInCoRe (
cni.etit.tu-dortmund.de/research/projects/secincore (accessed on 7 August 2022)), EPI (
www.episecc.eu (accessed on 7 August 2022)), or CRISP (
cordis.europa.eu/project/id/607941/reporting/fr (accessed on 7 August 2022))”, the results of those projects were not KBs but reports about then planned works as well as advocated architectures or small models (top-level ontologies). There currently exist some large projects, such as the Norwegian INSITU (Sharing Incident and Threat Information for Common Situational Understanding) project (2019–2022) [
10], which focus on harmonizing terminologies or on tools for the
collaborative synthesis of information in classic media (databases, textual documents, maps, …), not via KBs. The use of classic media make the harmonization of terminologies useful for supporting lexical searches (i.e. those proposed by current Web search engines and document editors; these are not semantic search tools). However, such an harmonization is a complex task which requires committees (hence an hierarchy-based decision-making organization) and it is useful only when its guidelines are followed (something that is not easy to do). Via KBs, harmonizing terminologies is not necessary since relations of equivalence or generalization between terms within KBs or across KBs can be added in a decentralized and incremental way by each provider of terms or knowledge. Tools that exploit these particular relations can allow users and knowledge providers to choose the terms they wish, without this decreasing knowledge retrievability.
This article distinguishes two meanings for “knowledge sharing” (KS). The one here called “
restricted KS” is closer to
data(base) sharing: it is about (i) easing the exchange of structured information (KRs or structured data) between
particular agents (persons, businesses or applications) that
can discuss with each other to solve ambiguities or other problems, and (ii) the
complete or efficient exploitation of the information by these particular agents, for particular applications. The other meaning, here called “
general KS”, is about people relating or representing information within or between KBs in ways that maximize the retrievability and exploitation of the information by
any person and application. Examples of early landmark works related to general KS were Ontolingua (server, ontologies and vision) [
11] and the still on-going above-cited CYC project. These two meanings are unfortunately very rarely distinguished, even indirectly, e.g. by the World Wide Web Consortium (W3C). With respect to KS, the W3C has a “Semantic Web vision” [
12] of a “Web of
Linked data” [
13]. As the use of the word “data” may suggest, and as explained in
Section 2, the techniques and vision proposed for these Linked Data are mainly focused on restricted KS. Indeed, the W3C had to focus on the basics and
convince industries of the interests of KBs over databases. However, after 1997 – the beginning of the popularization of the W3C visions and languages – KS was mainly learned about and operationalized via the W3C documents and languages, and thus almost all research works in KS were implicitly in restricted KS. Among research articles related to risk or emergency management and that advocate using KBs, most rely on the W3C techniques or approach – e.g. the articles of [
14] (about ontology-supported rule-based reasoning), of [
15] (about ontology-supported access to particular databases) and of [
16] (about a small ontology mainly including 38 concept types and 21 subtype relations, about some crisis management procedures). Previous studies into risk/emergency management have not addressed general KS in these domains and are insufficient to address the distributed and large number of potentially useful sources of information for such a management. This insufficiency is also one reason for the above-cited lack of large publicly accessible content ontologies or KBs related to disaster management.
When applied to programming – or, more generally, knowledge modeling and exploiting processes or techniques as well as rules or constraints (or data structures for them) – restricted KS means representing them (i) in a KB directly usable by a KR-based software for a particular application, or (ii) in a KB from which a particular program can be manually or semi-automatically generated (this is model-based design/programming). With general KS, these process-related resources are represented and organized into an ontology where general logical specifications are incrementally (and cooperatively) specialized by more and more precise or restricted specifications, according to selected paradigms (e.g. logical, purely functional and state-based) and their associated primitives from particular logics and languages. Since these primitives can be defined or declared in an ontology, this one can store and organize representations that are directly translatable in particular formal languages such as programming languages. Thus, if software components are stored in the lower levels of such an ontology, this one may also be used as a scalable library of software components in various languages. Via the systematic use of specialization relations and the explicit representation of any implementation choices, general KS allows the representation of specifications that are language dependent or application dependent while still maximizing knowledge reuse and thus allowing knowledge users (not just knowledge providers) to make such choices.
Knowledge representation and sharing (KR&S) – or, a fortiori, general KS – and the exploitation of its results has various advantages for risk/emergency management. Before an emergency occurs, i.e. in the anticipation phase, KR&S helps finding, organizing and analyzing resources (e.g. information for/on risk/emergency management techniques), designing tools (e.g. KB-based or not software and disaster area exploration robots) and testing them (e.g. via simulations). During an emergency, KR&S helps finding and coordinating resources (e.g. information and people). After an emergency, KR&S helps in organizing and analyzing data collected during the emergency (e.g. data collected by the robots) and exploits it for validating or refining hypothesis, techniques, simulation data and tools, thus for generating new knowledge. All these “KR&S helps or supports for risk/emergency management” derive from the knowledge integration and inferences they permit, compared to data-based technologies. Thus, in that respect, the helps and supports provided by KR&S technologies (such as those of data-based technologies) are not dependent on the context, e.g., earthquakes, fires, floods, volcanic eruptions, etc. What changes depending on the context or domain is the knowledge that is represented, searched, retrieved and exploited, as well as particular features required for that, such as particular kinds of KR construct, logic or expressiveness, e.g. for spatial, temporal or probabilistic KR. The provided KR&S helps are better with general KS – hence with the techniques provided in this article – than with restricted KS since general KS (i) supports a better integration of knowledge by more people, hence more knowledge sources, and (ii) supports each knowledge provider, consumer or application in selecting, extending or creating the above-cited particular features they require. Finally, regarding the context independence of the panorama of techniques provided in this article, it should also be noted that these techniques were developed by the first author due to some clear insufficiencies of existing KR&S technologies for general KS, in any domain. The next paragraph lists these insufficiencies.
Representing rules or filling data acquiring forms for a particular application – or building a tool to support this – is different to representing knowledge for general KS purposes – or building a tool to support this, e.g. for allowing experts or companies in a particular domain to represent (in a shared KB) the products, services or knowledge they can provide, or for allowing researchers, lecturers and engineers to represent and integrate their knowledge in this shared KB for pedagogical or cooperation purposes. When representing knowledge for general KS purpose, some technological gaps in existing KR&S technologies often become apparent. First, starting from the most immediately apparent: reusing an existing large shared lexical ontology is necessary since otherwise every knowledge contributor would have (i) to define each term (word sense) and its generalizations, and (ii) relate each of them to each other term of each other contributor; in other words, they would each have to spend months or years creating and relating their own large shared lexical ontologies. Second, extending the used KR language appears useful because it is almost never expressive or concise enough to allow entering all the particular required knowledge for the particular domain to represent. Third, for representing such an amount of complex knowledge, textual KR languages are much easier to use than graphical interfaces, in the same way that, for medium-to-large programs, textual programming languages are easier to use than graphical ones. Fourth, the used KR languages allows many ways to represent equivalent knowledge but the associated inference engines are not able to find the results equivalent. Fifth, separately-built KBs – hence poorly related KBs that are often inconsistent and implicitly redundant with each other – are not exploitable for general KS: they do not contain enough information for an inference engine to integrate them reliably (analogously, a person cannot integrate texts written in a language he does not understand). Thus, general KS requires shared KBs with a KB editing protocols that (i) ensures that enough information is provided to reach and maintain a particular minimal organization in each shared KB, and (ii) does not restrict the knowledge the users want to enter in a KB as long as it is within the scope of this KB. Sixth, in addition to this inner-KB KS protocol, there is a need for inter-KB KS protocols since no single individual shared KB can host and efficiently manage all knowledge in all domains, or have a KB editing protocol that satisfy all knowledge contributors. Although identifying these problems is not too difficult when representing knowledge for general KS purpose, research avenues for solving them were original and ambitious: the work of developing and implementing all the underlying techniques, tools and general ontologies is difficult and very long.
Section 2 introduces complementary techniques for supporting general KS – and hence the ideal described in the second paragraph of this introduction – via four subsections, one for each of the following four complementary topics of such a support: KR language instruments, KR content instruments (reusable ontologies; this is the topics on which most general KS related research focus), inner-KB content organization, inter-KB content organization. While doing so,
Section 2 also gives (i) various rationale for the above-cited insufficiencies of classic techniques, and (ii) the constraints (or most important features to support) that explain why the provided solutions are proposed as answers to these insufficiencies. The originality of
Section 2 is in the panorama or synthesis itself, rather than in the depth of the description of the introduced or cited techniques, since the first author has previously published on several of these techniques but separately, not together. However, in
Section 2 some new elements are also introduced. Furthermore, the panorama shows that it is only together that these complementary techniques support general KS by collectively answering the following research question: how to allow Web users to collaboratively build KBs where pieces of information (i) are not implicitly “partially redundant or inconsistent”, neither internally nor with each other, (ii) are complete w.r.t. particular criteria and subjects selected by the KB creators, (iii) do not restrict the knowledge that people can provide nor force them to agree on beliefs or terminology, (iv) do not lead knowledge providers to duplicate information in various KBs, and (v) do not require people to search information in several KBs nor aggregate information from several KBs?
Via several examples,
Section 3, the second part of this article, shows how various kinds of information useful for risk/emergency management can be represented or categorized for the purpose of general KS.
Section 3.1 illustrates how organizing and representing a small terminology, and why performing such tasks is important.
Section 3.2 provides a general model for organizing and representing
Search&Rescue information; the logic-based representation of procedures and other description objects is illustrated and is original for such tasks.
Section 3.3 shows KRs for an automatic systematic exploration of a disaster area, e.g. by a rover (in this article, “rover” refers to an autonomous small vehicle such as those used for planetary surface exploration); the illustrated originality in
Section 3.3 is the representation of procedures.
Section 3.4 represents information about ways to design rovers that are adapted to a terrain; the illustrated originality is in showing how all the important information from three different research articles are synthesized, related and organized. The contents of all these KRs (models, procedures, techniques, …) and their use for designing the intended rovers are themselves validated by the designed prototype rover and its capabilities [
17].
2. Four Complementary Avenues for Supporting General Knowledge Sharing
2.1. Tools to Import/Export Any Kind of Knowledge, Even in User Specified Formal Languages
Knowledge representations (KRs) are logic statements. From a graph-oriented viewpoint, KRs are concept nodes (i.e. concept type instances, quantified or not) connected or connectable by relation nodes (or, more shortly, “relations”: existentially quantified instances of relation types). KRs are expressed in formal languages: KR languages (KRLs). In this article, a KB is a set of objects that are either types (objects that can have instances) or non-type objects. Statements (KRs) are non-type objects. Types are either concept types or relation types. In this article, a “term” is an object identifier that does not solely come from the used KRL, i.e. that is not solely predefined. A term is
defined or declared in an ontology. A KB is only an ontology if it has no base of facts, hence if all its statements are definitions.
Box 1,
Box 2 and
Section 3 give KR examples.
Box 3 illustrates simple semantic queries on KRs.
Box 1. Some equivalent formal representations of a very simple statement (in the names of the given KRLs, “/” separates the used “logic/abstract model(s)” part from the used “concrete syntax model” part, and means that the first one is linearized with the second one).
Box 2. Some equivalent formal representations of a more complex statement, one that cannot be represented in first-order logic (and, a fortiori, in RDF+OWL2; for the representation with the Turtle notation, the IKLmE logic and structural model is used).
Box 3. Some equivalent formal representations of two semantic queries on a KB.
When it comes to KR languages (KRLs), the W3C first proposes
a few ontologies for “KRL models”, i.e. logic and structural models, e.g. RDF for representing very simple logic formulas (existentially quantified conjunctive formulas), OWL2 for the use of the SROIQ description logic and RIF for representing rules of more expressive classic logics. The W3C also proposes
some notations, i.e. concrete syntax models, for the previous KRL models, e.g. the notations named RDF/XML, RDF/Turtle and RIF/XML.
Box 1 illustrates RDF/Turtle and the meaning of “/” in these names. There exists other standards for other KR logic models, e.g. the model of KIF (the ANSI “Knowledge Interchange Format”) and Common-Logic (CL, the ISO/IEC model for first-order logic), with various notations for them, e.g. Prefixed-KIF, Infix-KIF and XCL (“XML for CL”). However, as described by the next two paragraphs, the current standard or common KRLs have at least two problems for general KS, e.g. for risk/emergency management.
The first drawback of these KRLs is their expressiveness restrictions. Although these restrictions ensure that what is represented via these KRLs has some interesting properties (e.g. efficiency properties), these restrictions prevent the representation of some useful information: some KRs cannot be formally written. Then, these KRs cannot be shared, and this also often leads to the writing of KRs in ad hoc, imprecise or biased ways, hence in incorrect or far less exploitable ways. Conversely, for general KS, enabling people to write expressive KRs has often no downside since, when needed and whichever their expressiveness, KRs can be translated into less expressive ones. This can often be completed automatically, to fit the need of a particular application, by discarding the kind of information that this application cannot handle or does not require. Since such choices are application dependent, the knowledge users should make them, not the knowledge providers. KRs designed for particular applications are often unfit (too biased or restricted, ...) for other applications. As mentioned in other words within the introduction, in general KS, knowledge providers do not make application-dependent choices – or only as additional specializations, hence without restricting the possibilities of knowledge users. Since current or future risk/emergency management cannot be reduced to a list of particular applications, it is limited by expressiveness restrictions.
A second important drawback of these KRLs is that they are not “high-level”, meaning that they are not supporting or leading to “normalized and easy to read or write” representations of many important notions such as numerical quantifiers, meta-statements, and interpretations of relations from collections. Hence,
even when similar pieces of information are represented, if different KRLs or different knowledge providers are involved, the results are generally so different that matching them to each other is difficult to do automatically, and hence so is searching or aggregating them. Using
ontology design patterns – such as those of [
18] – is difficult and only very partially addresses these issues; thus, it is rarely performed. In addition, for different domains or applications, it is often useful to use different notions and different ways to represent information. Viewing – and,
a fortiori, writing – KRs via current KR editors is even more restricting in terms of what can be displayed and expressed. E.g., graphics take a lot of space and thus do not allow people to simultaneously see and hence visually compare many KRs (this is a problematic for KR understanding and browsing).
A first answer to these problems was (i) FL [
19], a KRL that has a very expressive, concise and configurable textual
notation, and (ii) FE [
20], an English-looking version of FL which can more easily be read by people with only a small training in KR. Like FL, FE can use an ontology even for logic-related terms such as quantifiers and hence can be a notation for any logic, unlike the other logic-based controlled languages, e.g. “Attempto Controlled English” and “Common Logic Controlled English”.
Box 1 and
Box 2 illustrate the expressiveness and high-levelness of FL and FE compared to some classic KRLs. The English statement in
Box 2 could have been represented in KIF (since it has a second-order logic notation interpreted into a first-order logic model) but in a less readable and normalizing way.
A more general and complementary answer is the design of an ontology of (i)
model components for logics,
and (ii)
notation components for these models. KRLO (KRL ontology) [
21] is a core for such an ontology: it supports the definition of KRL languages (and actually most formal languages). Furthermore, it is stored in a cooperatively-built shared KB (details in
Section 2.3), that allow Web users to extend KRLO and store the definitions of new KRLs. A library of software components exploiting such an ontology is currently being created. Via these components or modules, KB systems will be able to import/export from/to/between any such specified KR languages, and thus also perform particular kinds of KR translations (in addition, since the rules for such translations are also specified in the ontology, tool users will not only be able to select the rules that they want to be applied but also complement these rules). [
22] criticized KIF, and other would-be KRL interoperability standards, for necessarily packaging only a particular set of logic primitives and hence not actually supporting interoperability if the primitives of any logic cannot be defined with respect to each other with that KRL. The use of KRLO and translation-procedures-based on it is a solution to this problem and can be seen as a way to have the interoperability advantage of standards without their expressiveness and notational restrictions. [
21] also shows how common notations such as Turtle or JSON-LD can be used for representing meta-statements and many kinds of quantifiers, albeit in a yet non-standard way.
Box 2 illustrates this with Turtle and IKLmE, a model that is part of KRLO and that represents the concept and relation types of IKL [
23], a first-order logic model that is an extension or variant of CL and KIF for interoperability purposes. Some other research projects had or have some similarities with the KRLO project but do not share the goal of supporting one shared ontology for any number of KRLs. Furthermore, KRLO is cooperatively extendable by Web users, as detailed in subsequent subsections, for general KS purposes as well as general translation purposes between KRLs. No other project related to KRL ontologies had the same goal as the KRLO project. The LATIN (Logic Atlas and Integrator) Project (2009–2012) [
24] represented translation relations between many particular logics. Ontohub [
25] is (i) a repository that included some KRL model representations and some translation relations between them, and (ii) an inference engine able to integrate ontologies based on different logics. ODM 1.1 [
26] is an ontology that relates some elements of some KRL models, mainly RDF, OWL, CL and Topic Maps.
2.2. General Purpose Ontologies Merging Top Level Ontologies and Lexical Ones
Foundational ontologies or, more generally, top-level ontologies define types that
support and guide the checking, organization and representation of the ontologies they are included in. Two examples of well-known general foundational ontologies are DOLCE [
27] and BFO [
28]. The previously cited POLARISCO [
8] relies on BFO for better formalizing and relating the terminologies and methods of various emergency response organizations.
Strictly speaking, lexical ontologies – e.g. ConceptNet 5.5 [
29] – organize and partially define various
meanings of words from natural languages and relate these words to these meanings. However, in this article, the expression “lexical ontologies” also refers to “large mappings between general KBs”, e.g. the lexical ontology of UMBEL (now retired but included into KBpedia [
30]) which had more than 35,000 types and 65,000 formal mappings between categories from (for example) OpenCyc, YAGO, DBpedia, GeoNames and schema.org.
Both kinds of ontologies – top-level ones and lexical ones – are domain-independent, thus usable in risk/emergency management. The more a KB reuse types from such ontologies, the easier it is for people to create, update or organize this KB and the more any of its content can be retrieved using these types. Similarly, the more types two KBs share and are based on (hence, especially types from such ontologies), the easier the content from these two KBs can be aligned or fully integrated. Below, the word “merge” is used for referring to any of these two processes. Since such ontologies are sets of definitions, as opposed to assertions of facts or beliefs, inconsistencies between these ontologies are telltales of conceptual mistakes, such as over-restrictions or misinterpretations. Thus, for the parts these ontologies are not redundant with one another, such ontologies complement each other and, possibly after some making some corrections, can be merged without this leading to inconsistencies.
The Multi-Source Ontology (MSO) [
31] is a step towards such a merged ontology. The MSO already merges several top-level ontologies as well as a lexical ontology derived from WordNet [
32]. It will be complemented with other top-level ontologies, typically those from other merges included in large general ontologies such as YAGO and DBpedia. However, unlike for other merges, the ones in the MSO follow the general KS supporting methods described in the next subsection. Here are examples of what this entails.
The MSO is in a cooperatively-built shared KB where it can be improved and complemented by Web users.
Modifications in such a KB are, whenever needed, “additive”, as opposed to “destructive”, since (i) a modification can be made by adding a relation that states how a newly entered KR
corrects another KR, (ii) KRs are represented as
viewpoints, preferences or beliefs from particular knowledge providers, and (iii) particular relations must be entered between opposing beliefs for them to be later automatically managed according to the wishes of each user. The next sub-section explains how. The other KS approaches are essentially based on helping the creation, handling, retrieval and aggregation of
(possibly competing) ontology modules – e.g. see [
33] – and
versions (for KBs, hence for ontology modules too) – e.g. see [
34]. Modules and versions are relation sets which may be “partially redundant and inconsistent” with each other, i.e., which may be competing. Thus, when creating a KB, such sets often require choices by ontology designers or users for selecting one or another. Using different modules or versions lead to different KBs, thus increasing the list that some knowledge users have to choose from and sometimes integrate. With the approach used in the MSO, additions do not require choices between relations and particular modules or versions can still be extracted using semantic queries.
In accordance with the previous point, when an ontology is merged into the MSO, its content does not need to – and is not – destructively modified to fix conflicts with other ontologies. Thus, no arbitrary choice has to be made and this eases the integration of later versions of these integrated ontologies.
The MSO has a top-level organized by subtype partitions, and thus has advantages similar to those of a decision tree for knowledge inference and retrieval purposes. This organization is kept when new KRs are added into the above-cited kind of “additive but consistent” shared KBs.
In addition to a lexical ontology and top-level ontologies, the MSO includes KRLO and hence types interesting for categorizing or representing software or procedures.
Section 3.2 shows how this last point is useful for risk/emergency management too.
2.3. KB Servers That Support Non-Restricting KB Sharing by Web Users
A user of a shared KB may want to complement it with a statement that contradict another knowledge provider’s statement already in this KB. However, for general KS purpose, a KB should not include two statements that are logically inconsistent with one another, since classic logics – and therefore most inference engines – cannot handle KBs that are logically inconsistent (in other words, most KB management systems are not based on a paraconsistent logical system or a similar approach). Similarly, for general KS purpose, avoiding inconsistencies in a shared KB cannot be achieved by having a person or a committee decide to
accept or not each new statement that is submitted to the KB. Indeed, this process is too slow to be scalable and it is important for
general KS to preserve the possibilities for knowledge end-users to make selections themselves according to their particular needs. Similarly, general KS cannot use solutions based on selecting only consensual KRs or only KRs from a largest consistent subset of the KB. Using a software to dispatch the submitted statements into different KBs (depending on various criteria) for each resulting KB to be internally consistent, e.g. as in the Co4 protocol for building consensual KBs [
35], is also not a scalable solution: with such a method, the number of required KBs can grow exponentially and these KBs may be mostly redundant with one another.
Solutions start by associating each term (alias, identifier within the KB) and statement to its source (its author or, if unknown, the source document). This is already a standard practice when it comes to terms (alias, object identifiers), e.g., the systematic use of URLs (with or without abbreviations) is advocated by the W3C. Regarding statements, making this association is to acknowledge that the statements which are usually called facts in KBs are actually beliefs: the associations between them and their sources become the actual facts. This association may be made via meta-statements that contextualize other statements to represent who created these last ones or believe in them. (Unfortunately, as of 2022, the W3C has not yet made recommendations regarding ways to represent contextualizations and OWL does not support the representation of meta-statements). More generally, in KBs that include such beliefs, the statements provided by users can be categorized as being either “beliefs” or “definitions”. These last ones are always “true, by definition” since the meaning of the term they define is whatever its definitions specify (thus, if a definition of a term is inconsistent, this term means “something impossible”). For example, assuming that pm is an identifier for a particular user in a KB, then pm is entitled to create the term “pm:Table” (this identifier uses the term-prefixing syntax allowed by most KRLs advocated by the W3C) and to define it as a type for flying objects rather than as a type for some kinds of furniture. Thus, definitions do not need to be contextualized like beliefs are.
Thus, to avoid direct inconsistencies between statements from different contributors (knowledge providers), a shared KB may have an editing protocol that leads to the entering of beliefs instead of facts. When a contributor C is about to add a belief that the inference engine detects as being in conflict or partially redundant with another contributor’s belief already in the KB, the protocol may ask C to relate the two beliefs for (i) representing why this addition is necessary (this is also a way to make C realize that the addition is not necessary or has to be refined), and then (ii) let the inference engine exploit such relations between conflicting beliefs for making choices between them when such a choice is required. For example, if the statements “according to user X, birds fly” and “according to user Y, healthy adult carinate birds can fly”, then a relation must be added between these statements to state whether the second statement is a correction (by Y) of the first statement, or whether the first statement is a correction (by X) of the second statement. Such a relation can then be exploited (according to application requirements or the preferences of the current user) for automatically or manually selecting which statement should be exploited by the used inference engine for the cases when this engine must choose between the two statements. If the purpose is simply to retrieve knowledge, this choice may not be needed since, when two statements are potential answers to a query, a good and informative result may simply be to return both of them connected by the relevant corrective relation. One particular rule for an automatic exploitation strategy may be a specification of the following informal rule: “when a choice between conflicting statements from trustable authors is needed, select the most corrected statements according to their inter-relations and then, if conflicts remain, generate all maximal sets of non-conflicting statements and give the results of the inferences made with each set”. Different users may refine or complement this rule in many ways.
The
shared KB editing protocol of the WebKB-2 server [
36] implements and actually adds some precision to this general approach. This protocol uses the addition of particular relations to the KB not only to be able to manage KB sharing conflicts but also modifications to the KB: modifications are
additive, not destructive. For example, when objects (relations or terms) are made obsolete by their creators but are still used by other agents, these objects are not fully removed but contextualized in a way indicating (i) regarding terms, who their new owners are, and (ii) regarding relations, who do not believe in them anymore. Regarding the addition of a belief that the inference engine detects as being
in conflict or partially redundant with already stored ones, the
main principle of this protocol is to ask the author of the belief to
connect it to
each of these particular other stored ones via a relation of a type
derived from each of the following ones: “
pm:correction”, “
pm:logical_implication” (alias, “=>”) and “
pm:generalization” (not all logical implications are generalizations). Here, “derived” means either “identical”, “inverse”, “exclusive with”, “subtype of”, “subtype of the inverse”, or “subtype of a type exclusive with”. E.g., “
pm:non-corrective_specialization_only” is defined as a subtype of the inverse of “
pm:generalization” as well as an exclusion to both “
pm:correction” and “=>”. Thus,
all potentially conflicting or redundant statements are (directly or transitively) connected via these relations. This organization has many advantages for inferences, quality evaluations and checks of the KB, e.g. statements can be searched for via their exclusion to some other ones. Even more importantly for general KS, this organization supports automatic choices between conflicting statements via rules such as the one given in the previous paragraph.
Since knowledge providers can specify the above-cited relations even when an inference engine is not able to detect potential conflicts or implicit redundancies, knowledge providers can also specify such relations between informal statements within a KB or a semantic wiki. Thus, the above-described approach can also be used for organizing the content of a semantic wiki and thus avoiding or solving edit wars in it. To sum up, the approach described in the previous paragraph works with any kind of information, does not arbitrarily constrain what people can store or represent, and keeps the KB organized, at least insofar as people or the used inference engine can detect redundancies or inconsistencies. In a fully formal KB, many implications have to be provided by knowledge providers (e.g., these implications may be rules these persons believe to be true) but generalization relations between statements can be automatically generated, e.g. for inference efficiency purposes. To obtain or keep a partially informal shared KB organized, and hence better exploit it for inferences and cooperation purposes, the more this KB uses some informal terms in its statements, the more it is useful to also ask the knowledge providers to specify generalization relation between statements.
2.4. KB Servers That Support Networked KBs
As hinted in the introduction (first paragraph), there is a huge amount of information that can be valuable for a domain such as risk/emergency management (and the information can also be used for many other purposes). All the information cannot be stored into a single individual KB (alias, physical KB). An individual KB is a KB having one associated KB server that stores this KB and manages query/update accesses to it – one server or, for security purposes, a set of equivalent ones. As opposed to such a KB, a networked KB (alias, virtual KB) is composed of a network of individual KBs where the KB servers exchange information or forward queries among themselves.
The W3C has not made recommendations about networked KBs, it only advised KB authors to relate the terms of their KBs to terms of some other KBs. This advice tries to reduce the problems coming from the fact that most KBs are developed
independently from one another, and hence are
just structured data for one another since their ontologies are not related or poorly related. However, this strategy for
partially independent development of KBs only very partially solves the above referred problems: the more knowledge is added to such KBs, (i) the more inconsistencies and implicit redundancies they have between them, i.e. together, (ii) the harder it then is to align or integrate them, and (iii)
each user wanting to reuse such KBs
has to (re-)do such an integration work. Although there are numerous approaches for partially automatizing such a work or aspects of it, as for example recently summarized by [
37], their success rates are necessarily limited: correctly and fully integrating two (partially-)independently developped ontologies requires understanding the meaning of each object in these ontology and hence, most often, finding information that is not represented in them.
Thus, for reasons similar to those given in the previous (sub-)sections, requirements for a networked KB to be scalable and interesting for general KS purposes are: (i) its overall content, i.e. the sum of its component KBs, should be as organized as if it was stored into one individual shared KB with the properties described in the previous subsection, (ii) neither the repartition of the KRs among the KBs, nor the process of adding an individual KB to a networked KB, should depend on a central authority (automated or not), and (iii) no user of the networked KB should have to know which component individual KB(s) to query or add to. Thus, ideally, for general KS on the Web, (i) there would exist at least one networked KB organizing all KRs on the Web, and (ii) additions or queries to one KB server would be automatically forwarded to the relevant KB servers.
These constraints are not satisfied by networked KBs based on
distributed or federated database systems. Indeed, in these systems, the protocols that distribute or exchange information and forward queries exploit the fact that each individual KB or database has a
fixed database schema or ontology, i.e. one that is not modified by its end-users (e.g. data providers). On the other hand, in general KS, the ontologies of the individual KBs are
often updated by their contributors. Many networked KB architectures exploit such database systems, including the architectures advocated in risk/emergency management related articles, e.g. those of [
15].
Similarly, these constraints are not satisfied by networked KBs based on
peer-to-peer (P2P) protocols or
multi-agent system (MAS) protocols. Indeed, for exploiting the KRs within these KB – e.g., for the distribution, indexation or exchange of knowledge – these protocols also have to rely on some
fixed and/or centralized ontologies (and/or use knowledge similarity measures or automatic ontology integration techniques when these approach are sufficient for the intended task or domains; these measures or techniques may be provided by the individual servers, peers or agents). These fixed ontologies may be stored within the individual servers, software agents or peers – or sometimes even the P2P routing table, as described by [
38]. They may also be external to them, with more structured networks (e.g. the use of super peers) or centralized solutions, for instance as described by [
39,
40,
41].
For satisfying the above-cited constraints, the solution proposed in [
19] by the first author of this present article is based on the notions of “(individual KB) scope” and “nexus for a scope”. The rest of this section presents the underlying ideas of a recent extension this solution by the first author. An
intensional scope is a KR specifying the kinds of objects (terms and KRs) that a shared individual KB server is committed to accept from Web users. This scope is chosen by the owner(s) of this shared individual KB. An
intensional core scope is the part of an intensional scope that specifies the kinds of objects that the server is committed to accept even if, for each of these kinds of objects, another intensional core scope on the Web also includes this kind of objects (i.e. if at least another server has made the same storage commitment for this kind of objects). An
extensional scope is a structured textual Web document that lists each formal term (of the ontology of the individual KB) that uses a normalized expression of the form “<formal-term-main-identifier>__scope <URL_of_the_KB>”. Since extensional scopes are Web documents, such a format enables KB servers to exploit Google-like search engines for retrieving the addresses of KBs storing a particular term. A
(scope) nexus is a KB server that has publicly published its intensional and extensional scopes on the Web, and has also specified within its non-core intensional scope that
it is committed to accept storing the following kinds of terms and KRs whenever they do not fall in the scope of another existing nexus: (i) the subtypes, supertypes, types, instances of each type covered by the selected intensional scope, and (ii) the direct relations from each of these last objects (that are stored in this KB only as long as no other nexus stores them). (The WebKB-2 server that hosts the MSO is a nexus that has at least the MSO as intensional scope. Thus, this server can be used by any networked KB as one possible nexus for non-domain specific terms and KRs.) Then, “an individual KB (server)
joining a networked KB” simply means that the KB server is being committed not only to be a nexus for its intensional scope but also to perform the following tasks whenever a
command (query or update) is submitted to the KB server:
The first task is, insofar as the intensional scope allows it, to handle this command internally via the KB sharing protocol of WebKB-2 or another protocol with similar or better properties.
The second task is to forward this command to the KB servers which, given their scopes, may handle it, at least partly. These servers are retrieved via their published extensional scopes.
Thus, thanks to this propagation, each command is forwarded to all nexus that can handle it, and no KB server has to store all the terms of all the KBs, even for interpreting the published scopes of other nexus. To counterbalance the fact that some forwardings of KRs may not be correctly performed or may be lost due to network errors, i.e. to counterbalance the fact that this “push-based strategy” may not always work, each KB server may also search other nexus having scopes overlapping its own scopes and then import some KRs from these nexus: this is the complementary “pull-based strategy”. KB servers that have overlapping scopes may have overlapping content but this redundancy is not implicit and hence, as explained in the previous subsection, not harmful for general KS purposes.
To sum up,
Section 2.4 showed how some inter-KB organization(s) can replicate an inner-KB organization that has advantages and supports that are described in
Section 2.3 and
Section 2.2, which themselves are made possible via the language-related techniques introduced in
Section 2.1.
Section 3 illustrates some applications of some ideas from
Section 2.1 and
Section 2.2 for some knowledge useful in risk/emergency management.