Semantic Web and Knowledge Graphs for Industry 4.0

: In recent years, due to technological advancements, the concept of Industry 4.0 (I4.0) is gaining popularity, while presenting several technical challenges being tackled by both the industrial and academic research communities. Semantic Web including Knowledge Graphs is a promising technology that can play a signiﬁcant role in realizing I4.0 implementations. This paper surveys the use of the Semantic Web and Knowledge Graphs for I4.0 from different perspectives such as managing information related to equipment maintenance, resource optimization, and the provision of on-time and on-demand production and services. Moreover, to solve the challenges of limited depth and expressiveness in the current ontologies, we have proposed an enhanced reference generalized ontological model (RGOM) based on Reference Architecture Model for I4.0 (RAMI 4.0). RGOM can facilitate a range of I4.0 concepts including improved asset monitoring, production enhancement, reconﬁguration of resources, process optimizations, product orders and deliveries, and the life cycle of products. Our proposed RGOM can be used to generate a knowledge graph capable of providing answers in response to any real-time query.


Introduction
The emergence of the Internet of Things (IoT), Internet of Services (IoS), Cyber-Physical Systems (CPS), and closer collaborations between human-machine and machine-machine systems have revolutionized the current industrial landscape resulting in the so-called Industry 4.0 (I4.0) [1].Technological advancements and the proliferation of different types of field devices such as sensors, embedded systems, and self-governed robots have enhanced I4.0 production.These heterogeneous field devices communicate in real-time and thereby are generating a huge amount of valuable data during the manufacturing process.The generated data can play an important role in several aspects such as enhancing the life cycle of products, on-time and on-demand productions, resource optimizations, product customization, maintenance of machines, and logistic styles [2].
However, the heterogeneous nature of different devices, the variety of their generated data, and their interoperability (or lack thereof) presents challenges for the efficient utilization of I4.0 industrial productions.To tackle such challenges, the Semantic Web including knowledge graphs is one of the possible solutions to obtain and communicate domain knowledge among distributed I4.0 partners [3].
The Semantic Web has revolutionized the existing document-based web into more intelligent systems by integrating data and web content into a more structured web environment whereby software agents can carry out tasks autonomously for users.The semantic web makes use of an ontology to represent the information in a machine-processable structure [4].Ontologies are the data models that are used to represent the semantics of domain concepts through ontological term, i.e., classes (entities) and relationships (properties).An ontology defines the schema of a domain and does not include any information about a particular individual of a domain.For instance Figure 1 illustrates an ontology with generalized terms, i.e., a class Book is linked to another class Author by a property hasAuthor.On the other hand, the insertion of the data instances into ontological terms becomes a Knowledge Graph.An ontology is a subset of a knowledge Graph and is needed for the development of knowledge graph.For example, when a specific instance such as a book named Hour of the Witch is written by an author Chris Bohjalian are mapped into ontological terms of Figure 1, it becomes a knowledge graph as illustrated in Figure 2. Ehrlinger et al. reported several definitions of a Knowledge Graph and have made clear the difference between an ontology and a knowledge graph [5].The emergence of knowledge graphs provides an enterprise-ready data framework analogous to the current status of the Semantic Web by integrating knowledge storage and intelligent discovery.In order to discover additional information from knowledge graphs, graph embedding techniques are used [6].Even though extensive work has been done in semantic data modelling to facilitate I4.0 applications, due to the complex nature of overall I4.0 systems, the currently available semantic model-based ontologies have several limitations.Three of the major issues are: (1) these production line models do not follow "Linked Data" principles and thus are lacking the re-usability of the existing vocabularies, such as Dublin Core, schema.org; (2) the scope of these models are application-specific (i.e., they cover a limited area such as manufacturing processes, resources, etc.) rather than the overall I4.0 system, ranging from data generation to production; and (3) there is no Industry 4.0-ready knowledge graph that can answers queries due to a lack of real-time data availability [7,8].
In this paper, we summarise existing approaches for Industry 4.0 and the Semantic Web.Our aim is to highlight major issues and opportunities which could arise from the merger of these existing technologies.Particularly, this survey provides a comprehensive overview of all existing ontologies, and then concludes with an overview of how we can benefit from using these ontologies and creating a knowledge graph for Industry 4.0.
The rest of this paper is structured as follows: Section 2 explains search methodology.I4.0 is highlighted in Section 3. The manufacturing production lines are described with their requirements, applications and challenges in Section 4. Semantic Web and ontologies for I4.0 are reviewed and analysed in Section 5. Reference Generalized Ontological Model (RGOM) is explained in Section 6. Section 7 provides a detailed discussion summarising insights gained from the systematic review, and finally, Section 8 concludes the paper with possible future direction.

Methodology
The survey was conducted based on three stage methodology including (i) planning and scope of the review, (ii) filtration of the review, and (iii) reporting the review [9].In the first stage of the methodology, the scope of the review is set to determine the literature's relevance to the semantic web and knowledge graphs in I4.0.This stage involved the identification of the most suitable keywords to select the articles.The keywords comprised of two main parts, i.e., method and field.The first term is used to represent the method while the second term represents the field where the method is being utilized.One of the keywords from the method has to be used with the field keyword at a time.For example, the industry 4.0 keyword from the field is combined with the ontology keyword from the method to search for ontologies for industry 4.0 and it is then combined with the knowledge graph keyword from the method as Knowledge graph for industry 4.0.Likewise, the field keywords listed in Table 1 were combined one by one with method keywords at a time, to search all types of manufacturing and production synonyms and technologies where knowledge graph can be applied.In the same way, the rest of the keywords from the fields are concatenated with method keywords.The & is a Boolean which joins the method keyword with the field keyword while the + is a Boolean OR that is used to incorporate an alternative keyword, synonyms or spellings from the field keyword.
As a result, this stage provided an initial step with searching different databases such as ACM digital library, IEEE Explore, Science direct, and Scopus with date ranging from 2010 to 2020 which results in almost 164 articles, in total.Table 2. illustrates the list of digital libraries used for searching articles.Additionally, a Google Scholar search engine has also been used in order to include non-academic publications.These articles include academic as well as industry publications containing conferences, workshops, letters, journals and peer-reviewed books.In reporting the literature, we included only full-text work based on ontology proposal as well as the construction of a knowledge graph for smart manufacturing.
In the second stage, an advanced filtration was adopted by considering the different versions of the selected ontologies in conjunction with the combinations of the titles and abstract which resulted in selection of more specific articles of 110.In line with ontologies selected version the titles and abstract of each research paper were studied to identify its relevance for inclusion.The filtration process was carried out using the following steps.

•
The most relevant ontologies covering reference architectures, manufacturing production line, predictive maintenance and supply chain concepts of I4.0 were captured.

•
The study elaborated all versions of the chosen ontologies for understanding their functional behaviour and its adaptation in the study.
The third stage is reporting the review and is composed of two steps.In the first step, a full text reading approach was adopted to further narrow the search and obtained 87 articles.This step excluded all those papers summarizing the work on Semantic Web or Knowledge Graph in smart Manufacturing.In the second step of reporting the review, a total of 51 papers were found relevant to be included in the study.Each round contains articles that were affirmed to be relevant in the previous round.The overall methodology adopted in this work is summarized in Figure 3.

Industry 4.0
I4.0 is one of the emerging topics coined by researchers referring it as a new era for industry and it is widely adopted throughout scientific world as well as industry particularly in Germany [10].A few other countries having manufacturing based industry like Japan [11] and Korea [12] have also been influenced by I4.0 concept and launched their related programs.
I4.0 aims to merge the advantages of technologies such as CPS, Internet of Services (IoS), and IoT to create smart factories.CPS is a system of systems in contrast to traditional systems that requires the collaboration of different machines, materials, and humans to work together intelligently to enhance production [13].
The vision of I4.0 is comprised of nine main pillars [14].Among them, system integration is one of the key pillars that is aimed to mutually connect three factors, i.e., digitization and assimilation into a complex technical-economical network from any simple relation, Digitization of the offered products and services, and new market models [10].According to [15], devices need to be integrated into three dimensions to achieve the objectives of establishing smart factories.

Vertical Integration in a Factory
It refers to the system integration at different hierarchical manufacturing levels into one smart manufacturing solution.This integration is accomplished from the level of the production shop floor that contains devices such as cyber-physical systems, actuators, sensors to the system of enterprise resource planning (ERP) of the business planning level.

Horizontal Integration Over-Value Network
It integrates the resource and information network within the value chain, to accomplish the smooth collaboration between businesses and deliver a real-time service and product.Smart manufacturing factories reach the world by utilizing the global production chains and networks of data in their processes.The interaction of factories within the globe as a smart factory is ensured by horizontal integration.

End to End Integration
The Product life cycle development involves many engineering activities to build a CPS, e.g., idea, design, manufacturing, utilization, and closure.The objective of a CPS engineering activity is to provide a good quality product, e.g., a comprehensive design of production plants, and to meet accurate time frames.
Considering factories integration across the globe operates according to different rules, standards, and businesses, which is a complicated task to be achieved.To solve the interoperability in such integrated environment, the meaning of I4.0 entities such as actuators, conveyors, sensors, are required to be described semantically to understand and share their meaning.Moreover, I4.0 addresses research and development in the rest of the eight main pillars to sustain the implementation of its principles in the industry [16].The focus of this work is to survey the recent literature for highlighting the major issues in the recent ontological solutions for the industry 4.0 (I4.0) and propose a generalized ontology framework covering most of the important concepts of I4.0.

Manufacturing Production Line
In this section, we give an overview of the manufacturing production lines, their requirements, applications and challenges.Later on, we demonstrate the effectiveness of semantic web and related technologies in order to address these challenges.The manufacturing production line performs a set of consecutive and parallel actions, processes, or procedures set up in a factory site.A manufacturing process is a progressive approach broken down into relatively distinct brief actions done at specially equipped components [17].Components such as machines, workstations, cells are assembled to make the desired product from raw materials.The raw materials are refined into a product during a refining process that becomes appropriate for consumption.For example, textile source plants such as cotton or agriculture products such as foodstuffs need a series of processes to declare them useful.Production lines manufacture a single or variety of products similar in design with different features.
I4.0 manufacturing production line is responsible for mass production using advanced technology tools such as sensors, actuators, IoT, etc.To achieve mass production certain requirements must be fulfilled.In order to meet these requirements different applications have been built.The requirements, application and challenges share a certain relationship with each other as shown in Figure 4.In the production line of I4.0 technologically advanced tools and techniques are deployed based on the reference architectures to produce a variety of products.To produce on-demand and in-time mass products, manufacturing production line has the following requirements.

Smooth Operation without Any Delay
There can be several factors such as equipment, special parts needed for production, factory overhead, etc., affecting the smooth operation of the manufacturing production line.The outage of power supply or network system, rise, or fall of material ambient temperature can result in the temporary shutdown of manufacturing.Similarly, the maintenance of machines and other manufacturing equipment influences the service life equipment and its manufacturing efficiency that can affect the operation.However, most of the time machine failure causes the delay of the smooth operations [18].In order to ensure the smooth operation of production without delay, an efficient failure control system needs to be implemented [19].

Maximum Optimization of the Process
Process optimization is a multi-feature problem aimed at enhancing technological performances.It is a trade-off between increasing the performance concerning time and process by contemplating the dynamics of a process.Optimization heavily relies on the amounts of data being generated during different production processes.Data-driven techniques such as process inefficiencies prediction, process-based machine learning, connectivity, and capturing of real-time data, digital twin, etc., can help to optimize and improve the process performance [20,21].Thus maximum optimization of the process can help reduce unnecessary cost, maximum usage of resources, and increase production.

System Integration
One of the key requirement in the production line is the integration of systems.Distinct systems can comprehend and access each other information and functions.The scope of integration ranges from devices, sensors, to other control systems providing different on-demand services.Effective integration of systems can result in effective communication between devices, systems, and sensors [22].

Reduction in Overall Process Time
The overall process time of manufacturing a product is also known as product turnaround time (TAT) and is defined as the time elapsed from the release of raw material to the completed product [23].Reduction of the overall process is critically essential in any industry due to its key impact on greater product revenues and decreased cost.The major components of the overall manufacturing process time are queue, processing, and shipping time [24].

Applications
Following are the applications being used to fulfil the aforementioned requirements.

Predictive Maintenance
In mechanical, electronic, or mechatronic devices various problems such as wear and tear, and fault can occur and cause a d'ns [25].Predictive maintenance facilitates smooth operations in any production line.It is responsible for reducing unnecessary activities given that it is not dependent on periodic maintenance intervals bound to average lifetime.Hence bring down the maintenance activities over a machine lifetime.In addition, early maintenance activities can be avoided and late activities such as equipment failure before the next periodic maintenance interval.Since these intervals rely on average lifetime which likely includes significant positive and negative deviations from the mean.Both reductions in unnecessary breakdowns and reduction in fatal breakdowns result in increased productivity and less production downtime in a production line.The predictive maintenance observes these problems and predicts if there is any failure or fault in the device.The authors in [26,27] have used predictive maintenance approaches to reduce the delay in smooth operations, i.e., to decrease the number of machine failures resulting in unscheduled downtime.

Production Efficiency
I4.0 use simulations to maximally optimize the processes and forecast production efficiency.To optimize a smart manufacturing process simulations are used to improve the production capabilities.The simulations provide the industries with the chance to intensify their productivity, which is very critical for immense competition [28].How active and efficient crimp the daily struggles when it comes to operational activities such as information flow and lack of connection between different set of processes within a company is explained by rut et al.With the growing market, the need to optimize manufacturing processes is catching pace as the goal is to provide high-quality products while maintaining an optimal level of resource consumption and availability.It has always been an area of concern for many companies.Production processes optimization has made the production lines efficient to produce mass products [29] and eliminate the manufacturing equipment halt [30].

Semantic Modeling of the Production Line
The assimilation of vertical and horizontal components, systems, and services is needed in smart manufacturing.It is required for seamless information exchange between different systems that operate under a wide variety of communication standards [31].In order to integrate systems in the manufacturing production line, semantic web and ontology emerged as promising technology [32].The advantage of ontology is to semantically conceptualize the domain knowledge of manufacturing production line to integrate the different systems.Reasoners including semantic web rule language are used to infer additional knowledge from a set of asserted facts or axioms.

Production Scheduling
Scheduling is an approach that creates coordination between machines and resources in order to perform a given task with a certain time frame [33].It determines when the activity in a production line has to start or stop.The required reduction in overall process time is achieved with production scheduling.It organizes and arranges the resources and activities needed in the manufacturing process.The motivation of production scheduling is to reduce queuing and production time by telling the resources what raw material to use to produce a product using which equipment.

Challenges
These applications still face some challenges to meet the production line requirements.

Quality of Data
Quality data is very crucial for any predictive maintenance system.Data quality can be scaled up to four areas of Intrinsic data quality, Representational data quality, Accessibility data quality, and Contextual data quality [34].Intrinsic data quality is about believability, reputation, accuracy, and objectivity.Representational data quality, contains the dimensions of Interpretability, Ease of understanding, Representational consistency, and concise representation and has to do with how they are formatted.Accessibility data quality concerns accessibility and security.Contextual data quality contains the dimensions of value, relevancy, timeliness, completeness, and appropriate amount of data.This area contains elements of missing or incomplete data and is heavily dependent on the context.All these areas and dimensions give a coherent picture of data quality.The lack of data quality is a big challenge for companies trying to implement predictive maintenance systems at different levels of decision-making in a production line.

Resource Consumption
On a machine level, different resources are utilized by the manufacturing process.The unnecessary consumption of resources by manufacturing processes highly affects production efficiency.In addition, the absence of machine integration and communication results in the unnecessary operation of machines which increases the needless resource consumption and its utilization.This also results in the consumption of power; for instance, machinery utilizes electricity and automobiles consume fuel.The goal of I4.0 is to accomplish low-cost production efficiency while leveraging automation.

Interoperability
Interoperability involves accessing real-time data that leads the way to a new approach for how companies can improve their production operations.It allows manufacturing partners (including customers, suppliers, and other departments) and their machines to share information accurately and quickly.The result is more effective, resulting in more reliable operations.The goal of Industry 4.0 is to achieve low-cost production efficiencies while leveraging automation.There is a lack of a common information model that can integrate the systems which are using incompatible communication protocols and generate data in diverse formats on the production shop floor.

Multi-Line and Multi-Product Constraints
Contemplating the conflicting goals in manufacturing production, there are quite a few constraints for multi-line and multi-product such as preferable product line segment, resource sharing constraint, minimum run-length constraints, etc. Various problems that arise during the design and execution of a manufacturing production line is balancing problems [35].Balancing a production line involves keeping the track of the number of working stations and the operations assigned.An overall task or operation is distributed among these working stations.In the production scheduling process, the satisfaction of the aforementioned constraints is a very challenging task.

Semantic Web and Ontologies for Industry 4.0
With the advancement of information technology, the mode of manufacturing is transforming from mechanization to intelligence and digitization.This transformation is influenced by the utilization of the internet of things, sensors, and CPS [36,37].All these devices and systems are vertically integrated in a smart factory [38].
Ontologies appeared as a significant tool to represent the domain knowledge of I4.0 to support integration and interoperability [39,40].Gruber et al. in [41], define an ontology as the formal, explicit specification of a shared conceptualization.The basic elements of an ontology are a concept and the relation between it, and axioms.When the instances of concepts are populated into the ontology it becomes knowledge base also known as knowledge graph (https://web.stanford.edu/class/cs520/,accessed on 1 May 2021).There are two key elements in Ontology: terminological components (Tbox) and assertional components (Abox) that specifies the concepts and its instances, respectively.Ontologies provide an automatic process known as reasoning to retrieve axioms that have not been explicitly incorporated in the knowledge graph.Based on the representation of conceptualizations the ontologies can be classified as general or core ontology and domain ontology [42].Core ontology represent conceptualizations that are domain-independent and universal, i.e., time ontology [43], sensor ontology [44], etc., and can be utilized in different scenarios across many realms of knowledge.Domain ontology represents the knowledge of a specific domain such as ontology of the process specification language (PSL) [45], MAnufacturing Semantic ONtology (MASON) [46], etc.
Ontologies have been widely used to model the concept of devices and it's capabilities, parameters, processes, etc., in I4.0.The knowledge representation with ontology has helped to solve various problems like interoperability (between standards, devices), modelling domain knowledge, integrating IoT, etc. Table 3. represents the research focus, and datasets of identified ontologies .

Ontologies for I4.0 Reference Architectures and Standards
Reference architectures use analogous standards and suffer from the issue of interoperability due to comparable standards.Several reference architectures and standards have been proposed to allow interoperability in smart industries.The standard and reference architectures define, classify, align, and integrate the resources, and processes along with the communication among them.Several research studies solved the interoperability conflicts in industry 4.0 standards by using ontologies to communicate the mutual knowledge of standards.Recently, the interoperability problem among standards has been tackled through characterizing the standards by proposing standard ontology (STO) [47].Moreover, the description of standards helped in its classification from different viewpoints according to the reference architecture and discovered the relationships among I4.0 standards.Currently, the created dataset contains more than 60 standards and 20 standard organizations [47].
Another study proposed a heavy-weight ontology-based approach to explore the ontology capability in demonstrating and utilizing the semantics of standards in a smart manufacturing context [48].The interoperability between the standards integration and its semantics development is achieved by the identification of smart manufacturing standards and semantic heterogeneity differences [49].There are many studies relevant to the analysis of I4.0 standards [50][51][52].

Ontologies for Industry 4.0 Manufacturing
Semantic modelling of smart factory, manufacturing production line, and the manufacturing systems interoperability are the crucial feature established in the industry 4.0 production between tangible assets including systems, devices, sensors, etc., connected with other over the internet.In the industry 4.0 context, stress has been given on the alignment of manufacturing systems, processes, resource reconfiguration, etc., in the production line.In the last decade, there have been numerous efforts to represent the domain knowledge of industry 4.0 in form of modular ontologies, i.e., resource, device ontology, process ontology, predictive maintenance ontology, etc., to meet manufacturing production requirements.There have been rigorous efforts to develop ontologies that are aimed to semantically model the manufacturing production line with clear meaning.Buchgeher et al. conducted a survey on the role of knowledge graphs in production and manufacturing [53].They have reported the bibliometric facts, type of research, statics and application scenarios of the Knowledge Graphs in manufacturing and production.
Recently, Kalaycı et al. proposed a SIB framework to integrate Bosch manufacturing data to analyse the surface mounting process pipeline [54].To experiment with their framework, they have developed surface mounting (SMT) to map the production line data.Grangel-González et al. concatenated domain ontologies on the top of SMT ontology to accomplish the interoperability issue in manufacturing data [55].Both approaches do not reuse the linked open data reuse principle.
Wan et al. proposed a resource configuration-based ontology describing the domain knowledge of the reconfiguration of sensible manufacturing resources using web ontology language (OWL) [56].The objective of their work is to integrate the CPS equipment through ontology-based resource integration architecture.The data generated are stored as a relational database and is associated and mapped into the model instances of the manufacturing ontology.The proposed ontology for resource reconfiguration is examined using an intelligent manipulator as a use case that verified its manufacturing feasibility.A reconfigurable I4.0 for pharmaceutical products has been proposed to adjust the increasing requirement of flexibility, agility, and low-cost in the health sector [57].The reconfigurable I4.0 is comprised of three layers, namely executing, deployment, and perception layer.The knowledge graph employed in the perception layer is representing the semantics of manufacturing based on the MASON Ontology responsible for scheduling production plan.In the deployment layer, IEC 61499 standard is implemented for modelling functionality and controlling of machines.The feasibility of the proposed approach is validated by taking a use case of drug packing based on demand.Kovalenko et al. proposed AutomationML ontology to represent the semantic modelling of cyber-physical systems covering data exchange in I4.0 scenario [58].The semantic-based representation of I4.0 devices in the administration shell provides the integration, identification, data availability, etc., of the devices [59,60].
Considering the domain of the I4.0 resources, RamírezDurán et al. developed a semantic model (ExtruOnt) to describe the knowledge of a manufacturing machine known as an extruder machine that executes extrusion process [61].Though the scope of the developed ExtruOnt is confined to a specific domain, and provides information about extruder components, three-dimensional representations of components and spatial connections, features, and the sensors capturing data about machine performance.However, it can be used as a reference model to develop ontologies to represent other manufacturing machines in I4.0.
Simple as well as combined capabilities of manufacturing resource ontology have been proposed to illustrate the capabilities of the production system [62].The resources capabilities preliminary conceptual model was produced in [63].The ontology development process followed the five stages of ontology engineering methodology that are feasibility study, kick-off, refinement, evaluation, and usage and evolution.According to Jarvenpaa et al.Manufacturing resource capability ontology (MaRCO) is used by resource vendors to represent the capabilities of resources they are offering and publish it in the digital marketplaces or global resources list and is browsed by production companies or systems integrators when reconfiguring existing or designing new manufacturing system.MaRCO aims to provide the matchmaking between the required capabilities of a resource and the production requirements of a product.Kaar et al. investigated and extracted the context and information from Reference Architectural Model Industrie (RAMI 4.0) [64], to integrate the industry 4.0 process by suggesting an ontology approach.This information was extracted from different sources, standards, architectures, and model relevant to I4.0.The ontology objective is to give an overview of the RAMI4.0key concepts and its relationships to enable the identification of inconsistencies, gaps, and redundancies in its layers descriptions and definitions for I4.0 process development.They carried out a middle-out approach to analyze the layers of RAMI4.0 and broke down the sentences referring to the information layer and shift it into a concept map.Semantic Manufacturing Ontology highlights the sequence of processes and machines required for an ordered workpiece (product) [65], turtle file is available online (http://i40.semantic-interoperability.org/smo/smo.ttl,accessed on 1 May 2021).Mazzola et al., proposed CDM-Core (http://sourceforge.net/projects/cdm-core/, accessed on 1 May 2021) ontology by re-using the existing domain and core ontologies [66].The authors claimed it to be the largest publicly available global ontology.However, they have more focused on the service-oriented architecture and monitoring of the manufacturing services.There is no explicit information regarding the modelling of manufacturing factory and the main concept such as type of processing, type of machine, etc., are missing.
Manufacturing system should be able to incorporate and assist human (Operators, Technicians).Human are participating in the environment of automated systems and need to consider the role of the operator in such an environment.Ferrer et al. proposed the addition of the skills and tasks performed by humans in manufacturing ontology that is using the CPS knowledge repositories [67].Their work presented a semantic model that allows the operations modelling achieved by human operators.However, they focused more on the service of orchestration during the production plans.Ahmad, S. et.al proposed the integration of manufacturing domain data such as Product, Process, and Resource (PPR) using the ontology approach for matching the product requirements in assembly automation [68].The mapping information of PPR helps in deriving the processes and resources required to manufacture the designed product.Considering the modular ontologies design, Protégé (http://protege.stanford.edu/,accessed on 1 May 2021) (ontology editor tool) can be used to form a graph dataset including all imported ontologies.
Teslya et al. proposed an ontology-based approach to describe the industrial components merged from four different scenarios in order to form upper-level ontology [69].Such a union will enable to change the created business process to boost the product customization for the customer and reduce the cost for its producers.In another study [70], the researchers proposed the ontology model of RAMI4.0 to exchange information between assets and I4.0 components in a meaningful way.
In [7], the authors proposed an ontology by merging five ontologies that are base, product, process, device, and parameter ontology to represent the manufacturing production process beginning from order to completion of the product.The ontology is built on the top of the product, process, device, and parameter ontologies to provide interaction with each other.Additionally, the order concept is modelled as a separate ontology that is linked with the product.Service-oriented architecture has been built on the top of this ontology model to discover, select, organize, and consume semantic web services dynamically [71].
Seyedamir et al. utilized the concepts of manufacturing resource, process, and product from the ISA-95 standard [72].They adopted the approach of semantic rules to infer implicit knowledge to allow inspecting the machines needed to produce product variants.Saeidlou, S. et al. designed an ontology model for the manufacturing domain and developed a semantic query algorithm to investigate the semantic richness of the queried keyword return by the ontology model [73].
Some of the most renowned ontologies in the manufacturing domain are process specification language (PSL) [45], ONTOlogy for Product Data Management (ONTO-PDM) [74], MAnufacturing Semantic ONtology (MASON) [46], ADAptive holonic COntrol aRchitec-ture (ADACOR) [75], etc., ontology.MASON ontology has been developed to estimate the production cost of the mechanical components.The design of PSL ontology emphasis to enable the exchange of process information in manufacturing systems accurately and comprehensively.Panetto, H. el.al modelled the product concepts based on two standards ISO-10303 and IEC-62264 to facilitate the interoperability between software application exchanging product life cycle information.PSL ontology represents the concepts of process modelling, planning, scheduling, simulation, etc. in axioms of first-order logic theories.ADACOR ontology has highlighted the knowledge related to customer work orders, production plans, model operations.These ontologies are helpful to recreate an ontology model to cover the notion of the whole production line from customer order to the product life cycle.There is a great amount of literature available for ontology-based agent system such as CORA [76], ROA Ontology [77], ORArch, and O4I4 Ontology [78] that perform main tasks in the manufacturing industry and is out of the scope of this paper.

Ontologies for Industry 4.0 Predictive Maintenance
Process in the manufacturing production line is the sequence of actions performed on materials and energy to transform it into a finished product.These processes may be affected by the faults and failures in the machines.The early detection of these failures can make sure the availability, efficiency, and high productivity of manufacturing processes.Usually, the abnormality is detected from analyzing data generated by the sensors placed on machine modules and other areas of the manufacturing production line.
CPS take an advantage of predictive maintenance.The communication between the production entities within a CPS is performed intelligently and autonomously which helps manufacturers to augment the production process.
Karray, M.H et al. proposed the Industrial MAintenance Management Ontology (IMAMO) to provide interoperability and create new knowledge that encourage making decision to be carried out in the process of maintenance [26].Schmidt, B. et al. reported the problem of insufficient data of historical events regarding the factory maintenance [79].The process and inspection data need to be correlated to enhance the prognostics and diagnostics to make the maintenance decision better.In another work, an ontology model has been developed to capture the context of flexible manufacturing system that are utilized for making a real-time decision to optimize the key performance indicators [80].
Qiushi Cao et al. developed a manufacturing predictive maintenance ontology (MPMO) to formally describe the chronicle concepts and their relationships [81].They proposed an algorithm to create semantic web rule language (SWRL) logic rules from historical events providing formalized predictive outcomes.The semantics and chronicle mining allows temporal constraints reasoning to forecast future malfunction of machinery.There are two main issues in the proposed approach that are (1).the partition methodology of numerical values and (2). the advancement of ontology and rule base.Thus, it is hard for this method to predict critical failure.
In I4.0 application, the monitoring of critical components is a complex job and needs to be resolved in real-time.The fuzzy-logic based approach decomposes any real-time complex system into a simple weighted sum of linear subsystems and is very actively used in several fields such as energy management [82][83][84].The authors in [85] utilized a Fuzzy C-Means (FCM) based approach to tackle with the uncertainties and classify the critical failures.Failure events are obtained from the industrial raw data by applying Sequential Pattern Mining (SPM).After that FCM is applied to the failure events along with its temporal information.A survey paper explored eMaintenance ontologies related to several fields [86].However, the focus is more on the data problem used in maintenance, standards, and eMaintenance tools.

Ontologies for Industry 4.0 Supply Chain Management
A supply chain is a web of business services and delivery options that play key roles: procurement of raw materials, transforming these raw materials into a product, and delivery of these products to customers [87].Ordinarily, the raw materials are acquired from distinct vendors and are transformed into the product at one or more production plants.The finished product is shifted to the storage room in the warehouse.According to the characteristics defined by [88], the heterogeneous information flow of the supply chain network creates complex processes between partners that need to be adjusted for business profit.An ontology-based data integration framework is proposed in the same research work that utilizes the relational databases and data in XML format.
The current advancement in the internet-based technologies pave a way for the task of extended supply chain and new constraints, therefore helping in managing productrelated information that comes from product models.The product data framework initially established by PRoduct Ontology (PRONTO) is extended to give the foundations for distributed data management (DPDM) which in turn helps to validate the data aggregation and disaggregation processes needed by the activities of logistic planning [27].PRONTO modelled the product and its variant set without considering any standard or reference architecture and did not reflect any other part of manufacturing production line [89].
In the supply chain, several failures such as factory fire, machine failures, acquisition of raw material, etc., have been observed [86].In [90], researchers proposed a decision support system based on the ontology model to decide optimal recovery action as a high resilience level by applying Particle Swarm Optimization (PSO) approach.Similarly in another study, ontology and multi-agent have been used to propose a decision support framework for the supply chain of prefabricated components [91].In another study, the problem related to the logistic process was optimized by proposing a framework based on ontology [92].
There is literature available on ontologies for solving different problems related to supply chain management covering numerous domains [93][94][95].

Analysis of Existing Ontological Approaches
The emergence of the semantic web and the knowledge graphs provides an interface among the various reference architectures to deduce the hidden relationships and the interoperability issues among them [47].It is believed that the coupling of the semantic web and the knowledge graphs could be able to provide a universal model for accomplishing the overall designing process by aligning all I4.0 reference architecture.
In the literature, several refined conceptualizations for the different domains of I4.0 such as sales, devices, processes, and products, etc., are proposed by using either complete or modular-based ontologies as depicted in Table 2.For example, the domain knowledge of the manufacturing production line is conceptualized by utilizing four different modular ontologies for device, product, process, and parameters [7].On one hand, the modular design could provide the opportunity for extending the domain knowledge; however, the maintenance and complexity requires some efforts.
In I4.0, each of the machine units or products at different granularity level such as cell, workstation, and component, etc., are represented through a resource.Furthermore, each of the resources can either be a particular product or service supporting a process that may include a sub process or sub resource, e.g., a transport system on workstation level include sub-process (clamping operation) and resource (griper).The authors in [7,60] have modelled the manufacturing production line by covering most of the resources but still is lacking some of the main resources such as material, environment temperature, and humidity, etc.
Additionally, it is extremely difficult to identify whether the time is related to the production line or product delivery, as it is bounded to the OperatingHours class only.A misconception, that the completed products are considered as to be a part of the manufacturing ontology; whereas it should be a part of the product to better modelled in terms of RAMI4.0.These ontologies also lack the linked data principles that is the re-usability of existing vocabulary.For instance, the property isPartOf links machine with manufacturing facility, i.e., machine is part of manufacturing facility) which is already defined in the Dublin Core vocabulary but it has not been re-used by the domain ontology [8].
The unavailability of such infrastructure that could be able to support the sharing of data using different ontologies in a universal framework is one of the weaknesses of the existing work.Apart from that the ontological based solutions for I4.0 in the existing work are facing two major challenges: (1) the ontology should follow the standard reference architectures of I4.0, but none of the current ontologies is fully compatible with the reference architectures, and (2) Although, immense data is being generated by I4.0 processes, machines, resources, etc., which is captured in IT systems in different formats and is not interoperable.However, there are no known semantic models that can be applied on the top of the data to cover all the concepts or processes involved in any typical smart manufacturing environment.Given this challenge, we need a semantic model that can be used to build a knowledge graph which in turn provides us with the solutions regarding the industry 4.0.
It is believed that the knowledge graphs could provide a baseline for implementing more efficient predictive maintenance and machine learning-based algorithms.In this connection, this study aims to provide insights into building the semantic web and knowledge graphs for enhancing the production line manufacturing of I4.0.
The overall schematics of this approach are explained in Figure 5.The resource ontology [8], is split into manufacturing, machine, and product ontology to cover more concepts.In machine ontology, new concepts such as capabilities of machine and power consumption by machines are added.The product ontology is enhanced with concepts used in RAMI 4.0 life cycle dimension.The concepts from the existing vocabularies have been reused.This can result in a twofold interest.First, it will pave ways for academia, to discuss topics relevant to this field of research and hence lead to intensive investigation.Secondly, these ontologies can be used by the industry for the implementation of a knowledge graph to provide solutions.

Reference Generalized Ontological Model
The proposed Reference Generalized Ontology Model (RGOM) is a universal platform developed with the composition of the domain specific and core ontologies along with the authors identified concepts based on reference architectural model Industrie 4.0 (RAMI4.0)[7,8,61].
The methodology for the proposed reference generalized ontological model (RGOM) is composed of the following steps.

•
A detailed survey is conducted by analyzing the recent literature for the ontological models for industry 4.0.In this step, key ontologies regarding the production line, supply chain, etc., were shortlisted based on the search methodology.• Industry 4.0 architecture such as Reference architectural model Industrie 4.0 (RAMI 4.0) was studied to find out the requirements needed for the industry 4.0 production.• A comparative study is then conducted to find out the gap between the standards and the current state of the art models.During this step, it was identified that the current ontologies do not follow the requirements of the RAMI4.0 and are unable to follow the reuse principle of linked open data.

•
The existing vocabularies were reused with the additional concepts that were missing.The whole process was performed iteratively.
Based on the literature review and RAMI4.0, the proposed RGOM considers core areas such as time, location, sensor, and different domains such as product, process, and machine along with the order, supply chain, warehouse, etc., and explores all the concepts and relationships among them.This implies that the RGOM provides a detailed unified model which takes the I4.0 domain knowledge from raw material to finished product including supply to the customer as well as monitoring the different situations of machines and processes.Machines and products are separated from the resource ontology [5], to form a machine ontology and a product ontology to accommodate more concepts and relations.For instance, the product ontology specify the concepts such as product (production of product) and service (maintenance usage) adopted from RAMI4.0 and identified concepts such as sales ontology are coupled.This helps to provide a full view that the order is placed for a service or for the manufacturing of product, depending on the order the either the service or the resources in the manufacturing production line will be reconfigured.RGOM has reused the existing vocabulary, i.e., the machine which is a manufacturing facility and is associated with the workstation by reusing the isPartOf property from Dublin core vocabulary.The process(s) happening at different times and locations is linked to the manufacturing resources by process ontology using performProcess property.It describes the basic taxonomy of all kinds of processes from manufacturing to human process(s) and logistic operations.Sales ontology defines customer orders concepts for the product.The order can have various concepts such as design, quantity, delivery date, etc., Supply chain ontology can assist in monitoring the delivery of the manufactured product to the customer.Thus, the context of the core ontologies alone would not be able to answer why, where and what type of questions, but the RGOM is able to infer all the contextual information ranging from a particular entity situation to the complete production line.The consistency of the RGOM has been evaluated through the reasoner (software tool) known as Hermit (http://www.hermit-reasoner.com/,accessed on 1 May 2021).Furthermore, the high level representation of the proposed RGOM is illustrated in Figure 5.It provides a comprehensive correlation of all the concepts discussed in Table 4.The RGOM owl file is made available at the github (https://github.com/MuhammadYahta/Smart-Manufacturing-Ontology,accessed on 1 May 2021) repository.

Discussion
The purpose of this survey is to investigate the extensive literature regarding manufacturing production line ontologies and their knowledge graph.It is clear from the literature that I4.0 is highly supported by reference architectures and models.Different countries have presented their architectures as a reference model to build I4.0.Some of them have aligned their architecture and model with another based on the analysis and comparison which is a general approach towards the alignment of the reference model.The analysis and comparison approach make use of the definitions included in each model to align the reference models.
The scope of this survey is limited to review the existing manufacturing ontologies in order to develop an enterprise-level knowledge graph for I4.0 manufacturing production line.To the best of our knowledge, the current study does not incorporate the dissimilar data sources such the measurement recorded by sensors (temperature, pressure, humidity, power), material required for production, quality of material, an operation performed by a machine or human, work orders, etc.Even though the data is captured in the databases, it needs a lot of manual efforts and time to integrate in a unified way.Upon building such knowledge graph, it can then help in integrating the data from diverse sources.The unified model may be capable to promptly answer the queries and help to predict the machine failure, optimize processes, etc., by applying machine learning or deep learning approaches.
Still there are following challenges needs to be addressed that are missing in the literature.
Open Challenges The current ontologies for manufacturing production are unable to cover the following challenges.We reviewed I4.0 manufacturing ontologies model to highlight the missing areas that need to be worked on to build a solution provider.

Domain Knowledge Capture
A huge amount of data is generated by different sensors, devices, machines, actuators, the interaction of human operators, processes, etc., in smart manufacturing during the production.This information is rarely accessible in a combined way as IT systems capture such information into diverse databases.Based on the literature [61,62,68], the existing models can only accommodate limited parts of the manufacturing production and the rest of the data is wasted.Manufacturing production demands a domain ontology to save all the data produced in the line.

Knowledge Graphs
Knowledge graph integrate data from heterogeneous sources in a given domain and provide a framework for analytic and data sharing among applications.One of the challenges to build an smart manufacturing knowledge graph is the unavailability of realtime data sets.The reason behind the data unavailability may be the improper modelling of manufacturing production line.

Comprehensive Information for Seamless Integration within and between Smart Factories
I4.0 is lacking cross-domain collaboration between smart factories due to the focus on domain-specific applications [96].Seamless collaborations from cross-domain are required to infer the useful information within and between smart factories, from the knowledge graph.The intelligent autonomous system can then use the deduce knowledge from independent applications using semantic reasoning to monitor and process events.

Elastic and Customised Assembly Lines
The semantic web is capable to capture the domain knowledge of a manufacturing domain which make the assembly line devices and resources elastic that can be customized according to the product ordered by customers.Thus, proper semantic modelling of assembly lines saves the unnecessary use of resources.

Intelligent and Adaptable Manufacturing
In order to achieve autonomous intelligence where the different machine can communicate and interact with each other, there is a need of knowledge graph that can help machines to answer based on their experience.This will help the manufacturing resources to detect the faults and failures in a more intelligent way.

Conclusions
In this paper, first, we provided a comprehensive review of the available ontological model for building I4.0 knowledge graph that enabled us to find the knowledge gap in terms of open challenges, applications.Once the challenges and applications are identified, they are related through a logical one-to-one mapping mechanism.A reference generalized ontological model (RGOM) based on RAMI 4.0 is then developed by covering most of the core concepts in the I4.0.The developed RGOM is a fundamental framework that could be utilized to populate realistic data and test the knowledge graph with an adequately accurate response for any real-time query related to the overall concepts of I4.0.
In future, the RGOM will be tested by using the Confirm Manufacturing (https://confirm.ie/) benchmark datasets for validating it against state-of-the-art ontological models by considering the accuracy and correctness of the query results.

Figure 1 .
Figure 1.Illustration of ontology defining the generalized concepts and their relationship.

Figure 2 .
Figure 2. Example of the knowledge graph obtained from the book_ontology (Figure 1).

Figure 3 .
Figure 3.An illustration of the methodology adopted for conducting the survey.

Figure 4 .
Figure 4.One to one mapping between requirements, applications, and challenges

Figure 5 .
Figure 5. High level representation of Reference generalized ontological model.

Table 1 .
Identification of search keywords.

Table 2 .
Digital Libraries used for searching articles.

Table 3 .
Research focus, datasets of identified ontologies.