1. Introduction
The rapid advancement of Industry 4.0 (I4.0) technologies has brought many changes and benefits to the manufacturing sector [
1]. However, these technological advancements also brought some negative impacts on human beings. A point of concern that can be mentioned is the potential displacement of human workers due to automation and the adoption of artificial intelligence (AI)-based systems [
2]. This displacement leads to job losses and economic inequalities, as individuals may struggle to adapt to the changing skill requirements in the labor market [
3,
4]. Thus, the industry and the academia started the transition from I4.0 to Industry 5.0 (I5.0), which goes beyond technological advancements and brings a new paradigm shift in manufacturing processes, with the goal of creating a human-centered industry by integrating human workers and advanced robotic systems to enhance productivity and collaboration, creating a new era of intelligent, interconnected systems [
5,
6,
7,
8,
9].
Alongside the I4.0 technologies such as the Internet of things (IoT) [
10] and its variants (the green IoT [
11,
12] and industrial IoT [
13,
14]), big data analytics, [
15] and machine learning algorithms [
16], this transition, illustrated in
Figure 1, is enabled by the implementation of I5.0’s main building blocks: individualized human–machine interaction technologies that interconnect and combine the strengths of humans and machines; bioinspired technologies and smart materials that allow for materials with embedded sensors and enhanced features while being recyclable; real-time-based digital twins and simulation to model entire systems; data transmission, storage, and analysis technologies that are able to handle data and system interoperability; AI to detect, for example, failures in complex and dynamic systems, leading to actionable intelligence; and technologies for energy efficiency, renewables, storage and autonomy [
17]. This results in an interconnected manufacturing ecosystem that enables the industry to respond to this paradigm shift.
I5.0 presents a human-centric transformative concept and brings a holistic development perspective that aims to foster a flourishing manufacturing system, embracing the integration of advanced technologies, collaborative human–machine interaction, and a focus on long-term viability and social well-being [
18]. In I5.0, humans are not replaced by machines; instead, they work alongside robots and intelligent systems to leverage their respective strengths [
19]. This collaboration allows more complex and creative tasks, with humans providing critical thinking, problem-solving, and adaptability, while machines contribute with precision, speed, and repetitive task execution [
20].
Moreover, a highly compatible concept for enabling the I5.0 implementation is the intelligent space (IS). An IS, represented in
Figure 2, can be defined as a physical space equipped with a network of sensors which obtains information about the place it observes, a network of actuators which allows its interaction with users, and changes in the environment through computing services. Such sensors, actuators, and computing services must be governed by an infrastructure capable of collecting and analyzing the information obtained by the sensors and making decisions [
21,
22,
23]. One of the key concepts within ISs is that the environment can sense, interpret, recognize user behavior, adapt to preferences, and provide natural interactions between humans and intelligent systems, using the IoT, AI, computer vision, and data analytics to create dynamic and adaptive environments in real time.
The integration of ISs and I5.0 has paved the way for the development of cognitive factories, marking a significant milestone in the evolution of manufacturing. The cognitive factory idea is not new and can be defined as a physical industrial environment transformed into an IS that can perceive, learn, and interact with humans and machines in a seamless and adaptive manner [
24,
25,
26]. These factories can leverage the technologies already mentioned and illustrated in
Figure 1 to enable real-time data collection and analysis, allowing predictive maintenance and process optimization continuously. In addition, by integrating cyberphysical systems (CPS), i.e., the convergence of operational technology (OT) and information technology (IT), they blur the boundaries between physical and digital systems, enhancing the communication and coordination between machines, humans, and products, resulting in a more agile and responsive manufacturing ecosystem [
27].
Since a cognitive factory can be composed of smaller aggregated units containing sensors, devices, systems, and subsystems, integrated or not, in this article, these smaller aggregated units are denominated cognitive cells, and the definition is stated as a fundamental component that operates as an intelligent unit equipped with cognitive technologies to enhance manufacturing processes. These cells represent a significant departure from traditional automation and digitalization approaches. While automation and digitalization focus on enhancing efficiency and productivity through the use of advanced technologies, the cognitive cell goes beyond that by incorporating cognitive capabilities and human–machine collaboration. Unlike traditional systems, the cognitive cell leverages AI, edge/fog computing, and advanced sensing capabilities to enable an adaptive behavior. By integrating human intelligence and problem-solving skills with machine precision and speed, the cognitive cell creates a dynamic and interactive environment where humans and machines work together.
In this context, this article aims to explore the convergence of the IS and I5.0 concepts. The objective is to provide insights into the technical implementation challenges of a cognitive cell as an example of a segment of a cognitive factory by developing and implementing a replica in a laboratory. By examining the key points, this article aims to contribute with an analysis of the technological challenges associated with cognitive factory implementation, providing practical insights and possible solutions, and enhancing the knowledge base surrounding cognitive factory implementation, thereby facilitating informed decision-making and strategies for organizations aiming to adopt this advanced manufacturing paradigm.
The remainder of this paper is organized as follows:
Section 2 presents related works;
Section 3 presents the communication architecture;
Section 4 describes a cognitive cell as a building block of the cognitive factory;
Section 5 shows the important findings during the development of this study and outlines future steps; finally,
Section 6 concludes this work.
2. Related Works
The transition from I4.0 to I5.0 has garnered significant attention in recent years as technological advancements continue to reshape the industrial landscape and represent a significant paradigm shift in the global industrial landscape. Extensive research has been conducted to explore the aspects and implications of this technological progression. Scholars such as [
28,
29] have examined the potential benefits of I5.0, emphasizing its ability to foster greater human–machine collaboration and enhance productivity. Additionally, the study developed by [
30] has delved into the challenges and barriers associated with this transition, including the need for upskilling the workforce and ensuring the ethical use of emerging technologies.
Moreover, researchers have investigated the key technological components that underpin the transition from I4.0 to I5.0. For instance, [
31] conducted an in-depth survey on enabling technologies and their findings highlight the potential use of edge computing, digital twins, cobots, the Internet of Everything (IoE), big data analytics, blockchain, and future 6G systems. In a similar vein, [
32] explored the significance of big data with AI in I5.0. The study revealed how AI can augment decision-making processes, optimize production workflows, and enable adaptive and autonomous systems.
Scholars also consider that the transition to I5.0 is already underway. For example, the study by [
33] proposed a cyberphysical human-centered system (CPHS) that utilized a hybrid edge computing architecture with smart mist-computing nodes to process thermal images and prevent industrial safety issues. In the work, the authors demonstrated in a real-world scenario that the developed CPHS algorithms could accurately detect human presence with low-power devices in a fast and efficient manner, complying with the I5.0 paradigm.
In the context of I5.0, the integration and coordination of interconnected systems play a vital role in achieving a consistent data flow and efficient resource utilization. Extensive research has been conducted to investigate the interconnection of edge devices and cloud infrastructure to create a distributed computing ecosystem. Scholars have proposed some architectural frameworks and protocols for enabling the integration and collaboration of these systems [
33,
34,
35].
Edge computing has emerged as a promising paradigm to address the requirements since the arrival of I4.0. Researchers have recognized the significance of these computing models in enabling efficient data processing and analytics at the network edge. Studies have shown that edge computing facilitates real-time data processing and reduces network latency by moving computation closer to data sources [
36,
37,
38]. Although not focused on I5.0 but rather on I4.0, [
39] proposed an edge-computing-based industrial gateway for interfacing IT and OT that could enable I4.0’s vertical and horizontal integration.
Similarly, fog computing extends the capabilities of edge computing by providing a decentralized computing infrastructure that bridges the gap between edge devices and the cloud, enabling data processing, storage, and analysis closer to the network edge [
40,
41].
Besides those mentioned technologies, [
42] proposed a communication architecture based on a message bus to enable not only communication to cross conventional physical borders but also to provide scalability to growing data volume and network sizes. The author argued that such architecture had the potential to renew industrial communications. At last, [
43] presented a new middleware broker communication architecture for protocols used by the IoT. The paper focused on two key roles: the design of broker communication for industrial automation with consideration of common packet format parameters and the development of a deep learning algorithm for the industrial IoT (IIoT).
In conclusion, the literature review on the communication architecture for cognitive factories reveals an important role in supporting the integration and operation of cognitive technologies in manufacturing environments. The studies demonstrated the importance of designing efficient frameworks that enable real-time data exchange, support collaboration, and ensure the interoperability within cognitive factory systems. By addressing the challenges and considerations associated with the communication architecture, researchers can advance the development and implementation of cognitive factories, paving the way for future advancements in intelligent manufacturing systems.
3. Communication Architecture
The communication architecture of a cognitive factory, although adaptable, is not a generic architecture. It has to comply with objectives to meet the specific requirements of its applications and rationalize the use of its infrastructure resources. To fulfill these main objectives, the architecture should possess characteristics such as interoperability, availability, reliability, and scalability. The International Telecommunication Union (ITU), defined by its Telecommunications Standardization Sector (ITU-T), approved the IoT reference model [
44] for IoT applications that is shown in
Figure 3.
The mentioned reference model consists of four horizontal layers. The first layer is the application layer, which contains all IoT applications. Next, there is the service and application support layer, which comprises a set of common and specific functions to assist the applications. The third layer is the network layer, responsible for communication between different entities in the model. Lastly, there is the device layer, composed of physical elements and their interaction description with other architectural elements in the model. In addition to these four layers, the model includes management and security layers. These layers provide functionalities associated with the four initial horizontal layers of the model.
Besides the ITU-T reference model, some state-of-art works, as can be seen in [
33,
45], propose using a three-layer model, physical, processing, and application, where the denomination can vary but the functions are the same. Subsequent subsections delve into each layer’s intricacies of the ITU-T reference model, providing a more comprehensive understanding of their functionalities and operations.
3.1. Device Layer
The device layer, also known as the physical or perception layer, is the lowest layer of the architecture. It represents the physical devices that are connected and form the foundation of the system. This layer consists of a diverse range of devices such as sensors, actuators, wearables, embedded systems, gateways, and other connected hardware that are responsible for gathering data from the environment, performing local processing, and communicating with other devices or higher layers in the architecture. The primary function of the device layer is to establish physical connectivity and enable the transmission of data.
Moreover, this layer encompasses the actual communication technologies involved in its deployment, including wired or wireless communication protocols, and the necessary infrastructure to facilitate data transfer. Common communication technologies utilized in the device layer include Wi-Fi, Bluetooth, Zigbee, cellular networks, Ethernet, and others, while radiofrequency modules, antennas, and any other physical devices that are able to transmit and receive data are also included in this layer [
46].
Within the cognitive factory context, the choice of communication technologies and protocols in the device layer depends on the specific requirements of the application, including factors such as range, data rate, power consumption, and compatibility with already deployed devices.
3.2. Network Layer
The network layer is a vital component of a layered model, which provides a framework for the design and implementation of communication networks. Positioned above the device layer and below the service/application support layer, the network layer focuses on routing and forwarding data packets across the network. Its primary goal is to establish end-to-end communication paths and ensure efficient delivery of data between source and destination devices. This layer can comprise different protocols, technologies, algorithms, and networking components, such as brokers, that facilitate the transmission of data among IoT devices, sensors, gateways, and other system components.
This layer ensures a seamless connectivity and interoperability, enabling the data flow across different layers and devices in a secure and efficient manner [
47], and also supports network interconnectivity by enabling communication between networks with different architectures or protocols. It achieves this through the use of entities that forward packets based on destination address information.
3.3. Service/Application Support Layer
This layer supports the application layer by providing a wide range of services, such as storage, data processing, service management, etc. [
48]. One of the key aspects is to offer reliable and scalable services that can efficiently handle the data requirements of applications. This includes providing access to distributed file systems, databases, object storage systems, and also data processing by offering services for data transformation, aggregation, and analytics [
49]. These technologies enable applications to leverage distributed computing resources and perform complex data processing tasks, such as real-time stream processing and batch analytics.
Furthermore, this layer also provides service management functionalities that ensure the availability, scalability, and reliability of application services. In this layer, platforms offer features for service discovery, load balancing, and fault tolerance, enabling applications to manage and orchestrate their services in distributed environments. These services simplify the deployment and management of applications by abstracting the complexities of distributed systems and providing mechanisms for automatic scaling, fault recovery, and service monitoring [
49].
3.4. Application Layer
The application layer represents the topmost layer of the architecture, focusing on the development and implementation of specific applications and services that leverage the capabilities of the cognitive factory. The application layer is responsible for providing the interface between end-users and other services within the factory domain [
50].
The aim of this layer is to enable the development, deployment, and management of applications that utilize the data, services, and functionalities offered by the lower layers. It serves as the gateway for allowing users or internal systems to interact and access its resources.
The layer within the communication architecture consists of a wide range of tailored applications that cater to specific industry domains and use cases. These applications serve many purposes, including industrial monitoring and control, environmental sensing, asset tracking, energy management, and more [
50,
51]. The components of this layer are usually deployed as microservices across distributed servers, making use of data processing engines, analytics frameworks, visualization tools, and other relevant functionalities.
Moreover, it provides a high-level abstraction for users or developers, shielding them from the complexities of the underlying architecture and infrastructure. It offers a set of APIs, protocols, and tools that facilitate the development of other applications, enabling the integration of data from sensors, devices, and other sources.
At last, it is worth mentioning that this layer also facilitates the integration and interoperability with other ecosystems, enabling a collaboration across several elements on the same or other domains.
4. Use Case: The Cognitive Cell
To demonstrate the benefits of a cognitive factory ecosystem, this section delves into a specific use case where a cognitive cell, representing the entire cognitive factory, is developed in a laboratory. The focus of this use case is the validation of IS and I5.0 concepts merged together for an environment where humans and cobots can safely work together. The proposed cognitive cell is composed of a robotic arm performing dynamic pick and place from the operator’s hand, and decentralized services to ensure the safety of human–robot interaction. The developed system brings as key aspects the integration between OT and IT networks, a customized layer model, and interoperability between many devices.
4.1. Overview
The proposed cognitive cell system can be considered as a small representation of a cognitive factory’s capabilities, once the backbone of the system is implemented in the same way for both entities. Bearing that in mind, whatever application developed for the cognitive factory can be transferred to a cognitive cell test bed. What sets the factory apart from a cognitive cell is that the former is more complex, consisting of any number of cognitive cells, and has the responsibility of integrating them if it is an application requirement.
Regarding the system’s capabilities, the main focus of its development was to ensure the safety and collaboration between human and robotic arms in real-time and dynamic tasks as an example to demonstrate the wide range of application that can be developed. As can be seen in
Figure 4, the whole ecosystem is composed of:
Two zones: a safe working area, highlighted with green color, and a forbidden zone, the red area in the figure;
A system for identifying the human position regarding the cell;
A system for calculating the object position and color recognition;
A cobot for performing tasks in cooperation with humans.
Section 4.3 and
Section 4.4 highlight the crucial elements of the cognitive cell’s overall implementation strategy.
4.2. Communication Architecture
Although some state-of-art works propose a three-layer model, the architecture proposed for this work was designed following the IoT reference model with some customization and is illustrated in
Figure 5. As can be seen, the proposed architecture is composed of seven layers in six different levels: physical, gateway, edge/fog computing, network, middleware, application, and cloud services layer.
Additionally, it is important to highlight that the focus of this research was primarily on the communication architecture within the context of the cognitive factories. While infrastructure implementation plays a significant role in the overall deployment, it is beyond the scope of this article and will be published in a specific paper about the topic. Therefore, the discussion and analysis presented in this paper solely pertain to the communication architecture and its associated components.
The following subsections address the relevant aspects regarding the proposed communication architecture.
4.2.1. Gateway Layer
Due to the importance and possibility of customization, the first modification was the creation of the gateway layer, which was initially a module inside the physical layer. This layer is responsible for collecting, sensing, and interpreting data to and from the physical environment. It acts as the interface between the physical world and the higher layers. The main purpose is to capture data from devices and convert it into meaningful information, and for translating information into actions performed by actuators, hence the relevance of customization.
The collected data from the sensors are typically raw and unprocessed, representing the current state of the physical world. Once the data have been collected, this layer processes and interprets them, making them ready for consumption by other layers. This processing can involve data aggregation, fusion, or transformation to derive more meaningful insights or extract relevant features from the raw data.
On the other hand, when this layer receives instructions or commands from higher layers, it processes them and triggers the appropriate actions. This can involve activating or deactivating specific actuators, adjusting their operating parameters, or performing specific tasks based on the received instructions.
Overall, it is common to use microservices connected directly to devices deployed in the physical environment that serve as the bridge between the physical and digital worlds. For instance, in this use case, this layer was responsible for connecting the cobot to the message bus on the network layer. By establishing this connection, the cobot was readily available for receiving commands from any authorized services within the architecture.
4.2.2. Edge/Fog Computing Layer
The edge/fog computing layer in this work was positioned at the same level as the gateway layer, since they have equal importance and are expected to work collaboratively to achieve similar objectives. This layer is where computing resources and services are located closer to the edge of the network, near the devices and sensors. It serves as an intermediary between the devices and the remainder of the architecture. The objective of this layer is to provide real-time data processing, volatile storage, and computation capabilities at the network edge, enabling faster response times, reduced latency, and improved efficiency in tasks.
Being part of this layer, devices can be strategically positioned within the ecosystem, enabling immediate responses to critical events, providing faster data insights, and reducing reliance on the application layer for processing. By analyzing data closer to their source, this layer minimizes the necessity of transmitting large volumes of raw data for processing, effectively addressing the challenges associated with latency and bandwidth. For this reason, in this use case, the stereo vision system was positioned on the edge layer.
Normally, in this layer, single-board computers are used for deploying the microservices and for connecting to the network infrastructure. Once connected to the infrastructure, the data exchange is performed through the network layer.
4.2.3. Cloud Services Layer
The cloud services layer refers to a specific component dedicated to facilitating the connectivity and integration with external entities. It acts as a bridge between the internal architecture and external resources, enabling communication and interoperability. This layer establishes connections and provides interfaces that allow the ecosystem to interact with third-party applications or other deployments.
The primary purpose of this layer is to enable data exchange, integration, and collaboration with external systems. It incorporates specialized APIs, protocols, and integration tools that facilitate communication. By leveraging industry-standard protocols, the cognitive factory, through this layer, can send and receive data, access services, and take advantage of functionalities offered by external systems.
The inclusion of a dedicated cloud computing layer for connecting external systems expands the functionality and value of the architecture solution. For example, it enables the transmission of sensor data to cloud-based analytics platforms for real-time analysis, facilitating advanced data processing, visualization, and insights. Integration with existing enterprise systems, such as MES, CRM, or SCM systems, is also possible, enhancing business processes and decision-making.
At last, it is important to highlight that this layer also serves as a gateway for communication and integration, so depending on the system or service to be integrated, the data exchange can be performed either by the application layer or through the network layer.
4.2.4. OT and IT Network Integration by Brokers
OT and IT networks are two distinct types of networks with specific roles and characteristics in the industry. OT networks are primarily responsible for controlling and monitoring physical processes in critical infrastructures such as manufacturing plants, power grids, and transportation systems. On the other hand, IT networks are designed to handle digital information, support business operations, and enable communication and data sharing within organizations. Separating OT and IT networks is of the utmost importance in the industry for several reasons [
52].
The utilization of brokers in OT and IT networks brings significant benefits by enabling a seamless communication and integration between these two domains. OT networks are typically associated with industrial control systems and devices that monitor and control physical processes, such as manufacturing plants, power grids, or transportation systems. On the other hand, IT networks deal with traditional computer networks, data centers, and enterprise systems. Furthermore, the brokers act as intermediaries that bridge the gap between OT and IT networks, facilitating interoperability and data exchange [
13,
53,
54].
While in the OT network, brokers have a significant role in enabling connectivity and data sharing between many OT devices and systems, providing a centralized platform for managing and controlling the flow of data, in the IT network, they serve as gateways for receiving and processing data from the OT network, providing the necessary translation and transformation capabilities from OT-specific protocols into IT-friendly formats that can be easily consumed by enterprise applications and systems. Thus, brokers can aggregate data from sensors, actuators, and control systems, and distribute the data to IT systems for analysis, decision-making, integration with enterprise applications, business intelligence, predictive analytics, and decision support.
Bearing this in mind, in this work, the OT and IT networks were integrated, as can be seen in
Figure 6.
For the integration of the networks, the OT network was integrated with the IT network through the utilization of two RabbitMQ brokers, using a clustering configuration with each instance in a different type of network. These brokers played a pivotal role as central brokers, facilitating the exchange of data between both domains. Furthermore, it is worth mentioning that RabbitMQ on the OT domain was connected to both Mosquitto and Orion brokers to enhance the interconnectivity and interoperability between devices. This interconnected architecture involving these three brokers effectively bridged the gap between networks, allowing efficient and secure communication.
4.3. Physical Implementation
4.3.1. OT Network Perspective
The cognitive cell from the OT network’s perspective was composed of two elements: an edge stereo vision system (ESVS) performing sensor activities, and a collaborative robot node (CRN) as an actuator.
To create a stereo vision system, two parallel cameras with a known baseline distance were connected together to a Raspberry Pi 4. By using two cameras in full high definition (FHD), the cognitive cell could perceive depth using computer vision triangulation, thus enabling the calculation of the height at which an object is located. In this work, the object was a monochromatic cube. Moreover, the cameras identified, separately, the object’s color and compared the results to increase the accuracy. This implemented stereo vision system enhanced the cognitive capabilities of the cell.
Likewise, in [
55], the Raspberry Pi served as the central processing unit for this mentioned system and worked in the edge/fog layer. It received the visual data from the cameras and performed the necessary computations to extract relevant information, returning only the results for the remaining of the system. This edge/fog system approach ensured that the processing and decision-making happened near the source of data, reducing latency and enabling real-time responsiveness.
Furthermore, the cognitive cell also had a cobot, model Elephant Robotics MyCobot 320-Pi, that interacted with operators collaboratively. The cobot was designed to pick objects from the operator’s hand and place them in the correct box, matching with the object color. This cobot took advantage of the cognitive capabilities provided by the edge stereo video system to ensure a safe and efficient interaction between humans and machines.
4.3.2. IT Network Perspective
From the IT perspective, the implementation of a cognitive cell involves leveraging advanced technologies to enhance the security of industrial processes. Bearing this in mind, a Raspicam Sensor IMX477 [
56], connected to a Raspberry Pi 3B, which includes an ARM Quad Core 1.2 GHz Broadcom BCM2837 64 bit SoC and 1 GB LPDDR2 RAM, was used to create an image capture node.
In this part of the implementation, the node implemented a microservice and used the gateway layer as a bridge to connect the physical device and continuously deliver captured images to the application layer, feeding the microservice responsible for detecting a person, analyzing its position in the real world, and making decisions whether the position was in an allowed or forbidden area.
4.4. Software Implementation
The piece of software that was implemented can be categorized into three distinct components as described in the following subsections: edge stereo vision system, collaborative robot node, and human detection node.
4.4.1. Edge Stereo Vision System
The software implementation of an ESVS requires a series of virtual procedures to be carried out before the system can be successfully put into operation. To begin, each camera underwent an individual calibration using images of a calibration pattern taken from different angles. This process allowed the extraction of intrinsic parameters and distortion coefficients. After completing the calibration for each camera, a stereo calibration was carried out to determine the relative pose between the two cameras. Stereo images, where the calibration pattern was visible to both cameras, were captured, and the corresponding points were extracted. Subsequently, the stereo camera parameters and transformation matrix between the coordinate systems of each camera were calculated. These parameters enabled the system to perform computer vision triangulation effectively.
After the calibration procedure, the routine used is described by the flowchart depicted in
Figure 7.
For identifying the object’s color using both cameras, firstly, the software tool developed removes all undesired elements of the images, then a color recognition algorithm is applied. If both images detect the same color, that information is considered as a final result; if not, the algorithm considers as a result the color containing more detected pixels between both cameras.
After performing color recognition, the cube position on the workspace can be known, as well as the destination box where it should be placed. The system used both cameras to triangulate the central point of the uppermost object’s face, getting 3D coordinates points in the real world. For a correct detection, both the cameras and cobot’s referential must be coincident. Once all pertinent information has been collected, it is transmitted to a remote procedure call (RPC) function at the cobot node.
4.4.2. Collaborative Robot Node
As shown in
Figure 8, the node awaits the reception of a message containing the pick-and-place positions. As soon as it receives the message and extracts its information, the same function for picking and placing is called. This function interpolates the path from the current to the target position joints’ angles in 50 points, and then starts the movement for each point. Before sending the robot to a point, the node consumes from the human detection node the information about the forbidden area. If someone is detected in that area, the robot stops until the area is clean.
Furthermore, with the utilization of the proposed communication architecture, it is also possible to integrate the Robot Operating System (ROS). The cognitive cell does not replace the ROS, but rather can be complemented by integrating it as a microservice within the application layer. This integration further enhances the system’s capabilities, benefiting from the extensive ecosystem of libraries, tools, and pre-existing functionalities specifically designed for robotic applications. The incorporation of the ROS as a microservice within the application layer adds another layer of flexibility to the cognitive cell, empowering the system to accomplish more complex tasks and operations.
4.4.3. Human Detection Node
For the human detection node, the first microservice (
Figure 9A) operating at the gateway layer continuously captured images and made them accessible, through publishing an AMQP message to any service at the application layer that required access to the image data. On its turn, the second microservice (
Figure 9B) at the application layer, after consuming the provided image, utilized the YOLOv8 deep learning algorithm [
57] for human detection. With the detection, the node was able to calculate the human position, verify whether it fell within a restricted zone, and then publish the status to the cobot node.
4.5. Experiments
4.5.1. Experimental Environment and Tests
In order to evaluate the overall efficiency of the developed cognitive cell, a real-world experimental environment was assembled, integrating the components outlined in
Section 4.3 and
Section 4.4 to operate collectively. The result is depicted in
Figure 10.
The cognitive cell was tested to evaluate its safety when detecting people in the forbidden zone, robustness in recognizing the color of the objects accurately despite variations in lighting, and the interoperability between the cognitive cell’s components using the proposed communication architecture. For assessing the system’s performance, a series of pick-and-place experiments for all six cube colors were conducted in conjunction with the continuous image acquisition for human detection. The two experiments were conducted over the course of eight hours, with two executions per hour, and each execution was divided into two tests: only the operator, and the operator with a person in the forbidden zone. In total, 192 pick-and-places tasks were carried out, collecting 4800 images, and the result of all nodes were recorded for each task.
The tests routines were:
- 1.
The operator retrieves a cube from the cube dispenser and presents it within the operational area of the ESVS. For ensuring a good evaluation, no colors were repeated.
- 2.
The system identifies both the color and position of the cube and sends the robot to pick it up on the operator’s hand.
- 3.
A person may either be present or absent within the restricted area.
- 4.
The robot places the cube inside the box that matches its color.
Across all experiments, the physical setup remained consistent, with artificial lighting switched on and a glass wall featuring blinds in the raised position.
4.5.2. Results
The cognitive cell underwent testing using the accuracy metric for binary classification, as shown in Equation (
1), to evaluate its performance in recognizing colors.
The color recognition accuracy is presented in
Table 1, demonstrating the performance of the ESVS. Using Equation (
1), the system achieved a 100% accuracy for the red, green, and blue colors, while achieving 87.5% for yellow, 71.87% for purple, and 28.12% for orange, respectively. The overall accuracy was also calculated, resulting in an average accuracy of 73.65%.
To detect people within the forbidden zone during the 96 pick-and-place tasks, a microservice captured images at intervals of 200 milliseconds, i.e., with a capture rate of 5 frames per second. Additionally, it was determined that each instance of a human reaching the forbidden zone would last ten seconds. As a result, 4800 images were collected, programmatically analyzed, and the results were recorded.
Subsequently, the results were measured using accuracy, Equation (
2), and a confusion matrix. The results were classified into four distinct categories for analysis: true positive (
TP), true negative (
TN), false positive (
FP), and false negative (
FN). The
TP classification was assigned when a human presence was correctly detected within the forbidden zone. Conversely, the
TN category denoted instances where the human was accurately identified outside the restricted zone. On the other hand, the
FP designation was given to cases where the system incorrectly detected a human presence in the zone despite there being none. Lastly, the
FN classification indicated instances where the system failed to detect the presence of a human within the zone, despite their actual presence.
The results can be seen in the confusion matrix shown in
Table 2, and it is important to highlight that this work did not train a model for the YOLOv8, thus the confusion matrix was derived from the default dataset, filtered by class. Moreover, confusion matrices are commonly employed in assessing the performance of AI algorithms, particularly when dealing with problems that involve multiple classes. However, in the specific context of evaluating human detection, the output was simplified to just two classes: positive, indicating the detection of a human within the restricted zone, or negative, indicating the absence of the referred zone.
The obtained results indicated that a total of 2743 and 1104 instances were correctly identified as true positives and negatives, respectively. While 340 images were incorrectly classified as false positives, there were 613 instances of false negatives recorded. Overall, using Equation (
2), the calculated accuracy was 80.1%.
Regarding the communication architecture, the proposed architecture demonstrated its effectiveness in facilitating a seamless communication and coordination between both OT and IT networks during the experiments. By utilizing brokers as intermediaries, the architecture enabled the exchange of real-time data between different components of the system.
In the context of human detection, the communication architecture successfully connected the microservices responsible for capturing and processing data related to human localization. The brokers effectively facilitated the transfer of data between the OT and IT networks, allowing the CRN to receive relevant information for its functioning. Similarly, in color recognition, the communication architecture enabled the exchange of data between the edge system and the cobot. Moreover, by utilizing brokers, the architecture ensured that the communication inside the OT network occurred flawlessly between its elements.
5. Discussion
In this section, we delve into a detailed discussion examining the development, the results, and their implications for the proposed cognitive cell system. The evaluation of different aspects, including communication architecture, human detection, and color recognition, provides a comprehensive understanding of the system’s functionality aligned with the I5.0 concept. We divide our discussion and key points into five categories: experiment, cognitive cell, adherence to I5.0 concepts, challenges and limitations, and future research.
5.1. Experiment
Regarding the color recognition performance, the system demonstrated an excellent accuracy in recognizing primary colors such as red, green, and blue, achieving an accuracy rate of 100%. This highlights the system’s robust capabilities in identifying and distinguishing primary colors. However, when it came to secondary and tertiary shades, the system’s accuracy showed some limitations. The accuracy rates for yellow, purple, and orange were relatively lower, indicating potential areas for improvement in recognizing and distinguishing these colors. Several factors contributed to this discrepancy in accuracy. One such factor was the difference in lighting conditions during image capture, as expected. Thus, efforts to enhance color recognition accuracy should include strategies to account for variations in lighting conditions and ensure consistent and reliable color perception.
The results obtained from the evaluation of the cognitive cell system’s performance in detecting people within the restricted zone revealed an overall accuracy of 80.1%. One factor that could have contributed to this accuracy and to the recorded instances of false positives and false negatives was the position of the camera. The placement and angle of the camera in relation to the restricted zone impacted its field of view and the accuracy of the person detection. Variations in camera positioning can introduce occlusions or distortions that affect the system’s ability to accurately identify human presence. Thus, optimizing the camera’s position and angle within the workspace is crucial to minimize false and missed detection, ultimately improving the system’s overall accuracy in detecting people within the restricted zone.
Moreover, it is important to note that the focus of this article was not on the system’s performance in recognizing colors or detecting people, but rather on the implementation of the cognitive cell within the context of a cognitive factory. The system’s overall accuracy only demonstrates its capabilities and paves the way for many other applications.
5.2. Cognitive Cell
This work introduced a novel approach by integrating the IS concept with I5.0 principles, representing a significant advancement in the field of advanced and flexible manufacturing. By creating dynamic and adaptive environments that facilitate seamless interactions between humans and ISs, the proposed approach fosters human–machine collaboration and enables more efficient, human-centric, and responsive manufacturing processes. This integration has the potential to revolutionize industrial environments, transforming them into cognitive factories.
With the use of the ESVS, the system’s cognitive capabilities were improved in relevant ways. Firstly, the system enabled the perception of depth through computer vision triangulation, providing height calculations for objects in the workspace. Then, it enhanced the color recognition capabilities by utilizing two cameras and processing two views of the same object, enabling a more comprehensive color analysis.
Furthermore, the deployment of the cognitive cell with an edge system brought an additional advantage. By processing and making decisions near the data source, the system enabled real-time responsiveness, which is essential for time-critical industrial processes. The edge architecture also minimized the need for data transmission to the application layer, thereby reducing bandwidth requirements.
Although our use case did not directly incorporate learning and adaptive capabilities, the communication architecture design offers the foundation for future implementation of AI algorithms. The data storage and data availability of any component alongside the flexibility of the proposed architecture, which allows an integration in the form of microservices, enable AI algorithms to be seamlessly incorporated into the system, allowing cognitive cells to analyze data, detect patterns, and adapt themselves to a specific operator or operational requirements, similar to an IS, through its actuators.
The experiments conducted in this work underlined a comprehensive layer utilization within the proposed communication architecture. The cobot, positioned in the device layer, actively participated in the execution of tasks. The microservice responsible for capturing images and the cobot utilized the gateway layer for integration into the system. The edge layer enabled the deployment of the ESVS. The network layer provided connection to all other layers, and, at last, the human detection microservice operated within the application layer. Through the effective integration and utilization of these layers, the proposed architecture demonstrated its potential and maturity to be implemented in cognitive factories.
Additionally, the results indicate that the proposed communication architecture effectively facilitated the communication between the OT and IT networks during the experiments, since the cobot performed all pick-and-place tasks, despite the color recognition result, and stopped every time a person was correctly or incorrectly detected.
Finally, it is important to highlight that our research contributes to the field by presenting a practical application of I5.0 in a specific manufacturing context through the development of a laboratory replica of a cognitive cell. This showcases the implementation challenges associated with cognitive cells and provides valuable insights for the advancement of the I5.0 paradigm in flexible and advanced manufacturing. While our proposed ESVS framework demonstrates promising results, we acknowledge its limitations. It is not a one-size-fits-all solution for all use cases of cognitive cells. We recognize that in-depth discussions of specific subsystems and use cases fall beyond the scope of this article. However, we believe that future research should address these aspects to enhance the framework’s adaptability and applicability.
Regarding the validation process, we acknowledge the importance of exploring different use cases of cognitive cells to gain comprehensive insights into the framework’s effectiveness. We recognize the need for more diverse validation scenarios in future research.
5.3. Adherence to I5.0 Concepts
A cognitive cell aligns with I5.0’s principles and technologies, including individualized human–machine interaction, bioinspired technologies, digital twins and simulation, data transmission and storage, interoperability, artificial intelligence, and energy-efficient technologies [
17].
In terms of individualized human–machine interaction, the cognitive cell demonstrated a collaborative nature by implementing a cobot and a vision system to recognize colors. The operator interacted with the cobot by presenting a cube within the operational area, while the system autonomously identified the color and position of the cube, facilitating an effective interaction between humans and machines.
While this work did not directly incorporate bioinspired technologies or smart materials, the concept could be integrated in some ways. For instance, the robot’s gripper could be designed to be adaptive or responsive to ergonomic and surfaces while using recyclable materials and embedded sensors for enhanced functionality.
The cognitive cell’s real-time data availability allows for the creation of a digital twin or simulation. By capturing and using the middleware layer services for storing all generated data from all component status, it is possible to create a digital twin while some virtual simulation of the cognitive cell can also be developed for further analysis and optimization.
Additionally, data transmission, storage, and analysis technologies are common components within a cognitive cell. In this work, the proposed communication architecture ensured data transmission and interoperability; however, its structure enabled the implementation, on the middleware layer, of any service that focused on data processing for learning processes, traceability, big data management, etc.
The cognitive cell also demonstrated the use of AI to detect the presence of a person within the restricted area. This highlighted the system’s capability to handle complex and dynamic scenarios without human support; thus, it can be utilized to learn from the provided information and adapt the environment accordingly.
Lastly, the cognitive cell can benefit from the implementation of energy-efficient technologies, renewable energy sources, storage systems, and autonomy mechanisms. These measures can enhance the overall sustainability and efficiency of the system’s operations. Optimizing energy consumption, utilizing renewable energy sources for power requirements, and incorporating energy storage and autonomous functionalities can contribute to a more environmentally friendly and efficient cognitive cell.
5.4. Challenges and Limitations
The transition to I5.0 and the implementation of cognitive factories face some challenges and limitations that are important to highlight. One major challenge is the costs associated with acquiring and implementing the necessary technology infrastructure and providing adequate training for the workforce. Small- and medium-sized enterprises (SMEs) may face particular difficulties due to their limited financial resources. Moreover, skill gaps within the existing workforce can hinder the successful integration and operation of cognitive cells, necessitating extensive training and upskilling efforts. Ensuring interoperability between different systems, devices, and platforms is another challenge, as proprietary protocols and interfaces complicate integration processes. At last, the extensive collection, analysis, and sharing of data in I5.0 raise concerns about data security and privacy. Robust security measures, such as encryption and access controls, are crucial for safeguarding data integrity and maintaining privacy.
5.5. Future Research
In order to suggest future research, this section presents possible areas of study, bearing in mind the implementation of cognitive factories within the framework of I5.0.
Market trends: to investigate and analyze quantitative aspects related to market trends within the I5.0 segment, including future predictions concerning trends in work, the job market, and technology.
Further performance evaluation: to conduct a comprehensive performance evaluation of the cognitive factory implementation in a real-world industrial setting; gather data on key performance metrics such as productivity, efficiency, quality, and cost-effectiveness; and compare the results with traditional manufacturing systems to assess the overall effectiveness of the cognitive factory and validate its impact on operational performance.
Integration of the IoT and sensor networks: to explore the integration of IoT devices and sensor networks within the cognitive factory to enhance data collection and enable real-time monitoring of manufacturing processes; to investigate the deployment of smart sensors, RFID tags, and other IoT devices to capture and transmit data from many stages of production; and to analyze the potential benefits of leveraging these data for predictive maintenance, supply chain optimization, and intelligent decision-making.
Optimization of AI algorithms: to continuously optimize and refine the AI algorithms used in the cognitive factory; to investigate advanced machine learning techniques, such as reinforcement learning and deep reinforcement learning, to enhance the system’s ability to adapt, learn, and optimize manufacturing processes autonomously; additionally, to explore the integration of explainable AI techniques to provide transparency and interpretability in decision-making processes.
Scalability and flexibility: to investigate approaches to enhance the scalability and flexibility of cognitive factories; to explore modular designs and plug-and-play architectures that enable easy integration and reconfiguration of production modules, allowing manufacturers to adapt quickly to changing market demands and production requirements; and to analyze the use of cloud computing and edge computing technologies to provide scalable computational resources and enable real-time data processing and analysis.
With research about the mentioned topics, it will be possible to advance the understanding and implementation of cognitive factories, contributing to the ongoing transformation of the manufacturing industry, and fostering innovation, efficiency, and sustainability in the era of intelligent and connected production systems.
6. Conclusions
This article aimed to explore the convergence of IS and I5.0 concepts by developing and implementing a replica of a cognitive cell in a laboratory. The objective was to provide insights into the technical implementation challenges of a cognitive factory, specifically focusing on the safe collaboration between humans and robots. By analyzing the key points, this article contributed to the understanding of the technological challenges associated with a cognitive factory implementation, offering practical insights and potential solutions, and expanding the knowledge base surrounding this advanced manufacturing paradigm.
The developed cognitive cell, which represents a small-scale version of a cognitive factory, demonstrated the benefits of a cognitive factory ecosystem. It showcased the integration between OT and IT networks, a customized layer model for communication architecture, and the interoperability among devices. Moreover, the cognitive cell system served as a test bed for one application, the cognitive factory, with its more complex structure, and had the ability to integrate multiple cognitive cells.
The experiments conducted with the cognitive cell system demonstrated its effectiveness and robustness. The system successfully executed the proposed tasks, showcasing its ability to perform multiple operations in a controlled environment. The results of the experiments indicated that the cognitive cell was able to recognize colors and localize humans with a certain accuracy.
Overall, the findings of this study reinforce the significance of cognitive factories and their potential for revolutionizing the industry. The successful implementation of the cognitive cell system in a laboratory setting provided a solid foundation for future research and development in this field. By addressing the technological challenges and providing practical insights, this article contributed to the advancement of the cognitive factory implementation and supported the transition to I5.0 principles.