1. Introduction
Manufacturing enterprises operate in an increasingly complex environment where production systems must be managed across their complete life cycle, from initial installation and ramp-up through daily operation, maintenance interventions, continuous improvement initiatives, and eventual decommissioning. Effective life-cycle management requires seamless integration of information from diverse sources including production planning systems, shop-floor controllers, ambient intelligence sensors, and distributed human expertise to support timely, informed decisions about maintenance scheduling, problem diagnosis, hazard prevention, and system reconfiguration [
1,
2,
3]. Traditional approaches to industrial automation have relied on centralized, monolithic control systems with limited flexibility and poor support for collaboration among geographically distributed stakeholders. These legacy systems struggle to accommodate the dynamic, knowledge-intensive decision-making processes required for modern manufacturing operations [
4,
5].
Service-oriented architecture (SOA) emerged in the mid-2000s as a promising paradigm for addressing these challenges by decomposing complex systems into loosely coupled interoperable services that can be dynamically composed to support diverse business processes [
6,
7]. Early SOA implementations in manufacturing focused primarily on enterprise resource planning and supply chain integration at the business layer, with limited penetration into shop-floor operations due to real-time constraints and the prevalence of proprietary automation protocols [
8,
9]. However, advances in web services standards, ambient intelligence technologies, and knowledge management systems created new opportunities to apply SOA principles throughout the manufacturing enterprise, from business planning down to equipment monitoring and control [
10,
11].
This paper presents a comprehensive SOA for decision support in industrial life-cycle management that was originally developed and validated as part of the InLife European research project [
12,
13,
14,
15]. The architecture addresses three fundamental challenges that persist in contemporary manufacturing systems. First, it provides structured mechanisms for capturing, organizing, and retrieving knowledge about production systems, problems, and solutions that would otherwise remain scattered across multiple information systems and human experts. Second, it enables real-time collaboration among distributed stakeholders including equipment operators, maintenance technicians, system integrators, and equipment vendors through orchestrated service compositions that automatically identify relevant expertise and facilitate information exchange. Third, it implements predictive decision support through continuous monitoring of life-cycle parameters, risk assessment algorithms, and case-based reasoning that proactively identifies potential problems and recommends appropriate interventions before failures occur.
The original system was built using SOAP-based web services, XML data exchange, and browser-based user interfaces that represented the state-of-the-art SOA technology stack in the late 2000s. While these specific technologies have evolved significantly, the fundamental architectural patterns, service decomposition strategies, and orchestration approaches remain highly relevant to contemporary manufacturing systems. The emergence of Industry 4.0 has renewed interest in service-oriented approaches to manufacturing integration, with modern implementations leveraging RESTful APIs, OPC UA for device-level interoperability, Asset Administration Shells for standardized digital representations, and microservices architectures for improved modularity and scalability [
16,
17,
18]. Understanding the design principles, implementation challenges, and validation results from early SOA deployments provides valuable insights for researchers and practitioners working to integrate these newer technologies into industrial environments.
This paper makes several contributions to the literature on service-oriented manufacturing systems. First, we provide a detailed description of a multi-layered SOA that successfully integrated collaborative services, life-cycle management functions, and decision support capabilities in real industrial settings. Second, we present quantitative validation results from two case studies demonstrating substantial improvements in problem resolution time, maintenance efficiency, and operational performance. Third, we analyze the evolution from traditional SOAP-based SOA to modern microservices architectures, identifying key technology transitions, persistent architectural patterns, and integration opportunities with Industry 4.0 standards. Fourth, we discuss lessons learned regarding service granularity, orchestration strategies, knowledge management integration, and real-time decision support that remain applicable to contemporary system designs.
Furthermore, this work represents one of the earliest fully validated industrial implementations of service-oriented architecture for life-cycle management and can be interpreted as a precursor to contemporary Industry 4.0 service-based manufacturing systems.
The remainder of this paper is organized as follows.
Section 2 provides background on service-oriented architecture principles, industrial life-cycle management requirements, and the evolution toward Industry 4.0.
Section 3 reviews related work on SOA in manufacturing, microservices architectures, and decision support integration.
Section 4 presents the system architecture, including the core collaborative services, life-cycle management services, and orchestration logic.
Section 5 describes the implementation, including the technology stack, service interfaces, knowledge management infrastructure, and risk assessment logic.
Section 6 reports the industrial case studies and validation results.
Section 7 discusses the main performance results, architectural implications, and integration paths with contemporary Industry 4.0 technologies.
Section 8 outlines future research directions, and
Section 9 concludes the paper.
4. System Architecture
The architecture presented in this section was developed in the context of the European project InLife—Integrated Ambient Intelligence and Knowledge-Based Services for Optimal Life-Cycle Impact of Complex Manufacturing and Assembly Lines, coordinated by UNINOVA [
63].
4.1. Multi-Layered Architecture Overview
The service-oriented architecture for industrial life-cycle management implements a multi-layered design that separates concerns between foundational collaborative capabilities, domain-specific life-cycle management functions, and orchestration logic that coordinates service interactions [
12,
14]. This layered approach enables independent evolution of different system aspects while maintaining clear interfaces between layers. The architecture comprises three primary layers supported by cross-cutting infrastructure for knowledge management and ambient intelligence integration. The overall architectural structure is illustrated in
Figure 1.
The bottom layer provides eight core collaborative services that implement fundamental capabilities required across diverse manufacturing scenarios. These services handle team management functions including identifying relevant expertise, composing appropriate teams, and initiating collaborative sessions [
13]. They also provide information management capabilities including retrieving product and process knowledge, accessing ambient intelligence data, and maintaining traceability of collaborative activities. Finally, they offer communication services that enable distributed stakeholders to interact through appropriate channels. By implementing these capabilities as reusable services, the architecture avoids duplicating common functionality across different application domains.
The middle layer implements three collaborative application services that combine core services to support specific manufacturing workflows. These application services address collaborative problem solving, intelligent monitoring, and reconfiguration of automated manufacturing systems and products. Each application service orchestrates multiple core services according to domain-specific logic, managing the overall workflow while delegating specific tasks to lower-level services. This separation between generic collaborative capabilities and domain-specific work-flows enables the architecture to support diverse manufacturing scenarios through different combinations of the same foundational services.
The top layer provides six life-cycle management services that address specific phases and activities in the equipment life cycle. These services implement condition-based maintenance, online remote diagnostics, hazard prevention, installation support, continuous improvement, and technology selection. Three of these services operate automatically in response to detected conditions, while three are invoked on-demand by users. The automatic services are orchestrated by a risk assessment module that continuously monitors life-cycle parameters and triggers appropriate services based on the criticality and characteristics of detected situations.
Cross-cutting infrastructure supports all layers through shared capabilities for knowledge management and ambient intelligence integration. The knowledge management infrastructure maintains a common repository of structured information about products, processes, equipment, problems, and solutions. It provides semantic-based tools for storing, retrieving, and reasoning about this knowledge using ontologies, case-based reasoning, and rule-based reasoning. The ambient intelligence infrastructure [
14] processes data from distributed sensors and context-aware systems, enriching raw measurements with contextual information about operating conditions, operator actions, and environmental factors. Both infrastructure components expose their capabilities through service interfaces that can be invoked by services at any layer.
4.2. Core Collaborative Services
The eight core collaborative services provide foundational capabilities that are composed of higher-level services to implement complete workflows. The collaboration start-up service initializes new collaborative sessions by accepting information about the topic, context, goals, required efforts, timeframe, and budget constraints. It creates a structured data object representing the collaborative situation and stores it in the knowledge base for reference throughout the collaboration life cycle. This service establishes the foundation for subsequent team composition and information management activities by defining the scope and parameters of the collaborative effort.
The resource discovery service identifies actors with relevant expertise and availability to participate in collaborative activities. It accepts requests specifying areas of expertise along with constraints on effort, timeframe, and budget. The service searches among registered actors to find those with matching expertise profiles, then checks their present and future availability considering existing commitments. It returns a list of candidate actors along with information about their availability windows and associated costs, both direct compensation and indirect overhead. This information enables intelligent team composition that balances expertise requirements against resource constraints.
The team composition service determines optimal team configurations based on multiple criteria including costs, expertise coverage, collaboration patterns, and organizational relationships. It accepts requests specifying the collaborative situation and desired team characteristics, then evaluates possible team configurations against the specified criteria. The service considers constraints including availability of actors, budget limitations, responsibility assignments, expertise rankings, and required skill combinations. It returns the most suitable team composition according to the specified optimization criteria, enabling automated team formation that would otherwise require extensive manual coordination.
The collaboration call service manages the initiation of collaborative sessions by notifying selected team members and managing their participation. It accepts a list of actors to be involved along with information about the collaborative situation, then sends notifications to all team members with relevant context and instructions. The service manages responses from team members, tracks participation status, and checks predefined conditions for collaboration initiation. This automated notification and tracking capability reduces the overhead of coordinating distributed teams and ensures all participants have necessary information to contribute effectively.
The product and process knowledge provision service retrieves relevant information about products, processes, systems, and problems to support collaborative decision-making. It accepts requests specifying the type of knowledge needed and the intended audience, then selects appropriate semantic-based knowledge management tools to locate relevant information. The service searches across multiple sources including the knowledge base, legacy systems, programmable logic controllers, shop-floor controllers, web resources, and ambient intelligence systems. It returns the requested knowledge with presentation formats appropriate for the specified actors, enabling informed decision-making based on comprehensive information.
The ambient intelligence information provision service delivers semantically enriched data from distributed sensors and context-aware systems. Unlike traditional sensor systems that provide only raw measurements, this service enriches data with contextual information about the environment, operating conditions, and related events. It accepts requests for specific information types, then retrieves relevant data from ambient intelligence systems along with associated context. The service returns both the requested measurements and additional environmental status information that helps interpret the data correctly, enabling more accurate situation assessment and decision-making.
The collaboration traceability service maintains comprehensive records of collaborative activities to support quality management, continuous improvement, and regulatory compliance. It accepts specifications of states to be monitored and predefined events that should trigger recording, then continuously captures information about participant interactions, information exchanges, decisions made, and actions taken. The service provides current collaboration state information on request and issues warnings when predefined conditions are detected. This traceability capability ensures that collaborative problem-solving activities are documented systematically rather than remaining in informal communications that are difficult to retrieve and analyze.
The standard communication services provide appropriate channels for distributed stakeholders to interact during collaborative sessions. These services propose and enable selection of communication mechanisms suited to each collaboration situation, considering factors such as the number of participants, geographic distribution, required interaction modes, and available infrastructure. Communication options may include text-based messaging, audio conferencing, video conferencing, shared workspaces, and document collaboration tools. By providing flexible communication support, these services enable effective collaboration regardless of participant locations and preferences.
4.3. Life-Cycle Management Services
The six life-cycle management services address specific activities and phases in the equipment life cycle, with three services operating automatically in response to detected conditions and three available on-demand for user-initiated activities. The condition-based maintenance service implements predictive maintenance by analyzing life-cycle parameters to identify developing problems and recommend appropriate interventions. When the risk assessment module detects a situation that is well-understood and not immediately critical, it triggers this service to suggest maintenance actions and determine optimal timing. The service considers production schedules to identify non-producing time slots when maintenance can be performed with minimal disruption. It stores information about solved cases including the actions performed and their outcomes, building a knowledge base that improves future maintenance recommendations through case-based reasoning.
The online remote maintenance and diagnostics service supports problem resolution when available information is insufficient to determine appropriate actions or when no similar historical cases exist. This service is triggered when the risk assessment module identifies a problem that requires additional expertise or investigation. The service provides diagnostic capabilities and supports maintenance development by facilitating collaboration between shop-floor personnel and remote experts. It implements case-based reasoning to search for similar problems in the historical database, presenting potentially relevant cases to guide diagnostic activities. When a solution is identified, the service captures the problem description, diagnostic process, and resolution for future reference, continuously expanding the knowledge base.
The prevention of hazardous situations addresses critical conditions that pose immediate risks to manufacturing systems or operators. When the risk assessment module detects life-cycle parameters indicating hazardous conditions, it triggers this service to initiate appropriate protective actions. The service is customizable to user requirements and can trigger various responses including alarms to notify operators, emergency calls to summon assistance, or automatic stops to prevent damage or injury. The service implements qualitative risk assessment using categories including hazardous situations that significantly reduce system capability and safety margins, major situations that reduce functional capability, and minor situations that cause slight reductions in capability without compromising safety.
The installation and ramp-up support service assists with initial equipment deployment and performance validation. Unlike the automatic services, this service is activated only on user request during installation and commissioning phases. It provides structured workflows for configuring equipment, validating performance against specifications, training operators, and establishing baseline operating parameters. The service captures information about installation decisions, configuration settings, and initial performance characteristics that inform subsequent life-cycle management activities. This documentation ensures that knowledge gained during installation is preserved and accessible throughout the equipment life cycle.
The continuous improvement service supports ongoing optimization efforts to enhance productivity, quality, or efficiency. This on-demand service is invoked by users when opportunities for improvement are identified through operational experience or changing requirements. It provides structured workflows for analyzing current performance, identifying improvement opportunities, evaluating alternative approaches, and implementing changes. The service maintains traceability of improvement initiatives including the rationale for changes, expected benefits, actual results, and lessons learned. This systematic approach to continuous improvement ensures that optimization efforts are documented and their outcomes are captured for organizational learning.
The selection of manufacturing automation and logistics technology solution service assists with technology evaluation and procurement decisions. This on-demand service is invoked when organizations need to select equipment, software, or services from multiple alternatives. It provides structured comparison frameworks that evaluate options against relevant criteria including technical capabilities, costs, vendor support, and integration requirements. The service captures the evaluation process and rationale for technology selections, creating an audit trail that supports future decisions and helps avoid repeating past mistakes.
4.4. Orchestration and Integration Patterns
The orchestration of services across the multi-layered architecture follows a centralized pattern where a risk assessment module serves as the primary coordinator for automatic life-cycle management activities. This orchestration approach was selected based on manufacturing industry preferences for single-party controlled process flows that provide clear accountability and predictable behavior. The risk assessment module continuously monitors life-cycle parameters collected from ambient intelligence systems, shop-floor controllers, and other data sources. When a parameter crosses a predefined threshold, the module evaluates the situation to determine its criticality, identify probable root causes, and select the appropriate life-cycle management service to address the condition. The decision-routing logic implemented by the risk assessment module is depicted in
Figure 2.
The risk assessment mechanism combines likelihood and impact estimates derived from both historical data and expert knowledge. The likelihood of a given situation is estimated based on similarity with previously observed cases stored in the knowledge repository, using a case-based reasoning approach where features such as equipment type, symptom patterns, and operational context are considered. Frequency of occurrence in past cases provides a primary indicator, complemented by expert-defined weighting when historical data is limited.
The impact component reflects the potential consequences of the situation, including production downtime, maintenance cost, and safety implications. These values are derived from historical records when available and, otherwise, from predefined expert assessments associated with each case type. Risk is then computed as a function of likelihood and impact and compared against predefined thresholds that determine the appropriate service response.
Threshold values are initially defined based on domain expertise and operational policies and are progressively refined through system use. Updates to likelihood and impact estimates are performed using a frequency-based approach as new cases are recorded, with expert validation ensuring consistency. This allows the system to incrementally adapt its decision behavior over time while maintaining operational reliability.
The risk assessment logic implements a decision tree that routes situations to different services based on their characteristics. When a life-cycle parameter indicates a critical condition that poses immediate risks, the prevention of hazardous situations service is triggered to initiate protective actions. When a parameter indicates a problem that is not immediately critical but for which insufficient information is available to determine appropriate actions, the online remote maintenance and diagnostics service is triggered to facilitate expert collaboration. When a parameter indicates a well-understood problem for which similar historical cases exist, the condition-based maintenance service is triggered to recommend proven solutions and schedule interventions. This routing logic ensures that each situation is handled by the most appropriate service based on its urgency and the available knowledge.
The integration between the service platform and shop-floor systems implements bidirectional data flows that enable both monitoring and control. Upward data flows carry real-time information about equipment status, process parameters, quality metrics, and ambient conditions from shop-floor systems to the service platform. This information feeds the life-cycle parameter monitoring and risk assessment processes that drive automatic service triggering. Downward data flows carry commands, configuration changes, and control signals from the service platform to shop-floor systems. These flows enable services to implement recommended actions such as adjusting operating parameters, scheduling maintenance activities, or initiating emergency stops. The bidirectional integration creates a closed-loop system where monitoring drives decision-making and decisions are enacted through control actions.
The integration with enterprise systems provides access to production planning, inventory management, and business intelligence information that informs life-cycle management decisions. Production schedules influence maintenance timing by identifying available windows when equipment can be taken offline without disrupting production targets. Inventory information affects maintenance planning by indicating parts availability and procurement lead times. Business intelligence about equipment performance, maintenance costs, and quality metrics supports continuous improvement initiatives and technology selection decisions. By integrating these enterprise information sources, the service platform can make more informed decisions that balance technical considerations against business objectives.
The knowledge management infrastructure serves as a central integration point that connects services across all layers. Services store structured information about their activities, decisions, and outcomes in the common repository, making this knowledge available to other services. For example, the condition-based maintenance service stores information about maintenance actions and their effectiveness, which the online remote diagnostics service can retrieve when addressing similar problems. The collaborative problem solving service stores information about problem-solving processes and solutions, which the continuous improvement service can analyze to identify recurring issues and improvement opportunities. This knowledge sharing across services enables organizational learning and continuous improvement of decision-making capabilities.
5. Implementation
5.1. Technology Stack
The original system implementation utilized SOAP-based web services as the primary mechanism for service exposure and invocation, reflecting the dominant SOA standards of the late 2000s. SOAP provided a standardized protocol for exchanging structured information in distributed environments, with strong support for complex data types, error handling, and security features. Services were described using Web Services Description Language documents that specified available operations, input and output parameters, and binding details. This standards-based approach enabled interoperability between services implemented in different programming languages and deployed on different platforms, supporting the heterogeneous technology environment typical of manufacturing enterprises.
XML served as the primary data exchange format throughout the system, providing a flexible, self-describing representation for complex data structures. Configuration files, service messages, knowledge base content, and integration payloads all utilized XML encoding. While XML’s verbosity imposed some overhead compared to more compact formats, its widespread tool support and human readability facilitated development and debugging. The system employed XML Schema definitions to validate message structures and ensure data consistency across service interactions. This rigorous approach to data validation helped prevent integration errors and improved overall system reliability.
The sensor and context streams feeding the risk assessment engine were associated with supervision, diagnostics, and maintenance support instead of hard real-time machine control. Consequently, the relevant update rates were in the order of seconds to minutes, depending on the monitored life-cycle parameter and the industrial scenario. Within this temporal scope, the use of SOAP/XML messaging and a centralized application server did not constitute a practical limitation for the validated services.
The architecture was designed so that time-critical control remained at shop-floor level, while the service platform operated at the decision-support level, where latency requirements were significantly less stringent. In the reported deployments, no cases were identified in which communication overhead invalidated the service response; however, the chosen implementation stack would not be appropriate for millisecond-level closed-loop control. This distinction motivated the later discussion in the paper regarding migration paths toward lighter and more distributed architectures in contemporary Industry 4.0 systems.
The user interface layer implemented a browser-based application architecture that combined multiple technologies to deliver rich, interactive experiences. Basic collaborative functionalities were provided through a wiki-based application that enabled users to create, edit, and share content collaboratively. More sophisticated functions were implemented using Enterprise JavaBeans deployed on a centralized application server, providing robust transaction management, security, and scalability. The client-side interface utilized Asynchronous JavaScript and XML techniques to enable dynamic updates without full page reloads, improving responsiveness and user experience. The XMLHttpRequest object facilitated asynchronous data exchange between the browser and server, while JavaScript provided client-side logic for manipulating the Document Object Model and responding to user interactions.
The communication infrastructure supported multiple channels for distributed collaboration including text messaging, audio conferencing, and video conferencing. Video and voice conferencing capabilities were implemented using the JABBER standard, which provided interoperable real-time communication across different client applications. This multi-channel communication support enabled stakeholders to select appropriate interaction modes based on the complexity of the problem, the number of participants, and available bandwidth. The integration of communication services with the broader service platform ensured that collaborative sessions were properly initialized, participants were notified, and interactions were captured for traceability.
The ambient intelligence infrastructure [
14] processed data from distributed sensors and context-aware systems to provide enriched information about operating conditions and environmental factors. Unlike traditional sensor systems that simply forwarded raw measurements, the ambient intelligence processing module analyzed sensor data in context, correlating measurements with information about equipment states, operator actions, and environmental conditions. This contextual enrichment enabled more accurate interpretation of sensor data and better situation assessment. The processed information was made available through service interfaces that other components could invoke to access both current conditions and historical trends.
5.2. Service Design and Interfaces
The service interfaces were designed following customer-oriented principles that emphasized ease of use, clear semantics, and appropriate granularity. Each service exposed a focused set of operations that corresponded to meaningful business or technical functions rather than low-level implementation details. Operation names and parameter names were chosen to be self-explanatory, reducing the learning curve for developers integrating with the services. Input parameters were organized into logical groups that corresponded to natural units of information, avoiding overly complex parameter structures that would be difficult to construct correctly.
The core collaborative services implemented relatively fine-grained interfaces that provided atomic capabilities which could be composed by higher-level services. For example, the resource discovery service provided a single operation that accepted expertise requirements and constraints and returned matching actors, rather than exposing separate operations for searching, filtering, and ranking. This design simplified client code while maintaining flexibility through rich parameter structures that could express diverse requirements. The service internally implemented sophisticated logic for expertise matching, availability checking, and cost calculation, but exposed this complexity through a simple, high-level interface.
The collaborative application services implemented coarser-grained interfaces that orchestrated multiple core services to accomplish complete workflows. For example, the collaborative problem solving service exposed operations for initiating problem-solving sessions, requesting additional information, identifying causes, and implementing solutions. Each of these operations internally invoked multiple core services in appropriate sequences, managing the overall workflow while delegating specific tasks. This layered approach enabled clients to work at an appropriate level of abstraction, invoking high-level operations for complete workflows or lower-level operations for more fine-grained control.
The life-cycle management services implemented interfaces tailored to their specific domains and usage patterns. The automatic services exposed operations that accepted situation descriptions and returned recommended actions, supporting the risk assessment module’s orchestration logic. The on-demand services exposed operations that guided users through structured workflows, accepting inputs at each step and returning appropriate prompts and options for subsequent steps. All services implemented consistent error handling patterns that distinguished between client errors such as invalid parameters, server errors such as internal failures, and business logic errors such as insufficient information to complete requested operations.
To provide a more concrete illustration of service specification and orchestration,
Appendix A presents an example service contract for the Online Remote Maintenance and Diagnostics workflow, including representative operations, message structure, and the corresponding end-to-end orchestration logic.
Service security was implemented through a centralized application server architecture in which web services communicated with the application server for security-relevant functionalities. This design prevented direct access to sensitive operations from external clients, routing all requests through the application server where authentication, authorization, and audit logging could be consistently enforced. Authentication was based on user credentials managed by the platform, while authorization relied on role-based access control associated with users and business units.
The platform also maintained traceability of relevant interactions, including user access, problem handling steps, and service-related actions, thereby providing an auditable record of system use. In the context of the original deployment, this centralized approach offered a pragmatic balance between controlled access and implementation feasibility.
From a contemporary perspective, such an architecture would require additional hardening measures to address current OT threat models. These would include stronger network segmentation, certificate-based service identities, encrypted service-to-service communication, finer-grained access policies, and continuous monitoring of security-relevant events. In a modern Industry 4.0 setting, such protections could be supported by secure interoperability standards such as OPC UA security profiles and by zero-trust-inspired design principles.
5.3. Knowledge Management Infrastructure
The knowledge management infrastructure maintained a comprehensive repository of structured information organized into three categories reflecting different update frequencies and usage patterns. Static data included models of automated manufacturing systems, products, and processes that changed only when the physical systems were modified or new products were introduced. This information provided the foundational context for interpreting dynamic data and reasoning about problems and solutions. The models captured hierarchical relationships between systems and subsystems, specifications of equipment capabilities and constraints, and standard operating procedures for different scenarios.
Dynamic product and process states capture real-time information from shop-floor systems and ambient intelligence sensors along with results from collaborative decision-making activities. This information included current equipment status, process parameter values, quality measurements, and environmental conditions. It also included decisions made during collaborative sessions, actions taken in response to problems, and measures of effectiveness for implemented solutions. The dynamic state information was continuously updated as new data arrived from monitoring systems and as collaborative activities progressed, providing a current view of the manufacturing environment.
Dynamic collaboration states maintained information about ongoing collaborative work including active sessions, participating actors, information exchanges, and progress toward goals. This information supported the collaboration traceability service by capturing who was involved in each decision, what information they considered, what alternatives they evaluated, and what rationale supported their final choices. The collaboration state information was particularly valuable for continuous improvement initiatives, enabling analysis of problem-solving patterns to identify opportunities for process optimization or knowledge capture.
The knowledge base implemented semantic-based tools that enabled sophisticated reasoning about stored information. An ontology defined the domain concepts and relationships relevant to manufacturing life-cycle management, providing a formal vocabulary for describing products, processes, equipment, problems, causes, and solutions. This ontology enabled semantic queries that could find relevant information based on conceptual relationships rather than just keyword matching. For example, a query for solutions to bearing failures could retrieve cases involving different specific bearing types because the ontology captured the relationship between specific bearing types and the general bearing concept.
Case-based reasoning capabilities enabled the system to leverage historical problem-solving episodes when addressing new situations. When a problem was detected, the case-based reasoning engine searched for the knowledge base for similar historical cases based on a combination of features including symptom description, involved production unit or equipment type, relevant life-cycle parameters, and operating context. Similarity assessment was therefore multi-attribute and supported the identification of cases that were not identical but operationally comparable.
Retrieved cases were ranked by relevance and presented to users together with information about the associated problems, diagnostic processes, actions implemented and observed outcomes. In practice, the usefulness of retrieved cases was assessed by domain experts during diagnostics and maintenance activities, particularly through their contribution to faster identification of probable causes and selection of corrective actions. While no formal retrieval accuracy benchmark was conducted, the industrial evaluation indicated that access to similar past cases reduced diagnostic effort and supported shorter time-to-resolution in the validated scenarios. Successful problem resolutions were then stored as new cases, allowing the knowledge base to grow incrementally over time.
Rule-based reasoning capabilities encoded expert knowledge as condition-action rules that could be automatically evaluated against current situations. Rules captured heuristics such as if a specific parameter exceeds a threshold under certain operating conditions, then a particular cause is likely, or if a specific symptom pattern is observed then a particular diagnostic procedure should be performed. The rule-based reasoning engine evaluated applicable rules against current situation descriptions and returned recommended actions along with explanations of the reasoning. Rules could be added, modified, or disabled by domain experts through administrative interfaces, enabling the knowledge base to evolve as understanding improved.
5.4. Risk Assessment and Decision Logic
The risk assessment module implemented the core orchestration logic that determined when and how to trigger life-cycle management services in response to detected conditions. The module continuously monitored life-cycle parameters collected from ambient intelligence systems, shop-floor controllers, and other data sources, comparing current values against predefined thresholds that indicated potential problems. When a parameter crossed a threshold, the module created a symptom object that captured the parameter identity, threshold violation details, current operating conditions, and relevant context. This symptom object served as the input to the risk assessment process that determined appropriate responses.
The risk assessment calculation implemented a probabilistic model that estimated both the likelihood and impact of potential consequences. The basic risk formula multiplied the probability of an incident occurring by its expected impact, where impact could be measured in monetary terms for equipment damage or production losses, or in safety terms for operator injuries. The advanced risk formula decomposed this calculation into conditional probabilities that captured the relationship between symptoms, causes, and consequences. Specifically, the risk for each potential cause was calculated as the probability of that cause given the observed symptom, multiplied by the probability of a specific consequence given that cause, multiplied by the impact of that consequence.
The probability estimates were derived from multiple sources including historical data, expert judgment, and analytical models. Historical data provided empirical frequencies for how often specific symptoms led to specific causes and how often specific causes led to specific consequences. Expert judgment supplemented historical data for rare events or new equipment where sufficient historical data was not available. Analytical models such as Event Tree Analysis and Bayesian Networks provided structured approaches for combining multiple information sources and reasoning about causal relationships. The system kept probability estimates in the knowledge base and updated them as new evidence became available, implementing a learning capability that improved risk assessment accuracy over time.
The impact estimates considered multiple dimensions including equipment damage costs, production losses, quality impacts, and safety risks. Equipment damage costs included both repair expenses and potential replacement costs if damage was severe. Production losses accounted for downtime during repairs, reduced throughput during degraded operation, and potential missed delivery commitments. Quality impacts included scrap costs, rework expenses, and potential warranty claims for defective products. Safety risks considered potential operator injuries and associated medical costs, lost time, and regulatory penalties. The multi-dimensional impact assessment enabled the system to appropriately prioritize situations that posed different types of risks.
The risk levels were classified into qualitative categories that determined service routing decisions. Hazardous situations were defined as those that significantly reduced system capability and safety margins, requiring immediate protective action through the prevention of hazardous situations service. Major situations reduced functional capability but did not pose immediate safety risks, typically triggering the online remote maintenance and diagnostics service to engage expert support. Minor situations caused slight reductions in capability without compromising safety, typically handled through the condition-based maintenance service using established procedures. This qualitative classification simplified decision-making while ensuring that situations received appropriate responses based on their severity.
The decision logic for service selection implemented a rule-based approach that considered both risk level and information availability. When a hazardous situation was detected, the prevention of hazardous situations service was triggered immediately regardless of information availability, prioritizing safety over optimal problem resolution. When a non-hazardous situation was detected, the system evaluated whether sufficient information was available to determine appropriate actions. If similar historical cases existed and the situation was well-understood, the condition-based maintenance service was triggered to recommend proven solutions. If insufficient information was available or no similar cases existed, the online remote maintenance and diagnostics service was triggered to facilitate expert collaboration and knowledge development. This adaptive routing ensured that each situation was handled by the most appropriate service based on its characteristics.
6. Case Studies and Validation
6.1. Automotive Assembly Lines (German SME)
The first industrial validation was conducted at a German small-to-medium enterprise that supplied complex assembly lines for small motors to the automotive industry. The company operated multiple assembly lines with sophisticated automation including robots, transport systems, programmable logic controllers, and quality control systems. The manufacturing environment was characterized by high product variety, frequent changeovers, and demanding quality requirements typical of automotive supply chains. The company faced challenges with unstructured problem documentation, reactive maintenance practices, and limited collaboration between shop-floor personnel and remote equipment vendors.
This case was selected because it offered a representative environment for testing the InLife approach in discrete manufacturing with complex automation, distributed maintenance responsibility, and strong dependence on vendor–service-provider coordination. The assembly lines combined mechanical, pneumatic, and control subsystems, making them suitable for validating monitoring, structured diagnostics, and collaborative maintenance support. The implementation was influenced by the characteristics of this setting, particularly the need to integrate shop-floor events with external expertise, the high cost of downtime, and the practical importance of reducing invasive interventions and improving reuse of maintenance knowledge.
The service-oriented architecture was deployed to support collaborative problem solving, intelligent monitoring, and condition-based maintenance across the assembly lines. The ambient intelligence infrastructure [
14] was integrated with existing sensors and control systems to capture real-time data about equipment status, process parameters, and operating conditions. The knowledge management infrastructure was populated with models of the assembly lines, standard operating procedures, and historical maintenance records. The core collaborative services were configured to support teams spanning shop-floor operators, maintenance technicians, the company’s engineering staff, and technical support personnel from equipment vendors.
The implementation focused on three primary use cases that addressed the company’s most pressing challenges. The first use case supported collaborative problem solving when assembly line issues occurred, automatically identifying relevant expertise, initiating collaborative sessions, and capturing problem-solving processes and solutions. The second use case implemented intelligent monitoring of critical equipment parameters, enabling proactive detection of developing problems before they caused failures. The third use case provided condition-based maintenance recommendations that scheduled interventions based on actual equipment condition rather than fixed time intervals, reducing unnecessary maintenance while catching developing problems earlier.
The implementation was carried out incrementally, beginning with modelling the production units and relevant life-cycle parameters, followed by configuration of monitoring rules, integration of available shop-floor data sources, and progressive introduction of the collaborative and maintenance-oriented services. In practice, one of the main constraints was the need to fit the deployment into an already operational production environment without interfering with normal work. Another challenge was the initial effort required to structure knowledge about equipment, symptoms, and maintenance procedures in a form suitable for system use. User adoption also required a gradual shift from informal communication and experience-based intervention to more structured problem registration and knowledge reuse. Although this did not generate strong organizational resistance, it required familiarization and alignment of expectations among the involved technical actors.
The validation period extended over several months during which the system was used for real production support. Performance metrics were collected comparing operations before and after system deployment across multiple dimensions. The percentage of problems registered in a structured way increased from ca. 50% to ca. 100%. This improvement was attributed to the system’s automated problem registration and structured data capture, which made documentation easier and more valuable by enabling knowledge reuse. The flexibility of maintenance intervals improved from low to medium on a qualitative scale, reflecting the shift from fixed schedules to condition-based timing that better matched actual equipment needs.
The percentage of invasive maintenance procedures decreased from ca. 70% to ca. 30%. This improvement resulted from better diagnostic capabilities that enabled more targeted interventions and from predictive maintenance that caught problems earlier when simpler corrective actions were sufficient. Spare parts consumption decreased from ca. 60% to ca. 40% of baseline levels. This improvement was attributed to more accurate diagnostics that avoided unnecessary parts replacement and to earlier problem detection that prevented secondary damage requiring additional parts.
Manufacturing plant availability and performance improved from medium to high on a qualitative scale, reflecting reduced unplanned downtime, faster problem resolution, and more effective maintenance planning. The collaborative capabilities enabled faster engagement of remote expertise when needed, reducing the time required to diagnose and resolve complex problems. The condition-based maintenance capabilities enabled better alignment of maintenance activities with production schedules, minimizing disruption to production operations. The structured knowledge capture enabled organizational learning, with solutions to recurring problems becoming more efficient over time as the knowledge base grew.
6.2. Air Conditioning Manufacturing (Portuguese Company)
The second industrial validation was conducted at a Portuguese company that manufactured and assembled compression and sorption air conditioning systems ranging from 6 to 600 kW capacity. The company employed approximately 400 people and served both its own brand and original equipment manufacturer customers. The company’s products were installed at customer sites across diverse geographic locations, creating challenges for maintenance support and problem resolution. The company faced difficulties with unstructured problem reporting from customers, long problem resolution times requiring site visits, and limited ability to schedule preventive maintenance effectively.
This case was selected because it provided a complementary industrial context in which the managed assets were geographically distributed and maintenance support depended heavily on remote diagnosis and interaction with customer sites. The equipment involved medium-to-large air conditioning units instrumented with temperature and pressure sensing, making the case particularly suitable for validating condition-based maintenance and remote diagnostics services. The implementation was shaped by the need to combine manufacturer knowledge, customer-reported symptoms, and remotely acquired monitoring data in order to reduce travel, accelerate problem resolution, and improve the structuring of service knowledge across repeated maintenance situations.
The service-oriented architecture was deployed to support the complete life cycle of installed air conditioning systems from installation through ongoing maintenance and eventual replacement. The system was configured to support collaboration among customer operators, the company’s field service technicians, engineering staff at company headquarters, and component suppliers. The knowledge management infrastructure was populated with product models, installation procedures, maintenance guidelines, and historical problem records. The life-cycle management services were configured to support installation and ramp-up, condition-based maintenance, remote diagnostics, and continuous improvement.
The implementation emphasized remote support capabilities that could reduce the need for costly site visits while improving problem resolution speed. Customers were provided with interfaces to report problems and describe symptoms, which were captured in structured form and made available to service technicians. The case-based reasoning capabilities enabled technicians to search for similar historical problems and retrieve proven solutions. The remote diagnostics capabilities enabled technicians to guide customer personnel through diagnostic procedures and corrective actions without traveling to the site. When site visits were necessary, the structured problem information and diagnostic results enabled technicians to arrive better prepared with appropriate parts and tools.
The implementation also followed a staged process, starting with modelling the installed equipment and maintenance-related knowledge, followed by integration of monitoring data, configuration of service workflows, and training of the actors involved in remote support. A practical constraint in this case was the distributed nature of the installed units, which required combining manufacturer knowledge, customer-side information, and remote diagnostic interaction in a coherent service process. Another important aspect was the need to improve the quality and structure of problem descriptions, since prior practice relied heavily on phone-based reporting and paper documentation. The transition therefore involved not only technical integration, but also a gradual adoption of more formalized maintenance support procedures by both service staff and customer-side actors.
The validation period demonstrated substantial improvements across multiple performance dimensions. The number of problems registered in structured form increased from approximately one per month to ca. 100% of all problems, representing a dramatic improvement in problem documentation and knowledge capture. This improvement was attributed to the system making problem reporting easier for customers and more valuable for the company by enabling faster, more effective responses. The average time to solve problems decreased from two days to up to four hours. This improvement resulted from faster access to relevant expertise, better diagnostic support through case-based reasoning, and more effective remote collaboration.
The percentage of problems requiring diagnostic travel to customer sites decreased from ca. 85% to ca. 25%, representing a considerable reduction in costly site visits. This improvement was enabled by better remote diagnostic capabilities and by empowering customer personnel to perform corrective actions under remote guidance. The percentage of problems handled directly by customers without company involvement increased from 0% to ca. 25%. This improvement was enabled by providing customers with access to structured problem-solving knowledge and guided diagnostic procedures. The percentage of maintenance actions that were scheduled and performed as planned increased from ca. 70% to ca. 100%. This improvement resulted from better maintenance planning based on equipment condition and better coordination with customer operations.
The company projected that full deployment of the system across their installed base would yield ca. 60% reduction in maintenance costs through reduced travel, faster problem resolution, and more effective preventive maintenance. They also observed improved quality of services through more consistent problem resolution and better knowledge capture.
6.3. Performance Analysis
The reported performance improvements are based on observations collected during the pilot deployment phase of the InLife system in an industrial environment. The evaluation relied on a combination of system logs, maintenance records, and expert assessments from operators and engineers involved in the process. The comparison between “before” and “after” corresponds to operational practices prior to the introduction of the InLife services and those observed after their deployment over a comparable time window. The dataset comprises multiple incident handling cases (on the order of tens of occurrences), allowing consistent patterns to be identified across scenarios such as fault reporting, maintenance intervention, and resolution time.
It should be noted that, due to the nature of the industrial setting, the evaluation was not conducted as a controlled experiment. Factors such as variability in operational conditions, staffing, and production context were not isolated. Therefore, the results should be interpreted as indicative of practical impact observed in real-world conditions rather than as statistically controlled measurements.
The performance results from both case studies indicate consistent operational benefits from the proposed architecture, especially in structured problem registration, diagnostic support, and maintenance coordination. Across both settings, the combination of monitoring, knowledge reuse, and service orchestration improved the ability to document problems systematically and respond to them in a more informed and timely way. In the air conditioning case, this was reflected in shorter resolution times and reduced travel requirements; in the automotive case, it was reflected in more targeted maintenance actions and lower spare-parts consumption.
These improvements should be interpreted in light of the industrial context and the evaluation methodology adopted. The two case studies differed in baseline data availability, operational configuration, and service scope, which limits strict cross-case comparison of every indicator. Nevertheless, the recurrence of positive effects across both environments supports the conclusion that the proposed architecture improved diagnostic efficiency, maintenance planning, and knowledge capture under real operating conditions.
7. Results and Discussion
7.1. Quantitative Performance Metrics
The quantitative performance metrics from the two case studies provide strong evidence for the operational benefits of service-oriented architecture in industrial life-cycle management.
Table 1 summarizes the key performance improvements observed across both validation scenarios, demonstrating consistent benefits despite differences in manufacturing domains, company sizes, and specific use cases. Due to differences in baseline data availability across the two industrial settings, not all performance indicators could be measured in both cases.
The reported values are based on industrial observations, system records, and project assessment data. Approximate values, indicated by “ca.”, are used only where the original baseline data were available at operational rather than fine-grained measurement level.
Some performance indicators were not available for both case studies due to differences in data collection practices and system maturity prior to deployment. In particular, in the automotive case, several metrics such as travel reduction and spare parts optimization were not systematically recorded before the introduction of the InLife system, preventing a reliable “before vs. after” comparison. In contrast, the air conditioning manufacturing case had more structured historical records, enabling a broader set of indicators to be quantified.
While this limits direct comparability across all metrics, a core subset of indicators—such as structured problem registration and overall availability/performance —was consistently observed across both environments. The convergence of improvements in these common indicators supports the generalizability of the observed benefits, while the additional metrics provide complementary evidence in contexts where data was available.
The improvements in structured problem registration were particularly striking, with both companies achieving near-complete capture of problems in structured form compared to minimal or inconsistent documentation previously. This improvement has multiple downstream benefits beyond the immediate operational metrics. Structured problem data enables trend analysis to identify recurring issues that warrant systematic solutions rather than repeated firefighting. It supports root cause analysis by providing comprehensive information about symptoms, operating conditions, and contextual factors. It facilitates knowledge transfer by capturing expert problem-solving processes in forms that can be retrieved and applied by less experienced personnel. It enables performance benchmarking by providing consistent metrics for problem frequency, resolution time, and effectiveness of solutions.
The reduction in problem resolution time observed in the air conditioning case study represents a transformational improvement in service responsiveness. Reducing resolution time from two days to four hours changes the customer experience from frustrating delays to rapid support, potentially influencing purchasing decisions and customer loyalty. The time reduction also has direct cost implications by reducing the duration of equipment downtime and the associated production or comfort losses. The improvement was achieved through multiple mechanisms including faster expertise identification, immediate collaboration initiation, better diagnostic support through case-based reasoning, and more effective remote guidance of corrective actions.
The reduction in diagnostic travel requirements demonstrates substantial cost savings potential while also improving service speed. Travel costs for field service technicians include not only direct expenses for transportation and accommodation but also opportunity costs from time spent traveling rather than solving problems. The reduction in travel requirements was enabled by better remote diagnostic capabilities, structured knowledge that could guide on-site personnel, and collaborative tools that enabled remote experts to effectively support problem resolution. The remaining problems that still required site visits likely represented situations where physical intervention was necessary or where equipment complexity exceeded the capabilities of on-site personnel even with remote guidance.
The emergence of customer self-service capability represents a significant shift in the service model. Enabling customers to resolve one-quarter of problems without vendor involvement reduces service costs while improving customer satisfaction through faster resolution. This capability was enabled by providing customers with structured access to problem-solving knowledge, guided diagnostic procedures, and clear instructions for corrective actions. The self-service capability also has strategic implications for differentiating the company’s offerings and potentially enabling new service business models such as tiered support with premium pricing for immediate expert assistance.
The reduction in invasive maintenance procedures observed in the automotive case study demonstrates the value of better diagnostics and predictive maintenance. Invasive procedures that require significant equipment disassembly are costly in terms of both labor time and production disruption. They also carry risks of introducing new problems through reassembly errors or damage during disassembly. The reduction in invasive procedures was achieved through more accurate diagnostics that enabled targeted interventions and through earlier problem detection that allowed simpler corrective actions before problems escalated to require major interventions.
The reduction in spare parts consumption represents direct cost savings while also indicating improved diagnostic accuracy. Unnecessary parts replacement wastes both the parts cost and the labor cost for installation. It can also introduce new problems if replacement parts have different characteristics than original parts or if installation is not performed correctly. The reduction in parts consumption was achieved through better diagnostics that confirmed root causes before parts replacement and through predictive maintenance that caught problems earlier when repair rather than replacement was possible.
7.2. Architectural Comparison: SOA vs. Microservices
The evolution from traditional service-oriented architecture to microservices represents both continuity and change in architectural principles and implementation approaches. Both paradigms emphasize modularity, loose coupling, and service-based integration, but they differ in granularity, deployment models, governance approaches, and technology stacks. Understanding these differences is essential for organizations considering modernization of existing SOA implementations or development of new service-based manufacturing systems.
Traditional SOA implementations typically define services at relatively coarse granularity corresponding to significant business or technical capabilities. In the manufacturing life-cycle management architecture described in this paper, services such as collaborative problem solving, condition-based maintenance, and online remote diagnostics each encompass substantial functionality that orchestrates multiple lower-level operations. This coarse granularity simplifies service discovery and composition by reducing the number of services that must be understood and coordinated. However, it can also reduce flexibility by coupling multiple concerns within single services and making it difficult to scale or evolve individual capabilities independently.
Microservices architectures emphasize finer granularity with services decomposed around specific business capabilities or technical functions. In a microservices reimplementation of the life-cycle management system, capabilities such as symptom detection, case retrieval, risk calculation, and action recommendation might each be implemented as separate microservices rather than being combined within larger services. This finer granularity enables independent development, deployment, and scaling of individual capabilities. However, it also increases the number of services that must be discovered, composed, and monitored, potentially increasing operational complexity and communication overhead.
Deployment and operational models differ significantly between traditional SOA and microservices. Traditional SOA implementations often deploy multiple services within shared application servers or enterprise service buses, leveraging common infrastructure for concerns such as security, transaction management, and monitoring. This shared infrastructure approach simplifies some operational concerns but creates coupling between services that share infrastructure and can make it difficult to scale individual services independently. The manufacturing life-cycle management system followed this pattern with services deployed on a centralized application server that provided common security and transaction management.
Microservices architectures emphasize independent deployment with each service packaged with its required dependencies and deployed in isolated containers or virtual machines. This independence enables services to use different technology stacks, scale independently based on their specific load patterns, and be updated without affecting other services. However, independent deployment also requires more sophisticated operational tooling for service discovery, health monitoring, distributed logging, and failure recovery. Manufacturing environments must carefully evaluate whether the benefits of independent deployment justify the additional operational complexity, particularly for systems that must meet high availability and real-time requirements.
Communication patterns and protocols differ between traditional SOA and microservices implementations. Traditional SOA heavily utilized SOAP-based web services with XML messaging, providing strong typing, comprehensive error handling, and built-in security features. These capabilities came at the cost of significant overhead in message size and processing time. Modern microservices implementations typically favor RESTful APIs with JSON messaging that offer simpler integration and better performance for many use cases. However, REST’s stateless request-response model may not be ideal for all manufacturing scenarios, particularly those requiring publish-subscribe patterns or guaranteed message delivery. OPC UA has emerged as an important middle ground for manufacturing, providing service-oriented communication with industrial-specific features such as real-time data access, historical data retrieval, and alarm and event management.
Governance and coordination approaches differ between the two paradigms. Traditional SOA often implements centralized governance with enterprise-wide standards for service design, shared data models, and coordinated deployment processes. This centralized approach ensures consistency and interoperability but can slow innovation and create bottlenecks in large organizations. Microservices architectures emphasize decentralized governance with individual teams responsible for their services’ design, implementation, and operation. This decentralization enables faster innovation but requires careful attention to interface contracts, data consistency, and operational standards to prevent fragmentation.
The choice between traditional SOA and microservices for manufacturing applications depends on multiple factors including real-time requirements, operational maturity, organizational structure, and existing technology investments. Manufacturing control systems with millisecond-level timing requirements may be better served by traditional SOA or specialized industrial protocols rather than microservices with their additional communication overhead. Organizations with limited operational maturity may find traditional SOA’s shared infrastructure easier to manage than microservices’ distributed operational model. Organizations with strong functional silos may benefit from microservices’ support for independent team ownership, while organizations with integrated engineering teams may prefer traditional SOA’s centralized coordination. Organizations with significant investments in existing SOA infrastructure may find incremental evolution more practical than wholesale replacement with microservices.
Compared with much of the published literature on service-oriented manufacturing systems, the contribution of this work lies not only in proposing a service-based architecture, but in validating it in real industrial settings with explicit integration of monitoring, knowledge management, and decision support. Earlier studies often emphasized conceptual architectures, interoperability mechanisms, or prototype-level demonstrations, whereas the present work combines industrial deployment with observable operational impact. Its main strength is, therefore, the integration of service orientation with practical life-cycle decision support. Its limitations are also clear: the original implementation reflects the technological constraints of its time, and the empirical validation follows an industrial assessment logic instead of a statistically controlled experimental design. Accordingly, the results are best generalized at the level of architectural patterns and observed operational benefits, not as universally transferable numerical gains.
7.3. Integration with Industry 4.0 Technologies
The integration of service-oriented architecture with Industry 4.0 technologies creates opportunities to enhance manufacturing life-cycle management capabilities while addressing limitations of earlier implementations. Three technology areas are particularly relevant for modernizing the architecture described in this paper: OPC UA for device-level interoperability, Asset Administration Shells for standardized digital representations, and digital twins for virtual modeling and simulation.
OPC UA provides a comprehensive framework for industrial interoperability that combines information modeling, service-oriented communication, and security features specifically designed for manufacturing environments. The OPC UA information modeling capabilities enable rich, hierarchical representations of equipment, processes, and data that go beyond the simple tag-based models of earlier industrial protocols. These models can capture relationships between components, constraints on valid operations, and semantic descriptions that enable automated reasoning. The OPC UA service set includes not only basic data access but also sophisticated capabilities such as historical data retrieval, alarm and event management, and method invocation that align well with life-cycle management requirements.
Integrating OPC UA with the service-oriented life-cycle management architecture would involve exposing equipment capabilities through OPC UA servers that implement standardized information models. The life-cycle parameter monitoring module could subscribe to relevant OPC UA data items and events rather than polling proprietary interfaces, reducing integration complexity and improving real-time responsiveness. The risk assessment module could invoke OPC UA methods to execute diagnostic procedures or corrective actions on equipment, providing a standardized mechanism for control that works across equipment from different vendors. The knowledge management infrastructure could import OPC UA information models to automatically populate equipment models and relationships, reducing manual configuration effort and ensuring consistency between physical equipment and digital representations.
Asset Administration Shells (AAS) represents another important standardization effort that complements OPC UA by defining comprehensive digital representations of assets that span their complete life cycles. An AAS includes not only real-time data and control interfaces but also documentation, maintenance records, configuration information, and business data. The AAS specification defines standardized submodels for common information types such as technical data, operational data, and maintenance information, enabling consistent access to asset information regardless of vendor or asset type.
Integrating AAS with the service-oriented life-cycle management architecture would involve creating AAS instances for each managed asset and populating them with information from multiple sources. The static data in the knowledge management infrastructure could be structured according to AAS submodels, providing standardized representations that facilitate information exchange with other systems. The dynamic product and process states could be exposed through AAS interfaces, enabling external systems to access current equipment status and historical trends through standardized APIs. The life-cycle management services could be registered as AAS operations, enabling them to be discovered and invoked through standard AAS mechanisms rather than custom service registries.
A concrete migration path can be illustrated using the Online Remote Maintenance and Diagnostics workflow. In a contemporary implementation, the functions currently grouped within this service could be decomposed into microservices such as symptom ingestion, case retrieval, risk evaluation, notification, and maintenance action management. The symptom ingestion microservice would receive data from the shop floor, the case retrieval microservice would query the knowledge base for similar past cases, and the notification microservice would contact the responsible actors once the appropriate service path had been selected.
At the interoperability level, equipment events and state changes currently entering the platform through proprietary or SOAP-based interfaces could be exposed through OPC UA variables, events, and methods. For example, monitored life-cycle parameters such as temperature, pressure, or flow-rate anomalies could be represented as OPC UA data nodes and events, while actions such as initiating a diagnostic procedure or confirming a maintenance plan could be mapped to OPC UA methods.
The knowledge repository could, in turn, be aligned with Asset Administration Shell submodels by separating static asset descriptions, operational state information, and maintenance knowledge. Technical asset characteristics would map naturally to technical-data submodels, current and historical operating values to operational-data submodels, and problem/solution records to maintenance or condition-monitoring related submodels. In this way, the original repository structure would be preserved conceptually but exposed through standardized digital representations more compatible with current Industry 4.0 ecosystems.
Digital twins provide virtual representations of physical assets and processes that enable simulation, prediction, and optimization without disrupting actual operations. A digital twin maintains synchronization with its physical counterpart through continuous data exchange, using the current physical state to update the virtual model and using virtual model predictions to inform physical operations. Digital twins can support life-cycle management by simulating the effects of proposed maintenance actions, predicting future equipment condition under different operating scenarios, and optimizing maintenance schedules to balance costs against risks.
Integrating digital twins with the service-oriented life-cycle management architecture would involve creating virtual models of managed equipment and processes and connecting them to the life-cycle parameter monitoring infrastructure. The digital twin models could consume the same sensor data and operating conditions as the risk assessment module, maintaining synchronized representations of equipment state. The condition-based maintenance service could query digital twin predictions about remaining useful life and optimal maintenance timing rather than relying solely on threshold-based rules. The online remote diagnostics service could use digital twin simulations to evaluate proposed diagnostic procedures or corrective actions before implementing them on physical equipment, reducing risks and improving effectiveness.
The combination of OPC UA for device connectivity, AAS for standardized digital representations, and digital twins for virtual modeling creates a powerful foundation for next-generation life-cycle management systems. These technologies address several limitations of the original implementation including proprietary device interfaces, inconsistent information models, and limited predictive capabilities. However, successful integration requires careful attention to several challenges including managing the complexity of multiple overlapping standards, ensuring consistency between physical equipment and digital representations, and meeting real-time requirements with the additional processing overhead of digital twin synchronization.
7.4. Lessons Learned and Design Implications
The development, deployment, and validation of the service-oriented architecture for industrial life-cycle management generated numerous insights about effective service design, implementation challenges, and organizational factors that influence success. These lessons learned remain relevant for contemporary system development efforts despite evolution in specific technologies and standards.
Service granularity emerged as a critical design decision that significantly impacts system usability, flexibility, and performance. The three-layer architecture with fine-grained core services, medium-grained application services, and coarse-grained life-cycle management services provided a good balance for the manufacturing life-cycle management domain. The fine-grained core services enabled flexible composition to support diverse workflows while avoiding excessive complexity in service orchestration. The coarse-grained life-cycle management services provided intuitive interfaces for domain-specific functions while internally orchestrating multiple lower-level services. Organizations developing service-oriented manufacturing systems should carefully analyze their domain to identify natural service boundaries that align with business capabilities and technical functions while avoiding both excessive granularity that increases complexity and insufficient granularity that reduces flexibility.
Orchestration versus choreography represents another important architectural decision that affects system behavior and operational characteristics. The centralized orchestration approach implemented through the risk assessment module provided clear accountability, predictable behavior, and straightforward monitoring that aligned well with manufacturing industry preferences. However, centralized orchestration also creates a potential bottleneck and single point of failure that must be carefully managed through redundancy and failover mechanisms. Choreography approaches where services coordinate through event exchanges rather than central control can provide better scalability and resilience but make it more difficult to understand and monitor overall system behavior. Manufacturing applications with clear process flows and accountability requirements may favor orchestration, while applications requiring high scalability and resilience may favor choreography or hybrid approaches.
Knowledge management integration proved essential for enabling intelligent decision support and organizational learning. The combination of structured information models, case-based reasoning, and rule-based reasoning provided complementary capabilities that addressed different aspects of life-cycle management. Structured information models enabled consistent data capture and retrieval across diverse sources. Case-based reasoning leveraged historical experience to guide problem-solving when similar situations had been encountered previously. Rule-based reasoning encoded expert knowledge to support decision-making when historical cases were not available. Organizations implementing service-oriented manufacturing systems should plan for comprehensive knowledge management from the beginning rather than treating it as an after-thought, as the value of decision support capabilities depends critically on the quality and completeness of underlying knowledge.
User interface design significantly influenced system adoption and usage patterns. The browser-based interface with wiki-like collaboration features and dynamic updates through AJAX techniques provided an accessible, familiar user experience that reduced training requirements and encouraged usage. However, the interface design also needed to accommodate diverse user roles including shop-floor operators, maintenance technicians, engineers, and managers with different information needs and interaction patterns. Organizations developing service-oriented manufacturing systems should invest in user-centered design processes that understand different user roles and design interfaces appropriate for each role’s needs and context.
Integration with existing systems presented both technical and organizational challenges that required careful management. The bidirectional integration with shop-floor control systems required understanding proprietary protocols, managing real-time data flows, and ensuring that service platform actions did not disrupt production operations. The integration with enterprise systems required aligning data models, managing access controls, and coordinating deployment schedules. Organizations implementing service-oriented manufacturing systems should allocate sufficient time and resources for integration activities, engage stakeholders from both IT and operational technology organizations, and plan for iterative integration that validates functionality incrementally rather than attempting big-bang deployments.
Organizational change management emerged as equally important as technical implementation for achieving successful outcomes. The shift from informal, reactive problem-solving to structured, proactive life-cycle management required changes in work practices, responsibilities, and performance metrics. Shop-floor personnel needed training not only in system operation but also in the value of structured problem documentation and the importance of capturing knowledge for organizational learning. Management needed to adjust performance metrics to recognize and reward proactive problem prevention rather than only measuring reactive problem resolution. Organizations implementing service-oriented manufacturing systems should plan for comprehensive change management including stakeholder engagement, training programs, and performance metric alignment.
Security and access control required careful attention to balance the need for information sharing with the need to protect sensitive data and control critical operations. The centralized application server architecture provided a clear point for enforcing security policies but also created a potential target for attacks. Role-based access controls enabled appropriate information sharing while preventing unauthorized access to sensitive functions. Audit logging provided traceability of actions for quality management and regulatory compliance. Organizations implementing service-oriented manufacturing systems should engage security experts early in the design process, implement defense-in-depth strategies with multiple security layers, and plan for regular security assessments and updates.
Performance and scalability considerations influenced both architectural decisions and technology selections. The SOAP-based web services provided strong standardization but imposed significant overhead that could impact performance for high-frequency interactions. The centralized application server simplified some operational concerns but created potential scalability bottlenecks. The browser-based user interface provided good accessibility but required careful optimization to maintain responsiveness with large data sets. Organizations implementing service-oriented manufacturing systems should establish clear performance requirements early in the design process, conduct performance testing throughout development, and plan for scalability through appropriate architectural patterns and technology selections.
8. Future Directions
8.1. Migration to Microservices
The migration from traditional service-oriented architecture to microservices represents a significant opportunity to enhance flexibility, scalability, and development velocity while also introducing new challenges that must be carefully managed. A successful migration strategy for the manufacturing life-cycle management system would follow an incremental approach that preserves existing functionality while gradually introducing microservices patterns and technologies.
The first phase of migration would focus on identifying service boundaries that align with microservices principles of single responsibility and independent deployability. The existing three-layer architecture provides a reasonable starting point, but some services may benefit from further decomposition. For example, the risk assessment module currently combines symptom detection, probability calculation, impact assessment, and service routing in a single component. These functions could be separated into distinct microservices that communicate through well-defined APIs, enabling independent evolution and scaling of each capability. The case-based reasoning and rule-based reasoning functions currently embedded in the knowledge management infrastructure could be exposed as separate microservices that other services invoke through standardized interfaces.
The second phase would address data management and consistency challenges that arise from microservices’ emphasis on decentralized data ownership. The current architecture maintains a common repository that all services access, providing strong consistency but creating coupling between services. A microservices architecture would give each service ownership of its own data store, improve independence but requiring careful design of data synchronization and consistency mechanisms. Event-driven patterns where services publish events about state changes and other services subscribe to relevant events could provide eventual consistency while maintaining loose coupling. The manufacturing domain’s requirements for data consistency and traceability need to be carefully analyzed to determine appropriate consistency models for different data types.
The third phase would implement containerization and orchestration infrastructure to support independent deployment and scaling of microservices. Container technologies such as Docker would package each microservice with its dependencies, ensuring consistent behavior across development, testing, and production environments. Container orchestration platforms such as Kubernetes would manage deployment, scaling, and health monitoring of microservices, automatically restarting failed services and distributing load across multiple instances. The manufacturing environment’s requirements for high availability and real-time performance would need to be carefully considered when configuring orchestration policies and resource allocations.
The fourth phase would address operational concerns including service discovery, distributed logging, distributed tracing, and performance monitoring. Service discovery mechanisms would enable microservices to locate and communicate with each other without hard-coded addresses, supporting dynamic deployment and scaling. Distributed logging would aggregate log messages from multiple microservices into centralized repositories that enable troubleshooting and analysis. Distributed tracing would track requests as they flow through multiple microservices, enabling performance analysis and bottleneck identification. Performance monitoring would collect metrics from individual microservices and provide dashboards showing overall system health and performance.
The migration strategy would need to carefully manage the coexistence of traditional SOA services and new microservices during the transition period. Adapter patterns could enable microservices to invoke existing SOA services and vice versa, allowing incremental migration without requiring simultaneous replacement of all services. API gateways could provide unified interfaces that abstract whether underlying implementations use traditional SOA or microservices, enabling transparent migration from client perspectives. The migration would prioritize services that would benefit most from microservices patterns such as those requiring independent scaling or rapid evolution, while potentially leaving stable, well-functioning services in traditional SOA form.
8.2. Digital Twin Integration
The integration of digital twin technology with the service-oriented life-cycle management architecture offers opportunities to enhance predictive capabilities, enable virtual commissioning and testing, and support optimization through simulation. A comprehensive digital twin integration strategy would address multiple aspects of the manufacturing life cycle from equipment design through operation and maintenance.
Equipment-level digital twins would provide virtual representations of individual machines and systems that maintain synchronization with their physical counterparts through continuous data exchange. These digital twins would consume the same sensor data and operating conditions as the life-cycle parameter monitoring module, using physics-based models or data-driven models to estimate internal states that cannot be directly measured. The digital twin models could predict remaining useful life based on current operating conditions and historical degradation patterns, enabling more accurate maintenance scheduling than threshold-based approaches. They could simulate the effects of proposed parameter changes or maintenance actions, enabling evaluation of alternatives before implementation on physical equipment.
Process-level digital twins would represent complete manufacturing processes including multiple equipment items, material flows, and quality relationships. These digital twins would enable simulation of production scenarios to evaluate the impact of equipment failures, maintenance activities, or process changes on overall production performance. The condition-based maintenance service could use process-level digital twin simulations to optimize maintenance schedules that balance equipment-specific needs against production requirements, identifying maintenance windows that minimize production disruption. The continuous improvement service could use process-level simulations to evaluate proposed process changes and predict their effects on productivity, quality, and costs.
Product-level digital twins would represent individual product instances throughout their life cycles from design through manufacturing, operation, and eventual disposal. These digital twins would capture as-built configurations, manufacturing history, operational usage patterns, and maintenance records for each product instance. The installation and ramp-up support service could use product-level digital twins to guide commissioning activities based on specific product configurations and customer requirements. The online remote diagnostics service could use product-level digital twins to understand the specific history and configuration of equipment experiencing problems, enabling more accurate diagnosis and targeted solutions.
The integration of digital twins with the service-oriented architecture would require addressing several technical challenges. Synchronization between physical equipment and digital twins must maintain sufficient accuracy while managing communication bandwidth and computational resources. The manufacturing environment’s real-time requirements may limit the complexity of digital twin models that can be updated at required frequencies. Model validation and calibration processes must ensure that digital twin predictions remain accurate as equipment ages and operating conditions change. Security mechanisms must protect digital twin models and data from unauthorized access while enabling appropriate sharing with service providers and partners.
The digital twin integration would also create new opportunities for service-based business models. Equipment vendors could offer digital twin models as services that customers access through standardized interfaces, enabling sophisticated predictive maintenance without requiring customers to develop and maintain complex models. Third-party service providers could offer digital twin-based optimization services that analyze customer operations and recommend improvements. Collaborative digital twin platforms could enable sharing of anonymized operational data and model improvements across multiple organizations, accelerating learning and innovation.
8.3. AI/ML Enhancement Opportunities
The integration of artificial intelligence and machine learning technologies with the service-oriented life-cycle management architecture offers opportunities to enhance decision support capabilities, automate knowledge extraction, and improve prediction accuracy. Several specific AI/ML applications align well with the architecture’s existing capabilities and could provide substantial value.
Predictive maintenance models using machine learning could enhance the condition-based maintenance service by learning complex relationships between sensor data, operating conditions, and equipment failures. Traditional threshold-based approaches and physics-based models require explicit specification of relationships that may be difficult to determine for complex equipment. Machine learning models can automatically discover patterns in historical data that indicate developing problems, potentially identifying precursors that human experts have not recognized. Deep learning approaches using recurrent neural networks or transformers could model temporal dependencies in sensor data, capturing degradation patterns that unfold over extended time periods. The trained models could be deployed as microservices that accept current sensor data and operating conditions and return predictions about remaining useful life or failure probability.
Anomaly detection using unsupervised learning could enhance the life-cycle parameter monitoring module by automatically identifying unusual patterns that may indicate problems. Traditional monitoring approaches require explicit specification of normal operating ranges and threshold values that may not capture complex multivariate relationships. Unsupervised learning approaches such as autoencoders or isolation forests can learn representations of normal behavior from historical data and flag deviations that may deserve investigation. These anomaly detection models could operate continuously on streaming sensor data, generating alerts when unusual patterns are detected. The alerts would feed into the risk assessment module’s existing logic for determining appropriate responses.
Natural language processing could enhance the knowledge management infrastructure by automatically extracting structured information from unstructured text sources such as maintenance reports, problem descriptions, and solution documentation. The current system relies on manual entry of structured information, which creates friction and may result in incomplete or inconsistent documentation. NLP techniques including named entity recognition, relationship extraction, and text classification could automatically identify equipment references, problem types, causes, and solutions in free-text descriptions. The extracted information could populate the knowledge base and improve the effectiveness of case-based reasoning by increasing the coverage of historical cases.
Reinforcement learning could optimize maintenance scheduling and resource allocation decisions by learning policies that balance multiple objectives including equipment availability, maintenance costs, and production requirements. The current condition-based maintenance service uses relatively simple heuristics for scheduling decisions that may not fully account for complex interactions between multiple equipment items and production constraints. Reinforcement learning agents could learn optimal policies through simulation using the digital twin models, then deploy the learned policies to guide real maintenance decisions. The agents would continue learning from actual outcomes, adapting their policies as equipment characteristics and operating conditions evolve.
Computer vision could enhance ambient intelligence capabilities by analyzing images and videos from cameras deployed in manufacturing environments. Visual information could detect operator actions, identify equipment conditions, and recognize safety hazards that are difficult to capture with traditional sensors. Deep learning models for object detection, semantic segmentation, and action recognition could process visual data in real-time, generating structured events that feed into the life-cycle parameter monitoring and risk assessment modules. The visual information could also support remote diagnostics by enabling experts to see equipment conditions without traveling to sites.
The integration of AI/ML capabilities would require addressing several challenges including data quality and availability, model training and validation, explainability and trust, and operational deployment. Manufacturing environments may have limited historical failure data for training predictive models, requiring techniques such as transfer learning or simulation-based training. Model predictions must be explainable to gain trust from operators and maintenance personnel who will act on the recommendations. Deployed models must be monitored for performance degradation and retrained as equipment and operating conditions change. The AI/ML integration should leverage the existing service-oriented architecture by implementing models as microservices with well-defined interfaces, enabling independent development and deployment while maintaining integration with existing capabilities.
9. Conclusions
This paper has presented a comprehensive service-oriented architecture for decision support in industrial life-cycle management that successfully integrated collaborative services, predictive maintenance, and knowledge management capabilities in real manufacturing environments. The multi-layered architecture comprising eight core collaborative services, three application services, and six life-cycle management services demonstrated how service-oriented principles can address the complex integration and coordination challenges inherent in modern manufacturing operations. The orchestration approach using a risk assessment module to continuously monitor life-cycle parameters and trigger appropriate services provided effective automated decision support while maintaining clear accountability and predictable behavior.
The industrial validation in automotive assembly and air conditioning manufacturing demonstrated substantial operational improvements. The quantitative results provide strong evidence that service-oriented architecture can deliver significant value in industrial settings when properly designed and implemented. The consistency of improvements across different manufacturing domains and company sizes suggests that the architectural patterns and service designs are broadly applicable instead of narrowly tailored to specific contexts.
The observed benefits must, however, be interpreted in context. The two industrial cases differed in sector, installed base, operational processes, and availability of baseline data, and the evaluation was conducted under real industrial conditions instead of controlled laboratory settings. For this reason, the main level of generalization is architectural: the results support the relevance of combining monitoring, structured knowledge, and service orchestration for life-cycle decision support, even though the exact numerical gains will depend on local organizational and technical conditions.
In practical terms, the impact of the proposed approach was reflected in performance indicators such as increased structured problem registration, shorter diagnostic and resolution times, reduced need for travel, lower spare-parts usage, and improved maintenance planning. These effects indicate that the architecture contributed not only to better information integration, but also to more efficient service execution and knowledge reuse in daily operation. The resources required for deployment were primarily related to system configuration, modelling of production units and life-cycle parameters, integration with available monitoring infrastructure, and training of the involved users. Although the exact deployment effort depended on the industrial setting, the case studies show that the approach is most valuable in environments where downtime, diagnostic delays, and fragmented maintenance knowledge already impose significant operational cost.
The analysis of evolution from traditional SOAP-based SOA to modern microservices architectures and Industry 4.0 technologies provides valuable insights for organizations considering modernization of existing systems or development of new service-based manufacturing systems. While specific technologies have evolved significantly since the original implementation, the fundamental architectural principles of modularity, loose coupling, service composition, and orchestration remain highly relevant. The integration opportunities with OPC UA for device interoperability, Asset Administration Shells for standardized digital representations, and digital twins for virtual modeling create pathways to enhance the original architecture’s capabilities while preserving its proven design patterns.
The lessons learned regarding service granularity, orchestration strategies, knowledge management integration, user interface design, system integration, organizational change management, security, and performance provide practical guidance for practitioners developing service-oriented manufacturing systems. These lessons emphasize that successful implementation requires attention to both technical and organizational factors, with particular emphasis on user-centered design, comprehensive knowledge management, and effective change management to support adoption of new work practices.
Future research should build on these validated architectural principles by exploring their reimplementation through contemporary Industry 4.0 technologies, with particular attention to interoperable asset representations, distributed service architectures, and explainable adaptive decision support.