Next Article in Journal
Photoelectrochemical Oxidation and Etching Methods Used in Fabrication of GaN-Based Metal-Oxide-Semiconductor High-Electron Mobility Transistors and Integrated Circuits: A Review
Next Article in Special Issue
Additive Manufacturing of Gear Electrodes and EDM of a Gear Cavity
Previous Article in Journal
Effects of Ni Content on Energy Density, Capacity Fade and Heat Generation in Li[NixMnyCoz]O2/Graphite Lithium-Ion Batteries
Previous Article in Special Issue
Advanced MMC-Based Hydrostatic Bearings for Enhanced Linear Motion in Ultraprecision and Micromachining Applications
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Modular and Distributed Supervisory Control Framework for Intelligent Micro-Manufacturing Systems with Unreliable Events

by
Gaosen Dong
,
Zhengfeng Ming
* and
Hesuan Hu
School of Electro-Mechanical Engineering, Xidian University, Xi’an 710071, China
*
Author to whom correspondence should be addressed.
Micromachines 2025, 16(10), 1076; https://doi.org/10.3390/mi16101076
Submission received: 27 August 2025 / Revised: 19 September 2025 / Accepted: 21 September 2025 / Published: 23 September 2025

Abstract

This paper presents a modular and distributed supervisory control integration framework for intelligent micro-manufacturing systems (MMSs) under event-level failures. Addressing the increasing demand for scalable and reliable supervisory control in both micro- and smart manufacturing, the proposed approach equips each subsystem with a detector automaton that classifies runtime states into Strictly robust, Recoverably robust, or Non-robust categories. Distributed supervisors then make real-time local decisions to ensure fault-tolerant evolution of system behaviors. Unlike conventional centralized or Petri net-based methods, the proposed automaton-based framework supports modular design and structural scalability. Quantitative comparisons show that the robustness-detection cost scales approximately linearly with the summed sizes of local graphs, indicating good structural scalability. Simulation studies validate the feasibility and scalability of the framework, demonstrating its effectiveness in maintaining production cycle reachability and its integration potential for micro-electro-mechanical systems (MEMS)-based production lines, micro-fabrication platforms, and smart factory environments. These results confirm that the proposed method can serve as a robust and deployable control layer for next-generation intelligent and micro-manufacturing integration architectures.

1. Introduction

With the rapid advancement of industrial information integration and the emergence of smart and micro-manufacturing paradigms, there is a critical need for scalable and robust supervisory control frameworks in automated manufacturing systems (AMSs). Modern AMSs increasingly require seamless integration between event-driven supervisory logic and industrial information systems, edge computing platforms, and smart devices to ensure safe, adaptive, and resilient operation under uncertainty. This requirement is particularly critical for micro-fabrication and MEMS production, where process outcomes are highly sensitive to dynamic conditions (e.g., surface roughness and feature/yield variability) [1,2]. While recent data-driven approaches can predict quality indicators or optimize yield, they do not provide a formal, verifiable supervisory logic to govern how the system should react under uncertainty [1,2]. Our work addresses this gap by proposing a distributed, automaton-based supervisory framework.
Supervisory control, originally proposed by Ramadge and Wonham [3], has become a key method to restrict the behavior of discrete-event systems (DESs) by enabling or disabling controllable events to enforce desired specifications. Unlike traditional feedback control in continuous systems, supervisory control offers a formal, event-driven framework that has been widely applied to industrial scenarios such as manufacturing [4], traffic management [5], communication protocols [6], and robotic coordination [7].
The theoretical foundations of supervisory control are built upon the notions of controllability, observability, and nonblockingness. Over time, these foundations have been extended to support critical features such as fault diagnosis [8,9,10], detectability [11,12,13], opacity [14,15,16], communication delays [17,18], and attack modeling and resilience [19,20,21]. These extensions aim to enhance the observability, security, and robustness of DESs in uncertain environments, which is essential in domains such as smart manufacturing, cyber–physical systems, and critical infrastructure.
Such robustness concerns are especially critical in smart and micro-manufacturing systems, including MEMS-integrated production lines, where unreliable events and subsystem failures may lead to safety violations or task incompleteness, which motivates the development of robust and distributed supervisory control frameworks capable of handling such uncertainties. Among the various application domains, automated manufacturing systems (AMSs) represent one of the most representative real-world implementations of DES theory. In practice, AMSs frequently encounter uncertainties including event failures, communication latency, sensor faults, and external disturbances, which may cause system behaviors to deviate from their intended trajectories, potentially resulting in safety violations or degraded performance. To mitigate such risks, robust supervisory control in AMSs has attracted increasing research interest, with the objective of ensuring that core behaviors, task reachability, and nonblockingness are preserved even in the presence of local anomalies, structural perturbations, or environmental variations [22,23,24,25,26,27,28,29,30].
Much of the existing literature on robustness in AMSs is based on Petri net modeling. For instance, [25,29] introduced the concept of maximal perfect resource-transition circuits (MPC) to characterize blocking states caused by resource failures. Other studies, such as [26], proposed synthesis methods using strong covering structures or saturated siphon constructs to design robust supervisors. Reachability graph-based methods for robustness analysis have recently become mainstream [24,27,28,30].
Despite the progress achieved in Petri net-based AMS robustness, most of these studies rely on centralized control architectures. Few unified frameworks exist that adopt finite automata as the modeling basis while integrating unreliable event modeling, robustness classification, distributed detector synthesis, and local enforcement into a cohesive control strategy. Compared with Petri nets, automaton-based modeling offers a more direct semantic alignment with supervisory control logic, better supports modular composition, and simplifies detector construction and implementation, which is particularly important for MMSs that are highly sensitive to event failures and resource perturbations.
To address this gap, this paper proposes a distributed detector-based robust supervisory control framework specifically designed for AMSs subject to unreliable events and partial observation. The proposed method supports runtime robustness classification and localized control decision-making while preserving the global system’s safety and liveness properties.
Unlike previous studies that focus on Petri net-based robustness [24,27,28,30], this work builds a detector-based framework under automaton semantics, which simplifies classification logic and supports distributed enforcement without global synthesis. Compared with existing Petri net-based robustness control methods [29,30], the proposed automaton-based approach offers the following benefits:
  • Explicit state and transition structures better aligned with supervisory control logic;
  • Easier construction of local detectors via synchronous composition and projection;
  • Avoidance of state explosion during reachability graph enumeration.
Moreover, unlike traditional Petri net-based methods, the proposed automaton-based strategy offers not only formal clarity and modular implementation advantages but also better alignment with runtime deployment needs in intelligent MMS/MEMS platforms, and it simplifies the construction and deployment of detector automata, which is particularly important for AMSs that are highly sensitive to event failures and resource perturbations.
In the industrial context, modern micro-manufacturing cells (e.g., laser micromachining, micro-assembly/packaging, and micro-inspection) are organized as small DES subsystems that share resources such as vacuum pumps, precision stages, grippers, and inspection microscopes. Because parts and wafers are fragile and tolerances are tight, these cells are highly sensitive to event-level faults (loss of vacuum, misalignment, tool jam, vision rejection). Our framework fits this context for several reasons: (i) Event-level enforcement: robustness is decided at the event level, matching the granularity of the above faults. (ii) Local real-time decisions: supervisors inspect detector labels over the local event alphabet and a small set of enabled events, enabling millisecond-scale reactions on PLC/ROS without global search. (iii) Modularity in production cells: the detector-based design isolates faults locally and supports incremental cell integration across reconfigurable lines.
The main contributions of this paper include the following:
  • A modular and distributed supervisory control integration framework is proposed for AMSs/MMSs, supporting robust execution under event failures and seamless integration into industrial information systems without requiring centralized coordination or global model construction.
  • Local detector automata are designed for each subsystem to classify operational states into Strictly robust, Recoverably robust, or Non-robust categories, thereby enabling real-time decision-making, enhanced information integration, and local control based on observed data streams.
  • The proposed control strategy offers a scalable and adaptable industrial solution that avoids unsafe trajectories, supports controller reusability and system extensibility, and is compatible with deployment on PLC-based, edge computing, and industrial information management platforms. It is also suitable for micro-manufacturing execution systems and MEMS-oriented production platforms.
The remainder of this paper is organized as follows. Section 2 introduces the modeling framework and defines event unreliability. Section 3 presents the problem formulation and robustness classification. Section 4 details the construction of detectors and the distributed control strategy. Section 5 provides simulation validation. Section 6 concludes this paper.

2. Preliminaries and Modeling for Automated and Micro-Manufacturing Systems

This section introduces basic definitions and notations used throughout the paper. We establish a unified modeling framework based on modular DESs, define relevant language operations, and introduce event classifications that will be used to formally express robustness concepts.

2.1. Modular System Model

We consider an AMS/MMS composed of N local modules. Each module is modeled as a deterministic finite-state automaton:
G i = ( X i , E i , f i , Γ i , x 0 , i ) , i = 1 , , N ,
where X i is the state set, E i the event set, f i the transition function, Γ i the set of enabled events, and x 0 , i the initial state. The global system is given by synchronous composition:
G = i = 1 N G i = ( Q , E , f , Γ , Q 0 ) ,
with Q = i = 1 N X i , E = i = 1 N E i , and Q 0 = ( x 0 , 1 , , x 0 , N ) . Synchronization occurs over shared events.

2.2. Event Classification

The global event set E is classified as follows:
  • Control-based: E = E c E u c , where E c denotes the set of controllable events and E u c denotes the set of uncontrollable events.
  • Observation-based: E = E o E u o , where E o denotes the set of observable events and E u o denotes the set of unobservable events.
  • Reliability-based: E = E r E u r , where E r denotes the set of reliable events and E u r denotes the set of unreliable events.
Each local supervisor S i only observes E i o E i and controls E i c E i E u c .

2.3. Language and Projections

Let E * denote the set of all finite event sequences. The language generated by G is
L ( G ) = { s E * f ( x 0 , s ) ! } .
Other useful constructs include the following:
  • Prefix closure: L ( G ) ¯ = { s E * t E * , s t L ( G ) } ;
  • Post-language: L ( G ) / s = { t E * s t L ( G ) } ;
  • Projection: For E i o E , define P i : E * E i o * , erasing events not in E i o .

2.4. Robustness-Related Language Sets

To characterize system robustness, we define two special subsets:
The first is the unreliable string set:
E u r + = { s E * e E u r , e s } ,
which includes all strings containing at least one unreliable event.
The second is the terminal language L e n d .
For each G i , we define a set of semantically meaningful terminal states X i e n d X i , representing successful completion of production cycles. These states are declared based on the model’s physical context (e.g., reaching the last processing step of a product).
The global terminal state set is
X e n d = i = 1 N X i e n d ,
and the terminal language is
L e n d = { s L ( G ) f ( x 0 , s ) X e n d } .

2.5. Illustrative Example: A Structured AMS Model

2.5.1. AMS Layout and Structural Motivation

To provide physical insight into our modeling framework, we begin with a realistic layout of an AMS as shown in Figure 1. The system consists of four input/output conveyor lines, two industrial robots, and two CNC-type machines. The layout captures typical component interactions and material transport paths in a real-world AMS.
Based on this structure, we now abstract the behavior of each subsystem using modular finite-state automata, as described below.

2.5.2. Product Automata

Each product automaton G P i = ( X P i , E P i , f P i , x 0 , P i ) describes the flow of a specific product type. Figure 2 shows the four product automata G P 1 , G P 2 , G P 3 , G P 4 , respectively. Transitions represent production steps; certain transitions such as e 3 , e 6 , e 10 , and e 13 correspond to final product completion steps and return the system to the initial states x 0 , x 3 , x 6 , and x 10 , respectively. These states serve as terminal states in our modeling, indicating the completion of a full production cycle and readiness for the next.

2.5.3. Resource Automata

Each resource automaton G R j = ( X R j , E R j , f R j , x 0 , R j ) models the operational status of a shared resource (e.g., machine, buffer, workstation). Figure 3 shows four resource automata G R 1 to G R 4 that synchronize with corresponding product transitions to coordinate resource usage. For example, events e 5 , e 8 , e 10 , and e 13 appear in both product and resource automata to represent shared transitions.

2.5.4. Global Synchronous System

The entire AMS is modeled as the synchronous composition
G = i = 1 4 G P i j = 1 4 G R j .
Here, each synchronization occurs over the intersection of shared events, such as e 3 , e 6 , e 10 , and e 13 , which represent jointly executed operations between product lines and corresponding resources. The transition structure of G encodes all inter-module dependencies and provides the operational foundation for distributed supervision.
Figure 4 illustrates the global synchronization structure of the AMS, showing interactions among all product and resource modules via shared events.
In practice, due to the state explosion of full synchronization, we analyze subsystems (e.g., G P 1   G R 1   G P 2 ) in subsequent sections to demonstrate key concepts such as detector construction and robust supervisory control.
Each final event e 3 , e 6 , e 10 , and e 13 leads the system back to its initial states, indicating a completed processing cycle. Therefore, we define the terminal state set as
Q e n d = { ( x P , x R ) ( Π i 4 X P i ) × ( Π j 4 X R j ) | i { 1 , , 4 } : x P i = x 0 , P i }
Rationale: In our AMS model, each terminal product event ( e 3 , e 6 , e 10 , e 13 ) synchronizes with resource-release transitions that return the involved resources to their idle nodes (see Figure 3). Hence, no explicit constraint on the resource component of ( x P , x R ) is required.
The terminal language is then
L e n d = { s L ( G ) f ( Q 0 , s ) Q e n d } .

2.5.5. Unreliable Events and Resource Failures

In this AMS framework, unreliable events represent potential failures occurring during interactions with shared resources. We focus specifically on faults related to the operation of resources G R 3 and G R 4 .
Let E u r E denote the set of unreliable events. We define the following:
  • E u r = { e 9 , e 10 , e 12 , e 13 } .
These events represent transitions in product automata that synchronize with resource automata G R 3 and G R 4 . In this context, failures in these resources (e.g., machine breakdowns or unavailable capacity) are modeled by assuming the associated events may be disabled due to resource-side faults such as breakdowns or unavailability. Supervisory strategies developed in later sections will aim to mitigate the risks posed by these unreliable events.
Definition  1.
An event e E is called unreliable if it is shared with a faulty resource automaton and may fail to be enabled due to unexpected faults. The collection of such events is denoted by E u r .
Proposition  1.
If e E u r is an unreliable event, then under fault conditions, e is not guaranteed to be enabled in the global system G even if all product-side conditions are satisfied.
Proof. 
Since G is the synchronous composition of product and resource automata, event e E u r is enabled in G at a global state Q = ( x P , x R ) only if it is enabled in both the product component G P and the corresponding resource component G R . If G R experiences a fault that disables e (e.g., resource is busy, failed, or unavailable), then e is not enabled in G regardless of the status of G P . Thus, unreliability at the resource level can directly disable the execution of e in the global model.    □
This reflects that only strings leading to complete and ready-for-restart configurations are considered safe terminations.

3. Problem Formulation for Distributed Robust Supervision

3.1. Problem Formulation

Let G = ( Q , E , f , Q 0 ) be the global plant defined in Section 2, where E is already partitioned into
E = E c E u c , E = E o E u o , E = E r E u r .
Each local supervisor S i is defined over a partial alphabet E i E and observes a subset E i o E i and controls E i c E i E c .
Define the local observation projection:
P i : E * E i o ,
which erases events not in E i o .
The local supervisor is a map:
S i : E i o 2 E i c ,
and the joint distributed supervisor is
S ( s ) = i = 1 N S i ( P i ( s ) ) .
Then the controlled behavior of the system is
L ( S / G ) = { s L ( G ) k | s | , s k S ( s < k ) E u c } .
Assumption 1.
(Full controllability and observability): Although the global event set includes uncontrollable ( E u c ) and unobservable ( E u o ) events, we assume that all locally relevant events are controllable and observable, i.e., E c i = E i , E o i = E i . This assumption enables the focus to remain on robustness enforcement under unreliable events while avoiding complications due to uncontrollability or unobservability. The extension to partial observation and limited control authority will be pursued in future work.
Rationale: We adopt full controllability and observability locally to isolate the effect of unreliable events on robustness detection. This matches many PLC/ROS-based AMS cells where event execution and sensing at the cell level are fully actuated and instrumented; the treatment of limited control/partial observation is deferred to future work.

3.2. Distributed Robustness Criteria

We consider two types of robustness objectives:
Definition 2.
The system is Strictly robust under S if
L ( S / G ) E u r + = ,
that is, no string executed under control contains any unreliable event.
Definition 3.
The system is Recoverably robust under S if
s L ( S / G ) ¯ , s 1 ( E E u r ) * s u c h t h a t s s 1 L e n d .
That is, for every prefix of the controlled behavior, there exists a continuation string without any unreliable events that drives the system into a terminal state.
Definition 4.
Given a set of subsystems { G i } with individual detector automata { D i } , we say the system is distributedly robust with respect to event failures in E u r if and only if the following hold: (i) the local robustness labels of D i are sound and complete with respect to the local behavior of G i ; and (ii) the composition of local robustness classifications under the merging rule ϕ g l o b a l ensures global nonblocking and task reachability.
These robustness criteria reflect different tolerance levels to resource failures and will guide the synthesis of distributed supervisors in subsequent sections.
To further clarify these robustness definitions, Figure 5 illustrates three representative execution paths from the AMS model G:
(i) Top path: Violates both Strict and Recoverable robustness, as it executes e 9 E u r and no further continuation reaches a terminal state.
(ii) Middle path: Satisfies Strict robustness by using only reliable events and reaching a designated terminal state.
(iii) Bottom path: Satisfies Recoverable robustness by continuing through reliable events after e 12 E u r to reach a terminal state.
Lemma 1.
If s L ( S / G ) and s contains an event e E u r , then the supervisor S cannot satisfy Strict robustness.
Proof. 
By definition of Strict robustness, the set L ( S / G ) must be disjoint from E u r + . If s contains any e E u r , then s E u r + , thus violating L ( S / G ) E u r + = . Therefore, S fails to achieve Strict robustness.    □
The robustness distinctions illustrated above motivate a clear analytical foundation. The following lemma formally highlights the inherent limitation imposed by Strict robustness, which completely forbids occurrences of unreliable events.

3.3. Problem Statement

Problem: Given the global plant G = i = 1 N G i , local alphabets ( E i o , E i c ) , and event partitions ( E u r , E u c , E u o ) , synthesize distributed supervisors { S i } i = 1 N such that the resulting behavior satisfies Strict or Recoverable robustness.
Key technical challenges: Each S i only observes partial behaviors and controls local events. Ensuring that the global behavior
L g l o b a l = i = 1 N P i 1 ( L ( S i / G i ) )
remains robust requires careful coordination under partial information.
Main conceptual challenges include the following:
  • Limited local knowledge: Although all events are assumed to be locally observable and controllable in this work, real-world systems may include unobservable or uncontrollable events, which complicate supervision.
  • Synchronization ambiguity: The execution of a shared event may depend on the state of another module that is not visible to the local supervisor.
  • Fault propagation risk: A single unreliable event can propagate failures through multiple modules unless proactively prevented.

4. Distributed Robust Supervisor Design

To enable robust supervision under partial observation and unreliable events, this section proposes a distributed synthesis strategy that decomposes global analysis into localized decisions.
Centralized robust supervisory synthesis typically requires the explicit construction of the global plant G = i = 1 N G i , which suffers from severe state explosion due to synchronous product operations. In the presence of unreliable events, the analysis of global robust reachability becomes even more intractable, as failure propagation must be tracked across all subsystems.
Centralized synthesis suffers from lack of modularity, as any subsystem update requires re-synthesizing the entire global model. This hinders scalability, adaptability, and practical implementation.
In contrast, the proposed distributed approach constructs local detectors D i and supervisors S i without requiring the global synchronous product, leveraging only local event structures and known unreliable events. This enables scalable synthesis, localized diagnosis, and runtime efficiency while still preserving global robustness guarantees through conservative decision fusion.
In systems composed of multiple interacting automata with local observations and possible event unreliability, centralized control methods often suffer from state explosion and lack of structural scalability. Specifically, the construction of the global plant G = G 1 G 2 G N and the corresponding monolithic supervisor becomes impractical as the number of subsystems increases or the event space becomes dense. (Throughout this paper, all events are observable and controllable; ‘local observations’ means each detector D i evolves on its own alphabet E i o , not that some events are unobservable.)
To overcome these limitations, we propose a distributed robustness framework that decomposes the control synthesis problem into localized robustness detection and enforcement tasks. The core idea is to endow each subsystem G i with a local detector D i that classifies its states into Strictly robust, Recoverably robust, or Non-robust according to whether reliable paths to local terminal states exist. These detectors are constructed using only the state space of G i and the known unreliable event set E u r .
Formally, for each local observation history s o ( E i o ) * , the supervisor S i determines the current state y Y D i of the detector and applies a conservative event-enablement rule:
S i ( s o ) = E i c if φ i ( y ) = Strict , E i c { e E i c e Non - robust } if φ i ( y ) = Recoverable , if φ i ( y ) = Non - robust .
Unlike traditional methods, our framework ensures that supervisors avoid unsafe behaviors without global coordination or fault observability. The distributed strategy guarantees that the global language
L global = i = 1 N P i 1 ( L ( S i / G i ) )
remains within the Strictly or Recoverably robust trajectories of the overall system, as proven in Theorem 1.
This design philosophy transforms the robustness problem from a centralized model-checking challenge into a modular, scalable synthesis approach, enabling large-scale implementation across fault-prone and information-constrained systems.

4.1. Distributed Robust State Detectors

The cornerstone of distributed robust supervision is the ability of local detectors to accurately classify the system states with respect to robustness. Distributed robust state detectors perform this function by analyzing local event sequences and predicting future execution outcomes.

Formal Definition and Construction

Formally, a local robust state detector associated with the local supervisor S i is defined as
D i = ( Y D i , E i o , f D i , y D i 0 , φ i ) ,
where
  • Y D i is the finite set of detector states;
  • E i o is the locally observable event set;
  • f D i : Y D i × E i o Y D i is the transition function;
  • y D i 0 is the initial state of the detector;
  • φ i : Y D i { Strict , Recoverable , Non - robust } is the robustness classification function.
Detectors evolve on the locally observable alphabet E i o , while the robustness classification only needs the reliable/unreliable split inside the local alphabet E i inherited from the global partition (see Step 3 below).
The construction method: is as follows:
  • Define each local subsystem as a combination of a product automaton and its relevant resource automata.
    G i = G P i j J i G R j .
  • Track reachable states in the local subsystem based solely on local observable events.
  • Event partition: Globally, the event set is partitioned as E = E r E u r into reliable and unreliable events. For a local subsystem G i with alphabet E i E and locally observable subset E i o E i , we inherit the global partition by intersection:
    E rel ( i ) : = E i E r , E ur ( i ) : = E i E u r , E i = E rel ( i ) E ur ( i ) .
    Events not in E i do not occur in G i and are irrelevant for local reasoning. The detector transitions use locally observable labels E i o , while robustness classification only needs the reliable/unreliable distinction inside E i as defined above. In Algorithm 1, we therefore operate on G i with two label sets: (i) E i when we compute forward reachability of arbitrary local prefixes, and (ii) E rel ( i ) when we require reliable prefixes or suffixes. Throughout the algorithm we write E rel (resp. E u r ) for E rel ( i ) (resp. E ur ( i ) ) to simplify notation.
  • Classify states as follows:
    • Strict robustness: A state y Y i is Strict if there exist a reliable prefix σ E rel * from Y i 0 to y, and there exists a reliable suffix τ E rel * from y to a terminal state in Y i end . Equivalently, y lies on a path that uses only reliable events up to y and can continue by reliable events to Y i end .
    • Recoverable robustness: A state y Y i is Recoverable if there exists a path s L ( G i ) from Y i 0 to y such that every prefix s ¯ of s reaches a state that admits a reliable suffix to Y i end . Hence, Strict and Recoverable share the same ‘reliable-suffix-to-terminal’ property; furthermore, Strict requires a reliable prefix, while Recoverable allows unreliable prefixes whose every prefix remains Recoverable by some reliable suffix.
    • Non-robust: A state not reachable to Y end under reliable events.
Robust State Classification Algorithm 1 follows below.
Algorithm 1 Robust State Classification with Prefix–Suffix Semantics.
Require: Local automaton G i = ( Y i , E i , ) ; initial set Y i 0 ; terminal set Y i end ; reliable events E rel E i ; unreliable events E u r = E i E rel .
Ensure: Label φ i : Y i { S TRICT , R ECOVERABLE , N ON - R OBUST } .
  1: B rel B ACKWARDCLOSURE ( Y i end , E rel ) ▹ states admitting a reliable suffix
  2: R F ORWARD C LOSURE R ESTRICTED ( Y i 0 , E i , B rel ) ▹ prefix stays inside B rel
  3: S F ORWARD C LOSURE R ESTRICTED ( Y i 0 , E rel , B rel ) ▹ reliable prefix inside B rel
  4: Rec R S ▹ Recoverable but not Strict
  5: for all  y Y i   do
  6:     if  y S   then
  7:         φ i ( y ) Strict
  8:     else if  y Rec   then
  9:         φ i ( y ) Recoverable
10:     else
11:         φ i ( y ) Non - Robust
12:     end if
13: end for
The graph primitives invoked by Algorithm 1 are specified in Algorithm 2.
The connection to Algorithm 1 is described below.
Using the localized sets above, we first compute the reliable-suffix basin B rel : =  BackwardClosure Y i end , E rel ( i ) , i.e., states that can reach Y i end by reliable events only. A state has an arbitrary (possibly unreliable) prefix whose every prefix remains Recoverable iff it is forward-reachable from Y i 0 within B rel using labels in E i . A state is Strict iff, in addition, there exists a reliable prefix inside B rel , obtained by forward closure from Y i 0 with labels in E rel ( i ) . Consequently, the classification realizes the logical intent: Strict states admit a reliable prefix and a reliable suffix; Recoverable states admit a path whose every prefix can be recovered by some reliable suffix; and S t r i c t R e c o v e r a b l e holds by construction.
Algorithm 2 Graph primitives used in Algorithm 1.
  1:
function BackwardClosure( T , A )                        ▹ least fixpoint of X T Pre ( X , A )
  2:
     R T
  3:
     repeat
  4:
         R T { y Y i a A , y R : y a y }
  5:
     until R no longer changes
  6:
     return R
  7:
end function
  8:
function ForwardClosureRestricted( S , A , C )                             ▹ BFS restricted to C
  9:
     Q S C ;    R S C
10:
     while  Q   do
11:
         y pop ( Q )
12:
         for all  y a y with a A   do
13:
            if  y C  and  y R   then
14:
                R R { y } ;    push ( Q , y )
15:
            end if
16:
         end for
17:
     end while
18:
     return R
19:
end function
Complexity: Let n i : = | Y i | be the number of states of the local automaton G i , m i : = | i | the number of transitions, and m i rel : = | { ( y , a , y ) i a E rel ( i ) } | the number of reliable-labeled transitions. All primitives used by Algorithm 1 are graph traversals (BFS/fixpoints) on finite graphs:
  • BackwardClosure ( Y i end , E rel ( i ) ) : reverse-BFS/least-fixpoint over reliable edges; it runs in O ( n i + m i rel ) time and O ( n i ) memory.
  • ForwardClosureRestricted ( Y i 0 , E i , B rel ) : BFS restricted to B rel ; it runs in O ( n i + m i ) time and O ( n i ) memory.
  • ForwardClosureRestricted ( Y i 0 , E rel ( i ) , B rel ) : BFS on the reliable subgraph inside B rel ; it runs in O ( n i + m i rel ) time and O ( n i ) memory.
  • The final labeling loop over Y i is O ( n i ) .
Hence, Algorithm 1 runs in overall time
O n i + m i and uses O ( n i ) memory ,
i.e., linear in the size of the local graph. We compute all detector scales as i = 1 N O ( n i + m i ) .
To enhance the interpretability of robustness classification under local observations, we visualize the reachable state space of the local detector D i for subsystem G 1 .
To further explain the interpretation of each detector state in subsystem G 1 , we list below the component-wise markings in Table 1 of y i ( 1 ) Y D 1 , which represent the synchronized configurations of product and resource automata.
The same modeling principle and robustness classification procedure are applied to the remaining subsystems G 2 G 4 , but their detector state table is omitted for brevity.
To intuitively present the results of the local robust state classification, we provide the robustness-annotated state transition diagram for subsystem G 1 = G P 1 G R 1 G R 2 . Each state y i Y D i is colored according to its robustness label.
The states in Figure 6 correspond to Y D i = { y 0 , y 1 , . . . , y 10 } .
This structure reveals how unreliable events (e.g., e 9 , e 10 ) impact recoverability. The same method is applied to other subsystems in the following Figure 7, Figure 8 and Figure 9.
To improve clarity, we extract a representative fragment of the detector D 3 in subsystem G 3 , emphasizing robustness-relevant states and transitions. Unlike previous subsystems, G 3 contains intermediate states whose robustness classification depends on the presence of unreliable events in the prefix path.
Unlike the detectors for G 1 and G 2 , the robustness structure of G 4 is dominated by Non-robust states. This is due to the fact that many transitions in G 4 involve unreliable events, forming unrecoverable cycles or branches. Thus, local supervisory control in this subsystem must Strictly avoid enabling transitions such as e 9 , e 10 , e 12 , and e 13 .

4.2. Global Robustness Classification and Guarantee

To enable distributed robustness enforcement, each local supervisor S i must make control decisions based on its current detector state y i . The robustness classification φ i ( y i ) determines which controllable events are allowed, depending on whether the state is Strict, Recoverable, or Non-robust.
Notation. For e E i and y i Y i , define P o s t e ( y i ) { y i Y i y i e y i } . A shared event is globally enabled at a joint state y = ( y i ) i N iff it is enabled by every local supervisor that synchronizes on it (conjunctive fusion).
Below, Algorithm 3 summarizes the local decision-making rule.
Algorithm 3 Local supervisor enabling rule based on robustness.
Require: Local observation s o ( E i 0 ) *
Ensure: Enabled set S i ( s o ) E i     (here S t r i c t i and R e c i are the offline partitions precomputed by Algorithm 1).
1: y i f D i ( y i 0 , s o ) ▹ current detector state
2: r φ i ( y i ) ▹ robustness label
3: if r = Strict then
4:     S i ( s o ) { e E i P o s t e y i Strict i }
5: else if r = Recoverable then
6:     S i ( s o ) { e E i P o s t e y i Rec i }
7: else
8:     S i ( s o )
9: end if
Label order and global aggregation:
Let the label set be L = { Non - robust , Recoverable , Strict } endowed with the total order Non - robust Recoverable Strict . For a joint detector state y = ( y i ) i N , the global robustness label is
φ global ( y ) = min , i N φ i ( y i ) ,
where min denotes the minimum with respect to the above total order (i.e., the weakest label dominates). Equivalently,
φ global ( y ) = Strict , if i : φ i ( y i ) = Strict , Recoverable , if i : φ i ( y i ) { Strict , Rec } ( j : φ j ( y j ) = Rec ) , Non - robust , otherwise .
Lemma 2.
(Global robustness consistency.) If any local detector D i classifies a state as Non-robust, then the global system state is Non-robust.
Proof. 
By definition, global robustness classification is determined by the least robust local detector classification. Hence, a Non-robust local classification directly yields a global Non-robust state.    □
Lemma 3.
(Robustness propagation.) If all local detectors D i classify their current state y i as either Strict or Recoverable, then the global state is at least Recoverable.
Proof. 
By definition of the global classification rule φ g l o b a l ( y ) = min , i N φ i ( y i ) , where φ i { S t r i c t , R e c o v e r a b l e , N o n - r o b u s t } , the absence of any Non-robust local state ensures that φ g l o b a l ( y ) { S t r i c t , R e c o v e r a b l e } .    □
Theorem 1
(Global robustness under reliable-event enforcement). Let G = i N G i be the synchronous product of local automata. For each i, Algorithm 1 computes a partition Y i = S t r i c t i ( R e c i S t r i c t i ) N o n - r o b u s t i with S t r i c t i R e c i , where S t r i c t i (resp. R e c i ) is the set of states that admit a reliable prefix (resp. a prefix whose every prefix remains Recoverable by some reliable suffix) to the terminal set Y i end .
Each local supervisor S i applies the following reliable-state rule at its current detector state y i :
y i S t r i c t i : enable exactly the events e E i with P o s t e ( y i ) S t r i c t i , y i R e c i S t r i c t i : enable exactly the events e E i with P o s t e ( y i ) R e c i , y i N o n - r o b u s t i : disable all events in E i .
A shared event is globally enabled at a global state y = ( y i ) i N iff it is enabled by every local supervisor that synchronizes on it (conjunctive fusion).
Then along every closed-loop execution, each visited global state y satisfies y i R e c i for all i N . Equivalently, the global robustness label
φ global ( y ) : = min , i N φ i ( y i ) ( S t r i c t R e c o v e r a b l e N o n - r o b u s t )
always belongs to { S t r i c t , R e c o v e r a b l e } . Moreover,
φ global ( y ) = S t r i c t i N : y i S t r i c t i , φ global ( y ) = R e c o v e r a b l e i : y i R e c i S t r i c t i i : y i N o n - r o b u s t i .
Proof. 
We show that i N R e c i is an invariant of the closed loop.
By Algorithm 1, we first compute the reliable-suffix basin B rel = BackwardClosure ( Y i end , E rel ( i ) ) and then R = ForwardClosureRestricted ( Y i 0 , E i , B rel ) ; hence, R e c i = R S t r i c t i collects exactly the states that are forward-reachable from Y i 0 while staying inside B rel . In particular, y i 0 R e c i , so the initial global state y 0 = ( y i 0 ) i N belongs to i R e c i .
Let y = ( y i ) i N i R e c i and suppose a global event e occurs to y = ( y i ) i N . Because all events are controllable and the global enabling is conjunctive, e can occur only if each involved supervisor S i enables e at y i . If y i S t r i c t i , the rule enables only transitions with P o s t e ( y i ) S t r i c t i , where P o s t e ( y i ) { y i | y i e y i } , thus y i S t r i c t i R e c i . If y i R e c i S t r i c t i , the rule enables only transitions with P o s t e ( y i ) R e c i , hence y i R e c i . If for some i we had y i N o n - r o b u s t i , no event would be enabled contradicting the occurrence of e at y. Therefore, y i R e c i and the invariant holds.
Consequently, no closed-loop execution can reach a local Non-robust state, i.e., φ i ( y i ) { S t r i c t , R e c o v e r a b l e } for all i, so φ global ( y ) = min i φ i ( y i ) { S t r i c t , R e c o v e r a b l e } . The two characterizations of φ global follow directly from the order S t r i c t R e c o v e r a b l e N o n - r o b u s t and the invariance i R e c i .    □
We construct a robust event enablement, see Table 2, mapping local detector states to enabled events.
Table 2, which shows robust events, assists the local supervisors in making rapid and precise decisions.
The next chapter provides experimental validation of the distributed robust supervisory strategy, verifying its efficacy under practical scenarios.

4.3. Structural Characterization of Local Robustness

The robustness of each subsystem G i is structurally influenced by its topological configuration and its interaction with unreliable events. In particular, the existence of cycles or interleaving paths involving events in E u r directly affects the classification of states in Y D i .
We formally observe the following:
Lemma 4
(Structural strictness via reliable closures). For subsystem G i with reliable alphabet E rel ( i ) , define
F rel F o r w a r d C l o s u r e ( Y i 0 , E rel ( i ) ) , B rel B a c k w a r d C l o s u r e ( Y i end , E rel ( i ) ) .
Then a state y Y D i is Strictly robust iff y F rel B rel .
Lemma 5
(Cycle-induced Non-robustness (sufficient)). If a state y lies on a cycle that contains some event in E ur ( i ) and y B rel , then y is classified as Non-robust.
To summarize the overall distributed control flow, Figure 10 illustrates the high-level architecture of the proposed framework, where each subsystem constructs a local detector, and the global classification is obtained through the merging rule φ .

5. Experimental Validation

To evaluate the applicability of the proposed framework in automated and micro-manufacturing contexts, simulation studies are conducted on representative system models. All subsystem automata and detector structures were constructed based on the modular modeling approach proposed in Section 3. The robustness classification Algorithms 1 and 3 were implemented using a Java-based simulation framework developed by the authors. All simulations were executed on a Windows 10 workstation with an Intel Core i7 processor and 16 GB RAM.
To visualize the local detectors and robustness propagation structures, we manually generated state transition graphs using Microsoft Visio. Each diagram reflects the formal construction of detectors based on synchronous composition and robustness labeling, as defined in Section 4.

5.1. Experimental Setup

Each local plant G i is modeled as the synchronous composition of a product automaton G P i and its corresponding resource automata { G R j } :
G i = G P i j J i G R j ,
where J i denotes the index set of resources used by subsystem i. Four distributed subsystems are constructed as follows:
  • G 1 = G P 1 G R 1 G R 2 ;
  • G 2 = G P 2 G R 1 G R 2 ;
  • G 3 = G P 3 G R 1 G R 2 G R 3 G R 4 ;
  • G 4 = G P 4 G R 3 G R 4 .
Unreliable events are defined as E u r = { e 9 , e 10 , e 12 , e 13 } , corresponding to typical faults:
  • e 9 : tool jam or axis over-current on a precision stage;
  • e 10 : part/wafers misalignment detected by the vision system;
  • e 12 : loss of vacuum or gripping failure during pick-place;
  • e 13 : vision reject after micro-inspection.
A compact mapping is summarized in Table 3. These faults occur at the event granularity and directly impact shared resources, which is consistent with our robustness classification and local supervision.

5.2. Robustness Structure Analysis of Subsystems

We summarize the robustness classification across subsystems based on the constructed detector automata, as visualized in Figure 6, Figure 7, Figure 8 and Figure 9. Each figure highlights state robustness categories under local observation:
  • G 1 and G 2 contain mostly Strictly robust states, with only a few Non-robust configurations.
  • G 3 exhibits a mixed structure, including ambiguous and Recoverable states due to its complex resource interactions.
  • G 4 is entirely Non-robust, as every reachable state involves unreliable transitions.
These results validate the effectiveness of local detectors in classifying robustness-critical regions, which directly influence supervisory control strategies. A summary of robustness distribution is given in Table 4.

5.3. Supervisor Response to Event Sequences

Each local supervisor S i applies the robust event-enablement rule using its detector D i to classify states and selectively disable risky transitions.
In G 1 and G 2 , only Strictly robust paths are allowed. In G 3 , the supervisor permits Recoverable trajectories while blocking transitions to Non-robust regions. In G 4 , all controllable transitions are disabled to avoid Non-robust cycles.

5.4. Structural Sensitivity and Scalability

Robustness is structurally influenced by the interaction between subsystems and unreliable events. As interleaving increases, detectors grow in size and more states become ambiguous or Non-robust. In G 3 , complex resource sharing induces a significant number of states requiring Recoverable supervision. In contrast, G 4 demonstrates the fragility of subsystems lacking redundant safe paths.

5.5. Robustness Enforcement Consistency

The consistency between local supervisory decisions and global robustness outcomes is confirmed through representative simulations. The following is shown in Figure 6, Figure 7, Figure 8 and Figure 9 and Section 5.2:
  • Supervisors in G 1 and G 2 enable only Strictly robust trajectories, which are also a subset of Recoverably robust behaviors.
  • Supervisors in G 3 permit Recoverable but not Strictly robust trajectories while preventing unsafe transitions.
  • Supervisors in G 4 disable all controllable events due to inherent Non-robustness.
These results confirm that the distributed supervisory strategy achieves global robustness through local enforcement without requiring centralized coordination.

5.6. Evaluation of Structural Performance Metrics

To complement the robustness enforcement results presented in Section 5.2, Section 5.3, Section 5.4 and Section 5.5, this section focuses on architectural-level performance indicators that reflect the practical value of the proposed distributed framework. Instead of measuring time-based performance—often dependent on platform-specific implementation—we evaluate structural properties that are stable across platforms and scale with system complexity. Specifically, we compare centralized and distributed designs in terms of reachable states, modularity, and extensibility.
We consider a partial plant composed of subsystems G 1 and G 2 , defined as
G 1 = G P 1 G R 1 G R 2 , G 2 = G P 2 G R 1 G R 2 .
The centralized plant is constructed as
G c = G P 1 G P 2 G R 1 G R 2 ,
while the distributed approach constructs independent local detectors D 1 and D 2 over G 1 and G 2 , respectively. These subsystems and their corresponding structures are consistent with the plant models illustrated in Figure 2 and Figure 3 and the detector graphs in Figure 6, Figure 7, Figure 8 and Figure 9.
Table 5 summarizes the structural performance differences between the centralized and distributed architectures in terms of reachability, design reusability, and scalability.
Note: ‘Reachable States’ counts the reachable states of the detector(s)—a single monolithic detector for the centralized case or the sum of local detectors for the distributed case—not the size of the plant’s global synchronous product.
Scalability: The complexity of each local detector D i scales with the size of its local graph only. Algorithm 1 runs in O ( | Y i | + | i | ) time and uses O ( | Y i | ) memory, where Y i and i are the reachable states and edges of the detector built on the local alphabet E i (product automaton of G P i with its adjacent resources). Thus, growth is driven by the local interaction degree d i (number of shared resources/events), not by the total number of subsystems N. In our case study, the most connected subsystem G 3 yields > 30 detector states, while G 1 and G 2 each have 11, showing that ‘hub’ modules can dominate the footprint, whereas other modules remain small. Practical mitigations include resource partitioning/decoupling, clustering or hierarchical detectors for hubs, and event abstraction to shrink E i . Here are some examples:
  • The centralized model has fewer reachable states due to full synchronization and global pruning.
  • The distributed approach constructs reusable local detectors that enable modular expansion.
  • Adding new modules (e.g., G 3 , G 4 ) in the centralized case requires reconstructing the full plant, while distributed controllers support incremental composition.
These metrics collectively confirm that although centralized synthesis produces compact models for small configurations, the distributed approach offers substantial structural advantages. It supports subsystem-level reuse, incremental integration, and scalable extension to larger system configurations—all without requiring full plant redesign.

5.7. Language-Level Comparison of Centralized and Distributed Control

To complement the structural evaluation presented in Section 5.6, we now compare the behavioral correctness of the proposed distributed supervisory scheme against a centralized baseline. The objective is to verify whether the distributed controllers can achieve equivalent robustness enforcement in terms of permissible event sequences and rejection of unsafe trajectories.
We consider the same partial plant G = G 1 G 2 and its centralized composition G c = G P 1 G P 2 G R 1 G R 2 , as defined in Section 5.6. The centralized supervisor is synthesized over G c using robustness enforcement based on Recoverably robust trajectories. The distributed control scheme uses two detectors D 1 and D 2 to determine local robustness and enable event decisions based on the distributed rule:
S ( s ) = i = 1 2 S i ( P i ( s ) ) ,
where P i ( s ) denotes the projection of string s onto the event set of G i .
Let L c denote the language generated by the centralized supervisor and L d denote the language generated by the distributed scheme under detector-based coordination.
We now state a formal result that characterizes the behavioral soundness of the distributed supervisor.
Proposition 2.
Let G = G 1 G 2 and let D 1 , D 2 be the robust state detectors constructed for G 1 and G 2 , respectively. Let L c be the language of the centralized robust supervisor synthesized over G c = G P 1 G P 2 G R 1 G R 2 , and let L d = { s L ( G ) D i ( P i ( s ) ) f o r a l l i = 1 , 2 } be the language permitted by the distributed strategy.
Then, the following holds:
1.
L d L c ;
2.
L ( G ) L d L ( G ) L c .
Proof. 
Consider any string s L d . By definition of the distributed strategy, D 1 ( P 1 ( s ) ) and D 2 ( P 2 ( s ) ) . According to the detector design (see Algorithm 1), this means that P 1 ( s ) leads to a state in Y D ( 1 ) and P 2 ( s ) leads to a state in Y D ( 2 ) , both of which are either Strictly robust or Recoverably robust.
Since the centralized supervisor L c is synthesized over G c , which contains the full behavior of G, and robustness pruning only removes strings violating Definition 3, any s that passes all local robustness detectors also satisfies the global robustness condition. Thus, s L c and L d L c .
Now consider any string s L ( G ) L d . Then there exists some i such that D i ( P i ( s ) ) = , meaning P i ( s ) leads to a Non-robust state in G i . Since the global supervisor enforces robustness over all components, such a string must also be removed from L c during centralized pruning. Therefore, s L c and s L ( G ) L c , implying L ( G ) L d L ( G ) L c .    □

5.8. Trajectory-Level Validation of Recoverable Robustness

While the structural and language-level analyses in Section 5.6 and Section 5.7 demonstrate that the distributed control framework enforces robustness consistently, it remains essential to validate that the detectors allow appropriate runtime execution for strings that meet the Recoverable robustness condition. In this section, we evaluate a specific trajectory under the distributed supervisory scheme to illustrate the runtime behavior and semantic interpretation of Recoverable robustness.
We consider the subsystem G 3 with its detector D 3 illustrated in Figure 8. The following event sequence is analyzed:
s = e 1 e 2 e 11 e 12 e 1 e 3 .
This path contains the unreliable event e 12 E u r and tests the distributed supervisor’s ability to accept partially unreliable behavior while preserving the ability to reach a terminal configuration.
Let y denote the state reached after executing the prefix e 1 e 2 e 11 e 12 . According to the detector classification in D 3 , the state y = y 4 ( 3 ) is labeled as Recoverably robust. This classification is justified because, although e 12 is an unreliable event, there exists a valid recovery sequence e 1 e 3 from y that leads to the terminal state Y end . Thus, the Recoverably robust label permits continuation beyond an unreliable event provided a complete recovery path is guaranteed.
At the detector state y = y 4 ( 3 ) , the supervisor evaluates whether to allow the event e 1 to continue execution. Since the detector classification is Recoverable and a valid recovery trajectory exists via e 3 , the system permits e 1 to occur.
After executing e 1 and then e 3 , the system reaches the terminal configuration y 6 ( 3 ) . According to the terminal state definition introduced in Section 2.4, y 6 ( 3 ) lies in the local terminal state set Y ( 3 ) end of subsystem G 3 . This demonstrates that the recovery path exists and is operational, validating the runtime semantics of the Recoverably robust label and confirming that the execution leads to a semantically complete production cycle.
Figure 11 visualizes the state transitions along the trajectory s, highlighting the transitions involving unreliable events and the eventual recovery via e 3 . The states along the path are as follows:
  • y 0 ( 3 ) (initial) e 1 y 1 ( 3 ) ;
  • y 1 ( 3 ) e 2 y 2 ( 3 ) ;
  • y 2 ( 3 ) e 11 y 3 ( 3 ) ;
  • y 3 ( 3 ) e 12 y 4 ( 3 ) (Recoverable robustness);
  • y 4 ( 3 ) e 1 y 5 ( 3 ) e 3 y 6 ( 3 ) (terminal).
This example confirms that the distributed supervisor correctly interprets the Recoverably robust classification and allows execution of unreliable events only when a valid recovery path exists. Such behavior illustrates the semantic soundness of the detector-based control strategy in runtime decision-making.
These structural properties confirm the feasibility of implementing the proposed detector-based supervisors in real-world industrial environments. In particular, the modular architecture is well-suited for deployment on edge controllers or Programmable Logic Controllers (PLCs) within smart factories, enabling real-time detection and mitigation of failures without centralized coordination.

5.9. Comparison with Existing Methods

To further assess the efficacy of the proposed distributed framework, we qualitatively compare it with several representative supervisory-control approaches: centralized robust synthesis, static fault-tolerant control (FTC), modular supervision, and our distributed robust scheme (see Table 6). The comparison focuses on unreliable-event handling, Recoverable robustness, state-space scalability, supervisor reusability, and incremental integration.
The centralized strategy provides strong guarantees but suffers from state explosion and poor scalability. Static FTC approaches enable some resilience, yet they typically rely on predefined failure models and cannot adapt at runtime. Modular architectures support structural reuse but lack explicit robustness enforcement against event uncertainty.
Compared with a centralized robust supervisor, the proposed distributed scheme achieves the same safety guarantees (see Proposition 2 and Section 5.7) while avoiding global runtime search. At runtime, decisions are made locally by inspecting the label of the current detector state and a small set of enabled events; therefore, the decision cost depends on the local event alphabet and the detector size rather than the size of the monolithic product automaton (Table 6). In practice, this reduces decision latency and improves controller reuse and scalability (cf. Section 5.6 and Table 5). Because timing strongly depends on platform and implementation (PLC vs. ROS, CPU load, I/O latency, etc.), we deliberately do not report a fixed percentage improvement and leave a cross-platform timing benchmark as future work.
The detector-based control structure is amenable to modular implementation, since each local subsystem requires only partial event monitoring and local classification. The architecture is compatible with PLC-based or ROS-based deployments and supports incremental system expansion and fault isolation, and it is therefore a strong candidate for practical cyber–physical manufacturing environments.
Practical implementation for micro-manufacturing: Each detector can be deployed as local logic on PLCs (IEC 61131-3 [31]) or ROS edge controllers. Because only partial events are monitored and the detector automaton is small, runtime checks are constant-time with negligible memory footprint compared to a monolithic product automaton. This enables short-cycle reactions, fault isolation at the cell level, and incremental expansion of production lines—key properties for micro-manufacturing and MEMS-based execution platforms (see also Section 5.1/Table 3).
Network latency and deployment: Our enabling rule is conjunctive and event-driven without a global clock. Latency mainly affects throughput (waiting for all involved local supervisors to enable a shared event) but not the correctness of the robustness guarantee. For time-critical shared operations, we recommend co-locating the relevant supervisors with the shared resource (e.g., on the same PLC rack or ROS edge) or using industrial fieldbuses with bounded jitter. A quantitative latency budget and its impact on cycle time are part of our planned hardware testbed.

5.10. Cost and Scalability Micro-Study

Setup: We use the running example with N = 4 product automata and 4 resource automata. For each local composition G i = ( Y i , E i , i ) , we run Algorithm 1 once and record the following platform-agnostic counters: (i) graph sizes n i = | Y i | , m i = | i | , and m i rel = | { ( y , a , y ) i a E rel ( i ) } | ; (ii) the intermediate sets of Algorithm 1: | B rel | , | R | , | S | ; and (iii) the final label counts | Strict i | , | Rec i | , | N o n - r o b u s t i | .
Unless otherwise stated, the terminal set Y i end used by Algorithm 1 is obtained by the product–events semantics: let E term prod be the union of terminal events declared by the product automata. Whenever a shared event a E term prod fires in the synchronous product that defines G i , the successor global state is inserted into Y i end . Multiple successors reached by the same terminal event at the same global state are deduplicated. Reliable events are E rel ( i ) = E i E ur ( i ) as in Section 4, and all events are controllable/observable (Assumption 1 in Section 3).
Centralized structural baseline: As a baseline, we build once the synchronous product of the eight base automata are reached and report the reachable sizes | X | and | | . Table 7 compares the centralized reachable graph with the sum of locals. We also report the (purely structural) explosion factors
EF states = | X | i n i , EF edges = | | i m i .
These ratios quantify how much larger the centralized model is than the aggregate of local models, independent of execution platforms.
Workload of Algorithm 1 (platform-agnostic): For each G i we also accumulate the edge visits of the three graph primitives used by Algorithm 1 (Algorithm 2):
  • c i back : number of reliable edges inspected by BackwardClosure ( Y i end , E rel ( i ) ) ;
  • c i all : number of edges scanned by ForwardClosureRestricted ( Y i 0 , E i , B rel ) ;
  • c i rel : number of reliable edges scanned by ForwardClosureRestricted ( Y i 0 , E rel ( i ) , B rel ) .
And we also accumulate the peak queue length q i max among these BFS/fixpoint procedures, which serves as a proxy for memory usage. Table 8 summarizes the counts. In all cases, the total work c i back + c i all + c i rel empirically matches the linear-time bound O ( n i + m i ) on G i .
Label distributions (Strict ⊆ Rec): To avoid ambiguity and to respect Strict i Rec i , we report | Strict i | , | Rec i Strict i | , their sum | Rec i | , and | Non - robust i | . We also list | Y i end | (after deduplicating multiple successors produced by the same terminal event at the same global state). For instance, in G 4 , all paths to the single terminal state are via unreliable events; hence, | Strict 4 | = | Rec 4 | = 0 and | Non - robust 4 | = | Y 4 | . The results in Table 9 are consistent with this semantics.
Optional, sensitivity to unreliable events: We vary the fraction p of events marked as unreliable and recompute the labels using the same product–events terminals. A simple trend plot can show how increasing unreliability reduces | Rec i | and increases | N o n - r o b u s t i | .
Takeaway. The centralized reachable graph is an order of magnitude larger than the sum of locals (Table 7), whereas our detection cost scales linearly in n i + m i on each G i (Table 8). The proposed distributed scheme therefore avoids constructing the global synchronous product at runtime and scales structurally with the sum of local sizes. This substantiates the claims on modularity and scalability without relying on platform-dependent wall-clock timing.

6. Conclusions

In conclusion, the proposed modular and distributed supervisory control integration framework provides a scalable and robust solution for automated and micro-manufacturing systems experiencing event-level failures. The method’s compatibility with industrial information integration standards, micro-fabrication platforms, MEMS-oriented production environments, and edge control infrastructures makes it highly suitable for deployment in real-world smart and micro-manufacturing applications. Simulation studies confirm its effectiveness in maintaining system robustness and adaptability while avoiding unsafe trajectories. Future work will focus on practical deployment and integration within industrial and micro-manufacturing execution systems, as well as experimental validation on MEMS-based devices, industrial edge controllers, and cloud-based control infrastructures. To facilitate deployment, we also distill two practice-oriented points clarified in this revision-online update of unreliable events and the applicability beyond full local controllability/observability.
Online update of unreliable events: The detector graph does not depend on the reliable/unreliable split; only the labels produced by Algorithm 1 do. Therefore, if the set E u r changes at runtime, each affected subsystem can recompute labels locally in linear time O ( | Y i | + | i | ) or hot-swap one of a few precomputed label tables for anticipated modes (e.g., a ‘resource-unreliable’ flag). The update is safety-monotone: declaring more events as unreliable can only shrink the enabled set and thus preserves safety while possibly becoming more conservative until a reliable suffix exists.
Finally, we note that the above study is carried out under a simplifying assumption of full local controllability and observability; the paragraph below outlines how the framework can be relaxed when this assumption is violated.
Limitations and extensions beyond full controllability/observability: The results above rely on full local controllability and observability. When some events are uncontrollable or unobservable, Algorithm 1 can be adapted as follows:
  • Limited control: Replace the reliable label set by the enforceable set E i enf : = E i rel E i c . Then, in Step 3 (ForwardClosureRestricted on reliable labels), use E i enf instead of E i rel . This ensures that Strict states admit a reliable and controllable prefix inside B rel .
  • Partial observation: Build an observer (or belief-state) automaton over the observable alphabet E i o (or equivalently run Algorithm 1 on the fly over observed state sets using the projection P i ). This yields labels consistent with what supervisors can infer from observations.
  • Mixed case: Combine the two by running Algorithm 1 on the observer with enforceable labels E i enf .
A rigorous development, together with complexity/approximation techniques to mitigate observer blow-up, is left as future work.

Author Contributions

Conceptualization, G.D.; Methodology, G.D. and H.H.; Investigation, H.H.; Software, G.D.; Validation, G.D., Z.M. and H.H.; Formal analysis, G.D. and Z.M.; Investigation, G.D.; Resources, G.D.; Data curation, G.D.; Writing—original draft preparation, G.D. and Z.M.; Writing—review & editing, G.D., Z.M. and H.H.; Visualization, G.D.; Supervision, G.D. and Z.M.; Project administration, Z.M.; Funding acquisition, Z.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

This study does not involve any human participants, animals, or sensitive data. No ethical approval was required for this research.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

  1. Balázs, B.Z.; Geier, N.; Takács, M.; Davim, J.P. A review on micro-milling: Recent advances and future trends. Int. J. Adv. Manuf. Technol. 2020, 112, 655–684. [Google Scholar] [CrossRef]
  2. Shavezipur, M.; Ponnambalam, K.; Khajepour, A. Fabrication uncertainties and yield optimization in MEMS tunable capacitors. Sens. Actuators A Phys. 2008, 141, 356–368. [Google Scholar] [CrossRef]
  3. Ramadge, P.J.; Wonham, W.M. The control of discrete event systems. Proc. IEEE 1989, 77, 81–98. [Google Scholar] [CrossRef]
  4. Takai, S. Verification of robust diagnosability for partially observed discrete event systems. Automatica 2012, 48, 1913–1919. [Google Scholar] [CrossRef]
  5. Diene, O.; Moreira, M.V.; Silva, E.A.; Alvarez, V.R.; Nascimento, C.F. Diagnosability of hybrid systems. IEEE Trans. Control Syst. Technol. 2017, 27, 386–393. [Google Scholar] [CrossRef]
  6. Carvalho, L.K.; Basilio, J.C.; Moreira, M.V. Robust diagnosis of discrete event systems against intermittent loss of observations. Automatica 2012, 48, 2068–2078. [Google Scholar] [CrossRef]
  7. Zhou, Y.; Hu, H.; Liu, Y.; Lin, S.W.; Ding, Z. A distributed method to avoid higher-order deadlocks in multi-robot systems. Automatica 2020, 112, 108706. [Google Scholar] [CrossRef]
  8. Carvalho, L.K.; Moreira, M.V.; Basilio, J.C. Comparative analysis of related notions of robust diagnosability of discrete-event systems. Annu. Rev. Control 2021, 51, 23–36. [Google Scholar]
  9. Cao, L.; Shu, S.; Lin, F.; Chen, Q.; Liu, C. Weak diagnosability of discrete-event systems. IEEE Trans. Control Netw. Syst. 2021, 9, 184–196. [Google Scholar] [CrossRef]
  10. Dong, W.; Yin, X.; Li, S. A uniform framework for diagnosis of discrete-event systems with unreliable sensors using linear temporal logic. IEEE Trans. Autom. Control 2023, 69, 145–160. [Google Scholar] [CrossRef]
  11. Shu, S.; Lin, F.; Ying, H. Detectability of discrete event systems. IEEE Trans. Autom. Control 2007, 52, 2356–2359. [Google Scholar] [CrossRef]
  12. Shu, S.; Lin, F. Generalized detectability for discrete event systems. Syst. Control Lett. 2011, 60, 310–317. [Google Scholar] [CrossRef] [PubMed]
  13. Shu, S.; Lin, F. Delayed detectability of discrete event systems. IEEE Trans. Autom. Control 2012, 58, 862–875. [Google Scholar] [CrossRef]
  14. Xie, Y.; Yin, X.; Li, S. Opacity enforcing supervisory control using nondeterministic supervisors. IEEE Trans. Autom. Control 2021, 67, 6567–6582. [Google Scholar] [CrossRef]
  15. Jacob, R.; Lesage, J.J.; Faure, J.M. Overview of discrete event systems opacity: Models, validation, and quantification. Annu. Rev. Control 2016, 41, 135–146. [Google Scholar] [CrossRef]
  16. Han, X.; Zhang, K.; Zhang, J.; Li, Z.; Chen, Z. Strong current-state and initial-state opacity of discrete-event systems. Automatica 2023, 148, 110756. [Google Scholar] [CrossRef]
  17. Shu, S.; Lin, F. Decentralized control of networked discrete event systems with communication delays. Automatica 2014, 50, 2108–2112. [Google Scholar] [CrossRef]
  18. Shu, S.; Lin, F. Deterministic networked control of discrete event systems with nondeterministic communication delays. IEEE Trans. Autom. Control 2016, 62, 190–205. [Google Scholar] [CrossRef]
  19. Wang, Y.; Li, Y.; Yu, Z.; Wu, N.; Li, Z. Supervisory control of discrete-event systems under external attacks. Inform. Sci. 2021, 562, 398–413. [Google Scholar] [CrossRef]
  20. Meira-Goes, R.; Kang, E.; Kwong, R.H. Synthesis of sensor deception attacks at the supervisory layer of cyber-physical systems. Automatica 2020, 121, 109172. [Google Scholar] [CrossRef]
  21. Meira-Goes, R.; Lafortune, S.; Marchand, H. Synthesis of supervisors robust against sensor deception attacks. IEEE Trans. Autom. Control 2021, 66, 4990–4997. [Google Scholar] [CrossRef]
  22. Chew, S.F.; Lawley, M.A. Robust supervisory control for production systems with multiple resource failures. IEEE Trans. Autom. Sci. Eng. 2006, 3, 309–323. [Google Scholar] [CrossRef]
  23. Feng, Y.; Xing, K.; Zhou, M.; Chen, H.; Tian, F. Polynomial-complexity robust deadlock controllers for a class of automated manufacturing systems with unreliable resources using Petri nets. Inf. Sci. 2020, 533, 181–189. [Google Scholar] [CrossRef]
  24. Yang, B.; Hu, H. Maximally permissive robustness analysis of automated manufacturing systems with multiple unreliable resources. IEEE Trans. Syst. Man Cybern. Syst. 2022, 53, 3527–3539. [Google Scholar] [CrossRef]
  25. Liu, H.; Feng, Y.; Li, J.; Luo, J. Robust Petri net controllers for flexible manufacturing systems with multitype and multiunit unreliable resources. IEEE Trans. Syst. Man Cybern. Syst. 2022, 53, 1431–1444. [Google Scholar] [CrossRef]
  26. Zhang, Z.; Liu, G.; Barkaoui, K.; Li, Z. Adaptive deadlock control for a class of Petri nets with unreliable resources. IEEE Trans. Syst. Man Cybern. Syst. 2021, 52, 3113–3125. [Google Scholar] [CrossRef]
  27. Yang, B.; Hu, H. On the Equivalence Between Robustness and Liveness in Automated Manufacturing Systems. IEEE Trans. Syst. Man Cybern. Syst. 2024, 54, 7495–7507. [Google Scholar] [CrossRef]
  28. Yang, B.; Hu, H. Decentralized Enforcement of Linear State Specifications for Augmented Marked Graphs with a Coordinator. IEEE Trans. Control Syst. Technol. 2023, 32, 413–427. [Google Scholar] [CrossRef]
  29. Feng, Y.; Ren, S.; Ren, X.; Chen, H.; Yang, Y. Small-size liveness-enforcing supervisor for automated manufacturing systems using the theory of transition cover. IEEE Trans. Syst. Man Cybern. Syst. 2022, 53, 2222–2235. [Google Scholar] [CrossRef]
  30. Li, J.; Hu, H. A Maximally Permissive Robustness Analysis for Automated Manufacturing Systems Allowing Multiple Server Failures. IEEE Trans. Autom. Sci. Eng. 2025, 22, 12485–12499. [Google Scholar] [CrossRef]
  31. IEC 61131-3; Programmable Controllers—Part 3: Programming Languages. International Electrotechnical Commission: Geneva, Switzerland, 2003.
Figure 1. Illustrative AMS layout including input/output conveyors, robotic arms, and machines. Arrows show material flows and resource-coordination links; arrow colors are only for visual distinction of different paths and do not encode additional semantics.
Figure 1. Illustrative AMS layout including input/output conveyors, robotic arms, and machines. Arrows show material flows and resource-coordination links; arrow colors are only for visual distinction of different paths and do not encode additional semantics.
Micromachines 16 01076 g001
Figure 2. Product automata G P 1 to G P 4 with terminal states x 0 , x 3 , x 6 , x 10 .
Figure 2. Product automata G P 1 to G P 4 with terminal states x 0 , x 3 , x 6 , x 10 .
Micromachines 16 01076 g002aMicromachines 16 01076 g002b
Figure 3. Resource automata G R 1 to G R 4 showing synchronization with product events.
Figure 3. Resource automata G R 1 to G R 4 showing synchronization with product events.
Micromachines 16 01076 g003
Figure 4. Global AMS synchronization structure.
Figure 4. Global AMS synchronization structure.
Micromachines 16 01076 g004
Figure 5. Execution-path classification in the AMS model. Red arcs indicate unreliable events e E u r ; black arcs indicate reliable events. Top path: not recoverable. Middle path: strict robustness. Bottom path: recoverable robustness.
Figure 5. Execution-path classification in the AMS model. Red arcs indicate unreliable events e E u r ; black arcs indicate reliable events. Top path: not recoverable. Middle path: strict robustness. Bottom path: recoverable robustness.
Micromachines 16 01076 g005
Figure 6. Local state detector of subsystem G 1 with robustness labels. Blue nodes: Strict; red nodes: Non-robust. (There are no Recoverable-only (green) states in this subgraph.) Red arrows: E ur ; black arrows: E rel .
Figure 6. Local state detector of subsystem G 1 with robustness labels. Blue nodes: Strict; red nodes: Non-robust. (There are no Recoverable-only (green) states in this subgraph.) Red arrows: E ur ; black arrows: E rel .
Micromachines 16 01076 g006
Figure 7. Robustness-classified local state detector D i for subsystem G 2 . Blue states are Strictly (and Recoverably) robust; red states are Non-robust. Red edges indicate unreliable events ( E u r ); black edges indicate reliable events.
Figure 7. Robustness-classified local state detector D i for subsystem G 2 . Blue states are Strictly (and Recoverably) robust; red states are Non-robust. Red edges indicate unreliable events ( E u r ); black edges indicate reliable events.
Micromachines 16 01076 g007
Figure 8. Partial detector of G 3 showing representative robustness-relevant paths. Blue nodes: Strict; green nodes: Recoverable-only ( Rec Strict ); red nodes: Non-robust. Labels are computed per state by Algorithm 1 and are path-independent. Red arrows: E ur ; black arrows: E rel .
Figure 8. Partial detector of G 3 showing representative robustness-relevant paths. Blue nodes: Strict; green nodes: Recoverable-only ( Rec Strict ); red nodes: Non-robust. Labels are computed per state by Algorithm 1 and are path-independent. Red arrows: E ur ; black arrows: E rel .
Micromachines 16 01076 g008
Figure 9. All states in this diagram are Non-robust due to the presence of unrecoverable transitions triggered by unreliable events. Red edges: unreliable; black edges: reliable.
Figure 9. All states in this diagram are Non-robust due to the presence of unrecoverable transitions triggered by unreliable events. Red edges: unreliable; black edges: reliable.
Micromachines 16 01076 g009
Figure 10. Distributed robust control structure with local detectors and global merging via φ rule. The classification result determines the global robustness label of the current joint state, guiding the distributed execution decision.
Figure 10. Distributed robust control structure with local detectors and global merging via φ rule. The classification result determines the global robustness label of the current joint state, guiding the distributed execution decision.
Micromachines 16 01076 g010
Figure 11. Trajectory segment illustrating Recoverable-robustness enforcement along the path s = e 1 e 2 e 11 e 12 e 1 e 3 . Red arc e 12 : unreliable event ( e 12 E u r ) . Black arcs: reliable events. Green-highlighted state y 4 ( 3 ) : Recoverable (a recovery sequence e 1 e 3 leads to the terminal state y 6 ( 3 ) .
Figure 11. Trajectory segment illustrating Recoverable-robustness enforcement along the path s = e 1 e 2 e 11 e 12 e 1 e 3 . Red arc e 12 : unreliable event ( e 12 E u r ) . Black arcs: reliable events. Green-highlighted state y 4 ( 3 ) : Recoverable (a recovery sequence e 1 e 3 leads to the terminal state y 6 ( 3 ) .
Micromachines 16 01076 g011
Table 1. Robustness-classified detector states Y D 1 of subsystem G 1 .
Table 1. Robustness-classified detector states Y D 1 of subsystem G 1 .
StateComponent Markings (Product + Resources)
y 0 ( 1 ) ( x 0 , x 13 , x 14 )
y 1 ( 1 ) ( x 1 , x 1 , x 14 )
y 2 ( 1 ) ( x 0 , x 4 , x 14 )
y 3 ( 1 ) ( x 0 , x 7 , x 7 )
y 4 ( 1 ) ( x 2 , x 13 , x 2 )
y 5 ( 1 ) ( x 0 , x 13 , x 5 )
y 6 ( 1 ) ( x 0 , x 8 , x 8 )
y 7 ( 1 ) ( x 2 , x 4 , x 2 )
y 8 ( 1 ) ( x 1 , x 1 , x 5 )
y 9 ( 1 ) ( x 0 , x 4 , x 5 )
y 10 ( 1 ) ( x 0 , x 9 , x 14 )
Table 2. Reliable-event enabling rule induced by the local detector.
Table 2. Reliable-event enabling rule induced by the local detector.
Robustness LabelEnabled Events at State y i
Strict { e E i P o s t e ( y i ) Strict i }
Recoverable { e E i P o s t e ( y i ) Rec i }
Non-robust
Table 3. Mapping of unreliable events to typical micro-manufacturing faults.
Table 3. Mapping of unreliable events to typical micro-manufacturing faults.
Unreliable EventPhysical Fault in Micro-Manufacturing
e 9 Tool jam/stage over-current
e 10 Misalignment detected by vision
e 12 Loss of vacuum/grip failure
e 13 Vision reject after inspection
Table 4. Robustness classification summary for each subsystem.
Table 4. Robustness classification summary for each subsystem.
SubsystemTotal StatesStrictly RobustNon-Robust
G11183
G21183
G3>30mixed 3 +
G4505
Table 5. Centralized vs. distributed structural performance comparison.
Table 5. Centralized vs. distributed structural performance comparison.
MetricCentralizedDistributedObservation
Reachable States1011 + 11 = 22Distributed slightly larger
Supervisor ReusabilityNo (monolithic)Yes (modular)Enables reuse
Incremental IntegrationNoYesDistributed supports local addition
Supports G 3 , G 4 ExtensionNoYesDistributed scalable to larger systems
Table 6. Qualitative comparison between centralized, static FTC, modular supervision, and the proposed distributed robust supervision.
Table 6. Qualitative comparison between centralized, static FTC, modular supervision, and the proposed distributed robust supervision.
FeatureCentralizedStatic FTCModular Sup.Proposed (Dist. Robust)
Unreliable Event Handling✓ (predefined)✓ (dynamic)
Recoverable Robustness
State-Space Scalability
Supervisor Reusability
Incremental Integration
Table 7. Structural size comparison: centralized reachable graph vs. sum of locals.
Table 7. Structural size comparison: centralized reachable graph vs. sum of locals.
Centralized | X | Centralized | | i n i i m i
Counts266069151
Explosion factors EF states = 0.377 EF edges = 0.397
Table 8. Per-automaton structural workload (edge visits) and peak queues.
Table 8. Per-automaton structural workload (edge visits) and peak queues.
i n i m i | B rel | c i back c i all c i rel q i max
1111681212127
2111681212127
34211337751023827
45600006
6915153991266237
Table 9. Per-automaton label counts (Strict ⊆ Rec) and terminal set sizes.
Table 9. Per-automaton label counts (Strict ⊆ Rec) and terminal set sizes.
i | Strict i | | Rec i Strict i | | Rec i | | Non - robust i | | Y i end |
180883
280883
31918373726
400051
3518535333
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Dong, G.; Ming, Z.; Hu, H. Modular and Distributed Supervisory Control Framework for Intelligent Micro-Manufacturing Systems with Unreliable Events. Micromachines 2025, 16, 1076. https://doi.org/10.3390/mi16101076

AMA Style

Dong G, Ming Z, Hu H. Modular and Distributed Supervisory Control Framework for Intelligent Micro-Manufacturing Systems with Unreliable Events. Micromachines. 2025; 16(10):1076. https://doi.org/10.3390/mi16101076

Chicago/Turabian Style

Dong, Gaosen, Zhengfeng Ming, and Hesuan Hu. 2025. "Modular and Distributed Supervisory Control Framework for Intelligent Micro-Manufacturing Systems with Unreliable Events" Micromachines 16, no. 10: 1076. https://doi.org/10.3390/mi16101076

APA Style

Dong, G., Ming, Z., & Hu, H. (2025). Modular and Distributed Supervisory Control Framework for Intelligent Micro-Manufacturing Systems with Unreliable Events. Micromachines, 16(10), 1076. https://doi.org/10.3390/mi16101076

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop