An Overview of the Application of Modern Statistical Techniques in Semiconductor Manufacturing

Chen, Hsuan-Yu; Chen, Chiachung

doi:10.3390/asi9040083

Open AccessReview

An Overview of the Application of Modern Statistical Techniques in Semiconductor Manufacturing

by

Hsuan-Yu Chen

¹ and

Chiachung Chen

^2,*

¹

Africa Industrial Research Center, National Chung Hsing University, Taichung 40227, Taiwan

²

Department of Bio-Industrial Mechatronics Engineering, National Chung Hsing University, Taichung 40227, Taiwan

^*

Author to whom correspondence should be addressed.

Appl. Syst. Innov. 2026, 9(4), 83; https://doi.org/10.3390/asi9040083

Submission received: 3 March 2026 / Revised: 10 April 2026 / Accepted: 16 April 2026 / Published: 21 April 2026

(This article belongs to the Special Issue Feature Papers in the ‘Industrial and Manufacturing Engineering’ Section)

Download

Browse Figures

Versions Notes

Abstract

The semiconductor industry has long relied on Statistical Process Control (SPC) for yield and reliability management. In early technology nodes, classic univariate tools such as Shewhart charts, cumulative sums (CUSUM), exponentially weighted moving averages (EWMA), and the Cp/Cpk exponent could effectively monitor a finite set of key variables. However, sub-5nm and emerging 3 nm technologies have fundamentally changed the statistical environment. Advanced patterning, high-aspect-ratio etching, atomic layer deposition (ALD), chemical-mechanical polishing (CMP), and novel materials have drastically narrowed the process window. At these scales, nanometer-level deviations in critical dimensions (CD), overlay, or surface roughness can significantly impact yield. Simultaneously, modern wafer fabs generate massive amounts of high-frequency sensor data and high-dimensional metrology data. Traditional SPC assumptions—such as independence, normality, low dimensionality, and stationarity—often do not hold. Semiconductor data exhibits: (i) extremely high-dimensionality and strong intervariate correlations; (ii) a hierarchical structure encompassing fab → tooling → chamber → recipe → batch → wafer → field; and (iii) metrological delays and sampling limitations leading to incomplete and asynchronous observations. To address these challenges, this paper reviews advanced statistical methods applicable to wafer fabrication. These methods include multivariate statistical process control (MSPC) approaches such as Hotelling T² statistics, PCA/PLS combining T² and Q statistics, contribution diagnostics, time-series drift and change point detection, and Bayesian hierarchical modeling for uncertainty-aware monitoring in data-limited scenarios. Furthermore, we discuss how to integrate these methods with fault detection and classification (FDC), line-to-line monitoring (R2R), advanced process control (APC), and manufacturing execution systems (MES). This paper focuses on scalable, interpretable, and maintainable implementations that transform statistical analysis from a passive monitoring tool into an active component of data-driven fab control.

Keywords:

statistical techniques; Statistical Process Control; multivariate methods; semiconductor; Bayesian hierarchical modeling

1. Introduction

The semiconductor industry has long been at the forefront of applying statistical thinking to complex manufacturing systems [1,2,3,4,5]. Since statistical quality control was first applied to diffusion, oxidation, and thin-film processes in the mid-20th century, SPC has served as a cornerstone of yield management and reliability assurance [6,7]. From early planar technologies to today’s sub-5nm and emerging 3 nm technology nodes, statistical methods have evolved with the increasing complexity of device architectures, materials, and equipment [6,8]. However, the statistical challenges faced by contemporary wafer fabrication plants (fabs) differ fundamentally from those of previous generations [6,7,8].

In the early days of integrated circuit (IC) manufacturing, processes were relatively simple, the number of critical quality characteristics (CTQ) parameters was limited, and design margins were relatively generous [6,7]. Process engineers typically monitored a set of controllable variables—such as film thickness, sheet resistance, and junction depth—at a low sampling frequency [6,7]. In this context, classic univariate SPC tools, including Shewhart control charts, cumulative sum (CUSUM) charts, exponentially weighted moving average (EWMA) charts, and capability indices (Cp, Cpk), were usually sufficient to detect assignable causes and maintain acceptable quality levels [1,2,3,4,5]. Deviations were typically attributed to discrete equipment failures or operator-related factors, and corrective actions could be implemented locally with minimal impact on the system [5,6,7].

However, as Moore’s Law drives the continuous shrinking of device dimensions, the statistical landscape is also changing [6,8]. Advanced nodes introduce complex multi-pattern lithography, high-aspect-ratio etching, atomic layer deposition (ALD), and novel materials, such as high-dielectric-constant materials and metallic gates [6,8]. Process windows are shrinking dramatically: nanometer-level deviations in critical dimensions (CD), overlay accuracy, or line edge roughness can determine device yield and potentially trigger catastrophic losses [6,8]. Furthermore, advanced architectures such as FinFETs, gate-all-around transistors, and 3D NAND intensify interdependence between vertical and lateral processes, increasing cross-module variation propagation (e.g., lithography disturbances affecting downstream etch, deposition, and CMP) [6,8,9].

In this highly sensitive environment, the economic consequences of undetected deviations are becoming increasingly severe [6,8]. Subtle variations in lithography dosage parameters or plasma etching deviations—often hard to detect under traditional univariate limits—can degrade electrical performance across large wafer populations [6,7,8]. Given the high capital intensity of modern fabs and the high value of advanced-node wafers, even modest yield loss can translate into substantial economic impact; thus, monitoring has shifted from passively detecting large deviations to proactively identifying subtle, multivariable, and dynamic process changes [6,7,8,10,11].

Simultaneously, technological advancements in equipment, automation, and data infrastructure have driven unprecedented growth in the volume of available data. Modern semiconductor equipment is equipped with sensor arrays acquiring temperature, pressure, flow, RF power, vibration, and endpoint signals at fine time resolution. At the same time, fab information systems increasingly integrate these streams with metrology and production context [6,8]. Metrology systems—such as CD-SEM, overlay tools, optical scatterometry, and e-beam inspection—produce high-dimensional spatial and within-wafer data, creating complex spatiotemporal structures at the wafer/lot/tool hierarchy [6,8]. The resulting manufacturing environment is characterized by high throughput, high speed, and diverse data modalities [6,8].

The surge in data volume presents both opportunities and challenges. On the one hand, massive amounts of sensor data enable early detection of tool degradation and drift [8]. On the other hand, traditional SPC assumptions—independence, normality, and low dimensionality—often fail, due to strong correlations, non-Gaussianity, multimodality, and temporal evolution [11,12]. Applying many univariate charts independently can inflate false alarms and obscure root causes, motivating multivariate and model-based monitoring [11,12,13].

Therefore, modern statistical techniques have become indispensable to semiconductor process engineering. Multivariate statistical process control (MSPC), including principal component analysis (PCA) and partial least squares (PLS), can monitor correlated sensor arrays while reducing dimensionality. In typical plasma etch and lithography applications, PCA-based monitoring reduces dimensionality by 70–95% (e.g., from 100 to 300 sensors to 5–15 principal components) while retaining >90–98% of total variance. This dimensionality reduction has been shown to reduce false alarm rates by 30–50% and improve fault detection rates by 10–25% compared with univariate SPC [12,13,14,15,16].

Dynamic and time-series model extensions (e.g., dynamic PCA, multiway PCA/PLS, and latent-variable time-series models) capture temporal dependence and batch- or recipe-level structure. These methods typically reduce detection delay by 20–40% and improve sensitivity for early-stage drift detection by 15–30%, especially in run-to-run (R2R) controlled processes. For batch-like semiconductor steps (e.g., CMP or deposition), multiway models can improve fault classification accuracy from ~70–80% to 85–95% [15,16].

Advanced anomaly detection and fault detection/classification approaches—ranging from fault-specific statistical schemes to machine learning methods such as k-nearest neighbors (kNN) and Gaussian mixture models (GMM)—help identify previously unseen failure modes and nonlinear interactions. In practice, kNN-based detectors achieve 85–95% detection accuracy with false alarm rates typically below 5–10%. GMM-based monitoring improves classification accuracy by 10–20% over linear MSPC in nonlinear processes. Hybrid ML-statistical approaches can detect previously unseen faults with 70–90% recall, compared to <60% for purely linear models [9,10,17]. Predictive modeling and virtual metrology (VM) further link equipment and in situ signals to wafer-level outputs, enabling feedforward/feedback action within advanced process control (APC) architectures. VM models (e.g., PLS regression, neural networks, or ensemble models) typically achieve: [8,18,19,20].

a.: Prediction accuracy (R²): 0.85–0.98 for CD, thickness, or etch rate.
b.: RMSE reduction: 20–50% compared with baseline empirical models.
c.: Metrology cycle time reduction: 30–70%, by replacing or reducing physical measurements.
d.: Yield improvement: 2–5% through faster corrective action and tighter control loops.

Importantly, the evolution of statistical practice in semiconductor manufacturing extends beyond algorithms into cross-system integration. Manufacturing execution systems (MES), fault detection and classification (FDC) platforms, run-to-run (R2R) controllers, and yield-management databases must interoperate to enable statistical insight to drive automated decision loops [8,20,21]. Standardized APC and CIM framework interfaces (e.g., SEMI CIM/APC component specifications) and factory-integration roadmaps reflect this shift toward coordinated, real-time, data-driven manufacturing ecosystems [8,20,21].

A suitable unified conceptual framework for this paper is: Process → Data → Model → Decision → Control → Learning/Update.

In semiconductor manufacturing, the process layer comprises tools, chambers, recipes, wafers, and metrology steps that introduce variation, drift, and faults. These operations produce the data layer, including high-frequency equipment sensors, wafer/lot context, metrology, electrical test, and maintenance/MES records. The model layer converts raw data into statistical insights using MSPC for multivariate monitoring, time-series models for drift and temporal patterns, Bayesian methods for uncertainty in sparse data, DOE/response surfaces for optimization, and FDC for equipment health diagnosis. Based on these outputs, the decision layer evaluates whether the risk is acceptable, whether an abnormality is emerging, and which action is economically justified, such as a lot hold, sampling escalation, recipe correction, maintenance trigger, or requalification. These decisions feed the control layer, where APC/R2R controllers, MES workflows, and engineering actions implement feedforward/feedback corrections to stabilize CD, overlay, thickness, etch rate, and yield. Finally, the system should include a learning/update loop, because semiconductor environments are non-stationary: maintenance, chamber aging, recipe changes, and node transitions require model recalibration, threshold adjustment, and periodic retraining. In this way, the paper’s individual methods are not isolated techniques but coordinated parts of a closed-loop manufacturing intelligence architecture. This interpretation is strongly supported by the Introduction, which emphasizes integration among MES, FDC, R2R, APC, and statistical monitoring, and by the Conclusion, which describes these methods as a “coherent ecosystem” and a “deeply integrated statistical architecture.

The traditional and modern statistical techniques in semiconductor manufacturing are shown in Figure 1. This figure contrasts traditional and modern statistical techniques in semiconductor manufacturing. Traditional methods rely on univariate SPC under simplifying assumptions and address localized faults with limited data. In contrast, modern approaches handle high-dimensional, correlated, and dynamic data using MSPC, time-series models, and advanced analytics. Integrated with APC, FDC, and MES, they enable predictive monitoring, system-level insight, and proactive, data-driven control.

Furthermore, the human factor remains crucial. Effective deployment of advanced statistical techniques requires collaboration among statisticians, process engineers, equipment engineers, and data scientists. It emphasizes interpretability, robustness to tool-to-tool/chamber-to-chamber variation, scalability, and maintainability in high-throughput operations [6,11,12]. Models must remain reliable under tool matching, maintenance events, inter-chamber variability, and technology node transitions—highlighting that sustainable implementation is as critical as methodological novelty [6,8,11,20].

Against this backdrop, this paper reviews the application of modern statistical techniques in semiconductor manufacturing, with a focus on wafer fabrication. We explore fundamental principles, methodological advancements, and practical applications in multivariate monitoring, anomaly detection, predictive modeling, and integrated process control [6,8,11,12,21]. This paper emphasizes the unique challenges of the semiconductor environment—high-dimensionality, spatiotemporal structure, extreme sensitivity to change, and stringent reliability requirements—and aims to provide a coherent framework linking statistical theory with manufacturing practice [6,8,11,21].

This review provides a systematic integration of statistical methodologies for semiconductor manufacturing, unifying MSPC, time-series modeling, Bayesian inference, DOE, and FDC into a coherent, multi-layer analytical framework. Its key scientific contribution lies in moving beyond isolated methods to demonstrate how different statistical paradigms address complementary challenges—correlation, temporal dynamics, uncertainty, and equipment-level variability—within a closed-loop control architecture. By linking data characteristics to appropriate modeling strategies and decision mechanisms, the review establishes a structured foundation for next-generation, data-driven fabs that require robustness, adaptability, and interpretability.

2. Characteristics of Semiconductor Manufacturing Materials

To understand why modern statistical methods are indispensable in wafer manufacturing, it is necessary to examine the inherent characteristics of semiconductor manufacturing data carefully. Unlike the relatively low-dimensional, independent, and regularly sampled measurement data in traditional large-scale production environments, semiconductor data are complex, multi-scale, sparse, and dynamically evolving [6,11,12]. These characteristics fundamentally challenge traditional statistical assumptions and have prompted the adoption of advanced modeling and monitoring frameworks [6,12,22].

Semiconductor manufacturing data are not only massive in volume but also highly complex in structure. It exhibits strong physical coupling, multi-layered structures, spatiotemporal dependencies, and operational constraints, including sampling costs and measurement delays [6,23,24,25,26,27]. Furthermore, the economic pressures of advanced process nodes underscore the importance of detecting subtle statistical biases hidden within this complexity [6,27]. The following sections describe the key differences between semiconductor data and data encountered in simpler manufacturing systems.

2.1. High-Dimensionality and Relevance

Modern semiconductor manufacturing processes generate extremely high-dimensional data streams. A single process unit (e.g., plasma etcher or CVD reactor) can generate hundreds of sensor variables during each wafer process, including pressure, gas flow, RF power, temperature, and endpoint signals [6,9,10,22]. These signals are often sampled at high frequencies, resulting in rich time trajectories rather than scalar summaries, and the dimensionality can quickly grow to thousands of variables across the line [6,9,10,22].

Metrology systems further complicate the data. For example, lithography performance cannot be fully described by a single metric such as critical dimension (CD). Engineers evaluate multiple interrelated properties, including CD mean and uniformity, overlay (X/Y), focus/exposure sensitivity, LER, LWR, and within-wafer spatial signatures [6,23,24]. These metrics are coupled through optical, chemical, and mechanical interactions; consequently, changes in one factor (e.g., focus) can co-vary with CD/overlay, while resist/process conditions can influence CD uniformity and roughness [6,23].

Treating these variables individually with univariate control charts ignores the inherent correlation structure imposed by process physics. Monitoring many correlated variables independently increases false-alarm rates and can mask joint failure modes that manifest only in a multivariate space [11,12,13]. For example, coordinated moderate shifts across multiple related variables may evade univariate thresholds yet represent a significant multivariate deviation [12,13].

High-dimensional data can cause the “curse of dimensionality,” in which estimation becomes unstable when the number of variables approaches or exceeds the number of observations. This is common in fabs with many sensors but limited labeled excursions or metrology samples [9,10,22]. Therefore, dimensionality reduction, latent-variable modeling, and structured regularization become key tools for extracting meaningful patterns from relevant high-dimensional data [11,12,13,15].

2.2. Hierarchical Structure

Semiconductor manufacturing materials possess an inherent hierarchical structure that reflects the physical and operational structure of a wafer fab. This hierarchy can be summarized as:

Factory → Tools → Chamber → Recipe (Formula) → Batch → Wafer → Die/Device (Crystal) → Measurement Point [6,23].

Each level introduces distinct sources of variation, including tool calibration drift, chamber differences, recipe settings, batch history, wafer environment, and within-wafer spatial inhomogeneity [6,23,24].

Ignoring this hierarchical structure can obscure major variance components. Indiscriminate pooling across chambers/tools can cause systematic differences (e.g., chamber effects) to be misinterpreted as random noise, while tool-matching challenges require models that explicitly account for grouping and cross-level effects [6,23,25]. Cross-level interactions are common: chamber maintenance can shift wafer-level CD distributions, batch sequencing can induce autocorrelation, and spatial patterns may arise from temperature gradients or flow asymmetry [6,23,24].

Hierarchical statistical methods—such as mixed-effects models, variance component analysis, and hierarchical Bayesian frameworks—are valuable because they decompose variability at different levels. This supports targeted interventions, such as chamber tuning or recipe adjustments [23,25].

The hierarchical structure in semiconductor manufacturing is shown in Figure 2. This figure illustrates the hierarchical structure of variability in semiconductor manufacturing, spanning from factory-level influences to measurement-point variability. Each level introduces distinct sources of variation, including tool drift, chamber differences, and wafer-level effects. The framework highlights the multi-scale nature of process variation, emphasizing the need for hierarchical modeling and integrated control to accurately diagnose, monitor, and optimize manufacturing performance.

2.3. Sparsity and Delay Measurement

Despite abundant sensor data, critical quality measurements in semiconductor manufacturing are often scarce and delayed. Metrology tools—CD-SEM, overlay, inspection, and electrical test—are expensive and time-consuming, making 85% measurement economically impractical; thus, fabs rely on sampling strategies that measure only subsets of wafers/lots/steps [27,28,29,30].

This sparsity creates several statistical challenges: reduced sensitivity to small drifts, potential sampling bias (especially under adaptive/non-uniform sampling), and measurement noise that can be non-negligible relative to nanoscale process variation [27,30]. Equally important is latency: many metrology results arrive hours or days after the relevant step, while production continues and drift may accumulate [27,28,29,30]. As a result, decisions must often be made under partial observability using surrogate variables (equipment sensors) or predictive models that infer quality before metrology returns [26,27,28,30].

Sparse and delayed feedback has therefore motivated predictive and feedforward control strategies, including virtual metrology (VM) and VM-enabled R2R control, which predict metrology targets from process/equipment context and provide timelier control actions [26,27,28,29,30]. Robust handling of missing data, irregular sampling, and asynchronous streams is essential for deploying these models reliably [26,28,29,30].

2.4. Abnormal and Non-Stationary Behavior

Traditional SPC often assumes normality, independence, and stationarity; however, semiconductor data frequently violates these assumptions [11,12]. Non-Gaussian behavior is common: particle and defect counts can be over-dispersed and/or zero-inflated. At the same time, other quality metrics may be skewed or heavy-tailed depending on mechanisms and sampling/inspection regimes [31,32].

Nonstationarity is also pervasive. Equipment ages and degrades; preventive maintenance resets baselines; recipes and targets evolve; materials and suppliers change; and product mix shifts—all of which alter the underlying data distributions [6,11,28]. Drift can be gradual, while abrupt shifts can occur after cleaning or hardware replacement; recipe transitions may change sensor regimes and degrade static FDC models unless adaptation is used [28,33]. Environmental factors (e.g., humidity/temperature) and scheduling/queuing can further induce time-varying correlations [6,11].

Nonstationarity challenges static control limits and fixed baselines. Therefore, models must support time-varying parameters, adaptive thresholds, and drift detection—often via dynamic latent-variable monitoring (e.g., dynamic PCA), moving-window/recursive updates, and online learning frameworks [15,28,34].

Collectively, high dimensionality, correlation, hierarchical structure, sparse and delayed measurements, and non-stationary behavior define the complexity of semiconductor data and explain why simple, independent, static tools are inadequate for modern wafer fabrication [6,11,12,27]. Instead, effective process control requires a multivariate, hierarchical, dynamic, and adaptive statistical framework that can handle structure, uncertainty, and continuous evolution [6,11,12,28].

In summary, semiconductor manufacturing data are highly complex, characterized by high dimensionality, strong correlations, hierarchical structure, sparsity, and nonstationarity. Processes generate numerous interdependent variables with spatiotemporal dynamics, making univariate methods insufficient. Hierarchical structures—from the factory to the measurement point—introduce multilevel variability that requires advanced modeling. Despite abundant sensor data, critical measurements are sparse and delayed, necessitating predictive approaches such as virtual metrology. Additionally, data often violate assumptions of normality and stationarity due to drift, maintenance, and process changes. These features demand multivariate, adaptive, and hierarchical statistical frameworks that can handle uncertainty, dynamic behavior, and complex system interactions for effective monitoring and control.

3. Multivariate Statistical Process Control

Semiconductor manufacturing data are inherently high-dimensional, strongly correlated, hierarchical, and non-stationary [6,35]. These characteristics pose a fundamental challenge to traditional univariate statistical process control (SPC) frameworks [5,35]. Monitoring each quality variable individually cannot capture the physical coupling relationships inherent in advanced processes. Therefore, MSPC has become a core methodological foundation of modern wafer manufacturing, enabling simultaneous monitoring of multiple interrelated process and quality characteristics [6,35].

MSPC extends the classic SPC concept to a multidimensional space, evaluating the joint distribution of variables rather than assessing each variable independently [5,35]. By explicitly considering correlation structures and covariance patterns, MSPC combines statistical monitoring with the coupled physics of process/tool behavior [6,35]. At advanced technology nodes, even subtle coordinated drifts can significantly impact yield, making this joint perspective not only beneficial but often crucial [6,35].

3.1. Motivation for Adopting a Multi-Faceted Approach

Traditional SPC tools (e.g., Shewhart, CUSUM, EWMA charts) typically monitor individual variables under the implicit assumption of independence [5]. However, in semiconductor manufacturing, these assumptions rarely hold because process variables are physically and operationally linked. In photolithography, CD, overlay, focus, and dose interact; in plasma etching, RF power, pressure, gas chemistry, and endpoint co-evolve; in CMP, downforce, slurry flow, pad condition, and removal rate interact; and electrical test parameters form interrelated vectors rather than isolated metrics [6,35].

Independent monitoring of many correlated variables creates two key issues. First, multiple charts running simultaneously inflate the overall false-alarm probability under correlation, even when each chart is tuned to a low Type I error rate [5,35]. Second, moderate but coordinated shifts across several variables can evade univariate limits, delaying detection until yield loss becomes visible [5,35]. Multivariate approaches address these problems by modeling the measurement vector relative to a historical multivariate baseline—i.e., testing whether joint behavior remains consistent with normal operating conditions [5,35].

MSPC also strengthens diagnosis. Decomposition and contribution analyses help identify which variables (or blocks of variables) drive a multivariate alarm, supporting root-cause analysis in complex tools and integrated modules [34,35].

3.2. Hotelling T² Control Chart

Hotelling’s T² control chart is a foundational tool for multivariate monitoring [36]. The T² statistic measures the (squared) Mahalanobis distance between an observed vector and the historical mean, scaled by the covariance matrix [37]. This generalizes the univariate standardized distance to a multivariate setting and naturally accounts for correlation: directions with higher natural variability contribute less than those with tighter control [36,37].

In semiconductor applications, T² charts are used to jointly monitor vectors such as CD/overlay-related metrics, electrical parametric test vectors, multilayer film thickness measurements, and sensor-summary health indicators [38,39,40]. Their appeal lies in conceptual clarity and compressing multivariate deviation into a single scalar statistic under multivariate normality assumptions [5,36].

However, T² has practical limits in high dimensions. Reliable covariance estimation requires sufficient historical data; when the number of variables approaches or exceeds the sample size, covariance estimates become unstable or singular, and the assumptions of stationarity and multivariate normality may be violated [5,35,38]. Thus, direct T² monitoring is commonly restricted to moderate-dimensional vectors or applied after dimensionality reduction [5,35].

3.3. MSPC Based on Principal Component Analysis

To address high dimensionality and multicollinearity, PCA-based MSPC is among the most widely used approaches [13,35]. PCA transforms correlated variables into orthogonal principal components (latent variables) that capture dominant variation patterns; in fabs, these often correspond to meaningful physical factors (e.g., chamber drift, systematic overlay bias, shared endpoint trajectory variation) [6,35].

PCA-based MSPC typically monitors two complementary statistics. The T² statistic in the score (PC) space detects shifts along the directions of dominant historical variability, while the Q statistic (squared prediction error; SPE) monitors residual variation orthogonal to the PCA model—capturing behaviors not explained by established patterns [13]. Together, T² and Q provide broad coverage for both “known-variation” shifts and novel anomalies [13,35].

In semiconductor practice, PCA-based MSPC is used to monitor post-lithography metrology, plasma tool data (including multiblock structures), etch/deposition sensor trajectories, and other multivariate indicators; contribution analysis enhances interpretability by linking alarms to original variables or blocks [40].

3.4. Expansion and Practical Considerations

Classic PCA-based monitoring often assumes linear structure and stable baselines, but semiconductor processes exhibit nonlinear coupling and evolving operating regimes [6,35,41]. Kernel PCA extends PCA by nonlinear mapping into a feature space, enabling monitoring of curved manifolds and nonlinear relationships [42,43]. Adaptive and recursive PCA approaches update the model online to accommodate gradual drift, while moving-window strategies re-estimate models from recent data to maintain sensitivity under changing conditions [41].

Implementation success depends on governance as much as statistics. Baseline (NOC) selection must exclude latent anomalies to avoid contamination; recalibration is required after changes to hardware, recipes, or technology; false alarms must be controlled to maintain trust; computational efficiency is critical for high-frequency data; and interpretability ensures that alarms lead to actionable diagnostics rather than opaque scores [5,6,40,43].

The MSPC in semiconductor manufacturing is shown in Figure 3. This figure presents the MSPC framework for semiconductor manufacturing, highlighting its role in handling high-dimensional, correlated, and non-stationary data. It integrates Hotelling’s T², PCA-based monitoring, and advanced extensions to detect both dominant and novel variations. With proper implementation and governance, MSPC enhances early fault detection, supports root-cause diagnosis, and enables data-driven decision-making in complex manufacturing systems.

In summary, MSPC provides a unified framework to address the high dimensionality, correlation, and nonstationarity of semiconductor data by modeling variables jointly rather than independently. Classical tools like Hotelling’s T² offer intuitive multivariate monitoring but are limited in high dimensions, motivating PCA-based approaches that reduce dimensionality while preserving key structures of variation. Extensions such as kernel and adaptive PCA further accommodate nonlinear behavior and evolving baselines. Critically, MSPC not only improves detection sensitivity and reduces false alarms but also enhances diagnosis through contribution analysis, making it both a statistical and engineering tool essential for robust, interpretable, and adaptive process control in advanced fabs [6,35,40,41].

4. Time-Series Modeling and Drift Detection

While MSPC can handle correlations and high-dimensional problems, semiconductor manufacturing still requires explicit handling of temporal behavior. The data generated during wafer fabrication is inherently sequential: wafers are processed chronologically, chambers gradually age, preventive maintenance resets equipment baselines, and environmental fluctuations introduce time-varying patterns [6,44,45]. Many yield losses arise not from sudden failures but from gradual drift accumulating over multiple runs, closely related to R2R control in fabs [44,45].

Therefore, time-series modeling and drift detection are important complements to multivariate monitoring. These methods explicitly incorporate time dependence, autocorrelation, and dynamic evolution [46,47]. By modeling process behavior changes over time, they can identify subtle degradation mechanisms earlier and support more precise, timely intervention strategies in advanced manufacturing control loops [44,45].

4.1. Drift Characteristics of Semiconductor Manufacturing Processes

In advanced semiconductor manufacturing, catastrophic hardware failures are relatively rare compared with gradual performance degradation, which often appears as small deviations that slowly push processes toward specification limits [6,44,45]. Multiple mechanisms contribute, including chamber contamination buildup, CMP consumable wear, sensor calibration drift, and lithography optical degradation, all of which can subtly affect CD and overlay [6,46,47]. These effects often manifest as slow monotonic trends or low-frequency oscillations rather than step changes [44,45,46].

Such drift can remain within univariate limits for long periods (e.g., CD shifts of a few nanometers per lot), yet, cumulatively, become yield-relevant after many runs [6,44]. Wafer-to-wafer noise can mask drift, so distinguishing true degradation from random variation requires time-aware statistical methods rather than independence assumptions [48,49].

4.2. Exponentially Weighted Moving Average and Cumulative Sum Chart

Classic Shewhart charts are effective for large, abrupt shifts but relatively insensitive to small sustained deviations [5]. To improve sensitivity to slow drift, fabs widely use EWMA and CUSUM charts [2,3].

EWMA applies decaying weights exponentially to past observations, with a smoothing parameter controlling the memory length; smaller smoothing parameters emphasize long-term accumulation and increase sensitivity to slow drift [3]. In advanced nodes, sub-nanometer-level shifts in CD, overlay, film thickness, or etch rate can affect yield, and EWMA charts provide practical early warning by filtering short-term noise while retaining underlying trends [3,6,50].

CUSUM accumulates deviations from a target, so persistent small biases create a steadily growing statistic that eventually crosses a decision threshold [2]. This makes CUSUM particularly effective for detecting small but systematic shifts in metrics such as overlay alignment, etch behavior, or electrical parametric vectors [2,6,50].

However, EWMA/CUSUM are typically derived under relatively stable baseline conditions and often assume weak autocorrelation (or require explicit handling of it) [2,3,5]. In environments with strong time dependence, regime changes, or structural drift, richer time-series models can be needed for robust monitoring [48,49].

4.3. Autoregressive Model and State-Space Model

Beyond control charts, explicit time-series models provide a richer characterization of process dynamics by modeling dependence on previous observations and/or latent states [48,49].

ARIMA models (Box–Jenkins methodology) represent a variable as a function of past values and past forecast errors and are widely used for autocorrelated sequences [51]. In fabs, ARIMA-style residual monitoring can be used: predictable temporal patterns are modeled, and deviations from predictions are treated as anomalies [51]. Their linear structure, however, can limit the representation of nonlinear interactions and multivariate coupling common in complex tools unless extended [46,47].

State-space models generalize time-series modeling by linking measurements to latent process states that evolve dynamically; the Kalman filter provides a recursive estimator that fuses prior state estimates with new noisy measurements [52,53,54]. This framework aligns with semiconductor R2R/APC, where control actions (e.g., lithography dose updates, CMP compensation, etch correction) are computed from estimated tool or chamber states and updated with new feedback [44,45,47]. State-space methods explicitly model process and measurement noise, adapt smoothly to gradual drift, and integrate with feedback/feedforward architectures—bridging statistical inference and control engineering in adaptive manufacturing systems [52,53,54].

4.4. Change Point Detection

While incremental drift is common, discrete structural changes occur after maintenance, recipe/formulation modifications, hardware replacements, or supplier/material changes [6,44,45]. Change point detection aims to localize the time at which statistical properties change (mean/variance/covariance or full distribution), supporting root-cause diagnosis, data segmentation for model training, and ensuring model validity across regimes [55,56,57,58,59,60].

Classical and modern approaches include likelihood-based/sequential procedures, Bayesian segmentation and online change point inference, and nonparametric tests [55,56,57,58,59]. In fab practice, change point analysis links yield excursions to maintenance events, detects chamber degradation onset, and avoids pooling heterogeneous data that could reduce sensitivity or bias baselines [6,44,45,55].

In summary, time-series modeling and drift detection are essential in semiconductor manufacturing due to gradual degradation, autocorrelation, and evolving baselines [6,44,45]. EWMA/CUSUM detect small shifts, ARIMA/state-space enable prediction and residual monitoring, and change-point methods identify structural transitions [2,3,51,52,53,54,55,56,57,58,59,60]. Together, these methods enable earlier intervention, stronger diagnosis, and sustained stability in advanced wafer fabrication [6,44,45,55].

The time-series modeling and drift detection in semiconductor manufacturing are listed in Table 1.

In summary, time-series modeling complements MSPC by explicitly capturing temporal dependence, gradual drift, and regime evolution inherent in semiconductor processes. Simple tools such as EWMA and CUSUM provide practical sensitivity to small, persistent shifts. At the same time, ARIMA and state-space models enable predictive monitoring and integration with APC through dynamic state estimation. Change-point detection further enhances robustness by identifying structural transitions linked to maintenance or process changes. Together, these approaches transform raw sequential data into actionable insights, enabling earlier fault detection, improved diagnosis, and tighter control, thereby supporting stable, high-yield manufacturing under continuously evolving process conditions.

The quantitative criteria and direct comparison between methods are listed in Table 2.

The Practical ranking by criterion is illustrated as follows:

1.: For small sustained drift
CUSUM > EWMA > Kalman > ARIMA > Change-point
2.: For noisy measurements with evolving baseline
Kalman > ARIMA > EWMA > CUSUM > Change-point
3.: For abrupt post-maintenance or recipe changes
Change-point > CUSUM > EWMA > Kalman > ARIMA
4.: For forecasting and residual-based monitoring
Kalman ≈ ARIMA > EWMA > CUSUM > Change-point
5.: For simplicity and shop-floor deployment
EWMA ≈ CUSUM > Change-point > ARIMA > Kalman

5. Bayesian Statistical Methods

As semiconductor processes become more complex and capital-intensive, statistical methods must handle sparse data, high uncertainty, multilevel variability, and leverage process knowledge. Bayesian methods provide a flexible framework by treating parameters as random variables and updating beliefs with new data, well suited to sequential, information-rich yet data-limited wafer manufacturing [6,61].

Unlike frequentist approaches that rely heavily on large-sample approximations and fixed-parameter assumptions, Bayesian inference can explicitly quantify uncertainty and incorporate prior knowledge [61,62]. This capability is particularly valuable in semiconductor manufacturing, where physical understanding of tool behavior, process physics, and historical baselines is often available even when current-node data are sparse (e.g., early ramp) [6,61].

5.1. Theoretical Basis of Bayesian Methods

Several structural characteristics of semiconductor manufacturing make Bayesian methods especially suitable. A key driver is metrology sparsity: at advanced nodes, only a small fraction of wafers receive detailed CD-SEM/overlay/e-test due to cost and capacity constraints [6]. With small sample sizes, classical estimators can be unstable; Bayesian methods improve stability by combining limited data with priors that reflect historical baselines or engineering expectations [61,63].

Uncertainty quantification is another motivation. Bayesian analysis supports decision-making by using posterior distributions (not just point estimates), allowing engineers to reason about the probability that CD shift, etch-rate drift, or overlay bias exceeds risk thresholds—which is important when economic stakes are high [61,62]. The sequential nature of wafer processing aligns naturally with Bayesian updating. Each new batch/metrology result recursively refines the posterior belief about the current process state, enabling integration with monitoring and APC loops [6,61].

5.2. Bayesian Control Chart

Traditional control charts rely on fixed parameter estimates from historical “in-control” data; these can be fragile when measurement frequency is low or operating conditions evolve [5,6]. Bayesian control charts extend SPC by placing priors on parameters (e.g., mean/variance) and updating them with each new observation [63,64,65,66].

In Bayesian mean monitoring, the process mean is treated as random; as new CD or overlay data arrive, the posterior distribution is updated, and monitoring can be framed as the posterior probability of exceeding a specification or risk threshold (risk-based alarming rather than fixed limits) [66,67,68]. This structure improves robustness under small samples and supports dynamic adaptation as data accumulate [66,67,68].

In wafer-fab contexts where drift is slow and measurement intervals are long, Bayesian charting ideas are often paired with predictive monitoring and health indicators to maintain sensitivity without excessive false alarms [6,68]. Prior choice remains critical: overly informative priors can delay detection of real shifts, while overly diffuse priors reduce practical value—so sensitivity analysis and historical validation are essential [61].

5.3. Hierarchical Bayesian Model

The multi-layered structure of semiconductor data (factory → tool → chamber → lot → wafer → site) naturally motivates hierarchical Bayesian modeling, where lower-level parameters are drawn from higher-level distributions—enabling partial pooling (“shrinkage”) and more stable estimates when local data are limited [6,64].

For example, inter-tool overlay performance can be modeled with tool-level random effects around a fleet distribution, and chamber-level etch-rate deviations can be estimated by shrinking toward the toolset mean, improving stability for chambers with few observations [6,64,65]. This supports probabilistic tool/chamber comparison (tool matching), variance decomposition across levels, and structured uncertainty accounting in large fabs operating many nominally identical tools [6,64].

Advances in computation—especially MCMC and variational inference—make hierarchical Bayesian approaches increasingly practical at an industrial scale. However, computational efficiency and monitoring latency remain key concerns for high-throughput deployment [61,69,70,71].

5.4. Bayesian Decision Making

Beyond estimation and monitoring, Bayesian methods provide a coherent basis for decision-making under uncertainty. Bayesian decision theory combines posterior distributions with explicit loss/cost functions, enabling economically rational interventions (e.g., stop-and-fix vs. continue production) when balancing yield risk and capacity/downtime cost [62].

In semiconductor manufacturing, Bayesian decision frameworks have been applied to optimize in-line metrology and inspection strategies under uncertainty, selecting sampling rates/periods/lots to maximize information while controlling productivity risk [69]. Bayesian optimization has also been explored to accelerate process development under the cost of experimentation by explicitly balancing expected improvement against uncertainty [70].

Overall, Bayesian statistical methods provide a robust framework for a fab environment characterized by sparse data, hierarchical variability, evolving baselines, and high economic stakes. Through posterior updating, hierarchical modeling, and decision-theoretic action, Bayesian approaches can strengthen monitoring, diagnosis, and control in modern wafer fabrication [6,61,62].

In summary, Bayesian methods provide a coherent framework for semiconductor manufacturing by integrating prior knowledge, sparse data, and hierarchical variability into probabilistic inference. They enhance stability under limited samples, enable uncertainty-aware monitoring through posterior updating, and support multi-level modeling via hierarchical structures. Bayesian control charts and decision theory further allow adaptive, risk-based interventions aligned with economic objectives. Despite computational challenges, these approaches bridge statistical inference and engineering judgment, offering robust, interpretable, and data-efficient solutions for monitoring, diagnosis, and control in complex, high-stakes fab environments with evolving process conditions.

6. Experimental Design and Response Surface Modeling

Since the early stages of integrated circuit manufacturing, Design of Experiments (DOE) has played a fundamental role in semiconductor process engineering [72,73,74]. While statistical process control ensures stability in production, DOE provides a structured approach to identify causal relationships, quantify sensitivities, and determine robust operating conditions [72,73,74]. In advanced nodes with extremely narrow process windows and high wafer value, experiments must balance scientific rigor with economic efficiency, motivating a shift from fixed one-shot designs toward sequential, model-driven, and uncertainty-aware frameworks [73,75,76,77,78].

6.1. Evolution of DOE in Semiconductor Manufacturing

In earlier technology generations, processes had wider tolerance ranges and fewer interacting parameters; classic factorial and response surface methodology (RSM) designs were practical and effective [72,73,74]. Full or fractional factorial experiments could estimate main effects and low-order interactions among parameters such as temperature, pressure, gas flow, and exposure dose [72,73]. Central composite and Box–Behnken designs were commonly used to approximate second-order response surfaces and identify optimal setpoints inside a defined process window [73,74].

As device geometry shrinks and process complexity increases, the limitations of traditional DOE become more apparent. Modern tools expose dozens of controllable factors with nonlinear, context-dependent interactions, making full-factor (or even rich fractional-factor) exploration prohibitively expensive—especially given advanced-node wafer costs and the operational risks of perturbing downstream modules [73,75,76,77,78]. Consequently, contemporary fab experimentation emphasizes information efficiency: extracting maximum learning from limited trials rather than exhaustively mapping the full factor space [73,75,79].

This shift is evident in the semiconductor literature, which integrates DOE with simulation, surrogate response surfaces, and optimization workflows (e.g., DOE/RSM coupled with process and device simulations for IC process optimization and design centering) [75,76,77,78].

6.2. Sequential and Adaptive Experimental Design

Sequential experimentation represents a conceptual shift from static design matrices to iterative learning cycles: run a small set of experiments, analyze results, and choose the next points based on uncertainty, expected improvement, or operational constraints [80,81,82]. This is particularly aligned with semiconductor realities (tight wafer budgets, rapidly evolving knowledge during ramp, and frequent baseline changes), where early screening can eliminate irrelevant factors, and later runs can focus locally on the most promising subspace [73,80,81].

In practice, sequential strategies often combine: (i) a screening phase (factor discovery), followed by (ii) local modeling/optimization around a feasible operating region [73,80]. This structure preserves statistical discipline while respecting limited capacity and minimizing the risk of exploring unsafe yield/reliability regions [73,80].

6.3. Gaussian Process Modeling

Gaussian process (GP) regression has become a powerful response-surface modeling tool for semiconductor experimental design because it is nonparametric and more flexible than fixed-form polynomial RSM [82,83]. Given observed data, GP models provide both a predicted mean and a predicted variance at untested points—directly quantifying predictive uncertainty under sparse data [83,84].

This uncertainty output makes GPs a natural backbone for Bayesian optimization: acquisition functions (e.g., expected improvement) explicitly trade off between exploiting promising regions and exploring uncertain regions to efficiently select the next experiment [82,84]. For nonlinear couplings in semiconductor processes—such as plasma chemistry interactions or lithography focus–exposure—GP surrogates capture curvature and complex interactions better than quadratic models while providing principled uncertainty estimates [82,83,84].

Although vanilla GP training scales poorly with dataset size, sparse/approximate GP methods (e.g., inducing-point approximations) have improved scalability and enabled broader industrial applicability [74].

6.4. Applications in Advanced Process Development

Combining sequential DOE with GP modeling has shown clear benefits in advanced semiconductor process development. In lithography optimization, the traditional focus–exposure matrix (FEM) is used to assess process windows; GP surrogate models can interpolate performance metrics at untested focus/exposure settings, improving identification of robust operating regions under tight CD tolerances [82,83,85].

In plasma etching, selecting recipes that balance profile control, selectivity, and damage requires navigating nonlinear interactions among RF power, pressure, gas composition, and bias. GP-guided sequential experiments can accelerate convergence toward feasible trade-offs in this multi-objective space [74,82,83,84]. In CMP, removal rate and within-wafer non-uniformity depend jointly on downforce, platen speed, slurry properties, and pad condition; model-based sequential design can explore this space efficiently while minimizing wafer consumption [73,82,83].

More broadly, GP-based DOE provides a structured framework for cost-effective exploration of high-dimensional recipe spaces. By combining predictive modeling with decision rules that account for uncertainty, engineers can accelerate optimization while maintaining yield and production stability [74,82,83,84,86].

The experimental design and response surface modeling for semiconductor manufacturing are shown in Figure 4. This figure illustrates a unified workflow for optimizing semiconductor processes. Classical DOE establishes baseline factor–response relationships, while sequential experimentation iteratively refines conditions. Gaussian process modeling and Bayesian optimization enable uncertainty-aware prediction and efficient search. Centered on response surface modeling, the framework integrates these methods to support adaptive, data-driven optimization, improving process performance, efficiency, and decision-making in advanced manufacturing.

In summary, DOE is essential in semiconductor manufacturing for identifying causal relationships and optimizing processes. While traditional factorial and response surface methods were effective in earlier nodes, increasing complexity and cost have driven a shift toward sequential, adaptive, and model-based approaches. Modern DOE emphasizes information efficiency, using iterative experimentation to refine process understanding. Gaussian process (GP) modeling enhances this by providing flexible, nonparametric response surfaces with uncertainty quantification, enabling Bayesian optimization. Applications in lithography, etching, and CMP show improved efficiency and reduced wafer usage. Overall, integrating sequential DOE with GP modeling supports cost-effective, data-driven optimization in high-dimensional and dynamic manufacturing environments.

7. Fault Detection and Classification

As semiconductor manufacturing systems become increasingly complex and data-intensive, statistical monitoring has expanded from product quality characteristics to equipment health itself [6,17,68,87]. Modern wafer fabs rely on automated, sensor-rich tools whose performance directly determines yield, throughput, and reliability [6,87]. Fault detection and classification (FDC) has become a critical infrastructure for real-time equipment monitoring, identifying anomalies before they propagate into measurable product defects [6,17,68,87]. Unlike traditional SPC, which relies mainly on downstream metrology or electrical test, FDC operates upstream at the equipment level, using high-frequency sensor traces to detect abnormal tool behavior before wafers suffer irreversible damage [6,68,87].

7.1. From Quality Control to Equipment Health Monitoring

Historically, fab monitoring emphasized product metrics such as CD, overlay, film thickness, and electrical parameters [6,87]. While effective, this approach often identifies problems only after wafers are processed and measured—too late to prevent scrap/rework when wafer value is high [6,87].

FDC shifts the focus to proactive equipment health monitoring. Modern tools provide dense sensor arrays (temperature, pressure, gas flow, RF power, vibration, optical emission, endpoint signals) with high temporal resolution, capturing detailed “fingerprints” of each wafer run [6,17,68,87]. Subtle deviations in these traces can indicate chamber contamination, flow imbalance, component degradation, or plasma instability long before downstream metrology shows excursions [17,68].

The objective of FDC is to detect abnormal tool behavior early and support rapid diagnosis/classification so engineers can intervene before production losses occur [17,68]. This proactive stance reflects advanced-node economics and represents a shift from reactive quality control to predictive equipment assurance [6,87].

7.2. Statistical Basis of FDC

Despite modern implementations, many FDC systems remain grounded in classical multivariate statistics: transforming high-volume sensor traces into indicators that separate normal from abnormal behavior [6,17,68,87].

PCA-based monitoring is widely used in FDC to compress high-dimensional sensor arrays into latent variables that capture dominant patterns of variation [12,13]. Deviations are commonly detected using Hotelling’s T² (changes along the modeled subspace) and SPE/Q (residual deviation outside the subspace), which is particularly effective in etch/deposition where strong physics-driven correlations exist among sensors [12,13]. Multiblock extensions further improve fault isolation by respecting tool/module groupings and enabling structured contribution diagnostics [40].

Beyond PCA, data-driven classifiers are also used for fault detection and diagnosis—e.g., distance-based and mixture-model approaches that support both detection and (in some cases) fault-type discrimination [9,10]. In addition, fault-specific charting and discriminant-style methods have been demonstrated to jointly detect and classify specific tool faults, improving sensitivity and speeding response [17].

Multivariate control charts remain an important foundation: they evaluate the joint behavior of sensor vectors rather than charting each sensor independently, helping to control false alarms in the presence of correlation [12,13,68]. In practice, baseline models are built from normal operation (NOC) data and thresholds are derived statistically, with strong emphasis on interpretability and drill-down diagnostics (e.g., contribution analysis) to support engineer action [12,40].

A core practical issue is model maintenance. Aging, cleaning, preventive maintenance, hardware upgrades, and recipe transitions change the data distribution, requiring recalibration and adaptation to retain sensitivity without overwhelming false alarms [33,87]. Balancing robustness and responsiveness is therefore central to industrial FDC design [33,87].

7.3. Time–Frequency Models and Sequence Models

Many equipment anomalies are dynamic during a run (e.g., ignition transients, endpoint oscillations, periodic instabilities) and can be missed by purely aggregated statistics [68,88]. As a result, advanced FDC incorporates time–frequency and sequence modeling.

Time–frequency analysis (e.g., wavelets, short-time Fourier transform) decomposes sensor signals to reveal evolving spectral content, which is valuable for detecting oscillatory behaviors that foreshadow plasma instability or arcing [88,89,90]. Wavelet-domain features are commonly used for transient/localized deviations and nonstationary signals [89,90].

Sequence models capture phase-structured tool operation. Hidden Markov models (HMMs) and related state-space approaches model a wafer run as a sequence of discrete states (load, stabilize, process, unload). Abnormal transitions, dwell times, or unlikely state sequences can signal the development of faults [91]. This provides a probabilistic lens on pattern deviation rather than only numeric deviation [91].

These dynamic approaches complement static MSPC by improving sensitivity to transient and subtle anomalies during specific phases of the run [68,88].

7.4. Integration with Statistical Quality Control

The greatest value of FDC emerges when integrated with downstream SPC/APC to form a closed-loop monitoring-and-intervention ecosystem [6,87,92]. Equipment-level anomalies detected by FDC can be correlated with later CD/overlay drift or yield loss, strengthening root-cause analysis by linking sensor patterns to product impact [6,87].

Integrated frameworks also improve decision confidence: concurrent FDC alarms and early quality-drift signals increase the likelihood of true process problems, while stable quality despite sensor deviations can trigger threshold/model tuning to reduce unnecessary tool stops [6,87]. This integration supports both feedforward and feedback action—tool holds, recipe adjustments, maintenance triggers—before wafers reach critical metrology steps, and historical linking between sensor signatures and yield outcomes improves predictive accuracy over time [6,87,92].

In summary, FDC represents a major innovation in semiconductor statistical practice, shifting from reactive, product-based monitoring to proactive equipment health management using high-frequency sensor data and multivariate modeling. Grounded in MSPC and enhanced by dynamic methods such as time–frequency analysis and sequence modeling, FDC improves sensitivity to transient and phase-specific anomalies. Its effectiveness relies on continuous model adaptation and tight integration with SPC/APC, enabling earlier fault detection, better diagnosis, and coordinated intervention, forming a closed-loop framework that improves yield, tool reliability, and efficiency in advanced fabs [6,17,68,87,92].

8. In-Process Control and Statistical Modeling

As semiconductor manufacturing processes become increasingly sensitive to minute variations in parameters, maintaining static recipe settings is no longer sufficient to ensure consistent product quality [6,44]. Even well-defined processes can drift due to chamber aging, consumable wear, environmental changes, and inter-tool variability [6,93]. Therefore, R2R control (inter-run/successive-run control) has become a cornerstone of APC in modern wafer manufacturing [6,44]. By dynamically adjusting recipe parameters using feedback from previous wafers or lots, R2R systems compensate predictable variations and stabilize outputs such as CD, overlay, film thickness, and etch bias [6,44,94].

8.1. Control Principles of Inter-Run Operation

R2R control is based on iterative feedback: after each wafer/lot is processed and measured, the controller computes the deviation from the target and updates the controllable recipe inputs for subsequent runs [44]. Unlike real-time loops operating within a single wafer cycle, R2R operates on a longer time scale—correcting between-run deviations with delayed feedback, reflecting the reality that critical metrology is often available only after processing [6,44].

Statistical modeling is central: a quantitative input–output relationship is required to translate measured error into corrective action. For example, lithography CD can be locally approximated as a function of exposure dose (and sometimes focus), enabling dose updates to compensate CD bias; CMP removal rate can be corrected by adjusting pressure or time; etch outputs can be compensated by updating setpoints based on inferred chamber state [6,44]. Without a model, feedback adjustments can become ad hoc and may destabilize the process [6,44].

8.2. Model-Based Controllers

The effectiveness of R2R depends on the accuracy and adaptability of the underlying model. Linear regression (local linearization) is commonly used because many unit processes can be approximated as linear within a narrow operating window—yielding transparent and computationally efficient correction rules [44,94].

However, semiconductor processes are rarely static. Drift, measurement noise, and unobserved disturbances motivate state-space (latent-state) modeling, in which hidden process states represent tool bias/chamber conditions and evolve over time [44,52]. Kalman filtering provides a recursive estimator that combines prior state estimates with new noisy measurements to generate stable adaptive corrections—widely aligned with R2R/APC practice in fabs [11,44,52,95].

Model effectiveness must be continuously evaluated. Hardware changes, recipe/material updates, supplier changes, and node transitions can shift the true input–output relationship; if the model is stale, corrections may become inappropriate, leading to overcompensation or instability [6,44]. Hence, periodic recalibration, validation with new data, and residual monitoring are essential for sustainable deployment [6,44].

8.3. Role of SPC in R2R Systems

In integrated architecture, the inter-run controller performs active correction, while SPC plays a supervisory role by monitoring residuals (predicted vs. observed) and verifying that model assumptions remain valid [44]. If residuals show abnormal patterns, increased variance, or systematic bias, SPC can flag that the model no longer represents the process, triggering investigation, re-identification, or maintenance actions [44].

This two-layer structure improves robustness: R2R reduces routine drift, while SPC protects against structural mismatch, sensor faults, and unmodeled disturbances—thereby helping to avoid over-adjustment due to noise or outliers [44]. In advanced fabs, R2R control, SPC monitoring, and equipment health diagnostics (e.g., FDC) increasingly work together as a closed-loop ecosystem that stabilizes routine variability, preserves model integrity, and enables proactive intervention [6,44].

In summary, in-process (inter-run) control and statistical modeling are essential in advanced semiconductor manufacturing [96,97,98]. Model-based R2R controllers (linear and state-space/Kalman) compensate predictable drift under delayed metrology, while SPC supervision maintains model validity and supports robust long-term operation [6,44,52]. The comparative analysis between methods: MSPC, time-series models, Bayesian approaches, and FDC is listed in Table 3.

Table 3 presents a structured comparison of MSPC, time-series, and drift-detection models, Bayesian approaches, and FDC, showing that no single framework is sufficient for all monitoring and decision-making needs in semiconductor manufacturing. MSPC remains foundational because it can handle high-dimensional, correlated variables and reduce false alarms compared with univariate SPC. However, its effectiveness is limited in very high-dimensional settings with unstable covariance estimation, shifting baselines, or strong nonlinearity. Thus, MSPC serves well as a core multivariate monitoring layer but is less complete as a standalone solution in dynamic environments.

In contrast, time-series and drift-detection methods focus on temporal evolution, gradual drift, and regime changes. They are better suited than MSPC for detecting slow deterioration and autocorrelated behavior, offering strong temporal sensitivity, especially for early drift detection. However, they often rely on stationarity assumptions, require frequent updates, and may struggle with nonlinear multivariate coupling. Therefore, while more dynamic, they are narrower unless integrated with broader frameworks.

Bayesian approaches differ in their emphasis on uncertainty representation and probabilistic decision-making. They are particularly effective in sparse-data conditions, early ramp stages, and hierarchical settings where prior knowledge is important. Compared with MSPC and time-series methods, they better incorporate uncertainty and support sequential updating and risk-based decisions. However, they are computationally intensive, sensitive to prior choices, and may introduce latency in real-time applications, limiting their practicality for high-throughput deployment.

FDC shifts monitoring upstream to equipment-level behavior using high-frequency sensor data. Its key advantage is early detection of equipment anomalies before wafer defects occur, making it powerful for proactive monitoring and predictive maintenance. However, it requires significant maintenance, is sensitive to changes in tools and recipes, and demands substantial computational and data resources. As a result, FDC is most effective when integrated with SPC/APC rather than used alone.

Overall, these approaches should be viewed as complementary rather than competing. MSPC is suited for multivariate quality monitoring, time-series methods for temporal dynamics, Bayesian approaches for uncertainty-aware decisions, and FDC for equipment-level detection. Each captures only part of the system—correlation, time dependence, uncertainty, or equipment behavior—so an integrated strategy tailored to problem scale and data characteristics is more effective than relying on any single method.

9. Statistics and Machine Learning: Complementary Roles

The rapid growth of sensor data, computing power, and storage infrastructure in semiconductor manufacturing has accelerated the adoption of machine learning (ML) for yield prediction, defect classification, and equipment anomaly detection [87,93]. However, despite strong predictive performance, ML cannot replace statistical thinking. Instead, modern fab analytics relies on a complementary blend of statistical rigor and ML flexibility, where sustainable deployment in high-volume fabs requires not only accuracy, but also interpretability, robustness to change, and full lifecycle management [87,99,100].

9.1. Limitations of Pure Data-Driven Models

ML methods are effective at capturing nonlinear relationships in high-dimensional fab data (e.g., sensor trajectories and wafer-map patterns), enabling applications such as wafer map pattern recognition and defect classification [96,97,98]. However, purely data-driven models face major limitations in manufacturing.

A key issue is sensitivity to distribution shifts/concept drift. Semiconductor processes evolve due to aging, preventive maintenance, recipe changes, and product mix changes, so models trained on historical distributions can degrade silently when the underlying data-generating process changes [99,100]. Without explicit drift monitoring and recalibration mechanisms, performance can deteriorate unnoticed in production [12,99,100,101].

Interpretability is another limitation. Many high-performing models behave like black boxes, but fabs require root-cause reasoning and defensible interventions (scrap decisions, tool downtime). Explainability methods such as LIME and SHAP are commonly used to help bridge this gap by providing local or additive feature-attribution explanations [102,103].

Overfitting is also a real risk: fab datasets can be extremely high-dimensional, while labeled data on yield excursions or rare failure modes is limited, allowing complex models to fit noise rather than stable process signals—leading to brittle real-world behavior [87,99]. Therefore, ML alone is insufficient; statistical principles remain essential for reliability, robustness, and safe deployment.

9.2. Statistical Information-Driven Machine Learning

Best practices in semiconductor analytics increasingly treat statistics and ML as an integrated workflow rather than competing paradigms [87]. Statistical analysis guides feature engineering/selection (e.g., correlation analysis, PCA for redundancy reduction, structured decomposition), improving efficiency and reducing the risk of overfitting before ML training [87,93].

Rigorous validation—cross-validation, hold-out strategies, and uncertainty reporting—provides unbiased performance assessment and contextualizes outputs with confidence/uncertainty, improving interpretability and decision quality [99,104]. Statistical drift detection and monitoring (including performance-based monitoring of prediction error) is then used to detect distribution shifts and trigger retraining/recalibration, preventing silent degradation [99,100,101,104].

In this hybrid view, ML provides the power of nonlinear modeling (e.g., wafer-map classification and defect-pattern recognition). In contrast, statistics provides guardrails (monitoring, uncertainty quantification, and diagnostic structure)—together yielding operationally reliable predictive systems [96,97,98,99,100,101].

9.3. Model Governance and Lifecycle Management

Production ML models in fabs continuously influence operational decisions, so they require governance comparable to that of process equipment [87,99,100]. Statistical monitoring of residuals, prediction errors, and calibration metrics can detect degradation early. Performance-based drift detection methods (including error-rate monitoring approaches such as EWMA-based drift detection) provide practical mechanisms for flagging changes and triggering response actions [99,100,101,104].

Documentation, version control, and retraining criteria are equally critical because incremental changes in fab can invalidate models over time [87,99]. Interpretability tools (e.g., LIME/SHAP) and statistical diagnostics (sensitivity analysis, uncertainty intervals) improve trust and enable collaboration between data scientists and process engineers by aligning model outputs with physical process understanding [102,103].

In summary, statistics and machine learning play complementary roles in semiconductor analytics: machine learning provides flexible nonlinear prediction and pattern recognition, while statistical methods ensure robustness, interpretability, and long-term maintainability, together forming a balanced analytical ecosystem for high-precision manufacturing. This synergy is particularly evident in state-of-the-art AI/ML approaches for in situ analysis and autonomous control of epitaxial growth. Recent studies demonstrate that convolutional neural networks (CNNs) can classify RHEED images with near-human or superior accuracy, enabling real-time identification of growth modes and surface reconstructions—such as Volmer–Weber, Stranski–Krastanov, and Frank–van der Merwe—from continuous video streams under varying observation conditions. Beyond supervised learning, unsupervised and self-supervised pipelines, including deep feature extraction with pretrained CNNs combined with clustering or change-point detection, enable on-the-fly featurization of RHEED data, phase boundary mapping, and real-time anomaly detection. These capabilities provide the foundation for robust feedback control in molecular beam epitaxy (MBE) and related epitaxial processes, illustrating how integrating statistical rigor with machine-learning adaptability enables intelligent, autonomous semiconductor manufacturing systems [87,96,97,99,100,101,102,103,104].

In summary, Machine learning (ML) is increasingly used in semiconductor manufacturing for prediction and pattern recognition, but it cannot replace statistical methods. Purely data-driven models face challenges such as sensitivity to process drift, limited interpretability, and overfitting due to high-dimensional data and scarce labels. Therefore, statistics and ML are best used together: statistics support feature selection, validation, uncertainty quantification, and drift monitoring, while ML provides flexible nonlinear modeling. Effective deployment also requires model governance, including monitoring, retraining, and interpretability tools. This integration enables robust, adaptive, and trustworthy systems that support advanced applications such as real-time epitaxial growth monitoring and autonomous process control.

10. Integration with Manufacturing Systems

The full value of statistical techniques in modern semiconductor manufacturing lies not in isolated analytical models, but in their seamless integration with the broader manufacturing ecosystem [6,87,105,106]. Advanced wafer fabs operate as highly automated cyber-physical systems, with data flowing between equipment sensors, metrology tools, manufacturing execution systems (MES), APC platforms, and yield management databases [6,106]. Therefore, statistical insights must be embedded into this infrastructure to influence operational decisions in (near) real time [87,105,106].

10.1. Integration with MES and APC Architectures

The MES is the operational backbone of a wafer fab, tracking lot movement, equipment status, recipe configuration, and process history [6,21]. Statistical monitoring outputs—control chart events, multivariate anomaly scores, drift indicators, and model health metrics—must be transmitted to MES/APC layers in a structured and actionable form so they can trigger standardized workflows (e.g., lot hold, engineering disposition, sampling plan escalation, or route-to-metrology actions) [6,87,105,106].

APC platforms extend this by embedding statistical estimators and models directly into automated control loops—such as R2R controllers, feedforward corrections, and deviation compensation logic [6,20,87]. For example, CD metrology can be consumed by dose/overlay correction models, and equipment health indicators from FDC/EDA streams can inform maintenance plans and tool qualification decisions [87,107,108,109,110]. This closed-loop integration ensures that statistical learning translates quickly into stability and yield protection [6,87,108,109,110].

A core implementation challenge is operational reliability: statistical outputs must be interpretable, time-aligned with production events, and robust to asynchronous metrology and data delays [87,106]. Hence, practical integration architectures include data-quality checks and validation layers (e.g., event correlation, timestamp alignment, route verification) before automated actions are executed [87,106].

10.2. Automated Decision Support and Smart Manufacturing

Beyond monitoring and control, integrated statistical systems enable automated decision support at the fab level. When statistical alerts connect to dispatch/scheduling, risk-aware routing can be implemented—for instance, sending lots processed on a tool with elevated anomaly/drift scores to expedited metrology or engineering review, while allowing stable tools to run leaner sampling plans to improve throughput [87,106,111,112].

Recipe management can also benefit from model-based adjustments, which can automatically correct small deviations within guard bands. At the same time, statistical predictions of degradation (from equipment data streams) can guide preventive maintenance scheduling to reduce unplanned downtime [87,106,107]. At the factory scale, integration among MES, APC, equipment data acquisition (EDA/Interface A), yield/defect databases, and decision support systems forms the foundation of “smart manufacturing” in semiconductor fabs—moving statistics from retrospective reporting into operational automation [87,106].

In summary, statistical methods deliver value in semiconductor manufacturing only when integrated with MES, APC, and fab-wide systems. Embedded in automated workflows, they enable real-time monitoring, control, and decision-making. This integration supports closed-loop optimization, predictive maintenance, and risk-based routing. Key challenges include data alignment, interpretability, and reliability, but successful implementation enables smart manufacturing and improved yield performance.

11. Emerging Trends and Challenges

The ongoing development of semiconductor manufacturing technology poses challenges for both engineering practice and statistical methods. As device architectures become more complex and process windows shrink, statistical systems must evolve in tandem [87,106]. Future directions are increasingly shaped by the convergence of real-time/streaming analytics, uncertainty-aware decision frameworks, hybrid physics–data modeling, and digital-twin architectures, while raising new challenges in complexity, interpretability, and long-term sustainability [87,106,113,114,115,116,117,118,119].

11.1. Real-Time and Streaming Analysis

A major trend is the shift from batch analytics to real-time and streaming analytics. Modern tools generate high-frequency sensor traces per wafer, creating continuous data streams rather than occasional summary measurements [87,106,113]. This motivates online monitoring algorithms designed for incremental updates and rapid detection of abnormal behavior under high dimensionality [111,113].

For example, high-dimensional data-stream monitoring methods have been developed to efficiently detect abnormal changes when many streams must be monitored simultaneously [111]. In semiconductor equipment contexts, wafer-to-wafer monitoring using data stream mining has been demonstrated for etch tools, explicitly treating trace data as streaming signals to enable near-run-level detection [112].

Implementing streaming analytics requires scalable computing and careful algorithm design under latency constraints, while remaining robust to noise and transient fluctuations [87,106,113]. Hence, balancing response speed versus stability (false alarms vs. missed detections) becomes a central design tradeoff in advanced fabs [106,113].

11.2. Uncertainty Quantification and Risk-Based Decision Making

As capital intensity rises, decision-making increasingly requires explicit uncertainty quantification (UQ) and risk-based thresholds rather than binary alarms [61,62]. Modern statistical systems increasingly report posterior probabilities, predictive distributions, or confidence intervals—enabling engineers to weigh not only “most likely” outcomes but also tail risks of exceeding specs [61,62].

Risk-based decision frameworks integrate statistical inference with cost functions (e.g., yield loss, downtime, cycle-time impact), supporting economically rational interventions (e.g., hold vs. continue, maintenance timing, sampling escalation) [62]. This shifts monitoring from purely technical metrics toward operational optimization guided by quantified uncertainty [61,62].

11.3. Physics-Based Learning and Digital Twins

Purely data-driven models can struggle under distribution shift or limited labeled events; embedding physical constraints and mechanistic structure can improve robustness and interpretability [117,118]. In manufacturing more broadly, physics-informed ML frameworks (including physics-informed neural networks and constraint-embedded learning) are increasingly used to improve generalization and trustworthiness when data are sparse or processes are nonstationary [117,118].

This hybrid modeling direction aligns closely with digital twin development. In manufacturing, digital twins are commonly framed as tightly coupled digital representations with varying levels of synchronization and integration (e.g., digital models, digital shadows, digital twins) [114]. In semiconductor factory integration roadmapping, digital twins are explicitly positioned across ISA-80 levels to support applications such as R2R control, predictive maintenance, and virtual metrology that require continuous data integration and model updates [106].

Standard efforts further shape this space. ISO 23247 provides an architecture for digital twin frameworks in manufacturing [115], and standards-focused analyses emphasize the need for interoperable, “trustworthy” digital twins and guidance on implementation [116]. In practice, building and maintaining semiconductor-relevant twins remains challenging due to the calibration burden, data governance, and the need for continuous alignment between models and evolving tool states [106,114,115,116].

11.4. Challenges of Complexity and Model Lifecycle Management

Despite progress, increasing model complexity (hybrid physics–ML, streaming MSPC, fab-wide decision automation) raises interpretability challenges and can erode engineer trust if diagnostics are not actionable [87,106,113,117,118]. The rapidly evolving fab environment also intensifies lifecycle demands: models can become obsolete after upgrades, material changes, node transitions, or shifts in operating regimes [87,106,113].

From an operational perspective, real-world ML systems often accumulate “technical debt” (hidden maintenance costs, entanglement, feedback loops, and brittle dependencies), making robust governance, monitoring, retraining protocols, and versioning essential for sustainability [119]. Therefore, the future of statistical systems in advanced wafer fabs depends not only on algorithmic innovation, but also on standardized practices for interpretability, validation, drift handling, and long-horizon model stewardship [87,106,113,119].

11.5. Other Emerging Directions

Emerging directions in semiconductor manufacturing increasingly emphasize the convergence of AI/ML, physics-based modeling, and real-time system integration. AI/ML techniques are expanding beyond prediction toward autonomous decision-making, enabling advanced anomaly detection, virtual metrology, and adaptive control under complex nonlinear conditions [106,117,118]. However, their integration requires improved interpretability and robustness to ensure trust in high-stakes environments. Digital twins are gaining prominence as virtual replicas of tools and processes, enabling simulation-driven optimization, predictive maintenance, and cross-fab knowledge transfer, supported by standardized architectures and interoperable data frameworks [106,114,116].

Simultaneously, physics-informed and hybrid models are emerging to bridge first-principles understanding with data-driven learning, improving extrapolation and stability under limited data [87,106]. Finally, real-time adaptive systems—combining streaming analytics, Bayesian updating, and APC—enable continuous learning and rapid response to drift and variability. Together, these directions point toward fully integrated, intelligent manufacturing systems capable of autonomous optimization and resilient operation [113,117,118].

The emerging trends and challenges of statistical techniques in semiconductor manufacturing are listed in Table 4.

In summary, semiconductor manufacturing is evolving toward real-time, data-driven, and uncertainty-aware systems as process complexity increases. Key trends include streaming analytics for high-frequency data, enabling rapid anomaly detection under latency constraints, and risk-based decision-making through uncertainty quantification. Hybrid physics–data models and digital twins improve robustness, interpretability, and system integration but face challenges in calibration and data governance. Increasing model complexity raises issues in interpretability and lifecycle management, requiring robust monitoring and retraining strategies. Emerging directions emphasize AI-driven autonomy, adaptive control, and integrated architectures, aiming to achieve intelligent, resilient, and self-optimizing manufacturing systems while ensuring long-term sustainability and trust.

12. Conclusions

The evolution of semiconductor manufacturing technology over the past few decades has fundamentally transformed the role of statistics in industrial practice. Initially used to monitor limited process parameters, statistics has evolved into a comprehensive analytical framework that underpins advanced process control (APC), equipment health monitoring, and intelligent decision support. Modern wafer fabrication environments generate massive volumes of correlated, hierarchical, and dynamic data, requiring methodologies that go far beyond traditional univariate statistical process control (SPC).

This review demonstrates that MSPC, time-series modeling, Bayesian inference, adaptive experimental design, fault detection and classification (FDC), continuous-operation control architectures, and machine learning collectively form an integrated analytical ecosystem. These approaches enable early detection of subtle deviations, accurate modeling of complex input–output relationships, structured handling of sparse and delayed measurements, and probabilistic decision-making aligned with economic objectives. However, their effectiveness depends not only on methodological sophistication but also on seamless integration with manufacturing execution systems (MES), APC platforms, and fab-wide digital infrastructures.

As semiconductor technologies continue to advance, with shrinking device geometries and increasingly complex architectures, process windows will narrow and sensitivity to variation will intensify. Consequently, the demand for advanced statistical methodologies will continue to grow. Emerging paradigms—including real-time analytics, uncertainty quantification, physics-informed modeling, and digital twin frameworks—are expanding the scope of statistical responsibilities and reshaping the architecture of modern wafer fabs. Future manufacturing systems will rely on deeply integrated statistical frameworks that are continuously operational, dynamically adaptive, and tightly coupled with automation and machine learning systems.

Looking forward, the trends highlighted in Section 11 point toward a transition to highly integrated, real-time, and uncertainty-aware manufacturing environments. Several concrete research directions emerge.

First, hybrid physics–data modeling frameworks should be further developed to integrate first-principles understanding with statistical and machine learning approaches. Such models can improve robustness, interpretability, and extrapolation capability, particularly under distribution shift and limited data conditions, and are essential for reliable digital twin implementation.

Second, real-time and streaming analytics must advance toward ultra-low-latency, high-dimensional monitoring systems. Future research should focus on incremental learning algorithms, adaptive thresholding, and multi-stream data fusion, as well as efficient computational architectures such as edge–cloud collaboration to support continuous wafer-level and tool-level monitoring.

Third, uncertainty-aware and risk-based control systems should be expanded. Integrating Bayesian inference, predictive distributions, and cost-aware decision policies into APC frameworks will enable more informed decisions, such as dynamic sampling, predictive maintenance, and economically optimized process adjustments.

Fourth, model lifecycle management and sustainability must become a formalized research area. As fab conditions evolve, models require systematic approaches for drift detection, retraining, validation, and version control. Establishing standardized protocols for model governance is critical to ensuring long-term reliability and mitigating technical debt in deployed systems.

Fifth, digital twin integration and cross-level system architectures require further development. Future work should address multi-scale integration across tools, processes, and fab systems, supported by interoperable data frameworks, standardization efforts, and robust validation methodologies to ensure trustworthy deployment.

Finally, interpretability and human–machine collaboration should be emphasized. As models grow in complexity, explainable AI techniques tailored to semiconductor processes are essential for providing actionable insights, supporting root-cause diagnosis, and maintaining engineers’ trust in automated decision systems.

In conclusion, semiconductor manufacturing has elevated statistics from an auxiliary tool to a foundational component of intelligent production systems. The next generation of wafer fabs will depend on deeply integrated, adaptive, and interpretable statistical architectures in which data, models, and control systems continuously co-evolve. Achieving this vision will require not only advances in methodology but also progress in system integration, computational infrastructure, and model governance, ultimately enabling sustainable improvements in yield, efficiency, and reliability.

Author Contributions

Conceptualization, H.-Y.C. and C.C.; methodology, H.-Y.C. and C.C.; software, C.C.; formal analysis, H.-Y.C.; investigation, H.-Y.C. and C.C.; data curation, H.-Y.C.; writing—original draft preparation, H.-Y.C. and C.C.; writing—review and editing, H.-Y.C. and C.C.; visualization, C.C.; supervision, C.C.; project administration, C.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The data are unavailable because a statement is still pending.

Acknowledgments

During the preparation of this manuscript/study, the authors used Grammarly, version 14.1267.0, to revise the English. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Shewhart, W.A. Economic Control of Quality of Manufactured Product; Barakaldo Books: Barakaldo, Spain, 2022. [Google Scholar]
Page, E.S. Continuous inspection schemes. Biometrika 1954, 41, 100–115. [Google Scholar] [CrossRef]
Roberts, S.W. Control chart tests based on geometric moving averages. Technometrics 2000, 42, 97–101. [Google Scholar] [CrossRef]
Kane, V.E. Process capability indices. J. Qual. Technol. 1986, 18, 41–52. [Google Scholar] [CrossRef]
Montgomery, D.C. Introduction to Statistical Quality Control, 6th ed.; John Wiley & Sons: Hoboken, NJ, USA, 2009. [Google Scholar]
May, G.S.; Spanos, C.J. Fundamentals of Semiconductor Manufacturing and Process Control; John Wiley & Sons: Hoboken, NJ, USA, 2006. [Google Scholar]
Spanos, C.J. Statistical process control in semiconductor manufacturing. Proc. IEEE 2002, 80, 819–830. [Google Scholar] [CrossRef]
International Technology Roadmap for Semiconductors (ITRS). Factory Integration (ITRS 2.0). 2015. Available online: https://www.semiconductors.org/resources/2015-international-technology-roadmap-for-semiconductors-itrs/ (accessed on 13 February 2026).
He, Q.P.; Wang, J. Fault detection using the k-nearest neighbor rule for semiconductor manufacturing processes. IEEE Trans. Semicond. Manuf. 2007, 20, 345–354. [Google Scholar] [CrossRef]
Yu, J. Fault detection using principal components-based Gaussian mixture model for semiconductor manufacturing processes. IEEE Trans. Semicond. Manuf. 2011, 24, 432–444. [Google Scholar] [CrossRef]
Qin, S.J. Statistical process monitoring: Basics and beyond. J. Chemom. 2003, 17, 480–502. [Google Scholar] [CrossRef]
Kourti, T.; MacGregor, J.F. Process analysis, monitoring and diagnosis, using multivariate projection methods. Chemom. Intell. Lab. Syst. 1995, 28, 3–21. [Google Scholar] [CrossRef]
Jackson, J.E.; Mudholkar, G.S. Control procedures for residuals associated with principal component analysis. Technometrics 1979, 21, 341–349. [Google Scholar] [CrossRef]
Wold, S.; Ruhe, A.; Wold, H.; Dunn, W.J. The collinearity problem in linear regression: The partial least squares (PLS) approach to generalized inverses. SIAM J. Sci. Stat. Comput. 1984, 5, 735–743. [Google Scholar] [CrossRef]
Ku, W.; Storer, R.H.; Georgakis, C. Disturbance detection and isolation by dynamic principal component analysis. Chemom. Intell. Lab. Syst. 1995, 30, 179–196. [Google Scholar] [CrossRef]
Nomikos, P.; MacGregor, J.F. Multivariate SPC charts for monitoring batch processes. Technometrics 1995, 37, 41–59. [Google Scholar] [CrossRef]
Goodlin, B.E.; Boning, D.S.; Sawin, H.H.; Wise, B.M. Simultaneous fault detection and classification for semiconductor manufacturing tools. J. Electrochem. Soc. 2002, 150, 778–784. [Google Scholar] [CrossRef]
Chien, C.F.; Chen, Y.J.; Hsu, C.Y.; Wang, H.K. Overlay error compensation using advanced process control with dynamically adjusted proportional-integral R2R controller. IEEE Trans. Autom. Sci. Eng. 2013, 11, 473–484. [Google Scholar] [CrossRef]
Butler, S.W. Process control in semiconductor manufacturing. J. Vac. Sci. Technol. B 1995, 13, 1917–1927. [Google Scholar] [CrossRef]
SEMI E93; Provisional Specification for CIM Framework Advanced Process Control Component. SEMI: Milpitas, CA, USA, 2000.
SEMI E81; Provisional Specification for CIM Framework Domain Architecture. SEMI: Milpitas, CA, USA, 2000.
Hsu, C.Y.; Liu, W.C. Multiple time-series convolutional neural network for fault detection and diagnosis and empirical study in semiconductor manufacturing. J. Intell. Manuf. 2021, 32, 823–836. [Google Scholar] [CrossRef]
Bao, L.; Wang, K.; Jin, R. A hierarchical model for characterizing spatial wafer variations. Int. J. Prod. Res. 2014, 52, 1827–1842. [Google Scholar] [CrossRef]
Ezzat, A.A.; Liu, S.; Hochbaum, D.S.; Ding, Y. A graph-theoretic approach for spatial filtering and its impact on mixed-type spatial pattern recognition in wafer bin maps. IEEE Trans. Semicond. Manuf. 2021, 34, 194–206. [Google Scholar] [CrossRef]
Pinheiro, J.C.; Bates, D.M. Mixed-Effects Models in S and S-PLUS; Springer: New York, NY, USA, 2000. [Google Scholar]
Cheng, F.T.; Chen, Y.T.; Su, Y.C.; Zeng, D.L. Evaluating the reliance level of a virtual metrology system. IEEE Trans. Semicond. Manuf. 2008, 21, 92–103. [Google Scholar] [CrossRef]
Bunday, B.D.; Allgair, J.A.; Caldwell, M.; Solecky, E.P.; Archie, C.N.; Rice, B.J.; Emami, I. Value-added metrology. IEEE Trans. Semicond. Manuf. 2007, 20, 266–277. [Google Scholar] [CrossRef]
Wu, W.M.; Cheng, F.T.; Kong, F.W. Dynamic-moving-window scheme for virtual-metrology model refreshing. IEEE Trans. Semicond. Manuf. 2012, 25, 238–246. [Google Scholar] [CrossRef]
Kao, C.A.; Cheng, F.T.; Wu, W.M.; Kong, F.W.; Huang, H.H. Run-to-run control utilizing virtual metrology with reliance index. IEEE Trans. Semicond. Manuf. 2013, 26, 69–81. [Google Scholar]
Kang, P.; Kim, D.; Lee, H.J.; Doh, S.; Cho, S. Virtual metrology for run-to-run control in semiconductor manufacturing. Expert Syst. Appl. 2011, 38, 2508–2522. [Google Scholar]
Lambert, D. Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 1992, 34, 1–14. [Google Scholar] [CrossRef]
Hilbe, J.M. Negative Binomial Regression, 2nd ed.; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
Shim, J.; Cho, S.; Kum, E.; Jeong, S. Adaptive fault detection framework for recipe transition in semiconductor manufacturing. Comput. Ind. Eng. 2021, 161, 107632. [Google Scholar] [CrossRef]
Alshammri, F.; Pan, J. Moving dynamic principal component analysis for non-stationary multivariate time series. Comput. Stat. 2021, 36, 2247–2287. [Google Scholar] [CrossRef]
MacGregor, J.F.; Kourti, T. Statistical process control of multivariate processes. Control Eng. Pract. 1995, 3, 403–414. [Google Scholar]
Hotelling, H. Multivariate quality control, illustrated by the air testing of sample bombsights. In Techniques of Statistical Analysis; Eisenhart, C., Hastay, M.W., Wallis, W.A., Eds.; McGraw-Hill: New York, NY, USA, 1947; pp. 111–184. [Google Scholar]
Mahalanobis, P.C. On the generalized distance in statistics. Sankhya A 2018, 80, S1–S7. [Google Scholar]
Tracy, N.D.; Young, J.C.; Mason, R.L. Multivariate control charts for individual observations. J. Qual. Technol. 1992, 24, 88–95. [Google Scholar] [CrossRef]
Mason, R.L.; Tracy, N.D.; Young, J.C. Decomposition of T² for multivariate control chart interpretation. J. Qual. Technol. 1995, 27, 99–108. [Google Scholar]
Cherry, G.A.; Qin, S.J. Multiblock principal component analysis based on a combined index for semiconductor fault detection and diagnosis. IEEE Trans. Semicond. Manuf. 2006, 19, 159–172. [Google Scholar] [CrossRef]
Li, W.; Yue, H.H.; Valle-Cervantes, S.; Qin, S.J. Recursive PCA for adaptive process monitoring. J. Process Control 2000, 10, 471–486. [Google Scholar] [CrossRef]
Schölkopf, B.; Smola, A.; Müller, K.-R. Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 1998, 10, 1299–1319. [Google Scholar] [CrossRef]
Cheng, C.-Y.; Hsu, C.-C.; Chen, M.-C. Adaptive kernel principal component analysis (KPCA) for monitoring small disturbances of nonlinear processes. Ind. Eng. Chem. Res. 2010, 49, 2254–2262. [Google Scholar] [CrossRef]
Sachs, E.; Hu, A.; Ingolfsson, A. Run-by-run process control: Combining SPC and feedback control. IEEE Trans. Semicond. Manuf. 1995, 8, 26–43. [Google Scholar] [CrossRef]
Spanos, C.J.; Guo, H.F.; Miller, A.; Levine-Parrill, J. Real-time statistical process control using tool data. IEEE Trans. Semicond. Manuf. 1992, 5, 308–318. [Google Scholar] [CrossRef]
Guo, R.S.; Chen, A.; Chen, J.J. Run-to-run control schemes for CMP process subject to deterministic drifts. In Proceedings of the 2000 Semiconductor Manufacturing Technology Workshop (Cat. No. 00EX406), Hsinchu, Taiwan, 14–15 June 2000; IEEE: New York, NY, USA, 2000; pp. 251–258. [Google Scholar]
Chen, J.H.; Kuo, T.W.; Lee, A.C. Run-by-run process control of metal sputter deposition: Combining time series and extended Kalman filter. IEEE Trans. Semicond. Manuf. 2007, 20, 278–285. [Google Scholar] [CrossRef]
Hamilton, J.D. Time Series Analysis; Princeton University Press: Princeton, NJ, USA, 1994. [Google Scholar]
Shumway, R.H.; Stoffer, D.S. Time Series Analysis and Its Applications: With R Examples, 4th ed.; Springer: Cham, Switzerland, 2017. [Google Scholar]
Hawkins, D.M.; Olwell, D.H. Cumulative Sum Charts and Charting for Quality Improvement; Springer Science & Business Media: New York, NY, USA, 2012. [Google Scholar]
Box, G.E.P.; Jenkins, G.M.; Reinsel, G.C.; Ljung, G.M. Time Series Analysis: Forecasting and Control, 5th ed.; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
Kalman, R.E. A new approach to linear filtering and prediction problems. J. Basic Eng. 1960, 82, 35–45. [Google Scholar] [CrossRef]
Durbin, J.; Koopman, S.J. Time Series Analysis by State Space Methods, 2nd ed.; Oxford University Press: Oxford, UK, 2012. [Google Scholar]
Jazwinski, A.H. Stochastic Processes and Filtering Theory; Courier Corporation: Mineola, NY, USA, 2007. [Google Scholar]
Basseville, M.; Nikiforov, I.V. Detection of Abrupt Changes: Theory and Application; Prentice Hall: Englewood Cliffs, NJ, USA, 1993. [Google Scholar]
Pettitt, A.N. A non-parametric approach to the change-point problem. J. R. Stat. Soc. Ser. C 1979, 28, 126–135. [Google Scholar] [CrossRef]
Adams, R.P.; MacKay, D.J.C. Bayesian online changepoint detection. arXiv 2007, arXiv:0710.3742. [Google Scholar] [CrossRef]
Killick, R.; Fearnhead, P.; Eckley, I.A. Optimal detection of changepoints with a linear computational cost. J. Am. Stat. Assoc. 2012, 107, 1590–1598. [Google Scholar] [CrossRef]
Brodsky, E.; Darkhovsky, B.S. Nonparametric Methods in Change Point Problems; Springer Science & Business Media: New York, NY, USA, 2013; Volume 243. [Google Scholar]
Fearnhead, P.; Liu, Z. On-line inference for multiple changepoint problems. J. R. Stat. Soc. Ser. B 2007, 69, 589–605. [Google Scholar] [CrossRef]
Gelman, A.; Carlin, J.B.; Stern, H.S.; Rubin, D.B. Bayesian Data Analysis; Chapman & Hall/CRC: Boca Raton, FL, USA, 2013. [Google Scholar]
Berger, J.O. Statistical Decision Theory and Bayesian Analysis; Springer Science & Business Media: New York, NY, USA, 2013. [Google Scholar]
Girshick, M.A.; Rubin, H. A Bayes approach to a quality control model. Ann. Math. Stat. 1952, 23, 114–125. [Google Scholar] [CrossRef]
Lindley, D.V.; Smith, A.F.M. Bayes estimates for the linear model. J. R. Stat. Soc. Ser. B 1972, 34, 1–18. [Google Scholar] [CrossRef]
Gelman, A. Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper). Bayesian Anal. 2006, 1, 515–534. [Google Scholar] [CrossRef]
Naderkhani, F.; Makis, V. Economic design of multivariate Bayesian control chart with two sampling intervals. Int. J. Prod. Econ. 2016, 174, 29–42. [Google Scholar] [CrossRef]
Lin, J.; Wang, K. A Bayesian framework for online parameter estimation and process adjustment using categorical observations. IIE Trans. 2012, 44, 291–300. [Google Scholar] [CrossRef]
Chen, A.; Blue, J. Recipe-independent indicator for tool health diagnosis and predictive maintenance. IEEE Trans. Semicond. Manuf. 2009, 22, 522–535. [Google Scholar] [CrossRef]
Chien, C.F.; Van Nguyen, T.H.; Li, Y.C.; Chen, Y.J. Bayesian decision analysis for optimizing in-line metrology and defect inspection strategy for sustainable semiconductor manufacturing and an empirical study. Comput. Ind. Eng. 2023, 182, 109421. [Google Scholar] [CrossRef]
Kanarik, K.J.; Osowiecki, W.T.; Lu, Y.; Talukder, D.; Roschewsky, N.; Park, S.N.; Gottscho, R.A. Human–machine collaboration for improving semiconductor process development. Nature 2023, 616, 707–711. [Google Scholar] [CrossRef]
Blei, D.M.; Kucukelbir, A.; McAuliffe, J.D. Variational inference: A review for statisticians. J. Am. Stat. Assoc. 2017, 112, 859–877. [Google Scholar] [CrossRef]
Box, G.E.P.; Hunter, J.S.; Hunter, W.G. Statistics for Experimenters: Design, Innovation, and Discovery, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2005. [Google Scholar]
Myers, R.H.; Montgomery, D.C.; Anderson-Cook, C.M. Response Surface Methodology: Process and Product Optimization Using Designed Experiments, 4th ed.; John Wiley & Sons: Hoboken, NJ, USA, 2016. [Google Scholar]
Box, G.E.P.; Draper, N.R. Empirical Model-Building and Response Surfaces; John Wiley & Sons: New York, NY, USA, 1987. [Google Scholar]
Boning, D.S.; Mozumder, P.K. DOE/Opt: A system for design of experiments, response surface modeling, and optimization using process and device simulation. IEEE Trans. Semicond. Manuf. 2002, 7, 233–244. [Google Scholar] [CrossRef]
Gaston, G.J.; Walton, A.J. The integration of simulation and response surface methodology for the optimization of IC processes. IEEE Trans. Semicond. Manuf. 2002, 7, 22–33. [Google Scholar] [CrossRef]
Shumate, D.A.; Montgomery, D.C. Development of a TiW plasma etch process using a mixture experiment and response surface optimization. IEEE Trans. Semicond. Manuf. 1996, 9, 335–343. [Google Scholar] [CrossRef]
Boning, D.S.; McIlrath, M.B.; Penfield, P.; Sachs, E.M. A general semiconductor process modeling framework. IEEE Trans. Semicond. Manuf. 2002, 5, 266–280. [Google Scholar] [CrossRef]
Montgomery, D.C. Design and Analysis of Experiments, 10th ed.; John Wiley & Sos: Hoboken, NJ, USA, 2019. [Google Scholar]
Chaloner, K.; Verdinelli, I. Bayesian experimental design: A review. Stat. Sci. 1995, 10, 273–304. [Google Scholar] [CrossRef]
Atkinson, A.C. Optimum experimental design. In International Encyclopedia of Statistical Science; Lovric, M., Ed.; Springer: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
Jones, D.R.; Schonlau, M.; Welch, W.J. Efficient global optimization of expensive black-box functions. J. Glob. Optim. 1998, 13, 455–492. [Google Scholar] [CrossRef]
Rasmussen, C.E. Gaussian processes in machine learning. In Proceedings of the Summer School on Machine Learning, Berlin, Germany, 2–14 February 2003; Springer: Berlin/Heidelberg, Germany, 2003; pp. 63–71. [Google Scholar]
Mockus, J. Bayesian Approach to Global Optimization: Theory and Applications; Springer Science & Business Media: New York, NY, USA, 2012. [Google Scholar]
Mack, C. Fundamental Principles of Optical Lithography: The Science of Microfabrication; John Wiley & Sons: Hoboken, NJ, USA, 2008. [Google Scholar]
Quiñonero-Candela, J.; Rasmussen, C.E. A unifying view of sparse approximate Gaussian process regression. J. Mach. Learn. Res. 2005, 6, 1939–1959. [Google Scholar]
Moyne, J.; Samantaray, J.; Armacost, M. Big data capabilities applied to semiconductor manufacturing: Advanced process control. IEEE Trans. Semicond. Manuf. 2016, 29, 283–291. [Google Scholar] [CrossRef]
Cohen, L. Time-Frequency Analysis; Prentice Hall: Englewood Cliffs, NJ, USA, 1995. [Google Scholar]
Mallat, S. A Wavelet Tour of Signal Processing; Academic Press: San Diego, CA, USA, 1999. [Google Scholar]
Jeong, Y.S.; Kim, B.; Ko, Y.D. Exponentially weighted moving average-based procedure with adaptive thresholding for monitoring nonlinear profiles: Monitoring of plasma etch process in semiconductor manufacturing. Expert Syst. Appl. 2013, 40, 5688–5693. [Google Scholar] [CrossRef]
Rabiner, L.R. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 2002, 77, 257–286. [Google Scholar] [CrossRef] [PubMed]
Hong, S.J.; Lim, W.Y.; Cheong, T.; May, G.S. Fault detection and classification in plama etch equipment for semiconductor manufacturing e-diagnostics. IEEE Trans. Semicond. Manuf. 2011, 25, 83–93. [Google Scholar] [CrossRef]
Susto, G.A.; Schirru, A.; Pampuri, S.; McLoone, S.; Beghi, A. Machine learning for predictive maintenance: A multiple classifier approach. IEEE Trans. Ind. Inform. 2014, 11, 812–820. [Google Scholar] [CrossRef]
Park, S.J.; Lee, M.S.; Shin, S.Y.; Cho, K.H.; Lim, J.T.; Cho, B.S.; Park, C.H. Run-to-run overlay control of steppers in semiconductor manufacturing systems based on history data analysis and neural network modeling. IEEE Trans. Semicond. Manuf. 2005, 18, 605–613. [Google Scholar] [CrossRef]
Musacchio, J.; Rangan, S.; Spanos, C.; Poolla, K. On the utility of run to run control in semiconductor manufacturing. In Proceedings of the 1997 IEEE International Symposium on Semiconductor Manufacturing, San Francisco, CA, USA, 6–8 October 1997; IEEE: New York, NY, USA, 1997; p. D9-12. [Google Scholar]
Wu, M.J.; Jang, J.S.R.; Chen, J.L. Wafer map failure pattern recognition and similarity ranking for large-scale data sets. IEEE Trans. Semicond. Manuf. 2014, 28, 1–12. [Google Scholar] [CrossRef]
Yang, Y.; Sun, M. Semiconductor defect pattern classification by self-proliferation-and-attention neural network. IEEE Trans. Semicond. Manuf. 2021, 35, 16–23. [Google Scholar] [CrossRef]
Cheng, K.C.C.; Chen, L.L.Y.; Li, J.W.; Li, K.S.M.; Tsai, N.C.Y.; Wang, S.J.; Huang, A.Y.A.; Chou, L.; Lee, C.S.; Chen, J.-E.; et al. Machine learning-based detection method for wafer test induced defects. IEEE Trans. Semicond. Manuf. 2021, 34, 161–167. [Google Scholar] [CrossRef]
Gama, J.; Žliobaitė, I.; Bifet, A.; Pechenizkiy, M.; Bouchachia, A. A survey on concept drift adaptation. ACM Comput. Surv. 2014, 46, 1–37. [Google Scholar] [CrossRef]
Bayram, F.; Ahmed, B.S.; Kassler, A. From concept drift to model degradation: An overview on performance-aware drift detectors. Knowl.-Based Syst. 2022, 245, 108632. [Google Scholar] [CrossRef]
Ross, G.J.; Adams, N.M.; Tasoulis, D.K.; Hand, D.J. Exponentially weighted moving average charts for detecting concept drift. Pattern Recognit. Lett. 2012, 33, 191–198. [Google Scholar] [CrossRef]
Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; ACM: San Diego, CA, USA, 2016; pp. 1135–1144. [Google Scholar]
Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems 30 (NeurIPS 2017); Curran Associates, Inc.: Red Hook, NY, USA, 2017. [Google Scholar]
Gama, J.; Sebastião, R.; Rodrigues, P.P. On evaluating stream learning algorithms. Mach. Learn. 2013, 90, 317–346. [Google Scholar] [CrossRef]
Wang, S.; Botros, Y.; Martin, J.W. Enabling robustness and flexibility of equipment data collection through SEMI EDA standards. In Proceedings of the 2004 IEEE/SEMI Advanced Semiconductor Manufacturing Conference and Workshop, Boston, MA, USA, 4–6 May 2004; IEEE: New York, NY, USA, 2004; pp. 165–169. [Google Scholar]
IEEE. International Roadmap for Devices and Systems (IRDS): Factory Integration; IEEE: Piscataway, NJ, USA, 2022; Available online: https://irds.ieee.org/images/files/pdf/2023/2023IRDS_FAC.pdf (accessed on 13 February 2026).
Khan, A.A.; Moyne, J.R.; Tilbury, D.M. An approach for factory-wide control utilizing virtual metrology. IEEE Trans. Semicond. Manuf. 2007, 20, 364–375. [Google Scholar] [CrossRef]
Qin, S.J.; Cherry, G.; Good, R.; Wang, J.; Harrison, C.A. Semiconductor manufacturing process control and monitoring: A fab-wide framework. J. Process Control 2006, 16, 179–191. [Google Scholar]
Cheng, F.T.; Chang, J.Y.C.; Kao, C.A.; Chen, Y.L.; Peng, J.L. Configuring AVM as a MES component. In Proceedings of the 2010 IEEE/SEMI Advanced Semiconductor Manufacturing Conference (ASMC), San Francisco, CA, USA, 11–13 July 2010; IEEE: New York, NY, USA, 2010; pp. 226–231. [Google Scholar]
International Society of Automation (ISA). Enterprise-Control System Integration—Part 1: Models and Terminology; International Society of Automation: Research Triangle Park, NC, USA, 2010. [Google Scholar]
Zou, C.; Wang, Z.; Zi, X.; Jiang, W. An efficient online monitoring method for high-dimensional data streams. Technometrics 2015, 57, 374–387. [Google Scholar] [CrossRef]
Ko, J.M.; Hong, S.R.; Choi, J.Y.; Kim, C.O. Wafer-to-wafer process fault detection using data stream mining techniques. Int. J. Precis. Eng. Manuf. 2013, 14, 103–113. [Google Scholar] [CrossRef]
Colosimo, B.M.; Jones-Farmer, L.A.; Megahed, F.M.; Paynabar, K.; Ranjan, C.; Woodall, W.H. Statistical process monitoring from industry 2.0 to industry 4.0: Insights into research and practice. Technometrics 2024, 66, 507–530. [Google Scholar] [CrossRef]
Kritzinger, W.; Karner, M.; Traar, G.; Henjes, J.; Sihn, W. Digital twin in manufacturing: A categorical literature review and classification. IFAC-PapersOnLine 2018, 51, 1016–1022. [Google Scholar] [CrossRef]
ISO 23247-1:2021; Automation Systems and Integration—Digital Twin Framework for Manufacturing—Part 1: Overview and General Principles. International Organization for Standardization: Geneva, Switzerland, 2021.
Shao, G. Manufacturing digital twin standards. In Proceedings of the ACM/IEEE 27th International Conference on Model Driven Engineering Languages and Systems, Linz, Austria, 22–27 September 2024; ACM: San Diego, CA, USA, 2024; pp. 370–377. [Google Scholar]
Pan, L.; Li, G.; Zhu, T.; Liu, D.; Wang, Y.; Lu, Y. Physics-informed machine learning in design and manufacturing: Status and challenges. J. Comput. Inf. Sci. Eng. 2025, 25, 120804. [Google Scholar] [CrossRef]
Leng, J.; Zuo, K.; Xu, C.; Zhou, X.; Zheng, S.; Kang, J.; Gao, R.X. Physics-informed machine learning in intelligent manufacturing: A review. J. Intell. Manuf. 2025, 1–43. [Google Scholar] [CrossRef]
Sculley, D.; Holt, G.; Golovin, D.; Davydov, E.; Phillips, T.; Ebner, D.; Chaudhary, V.; Young, M.; Crespo, J.F.; Dennison, D. Hidden technical debt in machine learning systems. In Advances in Neural Information Processing Systems 28 (NeurIPS 2015); Curran Associates, Inc.: Red Hook, NY, USA, 2015; pp. 2503–2511. [Google Scholar]

Figure 1. The traditional and modern statistical techniques in semiconductor manufacturing, arrows represent processes.

Figure 2. The hierarchical structure in semiconductor manufacturing, arrows represent processes.

Figure 3. The Multivariate Statistical Process Control (MSPC) in semiconductor manufacturing, arrows represent processes.

Figure 4. The experimental design and response surface modeling in semiconductor manufacturing, arrows represent processes.

Table 1. Time-Series Modeling and Drift Detection in Semiconductor Manufacturing.

Method/Category	Core Idea (What It Models)	Drift/Anomaly Types Best Detected	Strengths of Fab Use	Key Limitations/Cautions	Typical Semiconductor Use Cases	Representative Literature
Why time-series methods are needed (context)	Fab data are sequential: tool age, PM resets, and environments vary. Yield loss often comes from accumulated small deviations across runs (R2R context).	Slow drift, low-frequency oscillations, post-PM shifts, time-varying regimes.	Captures autocorrelation and dynamics, enabling earlier detection of subtle degradation compared with static SPC.	Ignoring the time structure allows noise to mask drift, delaying detection and giving a false sense of stability.	R2R/APC loops; long-term monitoring of CD, overlay, thickness, etch rate; alignment with maintenance cycles.	[6,44,45,46,47]
Drift characteristics in advanced tools (process reality)	Drift arises from physical causes such as chamber contamination, CMP pad/slurry wear, sensor calibration drift, and lithography optics degradation.	Gradual trends and oscillations rather than abrupt failures.	Supports mechanism-aware monitoring design, focusing on small but persistent changes and temporal structure.	High variability can obscure drift; independence assumptions often fail; one must separate noise from degradation.	Plasma etch/deposition monitoring, CMP stability, lithography CD/overlay, sensor drift tracking.	[6,44,45,46,47,48,49]
EWMA control chart	Uses exponentially weighted averages; recent data have more influence, smoothing noise while tracking slow change.	Small, sustained mean shifts and gradual drift.	Simple, widely used, and effective for detecting subtle long-term trends while reducing noise.	Sensitive to parameter choice; autocorrelation and regime changes may distort control limits.	CD/overlay monitoring, film-thickness, etch-rate trends, and smoothed metrology signals.	[2,3,5,6,50]
CUSUM chart	Accumulates deviations from a target over time; persistent, small biases add up until the detection threshold is reached.	Small systematic shifts and persistent bias.	Highly sensitive to small changes; matches the gradual degradation behavior.	Requires careful threshold tuning; may produce false alarms under correlated noise or regime shifts.	Overlay bias detection, etch rate drift, and electrical parameter shifts.	[2,5,6,50]
AR/ARIMA modeling (Box–Jenkins)	Models a variable using past values and prediction errors; anomalies are detected via residuals from forecasts.	Autocorrelated sequences and predictable temporal patterns; anomalies appear as residual outliers.	Explicitly models time dependence; supports prediction and residual-based monitoring.	Mainly linear; may not capture nonlinear interactions; requires updating across regime changes.	Residual monitoring for CD/overlay/thickness; run-to-run trends; tool condition tracking.	[48,49,51]
State-space models + Kalman filtering	Models a hidden system state evolving, with noisy observations; recursively updates estimates.	Gradual drift, evolving baselines, noisy measurements.	Integrates naturally with R2R/APC; separates process and measurement noise; enables adaptive control.	Requires correct model structure and noise assumptions; complex for multivariate systems.	Lithography dose/focus control, CMP compensation, etch correction, and chamber state estimation.	[44,45,47,52,53,54]
Change point detection (offline/online)	Detects times when statistical properties change, segmenting data into distinct regimes.	Abrupt shifts such as post-maintenance changes, recipe updates, or new degradation phases.	Improves diagnosis and modeling by linking changes to events and enabling regime-specific analysis.	Trade-off between detection delay and false alarms; gradual drift and multiple changes complicate detection.	Maintenance impact analysis, degradation-onset detection, and regime segmentation for modeling.	[6,44,45,55,56,57,58,59,60]
Integrated takeaway (combined framework)	Combines EWMA/CUSUM (fast detection), ARIMA/state-space (prediction and control), and change point methods (regime management).	Both gradual drift and discrete changes across evolving baselines.	Provides comprehensive monitoring, earlier intervention, and improved diagnosis for advanced fabs.	Requires handling autocorrelation, regime shifts, and model updates; increased system complexity.	Full monitoring of CD, overlay, thickness, etch, and CMP; integration with R2R/APC systems.	[2,3,4,5,6,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60]

Table 2. The quantitative criteria and direct comparison between methods.

Method	Best at Detecting	Detection Delay	False Alarm Risk	Small-Shift Sensitivity	Prediction Ability	Regime-Shift Handling	Complexity	Typical Quantitative Strengths
EWMA	gradual drift	low	medium	high	low	low	low	strong for 0.5σ–1σ shifts; simple online use
CUSUM	persistent small bias	very low	medium-high if poorly tuned	very high	low	low	low	Often best for the smallest sustained shifts
AR/ARIMA	autocorrelated patterns	medium	low-medium	medium	high	low-medium	medium	good forecast RMSE and residual monitoring
State-space/Kalman	noisy evolving drift	low	low	high	very high	medium	high	strong state estimation, adaptive filtering
Change-point detection	abrupt shifts/regime changes	very low for abrupt changes; high for slow drift	depends on the threshold	low for tiny gradual drift	low	very high	medium-high	best onset localization for discrete changes

Table 3. Comparative analysis between methods: MSPC, time-series models, Bayesian approaches, and FDC.

Aspect	MSPC	Time-Series Models and Drift Detection	Bayesian Approaches	FDC (Fault Detection and Classification)
Core idea	Jointly model correlated multivariate variables using covariance structure and latent variables.	Model temporal dependence, autocorrelation, and drift evolution over time	Treat parameters as probabilistic; update beliefs using prior + data (posterior)	Monitor equipment health using high-frequency sensor data and multivariate/dynamic models
Primary data characteristics addressed	High dimensionality, correlation, multicollinearity, moderate non-stationarity	Sequential data, temporal dynamics, drift, regime changes	Sparse data, uncertainty, hierarchical structure, evolving baselines	High-frequency sensor streams, dynamic signals, and equipment-level variability
Typical methods	Hotelling T², PCA/PLS, kernel/adaptive PCA	EWMA, CUSUM, ARIMA, state-space (Kalman), change-point detection	Bayesian inference, Bayesian control charts, hierarchical models, decision theory	PCA-based monitoring, multivariate charts, kNN/GMM classifiers, time–frequency and sequence models
Detection capability	Detects correlated and coordinated multivariate shifts	Detects gradual drift, autocorrelation, and structural changes	Detects shifts under uncertainty; robust with small samples	Detects early equipment anomalies before wafer-level defects
Sensitivity to subtle changes	High for correlated small shifts; improved vs. univariate SPC	Very high for slow drift (EWMA/CUSUM); strong for temporal patterns	High when priors are informative; supports probabilistic thresholds	High, especially for transient and phase-specific anomalies
Strengths	Reduces false alarms under correlation; enables contribution-based diagnosis; handles high-dimensional data	Captures process evolution; supports early detection and predictive monitoring; integrates with APC	Handles sparse data; quantifies uncertainty; supports hierarchical modeling and risk-based decisions	Enables proactive monitoring, upstream detection, and integrates sensor physics with statistical learning
Limitations	Covariance estimation is unstable in very high dimensions; assumes baseline stability; often linear.	May require stationarity or model updating; limited for nonlinear multivariate coupling unless extended	Computationally intensive; sensitive to prior choice; latency concerns in real-time use	Requires frequent model maintenance; sensitive to tool changes; high computational and data demands
Interpretability	Moderate–high (via contribution analysis and latent variables)	Moderate (depends on model; control charts intuitive, ARIMA less so)	High (probabilistic interpretation, uncertainty quantification)	Moderate (diagnostics possible but complex for ML/time-frequency models)
Adaptability to process change	Moderate (adaptive PCA, moving window methods)	High (state-space, change-point detection enables adaptation)	Very high (sequential updating and hierarchical pooling)	High but requires recalibration after maintenance or recipe change
Typical fab applications	CD/overlay monitoring, plasma/CMP sensor analysis, tool health indicators	Drift monitoring (CD, overlay, etch rate), R2R/APC integration, maintenance impact analysis	Early ramp processes, sparse metrology environments, tool matching, risk-based decisions	Real-time tool monitoring, anomaly detection, fault classification, predictive maintenance
Role in the control framework	Core monitoring layer for multivariate quality control	Dynamic monitoring and prediction layer	Decision-making and uncertainty quantification layer	Upstream monitoring and diagnostic layer integrated with SPC/APC
Key advantage (summary)	Captures correlation and reduces dimensionality for robust multivariate monitoring	Captures time evolution and enables early drift detection	Integrates prior knowledge with data for robust inference under uncertainty	Moves monitoring upstream to the equipment level for early fault prevention
Key references	[5,6,13,35,40,41]	[2,3,6,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60]	[6,61,62]	[6,17,68,87,92]

Table 4. Emerging Trends and Challenges of Statistical Techniques in Semiconductor Manufacturing.

Trend/Challenge Area	What Is Changing (Technical Driver)	Statistical/Analytics Implication	Representative Methods/Approaches	Practical Implementation Challenges	Impact on Fab Operations	Representative Literature
Overall direction	Shrinking process windows, greater tool complexity, and massive data volumes demand faster, more reliable decision-making.	Shift from offline SPC to continuous, model-integrated, uncertainty-aware monitoring and control.	Streaming MSPC, risk-based decisioning, physics-informed ML, digital twins.	Scaling, interpretability, governance, and maintaining complex model systems.	Faster response and improved robustness, but a higher maintenance burden.	[87,106,113,114,115,116,117,118,119]
Real-time and streaming analysis	High-frequency sensors generate continuous wafer-level data streams.	Need online, low-latency, high-dimensional algorithms handling autocorrelation.	Streaming monitoring, online anomaly/change detection, and data stream mining.	Compute limits, noise robustness, and balancing speed vs. accuracy.	Earlier detection and intervention support real-time equipment monitoring.	[87,106,111,112,113]
Speed–stability tradeoff	Faster detection increases false alarms; conservative settings delay detection.	Monitoring becomes a tuning problem under nonstationary conditions.	Adaptive thresholds, incremental updates, robust statistics.	Requires operational policies, escalation rules, and alignment with workflows.	Affects tool utilization, downtime, and engineer trust.	[87,106,113]
Uncertainty quantification (UQ)	High wafer value requires decisions beyond simple alarms.	Systems provide probabilities or prediction intervals instead of binary outputs.	Bayesian inference, probabilistic thresholds, predictive distributions.	Communicating uncertainty, selecting thresholds, and ensuring calibration.	Better prioritization; avoids overreaction to noise; supports informed decisions.	[61,62]
Risk-based decision making	Operational decisions involve economic trade-offs (yield vs. downtime vs. cycle time).	Integrates statistical inference with cost/loss functions.	Bayesian decision frameworks, cost-aware policies.	Defining loss functions, aligning with organizational goals, and avoiding bias.	More consistent and economically aligned interventions.	[61,62]
Physics-based learning (hybrid models)	Pure data-driven models struggle under distribution shift or limited fault data.	Combine physics constraints with data-driven models for robustness.	Physics-informed ML, constraint-based learning.	Requires accurate physics knowledge and integration with real data.	Improved generalization, stability, and interpretability.	[117,118]
Digital twin architectures	Need synchronized digital representations of tools and processes.	Twins integrate monitoring, prediction, and control in a unified system.	Digital models, digital shadows, full digital twins.	Continuous calibration, data integration, and governance challenges.	Enables predictive maintenance, virtual metrology, and coordinated control.	[106,114]
Standards and interoperability	Cross-vendor systems require common architectures and interfaces.	Standardization supports integration, validation, and trust.	ISO 23247 digital twin framework, interoperable platforms.	Integrating legacy systems, ensuring traceability and reliability.	Faster deployment and improved system consistency.	[115,116]
Complexity and interpretability	Advanced models increase system complexity and reduce transparency.	Interpretability becomes essential for trust and usability.	Explainable AI, diagnostic tools, physics-based constraints.	Explaining decisions under changing conditions; avoiding black-box outputs.	Influences adoption, troubleshooting efficiency, and decision quality.	[87,106,113,117,118]
Model lifecycle and technical debt	Rapid process evolution makes models outdated; complexity accumulates hidden costs.	Requires structured model governance and lifecycle management.	Monitoring, retraining, validation, version control, and MLOps practices.	Managing dependencies, drift, ownership, and long-term maintenance.	Determines long-term effectiveness and prevents performance degradation.	[87,106,113,119]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Published by MDPI on behalf of the International Institute of Knowledge Innovation and Invention. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.

Share and Cite

MDPI and ACS Style

Chen, H.-Y.; Chen, C. An Overview of the Application of Modern Statistical Techniques in Semiconductor Manufacturing. Appl. Syst. Innov. 2026, 9, 83. https://doi.org/10.3390/asi9040083

AMA Style

Chen H-Y, Chen C. An Overview of the Application of Modern Statistical Techniques in Semiconductor Manufacturing. Applied System Innovation. 2026; 9(4):83. https://doi.org/10.3390/asi9040083

Chicago/Turabian Style

Chen, Hsuan-Yu, and Chiachung Chen. 2026. "An Overview of the Application of Modern Statistical Techniques in Semiconductor Manufacturing" Applied System Innovation 9, no. 4: 83. https://doi.org/10.3390/asi9040083

APA Style

Chen, H.-Y., & Chen, C. (2026). An Overview of the Application of Modern Statistical Techniques in Semiconductor Manufacturing. Applied System Innovation, 9(4), 83. https://doi.org/10.3390/asi9040083

Article Menu

An Overview of the Application of Modern Statistical Techniques in Semiconductor Manufacturing

Abstract

1. Introduction

2. Characteristics of Semiconductor Manufacturing Materials

2.1. High-Dimensionality and Relevance

2.2. Hierarchical Structure

2.3. Sparsity and Delay Measurement

2.4. Abnormal and Non-Stationary Behavior

3. Multivariate Statistical Process Control

3.1. Motivation for Adopting a Multi-Faceted Approach

3.2. Hotelling T2 Control Chart

3.3. MSPC Based on Principal Component Analysis

3.4. Expansion and Practical Considerations

4. Time-Series Modeling and Drift Detection

4.1. Drift Characteristics of Semiconductor Manufacturing Processes

4.2. Exponentially Weighted Moving Average and Cumulative Sum Chart

4.3. Autoregressive Model and State-Space Model

4.4. Change Point Detection

5. Bayesian Statistical Methods

5.1. Theoretical Basis of Bayesian Methods

5.2. Bayesian Control Chart

5.3. Hierarchical Bayesian Model

5.4. Bayesian Decision Making

6. Experimental Design and Response Surface Modeling

6.1. Evolution of DOE in Semiconductor Manufacturing

6.2. Sequential and Adaptive Experimental Design

6.3. Gaussian Process Modeling

6.4. Applications in Advanced Process Development

7. Fault Detection and Classification

7.1. From Quality Control to Equipment Health Monitoring

7.2. Statistical Basis of FDC

7.3. Time–Frequency Models and Sequence Models

7.4. Integration with Statistical Quality Control

8. In-Process Control and Statistical Modeling

8.1. Control Principles of Inter-Run Operation

8.2. Model-Based Controllers

8.3. Role of SPC in R2R Systems

9. Statistics and Machine Learning: Complementary Roles

9.1. Limitations of Pure Data-Driven Models

9.2. Statistical Information-Driven Machine Learning

9.3. Model Governance and Lifecycle Management

10. Integration with Manufacturing Systems

10.1. Integration with MES and APC Architectures

10.2. Automated Decision Support and Smart Manufacturing

11. Emerging Trends and Challenges

11.1. Real-Time and Streaming Analysis

11.2. Uncertainty Quantification and Risk-Based Decision Making

11.3. Physics-Based Learning and Digital Twins

11.4. Challenges of Complexity and Model Lifecycle Management

11.5. Other Emerging Directions

12. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.2. Hotelling T² Control Chart