1. Introduction
Semiconductor manufacturing, a cornerstone of the modern digital economy, is characterized by its exceptional technological complexity and capital intensity [
1]. The fabrication of integrated circuits (ICs) entails hundreds of meticulously controlled steps within highly specialized cleanrooms, where precise regulation of environmental parameters—including temperature, particle concentration, air pressure, vibration, and humidity—is paramount for achieving high yields and product reliability [
2]. Among these, humidity control is particularly critical due to the pervasive and multifaceted interactions of water molecules with semiconductor materials, chemicals, and process equipment [
3].
Even trace amounts of moisture can induce significant detrimental effects throughout the manufacturing process. During front-end-of-line (FEOL) stages, adsorbed moisture can accelerate native oxide growth on silicon wafers, compromising surface integrity and disrupting subsequent epitaxial or gate oxide formation [
4]. In photolithography, humidity fluctuations alter photoresist properties, leading to critical dimension variations and patterning errors [
5]. During etching and deposition, moisture acts as an unintended contaminant, resulting in incomplete reactions, poor film coverage, and incorporated impurities [
6]. In back-end-of-line (BEOL) processes, moisture, especially in the presence of ionic contaminants and electrical bias, accelerates electrochemical migration and corrosion in metallic interconnects, causing dendrite formation, voids, and circuit failures that severely impact device lifetime and reliability [
7,
8]. The economic stakes are substantial: a single humidity excursion can lead to the loss of entire wafer batches, each valued at tens to hundreds of thousands of dollars, alongside costly production downtime [
9].
Conventional humidity monitoring in fabrication facilities relies on networks of physical sensors (e.g., capacitive, resistive) deployed at discrete locations [
10]. While essential, this approach suffers from inherent limitations: limited spatial resolution; a temporal disconnect between a humidity event and its observable impact on product quality; and an indirect, probabilistic correlation between sensor readings and wafer yield [
11]. These challenges underscore the need for a paradigm shift from merely monitoring ambient conditions to establishing a direct, near-real-time link between environmental factors and product outcomes.
Artificial Intelligence (AI) offers transformative potential in this regard. While prior research has extensively applied machine learning (ML)—particularly deep learning-based computer vision—to automated wafer defect inspection and classification [
12,
13], the focus has largely remained on post facto quality control. A promising yet underexplored direction is leveraging AI not just to classify defects, but to use product-related data (e.g., defect labels coupled with in situ process sensor logs) as a high-resolution, distributed sensor network for proactive environmental diagnostics. Specifically, if AI models can learn the subtle signatures of humidity-induced anomalies from multi-source process and sensor data, they could enable rapid, indirect detection of humidity issues, effectively using the process outcome and its associated data as a sensitive environmental probe [
14].
However, most existing AI models for semiconductor monitoring are static, trained on historical data and deployed with fixed parameters. This poses a significant limitation in dynamic manufacturing environments where operational conditions, process recipes, and tool states continuously evolve, causing data distributions to shift over time. A model that cannot adapt may experience performance degradation, failing to detect novel or gradually emerging humidity-related patterns.
To address this gap, this paper proposes an AI-driven framework centered on Lifelong Learning for rapid humidity detection in semiconductor manufacturing [
15]. Our core premise is that by integrating an adaptive, continuously learning mechanism, an AI system can maintain high detection accuracy over extended periods, adapting to new data batches and evolving environmental patterns without requiring complete retraining. The proposed framework moves beyond static data analysis. It employs a Lifelong Boosting Learning strategy, which combines ensemble methods with a knowledge retention and reuse mechanism to continuously adapt to streaming sensor data. This allows the model to incrementally learn from sequential data arrivals, preserving valuable historical knowledge while efficiently incorporating new information about humidity-related process deviations.
Based on the current research situation, the following research questions are proposed.
How can artificial intelligence leverage product-related data to establish a direct, near-real-time link between environmental factors and product outcomes for proactive environmental diagnostics?
How can an AI system continuously adapt to evolving operational conditions and data distribution shifts in dynamic manufacturing environments without experiencing performance degradation or requiring complete retraining?
The main contributions of this study are as follows.
A novel “environmental probe” paradigm is proposed, wherein heterogeneous process and wafer defect data are leveraged to establish a direct, near-real-time diagnostic link between ambient humidity and product quality outcomes.
The Lifelong Boosting Learning ( Boost) framework is developed, which integrates an adaptive ensemble mechanism with time-weighted knowledge retention to effectively mitigate catastrophic forgetting and adapt to non-stationary manufacturing patterns.
A robust evaluation methodology for environmental diagnostics under extreme class imbalance is established; it is demonstrated through macro-averaged metrics that the proposed system achieves superior detection sensitivity even when subject to stochastic sensor perturbations.
The remainder of this paper is organized as follows:
Section 2 presents a comprehensive review of related works in environmental monitoring, wafer defect classification, and smart manufacturing.
Section 3 describes the proposed
Boost framework, including problem definition and algorithm design.
Section 4 details the experimental evaluation, baseline comparisons, and result analysis.
Section 5 provides a discussion of the findings, implications, and current limitations. Finally,
Section 6 concludes the paper and outlines future research directions.
2. Related Works
In this section, we provide a comprehensive review of research streams closely related to the AI-driven framework proposed in this study.
The pursuit of robust humidity control and its intricate relationship with product quality in semiconductor manufacturing has spawned extensive research across multiple disciplines, including materials science, process engineering, environmental control, and, more recently, data science. This section reviews the foundational and contemporary work in three interconnected domains: (1) the impact of humidity and established monitoring techniques in fabs, (2) the evolution of wafer defect inspection and classification methodologies, and (3) the proliferation of AI and data-driven approaches in smart manufacturing. Critically, we examine the current state of the art at the intersection of these domains, identifying a conspicuous gap that the present study aims to address—specifically, the lack of an adaptive, continuously learning system that can perform real-time humidity diagnostics from product-inspection data.
The criticality of atmospheric moisture control in IC production has been understood since the industry’s early days. Foundational research meticulously documented the mechanisms of humidity-induced failure. Studies on surface chemistry elucidated how water molecules physisorb and chemisorb on silicon and other semiconductor surfaces, leading to uncontrolled oxide growth and surface state changes that degrade device performance [
16]. In the realm of metallization, the electrochemistry of corrosion processes—particularly for aluminum and, later, copper interconnects—was detailed, establishing models for failure acceleration under conditions of elevated humidity and bias [
17]. Furthermore, process-specific investigations revealed how humidity affects photoresist kinetics, the stability of etching plasmas, and the nucleation kinetics in chemical and physical vapor deposition chambers [
18]. This body of work firmly established the quantitative relationships between Relative Humidity (RH) levels, often in the parts-per-million (ppm) range for specific process steps, and the propensity for various defect modes.
To manage this risk, the industry adopted a multi-faceted engineering approach centered on physical sensor networks. The cleanroom environmental control system (ECS) is typically governed by a vast array of hygrometers. Capacitive polymer sensors are most common for general ambient monitoring due to their good balance of accuracy, response time, and cost [
19]. For more demanding applications, such as inside lithography scanners or deposition tools, chilled mirror hygrometers, which provide a fundamental dew-point measurement, are often employed for their high precision. These sensors feed data into a supervisory control and data acquisition (SCADA) or distributed control system (DCS), which orchestrates heating, ventilation, and air conditioning (HVAC) units, desiccant dryers, and humidifiers to maintain setpoints.
However, the limitations of this traditional paradigm are well-recognized in both academic and industrial literature. The challenge of spatial resolution is frequently cited; a sensor provides a point measurement that may not represent conditions a few meters away, especially near equipment with local heat loads or chemical emissions. The concept of “micro-environments” within the broader cleanroom is a key area of study. Researchers have used computational fluid dynamics (CFD) modeling to predict airflow and humidity gradients, and some have proposed dense grids of low-cost wireless sensor nodes to empirically map these variations [
20]. Another significant limitation is the latency between a humidity excursion and its confirmation via product metrology. This delay hinders rapid corrective action and complicates root cause analysis, as engineers must retrospectively correlate sensor logs with electrical test results from wafers processed days or weeks earlier [
21]. While advanced sensors and tighter control loops have improved, the fundamental disconnection between the environmental parameter (humidity) and the quality metric (wafer defects) remains a persistent challenge.
Parallel to environmental control, the field of wafer defect inspection has undergone its own revolution, evolving from manual microscope examination to fully automated, high-speed image acquisition and analysis. Automated Optical Inspection (AOI) and Scanning Electron Microscope (SEM) review tools are now ubiquitous in fabs. These systems generate terabytes of image data daily, depicting the patterned wafer surface at high resolution. The initial task of defect detection—finding anomalous pixels against a complex background—is largely solved through sophisticated image subtraction techniques, comparing a die image to a reference die or a golden template [
22].
The subsequent and more complex task is defect classification: categorizing each detected anomaly into a predefined type (e.g., particle, scratch, pattern bridge, residue, pit). The classification is crucial for determining the defect’s root cause and initiating the correct corrective action. Historically, this was accomplished using rule-based algorithms and handcrafted feature descriptors. Engineers would define thresholds for geometric features (size, aspect ratio, perimeter), texture, and intensity profiles [
23]. While effective for simple, high-contrast defects, these methods struggled with subtle or complex defect morphologies and required constant tuning as new process layers were introduced.
The advent of ML, and later deep learning, fundamentally transformed this field. Early ML approaches utilized classical algorithms like Support Vector Machines (SVMs), decision trees, and K-Nearest Neighbors (KNN), but they still relied on human experts to define and extract the relevant feature vectors from defect images [
24]. The breakthrough came with the application of deep Convolutional Neural Networks (CNNs). CNNs automate the feature extraction process, learning hierarchical representations of visual patterns directly from raw pixel data. Models based on architectures like ResNet, VGG, and Inception, often pre-trained on large natural image datasets and fine-tuned on proprietary wafer image datasets, have demonstrated classification accuracies exceeding 100% in research settings [
25]. These models can distinguish between dozens of defect types with a consistency and speed unattainable by human inspectors or older algorithms [
26].
This powerful capability has turned defect classification from a purely quality control activity into a rich source of data for fab intelligence. Modern systems not only classify defects but also cluster them, track their spatial signatures (e.g., wafer-edge rings, random distributions, reticle fingerprints), and correlate them with process tool and chamber histories [
27]. The defect map has become a diagnostic language. However, the translation of this language to pinpoint specific environmental root causes, like a precise humidity condition, remains largely interpretive and reliant on the experience of process integration engineers. The link is associative and manual, not yet encoded into the AI models themselves.
The application of AI in semiconductor manufacturing extends far beyond defect inspection, forming the core of the Industry 4.0 or “Smart Manufacturing” transformation. This movement seeks to leverage the massive volumes of data generated by equipment sensors (equipment engineering systems, or EES), process state logs, metrology tools, and testers to optimize every aspect of production [
28].
A prominent application is predictive maintenance (PdM), where ML models analyze multivariate time-series data from pumps, robots, and plasma sources to predict impending failures before they cause unscheduled downtime. Virtual metrology (VM) uses sensor data from a process tool to predict the outcome of a physical measurement (like film thickness or critical dimension), reducing the need for slow, destructive sampling [
29]. Run-to-run (R2R) control employs algorithms to adjust recipe parameters for each subsequent wafer lot based on measured results from the previous lot, compensating for process drift. Yield prediction models attempt to forecast the final test yield of a wafer or lot based on early metrology and inspection data, enabling early disposition of potentially bad material [
30].
A subset of this research focuses specifically on the impact of the fab environment on yield and quality. Studies have applied statistical methods and basic ML models (like linear regression or random forests) to find correlations between aggregated environmental data—average temperature, humidity, particle counts—and final yield metrics. Some have explored more sophisticated time-series analysis to link environmental excursions recorded by the facility system with yield dips observed later in the production timeline [
31]. These studies affirm that environmental factors are significant variables in yield models.
Despite the mature state of research in the three domains outlined above, a critical synthesis is missing. On one side, we have highly advanced AI models that excel at classifying wafer defects from images. On the other, we have a well-understood physical problem: humidity causes specific, identifiable types of defects. Yet, the current state of the art keeps these threads separate. Environmental monitoring systems produce humidity data streams. Defect inspection systems produce classified defect maps and images. Engineers and yield analysis software attempt to correlate them post facto.
The gap is the absence of an AI framework that directly internalizes the physics of humidity-induced failure by learning to recognize its visual signature as a primary classification objective. Existing defect classifiers may identify “corrosion” or “haze,” but they do not explicitly attribute it to an environmental root cause during the classification act. Furthermore, the potential to use the spatial and temporal distribution of these classified defects as a high-resolution, product-integrated sensor network for humidity is largely unexplored. No published work, to our knowledge, has proposed and demonstrated an end-to-end AI model trained specifically to detect and diagnose humidity issues by analyzing wafer defect imagery as the primary input signal, thereby creating a closed-loop link between product quality and environmental health.
Importantly, while AI methods for defect classification and environmental correlation have advanced, most existing models are static—trained on fixed datasets and deployed without ongoing adaptation. Semiconductor manufacturing, however, is a dynamic environment where process conditions, tool states, and ambient parameters continuously evolve. This leads to data distribution shifts over time, which can degrade the performance of static models. There is a pressing need for AI systems that can learn continuously, adapting to new patterns while retaining previously acquired knowledge—a capability known as lifelong or continual learning. Yet, the integration of lifelong learning mechanisms into environmental diagnostics for semiconductor manufacturing remains notably absent in the literature.
This paper positions itself directly within this gap. Our work is distinct from (a) pure environmental monitoring research, as we use the product (wafer) as the sensor; (b) pure defect classification research, as we specifically curate and model a subset of defects tied to a single environmental factor; and (c) broad yield–environment correlation studies, as we propose a real-time, image-based diagnostic tool rather than a retrospective statistical model. Furthermore, we introduce a lifelong boosting learning framework that enables continuous adaptation and knowledge retention, addressing the dynamic nature of semiconductor production environments. By bridging these fields, we aim to demonstrate a novel paradigm for intelligent environmental assurance in semiconductor manufacturing.
3. Lifelong Boosting Learning for Environment Humidity Detection
This section provides a detailed elaboration of the proposed Boost framework, which is designed to enable rapid and adaptive detection of humidity anomalies in semiconductor manufacturing environments by continuously learning from sequential data batches.
3.1. Motivation
Humidity control is critical in semiconductor manufacturing, as slight deviations can cause electrostatic discharge, material degradation, lithography defects, and yield loss. Although modern facilities deploy dense physical sensor networks, most existing detection systems rely on fixed thresholds or static ML models, which face significant limitations in dynamic production settings. Environmental humidity is highly non-stationary due to fluctuations in production load, equipment states, airflow, and seasonal variations, leading to performance degradation in models trained offline. Rapid detection is particularly challenging, as humidity anomalies can emerge and propagate quickly, while conventional approaches lack the agility to adapt in real time. Furthermore, long-term knowledge reuse is seldom implemented; recurring humidity patterns across production batches are rarely systematically captured or leveraged. To address these gaps, we propose a
Boost framework that integrates adaptive ensemble learning with incremental knowledge retention and reuse. This design enables fast, robust, and continuously improving humidity detection by learning from data streams over time, effectively handling the evolving nature of semiconductor production environments [
32,
33].
3.2. Problem Definition
In semiconductor fabrication, environmental humidity must be maintained within a strict operational range to prevent defects. Let denote the observed humidity level at time t. The safe operational range is defined as , where and are process-dependent thresholds. A humidity anomaly is declared if deviates beyond this range, i.e., or .
Given a multi-dimensional feature vector extracted from process sensors and contextual data at time t, the goal is to learn a detection function that predicts whether the corresponding humidity level is abnormal () or normal ().
The problem is formulated as a binary classification task over sequential data batches , where each batch corresponds to a production period. The model must adapt to potential distribution shifts across batches while retaining useful knowledge from previous batches, without catastrophic forgetting.
To quantify detection performance, we define a weighted loss function that accounts for both false alarms and missed detections. Let
be a differentiable loss (e.g., binary cross-entropy). The objective for batch
is to minimize the empirical risk, as defined in Equation (
1):
where
is a sample weight that can be adjusted to emphasize hard or critical examples,
controls the regularization strength and
is a knowledge-driven regularization term. To encourage knowledge reuse and mitigate catastrophic forgetting, we define
as the functional distance between the current candidate model
f and the ensemble of previously stored experts
, as presented in Equation (
2):
where
can be implemented as the Kullback–Leibler (KL) divergence or a parameter-based Euclidean distance. This constraint ensures that the updated model incorporates new humidity patterns while remaining consistent with previously acquired environmental knowledge.
An activation (alert) function
is defined to map the model’s continuous output (e.g., probability of anomaly) to a discrete alert decision. Let
be the predicted anomaly probability. The alert is triggered according to Equation (
3):
where
is a sensitivity threshold that can be tuned to balance precision and recall, and
is the indicator function. In practice,
can be adapted online based on the observed false alarm rate or operational requirements.
3.3. Lifelong Boosting Model
The proposed lifelong learning framework is designed to handle environmental data in semiconductor manufacturing. Unlike traditional methods that either retrain the model from scratch or selectively reuse weak learners, our framework adopts a “time-weighted ensemble of ensembles” strategy. This method generates a dedicated strong classifier for each incoming data batch and dynamically integrates them into a global model through a time decay mechanism, ensuring that the system prioritizes the latest data patterns while retaining historical knowledge to prevent catastrophic forgetting.
The architecture of this framework is shown in
Figure 1. Its workflow is carried out in a linearly cumulative manner over time steps (
):
- 1.
Data Input: The system receives sensor data streams in discrete batches (e.g., ).
- 2.
Specific Batch Model Training: For each batch , a robust ensemble model is trained. This step builds a powerful classifier by aggregating multiple weak learners (such as decision trees) for the difficult samples in that specific batch.
- 3.
Model Storage: Unlike discarding historical models, each trained batch model is serialized and stored in the historical knowledge base. This forms an expert sequence, with each expert focusing on the environmental distribution of their corresponding time period.
- 4.
Model Fusion: The core of the lifelong learning mechanism is the fusion layer. This framework aggregates the decisions of all stored models () to construct the final global model .
- 5.
Real-time Detection: The fused global model is used for real-time monitoring, and abnormal alerts are output based on the consensus of historical and current experts.
Let
represent the data batch arriving at time
t. For this batch, we train a powerful classifier
, which aggregates
T weak learners with weights
using an algorithm expressed in Equation (
4):
where
represents the
t-th weak learner trained on
, and
is its corresponding weight based on classification error.
To construct the global lifelong model
at the current time
t, we fuse all historical models
(where
). We introduce a Time-Decay Weighting Function
to address concept drift. The hypothesis is that recent models are more reflective of the current manufacturing environment than older models. The weighting function is defined as shown in Equation (
5):
where
is the forgetting factor. When
(current batch), the weight is
(maximum influence). As the time gap
increases, the influence of older models decays exponentially.
The final global decision function is given by the weighted majority vote of all batch models, computed via Equation (
6):
This form enables the framework to continuously adjust. The new pattern is captured by the model with the highest weight , while recurring historical anomalies are identified by a series of models from to , thereby forming a stable and reliable lifelong learning detection mechanism.
3.4. Algorithm Design
The program implementation details of the lifelong enhanced learning framework are elaborated in Algorithm 1. This algorithm performs “Training—Storage—Fusion— Detection” for each new received data batch to ensure continuous adaptation to the semiconductor manufacturing environment.
| Algorithm 1: Lifelong Boosting Learning for Environment Humidity Detection |
Require: New data batch at time t, Historical Model Repository , Decay factor , Boosting rounds T, Alert Threshold . Ensure: Anomaly alerts for , Updated Repository .
- 1:
Step 1: Data Input - 2:
Input sensor data stream . - 3:
Step 2: Batch Model Training - 4:
Initialize sample weights for training samples in . - 5:
Initialize batch model ensemble . - 6:
for to T do - 7:
Train weak learner on using weights . - 8:
Calculate error and learner weight . - 9:
Update sample weights (increase weight for misclassified samples). - 10:
Add to ensemble: . - 11:
end for - 12:
Formulate batch strong classifier: . - 13:
Step 3: Model Storage - 14:
Store the newly trained model: . - 15:
Step 4: Global Model Fusion - 16:
for to t do - 17:
Calculate time weight: . - 18:
end for - 19:
Construct global model using Equation ( 6): - 20:
- 21:
Step 5: Inference Calculation - 22:
Apply global model to the testing stream of . - 23:
Compute the anomaly probability score for current sensor readings. - 24:
Step 6: Real-time Semiconductor Environment Detection - 25:
return Result
|
The computational complexity of Algorithm 1 consists of a constant training cost and a linearly growing inference cost. Specifically, training a new batch model (Step 2) depends only on the boosting rounds T and batch size N, typically for decision trees, ensuring efficient updates independent of the accumulated history. In contrast, the Global Model Fusion and Inference phases (Steps 4–5) require aggregating predictions from all t historical models stored in , leading to a time and space complexity of . This linear growth reflects the trade-off between preventing catastrophic forgetting and resource consumption, although the inference latency can be effectively mitigated by parallelizing the evaluation of independent historical estimators.
5. Discussion
This study presents the Lifelong Boosting Learning framework for monitoring ambient humidity in semiconductor manufacturing through the analysis of process sensor data correlated with wafer defects. Experimental results demonstrate that Boost delivers superior classification performance, validating the efficacy of the proposed method in capturing subtle patterns associated with humidity-induced anomalies. This finding is significant as it provides an indirect, data-driven method for humidity surveillance and directly links process parameters to environmental conditions, facilitating a paradigm shift from passive environmental monitoring to active, product-informed diagnostics.
The strength of Boost lies in its innovative integration of lifelong learning principles with boosting ensemble methods. By using process and defect data as a distributed “sensor,” the framework overcomes spatial resolution limitations and temporal delays inherent in traditional point-based hygrometers. The lifelong learning mechanism allows Boost to continuously adapt to data distribution shifts during production, enabling systematic knowledge accumulation and reuse—a crucial capability for long-term, multi-batch semiconductor manufacturing.
However, several limitations should be acknowledged. Crucially, while the Boost framework is mathematically designed to handle sequential data batches and continuous adaptation, our current experimental validation was conducted on a static dataset. This initial evaluation primarily served to establish baseline classification metrics and validate the theoretical mechanics of the ensemble structure. We acknowledge that a true validation of its continual learning capabilities requires sequential, time series data streams that exhibit concept drift. Second, the model’s performance heavily relies on the quality of labeled humidity-related defect data, and in practice, defects often result from multiple interacting environmental and process factors. Therefore, testing the framework on dynamic data streams from an operational environment is a primary focus for our future work, alongside extending the framework to multi-parameter diagnostics.
Future research directions include: (1) incorporating semi-supervised or self-supervised learning to reduce dependency on large labeled datasets; (2) combining physics-based models with data-driven approaches to enhance interpretability and generalization; and (3) extending Boost to monitor other critical environmental parameters (e.g., temperature, airborne particles), thereby constructing a multi-dimensional environmental sensing system for semiconductor manufacturing.