Advancing YOLOv8-Based Wafer Notch-Angle Detection Using Oriented Bounding Boxes, Hyperparameter Tuning, Architecture Refinement, and Transfer Learning

Jun, Eun Seok; Sim, Hyo Jun; Moon, Seung Jae

doi:10.3390/app152111507

Open AccessArticle

Advancing YOLOv8-Based Wafer Notch-Angle Detection Using Oriented Bounding Boxes, Hyperparameter Tuning, Architecture Refinement, and Transfer Learning

by

Eun Seok Jun

,

Hyo Jun Sim

and

Seung Jae Moon

^*

Department of Mechanical Convergence Engineering, Hanyang University, Seoul 04763, Republic of Korea

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(21), 11507; https://doi.org/10.3390/app152111507

Submission received: 2 October 2025 / Revised: 21 October 2025 / Accepted: 24 October 2025 / Published: 28 October 2025

(This article belongs to the Special Issue Advances in Machine Learning and Data Mining: Emerging Trends and Applications)

Download

Browse Figures

Versions Notes

Abstract

Accurate angular alignment of wafers is essential in ion implantation to prevent channeling effects that degrade device performance. This study proposes a real-time notch-angle-detection system based on you only look once version 8 with oriented bounding boxes (YOLOv8-OBB). The proposed method compares YOLOv8 and YOLOv8-OBB, demonstrating the superiority of the latter in accurately capturing rotational features. To enhance detection performance, hyperparameters—including initial learning rate (Lr0), weight decay, and optimizer—are optimized using an one factor at a time (OFAT) approach followed by grid search. Architectural improvements, including spatial pyramid pooling fast with large selective kernel attention (SPPF_LSKA), a bidirectional feature pyramid network (BiFPN), and a high-resolution detection head (P2 head), are incorporated to improve small-object detection. Furthermore, a gradual unfreezing strategy is employed to support more effective and stable transfer learning. The final system is evaluated over 100 training epochs and tracked up to 5000 epochs to verify long-term stability. Compared to baseline models, it achieves higher accuracy and robustness in angle-sensitive scenarios, offering a reliable and scalable solution for high-precision wafer-notch detection in semiconductor manufacturing.

Keywords:

wafer notch; oriented bounding box; hyperparameter optimization; small-object detection; gradual unfreezing transfer learning

1. Introduction

Ion implantation modifies the electrical properties of wafers by introducing dopants at precisely controlled angles of a wafer notch. Incorrect implantation angles can lead to a channeling effect, where ions travel deeper along the crystallographic directions owing to reduced atomic collisions, thereby resulting in an uneven dopant distribution and degraded device performance. Therefore, accurate angular control is essential, and a critical prerequisite is the reliable detection of the wafer notch, which determines the wafer orientation. If wafer orientation can be detected in real time during the ion implantation process, misaligned or defective wafers can be identified immediately rather than after processing, enabling early correction of implantation errors and improving overall process yield [1,2,3,4].

The application of machine learning has rapidly expanded across sectors, reflecting its versatility and effectiveness in problem-solving [5,6,7,8]. Its integration with real-time monitoring and green technologies has drawn growing interest, enabling applications in intelligent food packaging, smart farming for plant disease detection, CO₂ tracking in refineries, and fuel-efficient marine vessel design [9,10,11,12]. Recent surveys on object detection have summarized the evolution of detection algorithms from traditional handcrafted-feature approaches to deep-learning–based frameworks such as regions with convolutional neural networks features (R-CNN), faster regions with convolutional neural networks features (Faster R-CNN), single shot multibox detector (SSD), and you only look once (YOLO) [13]. These reviews further highlight the emergence of anchor-free architectures and transformer-based models, along with lightweight networks such as MobileNet and efficientnet that enable real-time deployment on edge devices. Collectively, these advancements demonstrate the continuous effort to improve detection accuracy, computational efficiency, and generalization across complex visual environments—an evolution that forms the technical foundation for advanced vision-based automation across precision manufacturing sectors. This trend underscores the importance of real-time, sustainability-driven technologies in high-precision industries such as semiconductor fabrication. Among these, ion implantation is particularly sensitive to wafer orientation and angular alignment. Recent studies on industrial pre-alignment systems have emphasized the critical role of vision-based precision alignment in various manufacturing contexts, including wafer dicing, robotic assembly, and micro-electromechanical systems fabrication [14,15,16]. To address this, real-time notch recognition systems have been introduced to detect and correct angular deviations during processing, thereby reducing early-stage defects and improving process stability. Misalignment remains a frequent issue, leading to wafer scrap, material waste, and higher costs, while also increasing the risk of channeling-induced defects that degrade device performance. Therefore, developing robust, real-time detection algorithms is essential for improving yield and advancing sustainable semiconductor manufacturing.

To improve wafer-orientation detection, real-time notch-angle detection has been proposed to enable the immediate correction of angular deviations during implantation, thereby preventing defects before they occur and significantly improving process reliability. Wang et al. [17] enhanced detection accuracy by integrating image-enhancement techniques, such as real enhanced super-resolution generative adversarial network (Real-ESRGAN) and contrast limited adaptive histogram equalization, with the YOLOv4 object-detection model. The horizontal bounding box (HBB) model has been studied for wafer-notch detection.

However, no previous studies have applied oriented bounding box (OBB) models to wafer-notch detection in combination with hyperparameter tuning—such as initial learning rate (Lr0), weight decay, and optimizer selection—architectural enhancements including spatial pyramid pooling fast with large selective kernel attention (SPPF_LSKA), a bidirectional feature pyramid network (BiFPN), and a high-resolution detection head (P2 head), as well as gradual unfreezing-based transfer learning. Given their orientation-dependent features, wafer notches require methods capable of capturing angular variations. Prior studies have shown that OBB offers more precise localization than HBB, making them suitable for angle-sensitive tasks like notch detection [18,19,20]. In addition, the YOLO family, with its favorable balance of accuracy and inference speed, has proven effective for real-time applications and is therefore adopted as the core framework in this study [21,22,23,24].

Building on the proven effectiveness of YOLO for real-time detection, recent studies have shown that hyperparameter optimization during training can significantly boost deep-learning performance [25,26,27,28]. Architectural enhancements—such as SPPF_LSKA, BiFPN, and a P2 head—have also been found effective for small-object detection tasks like wafer-notch localization, which involve fine-grained angular features [29,30,31]. Additionally, gradual unfreezing transfer learning, where model layers are fine-tuned sequentially, has improved detection performance in domains such as medical imaging, quality assurance, and deepfake detection [32,33,34]. However, these strategies—hyperparameter optimization, architectural improvements, and gradual unfreezing—have not yet been applied to real-time wafer-notch detection.

This study aims to enhance real-time wafer-notch detection by establishing the superiority of OBB models over conventional HBB approaches and improving the baseline YOLOv8-OBB framework. To this end, a two-stage hyperparameter optimization is conducted: an initial one factor at a time (OFAT) approach followed by grid search, focusing on Lr0, weight decay, and optimizer—stochastic gradient descent (SGD), adaptive moment estimation (Adam), and Adam with decoupled weight decay (AdamW). In parallel, the architecture is refined to improve small-object-detection performance by integrating SPPF_LSKA, BiFPN, and a P2 head. Furthermore, gradual unfreezing transfer learning is applied to enhance training stability and wafer-domain adaptation. The performances of YOLOv8, YOLOv8-OBB, and the improved models are evaluated over 100 epochs and tracked up to 5000 epochs.

2. YOLOv8-Based Detection Models and Optimization Methods

2.1. YOLOv8 and YOLOv8-OBB

YOLOv8, introduced by Ultralytics (Frederick, MD, USA) in 2023, offers significant improvements in real-time computer vision tasks such as object detection, image classification, and instance segmentation. The key configuration parameters and computational specifications of the YOLOv8-nano model used in this study are summarized in Table 1. As shown in Figure 1, it refines YOLOv5 by adopting an anchor-free detection approach, which improves object-center precision, simplifies non-maximum suppression (NMS), and enhances generalization. Architectural updates include replacing the 6 × 6 convolution layers (Conv) with a 3 × 3 Conv and the cross stage partial block with three Conv (C3) with a cross stage partial block with two Conv and feature fusion (C2f) module, resulting in a faster and more efficient network. The spatial pyramid pooling fast (SPPF) module enhances multi-scale feature extraction, while feature concatenation (Concat) in the neck and mosaic augmentation improve training performance [35,36,37].

YOLOv8-OBB extends this architecture with a dedicated rotation-prediction head, enabling the generation of rotated bounding boxes based on angle-sensitive features. As illustrated in Figure 2, this enhancement improves the detection of arbitrarily oriented objects and increases accuracy in tasks requiring geometric alignment.

However, it should be noted that this study employed only a single baseline model (YOLOv8) for comparison. Although this experimental setup does not provide a comprehensive comparison across different detector types, it allowed for a clear and focused evaluation of the proposed architecture’s effectiveness.

2.2. Effect of Hyperparameters on Model Training and Optimization Methods

In YOLOv8-based object detection, tuning key hyperparameters is critical for stable convergence and strong generalization. Lr0 controls the learning rate, where higher values speed up training but risk instability, while lower values ensure smoother convergence. Weight decay acts as a regularizer to prevent overfitting, and the choice of optimizer—such as SGD, Adam, or AdamW—affects convergence behavior. SGD updates weights using a fixed learning rate and has slow but stable convergence. Adam combines momentum and adaptive learning rates for faster convergence, while AdamW improves upon Adam by decoupling weight decay from the gradient update for better regularization.

To optimize these parameters, two complementary strategies are employed. First, the OFAT method evaluates the individual impact of Lr0, weight decay, and optimizer. Grid search is then used to fine-tune combinations of these parameters by exhaustively testing predefined values, enabling the discovery of optimal configurations for accuracy and training stability.

Although automated hyperparameter optimization frameworks such as Bayesian optimization can theoretically find more global optima, a manual tuning strategy was intentionally adopted in this study due to several practical and domain-specific considerations. Manual tuning is computationally efficient, simple to implement, and allows direct control over key parameters without the need for large-scale graphics processing unit (GPU) resources. Moreover, the ion implantation environment exhibits low variability—illumination is stable, the vision camera is fixed inside the chamber, and the wafer remains stationary only for brief alignment moments—making a streamlined manual tuning approach both feasible and effective. This simplified strategy focuses on stable convergence and precise angular detection performance under realistic process conditions.

Similar manual tuning approaches have also been successfully applied in prior YOLO-based studies, demonstrating that task-specific manual adjustment can be both computationally efficient and effective for domain-constrained applications [38,39,40]. However, it should be noted that such manual tuning does not guarantee a global optimum, as finer parameter configurations might be overlooked. Despite this trade-off, the chosen approach provides an appropriate balance between practicality, interpretability, and accuracy within the constrained industrial context of this study.

2.3. Architecture Improvement Based on YOLOv8-OBB

Figure 3 shows the enhanced YOLOv8-OBB architecture tailored for the precise detection of small angular wafer notches. Three architectural modifications were introduced to improve multi-scale feature fusion and localization accuracy. The original SPPF in the backbone was replaced with SPPF_LSKA to enhance spatial feature extraction, while BiFPN modules were inserted in the neck to replace conventional Concat operations, enabling bidirectional feature routing and stronger multi-scale aggregation. In addition, an extra P2 head was appended to the detection branch to exploit higher-resolution features for finer localization of small notches.

Table 2 summarizes the structural differences between the baseline YOLOv8-OBB and the improved model, including the corresponding channel configurations and module replacements. The comparison indicates that the proposed network preserves the original channel hierarchy, maintaining 256, 512, and 1024 channels for the medium-resolution detection head (P3), low-resolution detection head (P4), and lower-resolution detection head (P5) feature maps, respectively, while introducing a new P2 level with 128 channels. This modification provides higher-resolution information for small-object detection while maintaining full compatibility with the baseline architecture.

The SPPF_LSKA module refines the SPPF structure by integrating the large selective kernel attention (LSKA) mechanism. While the standard SPPF captures multi-scale contextual information through pooling layers with a five-by-five kernel size, it has limited ability to model long-range dependencies and often causes feature loss for small targets during down-sampling. The incorporation of LSKA expands the effective receptive field using separable large-kernel convolutions, allowing the network to capture broader contextual information without excessive computational cost. Furthermore, the attention mechanism adaptively assigns weights to feature channels, strengthening the representation of fine-grained details such as edges and textures that are critical for accurate localization. Overall, this module enhances both the semantic richness and spatial precision of the extracted features [41].

The BiFPN facilitates bidirectional feature fusion across multiple scales by enabling top-down and bottom-up information flow between feature layers. It introduces learnable weights to adaptively balance the contributions of shallow and deep features during fusion, thereby emphasizing informative representations while suppressing redundant ones. As illustrated in Figure 3, the BiFPN establishes fully bidirectional routing among the P2–P5 feature levels, where solid and dashed lines indicate top-down and bottom-up pathways, respectively. In addition, BiFPN simplifies the fusion structure by eliminating unnecessary connections, improving computational efficiency without compromising accuracy [42].

The P2 head incorporates a higher-resolution feature level to address the loss of fine details that typically occurs during repeated down-sampling. This additional head allows the network to retain texture and edge information essential for the precise localization of very small and densely distributed targets, complementing the deeper layers responsible for high-level semantics. The inclusion of P2 improves overall feature completeness and detection robustness with minimal computational overhead [43].

Collectively, these architectural enhancements strengthen the YOLOv8-OBB network’s ability to preserve fine-grained spatial details and improve detection accuracy for small and low-contrast objects under complex visual conditions.

2.4. Gradual Unfreezing Transfer Learning

Transfer learning improves performance on target tasks with limited data by leveraging knowledge from pretrained models. Commonly, early layers capturing general features are frozen while only the final layers are fine-tuned, reducing overfitting but limiting adaptability. To overcome this, gradual unfreezing has been proposed as a more flexible strategy. It incrementally unfreezes layers during training, enabling progressive adaptation while preserving stable representations.

3. Performance Metric and Experimental Setup

3.1. Evaluation Indicator and Implementation Environment

In this study, notch-detection performance was evaluated using a mean average precision at intersection of union (IoU) thresholds ranging from 0.50 to 0.95 (mAP50–95), a widely adopted metric that reflects both classification and localization accuracy. The average precision (AP) is first computed as the area under the precision–recall curve for a given class, where precision is the ratio of correctly predicted positives to all predicted positives, and recall is the ratio of correctly predicted positives to all actual positives. The mAP50–95 was obtained by averaging the AP across multiple IoU thresholds ranging from 0.50 to 0.95 in increments of 0.05. The notation “50–95” in mAP50–95 represents this IoU range and step size, following the common objects in context-style mAP evaluation protocol.

In this work, a total of 33 angular classes were defined according to the discrete notch-angle intervals, and the final mAP50–95 value was calculated as the mean of APs across these classes. This metric provides a rigorous and comprehensive assessment of the detection performance across varying levels of overlap criteria. In Equation (1), N denotes the number of object classes used for averaging, and i indicates the index corresponding to each class.

mAP 50 - 95 = \frac{1}{N} \sum_{i = 1}^{N} {AP}_{i}

(1)

In this study, mAP50–95 was evaluated using the test dataset, which was held out from training and used only for final performance assessment.

The model was implemented using Python 3.11.11 in combination with the PyTorch 2.5.1 deep-learning framework. The process was executed on a system equipped with a central processing unit (CPU) and GPU utilizing a compute unified device architecture (CUDA) platform for accelerated computation. The detailed software and hardware specifications are presented in Table 3.

Most of the hyperparameters followed the default configurations provided by the YOLOv8 framework, with the nano-variant selected as the model scale. A summary of the default hyperparameter settings and model scales is provided in Table 4.

3.2. Dataset Description and Preparation

A total of 3517 wafer images were used in this study. An industrial vision camera was installed inside the moving stage, where the wafer is injected and synchronized with a control computer to record one complete rotational cycle at various angular positions. The camera was configured to capture grayscale images only, in order to minimize color-related variability and reduce environmental complexity under a controlled acquisition setup. The recorded video sequences were decomposed into individual image frames and categorized into four representative process conditions: stationary wafers with visible notches (positive samples), frames without any wafer present, moving wafers without visible notches, and moving wafers with partially visible notches. Considering that wafers remain stationary only for a brief moment during actual operation, 10 images per angular position were collected for the positive dataset, covering angular ranges of 355-5° (−5–+5°), 40–50°, and 107–117°, which represent the primary notch orientations observed in practice. The detected notch angle was subsequently transmitted to the stage control unit for angular correction, completing the end-to-end pre-alignment process within the industrial vision system.

The annotation of notch regions was performed using the roLabelImg tool, which supports OBB labeling. A total of 330 stationary wafer images were manually annotated according to their respective angular orientations, enabling the YOLOv8-OBB detector (Ultralytics, Frederick, MD, USA) to learn precise rotation-aware representations of wafer notches. For consistency across subsets, the bounding box dimensions were set to 30 by 60 pixels for the training set, 33 by 66 pixels for the validation set, and 36 by 72 pixels for the testing set. The entire dataset was divided into 2110 images for training, 703 for validation, and 704 for testing at a ratio of 6:2:2 using a python-based script that randomly selected frames rather than following sequential order, thereby preventing temporal bias. The training set was used for model optimization, while the validation set was employed to monitor the training progress and evaluate epoch-wise performance trends. The test set was held out and used only for the final quantitative performance evaluation of the trained models.

The dataset included different visual cases to represent the variability in wafer appearance, as illustrated in Figure 4.

4. Results and Discussion

4.1. Comparison of YOLOv8 and YOLOv8-OBB

Figure 5 shows a performance comparison between YOLOv8 and YOLOv8-OBB for wafer-notch-angle detection using the default hyperparameters listed in Table 3. YOLOv8 achieved an mAP50–95 of 0.319; however, the YOLOv8-OBB model outperformed it by 0.363, indicating a 4.4% improvement. This gain highlights the effectiveness of the OBB model in capturing rotational features and minimizing background interference, demonstrating its superior suitability for angle-sensitive detection tasks, such as wafer-notch localization.

4.2. Tuning of Model Parameters

Figure 6 shows the results of the OFAT experiments conducted to individually optimize Lr0, weight decay, and the optimizer settings based on mAP50–95. Among the Lr0 values tested, 0.001 achieved the highest performance with an mAP50–95 of 0.549, outperforming both 0.0001 (0.252) and 0.01 (0.363). This indicates that extremely low or high learning rates hinder stable convergence or overshoot optimal updates. In the case of weight decay, performance remained relatively stable across a wide range of values, with a slight peak at 0.005 (0.550), closely followed by 0.0005 (0.549) and 0.00005 (0.544). However, a noticeable decrease occurred at 0.05 (0.518), suggesting that excessive regularization can impair learning. The Adam optimizer yielded the highest mAP50–95 of 0.564, followed closely by AdamW at 0.550, whereas SGD yielded the lowest performance with an mAP50–95 of 0.452. These findings indicate that adaptive optimizers are more effective for the OBB detection task, with Adam offering the best initial performance under fixed conditions. Collectively, these OFAT results provide a reliable reference for narrowing the hyperparameter search space before applying the joint optimization methods.

Figure 7 shows the results of a grid search across combinations of optimizers (SGD, Adam, and AdamW), Lr0 (0.0001, 0.001, and 0.01), and weight decay (0.00005, 0.0005, 0.005, and 0.05), evaluated using mAP50–95. The best performance (0.649) was achieved with SGD, Lr0 of 0.01, and a weight decay of 0.0005; however, sharp performance decreases were observed under stronger regularization, falling to 0.291 at a weight decay of 0.05. AdamW yielded consistent results (e.g., 0.544 at 0.00005 and 0.550 at 0.005), whereas Adam peaked at 0.564 with a weight decay of 0.005 and Lr0 of 0.001. However, its performance decreased to 0.208 under harsher settings with a weight decay of 0.05 and Lr0 of 0.01. These trends suggest that SGD can be highly effective when finely tuned, whereas AdamW provides a more robust and reliable performance across a wider hyperparameter space, making it preferable in scenarios with limited tuning capacity. It is worth noting that the OFAT and grid search results show differing optimizer trends—Adam performed best in the OFAT analysis, whereas SGD achieved the highest performance in the joint grid search. This discrepancy arises because OFAT varies one factor at a time under fixed conditions, thereby neglecting potential interactions among hyperparameters. In contrast, the grid search jointly optimizes multiple parameters, where the combination of a tuned learning rate and weight decay allows SGD to generalize more effectively, resulting in superior overall performance.

4.3. Architectural Enhancement and Transfer Learning via Gradual Unfreezing

Figure 8 shows the results of an ablation study conducted to evaluate the impact of architectural modifications on the detection performance by incrementally applying three modules, SPPF_LSKA, BiFPN, and P2 head, to the baseline YOLOv8-OBB model. Using the optimized training configuration from the previous section (SGD optimizer, Lr0 of 0.01, and weight decay of 0.0005), the study revealed that the P2 head alone provided the most significant standalone improvement, increasing mAP50–95 from 0.649 to 0.712. This underscores the importance of incorporating high-resolution features from shallow layers, which are particularly effective for detecting small and low-texture targets such as wafer notches. In contrast, combinations that excluded the P2 head, such as SPPF_LSKA with BiFPN, did not improve and occasionally degraded the performance, indicating that structural modules alone are insufficient without fine-grained spatial information. Notably, when all three components were integrated, the model achieved the highest performance, reaching an mAP50–95 of 0.726, demonstrating the synergistic effects of spatial attention, multiscale feature fusion, and enhanced shallow-level prediction.

Figure 9 shows the results of an ablation study investigating the effect of gradual unfreezing transfer learning on the model performance using the optimized YOLOv8-OBB architecture that incorporates the SPPF_LSKA, BiFPN, and P2 head modules shown in Figure 3. The training was performed with the SGD optimizer, an Lr0 of 0.01, and a weight decay of 0.0005, following the same hyperparameter settings used in the previous optimization experiments. In this setup, the backbone was initially frozen and progressively unfrozen in pyramid stages corresponding to the feature levels (P1–P5), sequentially reducing the number of frozen layers from ten to seven, five, three, and one at predefined epoch thresholds of 4, 6, 8, 10, 12, and 14 within a total of 100 training epochs. A no-freezing baseline was also included for comparison. The results demonstrated that unfreezing the final backbone stages around epochs 10–12 yielded the optimal performance, achieving the highest mAP50–95 of 0.739 at epoch 12 and 0.728 at epoch 10, both surpassing the no-freezing baseline of 0.726. In contrast, early unfreezing schedules (epochs 4 and 6) resulted in lower scores of 0.698 and 0.690, likely due to premature disruption of pre-trained representations, while delayed unfreezing at epoch 14 also underperformed with a score of 0.668, suggesting insufficient adaptation time. These findings indicate that precisely timed, stage-wise unfreezing—particularly within the 10–12 epoch range—effectively balances the preservation of general features with task-specific learning, leading to improved detection performance.

4.4. Detection Results and Validation with Prolonged Training

Figure 10 shows a comparative analysis of the detection performance across the three models—YOLOv8, baseline YOLOv8-OBB, and the proposed enhanced model—evaluated using representative images within three angular intervals: 355-5° (−5–+5°), 40–50°, and 107–117°. Both the YOLOv8 and YOLOv8-OBB models were trained using the hyperparameter configurations listed in Table 3. YOLOv8 failed to detect a target, resulting in 33 missed cases. The baseline YOLOv8-OBB model showed partial improvement, detecting notches at select angles, such as 355°, 356°, 357°, and 3°, but still failed in most cases and exhibited low confidence scores. In contrast, the enhanced model achieved robust detection with confidence scores above 0.5 in most cases. It correctly identified notches in 23 out of 33 images, with 10 false positives and no complete detection failures. Although false positives occurred at a few angles, including 2°, 4°, 41–43°, 47°, 49°, 108°, 109°, and 116°, this model consistently outperformed the other models across all angular ranges. These results confirmed that the proposed architectural and training enhancements substantially improved the ability of the model to perform reliable and angle-sensitive detection, demonstrating its practical effectiveness in real-world scenarios requiring high-precision localization.

Figure 11 shows a large-scale training analysis of up to 5000 epochs to evaluate the convergence and final detection accuracy across the three models. To examine the trend of mAP50–95 on the test dataset at different training epochs, the weights were saved at fixed intervals and evaluated after each checkpoint. Early stopping was not applied, as it may prematurely terminate the training process near the plateau point, while the model still retains potential for gradual performance improvement beyond this stage. YOLOv8 converged quickly with a peak mAP50–95 of 0.565 at epoch 300 but plateaued early with no further improvements. In contrast, YOLOv8-OBB exhibited slower convergence but continued to improve until epoch 800, reaching an mAP50–95 of 0.772. The enhanced model demonstrated a superior performance, achieving the highest mAP50–95 of 0.837 with convergence at approximately 800 epochs. This gain is attributed to the optimized training settings, particularly the use of SGD over AdamW, which led to more stable and efficient learning and architectural upgrades (SPPF_LSKA, BiFPN, and P2 head). Additionally, the gradual unfreezing of the backbone during transfer learning helped preserve pretrained knowledge early while promoting task-specific adaptation, collectively enhancing generalization and training robustness.

5. Conclusions

This study proposes a series of improvements to the YOLOv8-OBB model to enhance wafer notch-angle-detection performance. The study first focused on the hyperparameter optimization of Lr0, weight decay, and the optimizer using both the OFAT and grid-search strategies. This was followed by architectural enhancements to improve the detection of small rotational features. Finally, the application of a gradual unfreezing transfer-learning strategy was used to further enhance model adaptability. The combination of the OBB with targeted model enhancements significantly improved the detection accuracy in complex angular environments.

The experimental results confirmed the effectiveness of the proposed approach. In comparative evaluations, the YOLOv8-OBB model demonstrated improved performance compared with the original YOLOv8 model, highlighting the effectiveness of OBB in capturing rotational object characteristics. Hyperparameter tuning using OFAT and grid search, particularly the optimization of the Lr0, weight decay, and optimizer, enhanced the training stability and accelerated convergence. Building on this foundation, the incorporation of architectural modules such as SPPF_LSKA, BiFPN, and the P2 head improved fine-grained feature extraction and multiscale representation. Furthermore, the application of a gradual unfreezing strategy in transfer learning enabled the progressive adaptation of pretrained layers, enhancing generalization and robustness. The final model, which integrated all the optimizations, demonstrated superior detection performance and sustained accuracy across extended training durations compared with the baseline configurations.

For reliable implementation in high-precision vision systems, particularly within semiconductor manufacturing processes, the combined use of the OBB, systematic hyperparameter optimization, architectural refinement, and transfer learning strategies should be considered. This integrated approach enhanced the detection accuracy, generalization capability, and angular-localization robustness, thereby offering a rigorously validated framework suitable for precision-critical applications.

Furthermore, the improvement in mAP50–95 directly translates to a higher correct detection rate of wafer notches, which plays a key role in maintaining precise wafer alignment during ion implantation. By enabling accurate and real-time identification of wafer orientation, the proposed method facilitates early detection of misaligned or defective wafers, thereby reducing process errors and improving overall manufacturing yield. Consequently, the proposed framework contributes not only to algorithmic performance but also to practical advancements in smart semiconductor manufacturing.

However, it should be noted that mAP50–95 primarily evaluates detection accuracy based on spatial overlap and does not directly quantify the angular deviation between the predicted and actual notch orientations. Future studies will therefore incorporate direct angular error measurements to complement the mAP-based evaluation, ensuring a more task-aligned assessment of angular precision.

Beyond this metric-related consideration, several additional limitations and potential directions for improvement can be identified. While the present study adopted a sequential improvement strategy—progressing through hyperparameter tuning, structural refinement, and gradual unfreezing transfer learning—future work will include more detailed experiments to analyze the contribution of each methodological stage to overall performance enhancement. Furthermore, beyond the manual hyperparameter tuning employed in this study, the adoption of automated or hybrid optimization frameworks (e.g., Bayesian optimization) will be explored. Although such methods require greater computational resources, adopting such frameworks offers a promising direction toward globally optimized parameter configurations and improved model robustness.

Author Contributions

Conceptualization, E.S.J., and S.J.M.; methodology, E.S.J. and H.J.S.; software, E.S.J. and H.J.S.; validation, S.J.M.; formal analysis, E.S.J. and S.J.M.; investigation, E.S.J.; data curation, E.S.J. and H.J.S.; resources, S.J.M.; funding acquisition, S.J.M.; writing—original draft preparation, E.S.J.; writing—review and editing, S.J.M.; visualization, E.S.J. and H.J.S.; supervision, S.J.M. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Korea Institute of Energy Technology Evaluation and Planning (KETEP), titled “The development of hydrogen fuel cell power pack over 200 kW for ships”. (2410010247, No. RS-2024-00420215).

Data Availability Statement

Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Purwaningsih, L.; Konsulke, P.; Tonhaeuser, M.; Jantoljak, H. Defect inspection wafer notch orientation and defect detection dependency. In Proceedings of the International Symposium for Testing and Failure Analysis (ISTFA 2021), Phoenix, AZ, USA, 31 October–4 November 2021; ASM International: Almere, The Netherlands, 2021; pp. 403–405. [Google Scholar] [CrossRef]
Chaudhry, A.; Kumar, M.J. Controlling short-channel effects in deep-submicron SOI MOSFETs for improved reliability: A review. IEEE Trans. Device Mater. Reliab. 2004, 4, 99–109. [Google Scholar] [CrossRef]
Dongsheng, Q.; Suilong, Q.; Weibin, R.; Yixu, S.; Yannan, Z. Design and experiment of the wafer pre-alignment system. In Proceedings of the 2007 International Conference on Mechatronics and Automation (ICMA 2007), Heilongjiang, China, 5–8 August 2007; IEEE: New York, NY, USA, 2007; pp. 1483–1488. [Google Scholar] [CrossRef]
Luckman, G.; Harris, M.; Rathmell, R.D.; Kopalidis, P.; Ray, A.M.; Sato, F.; Sano, M. Precision halo control with antimony and indium on Axcelis medium current ion implanters. In Proceedings of the 14th International Conference on Ion Implantation Technology (IIT 2002), Taos, NM, USA, 22–27 September 2002; IEEE: New York, NY, USA, 2002; pp. 279–282. [Google Scholar] [CrossRef]
Rucki, M.; Kilikevicius, A.; Bzinkowski, D.; Ryba, T. Identification of rubber belt damages using machine learning algorithms. Appl. Sci. 2025, 15, 10449. [Google Scholar] [CrossRef]
Wang, C.; Serre, T. A hybrid approach to investigating factors associated with crash injury severity: Integrating interpretable machine learning with logit model. Appl. Sci. 2025, 15, 10417. [Google Scholar] [CrossRef]
Liu, H.; Wang, X.; He, F.; Zheng, Z. Automated network defense: A systematic survey and analysis of AutoML paradigms for network intrusion detection. Appl. Sci. 2025, 15, 10389. [Google Scholar] [CrossRef]
Gao, J.; Li, J. Intelligent fast calculation of petrophysical parameters of clay-bearing shales based on a novel dielectric dispersion model and machine learning. Appl. Sci. 2025, 15, 10381. [Google Scholar] [CrossRef]
Zhao, Y.; Li, C.; Xia, X.; Tan, M.; Wang, H.; Lv, Y.; Du, J. Eco-friendly and intelligent cellulosic fibers-based packaging system for real-time visual detection of food freshness. Chem. Eng. J. 2023, 474, 146013. [Google Scholar] [CrossRef]
Moreira, R.; Moreira, L.F.R.; Munhoz, P.L.A.; Lopes, E.A.; Ruas, R.A.A. AgroLens: A low-cost and green-friendly smart farm architecture to support real-time leaf disease diagnostics. Internet Things 2022, 19, 100570. [Google Scholar] [CrossRef]
Cao, L.; Su, J.; Saddler, J.; Cao, Y.; Wang, Y.; Lee, G.; Gopaluni, R.B. Machine learning for real-time green carbon dioxide tracking in refinery processes. Renew. Sustain. Energy Rev. 2025, 213, 115417. [Google Scholar] [CrossRef]
Shang, G.; Xu, L.; Tian, J.; Cai, D.; Xu, Z.; Zhou, Z. A real-time green construction optimization strategy for engineering vessels considering fuel consumption and productivity: A case study on a cutter suction dredger. Energy 2023, 274, 127326. [Google Scholar] [CrossRef]
Trigka, M.; Dritsas, E. A Comprehensive Survey of Machine Learning Techniques and Models for Object Detection. Sensors 2025, 25, 214. [Google Scholar] [CrossRef]
Zheng, J.; Yan, J.; Wang, Q.; Zhou, H.; Huang, S. Wafer Precision Alignment Method Based on Feature Recognition. J. Donghua Univ. 2025, 51, 2. [Google Scholar] [CrossRef]
Mitchell, M.; Sivaraya, S.; Bending, S.J.; Mohammadi, A. Novel technique for backside alignment using direct laser writing. Micromachines 2025, 16, 255. [Google Scholar] [CrossRef]
Zhang, Y.; Song, Z.; Yu, J.; Cao, B.; Wang, L. A novel pose estimation method for robot threaded assembly pre-alignment based on binocular vision. Robot. Comput. Integr. Manuf. 2025, 93, 102939. [Google Scholar] [CrossRef]
Wang, H.; Sim, H.J.; Hwang, J.J.; Kwak, S.J.; Moon, S.J. YOLOv4-based semiconductor wafer notch detection using deep learning and image enhancement algorithms. Int. J. Precis. Eng. Manuf. 2024, 25, 1909–1916. [Google Scholar] [CrossRef]
Li, L.; Tokuda, F.; Seino, A.; Kobayashi, A.; Tien, N.C.; Kosuge, K. Fabric dynamic motion modeling and collision avoidance with oriented bounding box. IEEE Robot. Autom. Lett. 2025, 10, 9542–9549. [Google Scholar] [CrossRef]
Wang, K.; Wang, Z.; Li, Z.; Su, A.; Teng, X.; Pan, E.; Yu, Q. Oriented object detection in optical remote sensing images using deep learning: A survey. Artif. Intell. Rev. 2025, 58, 350. [Google Scholar] [CrossRef]
Fu, Y.; Wang, Z.; Zheng, H.; Yin, X.; Fu, W.; Gu, Y. Integrated detection of coconut clusters and oriented leaves using improved YOLOv8n-OBB for robotic harvesting. Comput. Electron. Agric. 2025, 231, 109979. [Google Scholar] [CrossRef]
Du, J. Understanding of object detection based on CNN family and YOLO. J. Phys. Conf. Ser. 2018, 1004, 012029. [Google Scholar] [CrossRef]
Tan, L.; Huangfu, T.; Wu, L.; Chen, W. Comparison of RetinaNet, SSD, and YOLO v3 for real-time pill identification. BMC Med. Inform. Decis. Mak. 2021, 21, 11. [Google Scholar] [CrossRef] [PubMed]
Kim, J.A.; Sung, J.Y.; Park, S.H. Comparison of Faster-RCNN, YOLO, and SSD for real-time vehicle type recognition. In Proceedings of the IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia 2020), Seoul, Republic of Korea, 1–3 November 2020; IEEE: New York, NY, USA, 2020; pp. 1–4. [Google Scholar] [CrossRef]
Sapkota, R.; Ahmed, D.; Karkee, M. Comparing YOLOv8 and Mask R-CNN for instance segmentation in complex orchard environments. Artif. Intell. Agric. 2024, 13, 84–99. [Google Scholar] [CrossRef]
Tan, F.G.; Yuksel, A.S.; Aksoy, B. Deep learning-based hyperparameter tuning and performance comparison. In Proceedings of the International Conference on Artificial Intelligence and Applied Mathematics in Engineering, Antalya, Türkiye, 3–5 November 2023; Springer Nature: Cham, Switzerland, 2023; pp. 128–140. [Google Scholar] [CrossRef]
Wahyudi, D.; Soesanti, I.; Nugroho, H.A. Optimizing hyperparameters of YOLO to improve performance of brain tumor detection in MRI images. In Proceedings of the 6th International Conference on Information and Communications Technology (ICOIACT 2023), Yogyakarta, Indonesia, 10–11 November 2023; IEEE: New York, NY, USA, 2023; pp. 413–418. [Google Scholar] [CrossRef]
El Khatib, Z.; Mnaouer, A.B.; Moussa, S.; Abas, M.A.B.; Ismail, N.A.; Abdulgaleel, F.; Ashraf, L. LoRa-enabled GPU-based CubeSat YOLO object detection with hyperparameter optimization. In Proceedings of the International Symposium on Networks, Computers and Communications (ISNCC 2022), Shenzhen, China, 19–22 July 2022; IEEE: New York, NY, USA, 2022; pp. 1–4. [Google Scholar] [CrossRef]
Salim, E. Hyperparameter optimization of YOLOv4 tiny for palm oil fresh fruit bunches maturity detection using genetics algorithms. Smart Agric. Technol. 2023, 6, 100364. [Google Scholar] [CrossRef]
Zhao, Z.; Liu, X.; He, P. PSO-YOLO: A contextual feature enhancement method for small object detection in UAV aerial images. Earth Sci. Inform. 2025, 18, 258. [Google Scholar] [CrossRef]
Yan, Q.; Shao, L.H.; Wang, X.; Shi, N.; Qin, A.; Shi, H.; Gao, Q. Small object detection algorithm based on high-resolution image processing and fusion of different scale features. In Proceedings of the 3rd International Conference on Image Processing and Media Computing (ICIPMC 2024), Hefei, China, 17–19 May 2024; IEEE: New York, NY, USA, 2024; pp. 36–43. [Google Scholar] [CrossRef]
Juanjuan, Z.; Xiaohan, H.; Zebang, Q.; Guangqiang, Y. Small object detection algorithm combining coordinate attention mechanism and P2-BiFPN structure. In Proceedings of the International Conference on Computer Engineering and Networking, Wuxi, China, 3–5 November 2023; Springer Nature: Singapore, 2023; pp. 268–277. [Google Scholar] [CrossRef]
Cheng, P.C.; Chiang, H.H.K. Diagnosis of salivary gland tumors using transfer learning with fine-tuning and gradual unfreezing. Diagnostics 2023, 13, 3333. [Google Scholar] [CrossRef]
Khanna, U. Gradual Unfreezing Transformer-Based Language Models for Biomedical Question Answering. Ph.D. Thesis, Macquarie University, Sydney, Australia, 2021. [Google Scholar]
Pintelas, E.; Livieris, I.E.; Pintelas, P. Quantization-based 3D-CNNs through circular gradual unfreezing for DeepFake detection. IEEE Trans. Artif. Intell. 2025, 1–13. [Google Scholar] [CrossRef]
Ramos, L.; Casas, E.; Bendek, E.; Romero, C.; Rivas-Echeverría, F. Computer vision for wildfire detection: A critical brief review. Multimeda Tools Appl. 2024, 83, 83427–83470. [Google Scholar] [CrossRef]
Casas, E.; Ramos, L.; Bendek, E.; Rivas-Echeverría, F. Assessing the effectiveness of YOLO architectures for smoke and wildfire detection. IEEE Access 2023, 11, 96554–96583. [Google Scholar] [CrossRef]
Van, D.D. Application of advanced deep convolutional neural networks for the recognition of road surface anomalies. Eng. Technol. Appl. Sci. Res. 2023, 13, 10765–10768. [Google Scholar] [CrossRef]
Isa, I.S.; Rosli, M.S.A.; Yusof, U.K.; Maruzuki, M.I.F.; Sulaiman, S.N. Optimizing the hyperparameter tuning of YOLOv5 for underwater detection. IEEE Access 2022, 10, 52818–52831. [Google Scholar] [CrossRef]
Moraes, A.M.; Pugliese, L.F.; Santos, R.F.D.; Vitor, G.B.; Braga, R.A.D.S.; Silva, F.R.D. Effectiveness of YOLO architectures in tree detection: Impact of hyperparameter tuning and SGD, Adam, and AdamW optimizers. Standards 2025, 5, 9. [Google Scholar] [CrossRef]
Irfani, M.H.; Heriansyah, R. Hyperparameter tuning to improve object detection performance in handwritten images. In Proceedings of the 2024 International Conference on Intelligent Cybernetics Technology & Applications (ICICyTA 2024), Bali, Indonesia, 25 November 2024; IEEE: New York, NY, USA, 2024; pp. 990–995. [Google Scholar] [CrossRef]
Di, J.; Xi, K.; Niu, H.; Wu, X.; Yang, Y. Enhanced YOLOv8 framework for precise small object detection in UAV imagery. IEEE Access 2025, 13, 157811–157827. [Google Scholar] [CrossRef]
Hu, Y.; Dai, Y.; Wang, Z. Real-time detection of tiny objects based on a weighted bi-directional FPN. In Proceedings of the International Conference on Multimedia Modeling, Bergen, Norway, 9–12 January 2022; Springer International Publishing: Cham, Switzerland, 2022; pp. 3–14. [Google Scholar] [CrossRef]
Li, M.; Chen, Y.; Zhang, T.; Huang, W. TA-YOLO: A lightweight small object detection model based on multi-dimensional trans-attention module for remote sensing images. Complex Intell. Syst. 2024, 10, 5459–5473. [Google Scholar] [CrossRef]

Figure 1. YOLOv8 architecture. Gray blocks represent Conv, yellow blocks denote C2f modules, the orange block indicates the SPPF layer, green blocks correspond to concatenation operations, blue blocks indicate upsampling operations, and purple blocks represent the detection heads.

Figure 2. YOLOv8-OBB architecture. The color scheme follows that of Figure 1, with the detection heads replaced by OBB outputs.

Figure 3. Improved YOLOv8-OBB architecture. (a) Unannotated version of the architecture; (b) refined model architecture, with structural modifications visually marked in red. The color scheme follows that of Figure 2, where orange represents the SPPF_LSKA module, green indicates the BiFPN layers, and the additional P2 head is newly incorporated.

Figure 4. Image cases for training. (a) Absent wafer; (b) wafer in motion; (c) Stationary wafer with a red box highlighting the area containing the notch; (d) enlarged view of the notch highlighted in red (this image case is not used as an example).

Figure 5. mAP50–95 evaluation for YOLOv8 and YOLOv8-OBB.

Figure 6. Model−performance comparison across different values using OFAT. (a) Lr0, (b) weight decay, (c) optimizer.

Figure 7. Grid-search results for mAP50–95 using combinations of optimizer types, Lr0, and weight-decay values.

Figure 8. Ablation-study results showing mAP50–95 for different combinations of SPPF_LSKA, BiFPN, and P2 head modules.

Figure 9. Ablation study of gradual unfreezing with varying epoch thresholds for layer release in the backbone network.

Figure 10. Detection results across representative angular ranges: (a) 3° from the 355-5° (−5–+5°) range, (b) 50° from the 40–50° range, and (c) 107° from the 107–117° range. Each example shows the predicted notch angle and confidence score for YOLOv8, YOLOv8-OBB, and the improved model, enabling a visual comparison of the detection performances.

Figure 11. mAP50–95 performance trends over extended training epochs for YOLOv8, YOLOv8-OBB, and the improved model.

Table 1. YOLOv8 model’s complexity and computational cost.

Model Size	Nano
Number of parameters	3,157,200
Gradients	3,157,184
GFLOPs	8.9

Table 2. Structural comparison between the baseline and the improved model.

Stage	Baseline	Improved Model	Channel Width	Structural Change
Backbone	SPPF	SPPF_LSKA	1024	Module replacement
Neck	Concat (P3–P5)	BiFPN (P3–P5)	P2 with 128 Channels added	Module replacement
Head	Detect (P3–P5) OBB (P3–P5)	OBB (P2–P5)	Additional P2 branch	Branch extension

Table 3. Software and hardware settings.

Platform	Description
System	Windows 11
Integrated development environment	Visual studio code
Virtual environment	Anaconda prompt
GPU	Nvidia Geforce RTX 4090
CPU	AMD Ryzen 7 5700X 8-Core Processor, 3401 MHz
Framework	Pytorch 2.5.1
CUDA	12.4
Language	Python 3.11.11
Ultralytics	8.3.51

Table 4. Baseline Hyperparameter Configurations.

Hyperparameter	Configuration
Model scale	Nano
Lr0	0.01
Weight decay	0.0005
Optimizer	AdamW
Epochs	100
Patience	0

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jun, E.S.; Sim, H.J.; Moon, S.J. Advancing YOLOv8-Based Wafer Notch-Angle Detection Using Oriented Bounding Boxes, Hyperparameter Tuning, Architecture Refinement, and Transfer Learning. Appl. Sci. 2025, 15, 11507. https://doi.org/10.3390/app152111507

AMA Style

Jun ES, Sim HJ, Moon SJ. Advancing YOLOv8-Based Wafer Notch-Angle Detection Using Oriented Bounding Boxes, Hyperparameter Tuning, Architecture Refinement, and Transfer Learning. Applied Sciences. 2025; 15(21):11507. https://doi.org/10.3390/app152111507

Chicago/Turabian Style

Jun, Eun Seok, Hyo Jun Sim, and Seung Jae Moon. 2025. "Advancing YOLOv8-Based Wafer Notch-Angle Detection Using Oriented Bounding Boxes, Hyperparameter Tuning, Architecture Refinement, and Transfer Learning" Applied Sciences 15, no. 21: 11507. https://doi.org/10.3390/app152111507

APA Style

Jun, E. S., Sim, H. J., & Moon, S. J. (2025). Advancing YOLOv8-Based Wafer Notch-Angle Detection Using Oriented Bounding Boxes, Hyperparameter Tuning, Architecture Refinement, and Transfer Learning. Applied Sciences, 15(21), 11507. https://doi.org/10.3390/app152111507

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Advancing YOLOv8-Based Wafer Notch-Angle Detection Using Oriented Bounding Boxes, Hyperparameter Tuning, Architecture Refinement, and Transfer Learning

Abstract

1. Introduction

2. YOLOv8-Based Detection Models and Optimization Methods

2.1. YOLOv8 and YOLOv8-OBB

2.2. Effect of Hyperparameters on Model Training and Optimization Methods

2.3. Architecture Improvement Based on YOLOv8-OBB

2.4. Gradual Unfreezing Transfer Learning

3. Performance Metric and Experimental Setup

3.1. Evaluation Indicator and Implementation Environment

3.2. Dataset Description and Preparation

4. Results and Discussion

4.1. Comparison of YOLOv8 and YOLOv8-OBB

4.2. Tuning of Model Parameters

4.3. Architectural Enhancement and Transfer Learning via Gradual Unfreezing

4.4. Detection Results and Validation with Prolonged Training

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI