Deep Learning for Classification of Internal Defects in Fused Filament Fabrication Using Optical Coherence Tomography

Lang, Valentin; Zhu, Qichen; Kopycinska-Müller, Malgorzata; Ihlenfeldt, Steffen

doi:10.3390/asi9020042

Open AccessArticle

Deep Learning for Classification of Internal Defects in Fused Filament Fabrication Using Optical Coherence Tomography

by

Valentin Lang

^1,*

,

Qichen Zhu

¹,

Malgorzata Kopycinska-Müller

² and

Steffen Ihlenfeldt

^1,3

¹

Institute of Mechatronic Engineering, Faculty of Mechanical Engineering, Technische Universität Dresden TUD, 01069 Dresden, Germany

²

Fraunhofer Institute for Ceramic Technologies and Systems IKTS, 01277 Dresden, Germany

³

Fraunhofer Institute for Machine Tools and Forming Technology IWU, 09126 Chemnitz, Germany

^*

Author to whom correspondence should be addressed.

Appl. Syst. Innov. 2026, 9(2), 42; https://doi.org/10.3390/asi9020042

Submission received: 23 December 2025 / Revised: 5 February 2026 / Accepted: 11 February 2026 / Published: 14 February 2026

(This article belongs to the Special Issue AI-Driven Decision Support for Systemic Innovation)

Download

Browse Figures

Versions Notes

Abstract

Additive manufacturing is increasingly adopted for the industrial production of small series of functional components, particularly in thermoplastic strand extrusion processes such as Fused Filament Fabrication. This transition relies on technological advances addressing key process limitations, including dimensional instability, weak interlayer bonding, extrusion defects, moisture sensitivity, and insufficient melting. Process monitoring therefore focuses on early defect detection to minimize failed builds and costs, while ultimately enabling process optimization and adaptive control to mitigate defects during fabrication. For this purpose, a data processing pipeline for monitoring Optical Coherence Tomography images acquired in Fused Filament Fabrication is introduced. Convolutional neural networks are used for the automatic classification of tomographic cross-sections. A dataset of tomographic images passes semi-automatic labeling, preprocessing, model training and evaluation. A sliding window detects outlier regions in the tomographic cross-sections, while masks suppress peripheral noise, enabling label generation based on outlier ratios. Data are split into training, validation, and test sets using block-based partitioning to limit leakage. The classification model employs a ResNet-V2 architecture with BottleneckV2 modules. Hyperparameters are optimized, with N = 2, K = 2, dropout 0.5, and learning rate 0.001 yielding best performance. The model achieves 0.9446 accuracy and outperforms EfficientNet-B0 and VGG16 in accuracy and efficiency.

Keywords:

artificial intelligence; computer vision; image classification; convolutional neural network; machine learning; deep learning; additive manufacturing; fused filament fabrication; optical coherence tomography

1. Introduction

Additive manufacturing has reached a stage of transition from purely stand-alone prototype production to widespread industrial application for the manufacture of small series of sophisticated functional components. This applies in principle to all additive manufacturing processes, in particular to printing thermoplastics using strand extrusion processes, commonly referred to as fused filament fabrication (FFF), and also alternatively referred to as fused deposition modeling (FDM). The transformation towards widespread industrial utilization is accompanied by consistent technological improvements addressing inherent process challenges, such as the insufficient dimensional stability of components [1], poor layer adhesion [2], inadequate extrusion [3], material breakage as a result of moisture [4], and poor print quality due to insufficient melting of the material [5].

In this context, an ultimate objective of technical monitoring concepts consists of identifying occurring defects at an imminent stage in order to, first and foremost, minimize efforts associated with continuing printing of defective parts as soon as possible, thus reducing excess costs due to process failures to a minimum. Furthermore, in the long term, print monitoring should also enable the optimization of printing processes and, via process control, ideally also provide adaptive processes that can cure prints in the event of a defect.

Optical Coherence Tomography (OCT) is one of the few high-resolution imaging techniques capable of providing volumetric information about transparent or semi-transparent objects, that is suitable for direct integration into production hardware for the purpose of in-line monitoring. In contrast to well-established Computed Tomography (CT), OCT uses low-energy radiation, making it a very attractive alternative for sensor technology in monitoring tasks in running production processes. OCT was originally developed as a diagnostic tool for ophthalmology [6]. Since then, it has become a routine diagnostic tool in this field and there is growing interest in adopting it for dermatology and stomatology [7]. The application of OCT in material diagnostics is also increasing [8]. OCT utilizes a short-coherence light source. As the method is based on Michelson interferometry, the light is split into two beams: A reference arm and a sample arm. The sample arm is focused on the object under investigation. Depending on the optical properties of the sample, the focused light can propagate through the material while exhibiting partial scattering at the surface and local inhomogeneities. These inhomogeneities include variations in material density, optical properties, contamination, pores, delamination, and other defects.

In this study, raw tomographic data from OCT measurements on AM-generated samples, in particular using FFF, are analyzed using deep learning methods of computer vision in order to assess the internal material quality of the section shown at the corresponding tomogram in preparation for prospective automated in-line monitoring of FFF printing. Convolutional neural networks (CNN) are used for automatic feature extraction and the classification of tomographic cross-sections. The tomographic information is aquired using a commercially available OCT system, which is well suited in terms of its dimensions and mass for capturing conditions from the interior of printed material volume directly at the material fusion location during a running printing operation. The concept envisages continuously scanning through the top layer, i.e., the currently printed layer, at least to an extent that information is recorded from the interface between superimposed printed layers. The junction between printed layers is of significant interest in terms of the structural strength of printed components, so that in-line recorded condition information from the junction area between printed layers is key to achieving the effective monitoring of component quality.

2. State of the Art

In additive manufacturing, common real-time inspection techniques focus on optical inspection, thermal imaging, and acoustic monitoring [9]. These technologies are widely adopted due to efficiency, ease of integration, and applicability to various materials and processes.

Approaches to data-based monitoring in additive manufacturing typically leverage advanced processing and analysis methods, ranging from feature extraction from sources such as melt pool images or heat maps to statistical methods that detect deviations from reference conditions. Machine learning (ML) is playing an increasingly important role in this context. Supervised models classify defects or predict porosity [10,11,12], unsupervised models detect anomalies [13,14,15], and physics-based approaches combine simulations with sensor data [16], while digital twins provide virtual real-time replicas of the process [17]. Monitoring strategies are either ex situ, involving CT or X-ray scans [18], surface profilometry [19], and destructive testing for validation [20], or in situ, incorporating layer-by-layer inspection [21], melt pool tracking [22], and adaptive feedback [23]. Major challenges include managing the vast amounts of data generated during printing, distinguishing between genuine defects and noise, and a lack of standardized frameworks, which complicates integration into control loop systems [24]. Despite all obstacles, applications can be found in aerospace, medicine, and the automotive industry, where reliable monitoring supports certification and quality assurance, as well as in research and development, where processes, structures, and properties need to be linked together [25].

Computer vision is a science at the interface between computer science and engineering and is dedicated to processing and analyzing images captured by cameras in a variety of ways in order to understand their content or extract information. Typical tasks of computer vision include object recognition and determining geometric structures of objects and movements, using image processing algorithms such as segmentation and pattern recognition methods such as object classification [26,27].

Optical monitoring is frequently used in additive manufacturing as it is non-contact, offers high temporal and spatial resolution, and provides direct insight into defects such as porosity, lack of fusion, or overheating. Different types of sensors can serve complementary roles, with high-speed cameras capturing detailed images of melt pools but generating enormous amounts of data [25,28], whereas photodiodes provide ultra-fast signals with low memory requirements based on melt pool emissions, but without spatial detail [29,30]. Infrared cameras map thermal fields to evaluate cooling rates and hot spots, but face challenges with emissivity [31,32]. Laser profilometers capture the surface topography of printed layers to enable direct conclusions about material quality and, extending on this, predictions regarding final component quality [21,33]. These sensors generate large, complex datasets that require complex processing, from image analysis and deep learning for defect classification to signal processing and multi-sensor fusion for accurate and responsive monitoring. Nevertheless, major challenges remain in processing huge amounts of data, reliably assigning signals to fault types, and standardizing procedures. Future trends are focused on control loop systems that adjust process parameters (nozzle temperature, bed temperature, feed speed, extrusion speed) directly based on optical feedback [34,35,36].

The function of OCT is basically comparable to that of ultrasound-like optical imaging, whereby low-coherence interferometry is used to enable resolution in the µm range and depth-resolved imaging up to a depth of several millimeters in real time [37]. OCT as a high-resolution imaging technique increasingly excels in a wide range of applications beyond medical imaging, including materials science, and thus recently in additive manufacturing [6,38]. OCT provides information about the internal structure of a sample by measuring the coherence of light waves, enabling the creation of three-dimensional images and thus the monitoring of internal alterations in material or tissue [39]. The technique is particularly suitable for examining transparent or translucent materials such as polymers, biological tissue, and thin films, while being non-invasive, delivering real-time images, and boasting high resolution [40]. Unlike conventional optical sensors such as cameras, photodiodes, or IR systems, which are limited to surface observations, OCT can visualize features beneath the surface in recently solidified layers, making it extremely valuable for detecting defects, near-surface pores, delaminations, and even surface roughness [41,42,43]. Demonstrated applications include powder bed fusion, where OCT measures layer thickness and detects subsurface cracks [30,44], as well as directed energy deposition, where OCT tracks bead geometry and ensures consistency between layers [45]. Research has shown that OCT can measure melt track depth and correlate it with process parameters, allowing for the prevention of porosity [46]. Technological virtues in high-resolution, volumetric data acquisition, and non-contact operation with great potential for integrated control are contrasted with challenges in limited penetration depth, line-of-sight requirements, complexity of integration, and a necessity for managing vast 3D datasets [8]. However, future developments in OCT are expected to complement traditional optical inspection by combining surface and subsurface information, feeding data into digital twins, proving particularly valuable in industries such as aerospace, electronics, optics and medical devices where internal defects are deemed unacceptable [8,47,48,49].

3. Materials and Methods

Tomographic cross-sectional OCT images pass through a processing pipeline consisting of labeling, preprocessing, model training and evaluation (see Figure 1). Labeling constitutes a statistical approach employing sliding window thresholding of Z-scores, subsequent morphological operations, and thresholding outlier ratios. The features derived in the labeling approach are not further processed in model training. The data preprocessing handles raw OCT cross-sectional images and involves cropping, normalization, histogram matching, data splitting and augmentation. Spatial continuity and correlation of B-scan images are considered in data splitting using block-based partitioning to split training, validation, and test sets. ResNet-V2 [50] builds the platform for the deep learning model for the classification task, with BottleneckV2 modules applied to the network. To optimize model performance, the width multiplication factor K and number of bottleneck modules N in the network are tuned.

The model is trained on the OCT raw data in order to perform a classification task on the OCT images into ‘good’ and ‘bad’ regarding the internal material state of the printed volume. The criteria for assessing the model’s performance include accuracy and loss curves, confusion matrix, precision (‘good’), recall (‘bad’), and the F1 score. Accuracy and loss values are recorded at each training epoch and plotted as curves upon the completion of training. As misclassifying the ‘bad’ category as ‘good’ has a significant impact on the subsequent printing process, special emphasis is placed on recall (‘bad’) and precision (‘good’). A high precision indicates a lower rate of misclassifying ‘bad’ samples as ‘good’. A high recall ensures most ‘bad’ samples are accurately detected, minimizing missed detections. The deep learning models in this study are constructed, trained, and validated using the PyTorch 2.3.0 library, with data preprocessing done by the NumPy and TorchVision libraries, including image transformation, normalization, and enhancement operations. The training and testing of the model is done in a Python 3.11 environment using the PyTorch library on an NVIDIA RTX 3070 GPU.

3.1. Data Acquisition

The experimental setup (see Table 1) for ex situ capturing of process-related tomographic image data is based on printing experiments utilizing a Raise3D Pro3. With a layer height of 0.2 mm, on a square base area of 10 × 10 mm², lines with a total length of 10 mm were printed on top of each other with different numbers of layers using PA12 and PLA. A nozzle diameter of 0.4 mm, generally the most widely used in FFF, was employed. The process employed an average nozzle temperature of 205 °C in the case of PLA and 265 °C in the case of PC. The build bed had an average bed temperature of 55 °C.

The data were acquired using a Spectral Domain OCT (SD-OCT) system [51]. A schematic representation of the system is shown in Figure 2, and a detailed description of the data processing chain for SD-OCT and other OCT variations can be found in [52,53]. The basic output unit in a scanning SD-OCT system is the depth profile recorded at a single location, known as an A-scan. As the light beam moves along one axis, multiple A-scans form a B-scan, which represents a virtual cross-sectional image of the sample. A series of B-scans acquired along the other axis creates a tomogram. This terminology is similar to that used in ultrasound imaging.

The test samples were imaged using a commercially available ThorLabs Telesto OCT system, operating at a central wavelength of 1300 nm. The axial resolution in air was 6.95 µm, and the optical components provided a lateral resolution of 7 µm. Note that in SD-OCT systems, axial resolution also depends on the refractive index of the imaged material. The maximum field of view was 9 × 9 mm², and the system could acquire A-scans at a rate of 76 kHz.

Data analyzed in this study were acquired ex situ. Samples were placed under the OCT sensor and aligned, and tomograms were recorded at specific locations for structures longer than 10 mm. Raw data were exported as stacks of matrices in TIF format. These could be viewed in Fiji software as images, where gray values corresponded to the signal intensity measured by the OCT system in dB.

OCT images contain the usual noise, resulting in the recorded images containing artifacts and false defect signals caused by the imaging mechanism and/or the sample structure:

Point-like bright spots are artifacts caused by speckle noise or tiny scattering [54];
Vertical shadows indicate signal loss artifacts caused by superseding, highly reflective, or opaque structures that prevent light from reaching underlying layers [55];
Vertical bright lines are caused by interaction between scan synchronization and periodic sample structures, resulting in local signal duplication or misalignment [55].

Samples consisting of single, double, and triple layers of material were imaged, and the corresponding data stacks were used for further analysis. In total, 8135 images were available for training and testing the model (Table 2). These images come in TIF file format, with each image accompanied by a TXT file containing A-scan-specific intensity slope values for each B-scan. By processing the TIF files, each B-scan image can be extracted for subsequent analysis. The image files obtained are listed in Table 2 along with their characteristics. The table shows the number of B-scans obtained in each TIF stack and indicates the output format of the B-scans contained as well as the materials of the corresponding samples.

3.2. Data Exploration and Data Labeling

The TIF files, shown as ‘C-scan’ in Figure 2c, constitute the initial files of this study, containing the entire scan spatial image. By processing the TIF file, each B-scan image can be derived for subsequent analysis. Figure 2c and Figure 3a show a B-scan image of an OCT scan exhibiting significant vertical artifacts, as it is a composite of several consecutive A-scans. The slope map is calculated by moving a 10-pixel sliding window along each A-scan (vertical direction, i.e., columns of the image array). For each B-scan image, a series of corresponding slope value data in txt format is generated. The sliding window is selected so as to capture local variations around each pixel point and represent its structural features in the B-scan image, especially where object edges and internal defects occur, which appear as distinct bright areas on the OCT image. The occurrence of these bright areas is related to the scattering and reflection of light at defects, and a slope analysis of the sliding window can effectively extract these bright areas while avoiding errors caused by noise along the vertical direction. This method identifies signs of structural changes appearing in the images, and thus facilitates distinct localization of outlier locations.

Figure 3b is based on the slope value document corresponding to the B-scan image in Figure 3a. The slope analysis clearly highlights both the edge information and the internal outlier information. Although the column length is reduced by 9 pixels due to the sliding window operation, this has a negligible effect on the subsequent analysis. The reason is that the upper edge area of the B-scan consists mainly of air and not of a 3D-printed area. Therefore, deleting this section has no impact on the results of the AM area analysis. The sliding window method continues in effect for detecting feature changes in the AM area, assuring that valid information about structural changes in the center area of the image is prioritized.

Before conducting a Z-score outlier analysis for an input document with slope values, it is required to verify that the data is approximately normally distributed. According to the Q-Q plot shown in Figure 3c, the data in the middle can reasonably be described as a normal distribution, but there are deviations at the ends of the data, which show extreme values or hard ends, indicating the presence of excessive outliers or deviations. The Z-score can effectively identify these data points that deviate from normal. Since most of the data follows a normal distribution, using the Z-score in measuring the distance from the mean is a sensible choice to reveal both the central tendency of the data and the abnormal outliers.

The applied Z-score-based evaluation method standardizes the data and uses plus or minus 2 standard deviations as criteria for determining outliers. In this way, outliers in the data can be effectively identified. The results are shown in Figure 3d, with the red markings indicating identified outliers. These outlier regions essentially correspond to the highlighted areas at the edges and inside the image, resulting in an extraction effect. A binary outlier matrix is created as a txt file for subsequent mask creation.

Since this study addresses the classification and identification of the internal state of printed volumes, a mask is utilized to clear the areas of the surface edges and the surrounding air. Initially, the upper three-quarters of the outlier image are recognized as the primary processing object. The reason for this is that the upper part of the image contains more important information about the target during the analysis process. The lower quarter of the image, on the other hand, contains outliers whose points are irregular and not suitable for image processing. Therefore, for each image file, the upper 75% is used as the basis for subsequent processing. Morphological expansion and erosion operations are performed to highlight the target area and eliminate noise, as shown in Table 3.

Opening performs Erosion first, followed by Dilation. Its primary function is to remove small noise points while preserving larger target areas. It is suitable for removing isolated noise points. Closing performs dilation first, followed by erosion. Its primary function is to fill small voids in the target region and connect disconnected sections. The ‘Parameter’ column of Table 3 lists the parameters corresponding to Erosion and Dilation. The order of the parameters is based on the order in which Erosion and Dilatation are employed during Opening or Closing. Figure 4a–d show the effects of the Opening and Closing operations on the outlier images. Figure 4a,c are the original image, and Figure 4b,d are the processed result. The Opening operation effectively removes the small outliers at both edges, while the Closing operation connects the disrupted edges.

In the morphologically processed image, an interpolation method is used to curate areas with all blank columns, as shown in Figure 5a. Spline interpolation is used to find the topmost 1-value points on both sides of the blank columns, and interpolation is used to create smooth boundary lines that gradually fill in these blank sections. Since the image has undergone morphological operations, the interpolation process does not induce boundary fluctuations due to small outliers, ensuring the smoothness and continuity of the boundary regions.

This step results in a continuous boundary between the top of the print and the air. The complete image is created by combining the processed upper-three-quarters image with the original lower-quarter image.

Based on the recognizable material edges of each printed piece in the outlier image, an average thickness of 15 pixels can be determined for the boundary area. To remove the effect of the edges of the prints for subsequent analysis, for each pixel column, the first 15 pixels in the column with a value of 1 are set from 1 to 0, effectively removing the surface area on average, and the final binary mask is obtained as shown in Figure 5b. Figure 5c shows the B-scan plot of the original OCT combined with the masks for the detected outliers (red dots) and the mask for distinguishing between the environment and material volume, with the green part marking the inner area of the FFF print.

By computing the percentage of the accumulation of the area of the detected outliers in the material volume, or the proportions of the red area in the green area, respectively, the percentage of outliers within the examined section of the printed part can be determined.

As evident from Figure 6, the percentage and variance of ‘A-09_1_layer’, ‘X5Y4_1_layer’, ‘X5Y4_2_layer’, and ‘X5Y4_3_layer’ are significantly lower than those of the other B-scan image sets, which demonstrates the consistency and stability of these datasets in the overall data.

In particular, the datasets ‘X5Y4_1_layer’, ‘X5Y4_2_layer’, and ‘X5Y4_3_layer’ show lower outlier rates, more stable ratios, and lower variances compared to the other datasets. This suggests that these images are more reliable for determining the boundary between the labels ‘good’ and ‘bad’. However, since ‘X5Y4’ and ‘A-09’ represent two different printing materials, the ‘A-09_1_layer’ dataset is considered in the experiments to ensure the completeness and reliability of the threshold determination. Finally, to establish the criteria for image labeling, the labeling threshold is determined based on the distribution covering 95% of the data from the four B-scan image sets: ‘X5Y4_1_layer’, ‘X5Y4_2_layer’, ‘X5Y4_3_layer’, and ‘A-09_1_layer’. The 95th percentile threshold method is a common choice for image classification and quality control in additive manufacturing. It effectively disregards outliers or noise by averaging the analysis results across the bulk of the data, ensuring that the labeling threshold is more representative and robust. The 95th percentile is used in additive manufacturing to evaluate the effects of powder quality and build orientation, for example, and this method is also frequently employed in image classification to establish classification criteria with high accuracy [56,57].

The analysis results in a value of 0.93 for the 95th percentile. Consequently, images with outlier scores greater than or equal to 0.93 are classified as ‘bad’, while images with scores less than 0.93 are classified as ‘good’. This threshold not only takes into account the differences in the assessment of different datasets, but also different material properties, thus providing the overall most reliable basis for the classification process.

3.3. Data Preprocessing

3.3.1. Image Preprocessing and Conversion

First, resizing is performed by uniformly reducing all images to 224 × 224 pixels given the required consistency of the size of images introduced into the deep learning model. This size is primarily chosen to accommodate common CNN architectures, such as ResNet, where this size retains sufficient detail without consuming significant memory and computing resources.

Data augmentation is then performed to improve the model’s generalization ability and reduce the risk of model overfitting. Data augmentation techniques are applied to the training set. The images in the training set are randomly flipped horizontally using the random level flipping method, helping the model learn different viewing angle features, especially for the task of undirected images.

Finally, normalization is performed. In deep learning models, the range of input values has an important influence on training efficiency and convergence speed. The image pixel values are normalized by adjusting their mean to 0.1605 and their standard deviation to 0.1056. Normalization helps improve training stability and avoid the problem of exploding or vanishing gradients due to pixel values that are either excessively large or small.

The results of the preprocessing are shown in Figure 7, from left to right: the Resized plot (Figure 7a), the randomized horizontally flipped plot (Figure 7b), and the Normalized plot (Figure 7c).

3.3.2. Data Transformation

Since two different materials, PA12 and PLA, are involved in this study, the surface structure and scanning properties of these materials can lead to significant differences in the distribution of gray values in the images. In order to avoid the influence of such material differences on the model classification performance, histogram matching (HM) is applied to make the images of different materials more consistent in terms of gray value distribution. Histogram transformation in OCT image processing has been shown to be effective for adapting image features from one material to another, especially when combined with deep learning techniques. With such image transformation techniques, OCT images of different materials can be normalized to similar image features for training in a unified model, improving the accuracy of image classification or other recognition tasks [58,59].

Since most images in the ‘X5Y4’ dataset manufactured from PLA material are labeled as ‘good’, only the images labeled as ‘good’ from the ‘A-09_1_layer’ material dataset manufactured from PA12 material are used to calculate the average cumulative distribution, as shown in Figure 8a. The gray scale histograms of the ‘X5Y4’ image sets made of PLA material are then adjusted to the average cumulative distribution using histogram matching. Figure 8b shows the version before adjustment and Figure 8c shows the image after adjustment. It can be observed that the image processed by histogram matching after adjustment has a gray value distribution that is similar to the average cumulative distribution of the reference images, as shown in Figure 8b, and can be directly utilized for model training and validation.

3.3.3. Dataset Loading and Splitting

Due to the spatial continuity of the B-scan images during the OCT scanning process, there is a high degree of similarity between adjacent B-scan images. To avoid the problem of data leakage, which is very likely to occur if the training and test sets contain excessively similar images during the model training process, a block-based segmentation strategy is applied. With this strategy, M consecutive B-scan images are considered as a block and randomly assigned to blocks when dividing the datasets (training, validation, and test sets). This means that images within the same block are assigned to the same dataset, ensuring that there are no adjacent or excessively similar image samples in different datasets.

Small block-size M: Adjacent images can split into different datasets, which increases the risk of data loss and reduces the accuracy of the model evaluation [60];
Large block-size M: Reduced probability of data leakage but prone to excessive differences between training, validation, and test sets, compromising the model’s generalization ability [61].

To ensure the reproducibility and accuracy of the experiments, the following steps are used to divide the dataset:

Determination of the number of blocks: All B-scan images are divided into a number of blocks, each containing number M of B-scan images, used as the basic unit of data division.
Division of training, validation, and test data: The division is 7:2:1, with 70% of the data used for training, 20% for validation, and 10% for testing. The training dataset includes augmentation operations as well as normalization operations. The validation and test sets retain the original structure of the image and only perform normalization operations to ensure their originality, allowing better simulation of the performance in a real-world environment.
Use of random seeds: The random number 42 is used, which ensures that the results are consistent each time the data is split.

Loading training data, a random shuffling strategy is used to ensure that the order of the images is different for each training run, so that the model is not dependent on a specific image order.

Table 4 shows the number of samples within training, validation, and test sets for each classification method for each block size category.

3.4. Modeling

3.4.1. Model Architecture

An improved residual unit based on ResNet-V2 [50] is adopted to reduce the number of network layers and increase the width of the network, and an improved deep residual network model is proposed to be applied to the classification of B-scan scanned images of 3D-printed parts. The goal is to achieve better image classification performance even with limited computing resources by using a more shallow residual network, since it is intended to run the model inference close to a machine for real-time process monitoring, which requires a certain degree of efficiency.

This network, based on the ResNet-V2 architecture, uses the BottleneckV2 module, with the input sizes and depth of the network adjusted according to the binary classification task into ‘good’ and ‘bad’. According to Table 5, the ResNet-V2 network architecture consists of the following modules:

Convolutional Layer: A 6 × 6 convolutional kernel is used for initial feature extraction with 64 channels and 3 padding, followed by batch normalization and ReLU activation to normalize features and activate convoluted nonlinear features.
Maximum pooling layer: A 2 × 2 pooling kernel is used with a step size of 2 and no padding (padding = 0), reducing the feature map size from 224 × 224 to 112 × 112.
Residual module: Each residual module consists of several BottleneckV2 modules, where N is the number of BottleneckV2 modules contained in each residual module and K is the multiplication factor of the number of channels, which is the width of the network, as shown in Figure 9a. The ‘pre-activation’ method, specifically ‘BN-ReLUConv’, is used. N BottleneckV2 modules in each residual module are structured by a residual connection, enabling the input features to be passed directly to the next layer (see Figure 9b,c).
Adaptive Average Pooling: Adaptive Average Pooling is used to gradually reduce the size of the feature maps and thus extract to high-level global features.
Fully Connected Layer: A dropout layer is implemented before a fully connected layer, which maps the input features to classification outputs ‘good’ and ‘bad’, converting them into probability distributions by a softmax function.

In the evolved residual network model, the residual unit has been modified to reduce the depth of the network while increasing its width. In particular, the total number of layers in the network has been reduced and the width of each layer has been expanded by increasing the number of channels. The total number of layers in the network is 2N + 2, with N representing the number of BottleneckV2 modules per residual block.

The following adjustments to the model structure are considered:

Bottleneck design: The BottleneckV2 module is used with an expansion ratio of 4, reducing the computational load on the middle layer of the network while maintaining strong feature representation capability at deep layers [62];
Multi-layer design: The ResNet-V2 network architecture consists of four residual modules corresponding to 64, 128, 256, and 512 channels. Multiple bottleneck blocks are embedded in each module, allowing the model to extract more abstract, higher-level features layer by layer;
Dropout: A dropout layer is included before the fully connected layer to randomly drop neurons to improve the robustness of the model and avoid the model being overly dependent on the training set [63];
Tuning of K and N: Parameters K and N in this model were not locked, as they are determined through tuning experiments.

3.4.2. Hyperparameter Configuration

During the training process, default hyperparameters are set to ensure that the ResNet-V2 network converges within a reasonable time frame while achieving optimal performance. In the comparative experiments outlined, only the hyperparameters under comparison are modified, while all other settings remain unchanged to ensure a valid comparison. The specific training configuration is as follows:

Learning Rate: An initial learning rate of 0.001 is used, along with a step decay schedule. The learning rate decay is applied every 5 epochs, with a decay factor of 0.1.
Optimizer: The Stochastic Gradient Descent (SGD) optimizer with momentum is employed, using a momentum value of 0.9 and weight decay of 0.0005.
Batch Size: A batch size of 32 is chosen based on the available memory of the 8GB RTX 3070 GPU employed.
Dropout: A dropout rate of 0.5 is applied to reduce overfitting.
Early Stopping: Early stopping with a patience of 12 epochs is used to prevent overtraining and ensure the model does not continue training once performance plateaus.
Epochs: The maximum number of epochs is set to 40.
Loss Function: The model is optimized using cross-entropy loss, appropriate for the binary classification task.

4. Results

4.1. Block Size Optimization

Results are primarily evaluated based on accuracy and loss curves in the training and validation sets. Since the goal is to investigate the influence of block size M on results with randomly assigned datasets, no data augmentation is applied to the training set. This is because data augmentation can disrupt the spatial correlation between consecutive B-scan images; for example, through horizontal flipping transformations. Such distortion can affect the accuracy of the experimental results and cause the model’s effective performance on this type of structured data to be misrepresented.

Figure 10a–h show the evolutions of accuracy and loss, respectively, across the training progressions for M values of 5, 10, 20, and 40. The test curve shows flat progression, as the test data set is ultimately only applied one time using the fully trained model. Figure 10a,b display the result for M = 5. The accuracy of the validation set consistently remains higher than that of the training set, and the loss is consistently lower than that of the test set. The accuracy of the test set is slightly lower than the accuracy of the validation set, but higher than the accuracy of the training set. This phenomenon is remarkably counterintuitive for deep-learning training processes.

In contrast, the accuracy and loss trends of the training and validation sets in Figure 10c,d appear to be normal, suggesting that the model with M = 10 shows neither signs of overfitting nor of underfitting. Similarly, the trends shown in Figure 10f and in Figure 10e appear typical for deep learning processes.

Figure 10g,h show the training conditions that result when block size M = 40 is applied. The loss of the validation set increases with the number of epochs, while the accuracy decreases with increasing number of epochs, indicating the occurrence of overfitting. Although a larger block size reduces data leakage, it can also lead to excessive differences between training, validation, and test sets, which has a negative effect on the model’s generalization ability.

In summary, the model performs well with block size M = 10 and block size M = 20, as in these conditions, there is no apparent indication of potential risk of data leakage or excessive feature differences in the training plots for recognizable loss and accuracy. Given the similar results, setting the block size M to 10 provides more flexible data partitioning, reducing the risk of overfitting while increasing data randomness, thereby improving the model’s generalization ability. Therefore, the dataset with block size M = 10 is selected for the subsequent experiments.

4.2. Hyperparameter Comparison

In order to investigate the effect of the multiplication factor K and the number of bottleneck modules N to the network performance, five sets of experiments are designed. The combinations of K and N are shown in Table 6, also listing accuracy, recall, precision, and F1-score performance for different model structures. Particular attention focuses on model performance in terms of recall (‘bad’) and precision (‘good’), since misclassifying a ‘bad’ category as a ‘good’ category has significant implications for the subsequent printing process in practical applications. Three experiments are conducted for each model structure, and average values of the test sets are provided upon completion.

Model performance is also evaluated in conjunction with the accuracy evolution curves during training. Analysis of the accuracy curves shows that the accuracy of the validation and test sets at K = 1 and N = 1, shown in Figure 11a, as well as at K = 1 and N = 2, shown in Figure 11b, and further at K = 2 and N = 1, shown in Figure 11c, is significantly above the accuracy of the training set.

This characteristics may result from insufficient model complexity and consequent underfitting for low values of N and K (e.g., K = 1, N = 1 or K = 2, N = 1), as the model lacks sufficient parameters to capture complex training patterns. In addition, data augmentation techniques such as random horizontal mirroring are applied only to the training set, which can further lower training accuracy compared to validation performance.

To ensure the generalization ability of the model, the subsequent comparisons are only performed between the two structures K = 2 and N = 2, and K = 3 and N = 2. The accuracy curves are shown in Figure 11d,e, and the results, in particular the accuracy curves for the training, validation, and test sets, are in line with expectations. The experiments show that the K = 2 and N = 2 structure performs better on several metrics, including recall (‘bad’), precision (‘good’), accuracy, and F1 score. Therefore, the K = 2 and N = 2 structure is chosen as the final architecture for the ResNet-V2 model.

Four different dropout values are tested, whose results are presented in Table 7. It can be concluded from Table 7 that the model achieves the best results for the metrics accuracy and F1-score at a high dropout value of 0.5. It can further be concluded from Table 7 that the model achieves better results for the metrics recall (‘bad’) and precision (‘good’) as the dropout value increases up to a dropout value of 0.8.

Although a dropout rate of 0.8 is commonly considered extremely high, it apparently proves to be most effective for the metrics recall (‘bad’) and precision (‘good’) in the examined study. This is attributed to significant noise in the OCT images, where a higher dropout rate assists the model in ignoring noise and focusing on extracting more relevant features. Therefore, the dropout value is ultimately set to 0.8.

Two different learning rate values are tested, with the results depicted in Table 8. According to the results, all four metrics under consideration are better with the initial learning rate set to 10 × 10⁻³ compared to the learning rate set to 10 × 10⁻⁴. Therefore, 10 × 10⁻³ is selected as the initial learning rate.

4.3. Test Performance

The performance of the model and the feature extraction effect are finally examined through the analysis of test results of the optimized ResNet-V2 network using confusion matrix and Grad-CAM heatmap visualization.

Notably, elements of the class ‘good’ are clustered more closely, indicating that the model exhibits a higher degree of distinguishability and consistency in feature extraction for elements of the class ‘good’. This is consistent with the results of the confusion matrix in Figure 12, where the number of misclassifications for elements of the label ‘good’ is significantly lower than for elements of the label ‘bad’.

Although the overall separation between category ‘bad’ and category ‘good’ is pronounced, there are some areas of overlap or proximity between the two categories. In addition, the points in category ‘bad’ appear to be more scattered than those in category ‘good’, and the clustering is relatively loose. This suggests that the model is less stable in extracting features from elements in the class ‘bad’ as compared to extracting features from elements in the class ‘good’, resulting in poorer performance in classifying samples in the class ‘bad’ and a correspondingly higher misclassification rate for samples in the class ‘bad’.

Grad-CAM heatmaps are used to investigate model problems and possible causes in the event of misclassifications. Figure 13a,b shows an example of misclassification of a sample that was in fact labeled as ‘bad’ but classified as ‘good’. The outlier ratio of this sample is 1.03%, which is approaching the threshold of 0.93%.

Figure 13c,d illustrate a case in which a sample truly labeled as ‘good’ is misclassified as ‘bad’. The outlier ratio of this sample is 0.15%. Figure 13e,f shows another case in which a sample truly labeled as ‘good’ is incorrectly classified as ‘bad’, even though this sample has an true outlier ratio of only 0.05%.

By comparing the model with widely used models that include EfficientB0 and VGG16, the model’s performance in terms of generalization ability, robustness, and computational efficiency is assessed. The hyperparameters in migration learning are set according to Section 3.4.2 by default, with dropout rate set to 0.8. The training results are summarized in Table 9.

In order to conform the single-channel input images to the three-channel input format of the pre-trained EfficientNet-B0 and VGG16 architectures, the single-channel images are replicated across all three channels in order to generate pseudo-RGB images. In particular, for each original grayscale image, the pixel intensities are replicated along the channel dimensions to obtain the input tensor.

5. Discussion

Although the motivation is clearly focused on preparing grounds for in-line process monitoring, the experiments documented in this study are conducted ex situ. The successful integration of OCT sensor heads into printers has already been demonstrated [49,64]. This has shown that neither the spatial dimensions nor the mass of OCT sensor heads constitute an obstacle to their integration into a printer. The energy required or the burden on the environment, for example due to radiation, also do not represent an obstacle. If the measurement is to be carried out in-line without any loss of time, rather than with a time delay inside the printing compartment, then the biggest hurdle will be the accessibility of the beams to the location of the material deposition. This means that a measuring setup must either be positioned directly after the material deposition, meaning after the nozzle, which requires the measuring setup to be able to rotate around the nozzle, or it could be positioned at an angle to the nozzle so that the location of material deposition directly below the nozzle is irradiated via an inclined beam angle.

The binary labels serving as ground truth in the proposed attempt are derived from an OCT image-based heuristic using outlier area ratios. This approach makes the label generation rely on the same OCT-derived features that are later used for training the classification model, which introduces a degree of circularity. Therefore, the performance of the trained classification model only reflects consistency with the adopted labeling strategy rather than absolute physical defect validation. As a consequence, future work must involve complementary validation methods, such as CT.

Evolutions of accuracy and loss, respectively, across training progressions for block size values M of 5, 10, 20, and 40 are investigated. For block size M = 5, the accuracy of the validation set consistently remains higher than that of the training set, and the accuracy of the test set is slightly lower than the accuracy of the validation set, but higher than the accuracy of the training set, which is counterintuitive for training processes.

In a harmonically learning model, the development of validation metrics shall track the trend observed in the training data closely. Over the course of training, both training and validation loss shall decrease, and both accuracies shall increase in a similar manner, indicating that the model is learning informative and transferable patterns rather than memorizing training examples. Validation performance is supposed to be typically slightly worse than training performance, a difference often referred to as the generalization gap, which remains modest and stable throughout training progression. The presence of a moderate and consistent generalization gap is to be expected and reflects the inherent difference between optimizing a model on the training set and evaluating it on unknown data [65,66]. One possible explanation for missing the generalization gap in the case of block size M = 5 is that the block size M value is too small, causing similar neighboring images to be split into different datasets, resulting in data leakage.

The effect of five sets of multiplication factor K and number of bottleneck modules N on model performance is investigated. For K = 1 and N = 1, K = 1 and N = 2, and K = 2 and N = 1, the accuracy of validation and test sets is significantly above the accuracy of the training set. It is possible that the reason for this phenomenon is a lack of complexity in the model, meaning that it is underfitted. If N and K are low, as for K = 1 and N = 1 or K = 2 and N = 1, for example, the model does not have enough parameters to capture the complex patterns in the training data, resulting in the underfitting of the training set. The accuracy of the validation set is relatively high, as the examples in the validation set can be relatively simple and the oversimplification of the model does not have as strong a negative impact on the examples in the validation set. Additionally, data augmentation methods such as random horizontal mirroring are applied to the training set, which can result in lower training accuracy compared to the validation set.

Four dropout values are evaluated (Table 7). The model achieves the highest accuracy and F1-score at a dropout of 0.5, while recall (’bad’) and precision (’good’) improve with increasing dropout up to 0.8. In general, a higher dropout value is effective in improving the model’s generalization ability, but anyway, a dropout rate of 0.8 is commonly considered extremely high [63]. Such a high drop out rate can be justified in specific scenarios in which such an aggressive regularization strategy is required [67]. A relatively small training dataset is likely to lead to severe overfitting. If the model is unable to generalize, a high dropout rate can be applied to severely limit the effective capacity of the network and force it to learn more robust, distributed representations. Ideally, a high dropout is applied exclusively to fully connected classifier heads and not to convolutional layers. Fully connected layers often contain a disproportionate number of parameters and are particularly prone to overfitting. In these cases, using a higher dropout rate acts as structural regularization, preventing the co-adaptation of neurons and improving generalization, while leaving earlier convolutional feature extractors largely untouched [68]. An edge case occurs when dropout is intentionally used to approximate ensemble-like behavior during training. High dropout can effectively serve as implicit training of a large number of subnetworks and averaging their predictions at inference time. Thus, a higher dropout rate can be used to improve robustness and reduce variance. [63] However, it is worth considering applying dropout values of up to 0.8 selectively, rather in the final layers of a network than throughout the entire architecture [69].

The optimized model delivers results based on test data with an accuracy of 0.9446, a recall (‘bad’) of 0.9227, and a precision (‘good’) of 0.9175. An overall accuracy of 0.9446 for the test data indicates that the model effectively generalizes to unknown samples and correctly classifies the vast majority of images. The recall value of 0.9227 for the class ‘bad’ shows that more than 92% of the samples that are truly defective are correctly identified. This is particularly important in quality assurance or safety-critical applications, where missing a defective item can have significant consequences. The high recall value therefore shows that the model is effective in minimizing false negatives for the critical class. At the same time, a precision of 0.9175 for the class ‘good’ shows that the model is correct in the vast majority of cases when it predicts a sample as non-defective. This suggests that the classifier does not excessively penalise normal samples and maintains an appropriate balance between rejecting defective items and correctly accepting good items. Overall, these metrics are internally consistent and suggest that the classifier achieves a favorable compromise between sensitivity and reliability. Overall, the reported performance can be considered strong and credible for an applied image classification task and would generally be suitable for further validation or deployment testing.

Grad-CAM heatmaps shown in Figure 13 illustrate causes of model problems and misclassifications. Figure 13a shows that the sample has extra noise in the vertical direction, which may have affected the model’s assessment. The Grad-CAM heatmap shown in Figure 13b shows that the red areas of the model focus are mainly concentrated on the areas of severe noise rather than on the internal structure of the 3D-printed part. This suggests that the model is not sufficiently robust against such signal noise and is prone to being influenced by external noise signals, leading to misclassifications.

In Figure 13d, the Grad-CAM heatmap shows that the red areas of high attention are mainly concentrated upon the edges towards the upper air periphery area of the 3D-printed part, thus deviating from the critical structural areas of the material interior and leading to a misclassification of the model. This suggests that during the processing of such samples, the model is not properly focused on the relevant features of the 3D-printed part, but is instead distracted by irrelevant areas.

In the case of misclassification illustrated in Figure 13e,f, the image is first prepared for histogram matching. It can be observed that after histogram equalization, dense white areas appear on both sides of Figure 13e. This phenomenon is also reflected in the Grad-CAM visualization in Figure 13f, where the model’s attention is drawn to these white areas due to histogram equalization, leading to misclassification. Although most histogram-matched images are correctly classified, this example shows that this technique is subject to a degree of residual uncertainty, which in certain cases can lead to noise or interference and compromise the classification performance of the model.

Although the developed model achieves an overall accuracy of 94%, the misclassifications reveal areas where the model’s performance can be further improved. In particular, the model is sensitive to noise, edge effects, and preprocessing techniques such as histogram equalization, which in certain cases can shift the model’s focus to irrelevant areas. While these issues tend towards marginal impact on the overall classification performance, addressing them will lead to improved robustness and accuracy.

A comparison of the results shows that ResNet-V2 outperforms EfficientNet-B0 and VGG16 in every metric. Furthermore, the ResNet-V2 model developed in this study has approximately 3.5 million parameters, while EfficientNet-B0, which is used for binary classification tasks, has approximately 5.3 million parameters, and VGG16 has over 138 million parameters. It is worth noting that ResNet-V2 achieves better performance than the EfficientNet-B0 and VGG16 models used in comparison, with far fewer parameters. This suggests using ResNet-V2 significantly improves computational efficiency and memory requirements while maintaining model accuracy. This makes ResNet-V2 the better choice for binary classification tasks in practical applications, especially in environments with limited computing resources.

Since the experiments are performed using two materials and a single printer setup, the demonstrated performance is strongly dependent on the materials, equipment, and process parameters used. As far as the printing equipment used allows integration of an OCT sensor in terms of design and kinematics, transferring the approach examined to other equipment would probably not have to be considered critical. A transfer of this approach to other printing materials is certainly the greatest challenge in disseminating the approach demonstrated. In order to detect defects below the surface, the light used by OCT must be able to penetrate the material to the relevant depths. Commercially available systems operate with wavelengths in the range of 800 nm to 1500 nm for the central wavelength of the spectrum used. Depending on the transparency of the feedstock used and the thickness of the layer produced during the manufacturing process, the number of layers made visible by OCT can vary from a single layer to several layers [49].

The study addresses a classification into ‘good’ and ‘bad’ as expression of overall internal quality instead of explicitly distinguishing between different defect types as result of specific defect mechanisms. Defect types in FFF are numerous and can be attributed to both material properties and unsuitable printing process parameters [70]. Since this study’s approach involves using OCT scans as source data for defect detection, the main focus in the future will primarily remain on identifying internal defects, in particular delamination, cracks, gaps, and voids. The term ‘gaps’ in this regard refers not only to gaps in the material, but also to other types of defects such as under-extrusion and cavities, which can be grouped into a single category, as they are expected to have a similar appearance in OCT scans, since they involve local missing material. A more precise differentiation of the causes of delamination, cracks, gaps, and voids is not expected to be reliably possible due to the limited resolution of OCT. In any case, molecular defects are not expected to be detectable due to this resolution limit.

Model training involves stochastic processes that may affect performance. Experiments in this study were conducted using a single training run per configuration. Performance variability is therefore not explicitly quantified and constitutes a limitation of this study, as performance variance across multiple runs is not explicitly quantified. The reported results are sufficient to support the comparative and methodological conclusions of this work.

6. Conclusions

CNNs based on ResNet-V2 are used for the classification of tomographic cross-sections. A dataset of 8135 OCT images passes a semi-automatic labeling, preprocessing, model training and evaluation. A sliding window indentifies outlier regions in the tomographic cross-sections, while masks suppress peripheral noise, enabling label generation based on outlier ratios. Data are split using block-based partitioning to limit leakage.

It is confirmed that a combination of width multiplication factor N = 2 and number of bottleneck modules K = 2 exhibited superior performance across various metrics, particularly in recall (‘bad’) and precision (‘good’), which is of particular importance in the detection of defective printed parts. In addition, other hyperparameters including dropout and learning rate are compared and tested, eventually determining that a dropout of 0.5 or 0.8, depending on the evaluation metrics emphasized, and an initial learning rate of 0.001, is effective for achieving optimum generalization ability and classification accuracy of the model. The optimized model delivers results based on test data with an accuracy of 0.9446, a recall (‘bad’) of 0.9227, and a precision (‘good’) of 0.9175. The effectiveness of the ResNet-V2 model is verified by conducting comparative experiments with alternative state-of-the-art CNN models, including EfficientNet-B0 and VGG16, demonstrating that the custom ResNet-V2 model outperforms the other models in terms of classification accuracy.

In future research and applications, the types of materials and the total number of image sets are to be extended to grant the model better generalization ability and robustness, and different classes of defects shall be considered.

Author Contributions

Conceptualization, V.L. and M.K.-M.; methodology, V.L., Q.Z. and M.K.-M.; software, Q.Z.; validation, Q.Z. and M.K.-M.; formal analysis, Q.Z. and M.K.-M.; investigation, M.K.-M.; resources, V.L. and M.K.-M.; data curation, Q.Z. and M.K.-M.; writing—original draft preparation, V.L. and Q.Z.; writing—review and editing, V.L.; visualization, V.L. and Q.Z.; supervision, S.I.; project administration, V.L.; and funding acquisition, V.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Federal Republic of Germany’s Ministry Bundesministerium für Wirtschaft und Klimaschutz (BMWK)/Federal Ministry for Economic Affairs and Climate Action, grant number 16KN 112726.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AI	Artificial Intelligence
AM	Additive Manufacturing
ANN	Artificial Neural Network
CNN	Convolutional Neural Network
CT	Computer Tomography
DL	Deep Learning
FDM	Fused Deposition Modeling
FFF	Fused Filament Fabrication
HM	Histogram Matching
ML	Machine Learning
OCT	Optical Coherence Tomography
PC	Polycarbonate
PLA	Polyactic Acid

References

Lümkemann, N.; Klimenta, M.; Hoffmann, M.; Meinen, J.; Stawarczyk, B. Dimensional stability and reproducibility of varying FFF models for aligners in comparison to plaster models. Materials 2023, 16, 4835. [Google Scholar] [CrossRef] [PubMed]
Shaqour, B.; Abuabiah, M.; Abdel-Fattah, S.; Juaidi, A.; Abdallah, R.; Abuzaina, W.; Qarout, M.; Verleije, B.; Cos, P. Gaining a better understanding of the extrusion process in fused filament fabrication 3D printing: A review. Int. J. Adv. Manuf. Technol. 2021, 114, 1279–1291. [Google Scholar] [CrossRef]
Kantaros, A.; Katsantoni, M.; Ganetsos, T.; Petrescu, N. The evolution of thermoplastic raw materials in high-speed FFF/FDM 3D printing era: Challenges and opportunities. Materials 2025, 18, 1220. [Google Scholar] [CrossRef]
Bruère, V.; Lion, A.; Holtmannspoetter, J.; Johlitz, M. Under-extrusion challenges for elastic filaments: The influence of moisture on additive manufacturing. Prog. Addit. Manuf. 2022, 7, 445–452. [Google Scholar] [CrossRef]
Erokhin, K.; Naumov, S.; Ananikov, V. Defects in 3D printing and strategies to enhance quality of FFF additive manufacturing. A Review. ChemRxiv 2023. [Google Scholar] [CrossRef]
Huang, D.; Swanson, E.A.; Lin, C.P.; Schuman, J.S.; Stinson, W.G.; Chang, W.; Hee, M.R.; Flotte, T.; Gregory, K.; Puliafito, C.A.; et al. Optical coherence tomography. Science 1991, 254, 1178–1181. [Google Scholar] [CrossRef]
Varghese, M.; Varghese, S.; Preethi, S. Revolutionizing medical imaging: A comprehensive review of optical coherence tomography (OCT). J. Opt. 2025, 54, 1178–1195. [Google Scholar] [CrossRef]
Fu, M.Y.; Yin, Z.H.; Yao, X.Y.; Xu, J.; Liu, Y.; Dong, Y.; Shen, Y.C. The progress of optical coherence tomography in industry applications. Adv. Devices Instrum. 2024, 5, 0053. [Google Scholar] [CrossRef]
Khanafer, K.; Cao, J.; Kokash, H. Condition monitoring in additive manufacturing: A critical review of different approaches. J. Manuf. Mater. Process. 2024, 8, 95. [Google Scholar] [CrossRef]
Gobert, C.; Reutzel, E.W.; Petrich, J.; Nassar, A.R.; Phoha, S. Application of supervised machine learning for defect detection during metallic powder bed fusion additive manufacturing using high resolution imaging. Addit. Manuf. 2018, 21, 517–528. [Google Scholar] [CrossRef]
Altmann, M.L.; Benthien, T.; Ellendt, N.; Toenjes, A. Defect classification for additive manufacturing with machine learning. Materials 2023, 16, 6242. [Google Scholar] [CrossRef]
Khanzadeh, M.; Chowdhury, S.; Marufuzzaman, M.; Tschopp, M.A.; Bian, L. Porosity prediction: Supervised-learning of thermal history for direct laser deposition. J. Manuf. Syst. 2018, 47, 69–82. [Google Scholar] [CrossRef]
Yadav, P.; Singh, V.K.; Joffre, T.; Rigo, O.; Arvieu, C.; Le Guen, E.; Lacoste, E. Inline drift detection using monitoring systems and machine learning in selective laser melting. Adv. Eng. Mater. 2020, 22, 2000660. [Google Scholar] [CrossRef]
Scime, L.; Beuth, J. Anomaly detection and classification in a laser powder bed additive manufacturing process using a trained computer vision algorithm. Addit. Manuf. 2018, 19, 114–126. [Google Scholar] [CrossRef]
Scime, L.; Beuth, J. Using machine learning to identify in-situ melt pool signatures indicative of flaw formation in a laser powder bed fusion additive manufacturing process. Addit. Manuf. 2019, 25, 151–165. [Google Scholar] [CrossRef]
Everton, S.K.; Hirsch, M.; Stravroulakis, P.; Leach, R.K.; Clare, A.T. Review of in-situ process monitoring and in-situ metrology for metal additive manufacturing. Mater. Des. 2016, 95, 431–445. [Google Scholar] [CrossRef]
Phanden, R.K.; Aditya, S.; Sheokand, A.; Goyal, K.K.; Gahlot, P.; Jacso, A. A state-of-the-art review on implementation of digital twin in additive manufacturing to monitor and control parts quality. Mater. Today Proc. 2022, 56, 88–93. [Google Scholar] [CrossRef]
Thompson, A.; Maskery, I.; Leach, R.K. X-ray computed tomography for additive manufacturing: A review. Meas. Sci. Technol. 2016, 27, 072001. [Google Scholar] [CrossRef]
Lang, V.; Weingarten, S.; Wiemer, H.; Scheithauer, U.; Glausch, F.; Johne, R.; Michaelis, A.; Ihlenfeldt, S. Process data-based knowledge discovery in additive manufacturing of ceramic materials by multi-material jetting (CerAM MMJ). J. Manuf. Mater. Process. 2020, 4, 74. [Google Scholar] [CrossRef]
Shanmugam, V.; Rajendran, D.J.J.; Babu, K.; Rajendran, S.; Veerasimman, A.; Marimuthu, U.; Singh, S.; Das, O.; Neisiany, R.E.; Hedenqvist, M.S.; et al. The mechanical testing and performance analysis of polymer-fibre composites prepared through the additive manufacturing. Polym. Test. 2021, 93, 106925. [Google Scholar] [CrossRef]
Lang, V.; Herrmann, C.T.E.; Fuchs, M.; Ihlenfeldt, S. Deep Learning Utilization for In-Line Monitoring of an Additive Co-Extrusion Process Based on Evaluation of Laser Profiler Data. Appl. Sci. 2025, 15, 1727. [Google Scholar] [CrossRef]
Clijsters, S.; Craeghs, T.; Buls, S.; Kempen, K.; Kruth, J.P. In situ quality control of the selective laser melting process using a high-speed, real-time melt pool monitoring system. Int. J. Adv. Manuf. Technol. 2014, 75, 1089–1101. [Google Scholar] [CrossRef]
Renken, V.; Albinger, S.; Goch, G.; Neef, A.; Emmelmann, C. Development of an adaptive, self-learning control concept for an additive manufacturing process. CIRP J. Manuf. Sci. Technol. 2017, 19, 57–61. [Google Scholar] [CrossRef]
Lu, Q.Y.; Wong, C.H. Additive manufacturing process monitoring and control by non-destructive testing techniques: Challenges and in-process monitoring. Virtual Phys. Prototyp. 2018, 13, 39–48. [Google Scholar] [CrossRef]
Grasso, M.; Colosimo, B.M. Process defects and in situ monitoring methods in metal powder bed fusion: A review. Meas. Sci. Technol. 2017, 28, 044005. [Google Scholar] [CrossRef]
Szeliski, R. Computer Vision: Algorithms and Applications; Springer Nature: Cham, Switzerland, 2022. [Google Scholar]
Gonzalez, R.C. Digital Image Processing; Pearson Education India: Bengaluru, India, 2009. [Google Scholar]
Tapia, G.; Elwany, A. A review on process monitoring and control in metal-based additive manufacturing. J. Manuf. Sci. Eng. 2014, 136, 060801. [Google Scholar] [CrossRef]
Craeghs, T.; Clijsters, S.; Kruth, J.P.; Bechmann, F.; Ebert, M.C. Detection of process failures in layerwise laser melting with optical process monitoring. Phys. Procedia 2012, 39, 753–759. [Google Scholar] [CrossRef]
Kanko, J.A.; Sibley, A.P.; Fraser, J.M. In situ morphology-based defect detection of selective laser melting through inline coherent imaging. J. Mater. Process. Technol. 2016, 231, 488–500. [Google Scholar] [CrossRef]
Caltanissetta, F.; Dreifus, G.; Hart, A.J.; Colosimo, B.M. In-situ monitoring of Material Extrusion processes via thermal videoimaging with application to Big Area Additive Manufacturing (BAAM). Addit. Manuf. 2022, 58, 102995. [Google Scholar] [CrossRef]
Rifat-E-Nur Hossain; Lewis, J.; Moore, A.L. In situ infrared temperature sensing for real-time defect detection in additive manufacturing. Addit. Manuf. 2021, 47, 102328. [Google Scholar] [CrossRef]
Reckert, A.; Lang, V.; Weingarten, S.; Johne, R.; Klein, J.H.; Ihlenfeldt, S. Quality prediction and classification of process parameterization for multi-material jetting by means of computer vision and machine learning. J. Manuf. Mater. Process. 2024, 8, 8. [Google Scholar] [CrossRef]
Fang, Q.; Xiong, G.; Zhou, M.; Tamir, T.S.; Yan, C.B.; Wu, H.; Shen, Z.; Wang, F.Y. Process monitoring, diagnosis and control of additive manufacturing. IEEE Trans. Autom. Sci. Eng. 2022, 21, 1041–1067. [Google Scholar] [CrossRef]
Mattera, G.; Caggiano, A.; Nele, L. Optimal data-driven control of manufacturing processes using reinforcement learning: An application to wire arc additive manufacturing. J. Intell. Manuf. 2025, 36, 1291–1310. [Google Scholar] [CrossRef]
Saluja, A.; Xie, J.; Fayazbakhsh, K. A closed-loop in-process warping detection system for fused filament fabrication using convolutional neural networks. J. Manuf. Process. 2020, 58, 407–415. [Google Scholar] [CrossRef]
Fujimoto, J.G.; Pitris, C.; Boppart, S.A.; Brezinski, M.E. Optical coherence tomography: An emerging technology for biomedical imaging and optical biopsy. Neoplasia 2000, 2, 9–25. [Google Scholar] [CrossRef]
Drexler, W.; Fujimoto, J.G. Optical Coherence Tomography: Technology and Applications; Springer Science & Business Media: Cham, Switzerland, 2008. [Google Scholar]
Alarousu, E.; Krehut, L.; Prykäri, T.; Myllylä, R. Study on the use of optical coherence tomography in measurements of paper properties. Meas. Sci. Technol. 2005, 16, 1131. [Google Scholar] [CrossRef]
Swanson, E.A.; Huang, D.; Hee, M.R.; Fujimoto, J.G.; Lin, C.; Puliafito, C. High-speed optical coherence domain reflectometry. Opt. Lett. 1992, 17, 151–153. [Google Scholar] [CrossRef] [PubMed]
Su, R.; Kirillin, M.; Chang, E.W.; Sergeeva, E.; Yun, S.H.; Mattsson, L. Perspectives of mid-infrared optical coherence tomography for inspection and micrometrology of industrial ceramics. Opt. Express 2014, 22, 15804–15819. [Google Scholar] [CrossRef]
Zorin, I.; Brouczek, D.; Geier, S.; Nohut, S.; Eichelseder, J.; Huss, G.; Schwentenwein, M.; Heise, B. Mid-infrared optical coherence tomography as a method for inspection and quality assurance in ceramics additive manufacturing. Open Ceram. 2022, 12, 100311. [Google Scholar] [CrossRef]
Catalucci, S.; Thompson, A.; Piano, S.; Branson, D.T., III; Leach, R. Optical metrology for digital manufacturing: A review. Int. J. Adv. Manuf. Technol. 2022, 120, 4271–4290. [Google Scholar] [CrossRef]
Gardner, M.R.; Lewis, A.; Park, J.; McElroy, A.B.; Estrada, A.D.; Fish, S.; Beaman, J.J.; Milner, T.E. In situ process monitoring in selective laser sintering using optical coherence tomography. Opt. Eng. 2018, 57, 041407. [Google Scholar] [CrossRef]
Hauschopp, C.; Dicks, S.; Bremer, J.; Wall, D.; Dehnen, G.; Brierley, N.; Kraft, R.; Schopphoven, T.; Meiners, W. Application of multimodal process monitoring for enhanced quality control and defect detection in laser-based directed energy deposition. J. Laser Appl. 2025, 37, 042035. [Google Scholar] [CrossRef]
Will, T.; Jeron, T.; Hoelbling, C.; Müller, L.; Schmidt, M. In-process analysis of melt pool fluctuations with scanning optical coherence tomography for laser welding of copper for quality monitoring. Micromachines 2022, 13, 1937. [Google Scholar] [CrossRef]
El-Sharkawy, Y.H. Integrated Optical Coherence Tomography and Hyperspectral Imaging for Automated Structural Health Monitoring of Carbon Fibre Aircraft Structures. J. Nondestruct. Eval. 2025, 44, 7. [Google Scholar] [CrossRef]
Jeon, D.; Jung, U.; Park, K.; Kim, P.; Han, S.; Jeong, H.; Wijesinghe, R.E.; Ravichandran, N.K.; Lee, J.; Han, Y.; et al. Vision-inspection-synchronized dual optical coherence tomography for high-resolution real-time multidimensional defect tracking in optical thin film industry. IEEE Access 2020, 8, 190700–190709. [Google Scholar] [CrossRef]
Wunderlich, C.; Phillips, C.; Schreiber, L.; Schallert, R.; Kopycinska-Müller, M. Optical coherence tomography as NDE method for quality control in additive manufacturing. In Proceedings of the Digital Twins, AI, and NDE for Industry Applications and Energy Systems 2025; Niezrecki, C., Farhangdoust, S., Eds.; International Society for Optics and Photonics, SPIE: Bellingham, WA, USA, 2025; Volume 13438, p. 134380F. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Identity mappings in deep residual networks. In Proceedings of the European Conference on Computer Vision; Springer: Cham, Switzerland, 2016; pp. 630–645. [Google Scholar]
Fercher, A.F.; Hitzenberger, C.K.; Kamp, G.; El-Zaiat, S.Y. Measurement of intraocular distances by backscattering spectral interferometry. Opt. Commun. 1995, 117, 43–48. [Google Scholar] [CrossRef]
Ali, M.; Parlapalli, R. Signal Processing Overview of Optical Coherence Tomography Systems for Medical Imaging; SPRABB9–June; Texas Instruments: Dallas, TX, USA, 2010. [Google Scholar]
Aumann, S.; Donner, S.; Fischer, J.; Müller, F. Optical coherence tomography (OCT): Principle and technical realization. In High Resolution Imaging in Microscopy and Ophthalmology: New Frontiers in Biomedical Optics; Springer: Cham, Switzerland, 2019; pp. 59–85. [Google Scholar]
Silva, V.B.; Andrade De Jesus, D.; Klein, S.; Van Walsum, T.; Cardoso, J.; Brea, L.S.; Vaz, P.G. Signal-carrying speckle in optical coherence tomography: A methodological review on biomedical applications. J. Biomed. Opt. 2022, 27, 030901. [Google Scholar] [CrossRef] [PubMed]
Lin, Z.; Hu, Y.; Lan, G.; Xu, J.; Qin, J.; An, L.; Huang, Y. Review of Artifacts and Related Processing in Ophthalmic Optical Coherence Tomography Angiography (OCTA). Photonics 2025, 12, 536. [Google Scholar] [CrossRef]
Kennedy, S.K.; Dalley, A.M.; Kotyk, G.J. Additive manufacturing: Assessing metal powder quality through characterizing feedstock and contaminants. J. Mater. Eng. Perform. 2019, 28, 728–740. [Google Scholar] [CrossRef]
Li, S.; Brandt, M.; Fensholt, R.; Kariryaa, A.; Igel, C.; Gieseke, F.; Nord-Larsen, T.; Oehmcke, S.; Carlsen, A.H.; Junttila, S.; et al. Deep learning enables image-based tree counting, crown segmentation, and height prediction at national scale. PNAS Nexus 2023, 2, pgad076. [Google Scholar] [CrossRef] [PubMed]
Wu, Y.; He, W.; Eschweiler, D.; Dou, N.; Fan, Z.; Mi, S.; Walter, P.; Stegmaier, J. Retinal oct synthesis with denoising diffusion probabilistic models for layer segmentation. In 2024 IEEE International Symposium on Biomedical Imaging (ISBI); IEEE: Piscataway, NJ, USA, 2024; pp. 1–5. [Google Scholar]
Majtner, T.; Bajić, B.; Herp, J. Texture-based image transformations for improved deep learning classification. In Proceedings of the Iberoamerican Congress on Pattern Recognition; Springer: Cham, Switzerland, 2021; pp. 207–216. [Google Scholar]
Wesarg, S.; Antón, E.P.; Baxter, J.S.; Erdt, M.; Drechsler, K.; Laura, C.O.; Freiman, M.; Chen, Y.; Rekik, I.; Eagleson, R.; et al. Clinical Image-Based Procedures, Fairness of AI in Medical Imaging, and Ethical and Philosophical Issues in Medical Imaging: 12th International Workshop, CLIP 2023 1st International Workshop, FAIMI 2023 and 2nd International Workshop, EPIMI 2023 Vancouver, BC, Canada, 8 October and 12 October 2023 Proceedings; Springer Nature: Cham, Switzerland, 2023; Volume 14242. [Google Scholar]
Khan, A.A. Balanced Split: A new train-test data splitting strategy for imbalanced datasets. arXiv 2022, arXiv:2212.11116. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition; IEEE: Piscataway, NJ, USA, 2016; pp. 770–778. [Google Scholar]
Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
Yang, S.; Chen, Q.; Wang, L.; Xu, M. In situ defect detection and feedback control with three-dimensional extrusion-based bioprinter-associated optical coherence tomography. Int. J. Bioprint. 2022, 9, 624. [Google Scholar] [CrossRef]
Zhang, C.; Bengio, S.; Hardt, M.; Recht, B.; Vinyals, O. Understanding deep learning requires rethinking generalization. arXiv 2016, arXiv:1611.03530. [Google Scholar] [CrossRef]
Keskar, N.S.; Mudigere, D.; Nocedal, J.; Smelyanskiy, M.; Tang, P.T.P. On large-batch training for deep learning: Generalization gap and sharp minima. arXiv 2016, arXiv:1609.04836. [Google Scholar]
Pauls, A.; Yoder, J. Determining optimum drop-out rate for neural networks. In Proceedings of the Midwest Instructional Computing Symposium (MICS), Duluth, MN, USA, 6–7 April 2018. [Google Scholar]
Salehin, I.; Kang, D.K. A review on dropout regularization approaches for deep neural networks within the scholarly domain. Electronics 2023, 12, 3106. [Google Scholar] [CrossRef]
Park, S.; Kwak, N. Analysis on the dropout effect in convolutional neural networks. In Proceedings of the Asian Conference on Computer Vision; Springer: Cham, Switzerland, 2016; pp. 189–204. [Google Scholar]
Erokhin, K.S.; Naumov, S.A.; Ananikov, V.P. Analysis, classification and remediation of defects in material extrusion 3D printing. Russ. Chem. Rev. 2023, 92. [Google Scholar] [CrossRef]

Figure 1. Proposed processing pipeline for label generation, preprocessing, model training, validation and evaluation.

Figure 2. (a) Principle of optical coherence tomography (OCT), (b) derivation of tomograms on the basis of measured light intensities, and (c) correlation of measurement beam or 1D tomogram (A-scan), aggregation to 2D tomogram (B-scan), and aggregation to 3D tomogram (C-scan).

Figure 3. (a) OCT B-scan image as a composite of consecutive A-scans, (b) slope value document corresponding to the B-scan, (c) Q-Q plot corresponding to the B-scan, and (d) resulting outlier map with red markings indicating identified outliers corresponding to the B-scan.

Figure 4. Upper three quarters of binary outlier maps: (a) original image of ‘A-09_2’, (b) corresponding map after morphological Closing, (c) original image of ‘T-16_1’, and (d) corresponding map after morphological Opening.

Figure 5. (a) Spline interpolation, (b) binary mask separating background (blue) and material volume (green), and (c) outlier image with red outliers and binary mask separating background and material volume.

Figure 6. Violin plot of outlier area ratios.

Figure 7. Example of image preprocessing of the training dataset: (a) resized, (b) flipped, and (c) normalized.

Figure 8. (a) Average cumulative distribution of the ‘good’ label in the ‘A-09_1_layer’, (b) example of a reference image from the ‘A-09_1_layer’, (c) original ‘X5Y4’ image without histogram matching, and (d) ‘X5Y4’ image after histogram matching.

Figure 9. (a) Full pre-activation residual unit, (b) network architecture ResNet-V2 with number of bottlenecks N = 1, and (c) network architecture ResNet-V2 with number of bottlenecks N = 2.

Figure 10. Training plots with model parameters set as N = 2 and K = 2 and varying block sizes M: (a) loss for M = 5, (b) accuracy for M = 5, (c) loss for M = 10, (d) accuracy for M = 10, (e) loss for M = 20, (f) accuracy for M = 20, (g) loss for M = 40, and (h) accuracy for M = 40.

Figure 11. Accuracy plots with varying parameters of multiplication factor K and number of bottleneck modules N: (a) K = 1 & N = 1, (b) K = 1 & N = 2, (c) K = 2 & N = 1, (d) K = 2 & N = 2, and (e) K = 3 & N = 2.

Figure 12. Confusion matrix of the test data.

Figure 13. Misclassifications analysed via Grad-CAM: (a,b) PA12 sample ‘A-09_2’ of true class ‘bad’ predicted as ‘good’, (c,d) PA12 sample ‘A-09_1’ of true class ‘good’ predicted as ‘bad’, and (e,f) PLA sample ‘X5Y4_2’ of true class ‘good’ predicted as ’bad’.

Table 1. Parameters of the printing experiments yielding the samples for the tomographic investion.

Parameter	Unit	Information
Printer model	-	Raise3D Pro3
Filament type	-	Raise3D PLA & Raise 3D PC
Nozzle diameter	mm	0.4
Layer height	mm	0.2
Line width	mm	0.4
Nozzle temperature	°C	PLA: 205 & PC: 265
Bed temperature	°C	55
Printing speed	mm/s	60
Flow rate	%	100
Cooling	%	100
Number of layers	-	Raft+ 1Layer/2Layer/3Layer

Table 2. Structure of the initial dataset consisting of OCT scans of various FFF-printed samples.

Sample	Number of B-Scans	Format of B-Scans	Sample Material
`A-09_1_layer`	1000	260 × 131	PA12
`A-09_2_layer`	1000	317 × 158	PA12
`A-09_3_layer`	1000	281 × 177	PA12
`T-16_1_layer`	999	295 × 138	PA12
`T-16_2_layer`	999	305 × 174	PA12
`T-16_3_layer`	999	271 × 220	PA12
`X5Y4_1_layer`	714	295 × 140	Reise3D PLA Red
`X5Y4_2_layer`	714	303 × 150	Reise3D PLA Red
`X5Y4_3_layer`	711	305 × 187	Reise3D PLA Red
Image sum	8135

Table 3. Morphological Treatment used for OCT outlier binary images.

Sample	Morphological Treatment	Parameter
`A-09_1_layer`	Opening	(1, 10); (2, 20)
`A-09_2_layer`	Closing	(1, 40); (2, 40)
`A-09_3_layer`	Closing	(1, 40); (2, 40)
`T-16_1_layer`	Closing	(1, 10); (2, 20)
`T-16_2_layer`	Opening	(1, 10); (2, 20)
`T-16_3_layer`	Opening	(1, 40); (2, 40)
`X5Y4_1_layer`	Opening	(1, 10); (2, 20)
`X5Y4_2_layer`	Opening	(1, 10); (2, 20)
`X5Y4_3_layer`	Opening	(1, 5); (2, 5)

Table 4. Dataset split for different values of block size M in training, validation, and test data.

M	Label	Train	Validation	Test
5	Good	2612	745	380
	Bad	3078	880	440
	Total	5690	1625	820
10	Good	2607	740	390
	Bad	3080	878	440
	Total	5687	1618	830
20	Good	2607	740	390
	Bad	3080	878	440
	Total	5687	1618	830
40	Good	2577	720	440
	Bad	3080	880	438
	Total	5657	1600	878

Table 5. Architecture of a custom model based on ResNet-V2.

Group Name	Output Size	Depth
`Convolutional Layer`	224 × 224	6 × 6, 64
`Max-pooling`	112 × 112	2 × 2, stride 2
`BottleneckV2_1`	112 × 112	$(\begin{matrix} 1 \times 1 & 64 \times K \\ 3 \times 3 & 64 \times K \\ 1 \times 1 & 256 \times K \end{matrix}) \times N$
`BottleneckV2_2`	56 × 56	$(\begin{matrix} 1 \times 1 & 128 \times K \\ 3 \times 3 & 128 \times K \\ 1 \times 1 & 512 \times K \end{matrix}) \times N$
`BottleneckV2_3`	28 × 28	$(\begin{matrix} 1 \times 1 & 256 \times K \\ 3 \times 3 & 256 \times K \\ 1 \times 1 & 1024 \times K \end{matrix}) \times N$
`BottleneckV2_4`	14 × 14	$(\begin{matrix} 1 \times 1 & 512 \times K \\ 3 \times 3 & 512 \times K \\ 1 \times 1 & 2048 \times K \end{matrix}) \times N$
`Avg-pooling`	1 × 1	2048 × K, softmax

Table 6. Model performance comparison with different architecture values of multiplication factor K and number of bottleneck modules N.

	Total Layer	Accuracy	Recall (‘Bad’)	Precision (‘Good’)	F1
K = 1, N = 1	14	0.9470	0.9212	0.9166	0.9470
K = 2, N = 1	14	0.9462	0.9280	0.9238	0.9462
K = 1, N = 2	26	0.9458	0.9295	0.9225	0.9458
K = 2, N = 2	26	0.9442	0.9242	0.9188	0.9442
K = 3, N = 2	26	0.9430	0.9212	0.9159	0.9430

Table 7. Model performance comparison with different dropout values.

Dropout	Accuracy	Recall (‘Bad’)	Precision (‘Good’)	F1
0.0	0.9414	0.9174	0.9124	0.9414
0.2	0.9406	0.9212	0.9154	0.9406
0.5	0.9442	0.9242	0.9188	0.9442
0.8	0.9434	0.9303	0.9242	0.9434

Table 8. Model performance comparison with different initial learning rates.

Learning Rate	Accuracy	Recall (‘Bad’)	Precision (‘Good’)	F1
10 × 10⁻³	0.9434	0.9303	0.9242	0.9434
10 × 10⁻⁴	0.9305	0.9152	0.9083	0.9305

Table 9. Comparison with comparable convolutional neural networks.

Model	Accuracy	Recall (‘Bad’)	Precision (‘Good’)	F1
ResNet-V2	0.9434	0.9303	0.9242	0.9434
EfficientNet-B0	0.9285	0.9242	0.9162	0.9285
VGG16	0.9301	0.9227	0.9149	0.9301

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Published by MDPI on behalf of the International Institute of Knowledge Innovation and Invention. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.

Share and Cite

MDPI and ACS Style

Lang, V.; Zhu, Q.; Kopycinska-Müller, M.; Ihlenfeldt, S. Deep Learning for Classification of Internal Defects in Fused Filament Fabrication Using Optical Coherence Tomography. Appl. Syst. Innov. 2026, 9, 42. https://doi.org/10.3390/asi9020042

AMA Style

Lang V, Zhu Q, Kopycinska-Müller M, Ihlenfeldt S. Deep Learning for Classification of Internal Defects in Fused Filament Fabrication Using Optical Coherence Tomography. Applied System Innovation. 2026; 9(2):42. https://doi.org/10.3390/asi9020042

Chicago/Turabian Style

Lang, Valentin, Qichen Zhu, Malgorzata Kopycinska-Müller, and Steffen Ihlenfeldt. 2026. "Deep Learning for Classification of Internal Defects in Fused Filament Fabrication Using Optical Coherence Tomography" Applied System Innovation 9, no. 2: 42. https://doi.org/10.3390/asi9020042

APA Style

Lang, V., Zhu, Q., Kopycinska-Müller, M., & Ihlenfeldt, S. (2026). Deep Learning for Classification of Internal Defects in Fused Filament Fabrication Using Optical Coherence Tomography. Applied System Innovation, 9(2), 42. https://doi.org/10.3390/asi9020042

Article Menu

Deep Learning for Classification of Internal Defects in Fused Filament Fabrication Using Optical Coherence Tomography

Abstract

1. Introduction

2. State of the Art

3. Materials and Methods

3.1. Data Acquisition

3.2. Data Exploration and Data Labeling

3.3. Data Preprocessing

3.3.1. Image Preprocessing and Conversion

3.3.2. Data Transformation

3.3.3. Dataset Loading and Splitting

3.4. Modeling

3.4.1. Model Architecture

3.4.2. Hyperparameter Configuration

4. Results

4.1. Block Size Optimization

4.2. Hyperparameter Comparison

4.3. Test Performance

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI