Next Article in Journal
A Driver’s Bumpy Feeling Reproducing Model Applied to the Six-Degree-of-Freedom Ship Simulation Driving Equipment
Previous Article in Journal
Optimization of Collaborative Vessel Scheduling for Offshore Wind Farm Installation Under Weather Uncertainty
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Classification of Double-Bottom U-Shaped Weld Joints Using Synthetic Images and Image Splitting †

by
Gyeonghoon Kang
and
Namkug Ku
*
Department of Marine Design Convergence Engineering, Pukyong National University, Busan 48513, Republic of Korea
*
Author to whom correspondence should be addressed.
This article is a revised and substantially expanded version of a conference presentation/abstract entitled Classification of Ship Double-Bottom Weld Joints Using Synthetic Images. In Proceedings of the 2025 Fall Academic Conference and General Assembly of the Korean Society of Ocean Engineers, Jeju, Republic of Korea, 29–31 October 2025.
J. Mar. Sci. Eng. 2026, 14(2), 224; https://doi.org/10.3390/jmse14020224
Submission received: 16 December 2025 / Revised: 10 January 2026 / Accepted: 19 January 2026 / Published: 21 January 2026
(This article belongs to the Section Ocean Engineering)

Abstract

The shipbuilding industry relies heavily on welding, which accounts for approximately 70% of the overall production process. However, the recent decline in skilled workers, together with rising labor costs, has accelerated the automation of shipbuilding operations. In particular, the welding activities are concentrated in the double-bottom region of ships, where collaborative robots are increasingly introduced to alleviate workforce shortages. Because these robots must directly recognize U-shaped weld joints, this study proposes an image-based classification system capable of automatically identifying and classifying such joints. In double-bottom structures, U-shaped weld joints can be categorized into 176 types according to combinations of collar plate type, slot, watertight feature, and girder. To distinguish these types, deep learning-based image recognition is employed. To construct a large-scale training dataset, 3D Computer-Aided Design (CAD) models were automatically generated using Open Cascade and subsequently rendered to produce synthetic images. Furthermore, to improve classification performance, the input images were split into left, right, upper, and lower regions for both training and inference. The class definitions for each region were simplified based on the presence or absence of key features. Consequently, the classification accuracy was significantly improved compared with an approach using non-split images.

1. Introduction

1.1. Background

The shipbuilding industry is increasingly adopting artificial intelligence- and robotics-based process automation to address the decline in skilled labor and rising labor costs. In particular, as welding processes account for approximately 70% of overall operations, welding automation technologies are becoming increasingly important. However, ship structures consist of complex geometries formed by large steel plates, and production is typically characterized by small-batch and customized manufacturing. As a result, the implementation of automation in shipbuilding is significantly more challenging than in mass-production and highly standardized industries such as the automotive or semiconductor sectors.
Among these, U-shaped weld joints in the double-bottom structure of a ship are representative regions where welding operations are highly concentrated, and their shapes and dimensions vary even within the same vessel. As shown in Figure 1, the intersection between the web frame and the longitudinal stiffener is defined as a U-shaped weld joint, and various configurations of U-shaped weld joints are illustrated in Figure 2. The welding of U-shaped weld joints is increasingly being automated using collaborative robots. However, in conventional robot-assisted welding, the weld joint is first visually inspected by a human operator, and its geometric information is manually provided to the robot. In this study, we aim to develop a classification system that enables a robot to recognize and classify U-shaped weld joints using camera-based perception without human intervention, thereby allowing collaborative robots to perform automatic welding more efficiently based on the classified joint types.

1.2. Objectives of the Research

U-shaped weld joints in a ship’s double-bottom structure can be classified into 176 distinct types based on combinations of their structural features. For automated welding using collaborative robots, accurate recognition of these U-shaped weld joints is essential. Accordingly, this study aims to develop an image-based automatic classification method for U-shaped weld joints. The application of image recognition techniques requires a large number of high-quality images. However, acquiring diverse real-world images of weld joints in shipyard environments is challenging. To address this issue, 3D Computer-Aided Design (CAD) models of U-shaped weld joints were automatically generated using Open Cascade and subsequently processed using Blender. By varying lighting conditions, camera angles, and surface textures, realistic welding environments were simulated, enabling the generation of synthetic images of U-shaped weld joints.

1.3. Previous Research

Shipyards present significant challenges for collecting visual data due to the complex arrangement of large structures, confined workspaces, and environmental variations such as lighting changes, reflections, and occlusions. In addition, the geometries of ship structures vary depending on component types and arrangement configurations, making it difficult to secure sufficiently large datasets for shape classification. To overcome these limitations, recent studies have adopted approaches that utilize CAD models and digital twins to generate structural geometries and supplement insufficient real data with synthetic data for training deep learning-based classification and detection models. Kim et al. [2] addressed the difficulty of capturing real images of U-shaped weld joints in shipyard environments by generating digital twin-based synthetic images. By adjusting rendering factors such as texture noise, illumination variations, additional light sources, and camera angles, they simulated real-world imaging conditions and applied the generated images to You Only Look Once (YOLO)v8, Faster Region-based Convolutional Neural Network (R-CNN), and Cascade R-CNN, achieving mean Average Precision (mAP)@50 of 85.6%, mAP@[0.5:0.95] of 64.4%, and an average Intersection over Union (IoU) of 88%.
Chon et al. [3] trained a Convolutional Neural Network (CNN)-based classification model for hull block number identification using multi-view image sets derived from 3D CAD data. Synthetic images were generated by projecting hull block CAD models from multiple viewpoints, while real photographs of 3D-printed block models were used for testing. The results demonstrated that models trained solely on CAD-based synthetic images achieved high classification accuracy on real images, highlighting the feasibility of synthetic data-driven hull block classification.
Na et al. [4] trained a ResNet-50 classification model using vector-based CAD geometry representations for ship structure nesting tasks. Their dataset consisted of 23,201 structural components, and symmetry-based augmentation strategies such as horizontal and vertical flipping were applied to simulate variations in placement orientation. This approach achieved a classification accuracy of approximately 98%.
Alongside research on geometric classification and structural understanding of ship structures, studies on weld quality recognition and weld seam recognition in real industrial environments have also been actively conducted. Palma-Ramírez et al. [5] performed transfer learning based on a ResNet-50 model for four classes—crack, porosity, lack of fusion, and no defect—appearing in radiographic images. They improved generalization performance by applying data augmentation, normalization, and 5-fold stratified cross-validation. They verified generalizability using public datasets (RIAWELC and GDXray) and a low-quality real-field dataset. They achieved high classification accuracies of 98.75% and 90.25% on the RIAWELC and GDXray datasets, respectively, and also recorded an accuracy of 75.83% on the low-quality real-field dataset. Similarly, Kumaresan et al. [6] conducted transfer learning on limited welding X-ray images using VGG16 and ResNet-50 pre-trained on ImageNet. After manually separating the weld region and augmenting it for training, they reported an average classification accuracy of approximately 90% with the VGG16-based model, and it showed relatively stable training and generalization tendencies compared with the ResNet-50-based model. Zhao et al. [7] constructed a dataset by augmenting 1016 real images collected from open-source repositories and laboratory settings into 3048 images via brightness adjustment. Using a 7:3 train–test split, they developed a lightweight model by integrating a MobileNetV3 backbone and an Efficient Multi-scale Attention (EMA) mechanism into the YOLOv8s-seg framework. By deploying this model on a wall-climbing robot equipped with magnetic adhesion and Mecanum wheels, they implemented autonomous weld seam tracking and, as a result, achieved 97.8% accuracy and a fast inference time of 54 ms in an edge-device environment. Meanwhile, research aiming to directly utilize 3D spatial information beyond 2D images has continued. Yin et al. [8] performed weld seam segmentation of ship sub-assembly structures using 3D point clouds acquired in a shipyard environment based on stereo vision cameras. The study adopted PointNet++ as the backbone and achieved an mean Intersection over Union (mIoU) of 83.38% and an inference speed of 15 Frames Per Second (FPS) for five weld seam classes by improving edge detection and sampling strategies.
Beyond shipbuilding, synthetic image-based approaches have been widely adopted in manufacturing and industrial applications to address the scarcity of real images. Domain randomization techniques, which introduce random variations in lighting, backgrounds, and materials in 3D CAD models, have been shown to improve model generalization to real-world environments. Fresnillo et al. [9] generated 25,000 synthetic images of industrial cables using Blender and achieved IoU values of 76–80% with FCN and DeepLabV3 models. Wang et al. [10] reported that a model trained solely on synthetic images for automotive cable harness detection achieved an IoU of 95.4% on real vehicle images, outperforming models trained only on real data. Valtchev and Wu [11] also demonstrated that similar domain randomization techniques reduced the domain gap, achieving an accuracy improvement of approximately 12 percentage points, thereby confirming that synthetic images can compensate for real-world environmental constraints.
The effectiveness of synthetic data has also been demonstrated through mixed training and generative approaches. Ruediger-Flore et al. [12] improved classification accuracy from 81.7% to 96.3% by training on a combination of synthetic and real images for 30 types of manufacturing parts. Eversberg and Lambrecht [13] combined Physically Based Rendering (PBR)-based synthetic images with active learning, improving mAP@[0.5:0.95] from 49.8% to 79.1% using minimal real data. Chen et al. [14] applied Wasserstein Generative Adversarial Networks (GANs) to generate synthetic P–S–N curve data for gear fatigue testing, reducing relative stress error by over 50% and decreasing the required number of specimens by 43.7%.
These studies demonstrate that CAD-based synthetic data generation effectively mitigates real-world data limitations and enhances learning performance. Building on this trend, the present study aims to improve classification accuracy for real-world images by applying CAD-based synthetic images and a split learning strategy tailored to U-shaped weld joint classification in shipbuilding.

2. Categorization of U-Shaped Weld Joints in Double-Bottom Structures

U-shaped weld joints in a ship’s double-bottom structure exhibit various features, as shown in Figure 3a,b, and are classified according to the presence or absence of collar plates, slots, watertight features, and girders. A collar plate is a reinforcement plate attached around the opening through which a longitudinal stiffener passes the web frame; it serves to mitigate stress concentration around the opening and enhance structural strength.
Regarding the collar plate type, Figure 4 illustrates that U-shaped weld joints can be categorized into nine types based on the presence or absence of the collar plate, its position, and its front–rear orientation. In Figure 4, the grey shading highlights the collar plate, and the circles indicate the positions of the slots. Type T1 represents the configuration without a collar plate. Types T2 and T4 have collar plates positioned on the left and right sides, respectively, and are detached from the bottom plate. In contrast, Types T6 and T8 have collar plates attached to the bottom plate on the left and right sides, respectively. Types T3, T5, T7, and T9 correspond to the reverse-view configurations of Types T2, T4, T6, and T8, respectively. As these configurations are viewed from the opposite direction, the collar plate is located on the rear side relative to the viewing direction, as shown in Figure 2a,d.
The slots shown in Figure 5 serve a drainage function and are classified into four configurations based on their position and presence: Left Slot (LS), Right Slot (RS), All Slot (AS), and No Slot (NS). The locations of the slots in Figure 5 are represented by circles. The watertight feature shown in Figure 6 refers to the upper part of the opening through which the longitudinal stiffener passes the web frame. Configurations in which this feature is blocked are classified as watertight feature type and are further categorized into four types: left watertight feature (LW), right watertight feature (RW), all watertight feature (AW), and no watertight feature (NW). In Figure 6, the grey shading highlights the plate blocking the upper opening, which constitutes the watertight feature. Finally, Figure 7 presents the classification based on the presence and location of girders. In Figure 7, the girder is represented by a thick vertical line on either the left or right side, following the structural shape introduced in Figure 3, to clearly identify its position. Configurations with a girder on the left are designated as Left Girder (LG), those with a girder on the right as Right Girder (RG), and those with no girders on either side as No Girder (NG).
A U-shaped weld joint is composed of four elements: collar plate, slot, watertight feature, and girder. In this study, the configuration of a U-shaped weld joint is expressed through combinations of these elements. The notation order of the configuration is defined as collar plate type → slot → watertight feature → girder. For example, a configuration consisting of a collar plate attached to the bottom on the right side and positioned toward the rear relative to the viewing direction (T7), a slot located on the left (LS), an unblocked watertight feature (NW), and a girder on the left (LG) is denoted as T7LSNWLG, as shown in Figure 2a. Using this combination scheme, a total of 176 types of U-shaped weld joints can be generated, and each configuration may vary in dimensions or shape depending on the ship type.

3. Image Recognition Algorithms

Image recognition technology is generally categorized into three types: classification, object detection, and segmentation. Since the objective of this study is to classify real-world images using synthetic images generated from CAD models, image classification was selected for model training. The classification models employed were ResNet-18, a Convolutional Neural Network (CNN), and the Vision Transformer (ViT). Both models applied transfer learning and were fine-tuned using ImageNet pre-trained weights.

3.1. Convolutional Neural Network (CNN)

Convolutional Neural Networks (CNNs) progressively extract local features from input images by repeatedly applying convolution and pooling operations. The ResNet-18 model employed in this study incorporates the residual block, as shown in Figure 8, proposed by He et al. [15], which was designed to mitigate performance degradation and vanishing gradient problems as network depth increases. In this study, a ResNet-18 model pre-trained on the ImageNet-1k dataset was adopted, and its final fully connected layer was modified to match the number of classes in the experimental dataset.

3.2. Vision Transformer (ViT)

The Vision Transformer (ViT) is a Transformer-based model that processes images by partitioning them into patches. As shown in Figure 9, the input image is divided into patches of 16 × 16 pixels. Each patch is linearly embedded, and a class token and positional embeddings are added before the sequence is fed into the Transformer encoder to learn global features. In this study, the ViT-Base-Patch16-224-in21k model developed by Google was employed, and transfer learning was performed using weights pre-trained on the ImageNet-21k dataset.

3.3. Training Configuration for Image Classification Models

The experimental environment and training settings for this study are as follows. Model implementation and training were performed using PyTorch version 1.7.1 for ResNet-18 and the Hugging Face Transformer library version 4.44.2 for ViT. Data preprocessing for training was applied differently according to the characteristics of each model. For ResNet-18, images divided into 112 × 224 pixels were resized to the model input size of 224 × 224 pixels via interpolation without preserving the aspect ratio. Conversely, for the ViT model, images of 112 × 224 pixels were processed using the 2D interpolation method proposed by Dosovitskiy et al. [16], maintaining the vertical resolution while interpolating every two horizontal pixels into one. This process reconfigured the original 14 × 14 patch structure into 7 × 14, enabling effective training. For hyperparameter tuning, both models employed a cosine annealing learning rate scheduler to induce stable convergence by gradually decreasing the initial learning rate [17]. The learning rate was set to 0.001 for ResNet-18, while the learning rate for ViT was adjusted within the range of 0.0003 to 0.00001 depending on the configuration. The batch size was set to 64, and the maximum number of epochs was set to 200. To prevent overfitting, early stopping was applied to both ResNet-18 and ViT, whereby training was terminated if the test accuracy did not improve for 20 consecutive epochs.

4. Synthetic Image Generation

4.1. 3D Modeling with Open Cascade

This study adopted an approach that replaces actual photographed images with 3D CAD-based synthetic images to secure a large volume of training data. To achieve this, 176 types of U-shaped weld joint configurations were first modeled in CAD. To obtain diverse training data, manual CAD modeling wsas avoided; instead, the Python-based CAD library Open Cascade version 7.9.0 was utilized to automate the modeling procedure. By defining the key dimensions of the U-shaped weld joint as variables and input parameters, a system was developed that can automatically generate all 176 configurations by modifying only the dimensional values.
The parameters defined for U-shaped weld joint modeling in this study are shown in Figure 10a. In Figure 10a, each parameter is labeled using the same numbering scheme as Table 1, and text annotations and dimension arrows are displayed in red to clearly indicate the locations and dimensions of the parameters. In particular, the Longi. length parameter, which may cause ambiguity in dimensional interpretation, is highlighted in blue to enhance visual clarity and readability. First, the width and height of the web frame are defined. The spacing, height, and length between longitudinal stiffeners are defined as Longi. space, Longi. height, and Longi. length, respectively. The thickness and width of the face plate of the longitudinal stiffener are denoted as Longi. face thickness and stiffener width, respectively. The width and height of the watertight feature hole, as well as the overall watertight feature height, are represented as watertight feature hole width, watertight feature hole height, and watertight feature height, respectively. The width, height, and radius of the collar plate are denoted as collar plate width, collar plate height, and collar plate hole radius, respectively, and the radius of the slot is represented as slot radius. Examples of each parameter are summarized in Table 1, and the CAD model generated by inputting these parameters into the CAD-based automatic generation program is shown in Figure 10b.

4.2. Rendering of U-Shaped Weld Joint CAD Models

The generated CAD models were automatically csonverted into a large number of synthetic images using Blender version 3.6. As shown in Figure 11, ten types of steel plate textures were obtained based on photographs of steel plate surfaces taken at an actual shipyard. These textures were mapped onto the CAD models to achieve realistic surface characteristics. Subsequently, rendering settings such as lighting, material representation, and shadows were adjusted within Blender to generate synthetic images in which the mapped textures appeared close to actual photographed images [18]. The specific rendering settings used are presented in Table 2. First, the Noise Threshold was disabled to ensure that all pixels were fully rendered, thereby enhancing image sharpness. Meanwhile, the View Transform was set to Raw during rendering to prevent color distortion caused by tone mapping and to preserve the characteristics of the original data.
The mapped CAD models were rendered under three realistic lighting conditions, as shown in Figure 12. In addition, synthetic images were generated from multiple viewpoints by varying the camera angles ( 3 , 2 , 1 , 0 , 1 , 2 , 3 ), as illustrated in Figure 13 [19]. A total of 7392 synthetic images were generated by rendering 176 different U-shaped weld joint configurations under seven viewing angles and three lighting conditions, with two different dimensional settings applied.
Figure 14 provides a visual comparison of the synthetic rendering results for Version 1 and Version 2 based on the changes listed in Table 2. Compared to Version 1, Version 2 exhibits clearer images with more pronounced light reflection and scattering effects.
Table 3 compares the results of four-class classification—Left Slot (LS), Right Slot (RS), No Slot (NS), and All Slot (AS)—based on the slot feature. Since left–right information is critical for distinguishing these four classes, flip augmentation of synthetic images was not applied during image recognition model training. Classification accuracy was calculated according to the method described in Section 5.1. The results indicate that Version 2 achieves higher classification accuracy than Version 1. This demonstrates that generating synthetic images with higher clarity and greater similarity to real images can improve model training performance.

4.3. Image Augmentation

Image augmentation is a crucial process for improving the generalization performance of model training. In this study, multiple augmentation techniques were applied to ensure diversity in synthetic images. The techniques used consist of cropping, brightness and contrast adjustment, and rotation [20,21,22,23]. First, images were randomly cropped to a square region within a 90–100% ratio of the original dimensions (224 × 224 pixels). The cropped images were then resized back to 224 × 224 pixels to match the input specification of the image recognition model. Brightness and contrast were adjusted using random values within a ±20% range, while rotation was applied with random values within a ±8° range. Throughout the augmentation process, care was taken to ensure that key features such as the collar plate, slot, watertight feature, and girder were not lost. These augmentations were applied using different combinations at each epoch, following the strategy proposed by Krizhevsky et al. [24], to increase the diversity of training images. Figure 15 illustrates an example of image augmentation that induces the largest variation from the original image, showing the result generated by applying a 90% crop, 20% brightness and contrast adjustment, and an 8° rotation to the original 224 × 224 pixels image. This augmentation process was implemented during the preprocessing stage of the PyTorch-based data loader. The training images were normalized using the ImageNet [25] mean values of [0.485, 0.456, 0.406] and standard deviations of [0.229, 0.224, 0.225]. This normalization was performed to match the characteristics of the ImageNet dataset used during pre-training, thereby preventing degradation in learning performance.

5. Synthetic Image Generation Based on Image Splitting

5.1. Necessity of Image Splitting

U-shaped weld joints are classified according to four criteria—collar plate type, slot, watertight feature, and girder—as described in Section 4. The numbers of classes for these criteria are 5, 4, 4, and 3, respectively. Although nine collar plate types were introduced in Section 4, only five classes are used in this study, as only five distinct configurations could be secured from actual shipyard photographs.
The training results presented in Table 4 indicate that the synthetic images used for training the collar plate classification were limited to types T1, T6, T7, T8, and T9. Accordingly, a total of 5376 images were used to classify the collar plate type. For the classification of the presence or absence of slots, watertight features, and girders, a total of 7392 synthetic images were utilized for training. For all four classification tasks, the test dataset consisted of 513 real images acquired from an actual shipyard, as described in Section 4.1. Classification accuracy was calculated as the ratio of correctly classified real images to the total number of test images. The average accuracies of ResNet-18 and ViT for classifying the collar plate type and the presence or absence of slots, watertight features, and girders were 56.37%, 74.19%, 45.97%, and 86.53%, respectively. These results can be attributed to the fact that the overall synthetic images of U-shaped weld joints contain multiple structural features simultaneously, making it difficult for the image recognition models to distinctly learn and discriminate individual features during complex feature learning. Specifically, according to Raghu et al. [26], the global information aggregation characteristic of ViTs can interfere with the recognition of localized weld features. To address this limitation and improve classification accuracy, the input images were split to isolate the regions containing the target features, enabling the models to focus on learning each feature independently.

5.2. Image Splitting

To improve the performance of U-shaped weld joint classification, this study constructed input images by splitting the regions in which each feature is located, rather than using the entire joint image. As shown in Figure 16, the images were split based on the feature regions corresponding to (a) collar plate type, (b) slot, (c) watertight feature, and (d) girder. The collar plate type region was split into two parts in the left–right direction, while the slot region was split into four parts. For the watertight feature, we first split the image into left and right parts and then extracted the central region. The size of this region was set to be the same as each patch when the image is divided into four parts, and the watertight feature is included in the defined middle region in all cases. Similarly, the girder region was split into two parts in the left–right direction, following the same approach as the collar plate type. This splitting strategy was applied identically to both synthetic images and real images captured in the shipyard.
Table 5 presents the experimental results to provide a rationale for selecting the 4-region splitting granularity for the slot feature. When 2-region splitting was applied, the accuracies for the Left and Right regions were 86.55% and 92.20%, respectively. This is considered to be because features other than the slot were included within the split regions, resulting in lower slot classification performance compared with the 4-region splitting. In contrast, under 4-region splitting, the accuracies for the Left and Right regions were 97.08% and 95.30%, respectively. For 6-region splitting, the accuracies for the Left and Right regions were 92.59% and 96.69%, respectively, and the Right-side result indicates improved performance compared with the 4-region splitting. However, under 6-region splitting, the Left-side accuracy (92.59%) was lower than that of the 4-region splitting (97.08%), resulting in a larger performance gap between the Left and Right regions than in the 4-region case. Therefore, we adopted the 4-region splitting to maintain consistent performance. In addition, for the collar plate, as shown in Figure 16a, it occupies a relatively large area and is attached to either the left or right side; thus, we applied 2-region splitting to distinguish its location and characteristics.

5.3. Class Composition of Split Images

First, the classification based on collar plate type was performed stepwise as follows. As shown in Figure 17a,b, the input image was first split into left and right regions, and the class was determined based on the presence or absence of a collar plate in each region. Subsequently, for regions in which a collar plate was present, as shown in Figure 18, the class was further distinguished according to whether the collar plate was located (a) at the front or (b) at the rear. Through this procedure, the presence and front/rear position of the collar plate on both the left and right sides were sequentially identified. By combining this information, collar plate types T1 to T9 could be classified.
Slots were classified based on their presence or absence by splitting the image into four regions—top, bottom, left, and right—as shown in Figure 19a,b. Two classes were assigned to the bottom-left and bottom-right regions to determine the presence or absence of slots. As shown in Figure 20a,b, the watertight feature was first split into left and right sections. Subsequently, two classes were assigned to the middle-left and middle-right regions, where the watertight feature appears, to classify the presence or absence of the watertight feature.
The girder region was split into two areas—left and right—as shown in Figure 21a,b. Within each region, the presence or absence of a girder was classified into two classes.
The synthetic dataset generated in this study consists of a total of 7392 images obtained by rendering 176 different types of U-shaped weld joints with seven viewing angles, three lighting conditions, and two scale settings. Because the synthetic images were divided into U-shaped weld-joint categories, the number of images included in each category may vary depending on the combination of weld-joint features. Accordingly, the number of synthetic images for each U-shaped weld-joint category is presented in Table 6.
For the collar plate type presence/absence classification, the numbers of synthetic images were 2016 for presence and 3360 for absence, and the front/rear classification was the same at 1008 images each. For the slot presence/absence classification, the numbers were 3192 for presence and 4200 for absence. For the watertight feature presence/absence classification, the numbers were 3024 for presence and 4368 for absence. Thus, for the collar plate type, slot, and watertight feature, the differences between the two classes were not large. In contrast, for the girder presence/absence classification, the numbers were 1344 for presence and 6048 for absence, showing a relatively large difference in distribution between the classes, and a similar tendency was also observed in the real images. The real test set consists of images collected in an actual shipyard environment, and the distribution may not be balanced due to the nature of the field data. As shown in Table 6, for the slot feature, the left side consists of 127 images for presence and 386 images for absence, and the right side consists of 186 images for presence and 327 images for absence. For the girder feature, the left side consists of 73 images for presence and 440 images for absence, and the right side consists of 34 images for presence and 479 images for absence, indicating an imbalanced distribution. Since the real images were used for testing the image recognition model, the class imbalance in the test images did not affect model training.

6. Results

Table 4 shows the classification performance when the models were trained using full-shape images as input. The models were evaluated on real shipyard images for four feature categories: collar plate type, slot, watertight feature, and girder. As shown in Table 4, ResNet-18 achieved an accuracy of 52.63% for collar plate type classification. It achieved 76.02% for slot classification, 54.97% for watertight feature classification, and 87.13% for girder classification. ViT achieved 59.45% for collar plate type classification. It achieved 71.53% for slot classification, 36.45% for watertight feature classification, and 84.99% for girder classification. The average accuracy of ResNet-18 and ViT was 56.04% for collar plate type, 73.77% for slot, 45.71% for watertight feature, and 86.06% for girder.
Table 7, Table 8, Table 9, Table 10 and Table 11 present the feature-wise classification performance and computational cost after applying image splitting. Table 7 reports the results for collar plate presence/absence classification. In the left region, ResNet-18 achieved 91.23%, ViT achieved 92.29%, VGG16 achieved 91.42%, and ResNet-50 achieved 91.81%, showing similar performance. In the right region, ResNet-18 achieved 98.25%, ViT achieved 95.90%, VGG16 achieved 98.64%, and ResNet-50 achieved 98.64%, showing overall high accuracy. Table 8 reports the results for collar plate front/rear classification. ResNet-18 achieved 96.29% in the left region and 100.00% in the right region. ResNet-50 also achieved 100.00% in the right region. VGG16 achieved the highest accuracy in the left region with 97.66%. ViT achieved 94.15% in the left region and 94.66% in the right region. Table 9 reports the results for slot presence/absence classification. ResNet-50 achieved the highest accuracy in the left region with 97.47%, and ResNet-18 achieved 97.08% in the same region. In the right region, ResNet-18 achieved the highest accuracy with 95.30%. ViT achieved 94.15% in the right region, VGG16 achieved 91.03%, and ResNet-50 achieved 92.20%. Table 10 reports the results for watertight feature presence/absence classification, and the accuracy was relatively lower than that of the other feature categories. ResNet-18 achieved the highest performance on both regions, with 86.72% in the left region and 80.86% in the right region. ViT achieved 82.81% in the left region and 74.60% in the right region. VGG16 achieved 80.31% in the left region and 73.68% in the right region. ResNet-50 achieved 76.61% in the left region and 74.46% in the right region. Table 11 reports the results for girder presence/absence classification. All models achieved high accuracy, ranging from 93.76% to 97.46%. ViT achieved the highest accuracy in the left region with 97.46%. VGG16 achieved the highest accuracy in the right region with 95.32%. ResNet-18 also maintained high performance, achieving 96.88% in the left region and 94.54% in the right region.
Table 7, Table 8, Table 9, Table 10 and Table 11 also report time per epoch and peak Video Random Access Memory (VRAM), enabling a quantitative comparison of training efficiency and resource consumption. ResNet-18 had the lowest peak VRAM, which was 1.82 GB. ResNet-18 also showed the shortest time per epoch. The time per epoch for collar plate front/rear classification was 4.86–4.93 s. The time per epoch for slot presence/absence classification was 7.95–8.04 s. The time per epoch for watertight feature presence/absence classification was 8.11–8.20 s. ViT had a peak VRAM of 6.35 GB and required approximately 20 s/epoch for slot and watertight feature training. VGG16 had the highest peak VRAM, which was 11.78 GB, and required the longest training time of 19–26 s/epoch. ResNet-50 had a peak VRAM of 7.98–7.99 GB and required 14–18 s/epoch for training. Consequently, the image splitting-based learning presented in Table 7, Table 8, Table 9, Table 10 and Table 11 generally improved feature-wise classification performance compared with the full-shape learning shown in Table 4. Considering both accuracy and computational cost, ResNet-18 maintained high accuracy for U-shaped weld joint feature classification while exhibiting the lowest VRAM usage and the shortest training time.

7. Conclusions and Future Work

This study proposes a method for constructing an automated classification system in which a collaborative robot recognizes and classifies the shapes of U-shaped weld joints in a ship’s double-bottom structure. Using Open Cascade, 3D CAD models of 176 types of U-shaped weld joints were automatically generated, and a large number of synthetic images were produced using Blender under various lighting conditions to replace actual captured images. When training an image recognition model using the full-shape images directly, multiple weld features were simultaneously included in a single image, and the performance of full-shape-based feature classification (collar plate type, slot, watertight feature, and girder) was low, as shown in the results in Table 4. The accuracies were 56.37%, 74.19%, 45.97%, and 86.53%, respectively. To improve this, this study constructed images by splitting the regions corresponding to each feature and used only the split regions for training, thereby improving the feature-wise classification accuracy. As a result, it was confirmed that training using only a synthetic image-based dataset can improve the classification performance of real U-shaped weld joint shapes.
In addition, this study used ResNet-18 and ViT, and comparative experiments were conducted with ResNet-50 and VGG16 used in Palma-Ramírez et al. [5] and Kumaresan et al. [6]. The results in Table 7, Table 8, Table 9, Table 10 and Table 11 show that the image-splitting-based learning consistently improves feature recognition performance overall. In particular, ResNet-18 achieved over 90% accuracy in classification according to collar plate front/rear and presence/absence, slot presence/absence, and girder presence/absence, while maintaining low Graphics Processing Unit (GPU) memory usage and fast training time per epoch. In addition, ResNet-18 showed similar or higher performance compared with ViT, VGG16, and ResNet-50, which have a larger number of parameters. In contrast, the classification accuracy for watertight feature presence/absence was relatively lower than that of the other items. This is judged to be because there were samples including girder shapes in the Absence class of the watertight feature.
Future work aims to improve the performance of the synthetic image-based classification model to a level applicable in real shipyard environments. In addition, based on the U-shaped weld joint shapes classified in this study, we intend to implement a system that automatically extracts the dimensions of the corresponding shape from captured images and uses them to perform automated welding of U-shaped weld joints.

Author Contributions

Conceptualization, G.K. and N.K.; Methodology, N.K.; Software, G.K.; Validation, G.K. and N.K.; Formal analysis, G.K.; Investigation, G.K.; Data curation, G.K.; Writing—original draft preparation, G.K.; Writing—review and editing, N.K.; Visualization, G.K.; Supervision, N.K.; Project administration, N.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by Korea Institute for Advancement of Technology (KIAT) grant funded by the Korea Government (MOTIE) (RS-2025-02263945, HRD Program for Industrial Innovation) and by the Regional Innovation System & Education (RISE) program through the Institute for Regional Innovation System & Education in Busan Metropolitan City, funded by the Ministry of Education (MOE) and the Busan Metropolitan City, Republic of Korea (2025-RISE-02-001-030).

Data Availability Statement

Data are contained within the article.

Acknowledgments

This article is a revised and substantially expanded version of a conference presentation/abstract [27].

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. JL Heavy Industry Co., Ltd. Assembly. Available online: https://jl-hi.co.kr/%ec%82%ac%ec%97%85%ec%98%81%ec%97%ad/assembly (accessed on 16 December 2025).
  2. Kim, J.-H.; Park, I.-O.; Lee, C.-M.; Kim, H.-J.; Jeon, D.-H.; Kim, H.-J.; Choi, D.-J.; Kim, W.-S. Design of a robust digital twin simulator for generating noise-resilient welding data in shipyard block assembly. J. Korea Multimed. Soc. 2025, 28, 395–411. [Google Scholar] [CrossRef]
  3. Chon, H.; Oh, D.; Noh, J. Classification of hull blocks of ships using CNN with multi-view image set from 3D CAD data. J. Mar. Sci. Eng. 2023, 11, 333. [Google Scholar] [CrossRef]
  4. Na, G.Y.; Cheon, S.; Yang, J. A deep learning-based part classification method for ship part pairing in nesting problems. Korean J. Comput. Des. Eng. 2021, 26, 514–524. [Google Scholar] [CrossRef]
  5. Palma-Ramírez, D.; Ross-Veitía, B.D.; Font-Ariosa, P.; Espinel-Hernández, A.; Sanchez-Roca, A.; Carvajal-Fals, H.; Nuñez-Alvarez, J.R.; Hernández-Herrera, H. Deep convolutional neural network for weld defect classification in radiographic images. Heliyon 2024, 10, e30590. [Google Scholar] [CrossRef] [PubMed]
  6. Kumaresan, S.; Aultrin, K.S.J.; Kumar, S.S.; Dev Anand, M. Deep learning-based weld defect classification using VGG16 transfer learning adaptive fine-tuning. Int. J. Interact. Des. Manuf. 2023, 17, 2999–3010. [Google Scholar] [CrossRef] [PubMed]
  7. Zhao, M.; Liu, X.; Wang, K.; Liu, Z.; Dong, Q.; Wang, P.; Su, Y. Welding Seam Tracking and Inspection Robot Based on Improved YOLOv8s-Seg Model. Sensors 2024, 24, 4690. [Google Scholar] [CrossRef] [PubMed]
  8. Yin, X.; Wei, N.; Kang, S.; Jiang, J.; Zhang, R. Exploring the Capability of Deep Neural Network on Ship Sub-assembly Weld Seam Recognition. In 2024 3rd International Conference on Automation, Robotics and Computer Engineering (ICARCE); IEEE: New York, NY, USA, 2024; pp. 6–11. [Google Scholar] [CrossRef]
  9. Malvido Fresnillo, P.; Mohammed, W.M.; Vasudevan, S.; Perez Garcia, J.A.; Martinez Lastra, J.L. Generation of realistic synthetic cable images to train deep learning segmentation models. Mach. Vis. Appl. 2024, 35, 84. [Google Scholar] [CrossRef]
  10. Wang, Y.; Deng, W.; Liu, Z.; Wang, J. Deep learning-based vehicle detection with synthetic image data. IET Intell. Transp. Syst. 2019, 13, 1097–1105. [Google Scholar] [CrossRef]
  11. Valtchev, S.Z.; Wu, J. Domain randomization for neural network classification. J. Big Data 2021, 8, 94. [Google Scholar] [CrossRef] [PubMed]
  12. Ruediger-Flore, P.; Glatt, M.; Hussong, M.; Aurich, J.C. CAD-based data augmentation and transfer learning empowers part classification in manufacturing. Int. J. Adv. Manuf. Technol. 2023, 125, 5605–5618. [Google Scholar] [CrossRef]
  13. Eversberg, L.; Lambrecht, J. Combining synthetic images and deep active learning: Data-efficient training of an industrial object detection model. J. Imaging 2024, 10, 16. [Google Scholar] [CrossRef] [PubMed]
  14. Chen, Y.; Li, Y.; Li, G.; Liu, H.; Lu, Z. Data augmentation of gear fatigue test using generative adversarial networks. J. Mech. Sci. Technol. 2025, 39, 5051–5063. [Google Scholar] [CrossRef]
  15. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
  16. Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16 × 16 words: Transformers for image recognition at scale. In Proceedings of the International Conference on Learning Representations (ICLR), Vienna, Austria, 3–7 May 2021. [Google Scholar] [CrossRef]
  17. Tobin, J.; Fong, R.; Ray, A.; Schneider, J.; Zaremba, W.; Abbeel, P. Domain randomization for transferring deep neural networks from simulation to the real world. In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 23–30. [Google Scholar] [CrossRef]
  18. Loshchilov, I.; Hutter, F. SGDR: Stochastic gradient descent with warm restarts. In Proceedings of the International Conference on Learning Representations (ICLR), Toulon, France, 24–26 April 2017. [Google Scholar] [CrossRef]
  19. Tremblay, J.; Prakash, A.; Acuna, D.; Brophy, M.; Jampani, V.; Anil, C.; To, T.; Cameracci, E.; Boochoon, S.; Birchfield, S. Training deep networks with synthetic data: Bridging the reality gap by domain randomization. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA, 18–22 June 2018; pp. 969–977. [Google Scholar] [CrossRef]
  20. Chaganti, S.Y.; Nanda, I.; Pandi, K.R.; Prudhvith, T.G.N.R.S.N.; Kumar, N. Image classification using SVM and CNN. In Proceedings of the 2020 International Conference on Computer Science, Engineering and Applications (ICCSEA), Gunupur, India, 13–14 March 2020; pp. 1–5. [Google Scholar] [CrossRef]
  21. Xu, M.; Yoon, S.; Fuentes, A.; Park, D.S. A comprehensive survey of image augmentation techniques for deep learning. Pattern Recognit. 2023, 137, 109347. [Google Scholar] [CrossRef]
  22. Kim, B.; Park, W.; Kim, K.; Kim, H.Y. CNN-based classification of the laser assembly process for ultra-small batteries. J. Mech. Sci. Technol. 2023, 37, 6181–6192. [Google Scholar] [CrossRef]
  23. Lim, J.J.; Kim, D.W.; Hong, W.H.; Kim, M.; Lee, D.H.; Kim, S.Y.; Jeong, J.H. Application of convolutional neural network (CNN) to recognize ship structures. Sensors 2022, 22, 3824. [Google Scholar] [CrossRef] [PubMed]
  24. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
  25. Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; et al. ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 2015, 115, 211–252. [Google Scholar] [CrossRef]
  26. Raghu, M.; Unterthiner, T.; Kornblith, S.; Zhang, C.; Dosovitskiy, A. Do vision transformers see like convolutional neural networks? Adv. Neural Inf. Process. Syst. 2021, 34, 12116–12128. [Google Scholar] [CrossRef]
  27. Kang, G.; Cho, J.; Park, Y.; Son, K.; Lee, J.; Jung, H.; Ku, N. Classification of Ship Double-Bottom Weld Joints Using Synthetic Images. In Proceedings of the 2025 Fall Academic Conference and General Assembly of the Korean Society of Ocean Engineers, Jeju, Republic of Korea, 29–31 October 2025; Available online: https://hanrimmobile.cafe24.com/mobile/ksoe/2025F/main.html (accessed on 16 December 2025).
Figure 1. Double-bottom structure of a ship showing the web frame and longitudinal stiffener (Longi. Stiffener). Reproduced with permission from Ref. [1]. Copyright 2024 JL Heavy Industries Co., Ltd.
Figure 1. Double-bottom structure of a ship showing the web frame and longitudinal stiffener (Longi. Stiffener). Reproduced with permission from Ref. [1]. Copyright 2024 JL Heavy Industries Co., Ltd.
Jmse 14 00224 g001
Figure 2. Various shapes of U-shaped weld joints: (a) T6ASNWNG; (b) T7NSNWNG; (c) T8NSRWLG; (d) T9NSNWNG.
Figure 2. Various shapes of U-shaped weld joints: (a) T6ASNWNG; (b) T7NSNWNG; (c) T8NSRWLG; (d) T9NSNWNG.
Jmse 14 00224 g002
Figure 3. Structural components of U-shaped weld joints in a double-bottom ship structure: (a) collar plate, slot, and watertight feature; (b) girder and watertight feature.
Figure 3. Structural components of U-shaped weld joints in a double-bottom ship structure: (a) collar plate, slot, and watertight feature; (b) girder and watertight feature.
Jmse 14 00224 g003
Figure 4. Differentiation by collar plate types: T1, T2, T3, T4, T5, T6, T7, T8, and T9.
Figure 4. Differentiation by collar plate types: T1, T2, T3, T4, T5, T6, T7, T8, and T9.
Jmse 14 00224 g004
Figure 5. Differentiation by slot: left slot (LS), right slot (RS), all slot (AS), and no slot (NS).
Figure 5. Differentiation by slot: left slot (LS), right slot (RS), all slot (AS), and no slot (NS).
Jmse 14 00224 g005
Figure 6. Differentiation by watertight feature: left watertight feature (LW), right watertight feature (RW), all watertight feature (AW), and no watertight feature (NW).
Figure 6. Differentiation by watertight feature: left watertight feature (LW), right watertight feature (RW), all watertight feature (AW), and no watertight feature (NW).
Jmse 14 00224 g006
Figure 7. Differentiation by girder: left girder (LG), right girder (RG), and no girder (NG).
Figure 7. Differentiation by girder: left girder (LG), right girder (RG), and no girder (NG).
Jmse 14 00224 g007
Figure 8. Residual learning structure of ResNet-18 (schematic illustration by the authors based on [15]).
Figure 8. Residual learning structure of ResNet-18 (schematic illustration by the authors based on [15]).
Jmse 14 00224 g008
Figure 9. Vision Transformer structure (schematic illustration by the authors based on [16]).
Figure 9. Vision Transformer structure (schematic illustration by the authors based on [16]).
Jmse 14 00224 g009
Figure 10. Example of dimensional parameters for a U-shaped weld joint: (a) U-shaped weld joint modeling parameters; (b) CAD model generated using the parameters listed in Table 1.
Figure 10. Example of dimensional parameters for a U-shaped weld joint: (a) U-shaped weld joint modeling parameters; (b) CAD model generated using the parameters listed in Table 1.
Jmse 14 00224 g010
Figure 11. Steel plate textures collected from shipyard environments.
Figure 11. Steel plate textures collected from shipyard environments.
Jmse 14 00224 g011
Figure 12. Three lighting conditions configured in Blender and the corresponding rendered examples for synthetic image generation.
Figure 12. Three lighting conditions configured in Blender and the corresponding rendered examples for synthetic image generation.
Jmse 14 00224 g012
Figure 13. Synthetic renderings of U-shaped weld joints rendered at seven camera angles ( 3 , 2 , 1 , 0 , 1 , 2 , 3 ).
Figure 13. Synthetic renderings of U-shaped weld joints rendered at seven camera angles ( 3 , 2 , 1 , 0 , 1 , 2 , 3 ).
Jmse 14 00224 g013
Figure 14. Rendering Development Process of U-shaped weld joints.
Figure 14. Rendering Development Process of U-shaped weld joints.
Jmse 14 00224 g014
Figure 15. Examples of data augmentation methods used in this study.
Figure 15. Examples of data augmentation methods used in this study.
Jmse 14 00224 g015
Figure 16. Split regions for each feature category: (a) Collar plate type; (b) Slot; (c) Watertight feature; (d) Girder.
Figure 16. Split regions for each feature category: (a) Collar plate type; (b) Slot; (c) Watertight feature; (d) Girder.
Jmse 14 00224 g016
Figure 17. Class composition based on the presence or absence of a collar plate: (a) Collar plate present; (b) Collar plate absent.
Figure 17. Class composition based on the presence or absence of a collar plate: (a) Collar plate present; (b) Collar plate absent.
Jmse 14 00224 g017
Figure 18. Class composition according to the front or rear position of the collar plate: (a) Collar plate front; (b) Collar plate rear.
Figure 18. Class composition according to the front or rear position of the collar plate: (a) Collar plate front; (b) Collar plate rear.
Jmse 14 00224 g018
Figure 19. Class composition based on the presence or absence of a slot: (a) Slot present; (b) Slot absent.
Figure 19. Class composition based on the presence or absence of a slot: (a) Slot present; (b) Slot absent.
Jmse 14 00224 g019
Figure 20. Class composition based on the presence or absence of a watertight feature: (a) Watertight feature present; (b) Watertight feature absent.
Figure 20. Class composition based on the presence or absence of a watertight feature: (a) Watertight feature present; (b) Watertight feature absent.
Jmse 14 00224 g020
Figure 21. Class composition based on the presence or absence of a girder: (a) Girder present; (b) Girder absent.
Figure 21. Class composition based on the presence or absence of a girder: (a) Girder present; (b) Girder absent.
Jmse 14 00224 g021
Table 1. Parameters used for U-shaped weld joint modeling.
Table 1. Parameters used for U-shaped weld joint modeling.
Parameter NumberParameterValue (mm)
1Web Frame Width2400
2Web Frame Height1200
3Longi. Space840
4Longi. Height560
5Longi. Length1200
6Longi. Face Thickness12
7Stiffener Width48
8Watertight Feature Hole Width65
9Watertight Feature Hole Height56
10Watertight Feature Height588
11Collar Plate Width264
12Collar Plate Height400
13Collar plate Hole Radius24
14Slot Radius48
Table 2. Changes in Blender rendering settings.
Table 2. Changes in Blender rendering settings.
Version 1Version 2
Noise ThresholdEnabled (0.01)Disabled
View TransformFilmicRaw
Table 3. Comparison of classification results across slot synthetic image versions 1–2.
Table 3. Comparison of classification results across slot synthetic image versions 1–2.
Number of ClassesClassification ModelAccuracy (%)
Version 14ResNet-1875.44
ViT67.25
Version 24ResNet-1876.02
ViT71.53
Table 4. Classification performance using full-shape images for four feature types.
Table 4. Classification performance using full-shape images for four feature types.
FeatureNumber of ClassesClassification ModelAccuracy (%)
Collar Plate Type5ResNet-1852.63
ViT59.45
Slot4ResNet-1876.02
ViT71.53
Watertight Feature4ResNet-1854.97
ViT36.45
Girder3ResNet-1887.13
ViT84.99
Table 5. Comparison of splitting schemes for the slot feature using a ResNet-18 model.
Table 5. Comparison of splitting schemes for the slot feature using a ResNet-18 model.
Classification ModelSplit PartSplit SchemeLearning RateAccuracy (%)
ResNet-18Left2-Region (L/R)0.00186.55
Right0.00192.20
Left4-Region (T/B/L/R)0.00197.08
Right0.00195.30
Left6-region0.00192.59
Right0.00196.69
Table 6. Distribution of synthetic training images and real test images across five weld-joint categories.
Table 6. Distribution of synthetic training images and real test images across five weld-joint categories.
FeatureSplit PartClassification CriterionNumber of Train ImageNumber of Test Image
Collar Plate TypeLeftPresence2016171
Absence3360342
Front100865
Rear1008106
RightPresence201675
Absence3360438
Front100823
Rear100852
SlotLeftPresence3192127
Absence4200386
RightPresence3192186
Absence4200327
Watertight featureLeftPresence3024162
Absence4368351
RightPresence3024177
Absence4368336
GirderLeftPresence134473
Absence6048440
RightPresence134434
Absence6048479
Table 7. Comparison of recognition performance and computational cost for collar plate presence/absence classification using image splitting.
Table 7. Comparison of recognition performance and computational cost for collar plate presence/absence classification using image splitting.
Classification ModelSplit PartLearning RateTime/Epoch (s)Peak VRAM (GB)Accuracy (%)
ResNet-18Left0.0017.171.8291.23
Right0.0017.241.8298.25
ViTLeft0.000114.456.3592.29
Right0.000114.856.3595.90
VGG16Left0.00119.6411.7891.42
Right0.00119.8511.7898.64
ResNet-50Left0.00114.357.9991.81
Right0.00114.317.9998.64
Table 8. Comparison of recognition performance and computational cost for collar plate front/rear classification using image splitting.
Table 8. Comparison of recognition performance and computational cost for collar plate front/rear classification using image splitting.
Classification ModelSplit PartLearning RateTime/Epoch (s)Peak VRAM (GB)Accuracy (%)
ResNet-18Left0.0014.861.8296.29
Right0.0014.931.82100.00
ViTLeft0.00015.396.3594.15
Right0.00015.466.3594.66
VGG16Left0.0019.8411.7897.66
Right0.0019.6211.7896.00
ResNet-50Left0.0017.837.9992.98
Right0.0017.647.99100.00
Table 9. Comparison of recognition performance and computational cost for slot presence/absence classification using image splitting.
Table 9. Comparison of recognition performance and computational cost for slot presence/absence classification using image splitting.
Classification ModelSplit PartLearning RateTime/Epoch (s)Peak VRAM (GB)Accuracy (%)
ResNet-18Left0.0017.951.8297.08
Right0.0018.041.8295.30
ViTLeft0.000120.346.3596.10
Right0.000120.126.3594.15
VGG16Left0.00126.1511.7893.37
Right0.00126.3111.7891.03
ResNet-50Left0.00118.387.9897.47
Right0.00118.467.9892.20
Table 10. Comparison of recognition performance and computational cost for watertight feature presence/absence classification using image splitting.
Table 10. Comparison of recognition performance and computational cost for watertight feature presence/absence classification using image splitting.
Classification ModelSplit PartLearning RateTime/Epoch (s)Peak VRAM (GB)Accuracy (%)
ResNet-18Left0.0018.201.8286.72
Right0.0018.111.8280.86
ViTLeft0.000120.066.3582.81
Right0.000120.206.3574.60
VGG16Left0.00125.7011.7880.31
Right0.00125.5711.7873.68
ResNet-50Left0.00118.677.9876.61
Right0.00118.597.9874.46
Table 11. Comparison of recognition performance and computational cost for girder presence/absence classification using image splitting.
Table 11. Comparison of recognition performance and computational cost for girder presence/absence classification using image splitting.
Classification ModelSplit PartLearning RateTime/Epoch (s)Peak VRAM (GB)Accuracy (%)
ResNet-18Left0.0018.031.8296.88
Right0.0018.011.8294.54
ViTLeft0.000120.046.3597.46
Right0.000120.076.3593.76
VGG16Left0.00125.9011.7896.10
Right0.00125.8311.7895.32
ResNet-50Left0.00118.127.9895.71
Right0.00118.357.9894.35
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kang, G.; Ku, N. Classification of Double-Bottom U-Shaped Weld Joints Using Synthetic Images and Image Splitting. J. Mar. Sci. Eng. 2026, 14, 224. https://doi.org/10.3390/jmse14020224

AMA Style

Kang G, Ku N. Classification of Double-Bottom U-Shaped Weld Joints Using Synthetic Images and Image Splitting. Journal of Marine Science and Engineering. 2026; 14(2):224. https://doi.org/10.3390/jmse14020224

Chicago/Turabian Style

Kang, Gyeonghoon, and Namkug Ku. 2026. "Classification of Double-Bottom U-Shaped Weld Joints Using Synthetic Images and Image Splitting" Journal of Marine Science and Engineering 14, no. 2: 224. https://doi.org/10.3390/jmse14020224

APA Style

Kang, G., & Ku, N. (2026). Classification of Double-Bottom U-Shaped Weld Joints Using Synthetic Images and Image Splitting. Journal of Marine Science and Engineering, 14(2), 224. https://doi.org/10.3390/jmse14020224

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop