Towards Industrial Surface Roughness Screening from OCT Images Using a Multimodal Large Language Model

Sabuncu, Metin; Avci, Sonay Onur

doi:10.3390/app16126010

Open AccessArticle

Towards Industrial Surface Roughness Screening from OCT Images Using a Multimodal Large Language Model

by

Metin Sabuncu

^1,* and

Sonay Onur Avci

²

¹

Department of Electrical and Electronic Engineering, Dokuz Eylul University, İzmir 35390, Türkiye

²

The Graduate School of Natural and Applied Sciences, Dokuz Eylul University, İzmir 35390, Türkiye

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(12), 6010; https://doi.org/10.3390/app16126010 (registering DOI)

Submission received: 11 May 2026 / Revised: 6 June 2026 / Accepted: 11 June 2026 / Published: 13 June 2026

(This article belongs to the Special Issue Future Applications of Large Language Models)

Download

Browse Figures

Versions Notes

Abstract

Rapid and non-contact surface inspection is essential for quality control in modern production. Optical coherence tomography (OCT) can image a surface without contact, but turning those images into roughness parameters usually requires specialized processing software. This study examined whether a multimodal large language model (LLM) could estimate roughness parameters directly from OCT B-scans as a screening tool. The study was designed as a controlled macro-scale proof of concept using periodic, analytically defined phantoms rather than as validation on stochastic industrial micro-roughness. Five test surfaces with exactly known geometries were designed, 3D-printed, and scanned with a spectral-domain OCT system. For each surface, roughness values were computed from the theoretical shape, extracted from the OCT image using MATLAB, and also estimated by the LLM from the same image. The repeatability of the LLM was checked by running the same prompt ten times per surface. On a sawtooth profile, the LLM estimates varied by 3.8% for Ra, 4.2% for Rq, 3.5% for Rp, 2.8% for Rv, and 3.1% for Rt. Across all five surfaces, the variation in Ra and Rq was around 3–5%, and for Rt, it stayed below 5%. The results show that a generative AI approach can produce repeatable roughness estimates that are useful for comparative screening. This method offers a flexible option for surface comparison and AI-assisted quality control when calibrated measurements are not required.

Keywords:

generative AI; multimodal large language models; optical coherence tomography; surface roughness; intelligent quality control; metrology; industrial inspection; non-contact measurement

1. Introduction

Fast, non-contact inspection of surfaces is a key requirement in automated production lines today. The shape and texture of a surface directly influence how a part performs mechanically, optically, and electrically [1]. In fields such as additive manufacturing, precision machining, microelectronics, polymer processing, and high-voltage insulation, even small topographic deviations can alter fatigue life, bond strength, heat transfer, or electric field behavior [2,3,4].

As manufacturing tolerances become stricter and quality assurance moves towards full automation, the industry needs measurement tools that are high-resolution, non-destructive, and fast enough for real-time use. These tools must quantify roughness and catch defects without sample preparation or physical contact [5,6,7,8].

Optical coherence tomography (OCT) is one such technique. It is a light-based, non-invasive imaging method that resolves features at the micrometer scale. OCT has gained ground in industrial inspection, where it is used to examine both surface and subsurface details [9,10]. Originally developed for medical imaging, OCT has been applied to non-destructive evaluation and metrology tasks: inspecting layered structures, monitoring quality in polymers, checking microfabricated components, studying cultural heritage objects, and analyzing additively manufactured parts [11,12,13,14,15,16]. Because OCT captures cross-sectional images (B-scans), it can be used to build two-dimensional height maps that follow standard surface texture protocols [17,18,19].

Spectral-domain OCT (SD-OCT) systems have improved speed and sensitivity through Fourier-domain signal processing, making it possible to scan large areas quickly for production monitoring [20,21,22,23]. Systems operating around 900–950 nm are well suited for polymers and 3D-printed materials, as near-infrared light in this window reveals both the surface and shallow subsurface structure [24,25,26]. Broadband illumination also reduces speckle noise, which improves image contrast and leads to more reliable surface detection [27]. For these reasons, OCT is a strong candidate for contact-free roughness assessment of 3D-printed parts. However, it must be noted that OCT cannot match the lateral resolution of stylus instruments or interferometric microscopes when measuring micro-scale roughness. For this reason, the present study concentrates on macro-scale structured surfaces whose features are clearly visible in OCT images. On such surfaces, traditional contact profilometry may be unsuitable because of material compliance, part geometry, or the risk of scratching.

Turning OCT scans into roughness numbers requires post-processing. The usual workflow includes detecting the surface boundary, filtering noise, correcting phase, flattening, leveling, calibrating pixels, and finally computing the profile parameters [28,29]. These steps need signal-processing knowledge, so users often rely on proprietary software. Such software can be rigid and may demand training, which slows down its uptake on the factory floor [30,31,32].

At the same time, large language models (LLMs) have changed what is possible in artificial intelligence. Models such as ChatGPT, Gemini, Claude, Grok, and LLaMA now process images alongside text and structured data [33,34]. They can handle calculations, read images, and reason about physical quantities in engineering and inspection tasks [35,36]. In medicine, multimodal LLMs have already been tested on OCT scans for tasks like report writing [37,38]. In manufacturing, LLMs have been used for fault finding and quality checks, for instance, in textile inspection [39,40,41].

To our knowledge, nobody has systematically tested whether a multimodal LLM can estimate surface roughness parameters directly from OCT images. Most OCT roughness studies use either hand-crafted algorithms or AI methods that need supervised training on large, labeled datasets [42]. However, LLMs could extract information from an OCT image and compute roughness metrics, all without any custom feature extraction, provided the right prompt is given. The proposed LLM-based workflow should not be interpreted as eliminating all OCT data preparation. OCT system-level reconstruction, pixel-to-physical unit scaling, calibration, and image export remain necessary. The intended simplification is the removal of a dedicated application-specific surface extraction and parameter computation stage. They might also combine data from a CAD model with what they see in the scan. This could greatly reduce the burden of writing and maintaining dedicated image-processing code, though expert judgment is still needed to design good prompts and to interpret the outputs [43].

This possibility suggests a new approach: LLM-assisted OCT surface screening, where a general-purpose multimodal model supports part of the analysis. To test this idea, we built reference surfaces whose exact shape was known from the CAD file. The test objects were deliberately restricted to macro-scale, periodic, and analytically defined geometries so that theoretical reference values, MATLAB-based OCT extraction, and LLM-based estimates could be compared under controlled conditions. These controlled structures let us check whether LLM-based analysis can rank surfaces by roughness and give repeatable results—the two things a screening tool must do well.

Using phantoms to test imaging systems is standard in medical OCT [44]. We followed the same principle but designed our phantoms for surface-texture analysis, not to imitate tissue [45]. The 3D-printed surfaces had precise, predefined shapes, so we could compute theoretical roughness parameters and compare them with the OCT-based values. Even though 3D-printed phantoms regularly appear in OCT research, they have not been used before to test AI- or LLM-assisted roughness estimation. Therefore, our phantoms act purely as test objects. They help us examine the feasibility and limits of the LLM approach, without claiming that it replaces industrial profilometry.

We imaged the phantom surfaces with an SD-OCT system. The analysis brought together the OCT data, the reference CAD models (STL files), and the LLM reasoning. The LLM was shown the OCT B-scan and a prompt with the physical dimensions, and it was asked to estimate the roughness parameters based on the visible surface profile.

The study examines whether a multimodal LLM can serve as a screening aid for OCT-based roughness. The aim is not to turn the LLM into a measurement instrument, but to see if its outputs can support comparative screening decisions. The engineered phantoms, with their analytically defined geometries, allow controlled and repeatable comparisons with theoretical values and with conventional MATLAB-based extraction. The results show that OCT imaging, together with a multimodal LLM, can act as a proof of concept for AI-assisted screening in quality-control workflows where metrological traceability is not required. The work thus offers a practical generative AI case for intelligent industrial inspection.

2. Materials and Methods

Five reference surface profiles (G1–G5) were designed for the study. Their exact shapes are shown in Figure 1. The geometries were chosen because their roughness parameters can be calculated directly from theory and because they can be printed reliably with a resin 3D printer. Working with analytically defined shapes makes it possible to compare theoretical roughness with the values obtained later from OCT scans.

Each phantom had a flat 20 mm × 20 mm base and a patterned top face. All vertical features were kept within 1.7 mm height, which matches the axial imaging range of the OCT system. In this way, the whole surface profile fits into a single cross-sectional image. The patterns were periodic and repeated evenly over the entire phantom.

The five geometries were as follows:

Sawtooth: a triangular wave with sharp peaks and straight ramps.

Rounded ridge: semicylindrical ridges that give smooth transitions between peaks and valleys.

Castle wall: wide rectangular steps with flat tops and deep valleys.

Grating: narrow rectangular steps spaced more densely than the castle wall.

Spike array: sharp, spike-like peaks that produce a high-kurtosis height distribution.

Figure 1 illustrates the idealized surface profiles corresponding to the five test geometries: (G1) sawtooth, (G2) rounded ridge, (G3) castle wall, (G4) grating, and (G5) spike array. Although the phantoms are three-dimensional objects, all surface roughness analysis in this study was performed on one-dimensional roughness profiles extracted along the OCT scan direction. The same geometry was replicated orthogonally across the phantom surface to ensure consistency over the scanned area. The general workflow of our experimental analysis is shown in Figure 2. The workflow consists of three parallel analysis branches: analytical calculation from ideal geometry, MATLAB-based extraction from the OCT B-scan, and LLM-based estimation from the same OCT image. The repeatability of the LLM branch was assessed by submitting the same image and prompt repeatedly and calculating the mean, standard deviation, and coefficient of variation in the outputs. In the original LLM prompt, the global peak-to-valley parameter was requested using the symbol Rz. Following the reviewer’s comment on ISO nomenclature, this quantity is reported in the revised manuscript as Rt, because the LLM output corresponded to total profile height over the full 10 mm evaluation length, calculated as Rp + Rv. The numerical values were not changed.

The analytically defined phantom surfaces were (i) designed in CAD and exported as STL files, (ii) fabricated via 3D printing (iii) imaged using SD-OCT to obtain B-scan cross-sections, and (iv) analyzed in three parallel ways: (1) ideal/reference SR computed directly from the geometry, (2) experimental SR computed from the OCT B-scans using a MATLAB surface-extraction, and (3) AI- or LLM-assisted SR computed from the same OCT B-scans using ChatGPT.

All phantom surfaces were fabricated using resin-based additive manufacturing. The geometries were designed in CAD and exported as STL files, which were printed using a commercial vat photopolymerization process with Anycubic Basic photopolymer resin. This printing approach provides reliable surface formation, making it suitable for producing periodic test phantoms. The printed samples were post-processed according to standard procedures and used directly for OCT imaging.

All samples were imaged using an SD-OCT system. A broadband illumination source with a center wavelength of 930 nm was used, which reduced speckle noise. The system provided an axial imaging depth of 1.7 mm, sufficient to fully capture the engineered surface features of all phantom geometries. The system parameters are as follows: central wavelength of 930 nm, axial scan rate of 1.2 kHz, maximum imaging depth of 1.7 mm, axial resolution of 7 μm in air, and lateral resolution of approximately 8 μm in the standard configuration. These parameters are sufficient to resolve the macro-scale surface features (0.1–0.5 mm amplitude) investigated in this study.

Each OCT B-scan consisted of 512 × 512 pixels, corresponding to an axial field of view of 1.7 mm and a lateral scan length of 10 mm. The lateral scan range was deliberately set to 10 mm to match the evaluation length used in all surface roughness computations. This ensured direct correspondence between OCT-derived profiles, analytical reference profiles, and numerical models.

Prior to image acquisition, the optical imaging height and focal position were adjusted so that the air–sample interface was fully contained within the axial scan window. This avoided truncation or saturation effects and ensured accurate capture of the surface profile.

The OCT image acquisition followed a standard SD-OCT configuration, as illustrated in Figure 3. The 930 nm infrared beam was delivered to the sample surface via an optical fiber. Light backscattered from the surface passed through a diffraction grating, which separated the spectral components. A collimating lens focused the dispersed light onto a CCD detector, enabling Fourier-domain signal reconstruction. The OCT control software reconstructed each B-scan and exported it as an uncompressed PNG image.

For quantitative surface roughness analysis, a single representative OCT B-scan was acquired for each geometry. The same B-scan was used for both MATLAB-based and AI-based analyses to ensure direct comparison under identical input data. This approach isolates differences arising from the analysis method rather than from scan-to-scan variability.

The five OCT B-scans used for quantitative analysis, corresponding to the sawtooth, rounded ridge, castle wall, grating, and spike array geometries, are presented without digital filtering, smoothing, or post-processing beyond OCT system reconstruction, ensuring that all analyzed features originate directly from the acquired signal in Section 3.

Surface roughness (SR) parameters were computed for ideal geometries and experimental OCT-derived profiles in accordance with ISO profile-based surface texture standards. No cut-off wavelength (λc) filter was applied because the deterministic periodic profiles contain no long-wavelength waviness components. Applying a filter would remove part of the designed geometry and reduce comparability among analytical theory, MATLAB extraction, and LLM estimates; thus, the unfiltered primary profile is the appropriate representation for this controlled study. Consequently, the parameters strictly correspond to the primary profile (P-parameters) as defined in ISO 21920-2 [46]. To be consistent with the surface roughness literature, we retain the conventional ‘R’ notation throughout.

In all cases, the surface was treated as a one-dimensional roughness profile (z(x)) evaluated over a fixed evaluation length (L) of 10 mm. The following standardized roughness parameters were calculated: arithmetical mean height (Ra), root-mean-square height (Rq), maximum peak height (Rp), maximum pit depth (Rv), maximum profile height (Rt = Rp + Rv), skewness (Rsk), and kurtosis (Rku). These parameters were computed using continuous formulations consistent with ISO definitions, where Ra and Rq represent first- and second-order amplitude measures, while Rsk and Rku characterize the asymmetry and peakedness of the height distribution, respectively. In this study, the parameters Rp, Rv, and Rt denote the global maximum values over the full evaluation length rather than the average of sections. In Table 1, these SR parameters and mathematical definitions are indicated.

The function z(x) denotes the leveled one-dimensional roughness profile and represents the vertical deviation of the surface height from the mean line of the profile at lateral position x, expressed in physical units (mm). The mean line is defined as the arithmetic mean of all z(x) values over the evaluation length. The term “leveled” indicates that the raw surface profile has been detrended to remove any mean-line offset.

For experimental analysis, OCT B-scan images were imported into MATLAB as grayscale intensity matrices corresponding to a 10 mm lateral scan length and a 1.7 mm axial range. Each image consisted of 512 × 512 pixels. The surface profile was extracted column-wise by detecting the uppermost high-intensity interface corresponding to the air–material boundary. MATLAB R2025a (The MathWorks, Inc., Natick, MA, USA) was used to perform the necessary calculations and processing operations.

Surface roughness parameters were computed directly from the raw extracted OCT profiles without applying distortion correction, envelope rescaling, or spatial filtering. Minor deviations between analytical and MATLAB-computed values arise from numerical discretization and finite sampling, whereas larger differences reflect deviations of the fabricated surfaces from ideal geometry due to additive manufacturing limitations. MATLAB-derived roughness values obtained from raw OCT images were treated as the experimental reference.

For the AI-assisted analysis, the same OCT images were uploaded without preprocessing. In the following, this LLM-based workflow serves as AI-assisted surface roughness screening. The model was instructed to estimate the roughness parameters Ra, Rq, Rp, Rv, Rt, Rsk, and Rku over the 10 mm evaluation length.

Surface roughness screening from OCT B-scans was performed using a large vision-language model, ChatGPT (GPT-5.2, OpenAI). The model was accessed via the ChatGPT interface. The main estimation sessions were performed on 18 December 2025 and repeated/confirmed on 10 February 2026 (model version GPT-5.2). A later embedded-scale-bar check was performed on 30 May 2026 and 1 June 2026 (model version GPT-5.5 Thinking). The interface did not expose fixed seed, temperature, or top-p controls.

Because the LLM does not return the extracted height profile, the computational pathway is not audited. Thus, this method is intended for rapid comparative screening rather than metrology. The LLM does not perform explicit numerical surface extraction. Instead, it interprets OCT image structure and produces parameter estimates consistent with standard roughness descriptors. Therefore, the outputs represent image-based approximations rather than deterministic computations.

OCT B-scans were uploaded as PNG images exported directly from the OCT system. Images were not resized, filtered, or contrast-enhanced prior to upload. Axial and lateral scales were provided in the text prompt (10 mm lateral over 512 px; 1.7 mm axial over 512 px), and no graphical scale bars were embedded in the images. Therefore, the model relied on the stated physical scaling.

To assess repeatability, ten independent estimation sessions were performed using identical prompts and images, with each response treated as a new estimate. The LLM outputs were benchmarked against analytical (theoretical) values and MATLAB-extracted profiles to evaluate their usefulness for screening.

3. Results

This section presents the surface roughness results obtained using three approaches: analytical calculation, OCT measurements analyzed in MATLAB, and AI-driven analysis using a large language model. Five engineered surface geometries (G1–G5) were evaluated: sawtooth, rounded ridge, castle wall, micro-grating, and spike array. For each geometry, roughness parameters were computed (i) theoretically from the ideal shape, (ii) conventionally from OCT images using MATLAB, and (iii) independently from the same OCT images using an AI-assisted analysis. This comparison evaluates the feasibility of AI-assisted surface roughness screening.

The analytically derived surface roughness parameters provide the theoretical reference based on the ideal geometric definitions. Because the surfaces are synthetically designed and periodic, roughness parameters were computed over a 10 mm evaluation length. The theoretical values distinguish the geometries by amplitude, spatial frequency, and symmetry. These analytical results establish the expected roughness behavior in the absence of fabrication or measurement imperfections and serve as the baseline for comparison with OCT and AI results.

Representative raw OCT images for each geometry are shown in Figure 4 without filtering or post-processing. In all cases, the primary surface interface is visible within the axial imaging range, confirming that the fabricated structures were captured by the OCT system. Deviations from the ideal geometry are observable, particularly for closely spaced features or curved regions, consistent with printing limits and finite OCT resolution. These exact scans were used for both the MATLAB-based and AI-assisted surface roughness analyses.

To illustrate the MATLAB-based surface roughness extraction, the castle wall geometry is presented as a representative example because its step features and symmetry allow clear visualization of surface extraction and roughness computation. Figure 5a shows the CAD/STL model of the castle wall surface and the cross-section used for roughness evaluation. Figure 5b shows the corresponding OCT B-scan scaled to physical units (axial range of 1.7 mm, lateral length of 10 mm, 512 × 512 pixels). The air–material interface was detected column-by-column as the uppermost high-intensity boundary and overlaid in green. The detected profile was converted from pixel units to millimeters using known OCT scaling factors and evaluated over a 10 mm length (Figure 5c). Surface roughness parameters were then computed from the extracted profile using ISO formulations (Figure 5d).

For the castle wall geometry, MATLAB yielded Ra = 0.120 mm and Rq = 0.127 mm. The maximum peak height and pit depth were Rp = 0.183 mm and Rv = 0.192 mm, giving Rt = 0.375 mm. The skewness and kurtosis were Rsk = −0.216 and Rku = 1.390, indicating a near-symmetric height distribution with low peakness, consistent with a plateau-dominated step morphology. The same MATLAB procedure was applied to the remaining geometries, and the results are summarized in Table 2.

Figure 6 illustrates OCT-based surface roughness analysis using a large language model for the sawtooth geometry. The OCT B-scan again corresponds to a 10 mm lateral scan length and a 1.7 mm axial imaging depth, sampled on a 512 × 512 pixel grid. The periodic triangular surface morphology is clearly resolved along the air–material interface.

The OCT image was provided to LLM without numerical preprocessing or reference values. LLM was instructed to interpret the OCT B-scan as a one-dimensional surface profile along the lateral direction and compute the roughness parameters. For the sawtooth surface, LLM estimated Ra = 0.122 mm and Rq = 0.141 mm. The extreme-value parameters were Rp = 0.248 mm and Rv = 0.283 mm, giving Rt = 0.531 mm. Skewness was near zero (Rsk = −0.03), and kurtosis was Rku = 1.84. Because the AI-based analysis and MATLAB used the same OCT scan, differences reflect differences in interpretation and numerical estimation rather than measurement variability.

Table 2 reports the surface roughness parameters for all five geometries, comparing analytical theory with OCT-based results obtained using MATLAB and an AI model. The analytical values represent the ideal reference. MATLAB provides a conventional OCT-based extraction, while AI values are obtained independently.

To separate LLM interpretation differences from deviations between the ideal CAD geometry and the fabricated samples, the LLM results were compared directly with MATLAB-based OCT extraction from the same B-scans in Table 3. This comparison is more appropriate for evaluating LLM visual inference because both methods operate on identical experimental OCT input data. The ideal theoretical values are retained as geometric references, but differences from theory include fabrication and imaging effects in addition to analysis-method differences.

To evaluate whether the LLM preserves comparative roughness trends, the five geometries were ranked according to Ra using both MATLAB-based OCT extraction and LLM-based estimation in Table 4. The ranking was identical for the two methods: rounded ridge, grating, castle wall, sawtooth, and spike array from lowest to highest Ra. This corresponds to a Spearman rank correlation of 1.00 for Ra. This result supports the use of the LLM approach for comparative macro-scale screening under controlled conditions, although it does not imply metrological accuracy for all roughness parameters.

To assess robustness, each estimation was repeated 10 times on the same scan. The method showed high repeatability with coefficients of variation (CV) for Ra and Rq below 5% across all geometries (see Table 5 and Table 6).

To evaluate LLM repeatability, 10 independent estimations of geometry were performed on the same OCT B-scan using identical prompts and settings. CV is not reported for Rsk because the mean is close to zero, making CV numerically unstable; SD and min–max are provided instead.

The variability was low, with coefficients of variation of 3.8% for Ra and 4.2% for Rq. Similar repeatability was observed for Rp, Rv, and Rt. The repeatability analysis reflects inference repeatability: the stability of LLM outputs when the same B-scan and prompt are submitted repeatedly. It does not assess measurement repeatability across different OCT scans, scan positions, or sample regions. The use of the same representative B-scan was deliberate in order to compare MATLAB-based extraction and LLM-based estimation using identical OCT input data. Future studies should include multiple independent B-scans per surface to quantify scan-to-scan variability, spatial variability, and total measurement uncertainty.

These results indicate that the AI-based method provides stable roughness estimates when the input image and prompt are fixed.

LLM repeatability was evaluated by performing 10 independent estimations per geometry using identical prompts and input images.

Across all tested surfaces, the coefficient of variation ranged from 3 to 5% for Ra and Rq, and remained below 5% for Rt.

The results indicate stable LLM-based roughness estimation across different surface morphologies and roughness amplitudes.

4. Discussion

This study evaluated whether a multimodal LLM can serve as a screening tool for surface roughness assessment from OCT images. Screening relevant properties, such as the ability to preserve the correct roughness ranking across different surface geometries, and the repeatability of the numerical outputs under fixed conditions, was examined. The percentage deviations of each method from the theoretical ideal are reported in Table 2. Agreement between the two analysis approaches was closest for the amplitude parameters Ra and Rq. Larger differences appeared for the extreme-value parameters Rp, Rv, and Rt, and for the higher-order moments Rsk and Rku.

For the sawtooth surface, the LLM gave Ra = 0.122 mm and Rq = 0.142 mm. Theoretical values were 0.125 mm and 0.144 mm. MATLAB gave 0.121 mm and 0.140 mm. Similar proximity was observed for the castle wall and grating. For these geometries, Ra and Rq stayed within a few percent of theory and MATLAB. This indicates that average amplitude information in the OCT image is preserved well enough for both methods to recover similar values. Ra and Rq are widely used descriptors of surface quality in tribology, optics, and contact mechanics [47,48].

A preliminary scale-bar check was performed using the representative sawtooth OCT B-scan. A digital scale reference was embedded into the image, while the prompt was kept identical to the original prompt except for stating that the image included a visual scale reference. In this single-run check, the LLM estimated Ra = 0.127 mm, Rq = 0.146 mm, Rp = 0.279 mm, Rv = 0.253 mm, Rt = 0.532 mm, Rsk = 0.014, and Rku = 1.78. These values were close to the original LLM estimates for Ra, Rq, and Rt, while Rp, Rv, and Rsk moved closer to the MATLAB-based extraction. Because this check was performed once and for one geometry, it is treated as preliminary. Systematic testing of text-only scaling, embedded scale bars, and axis-annotated OCT images remains future work.

Repeatability was evaluated using ten runs on the same image. For G1, the coefficient of variation was 3.8% for Ra and 4.2% for Rq. Across all geometries, Ra and Rq variation stayed between 3% and 5%. These values indicate stable outputs under fixed image and prompt conditions. This reflects process repeatability rather than measurement accuracy. The repeatability analysis reflects inference repeatability: the stability of LLM outputs when the same B-scan and prompt are submitted repeatedly. It does not assess measurement repeatability across different OCT scans, scan positions, or sample regions. Future studies should include multiple independent B-scans per surface to quantify scan-to-scan variability, spatial variability, and total measurement uncertainty.

The repeatability analysis used ten independent LLM runs for each OCT image. This approach provides an initial indication of consistency across runs. However, ten runs do not provide a statistically exhaustive characterization of the model output distribution. A larger sample size, such as 30 runs or more, would offer more robust statistical estimates. Specifically, it would improve the calculation of the coefficient of variation and confidence intervals. For this study, the original dataset of ten repeated runs was maintained. This decision ensured that the repeatability data matched the exact model version and inference conditions used for the primary roughness estimates. Future research should evaluate larger datasets of repeated runs under controlled conditions. This will allow for a more complete quantification of the statistical robustness of LLM-based roughness estimation.

Differences increased for the extreme value parameters. For the sawtooth geometry, Rt was 0.500 mm in theory, 0.521 mm from MATLAB, and 0.532 mm from the LLM. A similar trend was observed for the spike array. These parameters depend strongly on the correct identification of local peaks and valleys, and even small variations in the extracted profile can produce large changes in their values. In addition to algorithmic sensitivity, fabrication effects and the finite lateral and axial resolution of the OCT system influence the measured extremes. Step-like and high-gradient features commonly exhibit deviations from their ideal CAD definitions, which is consistent with known additive manufacturing effects such as edge rounding, material accumulation, and feature smoothing during photopolymerization [49].

The LLM tended to produce larger extreme values than MATLAB, indicating a higher sensitivity to visually prominent peaks and valleys in the OCT images. The most pronounced deviation occurred for the maximum pit depth Rv of the spike array geometry, where the LLM estimate exceeded the theoretical value by 146% (0.246 mm versus 0.100 mm). The MATLAB-based estimate also showed a substantial deviation (+38%), suggesting that the fabricated valleys were deeper or differently shaped than the ideal design, likely due to incomplete resin drainage and meniscus effects during printing. The additional amplification observed in the LLM output indicates that low-intensity or shadowed regions in the OCT image may be interpreted as deeper pits, further increasing the apparent valley depth.

From a methodological perspective, this behavior reflects the difference between deterministic profile extraction and image-based visual inference. The LLM does not explicitly reconstruct the surface profile but instead infers roughness characteristics from image appearance. As a result, it may emphasize visually salient features, particularly in regions with strong contrast or shadowing. This property can be advantageous for screening, as exaggerated valley depths may highlight potential structural weaknesses or defect-prone regions. However, it also introduces the possibility of overestimation when the outputs are interpreted quantitatively. Accordingly, extreme value parameters derived from the LLM should be understood as conservative, image-based approximations rather than metrologically traceable measurements. Therefore, LLM-derived Rp, Rv, and Rt values should not be used for critical defect-depth evaluation without independent validation.

In many engineering applications, the highest peaks and deepest valleys influence functional behavior such as contact, sealing, and fatigue initiation [50]. Sensitivity to these features may therefore be useful for conservative screening. At the same time, these parameters are the most sensitive to the detection strategy and should be interpreted cautiously.

Skewness and kurtosis varied more than Ra and Rq. For the castle wall, Rsk changed from 0.000 in theory to −0.216 in MATLAB and +0.151 in the LLM. These parameters depend on the height distribution. Small detection differences can change their sign. The general trend across geometries remained reasonable, with negative values for valley-dominated shapes and positive values for peak-dominated shapes. Although absolute agreement with theory was limited, the qualitative behavior was preserved. Since the LLM does not provide the underlying extracted height profile or height histogram, Rsk and Rku should be interpreted as qualitative indicators of profile asymmetry and peakedness rather than as traceable quantitative metrology outputs.

The two methods operate differently. MATLAB computes roughness from an extracted height profile. The LLM analyses roughness from image appearance. The agreement for Ra and Rq indicates that amplitude information is visible in the OCT data. Divergence for higher-order parameters shows sensitivity to interpretation. A limitation of the present validation strategy is that the MATLAB-based extraction is not an independent ground-truth measurement. Both MATLAB and LLM analyses are based on the same OCT B-scans and therefore share possible OCT-related error sources, including finite resolution, speckle, shadowing, and calibration uncertainty. The MATLAB results are used here as a reproducible OCT-derived reference for method comparison, not as a traceable metrological standard. Future work should compare both OCT-based methods with independent reference measurements such as contact profilometry, confocal microscopy, or interferometric profilometry. This behavior is consistent with prior work showing that OCT-based surface analysis depends on interface detection strategy and image quality [51,52,53].

The present study was designed as a controlled proof of concept. Deterministic MATLAB or Python scripts remain the preferred approach when a validated surface-extraction algorithm is available. Such scripts are faster, cheaper, auditable, and fully reproducible. The LLM-based approach investigated here is therefore not proposed as a replacement for conventional image processing or calibrated metrology. Its potential value lies in a different use case: low-code exploratory screening, rapid interpretation of heterogeneous image formats, integration of visual inspection with textual metadata, and preliminary decision support for users who may not have access to customized surface-analysis software. These potential advantages must be balanced against important drawbacks, including non-deterministic outputs, lack of an auditable extracted height profile, possible visual hallucination, latency, computational cost, and dependence on the model version.

Accordingly, the phantoms were deliberately chosen to be macroscopic, periodic, and high-contrast. The findings of this study are limited to macro-scale, high-contrast, periodic surfaces with clearly visible OCT interfaces. They should not be generalized to stochastic micro-roughness, multi-material industrial surfaces, or surfaces with strong scattering, transparency, saturation, or poor interface visibility without further validation.

This provided a baseline against which the LLM’s performance could be quantitatively assessed. Further extension to stochastic micro-roughness typical of industrial surfaces is a natural next step. The axial and lateral resolutions of the OCT were appropriate for the feature sizes investigated. For substantially finer textures, higher-resolution imaging systems can be applied. The computed parameters correspond to the primary profile (P-parameters) as defined in ISO 21920-2. For surfaces carrying superimposed waviness, appropriate filtering according to the relevant standard would be mandatory. A single B-scan per geometry was analyzed, which isolates method-to-method differences and follows the practice common in many industrial OCT inspections, where assessment is based on a limited number of scans. The use of one representative B-scan per geometry was deliberate in order to compare MATLAB-based extraction and LLM-based estimation using identical OCT input data. This design isolates differences caused by the analysis method rather than differences caused by scan position, local surface variability, or sample-to-sample variation. However, it also means that spatial variability across the surface was not evaluated. Therefore, the reported repeatability reflects LLM inference repeatability under fixed image and prompt conditions, not full OCT measurement repeatability or three-dimensional surface robustness. Future work should include multiple B-scans per sample or volumetric OCT data to assess scan position dependence and spatial variability. Certain industrial applications would benefit from incorporating multi-scan sampling and formal uncertainty evaluation. The LLM outputs were obtained with a specific commercial model accessed through a web interface. The exact prompt (Appendix B) and all raw outputs (Appendix A) are provided to enable verification of the reported statistics. The present workflow was performed through a web-based LLM interface and is not suitable for millisecond-level inline inspection. Individual queries may take several seconds and can depend on server load, network conditions, and response length. Therefore, the method should be interpreted as an offline or near-line decision-support tool. Real-time implementation would require API-based or local model deployment, fixed model versions, optimized prompting, and dedicated latency benchmarking.

The LLM’s internal estimation path cannot be inspected. The observed repeatability confirms stable outputs, but it does not guarantee that the results will generalize to other surface types. Systematic biases could exist that happen to align with our test geometries. We cannot know whether the model extracts a true height profile or simply maps image textures to numbers. These limits are acceptable for a screening tool. Such a tool only directs attention to suspect surfaces for further inspection. However, the limits prevent metrological use. The black-box nature of the LLM further limits interpretation and generalization. Unlike MATLAB extraction, the LLM does not return an auditable surface profile, boundary location, or height histogram. Therefore, it is not possible to determine whether a given estimate was based on the true air material interface, image contrast, shadowing, learned shape priors, or prompt-dependent reasoning. This lack of traceability limits the method for calibrated metrology and increases the risk of poor generalization to other OCT systems, materials, and surface types. The observed repeatability confirms stable outputs under fixed conditions, but it does not guarantee that the results will generalize to other surface types. Systematic biases could exist that happen to align with our test geometries. The LLM should therefore be regarded as an exploratory screening assistant rather than a traceable measurement instrument. Future work could ask the model to output intermediate steps, such as extracted profile coordinates. This might improve transparency and bring the method closer to auditable measurement. Additionally, one could test different material classes, including metallic, polymeric, textile, and additively manufactured surfaces with stochastic roughness.

Taken together, the results demonstrate that OCT images of the tested structured geometries contain sufficient information for approximate roughness analysis using both deterministic extraction and image-based generative AI interpretation. The LLM-based approach produced repeatable roughness estimates that preserved the approximate amplitude ranking of the tested surfaces, particularly for Ra and Rq. Therefore, its most appropriate role is comparative amplitude screening rather than critical metrological evaluation. It may serve as a complementary tool for rapid, non-contact inspection when comparative surface assessment is the goal and metrological traceability is not a regulatory requirement. However, LLM-derived extreme-value parameters and higher-order descriptors should not be used for critical defect-depth or functional surface qualification without independent validation.

This work establishes a practical foundation for AI-assisted surface inspection within the Industry 4.0 framework.

5. Conclusions

This study tested whether a multimodal LLM can serve as a screening tool for surface roughness from OCT B-scan images. MATLAB-based extraction and analytical geometry provided the reference values. Across five engineered surface geometries, the LLM produced amplitude estimates that agreed reasonably with theory and MATLAB. The LLM also correctly ranked the surfaces by roughness. Agreement was strongest for Ra and Rq, which are less sensitive to local extremes. The repeatability of the LLM outputs was high, with coefficients of variation below 5% for Ra, Rq, and Rt.

Larger deviations appeared for extreme-value parameters on surfaces with sharp or curved features. This shows that the model is sensitive to visually prominent peaks and valleys. This sensitivity can be useful for conservative screening. Higher-order moments kept their qualitative trends but had limited numerical agreement with the design. This reflects both fabrication deviations and the inherent sensitivity of these parameters.

The method is a proof of concept for rapid, non-contact roughness screening. It is not a replacement for calibrated profilometry. Therefore, LLM-derived extreme-value parameters (Rp, Rv, Rt) must not be used for critical defect-depth evaluation or safety-relevant decisions without independent deterministic verification, as the model may hallucinate peaks or valleys from OCT shadowing and contrast artifacts.

Future work on more surface types, multi-scan sampling, and formal uncertainty evaluation will help mature the approach. This work shows that generative AI-based interpretation of OCT images can be a practical tool for image-based surface inspection. It is suited for cases where surface features are well resolved, and comparative ranking is the main objective.

Author Contributions

Conceptualization, M.S.; methodology, M.S. and S.O.A.; software, S.O.A.; validation, M.S. and S.O.A.; formal analysis, S.O.A.; investigation, M.S.; resources, M.S.; data curation, S.O.A.; writing—original draft preparation, M.S.; writing—review and editing, M.S. and S.O.A.; visualization, M.S.; supervision, M.S.; project administration, M.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw OCT B-scan images used for quantitative surface roughness analysis in this study are provided in the Appendix A, Appendix B and Appendix C of this article. All data supporting the reported results are included within the manuscript and its Appendix A, Appendix B and Appendix C.

Acknowledgments

We thank Yener Simsek for his technical expertise and support with the 3D printing. During the preparation of this study, the authors used ChatGPT (OpenAI) to analyze optical coherence tomography (OCT) B-scan images to estimate surface roughness parameters. The authors reviewed and verified all outputs and take full responsibility for the content, analysis, and conclusions presented in this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AI	Artificial Intelligence
CAD	Computer-Aided Design
ISO	International Organization for Standardization
LLM	Large Language Model
NDE	Non-Destructive Evaluation
OCT	Optical Coherence Tomography
SD-OCT	Spectral-Domain Optical Coherence Tomography
SR	Surface Roughness
STL	Standard Tessellation Language (stereolithography file format)

Appendix A. Raw OCT B-Scan Images Used for Quantitative Analysis

This appendix presents the raw spectral-domain OCT B-scan images used for quantitative surface roughness analysis in this study. One representative unprocessed OCT image is provided for each surface geometry: sawtooth, rounded ridge, castle wall, grating, and spike array. All images correspond exactly to the scans analyzed using both the MATLAB-based and the ChatGPT-based analysis described in the main text.

The images are shown without filtering, smoothing, envelope correction, or post-processing of any kind. This is intended to demonstrate that the surface features used for roughness estimation are directly visible in the raw OCT data. Each B-scan has a lateral scan length of 10 mm and an axial imaging depth of 1.7 mm, sampled on a 512 × 512 pixel grid.

These raw OCT images serve as the direct input for both analysis approaches and support the reproducibility of the results reported.

Figure A1. Raw OCT B-scan of the sawtooth geometry used for surface roughness analysis.

Figure A2. Raw OCT B-scan of the rounded ridge geometry used for surface roughness analysis.

Figure A3. Raw OCT B-scan of the castle wall geometry used for surface roughness analysis.

Figure A4. Raw OCT B-scan of the grating geometry used for surface roughness analysis.

Figure A5. Raw OCT B-scan of the spike array geometry used for surface roughness analysis.

Figure A6. OCT B-scan of the sawtooth geometry with scale bars added. This image was not used for the main results, which were derived directly from the raw OCT data.

Appendix B. AI Prompt

“This image is an OCT B-scan of a surface. The lateral axis spans 10 mm over 512 pixels, and the vertical axis spans 1.7 mm over 512 pixels. Estimate the surface roughness parameters Ra, Rq, Rp, Rv, Rz, Rsk, and Rku based on the visible surface profile. Assume standard profile roughness definitions and provide numerical values in millimeters.”

Appendix C. Analytical Reference Surface Roughness Parameters

The theoretical surface roughness parameters for each geometry were computed analytically from the idealized one-dimensional profile definitions using ISO profile-based surface texture formulations. Calculations were performed over a 10 mm evaluation length, matching the OCT scan length. Because the geometries are periodic and analytically defined, closed-form expressions for Ra, Rq, Rp, Rv, Rt, Rsk, and Rku can be obtained directly from the known profile shapes. The resulting theoretical values are summarized in Table A1 and serve as the reference baseline for comparison with OCT-based MATLAB and ChatGPT analyses.

Table A1. Comparison of theoretical surface roughness parameters.

SR\Geometry	Sawtooth	Rounded Ridge	Castle Wall	Grating	Spike Array
Ra (theory), mm	0.125	0.090	0.125	0.125	0.128
Rq (theory), mm	0.144	0.112	0.125	0.125	0.153
Rp (theory), mm	0.250	0.107	0.125	0.125	0.400
Rv (theory), mm	0.250	0.393	0.125	0.125	0.100
Rt (theory), mm	0.500	0.500	0.250	0.250	0.500
Rsk (theory)	0.000	−1.152	0.000	0.000	1.262
Rku (theory)	1.800	3.461	1.000	1.000	3.122

References

Sarieddine, R.; Kadiri, H.; Guelorget, B.; Le Cunff, L.; Alhussein, A.; Habchi, R.; Lérondel, G. A Review on Potential Mechanically Resistant Materials for Optical Multifunctional Surfaces: Bioinspired Surfaces with Advanced Properties. Adv. Mater. Interfaces 2024, 11, 2300793. [Google Scholar] [CrossRef]
Wang, H.; Lee, Y.J.; Bai, Y.; Zhang, J. Post-Processing Techniques for Metal-Based Additive Manufacturing: Towards Precision Fabrication; CRC Press: Boca Raton, FL, USA, 2023. [Google Scholar] [CrossRef]
Leach, R. (Ed.) Characterisation of Areal Surface Texture; Springer: Berlin/Heidelberg, Germany, 2013; Volume 1. [Google Scholar] [CrossRef]
Townsend, A.; Senin, N.; Blunt, L.; Leach, R.K.; Taylor, J.S. Surface Texture Metrology for Metal Additive Manufacturing: A Review. Precis. Eng. 2016, 46, 34–47. [Google Scholar] [CrossRef]
Sushil, S.K.; Ramkumar, J.; Chandraprakash, C. Surface Roughness Analysis: A Comprehensive Review of Measurement Techniques, Methodologies, and Modeling. J. Micromanuf. 2025, 8, 107–130. [Google Scholar] [CrossRef]
Thomas, T.R. Roughness and Function. Surf. Topogr. Metrol. Prop. 2013, 2, 014001. [Google Scholar] [CrossRef]
Hocken, R.J.; Pereira, P.H. (Eds.) Coordinate Measuring Machines and Systems, 2nd ed.; CRC Press: Boca Raton, FL, USA, 2012; Volume 6. [Google Scholar]
Chen, Y.; Ding, Y.; Zhao, F.; Zhang, E.; Wu, Z.; Shao, L. Surface Defect Detection Methods for Industrial Products: A Review. Appl. Sci. 2021, 11, 7657. [Google Scholar] [CrossRef]
Fu, M.Y.; Yin, Z.H.; Yao, X.Y.; Xu, J.; Liu, Y.; Dong, Y.; Shen, Y.C. The Progress of Optical Coherence Tomography in Industry Applications. Adv. Devices Instrum. 2024, 5, 0053. [Google Scholar] [CrossRef]
Sabuncu, M.; Özdemir, H. Identifying Leather Type and Authenticity by Optical Coherence Tomography. Int. J. Cloth. Sci. Technol. 2024, 36, 1–16. [Google Scholar] [CrossRef]
Huang, D.; Swanson, E.A.; Lin, C.P.; Schuman, J.S.; Stinson, W.G.; Chang, W.; Hee, M.R.; Flotte, T.; Gregory, K.; Puliafito, C.A.; et al. Optical Coherence Tomography. Science 1991, 254, 1178–1181. [Google Scholar] [CrossRef] [PubMed]
Kennedy, B.F.; Kennedy, K.M.; Sampson, D.D. A Review of Optical Coherence Elastography: Fundamentals, Techniques and Prospects. IEEE J. Sel. Top. Quantum Electron. 2013, 20, 272–288. [Google Scholar] [CrossRef]
Wolfgang, M.; Kern, A.; Deng, S.; Stranzinger, S.; Liu, M.; Drexler, W.; Haindl, R. Ultra-High-Resolution Optical Coherence Tomography for the Investigation of Thin Multilayered Pharmaceutical Coatings. Int. J. Pharm. 2023, 643, 123096. [Google Scholar] [CrossRef]
Sabuncu, M.; Akdoğan, M. Photonic Imaging with Optical Coherence Tomography for Quality Monitoring in the Poultry Industry: A Preliminary Study. Braz. J. Poult. Sci. 2015, 17, 319–324. [Google Scholar] [CrossRef]
Shirazi, M.F.; Park, K.; Wijesinghe, R.E.; Jeong, H.; Han, S.; Kim, P.; Jeon, M.; Kim, J. Fast Industrial Inspection of Optical Thin Film Using Optical Coherence Tomography. Sensors 2016, 16, 1598. [Google Scholar] [CrossRef]
Bellezza Prini, C.; Buscaglia, P.; Olivero, M.; Re, A.; Grassini, S.; Vallan, A.; Perrone, G. Evaluation of a Budget Optical Coherence Tomography for Cleaning Treatments of Painted Ancient Artifacts. Meas. Sci. Technol. 2025, 36, 075206. [Google Scholar] [CrossRef]
Sabuncu, M.; Özdemir, H. Contactless Measurement of Fabric Thickness Using Optical Coherence Tomography. J. Text. Inst. 2022, 113, 713–717. [Google Scholar] [CrossRef]
Hutiu, G.; Dimb, A.L.; Duma, V.F.; Demian, D.; Bradu, A.; Podoleanu, A.G. Roughness Measurements Using Optical Coherence Tomography: A Preliminary Study. In Seventh International Conference on Lasers in Medicine; SPIE: Bellingham, WA, USA, 2018; Volume 10831, p. 108310E. [Google Scholar] [CrossRef]
Palová, K.; Kelemenová, T.; Kelemen, M. Measuring Procedures for Evaluating the Surface Roughness of Machined Parts. Appl. Sci. 2023, 13, 9385. [Google Scholar] [CrossRef]
Yang, X.; Zhang, Z.; Li, X.; Lin, H.; Lawman, S.; Stoyanov, S.; Zheng, Y. High-Speed Low-Cost Line-Field Spectral-Domain Optical Coherence Tomography for Industrial Applications. Opt. Lasers Eng. 2025, 184, 108631. [Google Scholar] [CrossRef]
Sabuncu, M.; Özdemir, H. Optical Coherence Tomography Imaging Can Identify Merino Lambs’ Wool Using Automatic Machine Learning Vision. Text. Res. J. 2023, 93, 4611–4623. [Google Scholar] [CrossRef]
Choma, M.A.; Sarunic, M.V.; Yang, C.; Izatt, J.A. Sensitivity Advantage of Swept Source and Fourier Domain Optical Coherence Tomography. Opt. Express 2003, 11, 2183–2189. [Google Scholar] [CrossRef]
Leitgeb, R.; Hitzenberger, C.K.; Fercher, A.F. Performance of Fourier Domain vs. Time Domain Optical Coherence Tomography. Opt. Express 2003, 11, 889–894. [Google Scholar] [CrossRef]
Stevens, L.M.; Tagnon, C.; Page, Z.A. “Invisible” Digital Light Processing 3D Printing with Near Infrared Light. ACS Appl. Mater. Interfaces 2022, 14, 22912–22920. [Google Scholar] [CrossRef]
De Pretto, L.R.; Amaral, M.M.; Freitas, A.Z.D.; Raele, M.P. Nondestructive Evaluation of Fused Filament Fabrication 3D Printed Structures Using Optical Coherence Tomography. Rapid Prototyp. J. 2020, 26, 1853–1860. [Google Scholar] [CrossRef]
Lauri, J.; Avsievich, T.; Sieryi, O.; Bykov, A.; Fabritius, T. 1.5-μm Optical Coherence Tomography for Quality Inspection of 3D-Printed Scattering Phantoms. In Proceedings of the 2024 IEEE International Instrumentation and Measurement Technology Conference, Glasgow, UK, 20–23 May 2024. [Google Scholar]
Yılmazlar, I.; Sabuncu, M. Implementation of a Current Drive Modulator for Effective Speckle Suppression in a Laser Projection System. IEEE Photonics J. 2015, 7, 6901166. [Google Scholar] [CrossRef]
Lee, J.; Saleah, S.A.; Jeon, B.; Wijesinghe, R.E.; Lee, D.E.; Jeon, M.; Kim, J. Assessment of the Inner Surface Roughness of 3D Printed Dental Crowns via Optical Coherence Tomography Using a Roughness Quantification Algorithm. IEEE Access 2020, 8, 133854–133864. [Google Scholar] [CrossRef]
Hofer, B.; Považay, B.; Hermann, B.; Unterhuber, A.; Matz, G.; Hlawatsch, F.; Drexler, W. Signal post processing in frequency domain OCT and OCM using a filter bank approach. In Three-Dimensional and Multidimensional Microscopy: Image Acquisition and Processing Xiv; SPIE: Bellingham, WA, USA, 2007; Volume 6443, pp. 129–134. [Google Scholar]
Thomas, T.R. Characterization of Surface Roughness. Precis. Eng. 1981, 3, 97–104. [Google Scholar] [CrossRef]
Josso, B.; Burton, D.R.; Lalor, M.J. Wavelet Strategy for Surface Roughness Analysis and Characterisation. Comput. Methods Appl. Mech. Eng. 2001, 191, 829–842. [Google Scholar] [CrossRef]
Chu, Z.; Weng, G.; Yu, L. Real-Time Industrial Surface Defect Detection Based on Lightweight Convolutional Neural Networks. Artif. Intell. Mach. Learn. Rev. 2024, 5, 36–53. [Google Scholar]
Matarazzo, A.; Torlone, R. A Survey on Large Language Models with Some Insights on Their Capabilities and Limitations. arXiv 2025, arXiv:2501.04040. [Google Scholar] [CrossRef]
Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F.L.; Almeida, D.; Altenschmidt, J.; Altman, S.; Anadkat, S.; et al. GPT-4 Technical Report. arXiv 2023, arXiv:2303.08774. [Google Scholar] [CrossRef]
Shojaee, P.; Meidani, K.; Gupta, S.; Farimani, A.B.; Reddy, C.K. LLM-SR: Scientific Equation Discovery via Programming with Large Language Models. arXiv 2024, arXiv:2404.18400. [Google Scholar] [CrossRef]
Hu, M.; Ma, C.; Li, W.; Xu, W.; Wu, J.; Hu, J.; Wang, X.; Zhou, B. A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers. arXiv 2025, arXiv:2508.21148. [Google Scholar] [CrossRef]
Betzler, B.K.; Chen, H.; Cheng, C.Y.; Lee, C.S.; Ning, G.; Song, S.J.; Tan, G.S.W.; Ting, D.S.W.; Wong, D.; Wong, T.Y. Large Language Models and Their Impact in Ophthalmology. Lancet Digit. Health 2023, 5, e917–e924. [Google Scholar] [CrossRef]
Yang, X.; Xiao, Y.; Liu, D.; Zhang, Y.; Deng, H.; Huang, J.; Chen, M.; Ye, H.; Xu, C. Enhancing Doctor-Patient Communication Using Large Language Models for Pathology Report Interpretation. BMC Med. Inform. Decis. Mak. 2025, 25, 36. [Google Scholar] [CrossRef]
Pang, Y.; Huang, T.; Wang, Q. AI and Data-Driven Advancements in Industry 4.0. Sensors 2025, 25, 2249. [Google Scholar] [CrossRef]
Alsaif, K.M.; Albeshri, A.A.; Khemakhem, M.A.; Eassa, F.E. Multimodal Large Language Model-Based Fault Detection and Diagnosis in Context of Industry 4.0. Electronics 2024, 13, 4912. [Google Scholar] [CrossRef]
Sabuncu, M.; Özdemir, H. Enhancing Textile Industry Quality Monitoring: Integrating ChatGPT and OCT for Advanced AI-Driven Solutions. J. Text. Inst. 2025, 117, 630–644. [Google Scholar] [CrossRef]
Singh, P. Systematic Review of Data-Centric Approaches in Artificial Intelligence and Machine Learning. Data Sci. Manag. 2023, 6, 144–157. [Google Scholar] [CrossRef]
Long, S.; Tan, J.; Mao, B.; Tang, F.; Li, Y.; Zhao, M.; Kato, N. A Survey on Intelligent Network Operations and Performance Optimization Based on Large Language Models. IEEE Commun. Surv. Tutor. 2025, 27, 3915–3949. [Google Scholar] [CrossRef]
Stupic, K.F.; Ainslie, M.; Boss, M.A.; Charles, C.; Dienstfrey, A.M.; Evelhoch, J.L.; Finn, P.; Giaquinto, R.O.; Kaufman, P.A.; Koay, C.G.; et al. A Standard System Phantom for Magnetic Resonance Imaging. Magn. Reson. Med. 2021, 86, 1194–1211. [Google Scholar] [CrossRef] [PubMed]
Valery, S.; Ksenia, K.; Viktor, D.; Elena, P.; Mikhail, A.; Ulyana, L.; Andrey, D. Fluorescence Imaging System for Biological Tissues Diagnosis: Phantom and Animal Studies. J. Biomed. Photonics Eng. 2020, 6, 010303. [Google Scholar] [CrossRef]
ISO 21920-2:2021; Geometrical Product Specifications (GPS)—Surface Texture: Profile—Part 2: Terms, Definitions and Surface Texture Parameters. International Organization for Standardization: Geneva, Switzerland, 2021.
Eifler, M.; Brodmann, B.; Müller, A.; Seewig, J. Comprehensive analysis of surfaces featuring functional characteristics by angular-resolved scattering light measurement. In Optical Manufacturing and Testing; SPIE: Bellingham, WA, USA, 2024; Volume 13134, p. 1313408. [Google Scholar] [CrossRef]
Suh, A.Y.; Polycarpou, A.A.; Conry, T.F. Detailed Surface Roughness Characterization of Engineering Surfaces Undergoing Tribological Testing Leading to Scuffing. Wear 2003, 255, 556–568. [Google Scholar] [CrossRef]
Boschetto, A.; Bottini, L. Accuracy prediction in fused deposition modeling. Int. J. Adv. Manuf. Technol. 2014, 73, 913–928. [Google Scholar] [CrossRef]
Liu, H.; Bhushan, B. Bending and fatigue study on a nanoscale hinge by an atomic force microscope. Nanotechnology 2004, 15, 1246–1255. [Google Scholar] [CrossRef]
Kahatapitiya, N.S.; Saleah, S.A.; Seong, D.; Ravichandran, N.K.; Han, S.; Wijesinghe, R.E.; Jeon, M.; Kim, J. Optical Coherence Tomography for High-Precision Industrial Inspection in Industry 4.0: Advances, Challenges, and Future Trends. Laser Photonics Rev. 2026, 20, e02290. [Google Scholar] [CrossRef]
Ghosh, S.; Knoblauch, R.; El Mansori, M.; Corleto, C. Towards AI driven surface roughness evaluation in manufacturing: A prospective study. J. Intell. Manuf. 2025, 36, 4519–4548. [Google Scholar] [CrossRef]
Batu, T.; Lemu, H.G.; Shimels, H. Application of artificial intelligence for surface roughness prediction of additively manufactured components. Materials 2023, 16, 6266. [Google Scholar] [CrossRef]

Figure 1. One-dimensional surface profiles used in this study for the five test geometries: (G1) sawtooth, (G2) rounded ridge (semicylindrical), (G3) castle wall (sparse rectangular), (G4) micro-grating (dense rectangular), and (G5) spike array.

Figure 2. Workflow of the proposed OCT and LLM surface roughness analysis. CAD test surfaces are first designed as STL files, 3D printed, scanned with OCT, and then analyzed using (1) reference roughness calculated from geometry, (2) MATLAB-based extraction from OCT, and (3) LLM-assisted analysis from OCT images. The LLM branch includes repeated inference using the same image and prompt to assess output repeatability under fixed input conditions.

Figure 3. Schematic of the SD-OCT imaging and AI-assisted surface roughness analysis.

Figure 4. Raw SD-OCT images used for surface roughness analysis for the five test geometries: (a) sawtooth, (b) rounded ridge (semicylindrical), (c) castle wall (sparse rectangular), (d) micro-grating (dense rectangular), and (e) spike array.

Figure 5. MATLAB-based surface roughness extraction for the castle wall geometry. (a) CAD/STL model with the cross-section used for roughness evaluation. (b) Raw OCT B-scan scaled to physical units, with the detected air–material interface overlaid in green. (c) Extracted surface profile over a 10 mm length. (d) MATLAB computed surface roughness parameters from the surface profile.

Figure 6. OCT surface roughness analysis using a large language model. (a) Raw OCT image acquired over a lateral length of 10 mm with an axial imaging depth of 1.7 mm, uploaded with the prompt. (b) Original LLM output for the sawtooth geometry. In the original output, the global peak-to-valley value was labeled Rz; in the revised manuscript, this same quantity is reported as Rt for consistency with ISO terminology.

Table 1. Surface roughness parameters used in this study.

Symbol	Parameter	Mathematical Definition
Ra	Arithmetical Mean Height	$R a = \frac{1}{L} \int_{0}^{L} \|z (x)\| d x$
Rq	Root-Mean-Square Height	$R q = \sqrt{\frac{1}{L} \int_{0}^{L} z {(x)}^{2} d x}$
Rp	Maximum Peak Height	$R p = m a x (z (x))$
Rv	Maximum Pit Depth	$R v = \|m i n (z (x))\|$
Rt	Maximum Profile Height	$R t = R p + R v$
Rsk	Skewness	$R s k = \frac{\frac{1}{L} \int_{0}^{L} {(z (x))}^{3} d x}{{R q}^{3}}$
Rku	Kurtosis	$R k u = \frac{\frac{1}{L} \int_{0}^{L} {(z (x))}^{4} d x}{{R q}^{4}}$

Table 2. Comparison of surface roughness parameters for the five test geometries obtained from analytical theory, MATLAB-based OCT extraction, and LLM-based analysis. Values in parentheses for Ra, Rq, Rp, Rv, and Rt denote the signed percentage deviation from the theoretical reference.

Surface Roughness\Geometry	Sawtooth	Rounded Ridge	Castle Wall	Grating	Spike Array
Ra (theory), mm	0.125	0.090	0.125	0.125	0.128
Ra (MATLAB)	0.121 (−3.2)	0.071 (−21.1)	0.120 (−4)	0.107 (−14.4)	0.141 (10.2)
Ra (LLM)	0.122 (−2.4)	0.098 (8.9)	0.119 (−4.8)	0.108 (−13.6)	0.140 (9.4)
Rq (theory), mm	0.144	0.112	0.125	0.125	0.153
Rq (MATLAB)	0.140 (−2.8)	0.086 (−23.2)	0.127 (1.6)	0.120 (−4.0)	0.166 (8.5)
Rq (LLM)	0.142 (−1.4)	0.114 (1.8)	0.124 (−0.8)	0.119 (−4.8)	0.167 (9.2)
Rp (theory), mm	0.250	0.107	0.125	0.125	0.400
Rp (MATLAB)	0.278 (11.2)	0.168 (57)	0.183 (46.4)	0.199 (59.2)	0.423 (5.8)
Rp (LLM)	0.249 (−0.4)	0.221 (106.5)	0.188 (50.4)	0.186 (48.8)	0.438 (9.5)
Rv (theory), mm	0.250	0.393	0.125	0.125	0.100
Rv (MATLAB)	0.243 (−2.8)	0.184 (−53.2)	0.192 (53.6)	0.196 (56.8)	0.138 (38)
Rv (LLM)	0.283 (13.2)	0.248 (−36.9)	0.190 (52)	0.212 (69.6)	0.246 (146)
Rt (theory), mm	0.500	0.500	0.250	0.250	0.500
Rt (MATLAB)	0.521 (4.2)	0.352 (−29.6)	0.375 (50)	0.395 (58)	0.561 (12.2)
Rt (LLM)	0.532 (6.4)	0.469 (−6.2)	0.378 (51.2)	0.398 (59.2)	0.684 (36.8)
Rsk (theory)	0.000	−1.152	0.000	0.000	1.262
Rsk (MATLAB)	0.043	−0.344	−0.216	−0.339	1.136
Rsk (LLM)	−0.034	−0.210	0.151	0.287	1.070
Rku (theory)	1.800	3.461	1.000	1.000	3.122
Rku (MATLAB)	1.838	2.323	1.390	1.614	2.850
Rku (LLM)	1.848	2.180	1.343	1.549	2.860

Table 3. Relative difference between LLM-based estimates and MATLAB-based OCT extraction from the same B-scan.

Parameter	Sawtooth	Rounded Ridge	Castle Wall	Grating	Spike Array
Ra	0.8%	38%	−0.8%	0.9%	−0.7%
Rq	1.4%	33%	−2.4%	−0.8%	0.6%
Rp	−10.4%	31.6%	2.7%	−6.5%	3.5%
Rv	16.5%	34.8%	−1%	8.2%	78.3%
Rt	2.1%	33.2%	0.8%	0.8%	21.9%
Rsk (difference)	−0.077	0.134	0.367	0.626	−0.066
Rku	0.5%	−6.2%	−3.4%	−4%	0.4%

Relative differences were calculated as [(LLM − MATLAB)/MATLAB] × 100. For Rsk, the absolute difference is reported because skewness is dimensionless and can change sign.

Table 4. Roughness ranking of the five geometries based on Ra obtained from MATLAB and LLM.

Rank	Geometry	Ra (MATLAB)	Ra (LLM)
1	Rounded Ridge	0.071	0.098
2	Grating	0.107	0.108
3	Castle Wall	0.120	0.119
4	Sawtooth	0.121	0.122
5	Spike Array	0.141	0.140

Table 5. Repeatability of LLM-based roughness estimation of G1 using the same OCT scan and prompt.

Parameter	Mean (mm)	SD (mm)	CV (%)	Min (mm)	Max (mm)
Ra	0.122	0.0046	3.8	0.115	0.129
Rq	0.142	0.0059	4.2	0.133	0.151
Rp	0.249	0.0086	3.5	0.236	0.262
Rv	0.283	0.0080	2.8	0.271	0.294
Rt	0.532	0.0166	3.1	0.507	0.556
Rsk	−0.034	0.016	-	−0.060	−0.01
Rku	1.848	0.069	3.7	1.750	1.960

Table 6. Summary of AI repeatability across different surface geometries (10 independent runs per geometry).

Geometry	Ra CV (%)	Rq CV (%)	Rt CV (%)
Sawtooth	3.8	4.2	3.1
Rounded Ridge	3.4	3.6	3.0
Castle Wall	4.3	4.3	4.1
Grating	4.7	4.7	4.6
Spike	5.2	5.2	4.8

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sabuncu, M.; Avci, S.O. Towards Industrial Surface Roughness Screening from OCT Images Using a Multimodal Large Language Model. Appl. Sci. 2026, 16, 6010. https://doi.org/10.3390/app16126010

AMA Style

Sabuncu M, Avci SO. Towards Industrial Surface Roughness Screening from OCT Images Using a Multimodal Large Language Model. Applied Sciences. 2026; 16(12):6010. https://doi.org/10.3390/app16126010

Chicago/Turabian Style

Sabuncu, Metin, and Sonay Onur Avci. 2026. "Towards Industrial Surface Roughness Screening from OCT Images Using a Multimodal Large Language Model" Applied Sciences 16, no. 12: 6010. https://doi.org/10.3390/app16126010

APA Style

Sabuncu, M., & Avci, S. O. (2026). Towards Industrial Surface Roughness Screening from OCT Images Using a Multimodal Large Language Model. Applied Sciences, 16(12), 6010. https://doi.org/10.3390/app16126010

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Towards Industrial Surface Roughness Screening from OCT Images Using a Multimodal Large Language Model

Abstract

1. Introduction

2. Materials and Methods

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Raw OCT B-Scan Images Used for Quantitative Analysis

Appendix B. AI Prompt

Appendix C. Analytical Reference Surface Roughness Parameters

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI