Self-Learning-Based Fringe Domain Conversion for 3D Surface Measurement of Translucent Objects at the Mesoscopic Scale

Su, Wenqing; Zou, Tao; Chen, Huankun; Niu, Haipeng; He, Zhaoshui; Zhao, Yumei; Chen, Zhuyun; Tan, Ji

doi:10.3390/photonics12090898

Open AccessArticle

Self-Learning-Based Fringe Domain Conversion for 3D Surface Measurement of Translucent Objects at the Mesoscopic Scale

by

Wenqing Su

¹,

Tao Zou

¹,

Huankun Chen

¹,

Haipeng Niu

²,

Zhaoshui He

¹,

Yumei Zhao

³,

Zhuyun Chen

⁴ and

Ji Tan

^1,*

¹

School of Automation, Guangdong University of Technology, Guangzhou 510006, China

²

Huawei Technologies Co., Ltd., Songshan Lake Research Center, Dongguan 523808, China

³

Soobu Technology Development Co., Ltd., Shenzhen 518100, China

⁴

School of Electromechanical Engineering, Guangdong University of Technology, Guangzhou 510006, China

^*

Author to whom correspondence should be addressed.

Photonics 2025, 12(9), 898; https://doi.org/10.3390/photonics12090898

Submission received: 15 August 2025 / Revised: 29 August 2025 / Accepted: 31 August 2025 / Published: 7 September 2025

(This article belongs to the Special Issue Advancements in Optical Metrology and Imaging)

Download

Browse Figures

Versions Notes

Abstract

Three-dimensional measurement of translucent objects using structured light techniques remained fundamentally challenging due to severe degradation of fringe patterns caused by subsurface scattering, which inevitably introduced phase errors and compromised measurement accuracy. Although deep learning had emerged as a powerful tool for fringe analysis, its practical implementation was hindered by the impractical requirement for large-scale labeled datasets, particularly in scattering-dominant measurement scenarios. To overcome these limitations, we developed a self-learning-based fringe domain conversion method inspired by image style transfer principles, where degraded and ideal fringe patterns were treated as distinct domains for cyclic translation. The proposed framework employed dual generators and discriminators to establish cycle-consistency constraints while incorporating both numerical intensity-based and physical phase-derived optimization targets, effectively suppressing phase errors and improving fringe modulation without requiring paired training data. Experimental validation demonstrated superior performance in reconstructing high-fidelity 3D morphology of translucent objects, establishing this approach as a robust solution for precision metrology of complex scattering media.

Keywords:

3D measurement; structured light; subsurface scattering; self-supervised learning

1. Introduction

Structured light techniques, with their non-contact nature, high accuracy, and high sensitivity, have emerged as one of the most popular 3D measurement technologies [1,2,3]. They are widely applied in industrial manufacturing, including workpiece defect inspection, reverse engineering, and industrial material analysis. These techniques are based on the point-to-point triangulation principle, which ensures superior performance for diffuse-reflective surfaces. Even in the presence of highly reflective surfaces, numerous high dynamic range approaches have developed to enhance measurement accuracy [4,5,6].

However, as measurement scenarios grow increasingly complex, the capability to measure only diffuse objects is no longer sufficient to meet modern application demands. When confronted with translucent or even transparent objects, conventional structured light techniques often fail. This is primarily due to the subsurface scattering effect in translucent media, which violates the fundamental point-to-point triangulation rule underlying structured light measurements. As a result, the acquired fringe patterns suffer from low modulation, low contrast, and low signal-to-noise ratio (SNR). The degraded structured light fringes introduce severe phase noise and geometric errors, making it extremely challenging to recover high-precision 3D morphology of translucent materials. High-precision measurement of translucent scattering media is critically important in industrial and medical applications, such as resin composite deformation analysis, jade artifact inspection, and 3D reconstruction of biological tissues [7,8,9]. Therefore, there is an urgent need to develop reliable methods for translucent objects.

To address these challenges, researchers have developed a series of innovative methods. Nayar et al. pioneered the analysis of illumination components in complex scenes, distinguishing between the direct illumination (the desired signal) and indirect illumination (considered interference) [10]. The indirect contributions, including interreflections, subsurface scattering, ambient light, and their coupling are collectively termed global illumination. Notably, subsurface scattering from translucent objects constitutes a major global illumination component. Building on this insight, Nayar proposed high-frequency illumination to separate direct and global illumination, effectively suppressing global components and demonstrating promise for translucent object measurement. Inspired by Nayar’s work, Chen et al. introduced the modulated phase-shifting method, which employs high-frequency modulation of low-frequency patterns to isolate indirect light [11]. However, such high-frequency fringe projection strategies often result in low-contrast images, limiting their practical utility. Subsequently, Gupta et al. developed the micro-phase-shifting method and XOR Gray-code method, enabling high-quality phase recovery without requiring indirect light separation [12,13]. By projecting two sets of high-frequency fringes and constructing an all-high-frequency Gray-code pattern, this approach achieves measurements under global illumination. However, it assumes the presence of only low-frequency indirect components, thus offering only partial mitigation of indirect reflection effects.

Further advancements leveraged polarization properties to suppress scattering. Chen et al. combined polarization techniques with optical fibers to separate light intensities, achieving high fringe contrast and high-quality measurements of translucent objects [14,15]. Nevertheless, this method increases system complexity, involves cumbersome operations, and suffers from low measurement efficiency. Lutzke et al. employed Monte Carlo (MC) simulations to model translucent imaging, facilitating the study of error compensation algorithms [16,17]. Rao et al. proposed a local blur-based error compensation method, analyzing phase errors induced by defocus effects of cameras/projectors and approximating subsurface scattering as localized blur [18]. While effective, this method has two limitations: (1) applicability only to short-range scattering, (2) low computational efficiency. Xu et al. established a relationship between phase error and fringe frequency, employing temporal denoising and improved phase unwrapping to significantly enhance measurement quality [19]. This method proves that fringe noise can be effectively reduced by time domain averaging to improve the fringe modulation. However, this method requires multiple temporal samples per fringe image, leading to high time costs.

Jiang et al. integrated single-pixel imaging (SPI) with structured light, achieving promising results for translucent measurements [20,21]. This technique calculates the optical transmission coefficient for each pixel, restoring high modulation fringes that are not disturbed by scattered light. For instance, Lyu et al. applied SPI to underwater environments, which effectively enhanced the fringe contrast to overcome the scattering effect of turbid liquid [22]. However, SPI’s need for massive image acquisitions severely restricts measurement speed, limiting it to static scenarios. Wu et al. later introduced multi-scale parallel SPI, drastically reducing the required fringe count and enabling dynamic measurements in complex scenes, including high-quality 3D reconstruction of translucent objects [23]. Despite this, SPI-assisted methods still cannot match the efficiency of conventional structured light techniques (e.g., nine frames for three-frequency phase unwrapping or seven for Gray codes). Moreover, parallel SPI demands substantial computational resources due to per-pixel calculations of light transport coefficients and direct illumination peak localization.

Recently, learning-based methods have gained traction in structured light measurement, particularly for fringe analysis, error compensation, and phase recovery [24,25,26,27]. Feng et al. proposed the first deep learning-based fringe analysis framework [28], where a neural network directly maps single-frame fringes to numerator/denominator images of the arctangent function, enabling single-frame phase extraction. Capitalizing on deep neural networks’ powerful nonlinear fitting capability, this approach excels in fringe restoration and enhancement. In our latest work [29], we combined deep learning with Bayesian inference for scattering media measurement, enhancing degraded fringes into high modulation patterns and significantly improving measurement quality. However, as a supervised learning method, it faces challenges in dataset preparation: ground-truth labels require spray-based coating, which may damage specimens and involves tedious operations.

In this work, we developed a self-learning fringe domain transformation framework to achieve fringe modulation enhancement and denoising for translucent objects measurement. The proposed method established a cyclic generative architecture that systematically transformed degraded fringe patterns into high-modulation counterparts through a cycle-consistency mechanism. By integrating both numerical constraints on fringe intensity distributions and physical constraints governing phase relationships, the framework successfully suppressed phase-shift errors and improved phase quality. The elimination of labeled data requirements represented a significant advancement, reducing dependence on curated datasets while maintaining measurement accuracy. Experimental validation across multiple scattering media demonstrated the method’s robustness in recovering fine surface features under challenging optical conditions.

2. Methods

2.1. Subsurface Scattering in Structured Light Projection

In conventional structured light measurement systems operating under ideal conditions, the point-to-point triangulation principle governs the optical behavior, where each camera pixel receives light exclusively from a corresponding projector pixel. This paradigm holds true for Lambertian surfaces, where captured fringe patterns can be mathematically described as

I_{n} (x, y) = a (x, y) + b (x, y) \cos [Φ (x, y) + \frac{2 π n}{N}],

(1)

where a(x, y) represents background intensity, b(x, y) denotes fringe modulation, Φ(x, y) is the phase distribution, and N is the phase shifted step, n = 1, 2, …, N. When measuring translucent objects, the incident structured light partially penetrates the surface and undergoes multiple subsurface scattering events within the superficial layers before re-emerging. This subsurface scattering effect introduces significant deviations from ideal measurement conditions. As illustrated in Figure 1, for any measured pixel A, the received intensity incorporates not only its direct reflection component but also scattering contributions from neighboring surface points (A₁–A_n), which violated the fundamental point-to-point measurement principle. As a result, the camera detects a composite signal comprising both direct surface reflection (desired fringe) and subsurface scattering components (noise). For a fringe image, the intensity of each pixel is destroyed by the scattering noise, so the signal-to-noise ratio (SNR) and the contrast of the fringe is significantly reduced. Physically, this phenomenon can be modeled as the convolution of ideal fringe patterns with a point spread function (PSF), yielding the degraded pattern expression:

I_{n}^{S} (x, y) = E (x, y) + [R (x, y) I_{n} (x, y) + S_{n} (x, y)] \cdot G (x, y),

(2)

where E(x, y) and R(x, y) denote the environmental intensity and the media reflectance, respectively; G(x, y) is the white random noise; and S_n(x, y) represents the scattering component,

\begin{matrix} S_{n} (x, y) & = & I_{n} (i, j) \otimes P (x, y) \\ = & \sum_{i = 0}^{H - 1} \sum_{j = 0}^{W - 1} [I_{n} (i, j) P (x - i, y - j)], \end{matrix}

(3)

where P(x, y) represents the PSF and H and W represent the height and width of the fringe patterns. Consequently, translucent object measurement transforms the conventional point-to-point model into an area-to-point paradigm, manifesting as low fringe modulation and poor SNR. The coupled scattering components introduce systematic phase errors, fundamentally limiting measurement accuracy in such scenarios.

2.2. Self-Learning-Based Fringe Domain Conversion

The developed self-learning-based fringe domain conversion framework aims to enhance the contrast and eliminate the noise of the fringes. It treats fringe patterns from translucent surfaces and ideal fringe patterns from diffuse surfaces as distinct image domains for cross-domain mapping. These two kinds of fringe patterns belong to degraded domain and enhanced domain. As shown in Figure 2, our approach implemented a cyclic generative architecture to transform low-modulation, low-SNR fringe patterns into high-quality sinusoidal fringes, which employed two generators G and F and two discriminators D_X and D_Y to establish bidirectional fringe domain mapping.

First, generator G converted degraded-domain images to the enhanced domain, while F performed the inverse transformation. The discriminators maintain domain-specific characteristics by evaluating image authenticity. Cycle-consistency constraints [30] preserved structural integrity by minimizing differences between original and reconstructed images. The system processed three key components simultaneously: fringe terms M(x, y) and D(x, y), along with wrapped phase φ(x, y), ensuring comprehensive fringe enhancement while maintaining phase accuracy. The wrapped phase is calculated as follows:

φ (x, y) = \arctan \{\frac{\sum_{n = 1}^{N} I_{n} (x, y) \sin (\frac{2 π n}{N})}{\sum_{n = 1}^{N} I_{n} (x, y) \cos (\frac{2 π n}{N})}\} .

(4)

The normalized numerator M(x, y) and denominator D(x, y) of the arctangent function can be calculated by the following equation [31]:

\{\begin{cases} M (x, y) = \sin [φ (x, y)] \\ D (x, y) = \cos [φ (x, y)] \end{cases} .

(5)

The proposed network architecture specifically addressed the challenges of processing low-SNR fringe patterns through specialized modules in both generator and discriminator as shown in Figure 3a and Figure 3b, respectively. The generator utilized the first stage of HINet structure [32], which employed a 3 × 3 convolutional layer to preserve high-frequency fringe details while filtering random noise. Its encoder progressively extracted multi-scale features through four cascaded half-instance normalization blocks as shown in Figure 4a, which uniquely combined instance and batch normalization to maintain fringe periodicity while suppressing scattering-induced artifacts. The integrated attention mechanism dynamically weighted feature importance, effectively enhancing valid fringe signals over noise components. During decoding, four residual blocks with skip connections recovered lost modulation details from degraded patterns as shown in Figure 4b, while the 2 × 2 deconvolution layers precisely reconstructed fringe spacing. Meanwhile, the supervised attention (SA) block actively suppressed invalid features caused by strong scattering as shown in Figure 4c. The discriminator’s 4 × 4 convolutional layers with LeakyReLU progressively learned hierarchical features from both spatial and frequency domains, enabling robust identification of authentic fringe characteristics against scattering noise. Its final binary classifier focused particularly on fringe continuity and modulation depth (>0.6 for enhanced patterns), ensuring physically plausible transformations. This comprehensive architecture demonstrated superior performance in preserving phase information while enhancing low-contrast fringes to near-ideal quality.

2.3. Design of the Loss Function

To achieve optimal generation performance while strictly maintaining cycle consistency, we designed a composite loss function comprising three critical components: cycle consistency loss (L_cyc), adversarial loss (L_adv), and identity loss (L_ide), with weighting coefficients (α = 10, β = 1, γ = 5) carefully balanced to prevent any single objective from dominating the training process. The total loss function can be expressed as

L_{t o t a l} = α L_{c y c} + β L_{a d v} + γ L_{i d e} .

(6)

The empirically determined weightings (α > γ > β) reflected the hierarchical importance of these objectives, with the dominant cycle consistency term (α = 10) guaranteeing structural similarity index (SSIM) in round-trip transformations, while the identity loss (γ = 5) preserved original fringe modulation in target-domain images. This balanced loss formulation demonstrated effectiveness in handling challenging scattering conditions.

Specifically, the cycle consistency loss is designed as

L_{c y c} (G, F) = E_{X ~ P_{d a t a} (X)} {‖F (G (X)) - X‖}_{1} + E_{Y ~ P_{d a t a} (Y)} {‖G (F (Y)) - Y‖}_{1},

(7)

where F(X) and F(Y) are the images that are generated after the image is entered into the generator F. This cyclic consistency mechanism enforced bidirectional domain preservation through L1-norm constraints. When a degraded-domain fringe pattern X underwent sequential transformations through both generators (X → G(X) → F(G(X))), the reconstructed pattern maintained essential structural similarity with the original input. Similarly, enhanced-domain patterns Y preserved their core features when processed through the inverse transformation path (Y → F(Y) → G(F(Y))).

The adversarial loss is defined as

L_{a d v} (G, D_{Y}, X, Y) = E_{X ~ P_{d a t a} (X)} [\log D_{Y} (Y)] + E_{Y ~ P_{d a t a} (Y)} [\log (1 - D_{Y} (G (X)))] + λ L_{T V} (Y),

(8)

where X and Y are degraded domain image and enhanced domain image, respectively, and G(X) is the image generated after the image input generator G. This adversarial loss followed conventional GAN formulation, where the generator G learned to produce enhanced-domain patterns indistinguishable from real samples (Y ≈ G(X)) while the discriminator D_Y progressively improved its ability to differentiate between generated and authentic enhanced-domain fringes. Additionally, our adversarial training was augmented with total variation (TV) regularization (λ = 0.1) to suppress noise artifacts by minimizing pixel-wise intensity variations (TV loss = Σ|∇G(X)|), resulting in smoother output patterns compared to baseline implementations.

The identity loss is expressed as

L_{i d e} (G, F, X, Y) = E_{Y ~ P_{d a t a} (Y)} {‖G (Y) - Y‖}_{1} + E_{X ~ P_{d a t a} (X)} {‖F (X) - X‖}_{1} .

(9)

It served as a regularizer to prevent unnecessary modifications to already domain-appropriate images, particularly crucial for maintaining fringe periodicity in enhanced-domain inputs where excessive processing could alter valid sinusoidal patterns.

It is noted that the network training simultaneously optimized losses for both M, D and the calculated wrapped phase φ as follows:

L = ω_{1} L_{t o t a l_M} (M_{p r e}, M) + ω_{2} L_{t o t a l_D} (D_{p r e}, D) + ω_{3} L_{t o t a l_φ} [\arctan (\frac{M_{p r e}}{D_{p r e}}), \arctan (\frac{M}{D})],

(10)

where L_{total_M} and L_{total_D} provided numerical constraints on intensity transformation fidelity, L_{total_φ} enforced physical constraints to mitigate phase shifted errors, ω₁–ω₃ are the weights of each constraint. During the training process, L_{total_M} and L_{total_D} decrease faster than L_{total_φ}, and L_{total_φ} is only used as an auxiliary term to constrain the suppression of phase shift error. Based on the above considerations, the weight ω₃ is set to a smaller value to prevent the initial training from focusing on the reduction of L_{total_φ}.

2.4. Training Dataset and Implementation

In conventional supervised deep learning approaches for structured light measurement, each input fringe pattern requires precisely matched ground truth data. For translucent object measurement, obtaining ideal reference fringes typically involves coating specimens with diffusible sprays to convert their surface properties from translucent to Lambertian, thereby eliminating subsurface scattering effects. This process proves particularly cumbersome, often requiring several hours of drying time per specimen, which significantly impedes practical dataset preparation.

The proposed self-supervised framework fundamentally overcomes this limitation by eliminating the need for paired training data. Our method simply requires two distinct sets of unpaired fringe patterns: one collected from various translucent samples and another from diffuse reference targets, with no spatial or temporal correspondence needed between them. Specifically, we acquired fringe data from 10 different translucent specimens (including dental composites and silicone phantoms) and 10 diffuse reference samples, capturing multiple surface positions per sample. Through systematic data augmentation involving rotation (±30°), scaling (0.8–1.2×), and translation (±10% FOV), we generated 1000 real unpaired fringe patterns.

To substantially expand the diversity of training patterns, we leveraged our established physical simulation framework to generate 3000 simulation fringe pairs per domain, meticulously modeling the degraded domain through convolution of ideal fringes, while the enhanced domain comprised purely synthetic sinusoidal patterns with high modulation. These simulation datasets were deliberately generated through independent processes for each domain to preserve the essential unpaired characteristic of the training set. By integrating experimentally measured fringe patterns from physical specimens with carefully simulated data generated through our physics-based modeling framework, we established a comprehensive training dataset comprising 4000 samples each for both the degraded and enhanced domains, which served as the foundation for our self-supervised learning network. The strategic combination of empirical and simulated data enabled robust network training while maintaining the critical unpaired relationship between domains, with the physical measurements capturing authentic scattering phenomena and the synthetic data ensuring sufficient variation in fringe frequencies and modulation depths. This balanced dataset composition proved particularly effective for the cycle-consistent domain transformation task, as evidenced by the network’s ability to generalize across different material types.

The network training was implemented in PyTorch 2.7.1 with 512 × 512 pixels inputs to optimize GPU memory utilization on an NVIDIA RTX 4060 (8 GB VRAM) platform. The learning rate strategy of the generator adopts cosine annealing that varied from an initial 2 × 10⁻⁶ to a minimum of 1 × 10⁻⁶. The discriminator adopts an attenuation strategy based on the training period and linearly reduces the learning rate. The initial learning rate is set to 2 × 10⁻⁴. The training period is 50 epochs, which is completed in approximately 9 h.

3. Experiments

We conducted multiple sets of 3D measurements on translucent media using a self-built mesoscale FPP measurement system, where the projector is TI DLP2010EVM-LC with a resolution of 854 × 480 and the camera is Basler daA720-520uc with a resolution of 720 × 540. The measurement volume is approximately 17 mm (length) × 10 mm (width) × 10 mm (depth). The test objects comprised various subsurface scattering materials, including optical glue, frosted glass, jade, and silicone. To validate the performance of the proposed self-learning method, the experimental results were compared with those obtained from traditional phase-shifting profilometry (PSP), supervised learning-based method [29], and the spray-coating method. The spray-coating results were regarded as ground truth. In all experiments, multi-frequency heterodyne phase unwrapping was employed, followed by 3D reconstruction using a stereo calibration model.

The first experiment involved the measurement of optical glue. Figure 5a shows the region of interest on the test object. Figure 5b–e present the fringe M computed using traditional PSP, the supervised learning-based method, the proposed method, and the spray-coating method, respectively. The traditional PSP exhibited significantly low modulation, whereas both deep learning-based methods enhanced the fringes to high quality. Due to non-uniform reflectivity, the spray-coated sample also displayed uneven fringe modulation. Figure 5f–i show the wrapped phase maps obtained by the four methods. Except for the noisy phase from traditional PSP, the other methods yielded satisfactory results, demonstrating the superiority of the proposed approach in fringe enhancement. Figure 6 compares the 3D reconstructions obtained by the four methods. Severe phase noise in traditional PSP led to substantial geometric errors and depth information loss, resulting in a rough surface that poorly represented the true 3D profile. In contrast, both deep learning-based methods achieved complete and smooth 3D reconstructions. Notably, the proposed method produced a smoother surface than the supervised learning-based approach, though this may slightly compromise fine details in complex surface measurements. While the spray-coating method inherently avoids subsurface scattering effects, coating non-uniformity may introduce measurement inaccuracies.

The second experiment involved high-resolution measurement of a localized region on a frosted glass cup surface. Figure 7a delineates the region of interest (ROI). Figure 7b–e present the fringe patterns obtained from four distinct methodologies, with localized insets revealing critical details. The conventional phase-shifting profilometry (PSP) yielded severely degraded fringe quality, with near-complete fringe disappearance in specific regions (Figure 7b). The supervised learning approach demonstrated notable enhancement in low-modulation regions; however, its performance deteriorated in areas with abrupt surface gradients, manifesting as residual noise artifacts (Figure 7c). In contrast, our proposed method successfully reconstructed high-fidelity fringe patterns across all challenging regions (Figure 7d). This superior performance stems from our fringe-domain transformation model, which leverages advanced image generation capabilities to optimize local fringe quality through global feature integration. The close agreement between our results and the spray-coating reference (Figure 7e) further validates the method’s reliability. Corresponding wrapped phase maps are shown in Figure 7f–i. The phase quality directly correlates with fringe modulation characteristics, as expected from fundamental phase retrieval principles.

Three-dimensional reconstructions derived from these phase maps are presented in Figure 8. The conventional method exhibited significant data loss in high-gradient regions due to fringe extinction. While both deep learning approaches improved reconstruction completeness, residual artifacts persisted. Our method achieved superior topological continuity compared to the supervised alternative. Cross-sectional profiles (Figure 8e,f) provide quantitative comparison. The conventional method’s profile contained substantial noise, whereas the other three methods produced smooth contours. Notably, our results showed a near-perfect overlap with the supervised learning output, demonstrating that our self-supervised approach achieves comparable performance to supervised methods in 3D metrology. Quantitative error analysis relative to the spray-coating reference is shown in Figure 9. Our method reduced the RMSE by 50% compared to conventional PSP. More significantly, it achieved equivalent RMSE to the supervised method, confirming measurement accuracy parity while eliminating the need for labeled training data.

The third experiment evaluated the proposed method’s performance on complex surfaces using a semi-transparent jade sample with intricate facial features (Figure 10). Both deep learning approaches successfully enhanced degraded fringe patterns to high-modulation states (Figure 10c,d). While the spray-coating method exhibited non-uniform modulation distribution, it preserved the finest details (Figure 10e), establishing an optimal reference for 3D reconstruction. Phase analysis revealed significant noise contamination in conventional PSP results, leading to rough surface reconstructions. Although the learning-based methods produced cleaner phase maps, excessive smoothing caused subtle height information loss—a common trade-off in data-driven approaches. Three-dimensional reconstruction results are presented in Figure 11. The conventional method failed to capture complex geometrical variations (Figure 11a), while both learning-based approaches achieved comparable detail reconstruction (Figure 11b,c). The spray-coating reference (Figure 11d) demonstrated superior edge definition and natural transitions, highlighting remaining challenges for computational methods. Quantitative error analysis (Figure 12) confirmed our method’s parity with supervised learning in RMSE performance.

The last experiment is measuring the dynamic process of extrusion deformation of a translucent silica gel. Because the dynamic process is not repeatable, it is impossible to compare the results of the spraying method. We compare the 3D shape reconstruction results of the three methods under five frames. Figure 13a1,a2 show the sketch map before and after the deformation of the measured object, and Figure 13a3 is the captured image of the measured object. Figure 13b–d are the results of traditional PSP, supervised learning method and the proposed method, respectively. The experimental results show that the proposed approach not only significantly outperformed conventional PSP in dynamic measurement conditions but also achieved reconstruction quality equivalent to supervised learning while maintaining robust resistance to motion-induced artifacts. These results collectively validate our method’s effective applicability to time-varying measurements of scattering media, demonstrating performance parity with supervised approaches and representing a notable advancement in dynamic 3D metrology capabilities. It should be noted the since the image acquisition and network processing programs have not been integrated, the experimental results are not inferenced in real time. The off-line inference speed of the proposed method for each pair of M and D is about 6 ms. The successful implementation under dynamic conditions highlights the method’s practical utility for real-world applications involving deformable translucent materials.

4. Discussion

Multiple experiments on translucent materials have demonstrated that the proposed method exhibits superior performance in both measurement accuracy and efficiency. Compared to existing methods, its main breakthroughs are discussed as follows:

First, there is no sacrifice in measurement efficiency. Although existing methods such as temporal denoising [19], single-pixel imaging [20], and error compensation [18] can improve the measurement accuracy of translucent materials, they reduce measurement efficiency to varying degrees, making dynamic measurement difficult. For example, temporal denoising requires repeated sampling of the same scene for n times, resulting in the need to project and capture n times more fringe patterns. Single-pixel imaging also relies on a larger number of Fourier basis patterns to recover the light transport coefficients (LTC). Furthermore, it requires independent computation of the LTC for each pixel, leading to significant computational overhead and time consumption. Error compensation methods, based on phase error models, require recovering the PSF for each pixel, which significantly reduces computational efficiency. In contrast, our method does not require additional fringe images or pixel-wise physical model calculations. The global image enhancement of the deep network enables the proposed method to achieve fast 3D measurement.

Second, there is no need for additional hardware. Some existing methods suppress subsurface scattering effects by leveraging the polarization properties of light [14,15]. This approach requires adding extra polarizing components to the system, which increases cost and system complexity to some extent. In comparison, our method does not rely on any additional hardware; it enhances the modulation and reduces noise in the original degraded images, ultimately enabling flexible 3D measurement under scattering conditions.

Third, no paired labeled data are required for training. Existing deep learning-based image processing often depends on paired datasets [33,34]. While these supervised learning approaches perform excellently in image enhancement, they are often limited in applications where labeled data is difficult to obtain. Therefore, compared to traditional supervised learning methods, our method effectively eliminates the dependency on labeled data through a self-learning strategy using unpaired data. At the same time, our method achieves accuracy and speed comparable to supervised learning approaches.

However, there are still several aspects that require improvement: (1) The training of the generative adversarial model is relatively complex and time-consuming. We hope to optimize the network learning mechanism and incorporate more potential physical constraints to enhance its self-learning effectiveness. (2) Our current self-supervised learning implementation uses four network models (two generators and two discriminators). In the future, we aim to explore lightweight self-learning strategies to reduce computational resource requirements. (3) The multi-frequency heterodyne method requires nine fringe images. We hope to further reduce the number of fringe patterns in the future to increase measurement speed.

5. Conclusions

We presented a novel self-learning-based fringe domain transformation framework for high-precision 3D measurement of translucent objects. The proposed method employs a cyclic generative architecture to effectively convert low-modulation fringe patterns into high-quality sinusoidal patterns, substantially improving phase retrieval accuracy. Experimental validation on various translucent media including optical glue, frosted glass, jade, and silicone materials demonstrates significant enhancement in both fringe modulation and signal-to-noise ratio under challenging subsurface scattering conditions. These advancements enable robust preservation of fine geometric features during 3D reconstruction, yielding superior measurement fidelity compared to conventional PSP. The proposed technique shows particular promise for industrial and medical metrology applications requiring non-contact characterization of complex translucent materials.

Author Contributions

Conceptualization, J.T. and H.N.; methodology, W.S. and T.Z.; software, W.S., Z.C. and Y.Z.; validation, T.Z., H.C. and H.N.; investigation, W.S. and J.T.; writing—original draft preparation, W.S. and T.Z.; writing—review and editing, J.T.; supervision, J.T. and Z.H.; funding acquisition, J.T. and Z.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China under grant number 62203121, 62273105; the Natural Science Foundation of Guangdong Province under grant number 2024A1515010557; and the Science and Technology Program of Guangzhou under grant number 2025A04J3373.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available upon reasonable request from the corresponding author.

Conflicts of Interest

Author Haipeng Niu was employed by the company Huawei Technologies Co., Ltd., Songshan Lake Research Center. Author Yumei Zhao was employed by the company Soobu Technology Development Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Xu, J.; Zhang, S. Status, challenges, and future perspectives of fringe projection profilometry. Opt. Lasers Eng. 2020, 135, 106193. [Google Scholar] [CrossRef]
Chen, R.; Xu, J.; Zhang, S. Comparative study on 3D optical sensors for short range applications. Opt. Lasers Eng. 2022, 149, 106763. [Google Scholar] [CrossRef]
Gorthi, S.; Rastogi, S.P. Fringe projection techniques: Whither we are? Opt. Lasers Eng. 2010, 48, 133–140. [Google Scholar] [CrossRef]
Qiu, J.; Liu, G.; Xi, D.; Wu, G. LUT-based phase error compensation method for large-step phase-shifting algorithm in DLP4500-based FPP system. Opt. Express 2024, 32, 39600–39622. [Google Scholar] [CrossRef]
Tan, J.; Su, W.; He, Z.; Bai, Y.; Dong, B.; Xie, S. Generic saturation-induced phase error correction for structured light 3D shape measurement. Opt. Letters 2022, 47, 3387–3390. [Google Scholar] [CrossRef]
Wan, Y.; Cao, Y.; Xu, M.; Tang, T. Enhanced Fourier-Hilbert-transform suppression for saturation-induced phase error in phase-shifting profilometry. Opt. Express 2023, 31, 37683–37702. [Google Scholar] [CrossRef]
Hu, Y.; Liang, Z.; Wang, K.; Gui, K.; Zhang, J.; Chen, Q.; Zuo, C. Structured light 3D imaging instrument for biological tissues with potential application in telemedicine. IEEE Trans. Instrum. Meas. 2024, 73, 5001411. [Google Scholar] [CrossRef]
Bai, Y.; Zhang, K.; Mo, R.; Ni, Z.; He, Z.; Xie, S.; Dong, B. Bayesian-neural-network-based strain estimation approach for optical coherence elastography. Optica 2024, 11, 1334–1345. [Google Scholar] [CrossRef]
Wu, H.; Cao, Y.; Dai, Y.; Wei, Z. Orthogonal Spatial Binary Coding Method for High-Speed 3D Measurement. IEEE Trans. Image Process. 2024, 33, 2703–2713. [Google Scholar] [CrossRef]
Nayar, S.; Krishnan, G.; Grossberg, M.; Raskar, R. Fast separation of direct and global components of a scene using high frequency illumination. ACM Trans. Graph. 2006, 25, 935–944. [Google Scholar] [CrossRef]
Chen, T.; Seidel, H.P.; Lensch, H.P.A. Modulated phase-shifting for 3D scanning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Anchorage, AK, USA, 23–28 June 2008; pp. 3839–3846. [Google Scholar]
Gupta, M.; Nayar, S.K. Micro phase shifting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Washington, DC, USA, 16–21 June 2012; pp. 813–820. [Google Scholar]
Gupta, M.; Agrawal, A.; Veeraraghavan, A.; Narasimhan, S.G. Structured light 3D scanning in the presence of global illumination. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, USA, 20–25 June 2011; pp. 713–720. [Google Scholar]
Chen, B.; Shi, P.; Wang, Y.; Xu, Y.; Ma, H.; Wang, R.; Zheng, C.; Chu, P. Determining surface shape of translucent objects with the combination of laser-beam-based structured light and polarization technique. Sensors 2021, 21, 6587. [Google Scholar] [CrossRef] [PubMed]
Chen, T.; Lensch, H.P.A.; Fuchs, C.; Seidel, H.-P. Polarization and phase-shifting for 3D scanning of translucent objects. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Minneapolis, MN, USA, 17–22 June 2007; pp. 1829–1836. [Google Scholar]
Lutzke, P.; Kühmstedt, P.; Notni, G. Fast error simulation of optical 3D measurements at translucent objects. Proc. SPIE 2012, 8493, 84930U. [Google Scholar]
Lutzke, P.; Heist, S.; Kühmstedt, P.; Kowarschik, R.; Notni, G. Monte Carlo simulation of three-dimensional measurements of translucent objects. Opt. Eng. 2015, 54, 084111. [Google Scholar] [CrossRef]
Rao, L.; Da, F. Local blur analysis and phase error correction method for fringe projection profilometry systems. Appl. Opt. 2018, 57, 4267–4276. [Google Scholar] [CrossRef]
Xu, Y.; Zhao, H.; Jiang, H.; Li, X. High-accuracy 3D shape measurement of translucent objects by fringe projection profilometry. Opt. Express 2019, 27, 18421–18434. [Google Scholar] [CrossRef]
Jiang, H.; Zhai, H.; Xu, Y.; Li, X.; Zhao, H. 3D shape measurement of translucent objects based on Fourier single-pixel imaging in projector-camera system. Opt. Express 2019, 27, 33564–33574. [Google Scholar] [CrossRef]
Jiang, H.; Li, Y.; Zhao, H.; Li, X.; Xu, Y. Parallel single-pixel imaging: A general method for direct–global separation and 3D shape reconstruction under strong global illumination. Int. J. Comput. Vis. 2021, 129, 1060–1086. [Google Scholar] [CrossRef]
Lyu, N.; Yu, H.; Han, J.; Zheng, D. Structured light-based underwater 3-D reconstruction techniques: A comparative study. Opt. Lasers Eng. 2023, 161, 107344. [Google Scholar] [CrossRef]
Wu, Z.; Wang, H.; Chen, F.; Li, X.; Chen, Z.; Zhang, Q. Dynamic 3D shape reconstruction under complex reflection and transmission conditions using multi-scale parallel single-pixel imaging. Light Adv. Manuf. 2024, 5, 373–384. [Google Scholar]
Feng, S.; Zuo, C.; Hu, Y.; Li, Y.; Chen, Q. Deep-learning-based fringe-pattern analysis with uncertainty estimation. Optica 2021, 8, 1507–1510. [Google Scholar]
Tan, J.; Liu, J.; Wang, X.; He, Z.; Su, W.; Huang, T.; Xie, S. Large depth range binary-focusing projection 3D shape reconstruction via unpaired data learning. Opt. Lasers Eng. 2024, 181, 108442. [Google Scholar] [CrossRef]
Yu, H.; Zheng, D.; Fu, J.; Zhang, Y.; Zuo, C.; Han, J. Deep learning-based fringe modulation-enhancing method for accurate fringe projection profilometry. Opt. Express 2020, 28, 21692–21703. [Google Scholar] [CrossRef] [PubMed]
Tan, J.; Su, W.; He, Z.; Huang, N.; Di, J.; Zhong, L.; Bai, Y.; Dong, B.; Xie, S. Deep learning-based method for non-uniform motion-induced error reduction in dynamic microscopic 3D shape measurement. Opt. Express 2022, 30, 24245–24260. [Google Scholar] [CrossRef]
Feng, S.; Chen, Q.; Gu, G.; Tao, T.; Zhang, L.; Hu, Y.; Yin, W.; Zuo, C. Fringe pattern analysis using deep learning. Adv. Photonics 2019, 1, 025001. [Google Scholar] [CrossRef]
Tan, J.; Niu, H.; Su, W.; He, Z. Structured light 3D shape measurement for translucent media base on deep Bayesian inference. Opt. Laser Technol. 2025, 181, 111758. [Google Scholar] [CrossRef]
Zhu, J.; Park, T.; Isola, P.; Efros, A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2242–2251. [Google Scholar]
Zhang, J.; Luo, B.; Li, F.; Niu, X.; Zhang, Q.; Wang, Y.; Chen, X. Single-exposure optical measurement of highly reflective surfaces via deep sinusoidal prior for complex equipment production. IEEE Trans. Ind. Inform. 2022, 14, 2039–2048. [Google Scholar] [CrossRef]
Chen, L.; Lu, X.; Zhang, J.; Chu, X.; Chen, C. HINet: Half Instance Normalization Network for Image Restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Nashville, TN, USA, 19–25 June 2021; pp. 182–192. [Google Scholar]
Liang, H.; He, Z.; Wang, X.; Su, W.; Tan, J.; Xie, S. HGG-Net: Hierarchical geometry generation network for point cloud completion. IEEE Trans. Intell. Transp. Syst. 2025; early access. [Google Scholar] [CrossRef]
Lin, Z.; He, Z.; Wang, X.; Su, W.; Tan, J.; Deng, Y.; Xie, S. Cross-scale fuzzy holistic attention network for diabetic retinopathy grading from fundus images. IEEE Trans. Emerg. Top. Comput. Intell. 2025, 9, 2164–2178. [Google Scholar] [CrossRef]

Figure 1. Subsurface scattering schematic in translucent object measurement.

Figure 2. The framework of the proposed method.

Figure 3. The structure of the deep network. (a) generator; (b) discriminator.

Figure 4. The structures of the network blocks. (a) HIN block; (b) Residual block; (c) SA block.

Figure 5. Measurement of optical glue. (a) The measured object; (b–e) the calculated fringe M of traditional PSP, supervised learning-based methods, the proposed self-learning-based method, and spray-coating method; (f–i) wrapped phases corresponding to (b–e).

Figure 6. Three-dimensional shape reconstruction results of the optical glue. (a) Traditional PSP; (b) supervised learning-based method; (c) proposed self-learning-based method; (d) spray-coating method.

Figure 7. Measurement of frosted glass cup. (a) The measured object; (b–e) the calculated fringe M of traditional PSP, supervised learning-based method, the proposed self-learning-based method, and spray-coating method; (f–i) the zoom in view of the wrapped phases corresponding to (b–e).

Figure 8. Three-dimensional shape reconstruction results of the frosted glass cup. (a) Traditional PSP; (b) supervised learning-based method; (c) proposed self-learning-based method; (d) spray-coating method; (e,f) the cross-sections corresponding to the white line in (d).

Figure 9. The error distribution of different methods. (a) Traditional PSP; (b) supervised learning-based method; (c) the proposed self-learning-based method.

Figure 10. Measurement of a jade sample. (a) The measured object; (b–e) the calculated fringe M of traditional PSP, supervised learning-based method, the proposed self-learning-based method, and spray-coating method; (f–i) wrapped phases corresponding to (b–e).

Figure 11. Three-dimensional shape reconstruction results of the jade sample. (a) Traditional PSP; (b) super-vised learning-based method; (c) proposed self-learning-based method; (d) spray-coating method.

Figure 12. The error distribution of different methods. (a) Traditional PSP; (b) supervised learning-based method; (c) the proposed self-learning-based method.

Figure 13. Measurement of the deformation of a translucent silica gel. (a1,a2) The measured object before and after deformation; (a3) the five frames of captured image; (b) the five frames 3D reconstructed results using traditional PSP; (c) the five frames 3D reconstructed results using supervised learning-based method; (d) the five frames 3D reconstructed results using self-learning-based method.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Su, W.; Zou, T.; Chen, H.; Niu, H.; He, Z.; Zhao, Y.; Chen, Z.; Tan, J. Self-Learning-Based Fringe Domain Conversion for 3D Surface Measurement of Translucent Objects at the Mesoscopic Scale. Photonics 2025, 12, 898. https://doi.org/10.3390/photonics12090898

AMA Style

Su W, Zou T, Chen H, Niu H, He Z, Zhao Y, Chen Z, Tan J. Self-Learning-Based Fringe Domain Conversion for 3D Surface Measurement of Translucent Objects at the Mesoscopic Scale. Photonics. 2025; 12(9):898. https://doi.org/10.3390/photonics12090898

Chicago/Turabian Style

Su, Wenqing, Tao Zou, Huankun Chen, Haipeng Niu, Zhaoshui He, Yumei Zhao, Zhuyun Chen, and Ji Tan. 2025. "Self-Learning-Based Fringe Domain Conversion for 3D Surface Measurement of Translucent Objects at the Mesoscopic Scale" Photonics 12, no. 9: 898. https://doi.org/10.3390/photonics12090898

APA Style

Su, W., Zou, T., Chen, H., Niu, H., He, Z., Zhao, Y., Chen, Z., & Tan, J. (2025). Self-Learning-Based Fringe Domain Conversion for 3D Surface Measurement of Translucent Objects at the Mesoscopic Scale. Photonics, 12(9), 898. https://doi.org/10.3390/photonics12090898

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Self-Learning-Based Fringe Domain Conversion for 3D Surface Measurement of Translucent Objects at the Mesoscopic Scale

Abstract

1. Introduction

2. Methods

2.1. Subsurface Scattering in Structured Light Projection

2.2. Self-Learning-Based Fringe Domain Conversion

2.3. Design of the Loss Function

2.4. Training Dataset and Implementation

3. Experiments

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI