A Bidirectional Design Method for Through-Glass Vias with Selective Laser Wet Etching Based on the Cross-Modal Learning Method

Meng, Yongbo; Wu, Liqing; Yuan, Bo; Zhou, Xingping; Li, Yan; Zhang, Zhijun; Shi, Yuechun

doi:10.3390/mi17010033

Open AccessArticle

A Bidirectional Design Method for Through-Glass Vias with Selective Laser Wet Etching Based on the Cross-Modal Learning Method

by

Yongbo Meng

^1,2,

Liqing Wu

^1,*,

Bo Yuan

¹,

Xingping Zhou

³,

Yan Li

²,

Zhijun Zhang

¹ and

Yuechun Shi

¹

Yongjiang Laboratory, Ningbo 315202, China

²

Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo 315211, China

³

Institute of Quantum Information and Technology, Nanjing University of Posts and Telecommunications, Nanjing 210003, China

^*

Author to whom correspondence should be addressed.

Micromachines 2026, 17(1), 33; https://doi.org/10.3390/mi17010033 (registering DOI)

Submission received: 24 November 2025 / Revised: 17 December 2025 / Accepted: 26 December 2025 / Published: 27 December 2025

(This article belongs to the Special Issue Photonic Integration Technology and Semiconductor Laser Array Chip Technology)

Download

Browse Figures

Versions Notes

Abstract

As an interposer, Through-Glass Vias (TGVs) play a critical role in advanced packaging such as Co-packaged optics (CPO). Currently, due to the complex influence of laser wet-etching process parameters, the precise bidirectional prediction of TGV parameters and the etching morphology still remains a challenge. In this paper, a bidirectional design method for TGVs is proposed, which is based on the cross-modal learning method. By integrating a Cellular Automaton Etch-Diffusion (CAED) physical model with a Stable Diffusion (SD) architecture, accurate forward prediction from laser parameters to TGV morphology is realized successfully. In addition, the Contrastive Language–Image Pre-training (CLIP) model is also applied to achieve an efficient inverse design of TGVs. Furthermore, the generalization ability is examined in this paper, demonstrating significant robustness and stability of the generative model. The results provide an efficient method for enhancing TGV quality within a deep learning framework.

Keywords:

laser material processing; glass processing; selective laser etching; ultrafast optics; deep learning

1. Introduction

To meet the demands of advanced three-dimensional packaging [1], heterogeneous integration has emerged as a critical pathway for enhancing integration density and device performance [2]. Due to the low electrical loss at high frequencies, relatively high stiffness, and adjustable thermal expansion coefficient of glass substrates, the Through-Glass Via (TGV) technique enables the fabrication and metallization of high-density, high-aspect-ratio glass vias, reducing device size while maintaining high-performance electrical interconnects [3,4,5]. The selective laser wet-etching (SLE) technique, with high fabrication precision, rapid processing, and low cost, has been an important means for TGV fabrication [6,7]. The SLE process is optimized through the adjustment of laser parameters and the implementation of various scanning strategies including segment scanning and contour scanning, based on a large number of experimental data [8,9]. In order to investigate the effects of laser parameters including spot radius and pulse energy on etching selectivity and key TGV morphological parameters such as aspect ratio, taper, and irregularity, the micro-crack-assisted etching model is developed to predict TGV morphology through the integration of laser processing, chemical etching, and substrate parameters [10]. A diffusion model is employed to explain the mechanism behind the high selectivity [9]. Moreover, some studies have established wet-etching models [5,11,12] and performed electromagnetic simulations on TGVs using HFSS software [5] to optimize the entire SLE process. Among them, the Cellular Automaton Etch-Diffusion (CAED) model [12] can be applicable to the complex chamfer geometries experimentally. In addition, the etching selectivity and diffusion coefficient, determined by fitting the etch depth to the etch time using the superdiffusion model, are used to predict the microchannel etch depth under specific laser and etching parameters for different etch times [13]. Since the SLE process involves multiple physical parameters such as laser pulse energy, repetition rate, and etchant concentration, the numerical simulation method is restricted in precisely determining the influence of each parameter and simultaneously adjusting multiple parameters. Existing studies have mainly optimized SLE-based TGV fabrication through extensive experimental parameter tuning and scanning strategies [6,7,8,9], and several physics-based models (e.g., micro-crack-assisted etching and diffusion-related wet-etching models) have been developed to explain the etching selectivity and to predict etching outcomes in a forward manner [5,9,10,11,12,13]. Currently, the main difficulties of SLE lie in the efficiency of visualizing the etching morphology in simulation, as well as in inversely deducing SLE processing parameters from etching morphology. In particular, there are few relevant reports on the inverse design for extracting laser parameters directly from the etching morphology. Therefore, it is necessary to develop an efficient method for the bidirectional design of TGVs.

The cross-modal learning framework possesses powerful perception and association capabilities for complex information by fusing heterogeneous data such as text and images [14,15]. Hence, this framework has been widely applied in multiple fields [16,17,18] to figure out synthesis problems. For instance, the video denoising diffusion model generates full-field stress evolution video frame sequences conditioned on stress–strain curves, aiding inverse design involving contact/buckling mechanisms in mechanical metamaterials [16]. Text encoders and cascaded diffusion models can achieve non-Abelian gauge field generation in circuit design, with a verification accuracy of 77.4% [17]. The Contrastive Language–Image Pre-training (CLIP) model successfully constructs a cross-modal semantic space between photonic crystal structural parameters and optical field distributions, while the Stable Diffusion (SD) model can generate optical fields from structural information [18]. Obviously, the cross-modal learning algorithm can provide a more efficient approach for complex physics models.

In this paper, a physics-based CAED simulator is combined with the cross-modal learning framework to enable bidirectional prediction of the SLE process. The CAED model is established with Super-Gaussian and Bessel-Gaussian beams of ultrafast laser, which separates the glass substrate into a cellular grid and describes the wet-etching evolution by local reaction–diffusion rules on the grid. The etching state of each cell is updated according to the etching selectivity and the neighboring cell states, thereby enabling an efficient simulation of morphology evolution during SLE wet etching. Based on this CAED method, the “SLE parameters–TGV morphology images” datasets are generated, providing physics-consistent supervision for training and evaluating the subsequent deep learning models. A cross-modal self-supervised learning framework is applied for the generation–inversion prediction task under the constraints of the SLE process. Under the constraints of the laser modification and wet-etching process, the SD architecture can achieve a forward design from SLE process parameters to TGV morphology images. Furthermore, using a CLIP algorithm it can establish a deep semantic alignment between TGV morphology images and SLE processing parameters, as well as enable the inverse extraction of laser parameters. Moreover, we further evaluate the model’s robustness and generalization across the parameter space to better characterize its applicability beyond the training distribution. The simulation results show that this work provides a new approach of intelligent design and optimization for the TGV fabrication process.

2. Principle

The SLE process can be divided into two major steps: laser-induced modification and wet etching. Figure 1a describes the principle of SLE, where a laser beam is illuminated on a glass substrate to create a modified area. Subsequently, after being corroded by the acidic etching solution, the modified area will exhibit higher etch selectivity than the unmodified area. Hence, by modulating the laser parameters, the modified area can be changed to obtain the desired TGV morphology. This paper employs the Super-Gaussian beams for layered scanning and the Bessel-Gaussian beams for single-pass penetration scanning to fabricate TGVs, respectively. Super-Gaussian beams can provide high selectivity and fine processing yet are limited in single-pulse processing ability. Conversely, Bessel-Gaussian beams allow efficient single-pulse modification at the expense of processing quality and selectivity. As shown in Figure 1a, the two types of laser beams are used to modify the glass.

A CAED model is employed to simulate the wet-etching process as shown in Figure 1b, which can be described by a set of two-dimensional reaction–diffusion equations [12] (as shown in Equation (1)). Equation (1) also can be extended to 3D. In order to facilitate the recognition of the SD model by the 2D image, only 2D models are considered here. In addition, the etching time is correlated with the third dimension (the z-direction) in the simulation with generality. Hence, the 2D TGV morphology in this paper is the cross-sectional view of the center of the 3D glass substrate after etching. In CAED model, the glass substrate can be regarded as a cuboid composed of multiple layers of two-dimensional planes, each of which is divided into m × n grids. The parameter A_m,n is introduced to represent the glass concentration at the (m, n) grid. When A_m,n = 1 and A_m,n = 0, it signifies that the glass is unetched and completely etched, respectively. The initial concentration of the glass substrate remains constant. In the etching process, assuming that the chemical corrosion solution is uniformly diffused, the etch rate at grid point (m, n) is driven by the etch states of its four surrounding grid points (m − 1, n), (m + 1, n), (m, n − 1), (m, n + 1).

\frac{1}{B} \frac{\partial}{\partial t} A (t, m, n) = (\frac{\partial^{2}}{\partial n^{2}} + \frac{\partial^{2}}{\partial m^{2}}) A (t, m, n) - (4 + \frac{S (m, n)}{B}) \cdot (1 - A (t, m, n))

(1)

S_{m, n} = S_{0} \cdot \exp (- \frac{| n |^{N}}{| ω_{n} |^{N}} - \frac{| m |^{N}}{| ω_{m} |^{N}})

(2)

S_{m, n} = S_{0} \cdot \exp (\frac{- 2 r^{2}}{ω^{2}}) \cdot {|J_{1} (k_{r} r)|}^{2}

(3)

In Equation (1), B represents the uniform etch rate of the chemical corrosion solution, and A(t, m, n) represents the glass concentration at position (m, n) at time t. S_m_,n represents the etch selectivity of the modified region related to laser parameters, where the peak etch selectivity S₀ can be equivalently expressed as the peak energy of the laser. The etch selectivity for Super-Gaussian and Bessel-Gaussian beams is defined as shown in Equation (2) and Equation (3), respectively. Where ω represents the beam waist radius, N is the Super-Gaussian exponent; K_r is the radial wave vector; and r is the radial distance. When using the Super-Gaussian beam as the light source, the value of S₀ is 10⁶. When using the Bessel beam as the light source, the value of S₀ depends on the beam waist radius and pulse energy. The pulse duration of the laser is 300 fs, and the pulse energy is about 10 μJ.

The SD algorithm can be used to build the text (laser parameters)-to-image (TGV morphology) model. Based on the datasets collected from the above CAED model, a denoising diffusion model [19] is used to generate TGV morphology images from SLE process parameters in SD algorithm. During model inference, a text encoder converts laser parameters into text embeddings, which can be recognized by the conditional image generation module. After sampling an initial noise, the U-Net network iteratively denoises under text guidance via Denoising Diffusion Implicit Models to restore the latent feature corresponding to the etching morphology. Subsequently, the decoder of the Variational Auto Encoder (VAE) outputs the TGV morphology image. During training, the text encoder first processes the laser parameters, while the VAE encoder compresses the TGV morphology image into a latent feature. A random timestep is then sampled, and noise is added to the latent feature. Leveraging cross-attention mechanisms, the U-Net network learns to predict the noise based on the SLE text embeddings, noisy TGV latent variables, and the timestep. The U-Net parameters are optimized by minimizing the mean squared error between the predicted noise and the true noise. Figure 2 illustrates the schematic of the forward and reverse diffusion processes. The forward diffusion process is a Markov Chain Process that gradually adds Gaussian noise to the original TGV image, transforming the initial image into a pure noise image over 100 steps. The reverse diffusion process is the inverse Markov Chain Process simulated by the U-Net network, which gradually denoises a sample from a pure Gaussian noise distribution to generate the target TGV morphology image. Using the same training framework, four hole type (straight hole, tapered hole, trapezoidal hole, and waist hole) datasets are mixed as the training datasets for the SD model. It is worth noting that, although the training framework is fully general, for different hole types, the generalization ability, inference efficiency, and attainable image quality of the SD model will vary. Hence, in the subsequent result analysis, three evaluation metrics are employed to assess the training effects of different hole types, and the Gaussian Process Regression (GPR) model is used to calculate the generalization ability. In the training process, the datasets of straight hole, tapered hole, and trapezoidal hole each consist of 8000 images, while the relatively complex waist hole dataset contains 10,000 images. In addition, the proportion of the training set, validation set, and test set in the SD model is 7:2:1.

The CLIP contrastive learning method [20] is used to map TGV morphology images and SLE processing parameters into the same embedding space. Through contrastive learning, the embedding distance is minimized for matched morphology images and SLE parameters, while the distance is maximized for mismatched pairs. The schematic diagram of the CLIP model is shown in Figure 3. For the Super-Gaussian beam, the SLE etching capability is determined by three key parameters: the number of pulses, the beam waist radius, and the Super-Gaussian exponent, which are selected as the input text. In the cross-modal space, association and matching of cross-modal information are achieved by computing the cosine similarity between pairwise SLE parameter text vectors I and TGV morphology image vectors T. The cosine similarity formula is shown in Equation (4), where

| | I | |

represents the text vector norm. When similarity (I, T) equals 1, it indicates that the TGV morphology image and SLE process parameters pair are close in the semantic space, meaning they are a perfect match.

s i m i l a r i t y (I, T) = \frac{I \cdot T}{| | I | | \times | | T | |}

(4)

3. Results and Analysis

To assess the reliability and accuracy of the TGV morphology images generated by the SD model, three evaluation metrics are used in this paper: Scale-Invariant Feature Transform (SIFT) [21], Perceptual Hash (pHash) [22], and Structural Similarity Index (SSIM) [23], as shown in Figure 4. These metrics capture different physically meaningful aspects of agreement between the SD-generated morphology and the CAED reference. SIFT is feature- and edge-sensitive because it compares local feature points; SSIM is structure-sensitive because it jointly evaluates luminance, contrast, and structural consistency; and pHash emphasizes global perceptual similarity by comparing low-frequency content. The SIFT similarity is computed by matching feature points using a nearest-neighbor distance ratio criterion. The similarity score is defined as the fraction of matched feature points (Equation (5)), where N_ref and N_gen denote the numbers of feature points extracted from the reference and generated images, respectively, and N_match denotes the number of matched feature points. SSIM measures structural similarity between two images (Equation (6)). In the SSIM formulation,

μ_{x}

and

μ_{y}

are the mean intensities of images x and y;

σ_{x}^{2}

and

σ_{y}^{2}

are their variances; and

σ_{x y}

is the covariance between x and y. The pHash evaluates perceptual similarity by converting images to grayscale, extracting low-frequency components, and constructing a compact hash representation. As shown in Equation (7), the low-frequency coefficient F(i, j) at pixel position (i, j) is used to compute the average over the low-frequency region, where W and H denote the image width and height, respectively.

S i m i l a r i t y = \frac{N_{match}}{\min (N_{ref}, N_{gen})}

(5)

SSIM (x, y) = \frac{(2 μ_{x} μ_{y} + C_{1}) (2 σ_{x y} + C_{2})}{(μ_{x}^{2} + μ_{y}^{2} + C_{1}) (σ_{x}^{2} + σ_{y}^{2} + C_{2})}

(6)

μ = \frac{1}{W \times H} \sum_{i = 1}^{H} \sum_{j = 1}^{W} F_{low} (i, j)

(7)

It is found that the results using Bessel-Gaussian beams generally surpass those using Super-Gaussian beams. The fundamental reason lies in the unique energy distribution of Bessel-Gaussian beams containing high-intensity diffraction rings, causing the edge light intensity to be significantly higher than the modification threshold during laser etching of glass. As a result, the material undergoes complete modification and forms sharp boundaries in the CAED model simulation process. In contrast, Super-Gaussian beams exhibit insufficient edge light intensity, leading to a gradual modification transition zone along the boundary of the TGV morphology. The SD model struggles to accurately generate the gradual edge details anticipated by Super-Gaussian simulations. Due to high sensitivity to local features (especially edges), the SIFT can effectively distinguish the edge differences between the simulations using the CAED model and generations from the SD model. Additionally, Bessel-Gaussian simulations require fewer input parameters, reducing the learning difficulty for the SD model and enhancing generation quality.

As shown in Figure 5, four sets of TGV morphology images with different SLE process parameters are input to the fine-tuned CLIP model for TGV image–SLE text matching test. Four types of TGV structures are fabricated with distinct SLE parameters (pulse count, beam waist radius, and Super-Gaussian exponent). In Figure 5, an epoch refers to one full pass of the CLIP training procedure over the entire training datasets. Specifically, the CLIP model processes all training image–text pairs once via multiple mini-batches and updates parameters through backpropagation and gradient-based optimization. Then it performs inference with the updated parameters to obtain the matching probabilities, which represents the degree of matching between the text and the image. Figure 5a shows the process of extracting laser parameters such as a pulse count of 60, a Gaussian beam waist radius of 2.40

μ m

, and a Gaussian exponent of 3 from the straight hole morphology image. Figure 5b illustrates the extraction of laser parameters, including a pulse count of 55, a Gaussian beam waist radius of 3.90

μ m

, and a Gaussian exponent of 7, from the tapered hole morphology image. Similarly, Figure 5c,d present the reverse extraction processes of laser parameters for the trapezoidal hole and waist hole, respectively. Additionally, an interfering factor is introduced by inputting a set of text descriptions similar to the actual SLE process parameters. The results show that, in the early training stages, the model shows the prediction probability for all candidate SLE parameters. As training epochs increase, the probability of mismatched parameters gradually converges to 0, while the probability of matched parameters converges to 1. Hence, an inverse prediction model is successfully set up to estimate SLE process parameters from TGV morphology images. As shown in Figure 5a,b, after 40 and 60 training epochs, the predictions for both tapered and straight holes gradually approach the actual values. While more-complex waist holes and trapezoidal holes require about 100 epochs to reach a stable state. The results can well demonstrate the inverse design capability of proposed CLIP model for TGVs.

Furthermore, in order to evaluate the generalization ability of the established model on untrained data, a Gaussian Process Regression model based on Bayesian optimization [24] is constructed. The input features of the GPR model are described by a three-dimensional parameter vector X: Super-Gaussian exponent N, beam waist radius ω₀, hole-type number type. The output target y is the SSIM ranging from 0 to 1. The SSIM value represents the degree of similarity between the etching morphology generated by the model and the ideal morphology, signifying the generalization ability. The kernel function hyper-parameters θ are globally optimized via Bayesian optimization, with the objective function being the minimization of the negative log marginal likelihood:

L (θ) = - \log p (y | X, θ)

. In the optimization process, it reaches the global optimum at the 18th iteration. The final solved parameters in the optimization are as follows: signal variance Ϭ_f² = 3.3718; characteristic length scales ℓ_N = 1.7400, ℓ_ω0 = 2.8285, ℓ_type = 0.0266; noise standard deviation Ϭ_n = 0.0126; basis function coefficient β = [0.9717]; bias term as a constant. The goodness-of-fit (R²) is equal to 0.983, indicating that the model accurately captures the nonlinear relationship between SSIM and SLE process parameters. The very small length scale for hole type suggests a strong influence of hole type on SSIM, where slight changes in hole shape cause significant output fluctuations. In contrast, the beam waist radius has the largest length scale value, indicating that SSIM changes smoothly with radius, and SD has strong robustness to radius changes. The trained GPR model predicts the SSIM, computes its 95% confidence interval, and quantifies the distance between test points and the training distribution with the Mahalanobis distance [25].

Figure 6 shows the distribution of generalization ability within the untrained parameter space. To quantify the deviation of the testing parameter and the training data, the Mahalanobis distance is introduced as a covariance-aware measure of distribution shift in the joint parameter space. Each sample is represented by the parameter vector

x = {[N, ω_{0}, t y p e]}^{T}

, where

N

is the Super-Gaussian exponent;

ω_{0}

is the beam waist radius; and

t y p e

denotes the hole-type index. Using all training samples

{x_{i}}_{i = 1}^{M}

, the mean vector

μ = \frac{1}{M} \sum_{i = 1}^{M} x_{i}

and covariance matrix

Σ = \frac{1}{M - 1} \sum_{i = 1}^{M} (x_{i} - μ) {(x_{i} - μ)}^{T}

are computed, respectively. The Mahalanobis distance of a test sample is then defined as

D_{M} (x) = {(x - μ)}^{T} Σ^{- 1} (x - μ)

. Compared with Euclidean distance,

D_{M} (x)

accounts for differences in parameter scales and correlations and thus provides a more appropriate indicator of distribution shift. A small value of

D_{M}

indicates that the test condition lies within (or close to) the high-density region covered by the training data, whereas a large value of

D_{M}

corresponds to the out-of-distribution case. Although all hole types maintain high fidelity, the generalization capability of SD model shows a decaying trend. As the Mahalanobis distance gradually increases, differences among hole types emerge: trapezoidal holes exhibit severe decay, while waist holes maintain optimal robustness. The generalization capabilities of hole type mainly rely on the training data distribution—waist hole samples constitute a larger proportion of the total dataset, leading the model to learn more thoroughly about common hole types that frequently appear in the dataset.

4. Discussion

The proposed method in this paper effectively leverages physical priors from the CAED model to enhance the accuracy and efficiency of both forward prediction and inverse design of TGVs. The results demonstrate that this method maintains high-quality performance across various TGV types, with its robustness closely related to the training data distribution. Notably, for structures that are well represented in the training set, such as waist holes, the established model exhibits better robustness.

Under the current conditions of small-scale datasets, the predictive capability of model gradually weakens as the Mahalanobis distance between test samples and the training distribution increases. This limitation can be alleviated by expanding the size of the dataset and strengthening the physical constraints. On the other hand, to better align with the actual TGV manufacturing process, this study can further expand the range of process parameters (such as etching time, etching solution concentration and temperature, and etching solution spray direction) to establish a complete process–morphology mapping relationship. By incorporating these parameters into the text–image cross-modal learning framework, not only can the microscopic mechanism of TGV formation under the coupling of multiple physical fields be revealed, but also the prediction accuracy and process guidance value of the model in real industrial scenarios can be enhanced.

In addition, although the current model already involves the basic structures of TGVs such as straight holes, tapered holes, trapezoidal holes, and waist holes, it should be extended to accommodate more-complex geometric morphologies. In the practical application of advanced packaging, it often also contains complex non-standard hole structures (such as bell-shaped holes, asymmetric holes, multi-step holes, and even three-dimensional curved micro-channels). This can further enhance the model’s recognition and generation capabilities, thereby enabling the generation of more-complex TGV geometric forms. This anisotropic hole structure has unique advantages in aspects such as high-frequency signal transmission and thermal stress matching, but the coupling of laser modification and etching dynamics of the physical model is more complex than simple hole structures. The physical model used in this paper is an approximate SLE simulation model including two processes: laser-induced modification and chemical corrosion, while the thermal accumulation, reactant accumulation, and femtosecond filamentation effects during the wet-etching process in the experiment were not fully considered. Therefore, in subsequent research, to enhance the model’s adaptability to diverse TGV morphologies and the flexibility of inverse design, it is necessary to combine multi-physics field simulation models, cross-modal deep learning, and reinforcement learning algorithms. This is expected to support the future development of the TGV technique towards functionalization and customization.

Furthermore, the aforementioned technique has certain requirements for the input dimensions and complexity of the AI algorithm. To address this challenge, advanced architectures such as attention mechanisms and graph neural networks could be introduced to enhance the model’s ability in capturing the implicit relationships between high-dimensional parameters of SLE processes and complex morphologies of TGVs. Meanwhile, a multi-task learning strategy could enable the simultaneous output of key quality indicators (such as surface roughness and hole diameter error), thereby providing more-comprehensive decision support for process optimization.

5. Conclusions

This paper introduces a cross-modal deep learning framework into the physical model of selective laser etching to establish the bidirectional design of TGV structures. A forward model based on a Stable Diffusion architecture is proposed to generate TGV morphology from process parameters and an inverse model is constructed using a CLIP algorithm to extract parameters from the target morphology. In addition, the established modal has been proven by the Gaussian Process Regression algorithm to have high accuracy and robust generalization across various TGV types. The approach in this paper provides a novel, efficient pathway for the intelligent design and optimization of TGVs and potentially other complex fabrication processes in Co-packaged optics.

Author Contributions

All authors contributed to the study conception and design. Physical modeling, data collection, and analysis were performed by L.W. and Y.M. The first draft of the manuscript was written by Y.M., and all authors commented on previous versions of the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by Key R&D Program of Zhejiang 2024SSYS0045; Ningbo Natural Science Foundation Youth Doctoral Innovation Research Project (2023J397); National Natural Science Foundation of China under (Grant 12404365).

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Lancaster, A.; Keswani, M. Integrated circuit packaging review with an emphasis on 3D packaging. Integration 2018, 60, 204–212. [Google Scholar] [CrossRef]
Salahuddin, S.; Ni, K.; Datta, S. The era of hyper-scaling in electronics. Nat. Electron. 2018, 1, 442–450. [Google Scholar] [CrossRef]
Choi, K.; Kim, S.W.; Lee, J.H.; Chu, B.; Jeong, D.Y. Eco-friendly glass wet etching for MEMS application: A review. J. Am. Ceram. Soc. 2024, 107, 6497–6515. [Google Scholar] [CrossRef]
Lai, Y.; Pan, K.; Park, S. Thermo-mechanical reliability of glass substrate and Through Glass Vias (TGV): A comprehensive review. Microelectron. Reliab. 2024, 161, 115477. [Google Scholar] [CrossRef]
Zhan, Y.; Yin, K.; Nie, M.; Dou, G. Selective laser-induced etching of borosilicate glass in hydrofluoric acid. In Proceedings of the 2023 24th International Conference on Electronic Packaging Technology (ICEPT), Dalian, China, 8–11 August 2023. [Google Scholar]
Kratz, M.; Kniffler, M.; Häfner, C.L. Influence of flexible multibeam intensity distributions on selective laser-induced etching process regimes. Opt. Express 2024, 32, 36453–36468. [Google Scholar] [CrossRef] [PubMed]
Chen, J.; Lu, X.; Wen, Q.; Jiang, F.; Lu, J.; Lei, D.; Pan, Y. Review on laser-induced etching processing technology for transparent hard and brittle materials. Int. J. Adv. Manuf. Technol. 2021, 117, 2545–2564. [Google Scholar] [CrossRef]
Butkutė, A.; Baravykas, T.; Stančikas, J.; Tičkūnas, T.; Vargalis, R.; Paipulas, D.; Sirutkaitis, V.; Jonušauskas, L. Optimization of selective laser etching (SLE) for glass micromechanical structure fabrication. Opt. Express 2021, 29, 23487–23499. [Google Scholar] [CrossRef] [PubMed]
Liu, J.; Xia, C.; Ming, X.; Dou, G.; Yin, C. Ultrahigh Aspect Ratio Through Glass Vias Perforation Utilizing Selective Laser-Induced Etching with Nanochannels. Adv. Eng. Mater. 2024, 26, 2400397. [Google Scholar] [CrossRef]
Liu, J.; Yin, C.; Xia, C.; Ming, X.; Dou, G. Precise Morphology Tailoring of Through Glass Vias Perforated by Selective Laser-Induced Etching on Fused Silica. JOM 2025, 77, 4150–4160. [Google Scholar] [CrossRef]
Axt, H.; Kratz, M.; Peters, C.; Bornschlegel, B.; Hinke, C. 3D target shape retention within selective laser-induced etching (SLE) via simulation of chemical etching. In Proceedings of the Laser-based Micro- and Nanoprocessing XIX, San Francisco, CA, USA, 28 January 2025. [Google Scholar]
Wimmer, M.; Kaiser, M.; Kieiner, J.; Wolff, J.; Kahmann, M.; Flamm, D. Selective laser etching of displays: Closing the gap between optical simulations and fabrication. In Proceedings of the Laser Applications in Microelectronic and Optoelectronic Manufacturing (LAMOM) XXIX, San Francisco, CA, USA, 27 January–1 February 2024. [Google Scholar]
Hasse, K.; Nwatu, D.; Müller, S.; Suntsov, S.; Kip, D. Fabrication of microchannels in lithium tantalate by selective etching of structures inscribed with a femtosecond laser. J. Phys. Photonics 2025, 7, 035014. [Google Scholar] [CrossRef]
Wang, H.; Fu, T.; Du, Y.; Gao, W.; Huang, K.; Liu, Z.; Chandak, P.; Liu, S.; Van Katwyk, P.; Deac, A.; et al. Scientific discovery in the age of artificial intelligence. Nature 2023, 620, 47–60. [Google Scholar] [CrossRef] [PubMed]
Wang, X.; Chen, G.; Qian, G.; Gao, P.; Wei, X.Y.; Wang, Y.; Tian, Y.; Gao, W. Large-scale multi-modal pre-trained models: A comprehensive survey. Mach. Intell. Res. 2023, 20, 447–482. [Google Scholar] [CrossRef]
Bastek, J.H.; Kochmann, D.M. Inverse design of nonlinear mechanical metamaterials via video denoising diffusion models. Nat. Mach. Intell. 2023, 5, 1466–1475. [Google Scholar] [CrossRef]
Sun, J.; Chen, X.; Wang, X.; Zhu, D.; Zhou, X. Self-reasoning assisted learning for non-Abelian gauge fields design. Phys. Rev. B 2025, 111, 094111. [Google Scholar] [CrossRef]
Sun, J.; Chen, X.; Wang, X.; Zhu, D.; Zhou, X. Photonic modes prediction via multi-modal diffusion model. Mach. Learn. Sci. Technol. 2024, 5, 035069. [Google Scholar] [CrossRef]
Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; Ommer, B. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 10684–10695. [Google Scholar]
Radford, A.; Kim, J.W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al. Learning transferable visual models from natural language supervision. In Proceedings of the 38th International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 8748–8763. [Google Scholar]
Lowe, D.G. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
Samanta, P.; Jain, S. Analysis of perceptual hashing algorithms in image manipulation detection. Procedia Comput. Sci. 2021, 185, 203–212. [Google Scholar] [CrossRef]
Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. mage quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2024, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
Seeger, M. Gaussian processes for machine learning. Int. J. Neural Syst. 2004, 14, 69–106. [Google Scholar] [CrossRef]
Weinberger, K.Q.; Saul, L.K. Distance Metric Learning for Large Margin Nearest Neighbor Classification. J. Mach. Learn. Res. 2009, 10, 207–244. [Google Scholar]

Figure 1. Schematic diagram of the SLE and modeling. (a) The target area on the glass substrate is modified using Super-Gaussian or Bessel-Gaussian beams, followed by wet etching of the modified region. (b) In the Cellular Automaton Etch-Diffusion model, the glass substrate is divided into several adjacent cells.

Figure 2. Schematic diagram of the forward diffusion and reverse diffusion in SD algorithm.

Figure 3. Schematic diagram of the CLIP algorithm. The optimizer achieves precise matching and prediction between SLE text feature vectors and TGV image feature vectors by maximizing the cosine similarity.

Figure 4. The stable diffusion model generation results are displayed in a quality evaluation chart. TGVs of various types, including straight holes, tapered holes, trapezoidal holes, and waist holes, are generated using SLE text parameters based on Super-Gaussian and Bessel-Gaussian beams. The theoretical TGV images and the generated TGV images are evaluated using SIFT, pHash, and SSIM evaluation metrics.

Figure 5. CLIP model prediction results for SLE parameters. The most probable SLE parameters are inversely derived by inputting TGV morphology images. In the legend, the SLE process parameters are set as pulse number_beam waist radius_Super-Gaussian exponent. (a), (b), (c), and (d) describe the process of reversely extracting laser parameters from the etching morphologies of straight holes, tapered holes, trapezoidal holes, and waist holes, respectively.

Figure 6. SD model generalization ability test results. Model generalization capability weakens as the Mahalanobis distance from the training dataset increases.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Meng, Y.; Wu, L.; Yuan, B.; Zhou, X.; Li, Y.; Zhang, Z.; Shi, Y. A Bidirectional Design Method for Through-Glass Vias with Selective Laser Wet Etching Based on the Cross-Modal Learning Method. Micromachines 2026, 17, 33. https://doi.org/10.3390/mi17010033

AMA Style

Meng Y, Wu L, Yuan B, Zhou X, Li Y, Zhang Z, Shi Y. A Bidirectional Design Method for Through-Glass Vias with Selective Laser Wet Etching Based on the Cross-Modal Learning Method. Micromachines. 2026; 17(1):33. https://doi.org/10.3390/mi17010033

Chicago/Turabian Style

Meng, Yongbo, Liqing Wu, Bo Yuan, Xingping Zhou, Yan Li, Zhijun Zhang, and Yuechun Shi. 2026. "A Bidirectional Design Method for Through-Glass Vias with Selective Laser Wet Etching Based on the Cross-Modal Learning Method" Micromachines 17, no. 1: 33. https://doi.org/10.3390/mi17010033

APA Style

Meng, Y., Wu, L., Yuan, B., Zhou, X., Li, Y., Zhang, Z., & Shi, Y. (2026). A Bidirectional Design Method for Through-Glass Vias with Selective Laser Wet Etching Based on the Cross-Modal Learning Method. Micromachines, 17(1), 33. https://doi.org/10.3390/mi17010033

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

A Bidirectional Design Method for Through-Glass Vias with Selective Laser Wet Etching Based on the Cross-Modal Learning Method

Abstract

1. Introduction

2. Principle

3. Results and Analysis

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI