Generation of Higher-Order Hermite–Gaussian Modes Based on Physical Model and Deep Learning

Chen, Tai; Jiang, Chengcai; Tao, Jia; Ma, Long; Cao, Longzhou

doi:10.3390/photonics12080801

Open AccessArticle

Generation of Higher-Order Hermite–Gaussian Modes Based on Physical Model and Deep Learning

by

Tai Chen

^1,2,

Chengcai Jiang

^1,2,

Jia Tao

^1,2,

Long Ma

^1,2,*

and

Longzhou Cao

^1,2,*

¹

School of Computer Science and Engineering, Chongqing Three Gorges University, Chongqing 404100, China

²

Key Laboratory of Intelligent Information Processing and Control of Chongqing Municipal Institutions of Higher Education, Chongqing Three Gorges University, Chongqing 404100, China

^*

Authors to whom correspondence should be addressed.

Photonics 2025, 12(8), 801; https://doi.org/10.3390/photonics12080801

Submission received: 3 July 2025 / Revised: 5 August 2025 / Accepted: 8 August 2025 / Published: 10 August 2025

(This article belongs to the Section Data-Science Based Techniques in Photonics)

Download

Browse Figures

Versions Notes

Abstract

The higher-order Hermite–Gaussian (HG) modes exhibit complex spatial distributions and find a wide range of applications in fields such as quantum information processing, optical communications, and precision measurements. In recent years, the advancement of deep learning has emerged as an effective approach for generating higher-order HG modes. However, the traditional data-driven deep learning method necessitates a substantial amount of labeled data for training, entails a lengthy data acquisition process, and imposes stringent requirements on system stability. In practical applications, these methods are confronted with challenges such as the high cost of data labeling. This paper proposes a method that integrates a physical model with deep learning. By utilizing only a single intensity distribution of the target optical field and incorporating the physical model, the training of the neural network can be accomplished, thereby eliminating the dependency of traditional data-driven deep learning methods on large datasets. Experimental results demonstrate that, compared with the traditional data-driven deep learning method, the method proposed in this paper yields a smaller root mean squared error between the generated higher-order HG modes. The quality of the generated modes is higher, while the training time of the neural network is shorter, indicating greater efficiency. By incorporating the physical model into deep learning, this approach overcomes the limitations of traditional deep learning methods, offering a novel solution for applying deep learning in light field manipulation, quantum physics, and other related fields.

Keywords:

higher-order Hermite–Gaussian modes; deep learning; physical model; spatial light modulator; hologram

1. Introduction

Hermite–Gaussian (HG) modes are solutions to the wave equation in Cartesian coordinates under paraxial conditions and represent the eigenmodes of square mirror optical resonators. They are commonly represented as HGnm, with n and m denoting the respective quantities of nodes along the x and y axes of the light spot. As n and m increase, the size of the light spot also increases. The higher-order HG modes not only carry richer information but also provide more degrees of freedom, making them uniquely advantageous in addressing the capacity crisis in optical communications and achieving the sustainable expansion of high-speed, high-capacity optical communication systems [1,2,3,4]. For example, the higher-order HG modes are utilized in optical tweezers technology for trapping and manipulating microscopic particles [5]. In quantum communication, the higher-order HG modes enable the encoding of high-dimensional quantum states and thereby enhance both the channel capacity and security of quantum communication [6]. In laser processing, the higher-order HG modes can be employed to achieve high-precision material processing [7]. In precision measurements, the higher-order HG modes are capable of providing a higher accuracy in beam displacement measurements [8].

Currently, there exist two primary approaches for generating higher-order HG modes: directly within the laser cavity or shaping the beam externally to the cavity. Although the direct generation approach is simple and convenient, it necessitates the integration of specialized optical components, which can complicate the structure and reduce stability [9,10,11,12]. Alternatively, beam shaping outside the cavity involves using optical diffraction elements, such as a spatial light modulator (SLM), to convert the fundamental mode into the higher-order HG modes by adjusting the loaded hologram, offering greater flexibility [13,14,15,16,17].

Utilizing an SLM and phase retrieval algorithm enables the efficient generation of higher-order HG modes. The Gerchberg–Saxton (GS) algorithm is a commonly used phase retrieval method that iteratively optimizes the target phase distribution through Fourier transforms [18]. However, the GS algorithm has inherent limitations such as multiple solutions and issues with the convergence speed. Several improved methods have been proposed to enhance the performance of the GS algorithm [19,20], but these modified algorithms often introduce new variables, increasing the complexity and requiring more experience to master their convergence characteristics and select the most suitable approach.

Owing to the development of deep learning, convolutional neural networks (CNNs) have showcased outstanding capabilities in various fields, such as computer vision, image processing, and pattern recognition [21]. By utilizing large-scale datasets, CNNs can effectively learn to solve optical imaging problems and have been applied to many challenging tasks, including super-resolution imaging [22], tomography [23], and holographic imaging [24]. Therefore, harnessing CNNs to generate the higher-order HG modes has become a topic worthy of further investigation.

In our previous work, we proposed a data-driven deep learning method that combines a CNN with the GS algorithm [25]. This method leverages the CNN to train the relationship between the output and input intensity distributions in the GS algorithm. Subsequently, it predicts the corresponding input intensity distribution based on the target intensity distribution. The predicted input intensity distribution is then fed into the GS algorithm to obtain the hologram loaded onto an SLM. Using the proposed data-driven deep learning method, we generated the higher-order HG modes of different orders as well as optical fields with arbitrary intensity distributions. Compared with the GS algorithm, it can generate a target optical field of higher quality. This method employs a supervised learning strategy to train the neural network model, which heavily relies on a substantial volume of labeled data. However, the model generally exhibits a limited interpretability and generalization capability, thereby constraining its further practical application and development.

To overcome the limitation of the aforementioned method, which heavily relies on a vast amount of training data for the neural network training, a hybrid approach integrating a physical model with deep learning has emerged. This approach effectively combines the strengths of the physical model and deep learning. It not only leverages the prior knowledge and interpretability inherent in the physical model but also harnesses the powerful automatic feature extraction and generalization capabilities of deep learning. Currently, untrained neural networks combined with physical models are utilized for phase retrieval, and their superior performance has been experimentally validated [26]. This paper proposes an innovative beam shaping method, namely the GS-CNN, which skillfully integrates the physical model of the GS algorithm with neural network technology. Specifically, the GS-CNN constructs its neural network architecture with the physical model of the Fourier transform between the front focal plane and the back focal plane, upon which the GS algorithm relies, as its core. This design not only fully leverages the CNN’s powerful capabilities for autonomous learning and adaptation but also cleverly incorporates the prior knowledge from the physical model as guidance, achieving a seamless integration of the strengths of both approaches. Through the GS-CNN, we have successfully generated higher-order HG modes of different orders, as well as optical fields with arbitrary intensity distributions. Experimental results demonstrate that, compared to the traditional data-driven CNN method, the GS-CNN achieves a significant improvement in efficiency and enhances the quality of the generated modes. Moreover, due to the incorporation of the physical model, the network model also possesses a certain degree of interpretability, providing robust support for the further development and application of the beam shaping technology.

2. Theoretical Model

2.1. GS Algorithm

The phase retrieval algorithm approximates the target optical field intensity distribution by iteratively optimizing the hologram on a diffractive element multiple times, based on the incident optical field distribution and the desired target optical field intensity distribution. The GS algorithm is the most frequently employed phase retrieval algorithm. It operates on the principles of the Fourier transform and its inverse, alternating between the spatial domain and the frequency domain, and recovers phase information through an iterative optimization. The GS algorithm boasts the advantages of a rapid convergence and high computational efficiency.

Figure 1 illustrates the framework diagram of the GS algorithm. Assume that we aim to generate the intensity distribution of the HG30 mode in the target plane. The incident beam is a Gaussian beam

A_{i n} (x, y)

, and the initial input phase has a random distribution

φ_{i n} (x, y)

. After applying a Fast Fourier Transform (FFT), an approximate target intensity distribution

{\hat{A}}_{o u t} (x, y)

, along with its corresponding phase distribution

φ_{o u t} (x, y)

, is generated on the target plane. On the target plane, the generated approximate target intensity distribution

{\hat{A}}_{o u t} (x, y)

is replaced with the standard intensity distribution of the HG30 mode

A_{o u t} (x, y)

, while the phase distribution remains unchanged. Subsequently, an Inverse Fast Fourier Transform (IFFT) is applied. During this process, the phase distribution

φ_{i n} (x, y)

is retained, while the generated intensity distribution

{\hat{A}}_{i n} (x, y)

is replaced with the incident Gaussian beam

A_{i n} (x, y)

.

This sequence of operations forms an iterative optimization loop. The loop continues until the error between the intensity distribution of the generated approximate target intensity distribution

{\hat{A}}_{o u t} (x, y)

and the standard target intensity distribution

A_{o u t} (x, y)

meets the convergence condition, the iteration terminates, and the resulting phase distribution

φ_{i n} (x, y)

is the hologram we need. Typically, we use the Root Mean Square Error (RMSE) to evaluate the difference between the generated intensity distribution

{\hat{A}}_{o u t} (x, y)

and the standard intensity distribution

A_{o u t} (x, y)

, and the formula is as follows:

R M S E = \sqrt{\iint {|{\hat{A}}_{o u t} (x, y) - A_{o u t} (x, y)|}^{2} d x d y / \iint {|A_{o u t} (x, y)|}^{2} d x d y}

(1)

The GS algorithm establishes a correspondence between the phase distribution of the incident light field and the target intensity distribution, and its expression is as follows:

A_{o u t} (x, y) = H (φ_{i n} (x, y))

(2)

where

H (\cdot)

serves as the mapping function between the hologram

φ_{i n} (x, y)

and the target intensity distribution

A_{o u t} (x, y)

, which can represent the physical process of the GS algorithm.

2.2. GS-CNN

The GS algorithm involves a process of forward encoding the phase distribution into the target intensity distribution, whereas decoding and reconstructing the phase distribution from the target intensity distribution requires solving the aforementioned inverse problem. Traditional data-driven deep learning utilizes a large amount of labeled paired data,

\{{(φ_{i n})}_{j}, {(A_{o u t})}_{j}\}, j = 1, \dots, J

, to form a training set

S_{T}

and then employs neural networks to fit this data:

R_{θ^{*}} = a r g m i n {‖R_{θ} (A_{o u t}) - φ_{i n}‖}^{2}

(3)

here,

R_{θ}

represents the mapping function of the neural network, which is characterized by a collection of weights and biases.

Through the training process, we obtain a viable mapping function capable of mapping an intensity distribution

A

(not present in the training set

S_{T}

) back to its corresponding phase distribution

φ

, i.e.,

φ = R_{θ^{*}} (A)

. Traditional data-driven deep learning has advantages, such as a fast reconstruction speed and good robustness, but it heavily relies on a large amount of labeled training data. Typically, the size

J

of the training set

S_{T}

can reach several thousand or even tens of thousands, which to some extent limits the flexibility and generality of its applications.

To effectively address this challenge, we propose a physics-model-driven deep learning method, namely the GS-CNN—the schematic diagram of which is illustrated in Figure 2. In Figure 2a, firstly, the phase distribution

φ_{i n} (x, y)

reconstructed by the CNN is combined with the incident Gaussian beam

A_{i n} (x, y)

and then undergoes an FFT to generate the reconstructed intensity distribution of higher-order HG modes

{\hat{A}}_{o u t} (x, y)

. Subsequently, the RMSE between the generated intensity distribution of the higher-order HG modes and its corresponding standard intensity distribution is employed as the loss function to train and optimize the parameters of the network model. Ultimately, this process results in the reconstructed phase distribution

φ_{i n} (x, y)

. Figure 2b,c present the respective changes in the reconstructed higher-order HG modes and the phase distribution throughout the iterative optimization process of the CNN. The GS-CNN adjusts network parameters by incorporating a physical model, enabling an untrained CNN to output the reconstructed phase distribution:

R_{θ^{*}} = a r g m i n {‖{H (R}_{θ} (A_{o u t})) - A_{o u t}‖}^{2}

(4)

Essentially, the GS-CNN achieves this by deeply embedding a physical model into the neural network architecture, utilizing the physical model of the GS algorithm as a constraint. Through iterative optimization using a loss function constrained by this physical model, the network not only learns data mappings during training but also adheres to the laws of optical physics. Furthermore, the introduction of the physical model significantly enhances the generalization capability of the GS-CNN. The light field transformation laws described by the physical model are applicable across various scenarios, enabling the network to grasp the underlying physical mechanisms of the light field reconstruction from a single training sample and maintain a stable performance when confronted with unseen light intensity distributions.

3. Results and Discussion

U-Net is widely used in computational imaging because its encoder–decoder architecture, combined with skip connections, enables it to balance global and local information, achieve efficient multi-scale feature fusion, and possess a strong generalization ability. Moreover, it boasts advantages such as a high parameter efficiency and fast training convergence, making it suitable for a variety of imaging tasks. Therefore, U-Net is selected as the architecture of the GS-CNN.

In this paper, we propose an improved U-Net architecture. Its core innovation lies in the multi-scale phase feature extraction and dynamic fusion mechanism. The target intensity distributions (HG10, HG30, and CTGU) were all theoretically simulated and generated using Python 3.6.5. The specific operations are as follows: For the HG10 and HG30 modes, they were generated based on the theoretical expressions of the higher-order HG modes [2]. In terms of data selection, for both the training set and the test set, two sets of the HG10, HG30, and CTGU intensity distribution with different spot sizes were, respectively, selected for computational analysis. Since the GS-CNN only requires a single target optical field intensity distribution as the input, there is no need to divide the training set into proportions in the traditional way. One of the sets of the HG10, HG30, and CTGU intensity distribution is selected for training, the other set of the HG10, HG30, and CTGU intensity distribution is used to validate the model’s generalization ability. In the encoder part, the model employs a four-stage progressive downsampling structure. Each stage consists of two 3 × 3 convolutional layers with channel numbers of 32, 64, 128, and 256 in sequence, accompanied by batch normalization and LeakyReLU activation (α = 0.2). This design can effectively extract multi-scale features ranging from local phase gradients (in shallow layers) to the full-field phase distribution (in deep layers). In particular, we introduce a residual connection in the third stage of the encoder. Specifically, the input of this stage is directly added to the output after being processed by two 3 × 3 convolutional layers (with 128 channels) to enhance the model’s ability to represent phase jump regions. The bottleneck layer adopts a full-resolution densely connected block with 512 channels. This module makes full use of the feature information by maintaining the full resolution and achieving dense connections. Then, it establishes a non-linear mapping relationship between the wrapped phase and the true phase through cascaded 1 × 1 and 3 × 3 convolutional layers. In the decoder part, the transposed convolution is used for 2× upsampling, and the channel attention-weighted fusion is performed with the feature maps of the corresponding layers in the encoder. We employed the RMSE between the target intensity distribution and the generated intensity distribution of higher-order HG modes as the loss function to guide the training and optimization of the neural network.

The channel attention-weighted fusion obtains channel weights by operations such as global average pooling and fully connected layer processing on the channel dimension of the feature maps and then multiplies the weights with the feature maps to achieve fusion. This design significantly improves the sharpness of phase edges. The final output layer uses an improved periodic activation function (with a period of 2π) to directly predict the unwrapped phase, avoiding the error accumulation problem in the phase unwrapping in traditional methods. The entire network adopts an end-to-end training strategy.

The training and testing of the neural network model were both carried out on computers equipped with the Ubuntu 24.04.1 LTS operating system. The accompanying hardware configuration includes an Intel Xeon Gold 6438Y processor, an NVIDIA RTX A6000 GPU with a memory capacity of 48 GB. We employed the PyTorch 2.5 deep learning framework and wrote the code in Python 3.6.5 based on this framework, utilizing the PyCharm 2023.1 platform for code development. The experimental parameters employed are as follows: the beam wavelength is 1064 nm, the SLM has a resolution of 1108 × 1108, and the waist radius of the incident Gaussian beam is 5 mm. We published our code at https://github.com/zaishuiyifang123/HG (accessed on 25 July 2025).

The target intensity distributions are selected as the HG10 mode and HG30 mode. Figure 3 illustrates the intensity distributions of the HG10 mode and HG30 mode generated by the GS-CNN, along with the required holograms. Figure 3a shows the standard intensity distribution of the HG10 mode and HG30 mode, Figure 3b presents the intensity distributions of the HG10 mode and HG30 mode generated by the GS-CNN, and Figure 3c displays the required holograms to be loaded. As illustrated in Figure 3, the GS-CNN is capable of generating an intensity distribution that closely approximates the standard one. It can be observed that as the order of the generated modes increases, the mode distribution becomes more complex, and, accordingly, the required hologram also becomes more intricate. In the holograms, different gray values represent the phase value. In addition, the intensity distribution of the HG10 mode and HG30 mode can be revealed in the hologram. This occurs because, during the process of optimizing the phase distribution to match the target intensity, the algorithm inadvertently embeds modulation features related to the target intensity distribution into the phase information [27].

In our study, we opted for a resolution of 512 × 512 pixels. Figure 4 illustrates the variation in the loss function value of the GS-CNN after 20,000 training epochs. As evident from Figure 4, the loss function value tends to stabilize and reaches a near-minimal level after 10,000 training epochs. Consequently, we opted to set the number of training epochs at 10,000.

Next, we proceeded with a quantitative analysis. We compared the GS-CNN with the data-driven deep learning method [25] through experimental simulations. For both methods, we conducted simulations employing the aforementioned neural network architecture and parameters. Figure 5 shows the experimental results for the generated HG10 mode, the HG30 mode, and the abbreviation “CTGU” (representing our university). Taking the CTGU with a uniform light intensity distribution as an example, it illustrates that the GS-CNN can effectively generate arbitrary intensity distributions. To ensure a fair and objective evaluation of the mode quality generated by the two methods, we uniformly employed the same incident Gaussian beam, with a beam size set at 3 mm and a planar wavefront. The first column in Figure 5 displays the standard intensity distribution of the generated modes. The second column shows the intensity distribution of the modes generated through data-driven deep learning, while the third column presents the intensity distribution of the modes generated by the GS-CNN. We use the RMSE, peak signal-to-noise ratio (PSNR), and structural similarity index (SSIM) to evaluate the quality of the generated modes. As can be seen from Figure 5, compared with the HG10 mode, HG30 mode, and CTGU generated by the data-driven deep learning, the RMSE values of these three modes generated by the GS-CNN have improved by 0.01, 0.012, and 0.016, respectively. The PSNRs of these three modes generated by the GS-CNN have improved by 4.6 dB, 4.0 dB, and 2.3 dB, respectively. The SSIMs of these three modes generated by the GS-CNN have improved by 0.127, 0.054, and 0.057, respectively. From the results, it can be observed that as the spatial distribution of the generated mode becomes more complex, the quality of the mode generation gradually declines. The differences in the difficulty of generating different higher-order modes primarily stem from the combined influence of the mode complexity and neural network performance. Firstly, higher-order modes exhibit more complex spatial distributions and contain a greater number of nodes, necessitating more precise phase control. As illustrated in Figure 3, the hologram required for the HG30 mode is significantly more intricate than that for the HG10 mode. Moreover, complex structures are more susceptible to noise and systematic errors, leading to an increase in generation difficulty. Secondly, when dealing with complex distribution modes, neural network architectures encounter bottlenecks in feature extraction and generalization. The more complex the distribution of the mode, the stronger the multi-scale feature fusion capability required to capture both local details and global structures.

As can be clearly seen from Figure 5, the intensity distribution of the modes generated by data-driven deep learning exhibits a speckled pattern, while the intensity distributions of the modes generated by the GS-CNN are more uniform without speckles. The reason is that data-driven deep learning heavily relies on a large amount of labeled data. This purely data-oriented learning approach makes the model lack prior physical knowledge, and non-essential features, such as noise, in the training data are easily captured by the model. In contrast, the GS-CNN incorporates the physical model as a constraint into the training process. The prior knowledge provided by the physical model helps enhance the model’s generalization ability, enabling precise control over the mode generation and ultimately resulting in a more uniform intensity distribution.

A detailed analysis of the runtime of the two methods is conducted. The data-driven deep learning method utilizes 5000 pairs of training data, and its network training process takes approximately 49 h. In contrast, the GS-CNN does not require pre-training. Its overall computation time is mainly concentrated on the iterative optimization process between the neural network and the physical model. When iterating for 20,000 epochs, it takes about 6 min. This method successfully avoids the lengthy data collection and training procedures while ensuring a high reconstruction quality, fully demonstrating its unique advantages in terms of the computation time.

The data-driven deep learning method, as a traditional application of deep learning in the field of computational imaging, relies heavily on large-scale labeled datasets. During the training process, such methods require tens of thousands of input–output data pairs, continuously adjusting the weights and biases of the neural network to learn the mapping relationship from the input data to the output data. Secondly, due to the excessive reliance on the statistical patterns of the training data, when encountering new scenarios that differ significantly from the training data, their performance often declines sharply, leading to issues such as prediction errors or noise interference [28,29]. In contrast, the GS-CNN proposes a novel neural network design paradigm, namely a physics-enhanced deep neural network that does not require pre-training. Its core idea lies in directly integrating the physical model into the neural network architecture. Through the interaction between the physical model and the neural network, it eliminates the reliance on large-scale labeled data, enabling high-quality phase reconstruction with only a single image of the target light field intensity distribution as the input. This is achieved through the constraints of the physical model and the optimization of the neural network, ultimately yielding a high-quality light field intensity distribution. This not only significantly simplifies the process of data acquisition and processing but also enhances the model’s adaptability and robustness when facing new scenarios.

The integration of physical models endows the network with interpretability, enabling the GS-CNN to be trained with just a single intensity distribution and breaking free from the heavy reliance of traditional data-driven methods on massive amounts of labeled data. Compared with traditional iterative methods, such as the GS algorithm, the GS-CNN not only maintains physical interpretability but also accelerates the phase-solving process through neural networks, while generating modes of higher quality. Furthermore, in contrast to general supervised models, the GS-CNN does not require the construction of a large-scale dataset; it can complete training with just one set of data, significantly reducing data collection costs. However, the GS-CNN also faces certain challenges, such as the significant impact of the physical model’s accuracy on the reconstruction results and potential convergence issues in certain extreme cases. The GS-CNN highlights the significance of using physical models as constraints for neural networks and initially verifies the advantages of combining physical models with neural networks. However, current research still has shortcomings, lacking systematic sensitivity analyses targeting variations in beam parameters (covering aspects such as the beam size and wavefront phase structure) as well as noise levels (including detection noise, environmental interference, etc.). Based on this, subsequent work plans to conduct controlled variable experiments, such as adjusting the beam waist size and introducing Gaussian white noise, to quantitatively evaluate the model’s robustness against parameter perturbations, thereby providing stronger evidence for the reliability of this method in complex scenarios.

In comparison with the traditional phase encoding method that transforms a complex amplitude distribution function into a pure phase distribution function [30,31,32], the GS-CNN achieves algorithmic innovation by integrating physical models with deep learning. It directly learns the generation rules of phase distributions through neural networks. In terms of application scenarios, the traditional phase encoding method offers distinct advantages in dynamic light field modulation, whereas the GS-CNN is more suited for high-precision light field generation compared to other hybrid approaches, such as Physics-Informed Neural Networks (PINNs). In terms of the physical model integration, the GS-CNN makes the physical process a core component of the network computation through architecture-level fusion (replacing traditional convolutional layers with Fourier transform layers from the GS algorithm), whereas PINNs only achieve indirect constraints by adding physical equation residual terms to the loss function. Regarding data dependency, the GS-CNN requires only a single target optical field intensity distribution for training, completely eliminating the need for paired datasets, while PINNs still require some labeled data to assist in the convergence of physical constraints. In terms of computational efficiency, the GS-CNN reduces the training time to 6 min (20,000 iterations), whereas the training efficiency of PINNs is greatly influenced by the complexity of physical equations. Additionally, the intermediate layer outputs of the GS-CNN (such as phase distributions) have clear physical meanings, supporting process traceability.

4. Conclusions

In this paper, we proposed a novel method, the GS-CNN, for generating higher-order HG modes by integrating a physical model into a deep learning framework. This method addresses the key limitations of the data-driven deep learning method in the context of higher-order HG mode generation, which typically requires extensive labeled datasets and is sensitive to the system stability and variations in the training data. Experimental results demonstrate that, compared to the data-driven deep learning method, the GS-CNN significantly reduces the RMSE of the generated modes, which indicates a higher quality of the generated modes.

The integration of a physical model into a deep learning framework, as demonstrated by the GS-CNN, offers a promising solution for generating higher-order HG modes efficiently and accurately, which is crucial for applications in quantum information processing [6], optical communications [1], and precision measurements [2,3,4]. This method overcomes the limitations of traditional deep learning methods and opens up new avenues for the application of deep learning in optics, quantum physics, and related fields [33,34,35,36].

Author Contributions

Conceptualization, T.C. and C.J.; methodology, L.M.; software, J.T.; validation, L.C.; data curation, J.T.; writing—original draft preparation, T.C. and C.J.; writing—review and editing, L.M. and L.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Natural Science Foundation of Chongqing (No. CSTB2022NSCQ-BHX0700, CSTB2023NSCQ-BHX0089), the Scientific and Technological Research Program of Chongqing Municipal Education Commission (No. KJZD-M202201204, KJQN202201217, KJQN202301247, KJQN202401212), the Humanities and Social Sciences Research Program of Chongqing Municipal Education Commission (No. 24KJD137), and the Fundamental Research Funds for Chongqing Three Gorges University of China (No. 0903321).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data supporting this study’s findings are available within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Rubinsztein-Dunlop, H.; Forbes, A.; Berry, M.V.; Dennis, M.R.; Andrews, D.L.; Mansuripur, M.; Denz, C.; Alpmann, C.; Banzer, P.; Bauer, T.; et al. Roadmap on structured light. J. Opt. 2017, 19, 133001. [Google Scholar] [CrossRef]
Liu, T.; Fulda, P. Experimental Demonstrations of Alignment and Mode Matching in Optical Cavities with Higher-Order Hermite-Gauss Modes. Phys. Rev. Lett. 2024, 132, 101402. [Google Scholar]
Steinlechner, S.; Rohweder, N.O.; Korobko, M.; Töyrä, D.; Freise, A.; Schnabel, R. Mitigating mode-matching loss in nonclassical laser interferometry. Phys. Rev. Lett. 2018, 121, 263602. [Google Scholar] [CrossRef] [PubMed]
Ast, S.; Di Pace, S.; Millo, J.; Pichot, M.; Turconi, M.; Christensen, N.; Chaibi, W. Higher-order Hermite-Gauss modes for gravitational waves detection. Phys. Rev. D 2021, 103, 042008. [Google Scholar] [CrossRef]
Taylor, M.A.; Janousek, J.; Daeia, V.; Knitte, J.; Hage, B.; Bachor, H.; Bowen, W. Biological measurement beyond the quantum limit. Nat. Photonics 2013, 7, 229–233. [Google Scholar] [CrossRef]
Wang, X.T.; Jing, J.T. Deterministic entanglement of large-scale Hermite-Gaussian modes. Phys. Rev. Appl. 2022, 18, 024057. [Google Scholar] [CrossRef]
Salter, P.S.; Booth, M.J. Adaptive Optics in Laser processing. Light Sci. Appl. 2019, 8, 110. [Google Scholar] [CrossRef]
Li, Z.; Wang, Y.; Sun, H.; Liu, K.; Gao, J. Tilt Measurement at the Quantum Cramer-Rao Bound Using a Higher-Order Hermite–Gaussian Mode. Photonics 2023, 10, 584. [Google Scholar] [CrossRef]
Tradonsky, C.; Mahler, S.; Cai, G.; Pal, V.; Chriki, R.; Friesem, A.A.; Davidson, N. High-resolution digital spatial control of a highly multimode laser. Optica 2021, 8, 880–884. [Google Scholar] [CrossRef]
Chu, S.-C.; Chen, Y.-T.; Tsai, K.-F.; Otsuka, K. Generation of high-order Hermite-Gaussian modes in end-pumped solid-state lasers for square vortex array laser beam generation. Opt. Express 2012, 20, 7128–7141. [Google Scholar] [CrossRef]
Cho, C.; Huang, Y.; Tsai, L. Low-threshold CW eye-safe vortex generation from an intracavity pump-wave off-axis pumped OPO. Opt. Lett. 2024, 49, 6189–6192. [Google Scholar] [CrossRef]
Zhu, X.; Yang, J.; Chen, Y.; He, H.; Dong, J. High-order Hermite-Gaussian modes and optical vortices generated in an efficient Yb:YAG microchip laser by manipulating gain distribution. Opt Laser Technol. 2025, 180, 111584. [Google Scholar] [CrossRef]
Ma, L.; Guo, H.; Sun, H.X.; Liu, K.; Su, B.D.; Gao, J.R. Generation of squeezed states of light in arbitrary complex amplitude transverse distribution. Photon. Res. 2020, 8, 1422–1427. [Google Scholar] [CrossRef]
Morizur, J.-F.; Armstrong, S.; Treps, N.; Janousek, J.; Bachor, H. Spatial reshaping of a squeezed state of light. Eur. Phys. J. D 2011, 61, 237–239. [Google Scholar] [CrossRef]
Wang, G.; Li, Y.; Shan, X.; Miao, Y.; Gao, X. Hermite–Gaussian beams with sinusoidal vortex phase modulation. Chin. Opt. Lett. 2020, 18, 042601. [Google Scholar] [CrossRef]
Kang, X.; Yu, D.; Li, Y.; Wang, G.; Song, B.; Li, Y.; Dong, X.; Gao, X. The energy distribution evolution of Airy-Hermite-Gaussian beams. Opt. Commun. 2021, 488, 126818. [Google Scholar] [CrossRef]
Ma, L.; Yan, M. Analysis of Factors Influencing the Generation of a Higher-Order Hermite–Gaussian Mode Based on Cascaded Spatial Light Modulators. Electronics 2024, 13, 2512. [Google Scholar] [CrossRef]
Gerchberg, R.W.; Saxton, W.O. A practical algorithm for the determination of phase from image and diffraction plane pictures. Optik 1972, 35, 237–246. [Google Scholar]
Wu, Y.; Wang, J.; Chen, C.; Liu, C.; Jin, F.; Chen, N. Adaptive weighted Gerchberg-Saxton algorithm for generationof phase-only hologram with artifacts suppression. Opt. Express 2021, 29, 1412–1427. [Google Scholar] [CrossRef] [PubMed]
Liu, K.; He, Z.; Cao, L. Double amplitude freedom Gerchberg-Saxton algorithm for generation of phase-only hologram with speckle suppression. Appl. Phys. Lett. 2022, 120, 061103. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
Wang, H.; Rivenson, Y.; Jin, Y.; Wei, Z.; Gao, R.; Gunaydin, H.; Bentolila, L.A.; Kural, C.; Ozcan, A. Deep learning enables cross-modality super-resolution in fluorescence microscopy. Nat. Methods 2019, 16, 103–110. [Google Scholar] [CrossRef]
Choi, G.; Ryu, D.; Jo, Y.; Kim, Y.; Park, W.; Min, H.; Park, Y. Cycle-consistent deep learning approach to coherent noise reduction in optical diffraction tomography. Opt. Express 2019, 27, 4927–4943. [Google Scholar] [CrossRef] [PubMed]
Rivenson, Y.; Zhang, Y.; Gunaydin, H.; Teng, D.; Ozcan, A. Phase recovery and holographic image reconstruction using deep learning in neural networks. Light Sci. Appl. 2018, 7, 17141. [Google Scholar] [CrossRef]
Jiang, C.; Chen, T.; Yang, W.; Ma, L.; Wang, T.; Cai, C. Generation of the higher-order Hermite-Gaussian intensity distribution based on convolutional neural networks. Appl. Phys. B 2025, 131, 45. [Google Scholar] [CrossRef]
Quero, C.O.; Leykam, D.; Ojeda, I.R. Res-u2net: Untrained deep learning for phase retrieval and image reconstruction. J. Opt. Soc. Am. A 2024, 41, 766–773. [Google Scholar] [CrossRef]
Wang, K.; Song, L.; Wang, C.; Ren, Z.; Zhao, G.; Dou, J.; Di, J.; Barbastathis, G.; Zhou, R.; Zhao, J.; et al. On the use of deep learning for phase recovery. Light Sci. Appl. 2024, 13, 4. [Google Scholar] [CrossRef] [PubMed]
Wang, F.; Bian, Y.; Wang, H.; Lyu, M.; Pedrini, G.; Osten, W.; Barbastathis, G.; Situ, G. Phase imaging with an untrained neural network. Light Sci. Appl. 2020, 9, 77. [Google Scholar] [CrossRef]
Dong, S.; Wang, P.; Abbas, K. A Survey on Deep Learning and Its Applications. Comput. Sci. Rev. 2021, 40, 100379. [Google Scholar] [CrossRef]
Larkin, A.S.; Pushkarev, D.V.; Degtyarev, S.; Khonina, S.N.; Savel’Ev, A. Generation of Hermite-Gaussian modes of high-power femtosecond laser radiation using binary-phase diffractive optical elements. Quantum Electron. 2016, 46, 733–737. [Google Scholar] [CrossRef]
Meng, D.; Ulusoy, E.; Urey, H. Non-iterative phase hologram computation for low speckle holographic image projection. Opt. Express 2016, 24, 4462–4476. [Google Scholar] [CrossRef]
Davis, J.A.; Wolfe, E.D.; Moreno, I.; Cottrell, D.M. Encoding complex amplitude information onto phase-only diffractive optical elements using binary phase Nyquist gratings. OSA Contin. 2021, 4, 896. [Google Scholar] [CrossRef]
Wu, H.; Greer, S.Y.; O’Malley, D. Physics-embedded inverse analysis with algorithmic differentiation for the earth’s subsurface. Sci. Rep. 2023, 13, 718. [Google Scholar] [CrossRef] [PubMed]
Raskatla, V.; Singh, B.P.; Patil, S.; Kumar, V.; Singh, R.P. Speckle-based deep learning approach for classification of orbital angular momentum modes. J. Opt. Soc. Am. A 2022, 39, 759–765. [Google Scholar] [CrossRef] [PubMed]
Rondon, I.; Leykam, D. Acoustic vortex beams in synthetic magnetic fields. J. Phys. Condens. Matter 2020, 32, 104001. [Google Scholar] [CrossRef]
Rondon, I. Acoustic spin and orbital angular momentum using evanescent Bessel beams. J. Phys. Commun. 2021, 5, 085015. [Google Scholar] [CrossRef]

Figure 1. The framework diagram of the GS algorithm.

Figure 2. The schematic diagram of the GS-CNN. (a) training the CNN; (b) the variation of intensity distribution during the iterative optimization process; (c) the variation of phase distribution during the iterative optimization process.

Figure 3. The intensity distribution and hologram of the HG10 mode and HG30 mode generated by the GS-CNN. (a) standard intensity distribution; (b) generated intensity distributions; (c) holograms.

Figure 4. The loss function value of the GS-CNN after 20,000 training epochs.

Figure 5. Experimental results for the generated modes.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, T.; Jiang, C.; Tao, J.; Ma, L.; Cao, L. Generation of Higher-Order Hermite–Gaussian Modes Based on Physical Model and Deep Learning. Photonics 2025, 12, 801. https://doi.org/10.3390/photonics12080801

AMA Style

Chen T, Jiang C, Tao J, Ma L, Cao L. Generation of Higher-Order Hermite–Gaussian Modes Based on Physical Model and Deep Learning. Photonics. 2025; 12(8):801. https://doi.org/10.3390/photonics12080801

Chicago/Turabian Style

Chen, Tai, Chengcai Jiang, Jia Tao, Long Ma, and Longzhou Cao. 2025. "Generation of Higher-Order Hermite–Gaussian Modes Based on Physical Model and Deep Learning" Photonics 12, no. 8: 801. https://doi.org/10.3390/photonics12080801

APA Style

Chen, T., Jiang, C., Tao, J., Ma, L., & Cao, L. (2025). Generation of Higher-Order Hermite–Gaussian Modes Based on Physical Model and Deep Learning. Photonics, 12(8), 801. https://doi.org/10.3390/photonics12080801

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Generation of Higher-Order Hermite–Gaussian Modes Based on Physical Model and Deep Learning

Abstract

1. Introduction

2. Theoretical Model

2.1. GS Algorithm

2.2. GS-CNN

3. Results and Discussion

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI