1. Introduction
Adaptive optics (AO) technology, through deformable mirror (DM) or spatial light modulator (SLM) control methods, can correct optical systems in real time and reduce wavefront distortion. This technology is widely used in astronomical telescopes and free-space optical communication (FSO) to reduce the impact of atmospheric turbulence on light transmission [
1,
2,
3,
4]. AO systems are divided into conventional AO systems and wavefront sensorless (WFS-less) AO systems. WFS-less AO systems directly use the light intensity information of the image sensor to calculate performance indicators and, on the basis of the direct optimization of system performance indicators, obtain the control signals required by the wavefront corrector to achieve wavefront correction and control. This greatly reduces the complexity of the adaptive optical system and has a wider application space [
5,
6,
7]. Currently, the performance of WFS-less AO systems mainly depends on the adopted wavefront correction control algorithm. This type of algorithm can be roughly divided into the following categories: Model-based control methods include geometric optics principles, nonlinear optimization, mode methods, etc. Intelligent-optimization-based methods that are used include the genetic algorithm (GA) [
8,
9], particle swarm optimization (PSO) [
10], differential evolution algorithm (DEA) [
11], simulated annealing (SA) [
12], Runge–Kutta optimizer (RUN) [
13,
14], and gradient-descent-based control methods, such as the sequential gradient descent algorithm, multielement high-frequency vibration, and stochastic parallel gradient descent algorithm (SPGD) [
15,
16,
17,
18,
19]. Model control methods have faster convergence speeds, but they need to establish an accurate mathematical model on the basis of the physical characteristics of the system to design a control algorithm. However, intelligent optimization control methods have higher computational complexity [
20,
21,
22]. In contrast, gradient descent control algorithms do not depend on the system model, are relatively simple in terms of computation, and have a lower dependence on optical systems.
SPGD is currently the most representative and widely applicable control algorithm because of its simple implementation and strong comprehensive correction ability. Conventional SPGD algorithms use fixed gain coefficients, resulting in a slow convergence speed or easy trapping in local optima, leading to a low correction performance. To optimize the problems existing in the above SPGD algorithm, with the continuous development of deep learning, optimizers such as momentum, AdaGrad, Adam, and Nadam have been successfully integrated into the SPGD algorithm, significantly accelerating the iteration process. These improved SPGD algorithms have been validated through both theoretical analysis and simulations, demonstrating their effectiveness in wavefront distortion correction [
23,
24,
25,
26]. The Sophia optimizer is a second-order optimizer that can achieve a faster convergence speed and better optimization effects in deep learning models [
27]. The Sophia optimizer as a second-order stochastic optimization algorithm been designed to reduce the cost of model training and improve training efficiency. It uses a low-cost random diagonal approximation of the Hessian as a preconditioner and introduces a clipping mechanism to control the maximum value of the update scale. This approach combines the advantages of second-order optimization while maintaining computational feasibility. Compared with the traditional Adam optimizer, it performs better in terms of validation forward loss, total computational cost, and actual training time. It can halve the number of training steps under the same loss, effectively reducing the demand for total computational resources.
To address the issues of slow convergence speed and susceptibility to local optima in the SPGD algorithm, this paper proposes a novel adaptive gain stochastic parallel gradient descent algorithm (Sophia-SPGD). The algorithm incorporates the lightweight second-order optimizer Sophia, which is utilized in deep learning, into the SPGD. It approximates the gradient by leveraging changes in the image performance index. By incorporating an adaptive-learning-rate adjustment mechanism, first- and second-order momentum corrections, and a clipping mechanism to stabilize the control of the gradient descent direction and step size, Sophia-SPGD enables precise and rapid updates to the wavefront corrector voltage signals. Through the modification of gradients and learning rates, the algorithm enhances the robustness of the optimization process, significantly improving both the convergence speed and accuracy of wavefront correction. This is highly important for the real-time performance and lightweight nature of WFS-less AO.
2. Methods
As is illustrated in
Figure 1, the WFS-less AO system uses a deformable mirror (DM) for wavefront correction, with a charge-coupled device (CCD) serving as the image sensor, alongside other components such as the wavefront control module. During the laser beam transmission process, the incoming laser beam passes through a nonuniform medium, causing distortions in the wavefront phase. The distorted laser beam is then reflected by the deformable mirror and passes through a converging lens to form an image on the CCD camera. The CCD camera captures this image and transmits it to the controller. The controller reads the far-field spot image and uses a blind optimization algorithm based on the optimization metrics of the far-field spot image to generate control voltage signals. These signals are amplified by a high-voltage amplifier and drive the deformable mirror to complete a cycle of closed-loop correction.
The distortion compensation process of the deformable mirror is achieved by altering the shape of its deformable surface to counteract the target wavefront distortions. There are many methods for representing the surface shape and wavefront; the most direct method is to represent the surface shape as
, which is the distance of a point at coordinates
from the horizontal plane. According to the working principle of the deformable mirror, the compensatory surface shape
produced by the mirror can be represented by a linear combination of the influence functions of each actuator, and the equations are as follows:
Here,
is the
i-th control signal for the DM actuator,
is the influence function of the
i-th DM actuator,
represents the coordinates on the wavefront plane,
represents the coordinates of the
i-th actuator on the DM,
represents the spacing between adjacent elements,
represents the coupling value between elements, and
represents the Gaussian metric [
13,
14].
In the simulation experiment, a sum of 100 Kolmogorov phase screens under turbulence strengths of
= 10, 15, and 20 are generated [
28]. The phase screen comprises Zernike aberrations from the 3rd to the 104th order, excluding tilt terms. The statistical properties of the generated phase screen conform to the Kolmogorov spectrum, and the screens do not exhibit correlations between them. The turbulence intensity is denoted as
, where
is the telescope aperture and
is the atmospheric coherence length. The optimization algorithm uses the mean radius (MR) and mean radius ratio (MRR) [
29] as an optimization metric; the Strehl ratio (SR) [
30] acts as an indicator metric. The MRR is the ratio between the average radius of the ideal far-field point spread function (PSF) and the average radius of the corrected PSF.
Allowing
to be the phase distribution of the wave front, in a circular aperture, the complex amplitude of the wave front
can be expressed as:
The relationship between the PSF and the complex amplitude of the wavefront can be described by means of Fourier transform. Performing a two-dimensional Fourier transform on the
yields its representation of
in the frequency domain, and the PSF can be obtained by calculating the square of the modulus of
.
The formulas for the SR, mean radius ratio (MR), and MRR are given below:
where
is the peak intensity of the image formed by the optical system with aberrations,
is the peak intensity of the ideal diffraction-limited image, and MRideal is the ideal mean radius. The SRs are frequently used in numerical simulations as the system’s objective function because of their low computational demands and simple structure. SR values close to 1 indicate minimal aberration effects and a high image quality, whereas lower values suggest significant aberrations and a reduced image quality. The mean radius is an essential parameter for evaluating the quality of an optical system, as it measures the deviation of the actual wavefront from the ideal wavefront. A smaller mean radius signifies a closer approximation to the ideal, diffraction-limited wavefront, indicating a higher optical quality and enhanced imaging performance. Throughout the algorithm’s iterative process, MR values are recorded. For analytical convenience, changes in the SR values are also recorded alongside.
In this paper, we introduce a Sophia-optimized SPGD algorithm, referred to as Sophia-SPGD, which integrates the Sophia optimizer from deep learning with the traditional SPGD algorithm. The variation in the performance metrics of the spot image is approximated as a gradient. The algorithm incorporates momentum, adaptivity, and gradient clipping techniques in the gradient descent process. By utilizing the gradient momentum method, the algorithm compares the variation in gradients from the current and previous iterations. If the variation in gradients is in the same direction, it generates a positive excitation, whereas opposing directions cause a negative excitation. This method effectively minimizes oscillations throughout the iteration process, enhancing the stability and efficiency of the optimization. The learning rate is adaptively adjusted to mitigate local convergence issues. Additionally, the algorithm incorporates gradient clipping to control the step size of the gradient descent, ensuring that the performance indices of spot images do not fluctuate dramatically in the later stages of the iterations. This approach streamlines the optimization process, enhancing the efficiency and stability of wavefront corrections in WFS-less adaptive optics systems. The Sophia-SPGD algorithm flowchart is shown in
Figure 2.
The core concept of the stochastic parallel gradient descent (SPGD) algorithm involves estimating the gradient of control parameters through changes in performance metrics and randomized perturbation voltage vector . The algorithm iteratively searches for control voltage vectors along the direction of gradient descent until it finds a maximum value of the performance metrics. Here, the performance metric is a function of the control voltage vectors, expressed as , where represents the control voltages for the various units of the wavefront corrector.
The specific process is as follows: First, the system randomly generates a small perturbation voltage vector
. Then, the positive perturbation voltages
are applied to the DM to obtain the CCD images and calculate the positive performance metrics
, and the negative perturbation voltages
are applied to the DM via a similar method to obtain the negative performance metrics
. The formula for calculating the variation in performance metrics is as follows:
where
represents the gain coefficient, which is generally a positive value;
is the voltage of the deformable mirror controller added during the k-th iteration;
denotes the number of corrective units in the wavefront corrector; and
is the randomly disturbed voltage vector applied during the k-th iteration. The control voltage vector
is calculated in accordance with Formula (6) and applied to the deformable mirror. Images are collected to obtain the correction effect, and the k-th iteration is completed.
SPGD is pervasively recognized as an effective method for performance metric optimization; however, the use of a fixed gain coefficient limits its adaptability for wavefront correction in wavefront sensorless adaptive optics systems. This leads to suboptimal correction results, slow correction speeds, and an easy fall into the local optimum. To increase the convergence speed and reduce the probability of falling into the local optimum, the momentum of inertia and adaptive learning rates are the most common considerations.
In the Sophia-SPGD algorithm, the first-order gradient and the second-order gradient are calculated according to Equations (11) and (12):
where
is the sign function, which determines the sign of a number. The sign of
and its performance are determined by the direction of optimization of the performance metrics. If the performance metrics increase,
takes a positive value; otherwise, it takes a negative value.
In the Sophia-SPGD algorithm, the introduction of a momentum term aims to accelerate learning and reduce oscillations. Here,
represents the first-order momentum term, and
represents the first-order momentum term.
and
are momentum hyperparameters, typically ranging between 0 and 1. The first-order momentum of the gradient
and the second-order momentum of the gradient according to
in Equations (13) and (14) are calculated as follows:
These formulas imply that the current momentum is a weighted average of the previous momentum and the current gradient. These two momentum terms play crucial roles in the parameter update process. By integrating past momentum with current gradient information, the algorithm progresses more steadily toward the optimal solution. This approach prevents overly sensitive parameter changes in situations of drastic gradient variations and enhances the convergence speed and robustness of the algorithm.
The primary role of adopting an adaptive learning rate is to balance the convergence speed and avoid becoming trapped in local optima during the optimization process. Adaptive-learning-rate algorithms automatically adjust the learning rate according to different parameters and formula conditions. In the initial stages of iteration, a higher learning rate is provided to accelerate convergence. As the iteration progresses and the performance indices gradually approach their extremum, the learning rate automatically decreases to prevent significant fluctuations near the optimal solution, thus ensuring stable convergence to the optimum. The relationship between the learning rate
and the number of iterations
is expressed as
Here, is a manually predefined initial learning rate. is the attenuation rate.
Introducing weight decay
when updating control voltages can smooth the changes in parameter update, thereby accelerating the convergence speed. As discussed above, the Sophia-SPGD algorithm is used to update the control voltage vector computation formula as follows:
where
represents the voltage signal for the
iteration and where
is the total number of iterations. The clipping function
is used to limit
within the range of the adaptive function
, and
is a very small constant set to 10
−8 to avoid dividing by 0. In the above equation, the typical parameters are
,
,
,
,
. Using an adaptive function to limit the range of the clipping function can effectively increase the convergence accuracy in the later stages of iteration and prevent falling into a local optimum.
The implementation of the Sophia-PGD algorithm is comprehensively described in Algorithm 1.
Algorithm 1. Sophia-SPGD |
Inputs: The learning rate , the initial learning rate , hyperparameters , , , , the constant , the initial momentum coefficient . Output: Calculated control voltage vectors 1: Set , where N is the number of corrector channels. Set , , . 2: for k = 1 to do 3: Randomly generate perturbed voltages 4: Compute the perturbation of performance indicators , , and 5: Compute 6: 7: 8: 9: |
10: |
End |
To implement SPGD or any of its variants, a performance metric must be specified. Here, the MR and SR are used to measure the correction performance of the algorithm and are defined as shown in Equations (3) and (4). Through numerical simulation, we can further observe and assess the performance of the Sophia-SPGD algorithm in practical scenarios to determine its effectiveness and advantages in enhancing correction performance.
3. Simulations and Analysis
To analyze the feasibility of the Sophia-SPGD algorithm, we first select the optimal parameters for each algorithm on the basis of many simulation experiments under various turbulence intensities. Then, we use both algorithms to correct the same set of randomly generated wavefront aberrations and compare the simulation results. The number of correction iterations is set at 1500. A 97-element DM is introduced to perform the simulations.
Figure 3 shows the obtained averaged SR and MRR adaptation curves. Except for cases with turbulence intensity
, the correction capability of Sophia-SPGD surpasses that of SPGD, particularly in the presence of stronger turbulence.
During the simulation process, 100 frames of random wavefront aberrations under different turbulence intensities of D/r0 = 10, 15, and 20 were used as correction subjects to analyze the convergence speeds of the Sophia algorithm and the SPGD control algorithms. The average correction results of the 100 frames of random aberrations serve as the experimental outcome, with each of the four algorithms iterating 1500 times. The average MRR and average SR adaptation curves are presented in
Figure 4. Both the SPGD and Sophia-SPGD algorithms have sufficiently converged after 1500 iterations. Comparing the convergence curves of different algorithms under the same turbulence conditions, it is evident that at lower turbulence, Sophia does not demonstrate a significant advantage in terms of convergence speed over SPGD. However, a comparison of the convergence curves under varying turbulence conditions reveals that as the turbulence intensity increases, the convergence performance of the Sophia-SPGD control algorithm progressively exceeds that of the SPGD control algorithm. The Sophia-SPGD and SPGD algorithms can effectively optimize and compensate for aberrations to some extent. However, under high-turbulence conditions, the correction accuracy is degraded because of the limitations of the correction capacity of the deformable mirror.
To observe the wavefront correction effects of the SPGD and Sophia-SPGD algorithms more objectively, a typical phase screen generated under a turbulence strength of D/r0 = 10 is selected. The uncorrected wavefront, its corresponding 3rd–18th Zernike coefficients and initial uncorrected PSF, are provided in
Figure 5a, b, and c, respectively. The simulation results obtained via the SPGD algorithm are shown in
Figure 5d–f, and the simulation results obtained via the Sophia-SPGD algorithm are shown in
Figure 5h–i. After 500 iterations, both of the above algorithms eliminate the wavefront aberration effectively, and the corresponding far-field point spread function (PSF) takes shape well. Judging from the distribution of the Zernike coefficients corresponding to the wavefront residuals after correction by the two algorithms, the correction performance of the Sophia-SPGD algorithm is better than that of the SPGD algorithm.
In order to investigate whether the algorithms fluctuate in the later iterations of the wavefront correction, we give the SR iteration curves of the SPGD and Sophia optimization algorithms, respectively (
Figure 6). The atmospheric turbulence strength is D/r = 15 for all of these realizations. It is shown that, during the iteration, the SR curve of conventional SPGD exhibits fluctuations. In contrast, the SR curve of the Sophia-SPGD algorithm is relatively smooth during the iteration and shows no significant fluctuations in the later stage of the iteration.
In addition, to better verify the stability of the algorithm, we use the Euclidean norm to measure the parameter update of the control voltage vector
during the iteration.
Figure 7 illustrates the
in each step size of each algorithm throughout the iteration process. During the iteration of SPGD, the changes in adjacent step sizes are very drastic. In contrast, during the iteration of Sophia-SPGD, the changes in adjacent step sizes are gentle and tend to be stable in the later stage of the iteration.
Convergence speed is a critical metric in the control of adaptive optics systems. Under three different conditions of strong turbulence, the initial average MR values were 0.304, 0.227, and 0.186, respectively; after correction by the SPGD control algorithm, the average MRR values at the convergence extremum were 0.753, 0.527, and 0.428, respectively. Calculating 80% of the MRR correction range under different turbulence conditions yielded values of 0.663, 0.467, and 0.380. These values serve as benchmarks to analyze and compare the number of iterations required by the two control algorithms to achieve the same correction effect, as shown in
Table 1. The simulation data demonstrate that the Sophia algorithm is more efficient than the SPGD algorithm. At a turbulence strength of D/r0 = 10, the correction speed of the Sophia-SPGD algorithm is 35.6% faster than that of the SPGD algorithm; at D/r0 = 15, it is 75% faster; and at D/r0 = 20, it is 79.5% faster.
Under three different conditions of strong turbulence, the initial average SR values were 0.136, 0.084, and 0.058; after correction by the SPGD control algorithm, the average MR values at the convergence extremum were 0.899, 0.714, and 0.575, respectively. Calculating 80% of the SR correction range under different turbulence conditions yielded values of 0.746, 0.573, and 0.472. These values serve as benchmarks to analyze and compare the number of iterations required by the two control algorithms to achieve the same correction effect, as shown in
Table 2. At a turbulence strength of D/r0 = 10, Sophia-SPGD’s correction speed is 65.9% faster than that of SPGD; at D/r0 = 15, it is 80.4% faster; and at D/r0 = 20, it is 93.2% faster.
In conclusion, the convergence results from both the MR and SR indicate that as the turbulence intensity increases, the convergence effectiveness becomes constrained. Notably, under conditions of strong turbulence, Sophia-SPGD consistently achieves faster convergence than SPGD.
4. Experiments and Results
To evaluate the correction capability of the wavefront sensorless adaptive optics system based on the Sophia algorithm, the aberrations within the experimental system were selected as the reference aberrations. These aberrations were used to control a 97-element piezoelectric deformable mirror, and experiments were conducted to verify the correction performance for both point targets and extended targets. The deformable mirror used was a high-speed deformable mirror with 97 piezoelectric elements manufactured by ALPAO, France, with an effective aperture of 22.5 mm and an actuator spacing of 2.5 mm. The CCD camera utilized was the Prime BSI Express Scientific camera, which was developed and produced by Teledyne Photometrics. The experimental results were evaluated via the MR as a performance metric.
In the point target correction experiment, the experiment setup was carried out referring to the optical path shown in
Figure 1. We employed an adjustable-power semiconductor laser (wavelength of 635 nm) as the light source. During the experiment, the laser emitted a 22.5 mm diameter beam through a collimating system, matching the effective aperture of the deformable mirror. The beam is reflected by an anamorphic mirror, passes through a beam reduction mirror (12.5×), and finally focuses on the target surface of the CCD camera through a focusing lens (f = 150 mm). An image acquisition card captured the speckle information detected by the CCD and transferred it to a MATLAB 2022b program on the computer for analysis. The program calculates the average radius of the speckles within a selected area as a performance metric for the algorithm. The required control voltages for the deformable mirror were converted by a D/A card, amplified by a high-voltage amplifier, and applied to 97 elements of the mirror, causing it to deform and optimize the speckle radius. The algorithm iterates until it reaches a predetermined number of iterations or converges to an optimal value. The algorithm used in this experiment was the same as that used in the previously described simulation experiment.
Figure 8a displays the far-field intensity distributions before correction under system static aberration. The diffraction limit of the experimental system used was calculated to be 19.8 pixels. Visually, the far-field speckle patterns reveal that the SPGD algorithm improved the MR from 89 pixels to 36 pixels, whereas the Sophia-SPGD algorithm increased it from 89 pixels to 27 pixels. Additionally, the number of iterations required to reach a far-field speckle size of three times the diffraction limit was 152 and 133 for the SPGD and Sophia-SPGD algorithms, respectively. Three-dimensional far-field intensity distributions are shown in
Figure 8b,c, where it can be clearly seen that the largest value of central energy was obtained with the Sophia-SPGD-algorithm’s correction.
Figure 9 shows the experimentally obtained MR curves during correction via the SPGD and Sophia-SPGD algorithms.
The extended-target-correction experimental system layout is shown in
Figure 10. We used an LED with a center wavelength of 570 nm as the light source. The extended target was a resolution test board, model GH-YP832, Guangzhou, China which consisted of 29 sets of four-direction stripes with line widths from 2.5 µm to 500 μm. The diffraction limit of the constructed optical system was 25 µm. The anamorphic mirror was a 97-unit MEMS DM made by ALPAO, Grenoble, France, for aberration correction. The camera was a 16-bit CDD camera. Lens L1 has a focal length of 400 mm, and L2 has a focal length of 100 mm, both of which form a 4f system. This process is similar to point target tests: an image acquisition card collects information about the extended target from the CCD camera and transfers it to MATLAB 2022b programs on a computer for analysis. The program evaluates the algorithm’s performance via an image frequency evaluation function within a selected area. The control voltages for the deformable mirror, which are calculated via the algorithm, are converted and amplified before being applied to the 97 elements of the mirror, which induces deformation to optimize the performance metrics. The algorithm iterates until it reaches a predetermined number of iterations or converges to the optimal value. The algorithm used in this experiment was the same as that employed in previous simulation experiments.
Figure 11a shows that the original stripe image is blurred in detail. After correction with the SPGD algorithm (
Figure 11b), although there was some improvement in image detail, blurriness remained.
Figure 11c shows that the newly proposed Sophia-SPGD algorithm successfully removed the blurred details, resulting in a clearer overall structure that more closely approximated the ideal diffraction-limited image. Image quality was assessed via the custom-modified frequency assessment function F. The F value of the image before correction was 4.1275; after correction, the F value of the SPGD algorithm was 4.6399; and the F value of the Sophia-SPGD algorithm was 5.8653.
F value curves after correction via the SPGD and Sophia-SPGD algorithms are shown in
Figure 12. Compared with SPGD, Sophia-SPGD requires fewer iterations to achieve the same image quality evaluation metric. The images corrected by the SPGD algorithm were unsatisfactory, whereas the images corrected by Sophia-SPGD had satisfactory visual results.