Dispersion ‐ Oriented Inverse Design of Photonic ‐ Crystal Fiber for Four ‐ Wave Mixing Application

: In this paper, we demonstrate the application of a deep learning neural network (DNN) in the dispersion ‐ oriented inverse design of photonic ‐ crystal fiber (PCF) for the fine ‐ tuning of four ‐ wave mixing (FWM). The empirical formula of PCF dispersion is applied instead of numerical sim ‐ ulation to generate a large dataset of phase ‐ matching curves of various PCF designs, which signifi ‐ cantly improves the accuracy of the DNN prediction. The accuracies of DNNs’ predicted PCF struc ‐ ture parameters are all above 95%. The simulations of the DNN ‐ predicted PCFs structure demon ‐ strate that the FWM wavelength has an average numerical mean square error (MAE) of 1.92 nm from the design target. With the help of DNN, we designed and fabricated a specific PCF for wave ‐ length conversion via FWM from 1064 nm to 770 nm for biomedical imaging applications. Pumped by a microchip laser at 1064 nm, the signal wavelength is measured at 770.2 nm.


Introduction
Photonic crystal fiber (PCF) provides tight confinement of light and dispersion design flexibility [1].By designing the air hole diameter (d) and hole pitch (Λ), as shown in Figure 1, we can precisely control the mode properties [2].PCFs with ultra-flattened dispersion [3] and tunable zero-group-dispersion wavelength in the visible region [4] have been studied.Therefore, PCF is a desirable medium for sensors, e.g., designing two elliptical channels on either side of the PCF's core, each with a distinct coating film, to achieve a response to the magnetic field and temperature using SPR [5,6].Moreover, PCF is a favorable medium for efficiently acquiring various nonlinear optical effects, such as fourwave-mixing (FWM) [7].FWM in PCF enables a wide range of nonlinear wavelength conversion by delicately designing the dispersion of PCF [8].Y. Chen reported an ultraviolet (U.V.) signal as short as 342 nm generated by pumping a well-designed photonic crystal fiber with a 532 nm microchip laser [9].Using a dispersion-engineered tapered photonic crystal fiber, Sidi-Ely et al. numerically demonstrated a parametric gain of around 2 μm in the mid-infrared region [10].R. Jiang et al. presented a WDM system employing a single-pump parametric process in a PCF with two ZDWs.The 1550 nm band is translated to the visible 500 nm spectral window [11].Developing efficient methods of PCF dispersion design is essential for generating an arbitrary wavelength light source in FWM, which has more applications in wavelength division multiplexing in future high-capacity all-optical networks [12], constructing excitation light sources used in micro spectroscopy and gas lasers [13].
Currently, the design of PCF for FWM mostly relies on manual trying-and-testing iteration, which is heavily time-exhausting and experience-dependent [14].The finite element method (FEM) is typically used to calculate the modal properties of photonic crystal fiber with high accuracy.However, the calculation process takes several tens of minutes depending on the grid accuracy and complexity of the structure.Moreover, the test and trial approach for PCF design becomes impractical for fine-tuning wavelength conversion.
Computer-aided fiber design (CAFD) has been widely applied to dispersion-oriented inverse PCF design and optimization instead of manual trying and testing.Reported computer-aided fiber design methods are based on optimization algorithms of deep learning neural networks (DNN) [15] and traditional optimizers, e.g., genetic algorithms (G.A.) [16], particle swarm optimization (PSO) [17], and simulated annealing (S.A.) [18].The traditional optimizer uses forward design algorithms and often takes several days to design.DNN, which is bi-directionally applicable in the inverse fiber design, would take only a few seconds to obtain results.
F. Poletti et al. employed G.A. for the dispersion optimization of PCF, and a dispersion of 0 ± 0.1 ps/nm/km in the wavelength range between 1.5 and 1.6 μm was achieved [19].However, it took between 12 and 20 h to obtain convergence results.Wang et al. proposed substituting the FEM simulation with an empirical dispersion formula [2] in the design of dispersion-oriented fiber.They employed a differential evolution (D.E.) genetic algorithm to complete the inverse EGI-PCF design process [20].The computation time has been reduced to the second level.While the number of parameters increases, the optimization time increases dramatically.
To overcome these obstacles, we propose applying a deep learning neural network (DNN) in the computer-aided inverse design of PCF for FWM.DNN uses multiple layers of nonlinear processing units for highly accurate feature extraction and transformation [21].Thus, it has been demonstrated in broad applications in photonics, e.g., optical image reconstruction [22] and signal analysis in optical communication [23] due to its universality and high accuracy.DNN maps the FWM wavelength conversion to its corresponding PCF structure in our work.First, to accelerate the training dataset generation, we adopt the method from previous work [2].We applied the finite element method (FEM) to simulate PCF dispersions with various transverse structures in a spectrum from 0.5 μm to 5 μm and provided a simulation-based empirical formula for PCF dispersion only dependent on the two structural parameters.Next, we calculated phase-matching curves with momentum and energy conservations and labeled them with various PCF structures as training data for DNN.The trained DNN would take a laptop only 0.2 s in average to obtain a predicted PCF structure.The predictive accuracies of Λ and d /Λ are all above 95%.The simulations of the DNN-predicted PCFs structure demonstrate that the FWM wavelength has an average numerical mean square error (MAE) of 1.92 nm from the design target.Using our method, we designed and fabricated a PCF for wavelength conversion via FWM from 1064 nm to 770 nm, which finds application in two-photon absorption spectroscopy in biomedical imaging.Pumped by a microchip laser at 1064 nm, the signal is measured at 770.2 nm, which is 0.2 nm offset from the target.

Principle of Phase-Matching in FWM
The nonlinear wavelength conversion of FMW is determined by the phase-matching curve, which satisfies the energy and momentum conservation [24].Equations ( 1) and ( 2) represent the energy and momentum conservation formulas, respectively.The phasematching curves comprise a series of calculated signal and idler wavelengths spanning the pump wavelength range.
where j  is the wavelength, j  is the angular frequency, and j n is the effective re- fractive index ( j = p , s and i , representing pump, signal, and idler, respectively).

P
 is the nonlinear phase mismatch term, where P is the pump power and n is the nonlinear refractive index [25].
eff A is the effective area of the fundamental mode, which is defined by Equation (3) [24].

Empirical Formula of PCF Dispersion
Usually, PCF is numerically simulated to obtain the dependence of the effective index on wavelength, defined as the dispersion curve.Then, the signal and idler wavelengths are calculated according to Equations ( 1) and ( 2).However, it often takes a laptop tens of minutes to obtain the effective index at a single wavelength using FEM software, e.g., COMSOL.To obtain a large enough dispersion dataset of various PCF designs to train DNN, we use the method of building an empirical formula represented by Equations ( 3)-( 5) as reported in [2] instead.To obtain the fitting data for the empirical formula, COMSOL was used to numerically calculate the effective indices of the fundamental mode of PCF with 8 layers of uniformed air holes in a hexagonal lattice at wavelengths λ = 0.5 μm to 5 μm (by a step of 50 nm).The pitch range is set as Λ = 2 μm to 5 μm (by a step of 0.1 μm), and the ratio range is set as d/Λ = 0.25 to 0.45 (by a step of 0.05), matching the single-mode PCF designs reported in practical applications.Next, we apply the universal conjugate gradient algorithm [26]•to search for fitting parameters ( to  ,  to  , i = 1-4), which are summarized in Table 1.It is noteworthy that the  is calculated according to the Sellmeier Equation of silica glass [19] rather than a constant of 1.45.Thus, the variation of material refractive index with wavelength is included to improve the prediction accuracy of FWM by DNN.Compared with [2], the average error between empirical formulas and FEM simulations is reduced from 4.42 10 to 1.44 10 .The maximum and minimum errors of the effective index are 7.63 10 and 3.26 10 , respectively.The empirical formula calculates that 1 effective index costs 0.006 s on average.Compared to FEM approaches in COMSOL, the computation time is at least six orders of magnitude faster.
where eff n is the effective index of the fundamental guided mode; co n is the refractive index of silica glass core calculated according to the Sellmeier equation; V and W are normalized frequency and transverse attenuation constants, respectively;  is the oper- ating wavelength; and eff a is the effective core radius that is here assumed to be / 3 

The Calculation of Effective Modal Area
The  is the effective area of the fundamental mode, defined by Equation ( 3) [24].The integrations of || | and || are simulated using COMSOL.The dependence of  on wavelength and PCF structural parameters is much less sensitive than the effective index, which has little impact on the phase-matching calculation.As a result, a total of 2635 effective modal areas are calculated after the scanning steps are readjusted, where the steps of wavelength, pitch, and d/Λ are 50 nm, 0.1 μm, and 0.05, respectively.The wavelength range is set as 0.8 μm to 1.6 μm, which is the pump wavelength range discussed in this article.

Data Preparation for the DNN Model
As shown in Figure 2, 5 groups of labeled data, corresponding to 5 pump power configurations (i.e., 0,1 kW, 1 kW, 10 kW, 100 kW, and 1000 kW), are prepared for the DNN models.Accordingly, five separate DNNs are trained to verify our proposed method's universality.In detail, each data group of power configurations contains 12,621 phasematching curves linked with PCF structural parameters of Λ and d/Λ ratio (defined as d/Λ).The parameter Λ varies from 2 to 5 μm (a step of 5 nm), and the ratio changes from 0.25 to 0.45 (a step of 0.01)-the effective index in Equation ( 2) is represented by the empirical formula in Equations (3)(4)(5).Using the empirical dispersion formula in Equation ( 2), phase-matched curves are computed via scanning pump wavelengths from 0.8 to 1.6 μm by an interval of 1 nm.The labeled 12,621 data points are divided into the training, validation, and testing sets of 10,000, 2000, and 621, respectively.All values of  in the 5 training datasets are generated from the 2635 simulated results in Section 2.3.More details are referred to in the data processing methods in [27].

Implementation of the DNN Algorithm
The diagram of our DNN architecture is plotted in Figure 3.The neural network is connected by the input of featuring wavelength sequences and the output of the PCF parameters (d and Λ).In the middle of Figure 3, DNN first adopts batch normalization to reduce the influence of network initialization and further speed up the training process.The following Relu-activated functions are used for the purpose of de-linearization.The next are four sub-layer-groups sharing the same structures.Each sub-group comprises a fully connected layer, a Relu-activated functions layer, and a batch-normalization layer.The neuron sizes to connect the four sub-layer-groups are 20, 13, 13, and 13, respectively.Before the last output, a fully connected layer is placed to decrease the nodes from 13 to 2. The 5 DNNs were trained with the same structure shown in the middle part of Figure 3.
The model training can be generally described to find the best-fit network parameters (i.e., weight () and bias (b)) utilizing decreasing loss function.The loss function of our DNN is expressed in Equation ( 7) [21] and represents the error between actual and predicted values.Gradient descent and back-propagation (B.P.) algorithms [28] are employed to minimize the value discrepancy.In every epoch, chain derivation operations can update the network parameters of  and b (see Equation ( 7)).After appropriate iterations, the loss function gradually converges, indicating that the DNN eventually gains highdimensional relationships from the input to the output.The average training time for our DNN model is 2'24''.The corresponding formulas for DNN models are presented as follows.

 
where  and b correspond to weights and bias, respectively.L is the multi-pro- cessing layer, and m is the number of the layers.and  is the regularization parameter.

Evaluation of Trained DNN for FWM
Overall, to evaluate the DNN performance, the following three aspects have been examined.Firstly, we estimate the DNN stability by analyzing the training progression.As previously discussed, 5 DNNs have been established in the same method using 5 varied pump power data (i.e., 0.1 kW, 1 kW, 10 kW, 100 kW, and 1000 kW).Here, we provide the training process using the dataset of 100 W. Figure 4 displays the detailed progression of this model's training and validations, where the left and right graphs showcase loss and root mean square error (RMSE), respectively.After about 20 epochs, the solid lines of training loss reach close to 0, and they remain at this level for the leftover procedures (see the solid blue lines in the diagrams below).The later validation values (see the solid red lines in the two diagrams) approximate the previous training loss curves.All these line shapes solidly confirm that our network has converged, and that the model is stable.The other 4 DNN models trained with pump powers of 1 kW, 10 kW, 100 kW, and 1000 kW undergo the same examinations and exhibit similar loss and RMSE behaviors.Next, we test our five trained DNNs using the unused test datasets (621).Table 2 compares the DNN-predicted PCF structure parameters (PDNN) and the ground truth values in the test dataset (PTEST).Random functions are applied to choose 6 data sets out of 621 sets in each of the 5 trained DNNs.We calculate the accuracy rates by comparing DNN-predicted PCF structural values with the prepared test dataset (i.e., 5 groups of 621 data points).The performance accuracy is defined by Equation ( 9), where the Λ-and d/Λ-related counterparts are plotted in Figure 5.

100%
DNN test test P P accuracy P where DNN P is the DNN-predicted value, and test P is the ground truth in the test dataset.
In general, the 5 DNN accuracy rates of Λ (blue square dots) and d/Λ (red square dots) are all above 95%, showing that our method could precisely output the demanded PCF geometries in spite of varied pump power.The wavelength error of FWM conversion using DNN-estimated fiber structures is defined as the absolute deviation between the DNN-predicted, FWM-converted, and target wavelengths.The results are shown in Figure 6a-d for different pump wavelengths at 852 nm, 976 nm, 1064 nm, and 1310 nm, respectively.In each diagram, the horizontal axis corresponds to the 5 trained DNNs (see tick labels of 0.1 kW, 1 kW, 10 kW, 100 kW, and 1000 kW).For 7 targeted wavelengths randomly chosen for each pump wavelength, the offsets from the converted wavelengths predicted by DNNs are averaged, which are 1.406 nm, 2.38 nm, 2.494 nm, and 1.41 nm, respectively, leading to a 5 DNN-averaged error of 1.92 nm.These data represent that our DNN models reach high accuracy levels since a slight systematic distortion could yield a different profile of phase-matching.
Above all, three sets of results demonstrate that DNNs are capable of "learning" the inverse mapping between wavelength characteristic sequence and PCF structure.

Experimental Demonstration
We apply our inverse design method to a PCF to generate coherent light at 770 nm by pumping it with a 1064 nm pump via FWM.At 770 nm, it is a fluorescent excitation wavelength commonly used in two-photon imaging systems [29].
We assume the pump power was set to 6 kW and subsequently obtained the optimal fiber structure parameter Λ = 3.93 μm and ratio (d/Λ) = 0.37 predicted by a trained DNN.We performed numerical simulations using COMSOL to verify the predicted structure parameters.By calculating the corresponding phase-matching curve, the FWM in PCF generates signal and idler wavelengths at 770 nm and 1721.2 nm, respectively, under 1064 pumping, as shown in Figure 7, satisfying our design target accurately.According to the structural parameters predicted by the neural network, we fabricated a PCF with five layers of homogeneous hexagonal air holes using the stack-anddraw method.As discussed in [19,30], the PCF dispersion is most affected by the three innermost rings of air holes, while the contribution from the sixth and beyond is almost negligible.Therefore, a PCF of five layers of cladding is fabricated.The structural parameters of the fabricated PCF were measured by an optical microscope, where Λ = 3.82 μm, ratio(d/Λ) = 0.41.The scanning electron micrograph (SEM) of manufactured fiber is shown in the insert of Figure 8a.We set up the FWM experiment as shown in Figure 8a.The experiment was conducted at a room temperature of 25 degrees Celsius (°C).The pump source is a microchip laser (pulse power SNP-20F-100, 600 ps pulse duration at 1064 nm, 19 kHz repetition rate, 140 mW average power).The pump was coupled into the PCF of 2.38 m via a pair of mirrors and a lens (focal length: 7 mm).The FWM wavelengths were measured at the output end of the fiber.Output spectra were recorded with an optical spectrum analyzer (OSA) (Yokogawa, AQ6374). Figure 8b shows that the signal wavelength is 770.2 nm, and the idler wavelength is 1724.9nm.Compared with the target signal 770 nm and idler 1721.2 nm, the errors of the signal and idler are 0.2 nm and 3.7 nm, respectively.Table 3 displays the most important PCF structure and wavelength data.The temperature control of PCF is often used in the tuning of the FWM wavelength offset in experiments.Grzegorz et al. have demonstrated that the ZDW thermal shift for fused silica fibers is +0.020 nm/°C.The four-wave-mixing phase-matching condition changed when the ZDW of the fiber moved farther away from the pump pulse wavelength with rising temperature [31].We set up the experiment as illustrated in Figure 8 and increased the PCF temperature from 30 °C to 100 °C (by a step of 10 °C).Table 4 summarizes the measured FMW wavelengths at different temperatures.The FWM tuning rate is about 0.1 nm per 10 °C on average.

Citation:Figure 1 .
Figure 1.Scheme of PCF inverse design.Forward: The principle of PCF structure determines phasematching calculation.A one-to-one correspondence between PCF structure (Λ, d) and PCF index profile.For fixed pump power, one specific phase-matching curve is determined.Training data are 12,621 phase-matching curves labeled with various PCF structures.Inverse: PCF design for wavelength conversion based on the DNN algorithm.

[ 2 ]
. i A and i B are coefficients representing the normalized frequency (V) and attenua- tion constants (W) with /   and / d  .Parameters of  to  ,  to  , (i = 1-4) are used to represent the functional relation of i A and i B with d /  [2].

Figure 2 .
Figure 2. The construction diagram of the labeled data for training.

Figure 3 .
Figure 3.The flow chart of inverse design is based on the DNN algorithm.The input is set as the scalar of the pump and target wavelength, and the output of DNN is the predicted PCF structure on demand.The diagram of our inverse design architecture is plotted in Figure 3.The inverse design is connected by the input of feature wavelengths ( and  ) and the output of the PCF parameters (d and ).On the left side of Figure 3, we modify the input data to characterize the represented PCF precisely.The data modification starts by locating the closest phase-matching curve at the targeted  and  position.Then, a new curve could be generated by shifting this original nearest curve across the targeted  and  position.Lastly, the vertical axis values of the new line are abstracted and fed into the DNN.In the middle of Figure3, DNN first adopts batch normalization to reduce the influence of network initialization and further speed up the training process.The following Relu-activated functions are used for the purpose of de-linearization.The next are four sub-layer-groups sharing the same structures.Each sub-group comprises a fully connected layer, a Relu-activated functions layer, and a batch-normalization layer.The neuron sizes to connect the four sub-layer-groups are 20, 13, 13, and 13, respectively.Before the last output, a fully connected layer is placed to decrease the nodes from 13 to 2. The 5 DNNs were trained with the same structure shown in the middle part of Figure3.The model training can be generally described to find the best-fit network parameters (i.e., weight () and bias (b)) utilizing decreasing loss function.The loss function of our DNN is expressed in Equation (7)[21] and represents the error between actual and predicted values.Gradient descent and back-propagation (B.P.) algorithms[28] are employed to minimize the value discrepancy.In every epoch, chain derivation operations can update the network parameters of  and b (see Equation (7)).After appropriate iterations, the loss function gradually converges, indicating that the DNN eventually gains highdimensional relationships from the input to the output.The average training time for our DNN model is 2'24''.The corresponding formulas for DNN models are presented as follows.

r
are the actual output of  and d/ , while  ( ) i  ,  ( ) i  are the predicted values.The ratio of r is referred to as d/

Figure 4 .
Figure 4.The loss (a) and RMSE (b) plots by training (red lines) and validation (blue curves) datasets.

Figure 5 .
Figure 5.The average accuracy rate of DNN-predicted PCF structure parameters Λ, d/Λ under five different pump powers.

Figure 6 .
Figure 6.(a-d) The simulated converted wavelength error under different pump powers with pump wavelengths of 852 nm, 976 nm, 1064 nm, and 1310 nm, respectively.

Figure 8 .
Figure 8.(a) The diagram of the experimental setup utilizing four-wave-mixing with the fabricated PCF.The insert chart is a scanning electron micrograph (SEM) of the fabricated PCF in this work.(b) The output spectrum is 2.38 m of fabricated PCF.The signal wavelength is 770.2 nm; the idler wavelength is 1724.9nm.

Table 2 .
The ground truth values in the test dataset (PTEST) and DNN-predicted PCF structure parameters (PDNN) values.

Table 4 .
FWM in PCF at different temperatures.