Next Article in Journal
Novel Mitochondria-Targeted Amphiphilic Aminophosphonium Salts and Lipids Nanoparticles: Synthesis, Antitumor Activity and Toxicity
Previous Article in Journal
Efficient Fluoride Wastewater Treatment Using Eco-Friendly Synthesized AlOOH
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Quick Method for Predicting Reflectance Spectra of Nanophotonic Devices via Artificial Neural Network

Electronic Engineering College, Heilongjiang University, Harbin 150080, China
*
Author to whom correspondence should be addressed.
Nanomaterials 2023, 13(21), 2839; https://doi.org/10.3390/nano13212839
Submission received: 20 September 2023 / Revised: 17 October 2023 / Accepted: 25 October 2023 / Published: 26 October 2023

Abstract

:
Nanophotonics use the interaction between light and subwavelength structures to design nanophotonic devices and to show unique optical, electromagnetic, and acoustic properties that natural materials do not have. However, this usually requires considerable expertise and a lot of time-consuming electromagnetic simulations. With the continuous development of artificial intelligence, people are turning to deep learning for designing nanophotonic devices. Deep learning models can continuously fit the correlation function between the input parameters and output, using models with weights and biases that can obtain results in milliseconds to seconds. In this paper, we use finite-difference time-domain for simulations, and we obtain the reflectance spectra from 2430 different structures. Based on these reflectance spectra data, we use neural networks for training, which can quickly predict unseen structural reflectance spectra. The effectiveness of this method is verified by comparing the predicted results to the simulation results. Almost all results maintain the main trend, the MSE of 94% predictions are below 10−3, all are below 10−2, and the MAE of 97% predictions are below 2 × 10−2. This approach can speed up device design and optimization, and provides reference for scientific researchers.

1. Introduction

Nanophotonic devices refer to photonic devices with dimensions on the nanometer scale that utilize the unique properties of nanoscale structures and the light–matter interaction to control and manipulate the propagation, emission, and detection of light. These devices have been widely applied in information transmission and processing, biomedical applications, and sensing [1,2,3]. The performance and applications of nanophotonic devices are influenced by the materials, dimensions, shapes, and structures of the devices [4,5,6,7,8,9], requiring in-depth research and exploration in device design, fabrication, and characterization [10]. Traditional approaches typically involve electromagnetic simulations based on software packages [11,12], such as the finite-difference time-domain (FDTD) method or finite element modeling (FEM), to simulate the spectra, magnetic fields, and electric fields. However, the optimization of these parameters often requires manual adjustments, discarding other results, and this process can take weeks or even longer, requiring specialized expertise throughout the entire process. With the advancement of computer science [13,14,15,16,17], attention has turned to deep learning methods [18,19].
In 2015, LeCun Y et al. introduced deep learning [20], which has since experienced a tremendous breakthrough since 2012, surpassing previous techniques in image [21], speech, and text processing. Deep learning methods have also been applied in the design of nanophotonic devices [22,23,24,25,26,27,28,29,30,31,32,33,34,35]. In 2018, Peurifoy J et al. explored the application of artificial neural networks in the design of nanophotonic particles. Their network consisted of three inputs, seven outputs, and a hidden layer with five neurons, achieving accurate prediction of scattering spectra for simple nanoparticles [22]. Their network and nanoparticles are simple, but they illustrate the principle vividly. In the same year, Malkiel I et al. utilized DNN to design and characterize “H”-shaped nanostructures [24]. Their network exceeds ten layers and has a large number of nodes with big data. In fact, most neural network methods are like this. The aforementioned studies demonstrate the powerful capabilities of deep learning methods.
However, whether simple neural network structures can provide efficient prediction and classification remains a question. If simple shallow neural networks work, they will provide a faster method for people working with nonlinear data. And, due to the nonlinear relationship between nanophotonic devices and their optical responses, we use artificial neural networks to obtain the hidden relationship between the reflectance spectra and structural parameters of nanophotonic devices. This paper discusses the application of a simple three-layer shallow ANN to accurately predict the reflectance spectra trends of nanophotonic devices. Taking the Au-SiO2-Au structure as a simple example, model training is completed through appropriate data collection, data processing, and network training, so that structural parameters can be input into the model to quickly predict the reflectance spectra of the Au-SiO2-Au structure. Reflectance spectrum is a crucial indicator in the design of nanophotonic devices and is highly sensitive to the structural parameters of the devices. If a nanophotonic device has 5 parameters and 50 selectable values for each parameter (there can be billions of combinations, requiring substantial computational resources to obtain their spectra), ANN can quickly predict spectral trends using only thousands of data groups rather than traversing through millions to billions of combinations. Comparisons of model predicted values and real value results demonstrate that the model can accurately predict the reflectance spectrum trend of the Au-SiO2-Au structure. It is worth noting that most models are tested on a test set (a subset of the dataset) rather than on data points outside the dataset. Therefore, in this study, we conduct additional tests on structural values outside the dataset. Compared with traditional methods (numerical methods such as FDTD and FEM), this approach significantly reduces the time required for prediction. Automated prediction through models and scripts will greatly save time in collecting and organizing data and provide quick reference. It is like a mobile phone photo album: swipe through the prediction results (pictures) one by one to determine whether there is a spectral trend we need. This research extends the application of the combination of nanophotonic and deep learning, providing a rapid reference for the design of nanophotonic devices and showcasing promising prospects for future developments.

2. Materials and Methods

2.1. Nanophotonic Device Model

We adjust the dimensions of the Yao Z [36] structure and a gold ring is added to the original structure by employing the finite-difference time-domain (FDTD) method facilitated by Lumerical FDTD Solutions software [http://www.lumerical.com/tcad-products/fdtd/] (accessed on 12 May 2023), as shown in Figure 1a,b. The substrate is Si, the intermediate metal material is Au, and SiO2 serves as the electrolyte. All three have equal length and width, with x = y = 200 nm. The thickness of the Si substrate is HSi = 150 nm, the thickness of Au on the Si layer is HAu = 100 nm, and the thickness of SiO2 is HSiO2 = 30 nm. The total thickness of the three layers is H = 280 nm. There is a gold circle a1 and a gold ring a2 on the top that can be controlled separately for radius and height. For convenience of statistical analysis, we set specific structural parameter spacings. However, in fact, there are countless combinations within the range of values taken by h2, ro, ri, h1, and r1. Following are the parameter combinations we set.
The initial radius of the gold circle is r1 = 10 nm. The radius is increased by 10 nm in each of the 8 cycles, resulting in radii of 10 nm, 20 nm, …, 80 nm, and 90 nm. The initial height of the gold circle is h1 = 25 nm. The height is increased in 2 cycles by 25 nm each time, resulting in heights of 25 nm, 50 nm, and 75 nm for the gold circle. The initial inner radius of the gold ring is ri = 90 nm. The inner radius is increased by 10 nm in each of the 4 cycles, resulting in inner radii of 90 nm, 100 nm, 110 nm, 120 nm, and 130 nm. The initial outer radius of the ring is ro = 140 nm. The outer radius is increased by 10 nm in each of the 5 cycles, resulting in outer radii of 140 nm, 150 nm, 160 nm, 170 nm, 180 nm, and 190 nm. The initial height of the gold ring is h2 = 25 nm. The height is increased in 2 cycles by 25 nm each time, resulting in heights of 25 nm, 50 nm, and 75 nm for the gold ring. There is a total of t = 2430 combinations obtained from arranging and combining different radii and heights of the circle and ring. A plane wave light source is placed above the material and is perpendicular to the X-Y plane and parallel to the Z-axis, with a wavelength range of 400–700 nm. At the same time, periodic boundary conditions are set in the X and Y directions, with a FDTD mesh accuracy of 2, the minimum wavelength for calculation is 0.1 nm, and mesh accuracy is an indicator of the mesh size used by FDTD adaptive mesh. There are 8 accuracy levels (2 corresponding to 1/10 of the minimum wavelength) and 100 monitoring frequency points from 400 nm to 700 nm.
We use FDTD script to carry out batch simulations, with one parameter changed in each cycle. The order of parameter changes is h2, ro, ri, h1, and r1. A total of 2430 simulations are carried out, taking 45 h. During each individual simulation, the data are automatically collected using a script, with a simulation time ranging from 60 to 200 s. In each simulation, 5 geometric parameters and 100 spectra reflectance parameters are obtained, forming the data set.
We input over 2000 sets of simulation data into the neural network. Though simulations also require a considerable amount of time and computing resources, once the neural network is trained, the model can accurately predict the spectra generated by countless combinations of structures in just a few milliseconds to seconds.

2.2. Artificial Neural Network

Within the realm of deep learning, supervised learning stands as the prevailing approach. In this paradigm, a data set comprises multiple samples, with each sample containing an input and its corresponding output. The structure of a basic neural network encompasses an input layer, hidden layers, and an output layer, as depicted in Figure 2a. In the context of our paper, a reflectance spectrum is represented by a series of discrete points, as illustrated in Figure 2b. For the implementation of our work, we leverage the PyTorch framework [14], which provides a robust and efficient platform for our deep learning experiments.
Artificial neural network (ANN) is a computational model designed to fit the structure and function of biological neural networks. It consists of many neural units that interact with each other through weighted connections, forming a hierarchical structure. ANN can be used in various fields such as pattern recognition, classification, data mining, natural language processing, and image processing, and has become an important component of machine learning. The most basic building block of an ANN is the neuron. A neuron receives inputs from other neurons, weights these inputs by their respective connection strengths, and adds a bias value before feeding the result into an activation function for nonlinear transformation. The output from one neuron can then serve as an input to subsequent neurons, thus forming layers of interconnected neurons that constitute the neural network. In ANN, a backward propagation algorithm is commonly used for repeated iterations on a train set to adjust the connection weights and bias values, minimizing the error between network outputs and real results. After training, ANN can be used for prediction and classification of new data [18,19,37]. The advantages of ANN include good nonlinear fitting ability, adaptability, robustness, and generalization capabilities, while its disadvantages include a tendency to overfit, the need for large amounts of training data, and computation time. In recent years, with the rise of deep learning, deep neural network (DNN) has become a widely used form of ANN and has achieved many significant research outcomes [30,38].
When a neural network is more inclined to fit a particular set of data but fails to fit unseen data, it is called overfitting. Adjusting hyperparameters can improve the model’s fitting ability, preventing overfitting and underfitting. Following are some important parameters in a neural network such as loss function, epoch, learning rate, activation function, and number of nodes [39,40].
(1) The loss function is used to measure the gap between a model’s predicted output and real output. In deep learning, a loss function is typically used to guide the model’s learning process by minimizing the loss value, thereby improving the accuracy of the model. Different tasks require different types of loss functions. Cross entropy is commonly used as a loss function in classification problems, while mean squared error or mean absolute error is generally employed in regression tasks. The selection of a loss function is also influenced by the complexity of the model architecture and task requirements.
(2) Epoch refers to the process where the entire training data set has been trained once. During the training, the training data are usually divided into batches, with each batch containing multiple samples. The model performs operations such as forward propagation, calculating loss function, backward propagation, and parameter updating on each batch until all batches in the training data set have been used to complete one epoch. In deep learning, multiple epochs are often required to train models to improve accuracy and generalization.
(3) Learning rate is a hyperparameter used in neural network training to control the step size taken when updating weights during each iteration (or each epoch). A larger learning rate results in a bigger step size, accelerating model convergence, but may cause the model to oscillate around the optimal point or even fail to converge. Conversely, a small learning rate slows down model convergence, requiring more iterations to reach the optimal point. Therefore, choosing a reasonable learning rate is crucial for neural network training. Typically, the optimal learning rate is selected through experiments.
(4) Activation function transforms input signals into output signals. In neural networks, activation functions are typically used to introduce nonlinear characteristics to neurons, allowing the network to model more complex data. Popular activation functions include Sigmoid function, Tanh function, ReLU function, etc. These three activation functions can be represented using mathematical formulas and shown in Figure 3:
Sigmoid : σ x = 1 1 + e x
Tanh : σ x = e x e x e x + e x
Re L U : σ x = max 0 , x
(5) The number of nodes typically refers to the number of neurons in each layer of a neural network. In the input layer, the number of nodes is equal to the dimensionality of the input data. In the output layer, the number of nodes is equal to the dimensionality of the output results. Increasing the number of the nodes in the hidden layers theoretically enhances the model’s expressive power, but it also increases training time and resource consumption and carries the risk of overfitting.
(6) An optimizer is an algorithm used to update the model parameters with the aim of minimizing the loss function [41]. The choice of optimizer impacts the convergence speed and accuracy of the model [42]. Here are some commonly used optimizers:
Stochastic gradient descent (SGD): It updates the parameters using the gradient of a single sample at a time. It is suitable for small data sets [20].
Batch gradient descent (BGD): It updates the parameters using the gradient computed from the entire data set. It is suitable for large data sets but has a slower convergence speed.
Adam optimizer: The Adam optimizer can be considered as one of the most widely used optimization algorithms due to its fast convergence and stable convergence process. It requires only first-order gradients and consumes minimal memory.

2.3. Training Process

In deep learning, training sets, validation sets, and test sets are data sets used for training models, optimizing models, and evaluating the performance of a model. Typically, the training set constitutes most of the data set and is used to train the model. The validation set is usually a randomly selected portion of the training set. The test set is an independent data set that has never been exposed to the model and is used to evaluate the performance of the deep learning model.
In Section 1, we collect a data set of 2430 samples by FDTD. Each sample consists of a set of inputs, called features, and a set of reflectance spectra outputs, called labels, which form the data set. The radii and heights of the top ring and circle of the device are allocated as input variables to the nodes of the input layer, and the reflectance spectra corresponding to each set of input variables are taken as the output variables. The data collection process takes 45 h using a computer configuration of 2.90 GHz 16 G CPU and RTX3060 12 G GPU.
During the data preparation phase, we randomly split the collected 2430 samples into two separate data sets: a training set and a test set. The training set accounts for 90% of the data set, while the test set accounts for 10%.
During model training, we further split the previously defined training set into a training subset and a validation subset. The validation subset is randomly sampled from the training set, typically using a 70% training and 30% validation ratio, to which we also adhere. After training the model on the training subset, we calculate metrics such as the loss function on the validation subset to assess the model’s performance and identify areas for improvement. If the model performs poorly on the validation subset, we adjust and optimize the model hyperparameters to enhance its performance. Finally, we evaluate the model’s prediction capabilities using the test set.

2.4. Artificial Neural Network Structure

Our artificial neural network (ANN) is structured with 3 hidden layers, each consisting of 300 nodes, as depicted in Figure 4a. Having a smaller number of nodes helps prevent overfitting and ensures computational efficiency. We utilize ReLU as the activation function, which is a common choice for deep learning due to its ability to model complex nonlinear relationships.
To train our network, we employ the Adam optimization algorithm. Adam is well-suited for tasks like ours, offering a good balance between computational efficiency and convergence speed. The input layer of our neural network comprises 5 nodes. These nodes represent the radii and heights of a1 and a2 (r1, h1, ri, ro, h2). Our output layer is made up of 100 nodes, corresponding to discrete reflectance spectra points spanning from 300 nm to 700 nm. These nodes effectively capture the spectral information that is generated as a result of the input parameters. In the context of this paper, we treat the spectral prediction as a regression problem. We assess the performance of our model primarily using the MSE as the loss function. A lower MSE indicates a better fit of our model to the data.
To optimize the training process, we set the learning rate to 0.001. This rate controls the step size during training and is chosen to balance training speed and stability. Furthermore, we employ a batch size of 32, which means that we update the weights of the neural network after every batch of 32 data points, a common practice to enhance training efficiency.

3. Results

Mean squared error (MSE) is widely used in regression problems and represents the mean of the squared differences between the predicted values and the real values. In the context of predicting discrete spectra points, we treat it as a regression problem and use MSE to evaluate the accuracy of the prediction model. A higher MSE indicates larger prediction errors, while a lower MSE indicates smaller prediction errors. During model optimization, the model parameters are adjusted to minimize the MSE, aiming to make the model predicted values y i closer to the real values y i , where n is 100 that represents 100 discrete spectra values. The formula for MSE is as follows:
MSE = 1 n i = 1 n y i y i 2
The loss curve of the network is shown in Figure 4b,c. The training process is completed with 6000 epochs, and the final values of the training loss and validation loss are 3.0271 × 10−4 and 7.1863 × 10−4, respectively. The neural network shows ideal training and validation results.

3.1. Testing of the Test Set

After obtaining the optimal hyperparameters, the final step is to evaluate the performance of the trained model on a previously unseen test set, which is generated separately from the initial data set. It has 243 groups (10% of 2430). We use the trained model to predict the reflectance spectra for different radii and heights and compare them to the FDTD results. We measure the difference between the predicted values and the real values using MSE and mean absolute error.
The mean absolute error (MAE) is a common metric used to evaluate the difference between predicted values y i and real values y i . It measures the average absolute difference between the predicted values y i and real values y i . n is 100 that represents 100 discrete spectra values. The formula for MAE is as follows:
MAE = 1 n i = 1 n y i y i
Figure 5 shows the MSE and MAE between structural FDTD simulation and prediction in the test set. The vast majority of MSEs are less than 0.001, and the vast majority of MAEs are less than 0.02. In this case, we can obtain the logarithmic form of the results for both MAE and MSE by taking the logarithm (log) of their respective values:
The predicted results unequivocally showcase the proficiency of the artificial neural network (ANN) model in accurately predicting the reflectance spectra of photon devices. Figure 6 provides a comprehensive comparison between the representative ANN-predicted spectra results and the corresponding spectra obtained through FDTD simulations. Specifically, we have selected four samples to show the efficacy of the ANN model. Figure 5b shows that the MAE of 97% of the prediction results is less than 0.02. Figure 6a shows the prediction result with an MAE of 0.0365. It has a larger MAE, much larger than 97% of other predictions. And, due to the FDTD Lumerical (2018a) software, wavelength data are not in an arithmetic sequence (see the discrete point positions in Figure 2b, which are dense at the front and sparse at the back), so all prediction results appear to have larger errors in the back half.
Figure 6 illustrates some prediction results, and it can be observed that the majority of the results follow the main trends. This observation highlights the high reliability and practicality of the ANN model in capturing the essential features of the reflectance spectra. The model only takes about 33 s to obtain the discrete spectral data and comparison images for the 243 sets of structural parameters in the test set. On average, it only requires 0.13 s per set of structural parameters, significantly faster compared with the traditional methods that take 60–200 s (for a single simulation of the structure in this paper, it takes 60–200 s to use FDTD Lumerical).

3.2. Testing Outside the Data Set

In our paper, we utilize a data set to train and evaluate our model. However, to test the generalization capability of the model, we also conduct tests on structures that are outside the dataset, within the range of r1, h1, ri, ro, and h2. Figure 7 presents two predicted results that maintain the main trends: the structure in Figure 7a, where r1 = 50 nm, h1 = 49 nm, ri = 131 nm, ro = 172 nm, and h2 = 74 nm, and in Figure 7b, where r1 = 11 nm, h1 = 75 nm, ri = 130 nm, ro = 161 nm, and h2 = 49 nm.

3.3. Model’s Application Potential

The above paragraph deals with the relevant tests of the model’s accuracy. During this section of the testing, we also discover that, when r1 = 49 nm, h1 = 49 nm, ri = 100 nm, ro = 142 nm, and h2 = 52 nm (parameters that are not included in the previously collected dataset), and this structure exhibits significant absorption capabilities. We validate this structure using the FDTD method, as shown in Figure 8. Although this structure is presented merely as a straightforward example to illustrate the neural network approach, it is evident that this structure exhibits an excellent absorption performance between 400 and 570 nm. The absorption rate approaches 1, indicating minimal energy loss in this wavelength range and high stability, making it suitable for absorber applications.
However, it is important to note that the material’s polarization properties consistently affect the absorber’s absorption rate. Therefore, we conduct an analysis of absorption rate variations by altering the incident light’s angle. Setting the incident angle θ at 0°, 10°, 20°, and 30°, it becomes apparent that within the 400–570 nm wavelength range, when the polarization angle is 10°, the average absorption rate can be maintained at 90%. This demonstrates that the absorber possesses a strong absorption rate. As the polarization angle increases, the absorption rate gradually decreases. Hence, adjusting the polarization angle of the light source provides a means of controlling the absorption rate.

4. Discussion

In this paper, the main goal is to use the deep learning method, especially the neural network, to quickly and accurately predict the reflectance spectra of nanophotonic devices, taking a simple Au-SiO2-Au structure as an example. We design an artificial neural network, and through a series of tests, the accuracy and rapidity of the artificial neural network in predicting the reflectance spectra are verified.
To train the ANN, a small yet diverse data set is created by varying the radii and heights of the gold circle and gold ring. This data set serves as a robust foundation for training the network. A simplified ANN architecture with three hidden layers is selected, allowing for efficient training and deployment of the model.
The test results showcase impressive performance metrics. Approximately 94% of the predictions achieve an MSE below 10−3, while all predictions within the test set exhibit an MSE below 10−2 across the wavelength range of 300 nm to 700 nm; 97% of the predictions demonstrated an MAE below 2 × 10−2 in the same wavelength. Importantly, we also conduct tests on structures outside the dataset to test model performance. The trained neural network model preserves the main features and trends of reflectance with respect to wavelength, ensuring the fidelity of the predicted spectra.
The trained ANN model demonstrates astonishing speed, capable of obtaining intuitive spectral data for any parameters within the range in just 0.13 s, achieving a significant speed improvement compared with the traditional numerical methods that take 60–200 s. Moreover, it maintains the main spectral trends. This effective prediction method holds tremendous prospects for optimizing the design of nanophotonic devices and reducing associated costs.

Author Contributions

Conceptualization, Y.G.; Methodology, Y.G.; Software, R.W.; Validation, R.W. and G.W.; Formal analysis, G.W.; Resources, Y.G.; Data curation, R.W., B.Z. and G.W.; Writing—original draft, R.W., B.Z., G.W. and Y.G.; Writing—review & editing, R.W., B.Z. and Y.G.; Visualization, R.W.; Supervision, B.Z. and Y.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All content and data have been displayed in the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Lu, C.C.; Yuan, H.Y.; Zhang, H.Y.; Zhao, W.; Zhang, N.E.; Zheng, Y.J.; Elshahat, S.; Liu, Y.C. On-chip topological nanophotonic devices. Chip 2022, 1, 100025. [Google Scholar] [CrossRef]
  2. Mohanty, A.; Li, Q.; Tadayon, M.A.; Roberts, S.P.; Bhatt, G.R.; Shim, E.; Ji, X.; Cardenas, J.; Miller, S.A.; Kepecs, A.; et al. Reconfigurable nanophotonic silicon probes for sub-millisecond deep-brain optical stimulation. Nat. Biomed. Eng. 2020, 4, 223–231. [Google Scholar] [CrossRef] [PubMed]
  3. Altug, H.; Oh, S.-H.; Maier, S.A.; Homola, J. Advances and applications of nanophotonic biosensors. Nat. Nanotechnol. 2022, 17, 5–16. [Google Scholar] [CrossRef]
  4. Li, S.; Miao, P.; Zhang, Y.; Wu, J.; Zhang, B.; Du, Y.; Han, X.; Sun, J.; Xu, P. Recent advances in plasmonic nanostructures for enhanced photocatalysis and electrocatalysis. Adv. Mater. 2021, 33, 2000086. [Google Scholar] [CrossRef] [PubMed]
  5. Anker, J.N.; Hall, W.P.; Lyandres, O.; Shah, N.C.; Zhao, J.; van Duyne, R.P. Biosensing with plasmonic nanosensors. Nat. Mater. 2008, 7, 442–453. [Google Scholar] [CrossRef]
  6. Fan, X.; White, I.M.; Shopova, S.I.; Zhu, H.; Suter, J.D.; Sun, Y. Sensitive optical biosensors for unlabeled targets: A review. Anal. Chim. Acta 2008, 620, 8–26. [Google Scholar] [CrossRef] [PubMed]
  7. He, Z.; Xue, W.; Cui, W.; Li, C.; Li, Z.; Pu, L.; Feng, J.; Xiao, X.; Wang, X.; Li, A.G. Tunable Fano resonance and enhanced sensing in a simple Au/TiO2 hybrid metasurface. Nanomaterials 2020, 10, 687. [Google Scholar] [CrossRef]
  8. Zhang, J.; Fang, Z.; Lin, J.; Zhou, J.; Wang, M.; Wu, R.; Gao, R.; Cheng, Y. Fabrication of crystalline microresonators of high quality factors with a controllable wedge angle on lithium niobate on insulator. Nanomaterials 2019, 9, 1218. [Google Scholar] [CrossRef] [PubMed]
  9. Abdollahramezani, S.; Hemmatyar, O.; Taghinejad, M.; Taghinejad, H.; Kiarashinejad, Y.; Zandehshahvar, M.; Fan, T.; Deshmukh, S.; Eftekhar, A.A.; Cai, W.; et al. Dynamic hybrid metasurfaces. Nano Lett. 2021, 21, 1238–1245. [Google Scholar] [CrossRef] [PubMed]
  10. Abdulhalim, I. Simplified optical scatterometry for periodic nanoarrays in the near-quasi-static limit. Appl. Opt. 2007, 46, 2219–2228. [Google Scholar] [CrossRef] [PubMed]
  11. Oskooi, A.F.; Roundy, D.; Ibanescu, M.; Bermel, P.; Joannopoulos, J.; Johnson, S.G. MEEP: A flexible free-software package for electromagnetic simulations by the FDTD method. Comput. Phys. Commun. 2010, 181, 687–702. [Google Scholar] [CrossRef]
  12. Multiphysics C; v. 5.6; COMSOL AB: Stockholm, Sweden, 2020; Available online: www.comsol.com (accessed on 12 May 2023).
  13. Côté, D. Using machine learning in communication networks. J. Opt. Commun. Netw. 2018, 10, D100–D109. [Google Scholar] [CrossRef]
  14. Paszke, A.; Gross, S.; Chintala, S.; Chanan, G.; Yang, E.; DeVito, Z.; Lin, Z.; Desmaison, A.; Antiga, L.; Lerer, A. Automatic differentiation in pytorch. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
  15. Hinton, G.E.; Osindero, S.; Teh, Y.-W. A fast learning algorithm for deep belief nets. Neural Comput. 2006, 18, 1527–1554. [Google Scholar] [CrossRef]
  16. Deng, L.; Yu, D. Deep learning: Methods and applications. Found. Trends® Signal Process. 2014, 7, 197–387. [Google Scholar] [CrossRef]
  17. Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
  18. Abdollahramezani, S.; Hemmatyar, O.; Taghinejad, H.; Krasnok, A.; Kiarashinejad, Y.; Zandehshahvar, M.; Alù, A.; Adibi, A. Tunable nanophotonics enabled by chalcogenide phase-change materials. Nanophotonics 2020, 9, 1189–1241. [Google Scholar] [CrossRef]
  19. Abdollahramezani, S.; Hemmatyar, O.; Taghinejad, M.; Taghinejad, H.; Krasnok, A.; Eftekhar, A.A.; Teichrib, C.; Deshmukh, S.; El-Sayed, M.A.; Pop, E.; et al. Electrically driven reprogrammable phase-change metasurface reaching 80% efficiency. Nat. Commun. 2022, 13, 1696. [Google Scholar] [CrossRef]
  20. LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
  21. Barbastathis, G.; Ozcan, A.; Situ, G. On the use of deep learning for computational imaging. Optica 2019, 6, 921–943. [Google Scholar] [CrossRef]
  22. Peurifoy, J.; Shen, Y.; Jing, L.; Yang, Y.; Cano-Renteria, F.; DeLacy, B.G.; Joannopoulos, J.D.; Tegmark, M.; Soljačić, M. Nanophotonic particle simulation and inverse design using artificial neural networks. Sci. Adv. 2018, 4, eaar4206. [Google Scholar] [CrossRef] [PubMed]
  23. Li, X.; Shu, J.; Gu, W.; Gao, L. Deep neural network for plasmonic sensor modeling. Opt. Mater. Express 2019, 9, 3857–3862. [Google Scholar] [CrossRef]
  24. Malkiel, I.; Mrejen, M.; Nagler, A.; Arieli, U.; Wolf, L.; Suchowski, H. Plasmonic nanostructure design and characterization via deep learning. Light Sci. Appl. 2018, 7, 60. [Google Scholar] [CrossRef] [PubMed]
  25. Chugh, S.; Ghosh, S.; Gulistan, A.; Rahman, B.M.A. Machine learning regression approach to the nanophotonic waveguide analyses. J. Light. Technol. 2019, 37, 6080–6089. [Google Scholar] [CrossRef]
  26. An, S.; Fowler, C.; Zheng, B.; Shalaginov, M.Y.; Tang, H.; Li, H.; Zhou, L.; Ding, J.; Agarwal, A.M.; Rivero-Baleine, C.; et al. A deep learning approach for objective-driven all-dielectric metasurface design. ACS Photonics 2019, 6, 3196–3207. [Google Scholar] [CrossRef]
  27. Liu, Z.; Zhu, D.; Rodrigues, S.P.; Lee, K.-T.; Cai, W. A generative model for the inverse design of metamaterials. Bull. Am. Phys. Soc. 2019, 64, S21.005. [Google Scholar]
  28. Kojima, K.; Koike-Akino, T.; Tang, Y.; Wang, Y. Inverse design for integrated photonics using deep neural network. In Integrated Photonics Research, Silicon and Nanophotonics; Optica Publishing Group: Washington, DC, USA, 2021; p. IF3A-6. [Google Scholar]
  29. Tahersima, M.H.; Kojima, K.; Koike-Akino, T.; Jha, D.; Wang, B.; Lin, C.; Parsons, K. Deep neural network inverse design of integrated photonic power splitters. Sci. Rep. 2019, 9, 1368. [Google Scholar] [CrossRef]
  30. Wu, Q.; Li, X.; Jiang, L.; Xu, X.; Fang, D.; Zhang, J.; Song, C.; Yu, Z.; Wang, L.; Gao, L. Deep neural network for designing near-and far-field properties in plasmonic antennas. Opt. Mater. Express 2021, 11, 1907–1917. [Google Scholar] [CrossRef]
  31. Kallioniemi, I.; Saarinen, J.; Oja, E. Optical scatterometry of subwavelength diffraction gratings: Neural-network approach. Appl. Opt. 1998, 37, 5830–5835. [Google Scholar] [CrossRef] [PubMed]
  32. Pasha, D.; Abuleil, M.J.; August, I.Y.; Abdulhalim, I. Faster Multispectral Imager Based on Thin Liquid Crystal Modulator and 3D Neural Network Lattice. Laser Photonics Rev. 2023, 17, 2200913. [Google Scholar] [CrossRef]
  33. Balin, I.; Garmider, V.; Long, Y.; Abdulhalim, I. Training artificial neural network for optimization of nanostructured VO2-based smart window performance. Opt. Express 2019, 27, A1030–A1040. [Google Scholar] [CrossRef] [PubMed]
  34. So, S.; Badloe, T.; Noh, J.; Bravo-Abad, J.; Rho, J. Deep learning enabled inverse design in nanophotonics. Nanophotonics 2020, 9, 1041–1057. [Google Scholar] [CrossRef]
  35. Park, J.; Kim, S.; Nam, D.W.; Chung, H.; Park, C.Y.; Jang, M.S. Free-form optimization of nanophotonic devices: From classical methods to deep learning. Nanophotonics 2022, 11, 1809–1845. [Google Scholar] [CrossRef]
  36. Yao, Z.; Xia, X.; Hou, Y.; Zhang, P.; Zhai, X.; Chen, Y. Metasurface-enhanced optical lever sensitivity for atomic force microscopy. Nanotechnology 2019, 30, 365501. [Google Scholar] [CrossRef]
  37. Gui, S.; Shi, M.; Li, Z.; Wu, H.; Ren, Q.; Zhao, J. A Deep-Learning-Based Method for Optical Transmission Link Assessment Applied to Optical Clock Comparisons. Photonics 2023, 10, 920. [Google Scholar] [CrossRef]
  38. Aldahdooh, A.; Hamidouche, W.; Fezza, S.A.; Déforges, O. Adversarial example detection for DNN models: A review and experimental comparison. Artif. Intell. Rev. 2022, 55, 4403–4462. [Google Scholar] [CrossRef]
  39. Fumo, D. A gentle introduction to neural networks series-part 1. Towards Data Sci. 2017, 14–20. [Google Scholar]
  40. Charu, C.A. Neural Networks and Deep Learning: A Textbook; Springer: Cham, Switzerland, 2018. [Google Scholar]
  41. Ruder, S. An overview of gradient descent optimization algorithms. arXiv 2016, arXiv:1609.04747. [Google Scholar]
  42. Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Figure 1. (a) Device 3D visualization of the structure; (b) top-view layout of the structure.
Figure 1. (a) Device 3D visualization of the structure; (b) top-view layout of the structure.
Nanomaterials 13 02839 g001
Figure 2. (a) A simple neural network. (b) Reflectance spectrum represented by discrete reflectance spectrum data points in this paper.
Figure 2. (a) A simple neural network. (b) Reflectance spectrum represented by discrete reflectance spectrum data points in this paper.
Nanomaterials 13 02839 g002
Figure 3. (a) Sigmoid function and (b) Tanh function are commonly used continuous, smooth, and monotonically increasing functions. (c) ReLU function is a nonlinear, simple, and computationally efficient function that effectively solves the vanishing gradient problem.
Figure 3. (a) Sigmoid function and (b) Tanh function are commonly used continuous, smooth, and monotonically increasing functions. (c) ReLU function is a nonlinear, simple, and computationally efficient function that effectively solves the vanishing gradient problem.
Nanomaterials 13 02839 g003
Figure 4. Artificial neural network (ANN) and loss. (a) Network structure, one input layer, three hidden layers, and one output layer. (b,c) Training and validation loss after 6000 epochs with MSE as the loss function. The final MSE values of the training loss and validation loss are 3.0271 × 10−4 and 7.1863 × 10−4.
Figure 4. Artificial neural network (ANN) and loss. (a) Network structure, one input layer, three hidden layers, and one output layer. (b,c) Training and validation loss after 6000 epochs with MSE as the loss function. The final MSE values of the training loss and validation loss are 3.0271 × 10−4 and 7.1863 × 10−4.
Nanomaterials 13 02839 g004
Figure 5. The distribution of MSE and MAE for the test set. (a) The MSE between the predicted and real values, all of the MSEs below 10−2, black line can be observed that 94% MSEs of the test set are below 10−3. (b) The MAE between the predicted and real values, yellow line represents the MAE of 97% of the test set below 2 × 10−2.
Figure 5. The distribution of MSE and MAE for the test set. (a) The MSE between the predicted and real values, all of the MSEs below 10−2, black line can be observed that 94% MSEs of the test set are below 10−3. (b) The MAE between the predicted and real values, yellow line represents the MAE of 97% of the test set below 2 × 10−2.
Nanomaterials 13 02839 g005
Figure 6. Comparison of predicted and real reflectance spectra values. (a) Sample MAE is 3.65 × 10−2, where r1 = 90 nm, h1 = 50 nm, ri = 90 nm, ro = 140 nm, and h2 = 50 nm. (b) Sample MAE is 3.54 × 10−3, where r1 = 20 nm, h1 = 25 nm, ri = 130 nm, ro = 170 nm, and h2 = 75 nm. (c) Sample MAE is 8.89 × 10−3, where r1 = 70 nm, h1 = 25 nm, ri = 130 nm, ro = 190 nm, and h2 = 75 nm. (d) Sample MAE is 4.41 × 10−3, where r1 = 60 nm, h1 = 75 nm, ri = 110 nm, ro = 190 nm, and h2 = 50 nm.
Figure 6. Comparison of predicted and real reflectance spectra values. (a) Sample MAE is 3.65 × 10−2, where r1 = 90 nm, h1 = 50 nm, ri = 90 nm, ro = 140 nm, and h2 = 50 nm. (b) Sample MAE is 3.54 × 10−3, where r1 = 20 nm, h1 = 25 nm, ri = 130 nm, ro = 170 nm, and h2 = 75 nm. (c) Sample MAE is 8.89 × 10−3, where r1 = 70 nm, h1 = 25 nm, ri = 130 nm, ro = 190 nm, and h2 = 75 nm. (d) Sample MAE is 4.41 × 10−3, where r1 = 60 nm, h1 = 75 nm, ri = 110 nm, ro = 190 nm, and h2 = 50 nm.
Nanomaterials 13 02839 g006
Figure 7. Comparison of predicted and real reflectance spectra values. Sample (a), where r1 = 50 nm, h1 = 49 nm, ri = 131 nm, ro = 172 nm, and h2 = 74 nm. Sample (b), where r1 = 11 nm, h1 = 75 nm, ri = 130 nm, ro = 161 nm, and h2 = 49 nm.
Figure 7. Comparison of predicted and real reflectance spectra values. Sample (a), where r1 = 50 nm, h1 = 49 nm, ri = 131 nm, ro = 172 nm, and h2 = 74 nm. Sample (b), where r1 = 11 nm, h1 = 75 nm, ri = 130 nm, ro = 161 nm, and h2 = 49 nm.
Nanomaterials 13 02839 g007
Figure 8. At r1 = 49 nm, h1 = 49 nm, ri = 100 nm, ro = 142 nm, and h2 = 52 nm, the absorption rates at different polarization angles (θ) of 0°, 10°, 20°, and 30° are examined. It can be observed that the absorption rate decreases with an increase in the polarization angle. Within the 400–570 nm wavelength range, at θ = 0° and 10°, this structure exhibits an absorption rate of over 90%.
Figure 8. At r1 = 49 nm, h1 = 49 nm, ri = 100 nm, ro = 142 nm, and h2 = 52 nm, the absorption rates at different polarization angles (θ) of 0°, 10°, 20°, and 30° are examined. It can be observed that the absorption rate decreases with an increase in the polarization angle. Within the 400–570 nm wavelength range, at θ = 0° and 10°, this structure exhibits an absorption rate of over 90%.
Nanomaterials 13 02839 g008
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, R.; Zhang, B.; Wang, G.; Gao, Y. A Quick Method for Predicting Reflectance Spectra of Nanophotonic Devices via Artificial Neural Network. Nanomaterials 2023, 13, 2839. https://doi.org/10.3390/nano13212839

AMA Style

Wang R, Zhang B, Wang G, Gao Y. A Quick Method for Predicting Reflectance Spectra of Nanophotonic Devices via Artificial Neural Network. Nanomaterials. 2023; 13(21):2839. https://doi.org/10.3390/nano13212839

Chicago/Turabian Style

Wang, Rui, Baicheng Zhang, Guan Wang, and Yachen Gao. 2023. "A Quick Method for Predicting Reflectance Spectra of Nanophotonic Devices via Artificial Neural Network" Nanomaterials 13, no. 21: 2839. https://doi.org/10.3390/nano13212839

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop