1. Introduction
Photovoltaic (PV) energy shares in electricity generation have continually grown since the beginning of commercial silicon-based solar cells over 50 years ago [
1]. It is also conveniently predicted that PV energy will be the leading renewable energy source due to availability, price decrease, and technology improvement [
1,
2]. In order to reach the future growth expectations, most of the PV research has been dedicated to increasing the efficiency limits of PV cells and decreasing manufacturing cost per unit of energy.
One of the ways to improve PV conversion efficiency is by modifying material properties. A well-known affect called light trapping, which is mostly achieved by surface patterning of the cell and inducing plasmonic effects, can significantly improve solar light absorption in silicon [
3,
4]. These techniques increase effective optical thickness without actually increasing the physical thickness of PV material, thus avoiding undesirable carrier recombination [
5]. Carrier recombination hinders photocurrent conversion of absorbed photons, thus making the solar cell electrically undesirable.
Gaining physical insight into the dependency of optical performance of a thin film to shapes, dimensions, material choices and other parameters of plasmonic nano-textures or nano-particles is critical in designing efficient cells. This has been the subject of extensive review in the field of nano-technology in the past 15 years. The research has led us to several design guidelines. In general, it is agreed that particle shape, dimension and position in the cell should be taken into account for a rigorous design of a plasmonic photovoltaic device [
6]. In addition, precise computational simulators that model electromagnetic equations and material properties at nano-scale and solar optical wavelenghts should be accompanied by powerful optimization algorithms for a feasible and efficient design [
7,
8,
9,
10,
11,
12,
13]. Optical modeling of thin film PV cells requires solving Maxwell’s electromagnetic equations because interaction of light with subwavelength structures cannot be explained with simple ray tracing models. On the other hand, solutions of Maxwell’s equations require spatial and temporal discretization of the complex domains, often facilitated by computational solvers such as the Finite Difference Time Domain (FDTD) method. These methods require extensive resources and time. When one searches for the best optical properties in the design and optimization framework, many repeated numerical FDTD simulations must be carried out for an entire wavelength range, which makes the search process extremely burdensome. When more than a handful of parameters are aimed to be optimized, the procedure becomes so time-consuming that even with the state-of-the-art numerical optimization algorithms, a rigorous design is practically infeasible. The only remedy to such a challenge is the use of “surrogate modeling”. This means replacing the black-box (FDTD) simulations with an accurate regression model. Such a model can be used for both optimization and analysis, leading to the concept of Surrogate-Based Optimization (SBO). Neural Networks (NN) are well-studied models in machine learning with the ability to approximate functions of arbitrarily high nonlinearity. NNs have been proven to be useful in many engineering problems [
14,
15,
16,
17] as a function approximator. However, NNs—and more broadly any surrogate model—have never been used in optimization of photovoltaic cells, and in particular optics equations at subwavelength scales. This paper aims to demonstrate this capability for learning and optimization for the first time.
In this work, we propose using NN as a surrogate model to design a plasmonic organic photovoltaic (OPV) device. The details of the physical model and advantages of OPV are presented in
Section 2. The rest of the paper is organized as follows: a brief explanation of NN-based optimization is given in
Section 3 and the results of the optimization are presented in
Section 4. Sensitivity analysis is also conducted to predict the dependence of the results on small changes in the inputs.
2. Description of the Physical Model
OPV provides ease of fabrication and inexpensive material choices for the active layer [
18,
19] even though the power conversion efficiency (PCE) is relatively lower than the inorganic rivals. There has been significant improvement in the PCE of these structures by using bulk heterojunction (BHJ) blend compared to bilayer donor/acceptor design due to the large interfacial area between the donor and acceptor of BHJ [
20]. Recently, researchers have made several efforts at optimizing the nanomorphology of OPV [
21,
22,
23,
24,
25]. In general, the increase in the optical efficiency is accompanied by increase in optical thickness, which increases recombination when the distance of the possible electron–hole creation zone is farther away from the p–n junction than collection length [
5]. Therefore, even though the absorption efficiency is improved, increased recombination hinders photocurrent generation. One of the methods to increase absorbed power without increasing absorption thickness is to induce plasmonic effects by using metallic nanoparticles. Plasmonics deal with the behavior of free electrons at the metal–dielectric or metal–semiconductor interface. When light hits a metallic surface, free electrons are excited, and an electrical field is created. This excitation is called surface plasmon polaritons (SPPs) and it enhances the created number of electron–hole pairs [
26]. Specifically, the mechanisms of SPPs are creating multiple light scatterings, creating electron–hole pairs by near field effects and coupling light to surface plasmon polaritons [
6].
A standard configuration of OPV with silver nanospheres is demonstrated in
Figure 1. Ag nanospheres are assumed to have radius
and are placed inside a poly(3-hexylthiophene):(6,6)-phenyl-C61-butyric-acid-methyl ester (P3HT:PCBM) layer of thickness
at a vertical distance
from the bottom and are repeated at a period of
. The Ag-filled active layer is stacked by poly(3,4-ethylenedioxythiophene): poly(styrenesulfonate) (PEDOT:PSS) with thickness
and an aluminum back reflector layer. The 3D view of the proposed OPV and the simplified 2D view are presented in
Figure 1a,b. The problem is reduced from 3D to 2D based on the premises of the study by Moreno et al. [
27]. In all the simulations of the present study, a plane wave source is propagated from the top to the bottom at a specified wavelength
and at incident angle
. Bloch and perfectly matched layer (PML) boundary conditions are imposed for
and
coordinates, respectively. Real and imaginary parts of the materials used in the simulations are taken from the literature [
28,
29,
30].
The enhancement in the absorbed power can be quantified by the absorption enhancement factor (EF). This quantity is defined as the ratio of the number of photons absorbed by the active layer of the plasmonic photovoltaic cell to the absorbed photons without plasmonic contribution (i.e., bare thin film). One of the design criteria for the cell geometry is to maximize EF. In mathematical terms:
where
and
are the absorbed optical power by plasmonic (with nanoparticle) and bare (without nanoparticle) photovoltaic cells,
is the AM 1.5 standard terrestrial solar spectrum [
31] and integration is done over the wavelength range of interest.
The absorbed fraction of optical power can be obtained by solving underlying physical equations, i.e., Maxwell’s electromagnetic equations. Maxwell’s equations are a set of partial differential equations which describe the relationship between electric and magnetic fields and incident light. Maxwell’s equations are generally solved numerically, except for a few simple cases where analytical solutions exist. One of the most popular methods for solving these equations is the Finite Difference Time Domain (FDTD) method which discretizes the spatial and temporal grid called Yee’s cell. In this study, a commercial software [
32] provided by Lumerical Inc. (Vancouver, Canada) is used to facilitate FDTD simulations. EF is calculated for a photovoltaic cell structure in
Figure 1 for
,
,
,
and for different
. Computed EF values are compared with Finite Element Method (FEM) simulations by Shen et al. [
30] in
Figure 2.
Despite these powerful solution techniques and parallelization options, the numerical solution of Maxwell’s equations for nano-structures is very time-consuming and can be an obstruction for direct optimization. Surrogate modeling can be used to approximate the absorptivity as a function of the inputs, namely OPV geometry and source properties. In the sequel, we explain the use of NNs as a surrogate model and describe the procedures for training and validation data generation, model fitting and validation along with necessary mathematical backgrounds.
3. Surrogate-Based Modeling and Optimization
Surrogate modeling starts with a proper sampling scheme (also known as design of experiments). After sampling points in the input space are collected, they are used in the forward problem to obtain an input–output set of data for training. This set is used for determining the corresponding parameters of the surrogate model [
33]. This procedure is called model training (fitting). An additional set of input–output pairs (validation set) is used to validate the trained surrogate model. Cross validation (CV) [
33] is often used as a reliable technique due to the intuitive solution and unbiased estimation. In CV, data is divided into
folds, and training is repeated
times, where (
) of the folds are used as training and one fold is left out and used for validation each time. The training-validation set can be sampled in various ways such as Uniform Sampling (US), Latin Hypercube Sampling (LHS) and Orthogonal Arrays (OA) [
33]. The purpose of sampling is to represent the input space in the best way, while making sure that the number of sample points is kept at a minimum, in order to reduce the computational forward problem cost.
3.1. Neural Network Model of Absorptivity
NN is a well-established regression tool which can approximate almost every function regardless of the degree of nonlinearity [
34,
35]. NN represents the relationship between input and output as a series of functions evaluated at the artificial neurons. The output of the NN model is
where
is the normalized output vector and
is the coefficient matrix of the ith layer and
is the number of layers.
is the input vector normalized to
range by
. The output is then renormalized to
to obtain NN absorptivity,
and
.
is found as a result of NN training by minimizing the error between NN output and target. The details of NN training are not included here for the sake of brevity, but are presented in
Appendix A. The interested readers are referred to [
34,
35] for further details. One of the advantages provided by the present NN model is modeling both plasmonic and bare structures in the same model, thus avoiding the additional computational cost.
3.2. Objective Function
The objective of the present optimization problem is to maximize EF by modifying the cell geometry. One of the reasons for choosing EF as the objective function is that the algorithm tends to minimize the active layer thickness when the aim is to maximize EF, thus the possibility of recombination is also decreased although photocurrent is not considered as the objective.
Therefore, the present optimization problem can be formulated as
where
is the geometry vector with Ag nanoparticles and
is the bare geometry without the nanoparticles, i.e.,
, and the lower and upper limits for the geometry vector are
and
. The bounds are the same as the bounds of training and validation sets except the lower bound of
.
in order to avoid short-circuit possibility due to Ag–Al contact.
is then calculated as the ratio of the integrals in the numerator and denominator of (16) which are calculated by using the trapezoidal method by evaluating the output of the surrogate model
and
for each wavelength increment (1 nm). The cost function in the optimization problem, however, is set to the inverse of EF and penalty terms are added [
36] in order to obtain an unconstrained minimization problem:
3.3. Optimization Algorithms
Numerical optimization methods can be classified as stochastic and deterministic [
37]. In stochastic optimization, the next candidate search point is selected randomly based on a current selection distribution, while in deterministic search no randomness is involved. The most notable deterministic search methods are gradient-based algorithms, which rely on first and/or second order derivatives (i.e., gradient and Hessian) and use a line search approach based on those. Some examples of stochastic methods are nature-inspired algorithms, such as artificial bee colony [
38], genetic algorithms [
39], Tabu search [
40] and Simulated Annealing [
41]. Examples of Gradient-based optimization algorithms include Conjugate Gradient, Levenberg Marquardt and Quasi Newton methods [
36]. We choose to work with a mixture of global Simulated Annealing (SA) and Quasi Newton (QN) techniques in the current work. The details of these algorithms can be found in reference [
36,
41]. For objective functions linked to NNs, the gradient can be computed directly using a back-propagation sensitivity quantity, which is elaborated in
Appendix A.