Studying critical currents, critical temperatures, and critical fields carries substantial importance in the field of superconductivity. In this work, we study critical currents in the current–voltage characteristics of a diluted-square lattice on an Nb film. Our measurements are based on a commercially available Physical Properties Measurement System, which may prove time consuming and costly for repeated measurements for a wide range of parameters. We therefore propose a technique based on artificial neural networks to facilitate extrapolation of these curves for unforeseen values of temperature and magnetic fields. We demonstrate that our proposed algorithm predicts the curves with an immaculate precision and minimal overhead, which may as well be adopted for prediction in other types of regular and diluted lattices. In addition, we present a detailed comparison between three artificial neural networks architectures with respect to their prediction efficiency, computation time, and number of iterations to converge to an optimal solution.
Mixed state in superconductors is the sign of existence of vortices, which is the most interesting research area in low temperature physics. The vortices can be in the form of liquid, glassy, or crystalline phases. These vortex phases can be studied in high temperature superconductor systems and type II superconducting thin films with an array of dots/antidots. Over the last few decades, various properties of superconducting thin films with an array of artificial pinning centers have been explored [1,2,3,4,5]. Different geometrical structures [2,3,5,6,7,8,9,10] have been used in the composition of the array of dots or antidots. Previous works  showed that using a diluted array of antidots increases pinning effect along with energy conservation. In this work, an experimental setup based on a diluted square array of antidots is used to measure its current–voltage (IV) behavior.
It has been noted in previous works  that repeated transport measurements may become costly and cumbersome to obtain. As a result, there is a dire need to come up with a theoretical or a formal model that could approximate the IV curves of superconducting films, and thereby relieve the researchers from repeatedly measuring these physical properties. Artificial neural networks (ANNs) are among the most widely adopted techniques for modeling complex systems. The concept of ANN is derived from the actual working of a human neuron system , in which different neurons connect with each other through some network’s coefficients (called weights). The working of ANNs is well addressed in literature . ANNs learn and identify relationships between systems’ parameters: this gives them an astonishing approximation or prediction capability. Other advantages associated with the use of ANNs include modeling non-linearity [14,15,16,17], fault tolerance, parallelism, robustness of the learning process, and ability to handle fuzzy information [18,19]. The advantage of using ANNs over other statistical methods—such as linear and nonlinear regression techniques—has been advocated on multiple occasions for various applications. For example,  made a comparison between the two techniques for the prediction of yarn tensile properties,  carried out the same comparison for Iran’s annual electricity load, and  compared ANN with linear regression models for predicting hourly and daily diffuse fraction. All of them concluded that ANN was the better prediction approach.
Despite their tremendous features, ANNs have had a very limited application in the field of materials science [11,23,24]—let alone the prediction of IV curves. In , ANNs were used for the prediction of the current–voltage curves for a square array of nano-engineered periodic antidots. Diluted square arrays—which are formed by removing a quarter of the sites from the original square lattice—offer a larger interstitial area in comparison to the original square lattice. This means that a large number of interstitial vortices may easily be accommodated, leading to increased energy conservation; this is commonly referred to as caging effect . In this work, we predict the IV curves for a diluted square array of antidots, and propose a framework based on ANN for extrapolating the IV behavior for a wide range of temperature and magnetic field values. In addition to this, we present a thorough comparison of three different ANN architectures trained with six different training algorithms for the framework. The comparison of training algorithms is based on prediction accuracy in terms of mean squared error (MSE), number of iterations needed to converge (epochs), and training time.
Our findings may be used as a benchmark in any followup work concerning the study of IV characteristics of any regular or diluted lattice, since we pinpoint the pros and cons of several architectures and training algorithms, and conclude on the most suitable options for our specific application. The rest of the paper is organized as follows. Section 2 summarizes the experimental details for acquiring the datasets. The choice of the architectures and training algorithms is given in Section 3. Simulation results and a comparison are given in Section 4, followed by the concluding remarks in Section 5.
2. Experimental Setup and Transport Measurements
2.1. Experimental Setup
For this work, we deposited a high-quality 60-nm-thick superconducting Nb film on a SO substrate. Ultraviolet photolithography and reactive ion etching techniques were used to fabricate the microbridges for transport measurements, followed by standard lithography on a polymethyl-metacrylate (PMMA) resist layer to obtain the desired arrays. Magnetically enhanced reactive ion etching was used to transfer the patterns to the film. Our measurements were carried out using a commercially available Physical Properties Measurement System (PPMS) from Quantum Design. A scanning electron micrograph (SEM) (HITACHI, Tokyo, Japan) of the diluted square array is shown in Figure 1.
2.2. Measurements Using PPMS
The patterned superconducting film with a diluted square lattice had of 8.646 K, which is smaller than of the unpatterned film. For transport measurements, we placed the sample in liquid helium to help reduce the heating contact. Figure 2 shows the voltage measurements at different temperatures below and zero applied fields. Change in the slope in these IV curves suggest the existence of three regions: one where the voltage is almost zero, which gradually increases in the second region, followed by the region in which the current increases linearly with voltage. While the Shapiro steps may be clearly observed in the second region within this temperature range, they continue to weaken until completely vanishing at higher temperatures; this happens mainly due to thermal fluctuations. Our observations are in agreement with those existing in literature .
For seven constant values of temperature (8.0, 8.1, 8.2, 8.3, 8.4, 8.5 and 8.6 K) and four constant values of magnetic fields (0, 100, 200 and 300 Oe), the measuring unit generated a dataset of 7 × 4 × 642 IV values. The next section describes the ANN methodology to be used for prediction.
3. ANN Architectures and Training Algorithms
The topology of the ANN can be described in terms of a directed graph of nodes with a transfer function , where is called a state variable for each node i, is called a weight carrying some real value between two nodes i and j, is a real-valued bias term, and is typically chosen to be a step or a linear function, and is called an activation function. This transfer function also represents the expected output of a system comprising just the input and output layers. It is well known that such a system is not capable of implementing many functions; it is usually necessary to incorporate a few hidden layers. Setti and Rao  showed that two hidden layers are usually an optimum choice capable of representing most of the desired functions. In this work, we also fix the number of hidden layers to two; however, we will vary the number of neurons in the hidden layers to have comparable prediction efficiencies.
The output, , of a network having two hidden layers is given as in Equation (1):
represents the weights from neuron k in the second hidden layer to the output neurons.
represents the weights from neuron j in the first hidden layer to neuron k in the second layer.
represents the weights from neuron i in the input layer to the neuron j in the first hidden layer.
represents the element in the input layer.
, , and represent the bias values for the hidden and output layers.
, and are the activation functions: H1, H2, and o stand for the first and second hidden and the output layers, respectively.
The primary objective of this analysis is to minimize a cost function given by Equation (2):
The layers in the directed graphs are usually arranged in one of the three possible options, giving rise to three different ANN architectures. More specifically, the manner in which the layers are interconnected describes a particular architecture. In what follows, we briefly describe each architecture; comparing them in terms of prediction accuracy, epochs, and training time is the contribution of this work—this is covered in Section 4.
3.1. Feedforward Networks
Figure 3a shows an example of a feedforward network; for simplicity, the system is shown for one hidden layer. In such networks, each layer is only connected to its immediate neighbors: it takes input from the preceding layer, and generates output to the subsequent. In this way, mapping between input and output is achieved.
3.2. Cascade-Forward Networks
Cascade-forward networks are a variant of feedforward networks. In this case, each layer is not just connected to the preceding node, but has a connection with the input as well; this is shown in Figure 3b. This will slightly modify Equation (1) to have and in the second hidden and output layers.
3.3. Layer-Recurrent Networks
In layer-recurrent networks, each hidden layer has a recurrent connection with additional tap delays. The circles in the hidden layer of Figure 3c depict these delays. This feature allows such networks to have a dynamic response to time series data. In our work, we have used a tap delay of two in each hidden layer.
In all of our simulations, we have made use of tan-sigmoid (a hyperbolic tangent sigmoid function in the hidden layers) and purelin (a linear function in the output layer) as the activation functions. As far as training the network is concerned, several algorithms have been proposed in literature. The most widely adopted among those—which we also consider in this work—include Levenberg–Marquardt (LM) , Bayesian Regularization (BR) , Resilient Backpropogation (NR), Conjugate Gradient (CGF) [29,30,31,32,33], Quasi-Newton Backpropagation (BFGS) , and Variable Learning Rate Back Propagation with Momentum (GDX) .
4. Simulation Results
Our experimental setup generated a data set comprising several values of current and voltage for a wide range of magnetic field and temperature readings, a few entries of which are given in Table 1. Our methodology used 70 percent of these values as training data, while the rest was used to serve the validation and prediction purposes. We made use of MATLAB’s toolbox called Neural Network to perform these simulations. As mentioned already, we implemented three architectures; each was trained using six algorithms, where each algorithm was run ten times with a different number of neurons in the hidden layers. This generated a total of 180 different models (6 algorithms × 10 configurations × 3 architectures). Table 2 summarizes the six algorithms and ten sets of number of neurons for each iteration of the algorithms, leading to sixty entries in total. Note that the entry No. of neurons indicates x neurons in the first hidden layer, and y in the second.
Each one of the 180 models was trained five times, and the best results in terms of minimum MSE, epochs, and training time were saved. The obtained results showed MSE in the range of to . For the purpose of cross-validation, results from all of the models were compared with the actual data generated by the PPMS. Figure 4 and Figure 5 show the actual and ANN-predicted IV curves for the diluted square array of antidots. The measurements—specifically for testing—were taken with the following parameters: temperature = 8.5 K, magnetic field = 300 Oe, and temperature = 8.4 K, magnetic field = 100 Oe, respectively, for the two figures, while current was varied from 0 to 8 mA in each case. Note that these values were deliberately not included in the training process of the ANN—they have been explicitly used for validation.
Figure 6 shows MSE in IV curves predicted by 180 models. Note that the horizontal axis corresponds to the sixty entries of Table 2. It may be observed in the figure that the feedforward network with [12 10] neurons in the hidden layers for BR as the training algorithm achieves the lowest MSE (i.e., ). However there are other results that have MSE in the same range, mostly trained with BR.
Figure 7 and Figure 8 show the results with respect to number of iterations (epochs) and training time, respectively. It may be observed that while the cascaded network with [5 2] neurons and CGF as the training algorithm gives best results with respect to smallest number of epochs (i.e., 9), the feedforward network with [8 4] neurons and NR as the training algorithm gives the best results with respect to minimum training time (i.e., 0.109 s). These two parameters—epochs and training time—are specifically more useful in studies requiring real-time training and prediction than MSE, since smaller delays would yield faster systems. Note that each of the Figure 6, Figure 7 and Figure 8 corresponds to a temperature of 8.5 K and a magnetic field of 300 Oe.
We have summarized the best results for each measured parameter in Table 3. It can be noted that although the feedforward network with BR gives minimum MSE, it takes a large number of iterations and training time to converge. Naturally, it will not be the optimum choice in real-time systems. Similarly, the other results show that the systems that converge faster yield large MSE—making them unsuitable in systems requiring high precision.
We have proposed a method based on ANN for measuring the IV curves in a diluted square array of antidots on an Nb film at different applied fields and temperatures. Because of their exceptional approximation capability, ANNs have recently been recommended for the prediction of IV curves in superconducting films. Their increasing role in this field motivated us to present a thorough analysis of three different architectures—namely, feedforward, cascaded, and layer-recurrent networks—which were trained using six different learning algorithms. Each algorithm was executed for ten different configurations of number of neurons in the hidden layers, resulting in a total of sixty ANN models for each architecture. Our results, based on MATLAB simulations, suggested that feedforward networks trained with BR manage to achieve the lowest MSE, but take a lot of time to converge, while those converging faster (in terms of number of iterations and training time) yield larger MSEs.
Since we pinpoint the pros and cons of various architectures with various possible configurations, our proposed framework may be used as a benchmark in all relevant works utilizing ANN in the prediction of IV curves. It is widely known that each geometry of arrays of antidots exhibits different current–voltage curves, leading to vastly varying critical currents and critical temperatures. We sincerely believe it is prudent to study the effectiveness of our approach for each of those geometries, and this we have left as our prospective followup work.
Muhammad Kamran performed the transport measurements on the Physical Properties Measurement System. Sajjad Ali Haider was responsible for the research work related to Artifical Neural Networks; he performed all the simulations. Tallha Akram is an expert of algorithms and formal methods; in this work he was responsible to make a choice of the algorithms and architectures, and reviewed the simulation results. Syed Rameez Naqvi conceived the entire concept, analyzed the obtained data from the Physical Properties Measurement System, and did all the write-up.
Conflicts of Interest
The authors declare no conflict of interest.
Baert, M.; Metlushko, V.; Jonckheere, R.; Moshchalkov, V.; Bruynseraede, Y. Composite flux-line lattices stabilized in superconducting films by a regular array of artificial defects. Phys. Rev. Lett.1995, 74, 3269. [Google Scholar] [CrossRef] [PubMed]
Martin, J.I.; Vélez, M.; Hoffmann, A.; Schuller, I.K.; Vicent, J. Artificially induced reconfiguration of the vortex lattice by arrays of magnetic dots. Phys. Rev. Lett.1999, 83, 1022. [Google Scholar] [CrossRef]
Latimer, M.; Berdiyorov, G.; Xiao, Z.; Kwok, W.; Peeters, F. Vortex interaction enhanced saturation number and caging effect in a superconducting film with a honeycomb array of nanoscale holes. Phys. Rev. B2012, 85, 012505. [Google Scholar] [CrossRef]
Villegas, J.; Savel’ev, S.; Nori, F.; Gonzalez, E.; Anguita, J.; Garcia, R.; Vicent, J. A superconducting reversible rectifier that controls the motion of magnetic flux quanta. Science2003, 302, 1188–1191. [Google Scholar] [CrossRef] [PubMed]
Kamran, M.; Naqvi, S.R.; Kiani, F.; Basit, A.; Wazir, Z.; He, S.K.; Zhao, S.P.; Qiu, X.G. Absence of Reconfiguration for Extreme Periods of Rectangular Array of Holes. J. Supercond. Novel Magn.2015, 28, 3311–3315. [Google Scholar] [CrossRef]
Martin, J.I.; Vélez, M.; Nogues, J.; Schuller, I.K. Flux pinning in a superconductor by an array of submicrometer magnetic dots. Phys. Rev. Lett.1997, 79, 1929. [Google Scholar] [CrossRef][Green Version]
Jaccard, Y.; Martin, J.; Cyrille, M.C.; Vélez, M.; Vicent, J.; Schuller, I.K. Magnetic pinning of the vortex lattice by arrays of submicrometric dots. Phys. Rev. B1998, 58, 8232. [Google Scholar] [CrossRef]
Cuppens, J.; Ataklti, G.; Gillijns, W.; Van de Vondel, J.; Moshchalkov, V.; Silhanek, A. Vortex dynamics in a superconducting film with a kagomé and a honeycomb pinning landscape. J. Supercond. Novel Magn.2011, 24, 7–11. [Google Scholar] [CrossRef]
De Lara, D.P.; Alija, A.; Gonzalez, E.; Velez, M.; Martin, J.; Vicent, J.L. Vortex ratchet reversal at fractional matching fields in kagomélike array with symmetric pinning centers. Phys. Rev. B2010, 82, 174503. [Google Scholar] [CrossRef]
He, S.; Zhang, W.; Liu, H.; Xue, G.; Li, B.; Xiao, H.; Wen, Z.; Han, X.; Zhao, S.; Gu, C.; et al. Wire network behavior in superconducting Nb films with diluted triangular arrays of holes. J. Phys. Condens. Matter2012, 24, 155702. [Google Scholar] [CrossRef] [PubMed]
Kamran, M.; Haider, S.; Akram, T.; Naqvi, S.; He, S. Prediction of IV curves for a superconducting thin film using artificial neural networks. Superlattices Microstruct.2016, 95, 88–94. [Google Scholar] [CrossRef]
Guojin, C.; Miaofen, Z.; Honghao, Y.; Yan, L. Application of Neural Networks in Image Definition Recognition. In Proceedings of the IEEE International Conference on Signal Processing and Communications, Dubai, UAE, 24–27 November 2007; pp. 1207–1210.
Güçlü, U.; van Gerven, M.A.J. Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream. J. Neurosci.2015, 35, 10005–10014. [Google Scholar] [CrossRef] [PubMed][Green Version]
Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst.1989, 2, 303–314. [Google Scholar] [CrossRef]
Hornik, K. Approximation capabilities of multilayer feedforward networks. Neural Netw.1991, 4, 251–257. [Google Scholar] [CrossRef]
Hornik, K. Some new results on neural network approximation. Neural Netw.1993, 6, 1069–1072. [Google Scholar] [CrossRef]
Hornik, K.; Stinchcombe, M.; White, H. Multilayer feedforward networks are universal approximators. Neural Netw.1989, 2, 359–366. [Google Scholar] [CrossRef]
Bielecki, A.; Ombach, J. Dynamical properties of a perceptron learning process: Structural stability under numerics and shadowing. J. Nonlinear Sci.2011, 21, 579–593. [Google Scholar] [CrossRef]
Meireles, M.; Almeida, P.; Simoes, M. A comprehensive review for industrial applicability of artificial neural networks. IEEE Trans. Ind. Electr.2003, 50, 585–601. [Google Scholar] [CrossRef]
Üreyen, M.E.; Gürkan, P. Comparison of artificial neural network and linear regression models for prediction of ring spun yarn properties. I. Prediction of yarn tensile properties. Fibers Polym.2008, 9, 87–91. [Google Scholar] [CrossRef]
Ghanbari, A.; Naghavi, A.; Ghaderi, S.; Sabaghian, M. Artificial Neural Networks and regression approaches comparison for forecasting Iran’s annual electricity load. In Proceedings of the International Conference on Power Engineering, Energy and Electrical Drives, Lisbon, Portugal, 18–20 March 2009; pp. 675–679.
Elminir, H.K.; Azzam, Y.A.; Younes, F.I. Prediction of hourly and daily diffuse fraction using neural network, as compared to linear regression models. Energy2007, 32, 1513–1523. [Google Scholar] [CrossRef]
Quan, G.Z.; Pan, J.; Wang, X. Prediction of the Hot Compressive Deformation Behavior for Superalloy Nimonic 80A by BP-ANN Model. Appl. Sci.2016, 6, 66. [Google Scholar] [CrossRef]
Zhao, M.; Li, Z.; He, W. Classifying Four Carbon Fiber Fabrics via Machine Learning: A Comparative Study Using ANNs and SVM. Appl. Sci.2016, 6, 209. [Google Scholar] [CrossRef]
Odagawa, A.; Sakai, M.; Adachi, H.; Setsune, K.; Hirao, T.; Yoshida, K. Observation of intrinsic Josephson junction properties on (Bi, Pb) SrCaCuO thin films. Jpn. J. Appl. Phys.1997, 36, L21. [Google Scholar] [CrossRef]
Setti, S.G.; Rao, R. Artificial neural network approach for prediction of stress–strain curve of near β titanium alloy. Rare Met.2014, 33, 249–257. [Google Scholar] [CrossRef]
Marquardt, D.W. An algorithm for least-squares estimation of nonlinear parameters. J. Soc. Ind. Appl. Math.1963, 11, 431–441. [Google Scholar] [CrossRef]
MacKay, D.J. A practical Bayesian framework for backpropagation networks. Neural Comput.1992, 4, 448–472. [Google Scholar] [CrossRef]
Beale, E. A derivation of conjugate gradients. In Numerical Methods for Nonlinear Optimization; Academic Press Inc.: Cambridge, MA, USA, 1972; pp. 39–43. [Google Scholar]
Hestenes, M.R. Conjugate Direction Methods in Optimization; Springer Science & Business Media: New York, NY, USA, 2012; Volume 12. [Google Scholar]
Johansson, E.M.; Dowla, F.U.; Goodman, D.M. Backpropagation learning for multilayer feed-forward neural networks using the conjugate gradient method. Int. J. Neural Syst.1991, 2, 291–301. [Google Scholar] [CrossRef]
Powell, M.J.D. Restart procedures for the conjugate gradient method. Math. Program.1977, 12, 241–254. [Google Scholar] [CrossRef]
Battiti, R.; Masulli, F. BFGS optimization for faster and automated supervised learning. In International Neural Network Conference; Springer: Berlin/Heidelberg, Germany, 1990; pp. 757–760. [Google Scholar]
Beale, M.H.; Hagan, M.T.; Demuth, H.B. Neural network toolbox 7. In Matlab’s User Guide; Mathworks: Natick, MA, USA, 2010. [Google Scholar]
Scanning electron micrograph (SEM) of a superconducting Nb film with a diluted square array of holes.
Scanning electron micrograph (SEM) of a superconducting Nb film with a diluted square array of holes.
The current–voltage (IV) characteristic of a diluted square array of holes at different temperatures.
The current–voltage (IV) characteristic of a diluted square array of holes at different temperatures.