# Adiabatic Quantum Computation Applied to Deep Learning Networks

^{1}

^{2}

^{3}

^{4}

^{5}

^{*}

## Abstract

**:**

## 1. Introduction

#### 1.1. Boltzmann Machines

#### 1.2. Convolutional Neural Networks

#### 1.3. Spiking Neural Networks

#### 1.4. Challenges

## 2. Related Work

## 3. Approach and Data

#### 3.1. Adiabatic Quantum Computation

#### 3.2. The Superconducting Quantum Adiabatic Processor

#### 3.3. Implementing a Boltzmann Machine on D-Wave

## 4. Results

## 5. Alternative Approaches

#### 5.1. HPC

#### 5.2. Neuromorphic

## 6. Discussion

#### Future Work

## 7. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## Appendix A.

#### Appendix A.1. Related Works for High Performance Computing

#### Appendix A.2. Related Works for Neuromorphic Computing

## Appendix B.

#### Appendix B.1. Description of High Performance Computing

#### Appendix B.2. Description of Neuromorphic Computing

_{2}) [53], tantalum-oxide (TaO

_{2}) [54], and titanium-oxide (TiO

_{2}) [55]. All of these memristive material stacks consist of an oxide layer sandwiched between two metallic layers. Depending on the polarity and magnitude of an applied voltage bias, the oxide layer transitions between being less or more conductive, providing the switching characteristics desirable for representing synaptic weights.

## References

- Ackley, D.H.; Hinton, G.E.; Sejnowski, T.J. A learning algorithm for Boltzmann machines. Cogn. Sci.
**1985**, 9, 147–169. [Google Scholar] [CrossRef] - Hinton, G.E.; Salakhutdinov, R.R. Reducing the Dimensionality of Data with Neural Networks. Science
**2006**, 313, 504–507. [Google Scholar] [CrossRef] [PubMed] - Hinton, G.E.; Osindero, S.; Teh, Y.W. A Fast Learning Algorithm for Deep Belief Nets. Neural Comput.
**2006**, 18, 1527–1554. [Google Scholar] [CrossRef] [PubMed] - LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE
**1998**, 86, 2278–2324. [Google Scholar] [CrossRef] - Scherer, D.; Müller, A.; Behnke, S. Evaluation of pooling operations in convolutional architectures for object recognition. In International Conference on Artificial Neural Networks; Springer: Berlin/Heidelberg, Germany, 2010; pp. 92–101. [Google Scholar]
- Esser, S.K.; Merolla, P.A.; Arthur, J.V.; Cassidy, A.S.; Appuswamy, R.; Andreopoulos, A.; Berg, D.J.; McKinstry, J.L.; Melano, T.; Barch, D.R.; et al. Convolutional networks for fast, energy-efficient neuromorphic computing. Proc. Natl. Acad. Sci. USA
**2016**, 113, 11441–11446. [Google Scholar] [CrossRef] [PubMed] - Indiveri, G.; Corradi, F.; Qiao, N. Neuromorphic Architectures for Spiking Deep Neural Networks. In Proceedings of the IEEE International Electron Devices Meeting (IEDM), Washington, DC, USA, 7–9 December 2015. [Google Scholar]
- Wiebe, N.; Kapoor, A.; Svore, K.M. Quantum deep learning. Quantum Inf. Comput.
**2016**, 16, 0541–0587. [Google Scholar] - LeCun, Y.; Cortes, C.; Burges, C.J. The MNIST Database of Handwritten Digits. 1998. Available online: http://yann.lecun.com/exdb/mnist/ (accessed on 17 May 2018).
- Feynman, R.P. Simulating physics with computers. Int. J. Theor. Phys.
**1982**, 21, 467–488. [Google Scholar] [CrossRef] - Shor, P.W. Polynomial-Time Algorithms for Prime Factorization and Discrete Logarithms on a Quantum Computer. SIAM J. Comput.
**1997**, 26, 1484–1509. [Google Scholar] [CrossRef] - Raussendorf, R.; Briegel, H.J. A One-Way Quantum Computer. Phys. Rev. Lett.
**2001**, 86, 5188–5191. [Google Scholar] [CrossRef] [PubMed] - Farhi, E.; Goldstone, J.; Gutmann, S.; Sipser, M. Quantum Computation by Adiabatic Evolution. arXiv, 2000; arXiv:quant-ph/0001106. [Google Scholar]
- Adachi, S.H.; Henderson, M.P. Application of Quantum Annealing to Training of Deep Neural Networks. arXiv, 2015; arXiv:1510.06356v1. [Google Scholar]
- Benedetti, M.; Realpe-Gómez, J.; Biswas, R.; Perdomo-Ortiz, A. Estimation of effective temperatures in quantum annealers for sampling applications: A case study with possible applications in deep learning. Phys. Rev. A
**2016**, 94, 022308. [Google Scholar] [CrossRef] - Benedetti, M.; Realpe-Gómez, J.; Biswas, R.; Perdomo-Ortiz, A. Quantum-assisted learning of graphical models with arbitrary pairwise connectivity. arXiv, 2016; arXiv:1609.02542v2. [Google Scholar]
- Terwilliger, A.M.; Perdue, G.N.; Isele, D.; Patton, R.M.; Young, S.R. Vertex Reconstruction of Neutrino Interactions Using Deep Learning. In Proceedings of the International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA, 14–19 May 2017; pp. 2275–2281. [Google Scholar]
- Barahona, F. On the computational complexity of Ising spin glass models. J. Phys. A Math. Gen.
**1982**, 15, 3241. [Google Scholar] [CrossRef] - Born, M.; Fock, V. Beweis des Adiabatensatzes. Z. Phys.
**1928**, 51, 165–180. [Google Scholar] [CrossRef] - Sarandy, M.S.; Wu, L.A.; Lidar, D.A. Consistency of the Adiabatic Theorem. Quantum Inf. Process.
**2004**, 3, 331–349. [Google Scholar] [CrossRef] - Boixo, S.; Somma, R.D. Necessary condition for the quantum adiabatic approximation. Phys. Rev. A
**2010**, 81, 032308. [Google Scholar] [CrossRef] - Somma, R.D.; Boixo, S. Spectral Gap Amplification. arXiv, 2011; arXiv:1110.2494. [Google Scholar]
- Harris, R.; Johnson, M.W.; Lanting, T.; Berkley, A.J.; Johansson, J.; Bunyk, P.; Tolkacheva, E.; Ladizinsky, E.; Ladizinsky, N.; Oh, T.; et al. Experimental investigation of an eight-qubit unit cell in a superconducting optimization processor. Phys. Rev. B
**2010**, 82, 024511. [Google Scholar] [CrossRef] - Choi, V. Minor-embedding in adiabatic quantum computation: II. Minor-universal graph design. arXiv, 2011; arXiv:1001.3116. [Google Scholar]
- Salakhutdinov, R.; Hinton, G. Deep boltzmann machines. In Proceedings of the Artificial Intelligence and Statistics, Beach, FL, USA, 16–18 April 2009; pp. 448–455. [Google Scholar]
- Hinton, G. A practical guide to training restricted Boltzmann machines. Momentum
**2010**, 9, 926. [Google Scholar] - Diehl, P.U.; Neil, D.; Binas, J.; Cook, M.; Liu, S.C.; Pfeiffer, M. Fast-Classifying, High-Accuracy Spiking Deep Networks Through Weight and Threshold Balancing. In Proceedings of the International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, 12–16 July 2015; pp. 1–8. [Google Scholar]
- Potok, T.E.; Schuman, C.; Young, S.; Patton, R.; Spedalieri, F.; Liu, J.; Yao, K.T.; Rose, G.; Chakma, G. A Study of Complex Deep Learning Networks on High Performance, Neuromorphic, and Quantum Computers. In Proceedings of the Machine Learning in HPC Environments, Salt Lake City, UT, USA, 30 January 2017; pp. 47–55. [Google Scholar]
- Young, S.R.; Rose, D.C.; Karnowski, T.P.; Lim, S.H.; Patton, R.M. Optimizing Deep Learning Hyper-Parameters Through an Evolutionary Algorithm. In Proceedings of the Workshop on Machine Learning in High-Performance Computing Environments, Austin, TX, USA, 15–20 November 2015; pp. 1–5. [Google Scholar]
- Young, S.R.; Rose, D.C.; Johnston, T.; Heller, W.T.; Karnowski, T.P.; Potok, T.E.; Patton, R.M.; Perdue, G.; Miller, J. Evolving Deep Networks Using HPC. In Proceedings of the Machine Learning on HPC Environments, Denver, CO, USA, 12–17 November 2017; p. 7. [Google Scholar]
- Sayyaparaju, S.; Chakma, G.; Amer, S.; Rose, G.S. Circuit Techniques for Online Learning of Memristive Synapses in CMOS-Memristor Neuromorphic Systems. In Proceedings of the Great Lakes Symposium on VLSI, Lake Louise, AL, Canada, 10–12 May 2017; pp. 479–482. [Google Scholar]
- Liu, C.; Yang, Q.; Yan, B.; Yang, J.; Du, X.; Zhu, W.; Jiang, H.; Wu, Q.; Barnell, M.; Li, H. A Memristor Crossbar based Computing Engine Optimized for High Speed and Accuracy. In Proceedings of the IEEE Computer Society Annual Symposium on VLSI (ISVLSI), Pittsburgh, PA, USA, 11–13 July 2016; pp. 110–115. [Google Scholar]
- Farabet, C.; Martini, B.; Akselrod, P.; Talay, S.; LeCun, Y.; Culurciello, E. Hardware accelerated convolutional neural networks for synthetic vision systems. In Proceedings of the 2010 IEEE International Symposium on Circuits and Systems, Vienna, Austria, 30 May–2 June 2010; pp. 257–260. [Google Scholar]
- Yepes, A.; Tang, J.; Mashford, B.J. Improving classification accuracy of feedforward neural networks for spiking neuromorphic chips. arXiv, 2017; arXiv:1705.07755. [Google Scholar]
- Schuman, C.D.; Potok, T.E.; Young, S.; Patton, R.; Perdue, G.; Chakma, G.; Wyer, A.; Rose, G.S. Neuromorphic Computing for Temporal Scientific Data Classification. In Proceedings of the Neuromorphic Computing Symposium, Knoxville, TN, USA, 17–19 July 2018. [Google Scholar]
- Coates, A.; Huval, B.; Wang, T.; Wu, D.J.; Catanzaro, B.; Ng, A.Y. Deep Learning with COTS HPC Systems. In Proceedings of the 30th International Conference on Machine Learning (ICML), Atlanta, GA, USA, 16–21 June 2013; pp. 1337–1345. [Google Scholar]
- Cassidy, A.S.; Merolla, P.; Arthur, J.V.; Esser, S.K.; Jackson, B.; Alvarez-Icaza, R.; Datta, P.; Sawada, J.; Wong, T.M.; Feldman, V.; et al. Cognitive Computing Building Block: A Versatile and Efficient Digital Neuron Model for Neurosynaptic Cores. In Proceedings of the International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, 12–17 July 2013; pp. 1–10. [Google Scholar]
- Shen, J.; Ma, D.; Gu, Z.; Zhang, M.; Zhu, X.; Xu, X.; Xu, Q.; Shen, Y.; Pan, G. Darwin: A neuromorphic hardware co-processor based on Spiking Neural Networks. Sci. China Inf. Sci.
**2016**, 59, 1–5. [Google Scholar] [CrossRef] - Jouppi, N. Google Supercharges Machine Learning Tasks with TPU Custom Chip. 2016. Available online: https://cloudplatform.googleblog.com/2016/05/Google-supercharges-machine-learning-tasks-with-custom-chip.html (accessed on 17 May 2018).
- Nervana. Nervana Engine. 2016. Available online: https://www.nervanasys.com/technology/engine/ (accessed on 30 January 2017).
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput.
**1997**, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed] - Esser, S.K.; Appuswamy, R.; Merolla, P.; Arthur, J.V.; Modha, D.S. Backpropagation for energy-efficient neuromorphic computing. In Proceedings of the Advances in Neural Information Processing Systems (NIPS), Montréal, QC, Canada, 7–12 December 2015; pp. 1117–1125. [Google Scholar]
- Arthur, J.V.; Merolla, P.A.; Akopyan, F.; Alvarez, R.; Cassidy, A.; Chandra, S.; Esser, S.K.; Imam, N.; Risk, W.; Rubin, D.B.; et al. Building Block of a Programmable Neuromorphic Substrate: A Digital Neurosynaptic Core. In Proceedings of the International Joint Conference on Neural Networks (IJCNN), Brisbane, Australia, 10–15 June 2012; pp. 1–8. [Google Scholar]
- Bohte, S.M.; Kok, J.N.; La Poutre, H. Error-backpropagation in temporally encoded networks of spiking neurons. Neurocomputing
**2002**, 48, 17–37. [Google Scholar] [CrossRef] - Schrauwen, B.; Van Campenhout, J. Extending Spikeprop. In Proceedings of the IEEE International Joint Conference on Neural Networks, Budapest, Hungary, 25–29 July 2004; Volume 1, pp. 471–475. [Google Scholar]
- Song, S.; Miller, K.D.; Abbott, L.F. Competitive Hebbian learning through spike-timing-dependent synaptic plasticity. Nat. Neurosci.
**2000**, 3, 919–926. [Google Scholar] [CrossRef] [PubMed] - Williams, R.S. How We Found The Missing Memristor. IEEE Spectr.
**2008**, 45, 28–35. [Google Scholar] [CrossRef] - Jo, S.H.; Chang, T.; Ebong, I.; Bhadviya, B.B.; Mazumder, P.; Lu, W. Nanoscale memristor device as synapse in neuromorphic systems. Nano Lett.
**2010**, 10, 1297–1301. [Google Scholar] [CrossRef] [PubMed] - Kim, K.H.; Gaba, S.; Wheeler, D.; Cruz-Albrecht, J.M.; Hussain, T.; Srinivasa, N.; Lu, W. A functional hybrid memristor crossbar-array/CMOS system for data storage and neuromorphic applications. Nano Lett.
**2011**, 12, 389–395. [Google Scholar] [CrossRef] [PubMed] - Prezioso, M.; Merrikh-Bayat, F.; Hoskins, B.; Adam, G.; Likharev, K.K.; Strukov, D.B. Training and operation of an integrated neuromorphic network based on metal-oxide memristors. Nature
**2015**, 521, 61–64. [Google Scholar] [CrossRef] [PubMed] - Schuman, C.D.; Plank, J.S.; Disney, A.; Reynolds, J. An Evolutionary Optimization Framework for Neural Networks and Neuromorphic Architectures. In Proceedings of the International Joint Conference on Neural Networks (IJCNN), Vancouver, BC, USA, 24–29 July 2016; pp. 145–154. [Google Scholar]
- Schuman, C.D.; Birdwell, J.D.; Dean, M.E. Spatiotemporal Classification Using Neuroscience-Inspired Dynamic Architectures. Proc. Comput. Sci.
**2014**, 41, 89–97. [Google Scholar] [CrossRef] - Cady, N.; Beckmann, K.; Manem, H.; Dean, M.; Rose, G.; Nostrand, J.V. Towards Memristive Dynamic Adaptive Neural Network Arrays. In Proceedings of the Government Microcircuit Applications and Critical Technology Conference (GOMACTech), Orlando, FL, USA, 14–17 March 2016. [Google Scholar]
- Yang, J.J.; Zhang, M.; Strachan, J.P.; Miao, F.; Pickett, M.D.; Kelley, R.D.; Medeiros-Ribeiro, G.; Williams, R.S. High switching endurance in TaOx memristive devices. Appl. Phys. Lett.
**2010**, 97, 232102. [Google Scholar] [CrossRef] - Medeiros-Ribeiro, G.; Perner, F.; Carter, R.; Abdalla, H.; Pickett, M.D.; Williams, R.S. Lognormal switching times for titanium dioxide bipolar memristors: origin and resolution. Nanotechnology
**2011**, 22, 095702. [Google Scholar] [CrossRef] [PubMed]

**Figure 1.**A Boltzmann machine is divided into a visible layer, representing the data input, and a hidden layer, which represents latent factors controlling the data distribution. This diagram shows the restricted Boltzmann machine, or RBM, in which intralayer connections are prohibited. Each connection between units is a separate weight parameter which is discovered through training.

**Figure 2.**A convolutional neural network is composed of a series of alternating convolutional and pooling layers. Each convolutional layer extracts features from its preceding layer to form feature maps. These feature maps are then down-sampled by a pooling layer to exploit data locality. A perceptron, a simple type of classification network, is placed as the last layer of the CNN.

**Figure 3.**The connectivity in a CNN is sparse relative to the previously shown BM model. Additionally, the set of weights is shared between units, unlike in BMs. In this illustration we symbolize this with the red, green, and blue connections to show that each unit in the convolutional layer applies the same operation to different segments of the input.

**Figure 4.**Our LBM model added connectivity between units in the hidden layer, shown in red. RBMs prohibit such intralayer connections because they add too much computational complexity for classical machines. We represented the hidden layer (outlined in blue) on the D-Wave device. The connections between hidden units were 4-by-4 bipartite due to the device’s physical topology constraints.

**Figure 5.**The hidden layer from Figure 4 is represented in one of D-Wave’s chimera cells here, with the cell’s bipartite connectivity made more obvious. The input/visible units of the LBM are left on a classical machine. Their contributions to the activity of the hidden units is reduced to an activity bias (represented with ± symbols) on those units. Figure 6 shows the overall chimera topology of the D-Wave device.

**Figure 6.**Chimera graphs are composed of 8-qubit cells featuring bipartite connectivity. Each cell’s partition is connected to another partition in the adjacent cells.

**Figure 7.**An initial experiment to demonstrate LBM utility. Reconstruction error (sum of squared error) of BMs trained on simulated data using no intralayer connections and using random intralayer connections with a small (0.0001) hidden-to-hidden weight learning rate. Here we show five RBMs (

**red**) and five LBMs (

**blue**), and the results suggest even just the presence of relatively static intralayer connections gives LBMs a performance advantage over RBMs. We obtained these results from the quantum annealing simulator provided by D-Wave.

**Figure 8.**Reconstruction error and classification rate over 25 training epochs using 6000 MNIST images for training and 6000 for testing. Reconstruction error decreases as classification rate rises, confirming that the RBM learns the MNIST data distribution.

**Figure 9.**RBM and LBM performance on the MNIST digit classification task. The LBM tends to label the digits slightly better and produces lower reconstruction error than the RBM.

**Figure 10.**Comparison of RBM against LBM trained on neutrino data using a software simulator. Weights are randomly initialized from a normal distribution. The change in learning rate at epoch 5 is due to a change in the momentum parameter in the algorithm that is designed to speed the rate of training. The graph shows the mean performance of five different RBMs and five different LBMs and suggests the mean reconstruction error of RBM and LBM are significantly different.

**Figure 11.**Another comparison of RBM against LBM run on neutrino data using D-Wave hardware. Both the RBM and LBM are initialized from the same pre-trained model. The pre-trained model is an RBM run for three epochs on a classical machine. The graph shows the mean performance of five different RBMs and five different LBMs, suggesting the performance difference between RBM and LBM persists on hardware.

**Figure 12.**A comparison of the platforms, deep learning approaches, contributions, and significance of the result from the MNIST experiment.

**Figure 13.**A proposed architecture that shows howthe three approaches, quantum, HPC, and neuromorphic can be used to improve a deep learning approach. Image data can be analyzed using an HPC rapidly derived CNN with the top layers using an LBM on a quantum computer. The top layers have fewer inputs, and require greater representational capabilities which both play to the strength and limitations of a quantum approach. The temporal aspect of the data can be analyzed using an SNN. Finally, the image and temporal models will be merged to provide a richer and we believe a more accurate model, with an aim to be deployed in very low power neuromorphic hardware.

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Liu, J.; Spedalieri, F.M.; Yao, K.-T.; Potok, T.E.; Schuman, C.; Young, S.; Patton, R.; Rose, G.S.; Chamka, G.
Adiabatic Quantum Computation Applied to Deep Learning Networks. *Entropy* **2018**, *20*, 380.
https://doi.org/10.3390/e20050380

**AMA Style**

Liu J, Spedalieri FM, Yao K-T, Potok TE, Schuman C, Young S, Patton R, Rose GS, Chamka G.
Adiabatic Quantum Computation Applied to Deep Learning Networks. *Entropy*. 2018; 20(5):380.
https://doi.org/10.3390/e20050380

**Chicago/Turabian Style**

Liu, Jeremy, Federico M. Spedalieri, Ke-Thia Yao, Thomas E. Potok, Catherine Schuman, Steven Young, Robert Patton, Garrett S. Rose, and Gangotree Chamka.
2018. "Adiabatic Quantum Computation Applied to Deep Learning Networks" *Entropy* 20, no. 5: 380.
https://doi.org/10.3390/e20050380