Literature DB >> 33635656

Neuromorphic Binarized Polariton Networks.

Rafał Mirek¹, Andrzej Opala², Paolo Comaron², Magdalena Furman¹, Mateusz Król¹, Krzysztof Tyszka¹, Bartłomiej Seredyński¹, Dario Ballarini³, Daniele Sanvitto³, Timothy C H Liew⁴, Wojciech Pacuski¹, Jan Suffczyński¹, Jacek Szczytko¹, Michał Matuszewski², Barbara Piętka¹.

Abstract

The rapid development of artificial neural networks and applied artificial intelligence has led to many applications. However, current software implementation of neural networks is severely limited in terms of performance and energy efficiency. It is believed that further progress requires the development of neuromorphic systems, in which hardware directly mimics the neuronal network structure of a human brain. Here, we propose theoretically and realize experimentally an optical network of nodes performing binary operations. The nonlinearity required for efficient computation is provided by semiconductor microcavities in the strong quantum light-matter coupling regime, which exhibit exciton-polariton interactions. We demonstrate the system performance against a pattern recognition task, obtaining accuracy on a par with state-of-the-art hardware implementations. Our work opens the way to ultrafast and energy-efficient neuromorphic systems taking advantage of ultrastrong optical nonlinearity of polaritons.

Entities: Chemical Disease Gene Species

Keywords: binary network; exciton-polaritons; microcavities; nonlinear optics; semiconductors

Mesh：

Year: 2021 PMID： 33635656 PMCID： PMC8155323 DOI： 10.1021/acs.nanolett.0c04696

Source DB: PubMed Journal: Nano Lett ISSN： 1530-6984 Impact factor: 11.189

Introduction

The human brain, despite consuming only about 15 W of power, is superior to the most advanced modern supercomputers in many practical tasks, such as object detection and classification. Artificial neural networks (ANNs) are an approach to data processing that mimics the operation of a biological network of neurons, allowing researchers to implement machine learning. Recent years have witnessed immense progress in ANN-based applied artificial intelligence, which has found many important applications in a growing diversity of fields, including medicine, logistics, finance, marketing, defense, agriculture, quantum science, geoscience, gaming, information technology, cybersecurity, language processing, robotics, and autonomous vehicles.[1,2] As the amount of data continually grows, there is an urgent need to provide faster and more energy efficient systems. However, in comparison with the human brain, software simulations of neural networks are inefficient.[3] In the von Neumann architecture, prevalent in conventional computers, the memory and processing units are physically separated, which results in a communication bottleneck. Moreover, the development of current semiconductor technology is bounded by the practical limit of Moore’s law and Amdahl’s law, which hinder the further increase of computational power through the decrease of system size or the increase of the number of processing units.[4] These bounds are largely due to the limited energy efficiency of memory, communication channels, and processing units, which no longer improves exponentially as in the previous decades.[5] Therefore, it is crucial to find an energy-efficient and powerful alternative for big data processing. Such a platform is required to realize a neuromorphic approach to neural networks, in which the massively parallel structure of the network is realized physically rather than simulated.[3] In this context, photonic systems are natural candidates;[4,6−12] but, most of the existing realizations were only able to perform basic machine learning tasks, and the advantage of optical system in terms of speed or energy efficiency has not been clearly demonstrated. Recently, semiconductor microcavities in the quantum strong-coupling regime have emerged as a promising hardware platform for machine learning.[13,14] Exciton-polaritons are quasiparticles resulting from the coupling between photons and excitons in this system.[15,16] They exhibit properties of both light and matter. Electrostatic interactions of excitons lead to optical nonlinearity orders of magnitude stronger than in conventional optical media.[17,18] The cavity photon lifetime results in a picosecond reaction time. The extremely low effective mass of polaritons allows for Bose–Einstein condensation[16,19] recently realized at room temperature in organic and nonorganic materials,[20−22] demonstrating strong nonlinear effects.[23,24] Basic logic elements such as polariton switches, transistors, and gates have been realized.[18,25−30] A system consisting of a polariton microcavity and an off-line classifier was demonstrated to outperform linear classification algorithms.[14] To solve practical tasks of high complexity, a neural network has to perform a nonlinear transformation of input data into an effective higher-dimensional space. This allows for determining the result with a linear classification at the output layer.[31] Recently, binarized neural networks, in which the activations or weights of connections are two-level and the neurons perform simple binary operations, have received much attention.[32,33] Binarized networks are characterized by a greatly improved speed and energy efficiency, at the cost of a minimal reduction of inference accuracy. Here, we propose theoretically and realize experimentally a binary network implemented in a polariton microcavity system. Importantly, the hardware of the network is composed of energy-passive optical elements only, such as resonators, beam splitters, and optical filters. We demonstrate that binarized neurons can operate in a fully all-optical mode, which allows for exploiting the intrinsic ultrashort time scales and high energy efficiency of photonics.[34] The energy cost of a single binary operation is measured to be of the order of picojoules, which is comparable to the state-of-the-art electronic neuromorphic implementations, while the computation time scales are in the picosecond range. We demonstrate approximately 96% classification accuracy of handwritten digits from the Modified National Institute of Science and Technology (MNIST) data set, using a simple single-hidden-layer network in a noisy experimental environment.

Results

All-Optical XOR Logic Gate

The first step in the implementation of a binarized network is the realization of its basic building block,[33] a single XOR gate. The XOR task is a generic example of a problem not solvable using a perceptron or a linear classifier, see Figure a. Therefore, it is a benchmark of the capability to solve problems that require a nonlinear transformation. The principle of the implementation is depicted in Figure b. In addition to the inputs, which correspond to the two-dimensional xy plane in Figure b, a nonlinear feature (z axis) is provided by a micrometer-sized exciton–polariton condensate.

Figure 1

Nonlinear classification and experimental realization. (a) The XOR operation is a generic classification problem that is linearly inseparable in the space of inputs—there exists no straight line separating points corresponding to the “0” and “1” results marked with blue and orange circles, respectively. (b) An additional feature, represented by the z axis, which is a nonlinear function of inputs, allows for performing classification with a two-dimensional plane. (c) Experimental realization in an exciton–polariton system. A series of picosecond pulses encoding the inputs are incident on a semiconductor microcavity in the strong coupling regime, triggering a nonlinear response as a result of bosonic condensation. The emission is used to perform linear classification. In our experiment, the microcavity consists of two CdTe-based Bragg mirrors, separated by an approximately 600 nm thick (Cd,Zn,Mg)Te layer. At the antinodes of the electromagnetic standing wave, six (Cd,Zn,Mn)Te quantum wells (QWs) are introduced for efficient coupling of QW excitons and the photonic modes (see the Supporting Information for more details). We excite two spatially separated localized condensation sites, Figure c, with a series of nonresonant picosecond laser pulses, encoding the corresponding inputs with low (0) or high (1) pulse energy. The two sites are localized close to each other, with a 2 μm distance, which results in a Josephson junction type coupling.[35,36] The light emitted from condensation sites is a nonlinear transformation of the inputs, directed to the linear classifier. The classifier is trained to distinguish “0” and “1” results by adjusting output weights, or the cut in the feature space (Figure b). Figure shows the results obtained using an optoelectronic setup. The photoluminescence of a condensation site as a function of the combined pulse energy of the two inputs resembles the ReLU (rectified linear unit) activation function, see Figure a. Figure b shows the energy integrated output intensity from one of the sites for the four possible binary input combinations. The emission intensity from the two sites is converted to electronic signals by the camera and used to infer the result using linear classification. As demonstrated in Figure c, the accuracy (or the ratio of correct to total predictions) of the XOR gate depends on the degree of nonlinearity η (see the Supporting Information for the definition of η), and an almost perfect operation is obtained for η ≈ 5. Our system achieved perfect accuracy (no mistakes in several hundred thousand operations) due to the nonlinearity reaching η ≈ 50.

Figure 2

Optoelectronic machine learning. (a) The nonlinear dependence of the total emission intensity from the condensation site on the energy of two input pulses. (b) Emission in the four input configurations demonstrates nonlinearity. Insets show typical real-space emission observed on a CCD camera for each realization. The same color scale is preserved for each panel. Image size is of ∼7 μm × 7 μm. (c) Accuracy of the XOR gate as a function of the useful degree of nonlinearity η. (d) Accuracy of the MNIST handwritten digit prediction versus the number of XOR gates. Dashed lines show the benchmarks of software linear classification for the full and binarized MNIST input. (e) Conceptual scheme of the network with a single hidden layer of XOR gates. Having constructed the XOR unit, we build a binary network with a single hidden layer of several thousand (Ngates) of XOR gates, see Figure e. We consider the handwritten digit recognition task using the MNIST data set, which consists of 60000 training samples and 10000 testing samples of 28 × 28 greyscale images.[37] At the input, we convert each image into a black and white bitmap, and assign a random pair of pixels from the 28 × 28 image to each of the gates, see Figure e. The same pairs of pixel positions denoted by p1···p are assigned to the same gates 1···n for all digits. This allows us to detect nontrivial correlations between pixels even in the single-layer network. The above stage does not require any nonlinear operation and can be implemented all-optically, for example, using a three-dimensional laser-written waveguide array.[38] Since the assignment is random and does not change during training, the structure of the network can be considered as a binary generalization of extreme learning machines.[39] Deep networks with more complex structures can be implemented by cascading layers of XOR gates.[33] To demonstrate the capability of the network, we use time multiplexing to realize all gates in the hidden layer. Logistic regression is used to determine the optimal classification hyperplane in the Ngates-dimensional space (see the Supporting Information for details). The results are shown in Figure d, where we plot the accuracy of inference as a function of Ngates. For around 10000 gates the accuracy reaches a plateau at the level of approximately 96%. This is comparable to or higher than that for the state-of-the-art neuromorphic implementations[9,14,40−42] and is considerably higher than the accuracy of pure software linear classification of the grayscale MNIST data set (92.7%) or its binarized version (91.5%), obtained with logistic regression algorithm. Similar to the majority of photonic realizations,[7,8,40,43] in the above scheme the linear classification is implemented electronically. This limits the speed and energy efficiency of the system. To solve this issue, we demonstrate that binarized neurons can operate in the all-optical configuration. Such a device is a photonic analog of neural network accelerators.[44−47] In Figure a, we show the modified setup of the XOR gate, in which the linear classification is performed by optical elements only. The input pulses are directed at beam splitters, which create auxiliary optical paths bypassing the microcavity. In contrast to the previous scheme, a single condensation site is excited by both input pulses. The weights w1 and w2 of direct connections between the input and the output are implemented with neutral density filters, which reduce the pulse intensity in a controlled way. Since the emission from the condensate is always darker than the input pulses, the emission weight is set to unity. The emission from the condensate mixed with the two weighted auxiliary pulses constitutes the optical output of the gate. This intensity mixing effectively performs a simple three-component vector-matrix multiplication, which is necessary to perform the classification in the three-dimensional feature space (see the Supporting Information for details).

Figure 3

All-optical implementation of XOR gate. (a) Scheme of the experimental setup, in which the linear classification of Figure b is implemented with two auxiliary pulse paths controlled with neutral density filters, corresponding to weights w1 and w2. (b) Dependence of emission intensity on the energy of excitation pulses for equal pulse energy in both pulses. The spectral filter placed behind the sample allows for obtaining a negative differential response of the condensate emission. (c) Measured filtered emission intensity for all four combinations of inputs (blue) and the output intensity of the all-optical XOR gate (dark blue), which consists of the emission combined with the weighted inputs. Black dashed lines separate realizations of different inputs. Red dashed lines indicate the gate output intensity levels corresponding to results “0” and “1”. Insets show typical real-space emission observed on a CCD camera for each realization. The same color scale is preserved for each panel. Image size is of ∼6 μm × 6 μm. The nonlinear element has to exhibit a negative differential input–output dependence in a range of excitation powers, as shown in Figure b. As the filter weights w1 and w2 cannot be negative, the monotonically positive dependence would not lead to a useful gate (see the Supporting Information). We use a long-pass spectral filter placed behind the cavity to obtain the negative response shown in Figure b. In the “11” input configuration the polariton–polariton interactions shift the emission to higher frequencies, which are blocked by the filter. This method allows for obtaining the well-defined “0” and “1” output levels, which are consistent for all input configurations, see Figure c. The noise of the output results mostly from the limited stability of our laser. To estimate the energy efficiency we determine the input pulse energy required for a single gate operation. The power of input pulses in the “1” state was measured using a power meter at the entrance to the microcavity to be 1.2 mW at 76 MHz repetition rate, which gives approximately 16 pJ pulse energy per gate operation, while the energy of auxiliary pulses was much lower. The approximate cost is around 16 pJ per synaptic operation, comparable to the state-of the-art neuromorphic electronic implementation.[3]

Discussion

The radical change of the paradigm of computation allows us to propose an optical system that can be realized with currently available optical elements. In particular, the system does not require a separate memory unit, as all information is carried by photons propagating through the network. We emphasize that despite the binary structure of the network, which is based on XOR gates, we go beyond the traditional digital computer architecture. Our approach reveals the potential of semiconductor microcavity systems as a platform for energy efficient information processing.

22 in total

1. Control and ultrafast dynamics of a two-fluid polariton switch.

Authors: M De Giorgi; D Ballarini; E Cancellieri; F M Marchetti; M H Szymanska; C Tejedor; R Cingolani; E Giacobino; A Bramati; G Gigli; D Sanvitto
Journal: Phys Rev Lett Date: 2012-12-27 Impact factor: 9.161

2. Photonic Floquet topological insulators.

Authors: Mikael C Rechtsman; Julia M Zeuner; Yonatan Plotnik; Yaakov Lumer; Daniel Podolsky; Felix Dreisow; Stefan Nolte; Mordechai Segev; Alexander Szameit
Journal: Nature Date: 2013-04-11 Impact factor: 49.962

3. Artificial brains. A million spiking-neuron integrated circuit with a scalable communication network and interface.

Authors: Paul A Merolla; John V Arthur; Rodrigo Alvarez-Icaza; Andrew S Cassidy; Jun Sawada; Filipp Akopyan; Bryan L Jackson; Nabil Imam; Chen Guo; Yutaka Nakamura; Bernard Brezzo; Ivan Vo; Steven K Esser; Rathinakumar Appuswamy; Brian Taba; Arnon Amir; Myron D Flickner; William P Risk; Rajit Manohar; Dharmendra S Modha
Journal: Science Date: 2014-08-07 Impact factor: 47.728

4. Nonlinear interactions in an organic polariton condensate.

Authors: K S Daskalakis; S A Maier; R Murray; S Kéna-Cohen
Journal: Nat Mater Date: 2014-02-09 Impact factor: 43.841

5. A sub-femtojoule electrical spin-switch based on optically trapped polariton condensates.

Authors: Alexander Dreismann; Hamid Ohadi; Yago Del Valle-Inclan Redondo; Ryan Balili; Yuri G Rubo; Simeon I Tsintzos; George Deligeorgis; Zacharias Hatzopoulos; Pavlos G Savvidis; Jeremy J Baumberg
Journal: Nat Mater Date: 2016-08-08 Impact factor: 43.841

6. Classification with a disordered dopant-atom network in silicon.

Authors: Tao Chen; Jeroen van Gelder; Bram van de Ven; Sergey V Amitonov; Bram de Wilde; Hans-Christian Ruiz Euler; Hajo Broersma; Peter A Bobbert; Floris A Zwanenburg; Wilfred G van der Wiel
Journal: Nature Date: 2020-01-15 Impact factor: 49.962

7. Neuromorphic computing with nanoscale spintronic oscillators.

Authors: Jacob Torrejon; Mathieu Riou; Flavio Abreu Araujo; Sumito Tsunegi; Guru Khalsa; Damien Querlioz; Paolo Bortolotti; Vincent Cros; Kay Yakushiji; Akio Fukushima; Hitoshi Kubota; Shinji Yuasa; Mark D Stiles; Julie Grollier
Journal: Nature Date: 2017-07-26 Impact factor: 49.962