Literature DB >> 23928655

Decision maker based on nanoscale photo-excitation transfer.

Song-Ju Kim1, Makoto Naruse, Masashi Aono, Motoichi Ohtsu, Masahiko Hara.   

Abstract

Decision-making is one of the most important intellectual abilities of the human brain. Here we propose an efficient decision-making system which uses optical energy transfer between quantum dots (QDs) mediated by optical near-field interactions occurring at scales far below the wavelength of light. The simulation results indicate that our system outperforms the softmax rule, which is known as the best-fitting algorithm for human decision-making behaviour. This suggests that we can produce a nano-system which makes decisions efficiently and adaptively by exploiting the intrinsic spatiotemporal dynamics involving QDs mediated by optical near-field interactions.

Entities:  

Mesh:

Year:  2013        PMID: 23928655      PMCID: PMC3738946          DOI: 10.1038/srep02370

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


Which clothes should I wear? Which restaurant should we choose for lunch? Which article should I read first? We make many decisions in our daily lives. Can we look to nature to find a method for ‘efficient decision-making’? For formal discussion, let us focus on the multi-armed bandit problem (BP), stated as follows. Consider a number of slot machines. Each machine when pulled, rewards the player with a coin at a certain probability P (k∈{1, 2, …, N}). For simplicity, we assume that the reward from one machine is the same as that from another machine. To maximise the total amount of reward, it is necessary to make a quick and accurate judgment of which machine has the highest probability of giving a reward. To accomplish this, the player should gather information about many machines in an effort to determine which machine is the best; however, in this process, the player should not fail to exploit the reward from the known best machine. These requirements are not easily met simultaneously, because there is a trade-off between ‘exploration’ and ‘exploitation’. The BP is used to determine the optimal strategy for maximising the total reward with incompatible demands, either by exploiting rewards obtained owing to already collected information or by exploring new information to acquire higher pay-offs in risk taking. Living organisms commonly encounter this ‘exploration-exploitation dilemma’ in their struggle to survive in the unknown world. This dilemma has no known generally optimal solution. What strategies do humans and animals exploit to resolve this dilemma? Daw et al. found that the softmax rule is the best-fitting algorithm for human decision-making behaviour in the BP task1. The softmax rule uses the randomness of the selection specified by a parameter analogous to the temperature in the Boltzmann distribution (see Methods). The findings of Daw et al. raised many exciting questions for future brain research2. How humans and animals respond to the dilemma and the underlying neural mechanisms still remain important and open questions. The BP was originally described by Robbins3, although the same problem in essence was also studied by Thompson4. However, the optimal strategy is known only for a limited class of problems in which the reward distributions are assumed to be ‘known’ to the players, and an index called ‘the Gittins index’ is computed56. Furthermore, computing the Gittins index in practice is not tractable for many problems. Auer et al. proposed another index expressed as a simple function of the total reward obtained from a machine7. This upper confidence bound (UCB) algorithm is used worldwide for many applications, such as Monte-Carlo tree searches89, web content optimization10 and information and communications technology (ICT)11121314. Kim et al. proposed a decision-making algorithm called the ‘tug-of-war model’ (TOW); it was inspired by the true slime mold Physarum1516, which maintains a constant intracellular resource volume while collecting environmental information by concurrently expanding and shrinking its branches. The conservation law entails a ‘nonlocal correlation’ among the branches, that is, the volume increment in one branch is immediately compensated for by volume decrement(s) in the other branch(es). This nonlocal correlation was shown to be useful for decision-making. Thus, the TOW is a dynamical system which describes spatiotemporal dynamics of a physical object (i.e. an amoeboid organism). The TOW connected ‘natural phenomena’ to ‘decision-making’ for the first time. This approach enables us to realise an ‘efficient decision maker’–an object which can make decisions efficiently. Here we demonstrate the physical implementation of the TOW with quantum dots (QDs) and optical near-field interactions by using numerical simulations. Semiconductor QDs have been used for innovative nanophotonic devices1718 and optical near-field interactions have been successfully applied to solar cells19, LEDs20, diode lasers21 etc. We have already proposed QD systems for computing applications, such as the constraint satisfaction problem (CSP) and the satisfiability problem (SAT)2223. We introduce a new application for decision-making by making use of optical energy transfer between QDs mediated by optical near-field interactions. We use three types of cubic QDs with side lengths of a, and 2a, which are respectively represented by QD, QD and QD [Fig. 1(a)]. We assume that five QDs are one-dimensionally arranged in ‘L-M-S-M-L’ or ‘M-L-S-L-M’ as shown in Fig. 1(b), where S, M and L are simplified representations of QD, QD and QD, respectively. Owing to the steep electric fields in the vicinity of these QDs, an optical excitation can be transferred between QDs through resonant energy levels mediated by optical near-field interactions2425262728. Here we should note that an optical excitation is usually transferred from smaller QDs to larger ones owing to energy dissipation processes occurring at larger QDs (details are described in Supplementary Information). In addition, an optical near-field interaction follows Yukawa-type potential, meaning that it could be engineered by inter-dot distances, where r is the inter-dot distance, and A and a are constants29.
Figure 1

(a) Energy transfer between quantum dots (QDs). Two cubic quantum dots QD and QD, whose side lengths are a and , respectively, are located close to each other. Optical excitations in QD can be transferred to neighbouring structures QD via optical near-field interactions, denoted by 29, because there exists a resonance between the level of quantum number (1, 1, 1) for QD (denoted by S1) and that of quantum number (2, 1, 1) for QD (M2). (b) QD-based decision maker. The system consists of five QDs denoted QD, QD, QD, QD and QD. The energy levels in the system are summarised as follows. The (2, 1, 1)-level of QD, QD, QD and QD is respectively denoted by ML2, MR2, LL2 and LR2. The (1, 1, 1)-level of QD, QD, QD and QD is respectively denoted by ML1, MR1, LL1 and LR1. The (2, 2, 2)-level of QD and QD is respectively denoted by LL3 and LR3. The optical near-field interactions are , , and . (c) Schematic summary of the state transitions. Shown are the relaxation rates , , , , and , and the radiative decay rates , , , and .

When an optical excitation is generated at QD, it is transferred to the lowest energy levels in QDs; we observe negligible radiation from QDs. However, when the lowest energy levels of QDs are occupied by control lights, which induce state-filling effects, the optical excitation at QD is more likely to be radiated from QDs30. Here we consider the photon radiation from either left QD or right QD as the decision of selecting slot machine A and B, respectively. The intensity of the control light to induce state-filling at the left and right QDs is respectively modulated on the basis of the resultant rewards obtained from the chosen slot machine. We call such a decision-making system the ‘QD-based decision maker (QDM)’. The QDM can be easily extended to N-armed (N > 2) cases, although we demonstrate only the two-armed case in this study. It should be noted that the optical excitation transfer between QDs mediated by optical near-field interactions is fundamentally probabilistic; this is described below in detail on the basis of density matrix formalism. Until energy dissipation is induced, an optical excitation simultaneously interacts with potentially transferable destination QDs in the resonant energy level. We exploit such probabilistic behaviour for the function of exploration for decision-making. It also should be emphasised that conventionally, propagating light is assumed to interact with nanostructured matter in a spatially uniform manner (by a well-known principle referred to as long-wavelength approximation) from which state transition rules for optical transitions are derived, including dipole-forbidden transitions. However, such an approximation is not valid for optical near-field interactions in the subwavelength vicinity of an optical source; the inhomogeneity of optical near-fields of a rapidly decaying nature makes even conventionally dipole-forbidden transitions allowable172223. We introduce quantum mechanical modelling of the total system based on a density matrix formalism. For simplicity, we assume one excitation system. There are in total 11 energy levels in the system: S 1 in QD; ML1 and ML2 in QD; LL1, LL2 and LL3 in QD; MR1 and MR2 in QD; LR1, LR2 and LR3 in QD. Therefore the number of different states occupying these energy levels is 12 including the vacancy state, as schematically shown in Fig. 1(b). Because a fast inter-sublevel transition in QDs and QDs is assumed, it is useful to establish theoretical treatments on the basis of the exciton population in the system composed of QD, QDs and QDs, where 11 basis states are assumed, as schematically illustrated in Fig. 1(c).

Results

Here we propose the QDM that is based on the five QD system, as shown in Fig. 1(b). Through optical near-field interactions, an optical excitation generated at QD is transferred to the lowest energy levels in the largest-size QD, namely the energy level LL1 or LR1. However, when LL1 and LR1 are occupied by other excitations, the input excitation generated at S1 should be relaxed from the middle-sized QD, that is, ML1 and MR1. The idea of the QDM is to induce state-filling at LL1 and/or LR1 while observing radiations from ML1 and MR1. If radiation occurs in ML1 (MR1), then we consider that the system selects machine A (B). We can modulate these radiation probabilities by changing the intensity of the incident light. We adopt the intensity adjuster (IA) to modulate the intensity of incident light to large QDs, as shown at the bottom of Fig. 2. The initial position of the IA is zero. In this case, the same intensity of light is applied to both LL1 and LR1. If we move the IA to the right, the intensity of the right increases and that of the left decreases. In contrast, if we move the IA to the left, the intensity of the left increases and that of the right decreases. This situation can be described by the following relaxation rate parameters as functions of the IA position j: , and .
Figure 2

Intensity adjuster (IA) and difference between radiation probabilities from ML1 and MR1.

The difference between radiation probabilities S(j) − S(j) as a function of the IA position j, which are calculated from the quantum master equation of the total system, is denoted by the solid red line. Here we used the parameters , and . As supporting information, the dashed line denotes the case where , and .

In advance, we calculated two (right and left) radiation probabilities from ML1 and MR1, namely S(j) and S(j), in each of the 201 states (−100 ≤ j ≤ 100) by using the quantum master equation of the total system (see Supplementary Information). There are 101 independent values. Results are shown in Fig. 2 as the solid red line. If j > 0, the intensity of incident light to the LR1 increases (the decreases by the amount of j/10, 000), while that to the LL1 decreases (the increases by the amount of j/10, 000). Correspondingly, the radiation probability S(j) is larger than S(j) for j > 0, while the radiation probability S(j) is larger than S(j) for j < 0. For j < −100, we used probabilities S(−100) and S(−100). Similarly, we used S(100) and S(100) for j > 100. The dynamics of the IA is set up as follows. The IA position is moved according to the reward from a slot machine. Here the parameter D is a unit of the move. Set the IA position j to 0. Select machine A or B by using S(j) and S(j). Play on the selected machine. If a coin is dispensed, then move the IA to the selected machine's direction, that is, j = j − D for A, and j = j + D for B. If no coin is dispensed, then move the IA to the inverse direction of the selected machine, that is, j = j + D for A, and j = j − D for B. Repeat step (2). In this way, the system selects A or B, and the IA moves to the right or left according to the reward. Figures 3(a) and (b) demonstrates the efficiency (cumulative rate of correct selections) for the QDM (solid red line) and the softmax rule with optimised parameter τ (dashed line) in the case where the reward probabilities of the slot machines are (a) P = 0.2 and P = 0.8 and (b) P = 0.4 and P = 0.6. In these cases, the correct selection is ‘B’ because P is larger than P. These cumulative rates of correct selections are average values for each 1, 000 samples. Hence, each value corresponds to the average number of coins acquired from the slot machines. Even with a non-optimised parameter D, the performance of the QDM is higher than that of the softmax rule with optimised parameter τ, in a wide parameter range of D = 10–100 although we show only the D = 50 case in Figs. 3(a) and (b).
Figure 3

(a) Efficiency comparison 1. The efficiency comparison between the QDM and the softmax rule is for slot machine reward probabilities of P = 0.2 and P = 0.8. The cumulative rate of correct selections for the QDM with fixed parameter D = 50 (solid red line) and the softmax rule with optimised parameter τ = 0.40 (dashed line) are shown. (b) Efficiency comparison 2. The efficiency comparison between the QDM and the softmax rule for P = 0.4 and P = 0.6. The cumulative rate of correct selections for the QDM with fixed parameter D = 50 (solid red line) and the softmax rule with optimised parameter τ = 0.25 (dashed line) are shown. (c) Adaptability comparison. The adaptability comparison between the QDM and the softmax rule for P = 0.4 and P = 0.6. In every 3,000 steps, two reward probabilities switch. The percentage of correct selections for the QDM with fixed parameter D = 50 (red line), and the softmax rule with the optimised parameter τ = 0.08 (black line) are shown. In this simulation, we used the forgetting parameter α = 0.999 (see Methods).

Figure 3(c) shows the adaptability (percentage of correct selections) for the QDM (red line) and the softmax rule (black line). The parameter τ of the softmax rule was optimised to obtain the fastest adaptation. The switching occurs at time steps t = 3, 000 and 6, 000. Up to t = 3, 000, the systems adapt to the initial environment (P = 0.4 and P = 0.6). Between t = 3, 000 and 6, 000, the systems adapt to the new environment (P = 0.6 and P = 0.4). Beyond t = 6, 000, the systems adapt to the first environment again (P = 0.4 and P = 0.6). It is noted that we set β = 0 at the switching points because the softmax rule cannot detect the environmental change. Nevertheless, the adaptation (slope) of the QDM was found to be faster (steeper) than that of the softmax rule to the initial environment (P = 0.4 and P = 0.6) as well as to the new environment (P = 0.6 and P = 0.4). Thus, we can conclude that the QDM has better adaptability than the softmax rule.

Discussion

In summary, we have demonstrated a QDM on the basis of the optical energy transfer occurring far below the wavelength of light by using computer simulations. This paves a new way for utilising quantum nanostructures and inherent spatiotemporal dynamics of optical near-field interactions for totally new application, that is, ‘efficient decision-making’. Surprisingly, the performance of the QDM is better than that of the softmax rule in our simulation studies. Moreover, the QDM exhibits superior flexibility in adapting to environmental changes, which is an essential property for living organisms to survive in uncertain environments. Finally, there are a few additional remarks. First, it has been demonstrated that the optical energy transfer is about 104 times more energy efficient as compared with the bit-flip energy of conventional electrical devices31. Furthermore, nanophotonic devices based on optical energy transfer with such energy efficiency have been experimentally demonstrated, including input and output interfaces with the optical far-field32. These studies indicate that the QDM is highly energy efficient. The second remark is about the experimental implementation of size- and position-controlled quantum nanostructures. Kawazoe et al. successfully demonstrated a room-temperature-operated two-layer QD device by utilising highly sophisticated molecular beam epitaxy (MBE)17. In addition, Akahane et al. succeeded in realising more than 100 layers of size-controlled QDs33. Besides, DNA-based self-assembly technology can also be a solution in realising controlled nanostructures34. Other nanomaterial systems, such as nanodiamonds35, nanowires36, nanostructures formed by droplet epitaxy37, have also been showing rapid progress. These technologies provide feasible solutions. Finally, in the demonstration of decision-making in this study, for simplicity, we dealt with restricted problems, namely, P + P = 1. However, general problems can also be solved by the extended QDM although the IA dynamics becomes a bit more complicated. The IA and its dynamics can also be implemented by using QDs.

Methods

Calculation of radiation probabilities S(j) and S(j)

Using the quantum Liouville equation (see Supplementary Information), we can calculate the probabilities of radiation from ML1 and MR1. We used the following relaxation rate and radiative decay parameters as shown in Fig. 1(c), , , , , , , , , , and , where j represents the position of the intensity adjuster (IA). Note that we used the following optical near-field interaction parameters so that the QDs are one-dimensionally arranged in “M-L-S-L-M” order for this calculation: , , , , , , and . Here we used A = 0.1 and a = 10−9 for the parameters in eq.(1). Then, the distance between QD and QD (r1) is 20 nm, and the distance between QD and QD (r2) is 24.5 nm, such that and . Finally, we obtained two radiation probabilities whose maximum ratio is 8520.82. The difference between radiation probabilities, S(j) − S(j), is shown by the solid red line in Fig. 2.

Algorithms

Softmax rule

In the softmax rule, the probability of selecting A or B, or , is given by the following Boltzmann distributions: where is the estimated reward probability of slot machines k, denoted by P, and ‘temperature’ β is a parameter. These estimates are given as, Here w(t) (k∈{A, B}) is 1 if a reward is dispensed from machine k at time t, otherwise 0, and m(t) (k∈{A, B}) is 1 if machine k is selected at time t, otherwise 0. α is a parameter that can control the forgetting of past information. We used α = 1 for Figs. 3(a) and (b), and α = 0.999 for (c). In our study, the ‘temperature’ β was modified to a time-dependent form as follows: Here, τ is a parameter that determines the growth rate. β = 0 corresponds to a random selection, and β → ∞ corresponds to a greedy action. Greedy action means that the player selects A if , or selects B if .

Dynamics of the intensity adjuster

We adopt the IA to modulate the intensity of incident light to large QDs as shown at the bottom of Fig. 2. The dynamics of the IA also uses estimates Q (k∈{A, B}) which are different from those of the softmax rule. The IA position j is determined by where the function floor(x) truncates the decimal point. r(t) (k∈{A, B}) is D if a reward is dispensed from the machine k at time t, −D if a reward is not dispensed from the machine k and 0 if the system does not select the machine k. Here D is a parameter and α is also a parameter that can control forgetting of past information. We used α = 1 for Figs. 3(a) and (b), and α = 0.999 for (c). If we set α = 1, the dynamics can be stated by the following rules. If a coin is dispensed, then move the IA to the selected machine's direction, that is, j = j − D for A and j = j + D for B. If a coin is not dispensed, then move the IA to the inverse direction of the selected machine, that is, j = j + D for A and j = j − D for B.

Author Contributions

S.-J.K., M.N. and M.A. designed research, S.-J.K. performed the computer simulations and analysed the data, M.N. and M.A. helped with the modelling. M.O. and M.H. helped with the analysis. S.-J.K. and M.N. wrote the manuscript. All atuthors reviewed the manuscript.
  10 in total

1.  Spectrally resolved dynamics of energy transfer in quantum-dot assemblies: towards engineered energy flows in artificial materials.

Authors:  S A Crooker; J A Hollingsworth; S Tretiak; V I Klimov
Journal:  Phys Rev Lett       Date:  2002-10-14       Impact factor: 9.161

Review 2.  Assembly of hybrid photonic architectures from nanophotonic constituents.

Authors:  Oliver Benson
Journal:  Nature       Date:  2011-12-08       Impact factor: 49.962

3.  Tug-of-war model for the two-bandit problem: nonlocally-correlated parallel exploration via resource conservation.

Authors:  Song-Ju Kim; Masashi Aono; Masahiko Hara
Journal:  Biosystems       Date:  2010-04-22       Impact factor: 1.973

4.  Lower bound of energy dissipation in optical excitation transfer via optical near-field interactions.

Authors:  Makoto Naruse; Hirokazu Hori; Kiyoshi Kobayashi; Petter Holmström; Lars Thylén; Motoichi Ohtsu
Journal:  Opt Express       Date:  2010-11-08       Impact factor: 3.894

5.  Optical control of excitons in a pair of quantum dots coupled by the dipole-dipole interaction.

Authors:  Thomas Unold; Kerstin Mueller; Christoph Lienau; Thomas Elsaesser; Andreas D Wieck
Journal:  Phys Rev Lett       Date:  2005-04-08       Impact factor: 9.161

6.  Coherent exciton-surface-plasmon-polariton interaction in hybrid metal-semiconductor nanostructures.

Authors:  P Vasa; R Pomraenke; S Schwieger; Yu I Mazur; Vas Kunets; P Srinivasan; E Johnson; J E Kihm; D S Kim; E Runge; G Salamo; C Lienau
Journal:  Phys Rev Lett       Date:  2008-09-08       Impact factor: 9.161

7.  Near-field optical microscopy with a nanodiamond-based single-photon tip.

Authors:  Aurélien Cuche; Aurélien Drezet; Yannick Sonnefraud; Orestis Faklaris; François Treussart; Jean-François Roch; Serge Huant
Journal:  Opt Express       Date:  2009-10-26       Impact factor: 3.894

8.  Amoeba-inspired nanoarchitectonic computing: solving intractable computational problems using nanoscale photoexcitation transfer dynamics.

Authors:  Masashi Aono; Makoto Naruse; Song-Ju Kim; Masamitsu Wakabayashi; Hirokazu Hori; Motoichi Ohtsu; Masahiko Hara
Journal:  Langmuir       Date:  2013-04-08       Impact factor: 3.882

9.  Cortical substrates for exploratory decisions in humans.

Authors:  Nathaniel D Daw; John P O'Doherty; Peter Dayan; Ben Seymour; Raymond J Dolan
Journal:  Nature       Date:  2006-06-15       Impact factor: 49.962

Review 10.  Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration.

Authors:  Jonathan D Cohen; Samuel M McClure; Angela J Yu
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2007-05-29       Impact factor: 6.237

  10 in total
  7 in total

1.  Single-photon decision maker.

Authors:  Makoto Naruse; Martin Berthel; Aurélien Drezet; Serge Huant; Masashi Aono; Hirokazu Hori; Song-Ju Kim
Journal:  Sci Rep       Date:  2015-08-17       Impact factor: 4.379

2.  Chaotic oscillation and random-number generation based on nanoscale optical-energy transfer.

Authors:  Makoto Naruse; Song-Ju Kim; Masashi Aono; Hirokazu Hori; Motoichi Ohtsu
Journal:  Sci Rep       Date:  2014-08-12       Impact factor: 4.379

3.  Random walk with chaotically driven bias.

Authors:  Song-Ju Kim; Makoto Naruse; Masashi Aono; Hirokazu Hori; Takuma Akimoto
Journal:  Sci Rep       Date:  2016-12-08       Impact factor: 4.379

4.  Ultrafast photonic reinforcement learning based on laser chaos.

Authors:  Makoto Naruse; Yuta Terashima; Atsushi Uchida; Song-Ju Kim
Journal:  Sci Rep       Date:  2017-08-18       Impact factor: 4.379

5.  Scalable photonic reinforcement learning by time-division multiplexing of laser chaos.

Authors:  Makoto Naruse; Takatomo Mihana; Hirokazu Hori; Hayato Saigo; Kazuya Okamura; Mikio Hasegawa; Atsushi Uchida
Journal:  Sci Rep       Date:  2018-07-18       Impact factor: 4.379

6.  Entangled N-photon states for fair and optimal social decision making.

Authors:  Nicolas Chauvet; Guillaume Bachelier; Serge Huant; Hayato Saigo; Hirokazu Hori; Makoto Naruse
Journal:  Sci Rep       Date:  2020-11-24       Impact factor: 4.379

7.  Ionic decision-maker created as novel, solid-state devices.

Authors:  Takashi Tsuchiya; Tohru Tsuruoka; Song-Ju Kim; Kazuya Terabe; Masakazu Aono
Journal:  Sci Adv       Date:  2018-09-07       Impact factor: 14.136

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.