The present disclosure relates to an optical associative learning element and a method of performing an associative learning operation in the optical domain using the optical associative learning element.
Artificial intelligence (AI) seeks to build, engineer, control and design neuromorphic networks that are on par with or perhaps even more elegant than biological neural networks in nature. Such associative learning in the form of classical conditioning is often linked with the ability of humans and animals to solve complex multivariate problems with relative ease. Inspired by the same principles, associative learning have been used to augment human work by taking advantage of statistical data inputs that occur simultaneously and thereby forming associations between them.
In autonomous systems, associative learning has been explicitly used to provide machine learning capabilities, for example, the aptitude to predict rare events from temporal and sequential patterns of timestamped observations. The ability to associate can also facilitate sophisticated machine intelligence with a vast array of data analytic applications such as predicting telecommunication equipment failures and mitigating credit card transaction frauds.
On a larger scale, based on one example machine learning architecture (see U.S. Pat. No. 5,588,091, issued Dec. 24, 1996), the computational effort of artificial neural networks using associative learning elements as building blocks scales linearly with the number of connections, in contrast to the non-linear scaling in the conventional Hebbian learning-based networks. Given the typically large datasets necessary in machine learning, this can substantially downscale the training time, energy usage and network size.
Accordingly it is an object of the present disclosure to provide an associative learning element capable of input data association.
The project leading to this application has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 780848.
According to a first aspect of the present disclosure there is provided an optical associative learning element comprising a first waveguide, a second waveguide and a modulating element, wherein:
The first state may comprise a crystalline state of the modulating element. The second state may comprise a less crystalline state, e.g. an amorphous state, of the modulating element. The second state may be a state in which a fractional volume of the modulating element is amorphous and the remaining volume of the modulating element is crystalline. The fractional volume of the modulating element which is amorphous may be larger in the second state than in the first state.
Implementing the associative learning element on an optical platform offers the advantage of broad bandwidth, and power-efficient data transmission using CMOS-compatible fabrication process. Further, photonic networks are inherently scalable and therefore well-suited to implementing an on-chip artificial neural network based on interlinked associative learning elements according to the first aspect.
In some embodiments, the modulating element is configured to modify the amount of coupling between the first and second waveguides in the second directional coupler dependent on the state of the modulating element.
In some embodiments, the modulating element is not evanescently coupled to the second waveguide in the first directional coupler. For example, the modulating element may only extend over the second waveguide in the portion of the second waveguide corresponding to the second directional coupler.
In some embodiments, the first directional coupler is arranged such that when a first optical field is carried by the first waveguide in the absence of a second optical field being contemporaneously carried by the second waveguide, the residual intensity of the first optical field in the first waveguide at the interface between the first and second directional couplers is at least half the initial intensity of the first optical field, preferably greater than 80% of the initial intensity of the first optical field.
In other words, the first directional coupler is arranged to minimize single-input coupling between the first and second waveguides in the first directional coupler.
In some embodiments, the second directional coupler is arranged such that:
In some embodiments, the magnitude of the difference between I4 and I3 is less than or equal to 10% of the magnitude of the difference between I2 and I1, preferably less than or equal to 5% of the magnitude of the difference between I2 and I1, more preferably less than or equal to 1% of the magnitude of the difference between I2 and I1.
This is indicative of the modulating element regulating the output response of the learning element. The second state may be equivalent to a ‘post-learning’ (trained) state and the first state may be equivalent to a ‘before learning’ (untrained) state of the learning element. In the post-learning state, two similar optical input fields incident separately in the first and second waveguides of the modulating element may produce similar outputs from the first waveguide after the second directional coupler. On the contrary, in the before learning state the same two optical input fields may produce dissimilar outputs from the first waveguide after the second directional coupler. The two input fields may be analogous to unconditioned (UCS) and neutral/conditioned stimuli (NS/CS) as per classical conditioning. The output of the first waveguide after the second directional coupler may be analogous to the response (R) of the learning element, which is modulated by the modulating element.
In some embodiments, the first and second directional couplers are arranged such that the state of the modulating element can be switched from said first state to said second state by introducing a first optical field into the first waveguide contemporaneously with a second optical field into the second waveguide.
In some embodiments, the first optical field comprises a first optical pulse or a train of first optical pulses and the second optical field comprises a second optical pulse or a train of second optical pulses, wherein the first optical pulse or pulses are temporally overlapped with the second optical pulse or pulses in the first directional coupler. The first and second optical pulses may have a defined optical phase delay between them, such as, for example, a temporal delay in the range 0.66 fs to 1.155 fs, e.g. 0.825 fs for the waveguide structures of effective refractive index neff=1.59 at optical wavelength 1580 nm. This corresponds to a phase offset/delay in the range 0.4π radians to 0.7π radians, e.g. 0.5π radians.
In some embodiments, the modulating element comprises a phase change material.
In some embodiments, the modulating element comprises a material comprising a compound or alloy of a combination of element selected from the following list of combinations: GeSbTe, VOx, NbOx, GeTe, GeSb, GaSb, AgInSbTe, InSb, InSbTe, InSe, SbTe, TeGeSbS, AgSbSe, SbSe, GeSbMnSn, AgSbTe, AuSbTe, and AlSb.
In some embodiments, the second waveguide is tapered in the portion corresponding to the second directional coupler, such that a width of the second waveguide in the first directional coupler is greater than a corresponding width of the second waveguide in the second directional coupler.
In some embodiments, the width of the second waveguide in the first directional coupler is in the range 1.05 μm to 1.15 μm and the width of the second waveguide in the second directional coupler is in the range 0.95 μm to 1.04 μm and the second waveguide tapers over a distance in the range 0.4 μm to 0.6 μm.
In some embodiments, the length of the first directional coupler is in the range 1.5 μm to 3.0 μm and the length of the second directional coupler is in the range 10 μm to 20 μm. The ratio of the length of the first directional coupler to the length of the second directional coupler may be in the range 0.05 to 0.30.
In some embodiments, the gap between the first and second waveguides is in the range 0.05 μm to 0.15 μm.
It should be appreciated that the dimensions disclosed herein are exemplary only. The dimensions will in general depend on the effective refractive indices of the first and second waveguides that form the first and second directional couplers.
According to a second aspect of the present disclosure there is provided a photonic chip comprising:
The optical phase delay introduced by the first and second spatial paths on the photonic chip may be in the range 0.66 fs to 1.155 fs, e.g. 0.825 fs. This corresponds to a phase offset/delay in the range 0.4π radians to 0.7π radians, e.g. 0.5π radians.
In some embodiments, the optical phase delay and the first directional coupler are arranged such that optical intensity is accumulated in the second waveguide at the interface between the first and second directional couplers of the learning element when both the first and second waveguides carry optical fields contemporaneously. The optical phase delay and the first directional coupler may be arranged to maximise the accumulated optical intensity.
In this manner, when the first and second optical fields, e.g. representative of UCS and NS/CS inputs, are incident together into the optical associative learning element, optical intensity is accumulated in the second waveguide which can lead to switching of the state of the modulating element from e.g. a crystalline state to a less crystalline/amorphous state. This results in a change of the output response R of the learning element such that UCS and NS/CS single inputs result in a similar output after the state of the modulating element has been switched, which is indicative of associative learning.
In some embodiments, the optical phase delay and the first directional coupler are together arranged such that when a first optical field is carried by the first waveguide and contemporaneously a second optical field is carried by the second waveguide, the first directional coupler transfers at least a portion, e.g. at least 10% or at least 20% or at least 50% or at least 80%, of the initial intensity of the first optical field from the first waveguide to the second waveguide, such that the total optical intensity in the second waveguide at the interface between the first and second directional couplers is greater than the total optical intensity in the second waveguide at the start of the first directional coupler.
The portion of the intensity of the second optical field transferred from the second waveguide to the first waveguide may be less than 10%, e.g. less than 5% or more preferably less than 1%. In other words, in the first directional coupler, the total optical intensity associated with first and second optical fields carried by the first and second waveguides respectively is substantially accumulated in the second waveguide at the interface between the first and second directional couplers. For example, at least 80% of the total intensity may be accumulated in the second waveguide, preferably at least 90%, more preferably at least 95%.
According to a third aspect of the present disclosure there is provided an optical system comprising:
In some embodiments the light source comprises a first laser, a second laser and an optical combiner.
In some embodiments, the first laser is arranged to produce first optical pulses having a first wavelength and the second laser is arranged to produce second optical pulses having a second wavelength, different from the first wavelength.
In some embodiments, the optical combiner is arranged to receive the first and second optical pulses from the first and second lasers and combine them into a common spatial mode.
In some embodiments, an output of the optical combiner is coupled to the input coupler of the photonic chip.
In some embodiments, the first and second spatial paths on the photonic chip comprise a first ring resonator and a second ring resonator respectively.
In some embodiments, the first ring resonator is arranged to select said first wavelength from said first portion and the second ring resonator is arranged to select said second wavelength from said second portion.
In some embodiments, the outputs of the first and second ring resonators are coupled to the first and second waveguides respectively of the optical associative learning element, prior to the first directional coupler.
In some embodiments, the detector arrangement comprises a beam splitter, a first optical tuneable filter, a second optical tuneable filter, a first photodiode and a second photodiode.
In some embodiments, the beam splitter is arranged to split the optical intensity from the first waveguide into a first spatial mode and a second spatial mode.
In some embodiments, the first optical tuneable filter is arranged to select the first wavelength in the first spatial mode and the second optical tuneable filter is arranged to select the second wavelength in the second spatial mode.
In some embodiments, the first photodiode is arranged to detect optical intensity after the first optical tuneable filter and the second photodiode is arranged to detect optical intensity after the second optical tuneable filter.
In some embodiments, the optical system further comprises a controller arranged to control the light source to produce a pre-determined sequence of optical fields.
In some embodiments, the controller is further arranged to receive one or more readouts from the detector arrangement.
In some embodiments, the controller is further arranged to determine a learning status of the optical associative learning element based on the one or more readouts.
In some embodiments, the light source further comprises a third laser arranged to provide optical pump pulses for resetting the state of the modulating element to a predetermined state, thereby undoing a prior training process of the optical associative learning element.
In some embodiments, the controller is arranged to determine a learning status of the optical associative learning element by controlling the light source to transmit first and second optical fields having first and second wavelengths through the first and second waveguides respectively and monitoring the output of the detector arrangement and determining therefrom optical transmittance factors of the first and second optical fields through the optical associative learning element.
In some embodiments, the controller is arranged to determine that the optical associative learning element is in a trained state if the optical transmittance factors of the first and second optical fields through the optical associative learning element are within 10% of each other, preferably if they are within 5% of each other.
According to a fourth aspect of the present disclosure, there is provided an optical artificial neural network, comprising a plurality of optical associative learning elements according to the first aspect, wherein at least two of the optical associative learning elements are coupled together.
In some embodiments, the output of the first waveguide of a first one of the plurality of optical associative learning elements is coupled to the input of the first or second waveguide of a second one of the plurality of optical associative learning elements.
In some embodiments, the optical artificial neural network further comprises a controller configured to implement a pattern recognition algorithm on the optical artificial neural network.
According to a fifth aspect of the present disclosure, there is provided a method of performing an associative learning operation in the optical domain using an optical associative learning element according to the first aspect, the method comprising: providing first and second optical fields contemporaneously to the first and second waveguides respectively thereby modifying a state of the modulating element.
In some embodiments, modifying a state of the modulating element comprises changing the state of the modulating element from a more crystalline state to a less crystalline state, such as an amorphous state.
In some embodiments, the first directional coupler accumulates optical intensity associated with the first and second optical fields in the second waveguide at the interface between the first and second directional couplers.
In some embodiments, the method further comprises selecting a relative phase delay between the first and second optical fields in order to maximize an accumulated optical intensity in the second waveguide at the interface between the first directional coupler and the second directional coupler.
In some embodiments, the step of selecting a relative phase delay comprises performing a numerical simulation of the device in order to determine an optimal value of the relative phase delay.
In some embodiments, the step of selecting a relative phase delay comprises configuring a light source to provide first and second optical fields having a defined phase delay to each other, e.g. the optimal phase delay, to the first and second waveguides of the learning element respectively.
In some embodiments, the method further comprises determining a learning status of the device by determining optical transmittance factors through the device for optical fields coupled to inputs of the first and second waveguides.
In some embodiments, the device is deemed to be in a trained state if said optical transmittance factors are within 10% of each other, preferably within 5% of each other.
In some embodiments, the method further comprises resetting the device by providing pump optical pulses to the second waveguide in order to crystallise the modulating element.
According to a sixth aspect of the present disclosure, there is provided a system comprising an optical associative learning element according to the first aspect coupled to a light source, wherein the light source is operable to provide first and second optical fields to the first and second waveguides of the first directional coupler, the first and second optical fields having a pre-determined relative phase delay between them, wherein the relative phase delay and the first directional are together arranged such that optical intensity is accumulated in the second waveguide at the interface between the first and second directional couplers.
In this manner, when the first and second optical fields, e.g. representative of UCS and NS/CS inputs, are incident together into the optical associative learning element, optical intensity is accumulated in the second waveguide which can lead to switching of the state of the modulating element from e.g. a crystalline state to a less crystalline/amorphous state. This results in a change of the output response R of the learning element such that UCS and NS/CS single inputs result in a similar output after the state of the modulating element has been switched.
The light source may comprise a controller which is operable to adjust the relative phase delay between the first and second optical fields.
The light source may comprise a first and second phase-locked lasers and a phase modulator operable to adjust the relative phase delay between outputs of the first and second lasers to provide said first and second optical fields to the learning element.
The relative phase delay may be in the range 0.66 fs to 1.155 fs, e.g. 0.825 fs. This corresponds to a phase offset/delay in the range 0.4π radians to 0.7π radians, e.g. 0.5π radians.
The system may further comprise a detector arrangement coupled to the first waveguide of the optical associative learning element at the output of the second directional coupler thereof.
The system may further comprise a controller arranged to control the light source to produce a pre-determined sequence of optical fields. The controller may also be arranged to monitor an output of the detector arrangement, e.g. to determine a learning status of the learning element.
The features (including optional features) of any aspect may be combined with those of any other aspect, as appropriate.
Example embodiments will be described, by way of example only, with reference to the drawings, in which:
It should be noted that the Figures are diagrammatic and not drawn to scale. Relative dimensions and proportions of parts of these Figures have been shown exaggerated or reduced in size, for the sake of clarity and convenience in the drawings. The same reference signs are generally used to refer to corresponding or similar feature in modified and different embodiments.
Classical conditioning was initially described in Ivan Pavlov's dog experiment in 1927. In the experiment, food was the UCS that triggered an unconditioned response (UCR) i.e., the dog's salivation; while the ringing bell sound was the NS or CS. The bell (NS/CS) only triggered the salivation response R after the ringing bell was associated by repletion with food. Thus, these initially distinct responses eventually converged to a single response after similar stimuli co-occurrence, which associated the stimuli.
Two main roles of the simplified neural circuitry of
According to the present disclosure, with reference to
The optical associative learning element 200 is arranged to accumulate optical intensity in the second waveguide 204 at the interface 214 between the first 208 and second 210 directional couplers when both the first 202 and second 204 waveguides carry optical fields contemporaneously, i.e. when the optical fields carried by the first 202 and second 204 waveguides are substantially overlapping in time. For example, the first directional coupler 208 is arranged such that when a first optical field is carried by the first waveguide 202 and contemporaneously a second optical field is carried by the second waveguide 204, the first directional coupler 208 transfers at least a portion of the intensity of the first optical field from the first waveguide 202 to the second waveguide 204, such that the total optical intensity in the second waveguide 204 at the interface 214 between the first 208 and second 210 directional couplers is greater than the total optical intensity in the second waveguide 204 at the start of the first directional coupler 208.
In this manner, the net optical energy/intensity from both the inputs (UCS and NS/CS) is converged in the lower waveguide 204 of the first coupler 208. This can cause a fractional volume of the modulating element 206 to be switched to a different state. For example, the modulating element may comprise a phase change material (PCM) and the converged optical energy/intensity causes a fractional volume of the modulating element 206 to be switched from a crystalline state to an amorphous state. With more converging learning optical fields (e.g. pulses), a larger volume of material of the modulating element 206 switches from crystalline to amorphous which could be considered to correspond to a switching from a “before learning” state to an “after learning state”. This is illustrated in
In some embodiments, the PCM 206 deposited on the second waveguide 204 is a germanium antimony tellurium alloy Ge2Sb2Te5 (GST). In general, the modulating element 206 comprises a material comprising a compound or alloy of a combination of element selected from the following list of combinations: GeSbTe, VOx, NbOx, GeTe, GeSb, GaSb, AgInSbTe, InSb, InSbTe, InSe, SbTe, TeGeSbS, AgSbSe, SbSe, GeSbMnSn, AgSbTe, AuSbTe, and AlSb. GST is well-suited as it has a low structural phase transition time (sub-ns amorphization and few-ns crystallization time), high cycling endurance (˜1012 cycles), and long retention time (>10 years at room temperature). In some embodiments, a thin capping layer of indium tin oxide (ITO) may be additionally deposited on the PCM cell to prevent oxidation, and to localize optically-induced heat for PCM structural phase switching.
The first directional coupler 208 performs the function of determining the input optical intensity combinations (input to the first 202 and second 204 waveguides) that sufficiently trigger the associative learning process. Meanwhile, the second directional coupler 210 regulates the output response R which is measured from the output of the first waveguide 202. The lower (second) waveguide 204 of the first directional coupler 208 is the site where optical energy from the UCS and NS/CS inputs accumulates for structural phase switching to occur in the modulating element 206, thereby regulating the output response R. It is desirable that the optical associative learning element 200 associatively learns only upon two-input incidence, i.e. when optical fields are present in the first 202 and second waveguides 204 contemporaneously. As mentioned above, the first directional coupler 208 is configured to accumulate optical intensity in the lower waveguide 204 at the interface 214 for switching the state of the modulating element 206—which constitutes the learning process. The regulation of the output response R is performed by the second coupler 210. This is measured upon one-input incidence, i.e. a single optical field incident either in the first waveguide 202 or the second waveguide 204.
In embodiments, the photonic chip 416 comprises a first ring resonator 410 and a second ring resonator 412 in the first 422 and second 424 spatial paths respectively. The light source 402 comprises a first laser 404, a second laser 406 and an optical combiner 408. The first laser 404 is arranged to produce first optical pulses having a first wavelength B. The second laser 406 is arranged to produce second optical pulses having a second wavelength A, in general different from the first wavelength. The optical combiner 408 is arranged to receive the first and second optical pulses from the first and second lasers and combine them into a common spatial mode, e.g. in a single fiber optical cable or waveguide. The first ring resonator is arranged to receive a first portion of the output intensity of the optical combiner and the second ring resonator is arranged to receive a second portion of the output intensity of the optical combiner. The first ring resonator is arranged to select said first wavelength from said first portion and the second ring resonator is arranged to select said second wavelength from said second portion. The outputs of the first and second ring resonators are coupled to the first 202 and second 204 waveguides of the learning element 200 respectively, prior to the first directional coupler 208. The first 404 and second 406 lasers represent the UCS and NS/CS stimuli. After passing through the ring resonators, only UCS or NS/CS is selected for each waveguide at the resonant wavelength B or A and sent to the element 200. Thus, control of wavelengths helps regulate the device operation.
The optical phase difference or optical delay between the optical fields carried by the first 202 and second waveguides 204 in the learning element 200 affects how energy is coupled in the first directional coupler 208 and therefore the extent to which optical energy/intensity is accumulated in the lower waveguide 204. The relative time delay of the optical phases between the UCS (upper waveguide 202) and NS/CS (lower waveguide 204) inputs may be denoted Δt+tUCS−tNS/CS, where tUCS and tNS/CS are the times at which the respective optical field input signals UCS and NS/CS are referenced to the same point in phase. In embodiments, phase delay control is achieved using an on-chip photonic layout 416 which contains the learning element 200 in addition to the ring resonators 410, 412 and spatial paths 422 and 424. The layout 416 locks the time delay of the phases (phase delay) as a function of spatial path length difference from the optical splitter 418, contained on the layout 416 and arranged to receive the output of the optical combiner 408, to the first 202 and second 204 waveguide inputs of the element 200. Given the broadband response of the optical element, the relative time delay of the phases to the waveguide inputs of the element can be precisely defined with respect to the input wavelength to the layout. On the other hand, to enable single input incidences to the element, a respective ring resonator 410, 412 is coupled to the two waveguide paths prior to the input ports of the element. The single UCS (NS/CS) input is incident when the input to the on-chip layout is of ring B (A) resonant wavelength λB (λA). By precisely defining the optical wavelength of the input laser source, the on-chip layout sorts both the single- and two-input incidences to the element. Simultaneous real-time monitoring of the element is carried out by using a photodetector 414 to measure the output transmission and thereby determine the learning element response R.
Example physical parameters of the learning element 200 were determined using coupled mode theory. The first directional 208 coupler effectively performs the function of determining the input optical field combinations that sufficiently trigger the associative learning process, whilst the second directional coupler 210 is used for regulating the output response R. In other words, two-input coupling to the second waveguide 204 at the interface 214 between the first 208 and second 210 directional couplers should be enhanced and ‘one-input coupling’ from the first waveguide 202 to the second waveguide 204 should be impeded by exploiting the critical coupling length contrast between the one-input case and two-input case. On the other hand, in the second directional coupler 210, the difference in output response R due to the loss contrast between the PCM 206 structural states that represent the before and after learning cases is exploited.
In one example, the first directional coupler 208 has a length of 2 μm and the second directional coupler 210 has a length of 15 μm. The first waveguide 202 is a plain waveguide consistently of nominal width 0.9 μm. In the second waveguide 204, the width of the segment corresponding to the first directional coupler 208 is 0.9 μm, whereas the width of the segment corresponding to the second directional coupler 210 is tapered from 0.9 μm to 0.8 μm. This tapering compensates for the non-zero permittivity of the PCM 206 which contributes to the effective refractive index of the waveguide. The tapering therefore provides optimal inter-waveguide coupling with a waveguide separation gap of 0.1 μm. However, it should be appreciated that the tapering is not essential and the learning element can still function without it. It should be appreciated that the learning element 200 capitalizes on PCM optical loss contrast between the two phases (crystalline and amorphous) to absorb and direct the optical field before and after the learning process. The use of directional couplers ensures the applicability of the element over a broad optical wavelength range.
Simulations were performed based on the above exemplary dimensions of the learning element 200 using three-dimensional finite difference time domain FDTD numerical simulation, the results of which are shown in
Experimental results are presented in
At the start of the experiment, UCS pump input pulses at 14.5 mW power were sent into the learning element 200 (in the first waveguide 202) in events 1 and 2. It was observed that the readouts remained at the baselines. The readouts likewise remained the same when only NS pump input pulses at 14.5 mW power were sent into the learning element 200 (in the second waveguide 204) in events 3 and 4. However, when both UCS and NS pump pulses were sent together with a fixed phase delay (according to this example, Δt=0.825 fs) at 6.6 mW each (13.2 mW total) in event 5, the transmission change (ΔTr Tr0) for the UCS and NS probe readouts changed by ˜−4% and ˜+4% respectively. As the input pump pulse power was increased from 6.6 mW to 14.5 mW each in events 6-8, the probe readouts further changed by nearly −7% and +7% respectively, both of which were well above the UCR/CR response threshold at Tr˜0.07. The experiment confirms the association of input NS/CS (analogous to the ringing bell in Pavlov's dog experiment) to input UCS (analogous to the food in Pavlov's dog experiment) through its output CR which is the learned response from UCR (analogous to salivation in Pavlov's dog experiment), after the temporal pairing of UCS and NS pump inputs in events 5-8 that caused the PCM to switch towards a more amorphous state (in contrast to that in events 1-4).
The reversibility of the associative learning process is further shown in
With reference to
It should be appreciated that in embodiments, Δt influences the accumulated optical pump field at the interface 214 between the first 208 and second 210 directional couplers of the learning element 200.
The Δt-dependence on the coupling provides greater on-demand control to generalize, discriminate and scale the pulse wavelengths that can induce the learning process, when both inputs are sent to the learning element 200. Given the sinusoidal/modular nature of Δt, sending both the pump pulses can produce the same output probe response at a set of predetermined regularly-spaced wavelengths, in contrast to single-input incidence case. The wavelength-insensitive feature of the element upon single-input incidence is due to the non-temporally resonant cascaded structures (i.e. the cascaded first 208 and second 210 directional couplers) that make up the element, whose broadband response is limited only by the change in coupling strength as the wavelength is varied. The timing-dependent plasticity of the associative learning element is consistent with the STDP rule albeit at a different order, thus permitting the associative implementation of input-input temporal contiguity in photonic neuromorphic systems.
Table 1 summarizes the minimum active volume and learning energy of other associative learning devices, except that of the synthetic biological genetic device which cannot be determined. These known electronic and optoelectronic associative learning devices range from ˜0.1 to 1010 μm3 in active volume and consume ˜2.63 to 105 nJ of energy per learning event. In comparison, the all-optical associative learning element 200 according to the present disclosure exhibits favourable characteristics in terms of dimensions and energy usage, with a low active volume at 0.12 μm3 and minimum learning energy at 1.8 nJ. In an embodiment, the single-element device is of 3 μm×17 μm area dimensions.
The associative learning element 200 of the present disclosure may be employed as a building block in artificial neural networks, with reduced energy consumption, as is apparent from the data presented in Table 1. Conventional artificial neural networks on the Hebbian learning rule adopt the backpropagation algorithm, with an inherent nonlinear scaling (O(N2˜N3)) of computational effort with the synaptic connection number N. In contrast, the computational effort in neural networks that are built on associative learning scales linearly (O(N)) with N (see U.S. Pat. No. 5,588,091). Considering the typically large training input datasets required to solve a particular machine learning task, it follows that the number of iterations needed to achieve convergence can be significantly reduced by using associative learning elements; thus substantially downscaling the training time and energy usage of neural network. Therefore it should be appreciated that the present disclosure also provides photonic neural networks built on the optical associative learning element 200 according to the present disclosure, with applications in noisy pattern recognition and classification, for example.
The relation between the learning element 200 output response R and input stimuli S can be expressed in the compact matrix notation R=M(II) M(I) S, where the 2×1 column vector S=(UCS, NS/CS)T, while the 2×2 and 1×2 matrices that describe the first 208 and second 210 directional couplers respectively are given by:
in which s is the waveguide mode coupling coefficient, θb=cosh−1(γcrys/4κ), θa=sin−1(γam/4κ), l1 is the length of the first directional coupler 208 and l2 is the length of the second directional coupler 210.
In the first directional coupler 208, when two identical inputs E0 of the same wavelength λ0 are sent into the learning element 200, the total field coupled to the respective waveguides at the interface 214 between the first 208 and second 210 directional couplers is the product of matrix M(I) and column vector (e−ωΔt, 1)T, where ω=(2πc/λeff), c is the vacuum speed of light, λeff=λ0/neff is the effective wavelength in the waveguide, and neff is the effective refractive index in the waveguide. It follows that the field intensity at the second waveguide 204 at the interface 214 is |El1|2two=E02 (1+sin(2κl1) sin(ωΔt)). In comparison, for one-input incidence, the coupled field intensity is |El1|2one=E02 sin2(κl1). Thus, the critical coupling (maximum energy transfer) length of the first directional coupler 208 is lcrit=π/κ for one-input incidence and lcrit/2 (at ωΔt=π/2) for two-input incidence. Given κ=0.157 μm−1 in embodiments of the present disclosure, this gives |El1|2two=1.588 (for ωΔt=π/2) and |El1|2one=0.095 at l1=2 μm. From a PCM 206 switching energy threshold perspective, the ratio |El1|2two/(1−|El1|2one)=1.755 is indicative of associative learning because of the significant energy surplus upon two-input incidence relative to the maximum energy from one-input incidence.
In the second directional coupler 210, the relative change in output response R, which is measured for one-input incidences, can be estimated largely based on M(II) because |El1|2one in the first cascade (l1=2 μm) is negligibly low. Thus, the ratio η=RNS/CS/RUCS|2 can be approximated as η≈|M(II)12/M(II)11|2. This leads to ηb≈|sinh (κl2 sinh θb)/sinh (κl2 sinh θb+θb)|2 and ηa≈sin(κl2 cos θa)/cos (κl2 cos θa−θa)|2 (subscript ‘b’ and ‘a’ denote before and after learning). Additionally, the output transmission difference between UCRb and UCRa can be denoted as Δ|R|2=|M(II)11b|2−|M(II)11a|2 where the alphabetic subscripts likewise denote the learning states. Given γcrys=7.65κ and γam=0.24κ according to the present disclosure, ηb≈0.072 and ηa≈1.006 at l2=15 μm. Therefore it is possible to attain ηb<<ηa due to the unbounded sinh and positive unbounded cosh functions which cause ηb→0 with the substantially large γcrys. The set of relations ηb<<ηa and ηa≈1 is the second signature of associative learning because the output R upon NS/CS input incidence transitions from a significantly low value (ηb<<ηa) to that of UCS (ηa≈1) which remains within the same transmission range (Δ|R|2<0.5).
In embodiments, the optical associative learning element 200 was fabricated on a Si3N4/SiO2 platform. Electron beam lithography (JEOL 5500FS, JEOL Ltd.) was used at 50 kV to define the Si3N4 structure on the Ma-N 2403 negative-tone resist-coated substrate. After the development process, reactive ion etching (PlasmaPro 80, Oxford Instruments) was performed in CHF3/O2/Ar to etch down 330 nm of Si3N4. A subsequent step of electron beam lithography was implemented on a poly(methyl methacrylate) (PMMA) positive resist-coated substrate to open a window for the PCM cell. This was followed by the sputter-deposition of 10-nm GST/10-nm ITO on the substrate. The element characterization process was performed using a high resolution emission gun SEM (Hitachi S-4300 SEM system—Ibaraki, Japan) with low accelerating voltage (1 to 3 kV) at a working distance of ˜13 mm.
An exemplary optical setup 1000 employing an optical associative learning element 200 according to the present disclosure is illustrated schematically in in
In some embodiments, the optical associative learning element 200 consists of two cascaded optical directional couplers 208 and 210. For brevity and consistency, these are referred to as cascade I and II in the following paragraphs. The directional couplers, made up of two parallel channel optical waveguides 202 and 204 in close proximity, allow optical energy exchange between the guided modes of adjacent waveguides. The lower waveguide 204 of cascade I (segment L1) is the site where optical energy from the UCS and NS/CS inputs accumulate for PCM 206 structural phase switching to occur at the lower waveguide 204 of cascade II (segment L2), thus regulating the output response R of the element 200.
For the case of a lossy bottom waveguide with similar propagation constants, one can theoretically treat the optical modes in the element starting from the coupled-mode equations da/dx=iκb and db/dx=iκa−(γ/2)b where the normalized x-direction spatially dependent mode amplitudes of the coupled upper and lower waveguides are denoted by a and b; κ is the coupling coefficient, and γ is the loss coefficient of mode b due to the PCM. To ensure the relevance of these equations, the difference in propagation constant is compensated by tapering the second waveguide 204 on which the PCM patch 206 is deposited, which is comparable to using a lossy material with diminishing real permittivity in passive parity-time symmetric directional couplers. Because cascades I and II are respectively without and with the PCM 206, the modes in cascade II are first solved for and then conveniently it is possible to obtain the solution for cascade I by letting γ→0, before cascading both matrices to solve for the output R with respect to the UCS and NS/CS inputs.
For γ/4κ≤1, given the [−1, +1] range of a sine function, let γ/4κ=sin θ to arrive at
where a0 and b0 are the fields a(x=0) and b(x=0) which we relate to the general notations a(x) and b(x) after applying initial boundary condition to the equations. For γ/4κ≤1, let γ/4κ=cosh θ given [1, =∞] range of a hyperbolic cosine function. Following through the same procedure, this gives
From equation 3 the input-output relation of cascade I is obtained by letting γ→0.
To describe the output R as a function of the inputs UCS and CS, one can multiply the 2×2 matrix in equation 3 after letting γ→0 for cascade I by that of equation 3 when γ/4κ≤1 or equation 4 when γ/4κ>1 for cascade II. The 2×2 matrix in cascade II can be reduced to a 1×2 matrix because only the output field on the upper waveguide 202 of cascade II represents the output R. The equation for the overall system can thus be concisely written as R=M(II) M(I) S where S=(UCS, NS/CS)T is the column vector that denotes the respective inputs to the element while the matrices M(I) and M(II) respectively describe the optical coupling tendencies in the cascaded sections of the lengths x=l1 and x=l2.
in which θ=sin−1(γ/4κ) when γ/4κ≤1 and θ=cosh−1(γ/4κ) when γ/4κ>1. Here, the inputs to the first and second cascades are respectively at x1=0 and x2=0.
From eigenmode simulations of the structure, estimates of the parameter values were obtained as x=0.157 μm−1, γcrys=7.65κ and γam=0.24κ using the eigenvalue splitting equation Δβ±=2i (κ2+(γ/4κ)2)1/2 which directly follows from the coupled mode equations, where γcrys and γam are the loss coefficient γ when the PCM is at crystalline and amorphous structural phases. Because γcrys/4κ>1 and γam/4κ≤1, equation 5 and equation 6 can be written respectively as equations 1 and 2 above.
When two optical inputs of the same magnitude E0 and wavelength λ0 are launched into the element 200, the total field coupled to the respective waveguides at L1 is scaled by the product of matrix M(I) and column vector (e−iωΔt 1). The inputs to the element can thus be rewritten as a0=E0e−ωΔt and b0=E0 where the angular frequency ω=(2πc/neff λ0), c is the vacuum speed of light, and neff is the waveguide effective refractive index. It follows that the field coupled to the lower and upper waveguide in the first cascade are respectively given by
|Eupper|2=|E0e−iωΔtM(I)11+E0M(I)12|2=E02(1−sin(2κl1)sin(ωΔt)) (7)
|Elower|2=|E0e−ωΔtM(I)21+E0M(I)22|2=E02(1+sin(2κl1)sin(ωΔt)) (8)
At κl1=π/4 when ωΔt=π/2, it follows that the field coupled to the upper and lower waveguide are respectively 0 and 2E0, which is indicative of two-input critical coupling. This implies that the two-input critical coupling length at l1=π/4κ is half the single-input critical coupling length at l1=π/2κ.
With reference to
To conveniently turn on/off the UCS (NS/CS) input and to precisely define the time delay of the phases between the UCS (NS/CS) inputs, the associative learning element 200 is integrated on an on-chip structure 416 as described above. When the input to the on-chip structure is of ring B (A) resonant wavelength λB (λA), the UCS (NS/CS) inputs are incident to the associative learning element 200.
While the single-input probe readouts are carried out at the resonant wavelengths λA and λB, the two-input pump signals (which induce associative learning) are let incident at the non-resonant wavelengths of the ring resonators. The time delay between the inputs Δt can be conveniently defined from the spectrum in
The simulation results can be further corroborated by equations 7 and 8, which give |EL1|2lower→0 when Δt=2.475 fs at x2=5 μm and |EL2|2lower→1.588 when Δt=0.825 fs at x2=2 μm (at the interface 214). To compare the field magnitude at the interface 214 in the second waveguide 204, the electric field profile of the vertical cross-section was retrieved, shown in
In
The results disclosed herein show that after the associative learning process, input CS comes to suggest input UCS, which reflects the typical one-way associative learning process NS/CS→UCS. Additionally, with reference to
Based on these exemplary dimensions, three-dimensional FDTD numerical simulations of the structure were performed before and after the learning process to corroborate the χ and η calculations above.
For optical neuromorphic computing applications that require the ability to handle rapid bursts of traffic and heavy loads with little or no notice, it is desirable to have a scalable monolithic hardware system architecture. The optical associative learning element 200 according to the present disclosure can serve as a building block in a neuromorphic network. As disclosed herein, the all-optical associative learning element 200 can be integrated onto a platform (i.e. photonic chip 416) which locks the phase difference between the UCS and NS/CS as a function of the input optical frequency after the optical input through e.g. an apodized grating coupler was divided equally by the on-chip optical splitter 418. This approach capitalizes on the fact that the all-optical associative learning element 200 consists of cascaded first 208 and second 210 directional couplers, which have been found to be robust to stimuli wavelength difference within a reasonably wide wavelength range. An all-optical phase shifter may be introduced on a first layer of the neuromorphic network. Subsequent layers may require only judicious determination of the path length between one associative learning node to another (as demonstrated herein) once the operating optical wavelength has been determined for the prospective scalable neuromorphic network. Several all-optical artificial neural network architectures based on the associative learning element 200 are disclosed herein.
Typical artificial neural networks originate from the Hebbian learning rule, which describes how neuronal activities affect the connections between neurons i.e., biological neural plasticity. The rule states that the synaptic weight of a neural connection is adjusted based on the relative timing between the activities from two neurons on either sides of a synapse (pre-synaptic and post-synaptic activities). An example of a scheme to artificially implement the spike-based formulation of the Hebbian learning rule, known as spike-timing dependent plasticity (STDP) is shown in
On the other hand, associative learning for machine learning is based on empirical evidences of the learning process in the marine snail Hermissenda crassicornis and the hippocampus of the rabbit. Inspired by the learning process in these biological neural systems, a distinctively unique type of artificial neural network based on associative learning has been proposed, with the basic neural connection shown in
Although the appended claims are directed to particular combinations of features, it should be understood that the scope of the disclosure of the present invention also includes any novel feature or any novel combination of features disclosed herein either explicitly or implicitly or any generalisation thereof, whether or not it relates to the same invention as presently claimed in any claim and whether or not it mitigates any or all of the same technical problems as does the present invention.
Features which are described in the context of separate embodiments may also be provided in combination in a single embodiment. Conversely, various features which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable sub combination. The applicant hereby gives notice that new claims may be formulated to such features and/or combinations of such features during the prosecution of the present application or of any further application derived therefrom.
For the sake of completeness it is also stated that the term “comprising” does not exclude other elements or steps, the term “a” or “an” does not exclude a plurality and reference signs in the claims shall not be construed as limiting the scope of the claims.
Number | Date | Country | Kind |
---|---|---|---|
1908760.0 | Jun 2019 | GB | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/GB2020/051358 | 6/4/2020 | WO |