Many problems in logistics, financial portfolio management, drug discovery, and other application domains require finding an assignment of values to their inputs (typically called variables) with the goal of optimizing an objective. For instance, such problems include “combinatorial optimization problems”. Unlike other areas of optimization, combinatorial optimization relates to problems where variables take values from a finite set. For example, valid assignments could be binary (e.g. whether to make an investment or not), from a limited set (e.g. one of three available routes to pick), or, in general from a finite subset of the integers. In such problems, there is a finite set of ways for combining the values of each variable. In principle, it is possible to enumerate all possible combinations and find the optimal assignment. In practice, however, such exhaustive search is infeasible for problems of even moderate sizes, as the set of combinations is extremely large (exponential in the number of variables).
There has been extensive work towards understanding the structure of such problems. A subset of combinatorial optimization problems belongs to the category of problems known as NP-complete (where NP stands for nondeterministic polynomial). NP-completeness is a concept known in computational complexity theory, and all NP-complete problems can be transformed into any other NP-complete problem. An efficient solver for any NP-complete problem implies that any NP-complete problem can be solved efficiently. All NP-complete problems also belong to a larger subset of problems known as ‘NP-hard’, where all NP-hard problems can also be transformed into all other NP-hard problems.
The term “efficient” in this setting means finding a solution to the problem without enumerating all possibilities. Specifically, an efficient solution to a graph optimization described herein is a solution whereby the amount of time taken to find the solution scales polynomially with the number of variables of the problem (such as graph vertices), whereas enumerating all possible solutions is exponential with the number of variables of the problem. However, it is widely accepted that no such efficient solver can ever exist. Instead, work in this area has focused on devising algorithms that find solutions that are “good enough”; often there are no assurances that such approximation algorithms will indeed provide an answer which is close enough to the exact solution.
A variety of combinatorial optimization problems exist, and, as described above, NP-complete problems may be transformed to other NP-complete problems. For example, the traveling salesman problem is defined as follows: for a given set of cities and pairwise distances between cities, the problem is to find a path via all the cities, wherein each city is visited exactly once, such that the path has the shortest total length.
A general form of combinatorial optimization problem can be defined called quadratic unconstrained binary optimization (QUBO) problems, which are defined by a set of binary variables V={v1, v2, . . . vN}, each taking a value of either 0 or 1, and a formula ΣiΣjQij·vi·vj, where the coefficient Qij defines the interaction between variable v1 and vj. The travelling salesman problem can be formulated as a QUBO problem, by defining variables as positions in the path between each city for each possible city to be visited (for example, the first variable may indicate whether or not London is the first city visited), and with the distances between cities encoded in the matrix Q, such that the total distance is minimised subject to the constraint that all cities are visited exactly once. QUBO is a type of polynomial unconstrained binary optimization (PUBO) problem, which assigns values to a set of binary variables V={v1, v2, . . . , vN} so as to minimise a formula Σv′⊆VQV, ·Πv∈V′v, where in this case, the coefficients Q may encode interactions between any number of variables. As mentioned above, it is possible to transform between different formulations of NP-hard problems. It is possible to transform PUBO problems to QUBO by introducing auxiliary variables and terms in the formula.
A formulation called the Ising model, used in physics to model ferromagnetism and other physical processes, is equivalent to the QUBO problem defined above. The Ising model is described in terms of a physical system with variables that can exist in two discrete states, where these variables can interact with each other, and the total energy of the system is given by H(σ)=−Σi,jJijσiσj−μΣihiσi. In the Ising formulation, the binary variables (sometimes referred to as ‘spins’) are typically assigned to one of +1/−1, rather than 1/0 or any other binary assignment. However, the Ising formulation can easily be mapped to the QUBO formulation for Boolean variables, by applying the formula: σi=2vi−1.
Note that the notation used above is slightly different between the QUBO and the Ising formulation, with the variables to be assigned represented by 6 and the interaction coefficients represented by J for the Ising model. For simplicity, in this application the notation u will be used for variables to which values are assigned and J will be used to denote arrays of interaction coefficients in the context of an Ising solver. However, Q and v may also be used herein in general to denote a matrix of weights and a variable, respectively.
The second term −μΣihiσi in the above expression for the total energy represents the effect of an external ‘field’ or some external effect on the system being modelled. For example, in a ferromagnetic material, the first term represents the energy contribution for interactions between magnetic dipoles, while the second term represents the energy of the system due to an external magnetic field. Many problems are modelled as Ising problems without external fields, as it is much simpler to solve the Ising problem in the case of no external field. However, it is possible to convert a problem with an external field to a problem without an external field by introducing an extra spin and additional edges with weights chosen carefully. Any problems or models referred to as Ising problems in the description below assume no external field, or a problem already converted to one with no external field.
Until recently, algorithms for combinatorial optimization have typically been implemented in digital hardware, such as commodity CPUs, FPGAs, GPUs, and ASICs. Digital hardware has great advantages with respect to flexibility (i.e. the ability to program different algorithms), and reliability. However, digital solutions are also limited by the speed of execution and power consumption. In the past, improved computational power and reduced consumption could be achieved for each generation of digital hardware. It is widely predicted that improving performance of digital hardware will be increasingly difficult, as fundamental physical limits are approached. Searching for better answers for combinatorial optimization problems or tackling larger instances of those will come at a greater hardware cost.
However, more recently, there have been attempts to solve such problems using hardware based on non-digital physical processes. A popular realization of the Ising model with a physical process uses quantum annealers. In existing systems, the problem variables are represented by quantum bits, taking values +1 and −1, usually referred to as “spins”. However, this topology does not allow full connectivity. Instead, the qubits interconnect in an architecture comprising sets of connected unit cells, each with four horizontal qubits connected to four vertical qubits via couplers. Unit cells are tiled vertically and horizontally with adjacent qubits connected, creating a lattice of sparsely connected qubits. The limited connectivity of this architecture has undesirable implications, resulting in inefficient representations of the problem variables into the spins, i. e. the number of the quantum bits required for the physical system to represent the problem is much higher than the original variable number.
Due to this inherent physical limitation of the quantum annealers hardware, algorithms have been developed which can run in classical hardware and are inspired by the physical properties of Quantum. For example, Microsoft Azure has developed Quantum Inspired Optimization (QIO) algorithms, which have shown good promise to approximate PUBO problems.
In an optical solver, light signals are used to represent the input variables (e.g. σi=1 . . . N in the Ising problem), and an optical element is used to combine the signals in a way that models the interaction between the variables (e.g. the matrix J in the Ising problem). Optical elements that perform a vector-by-vector multiplication in the optical domain, such as a liquid crystal display or a ring-resonator, are known in the art. The summation (Σ) can be implemented using a photodetector that can perform coherent or incoherent addition of signals falling upon the photodetector's light sensing element.
In the cases where the inputs to the solver (i.e. the variables whose values are to be determined) can take binary positive and negative values, such as −1 and +1, or −½ and +½, then these are sometimes referred to as “spins” merely by analogy with the quantum property of spin. However in such a context, this does not actually mean the quantum property of spin. Instead the two possible “spin” values simply refer to two possible values of a binary variable, and could be represented using, for example, two different values of the amplitude, or phase, of the light.
State of the art solutions based or inspired in optics propose either digital approaches only (see Toshiba's Solving Traveling Salesman Problem with SBM [Simulated Bifurcation Machine], Ikuko Hasumi https://medium.com/toshiba-sbm/solving-traveling-salesman-problem-with-sbm-simulated-bifurcation-machine-89740c83ed37) or hybrid approaches, as per Bohm, Fabian, Guy Verschaffelt, and Guy Van der Sande. “A poor man's coherent Ising machine based on opto-electronic feedback systems for solving optimization problems.” Nature communications 10.1 (2019): 1-9 (https://www.nature.com/articles/s41467-019-11484-3) and Inagaki, Takahiro, et al. “A coherent Ising machine for 2000-node optimization problems.” Science 354.6312 (2016): 603-606 (https://science.sciencemag.org/content/354/6312/603).
In hybrid approaches, a building block to generate a signal representing the variable values is typically implemented in optical hardware, but the logic to compute the variable interactions is implemented using digital hardware and hardware to convert between optical and digital domains. By contrast, in ‘all-analogue’ solvers, non-digital hardware is instead used to convert a signal between optical (i.e. a light signal) and analogue electronic domains. An advantage of all-analogue solvers is the speed at which optical and analogue electronic signals can be transmitted (digital electronics are inherently much slower due to the need to clock sequences of bits through flip-flops). Whereas implementing part of the iteration in the digital domain defeats the point of an all-analogue solver, which is the speed of transmission compared to digital electronics. The speed of the system will be limited by the slowest part, so the inclusion of any digital electronics negates the benefit of optical solvers.
An all optical solution has been proposed and demonstrated with all-to-all connectivity for 4 spins/variables and partial connectivity for 16 spins/variables. (Marandi, A., Wang, Z., Takata, K., Byer, R. L. & Yamamoto, Y. Network of time-multiplexed optical parametric oscillators as a coherent Ising machine. Nat. Photonics 8, 937-942 (2014), K. Takata et al. “A 16-bit Coherent Ising Machine for One-Dimensional Ring and Cubic Graph Problems”, Scientific Report 2016. A 16-bit Coherent Ising Machine for One-Dimensional Ring and Cubic Graph Problems (europepmc.org)).
State of the art all-optical solvers generate variables using optical signals in a time-division multiplexing architecture. I.e. the signals are multiplexed in series into the same beam of light, and a different delay path is introduced for each variable so that the signals can then be combined in order to model the interactions between the variables. However, for time-division multiplexing, because spin generation is carried out in series, the time complexity of the solver is linear in the number of variables being modelled.
Solvers which are implemented wholly in the analogue domain use optical or analogue electronic vector multipliers to model the ‘spin’ interactions of Ising systems. Implementing optical vector multipliers has the advantage pointed out above, that it leverages the speed of optical transmission.
Existing technologies to perform vector multiplication in the optical domain using light signals include spatial light modulators (SLM), ring resonators and Mach-Zender interferometers (MZI). A challenge in designing an optical vector multiplier is in maximizing the speed and efficiency of the multiplication. Typically, multiplication of light signals is done by applying an optical element to alter the value of the input by a predetermined amount. This may be achieved, for example, by passing the input signal through a lossy element, with the intensity of the optical signal being reduced by a predetermined amount depending on the configuration of the optical element. Examples of optical elements used for this purpose include liquid crystal displays and ring resonators.
Some optical vector multipliers represent the input elements of the vector by modulated light signals of the same wavelength. A difficulty with this is that the resulting signal computing the vector multiplication of such as set of signals by an array of lossy elements is a single beam of light of a single wavelength. Since the light is all of one wavelength, to detect the correct value of the weighted sum, coherent addition of the weighted inputs must be performed. Spatial light modulators are used in this way, such that the result of the vector multiplication is obtained by coherent addition after modulating by the elements of the SLM. However, a difficulty with coherent addition is that the optical signals being added need to be phase-matched so that the electrical fields are added correctly. This is difficult to achieve at optical frequencies of the order of hundreds of THz for systems which may be configured to process hundreds of inputs.
Ring resonators on the other hand use wavelength-division multiplexing, which represents the input vector by a wavelength-multiplexed light signal comprising a unique wavelength carrier for each element of the input vector to be multiplied. Ring resonators are placed next to a waveguide to couple only the light of a specific wavelength. When the length of the ring circumference is equal to an integer number of wavelengths which is the resonance condition, the optical signal in the waveguide is dropped to the ring and lost due to the scattering. The levels of light intensity are considered as data values in analog form and multiplications are implemented by the ring resonators in which the loss rate can be controlled by injecting a charge into the ring or changing its temperature. However, a potential disadvantage of ring resonators is that the signal containing the full input vector passes through the series of ring resonators, which leads to losses during optical transmission.
Mach-Zender interferometers consist of two directional couplers (DCs) where two input signals are equally divided into two output ports and two phase shifters (PSs) that modulate the input signals. The MZI can distribute the light signal of each input port to the two output ports at an arbitrary ratio by adjusting the phase shift so that it can be used to implement a 2×2 unitary transformation. As in the case of ring resonators, optical vector multiplication suffers increasing system losses as the number of inputs to the system increases, with losses increasing linearly with the number of inputs as each input has to go through more devices.
By contrast, for an SLM, each individual signal only passes once through a single modulator of the SLM, which minimizes system losses, such that the loss is constant as the number of inputs increases. However, as noted above, existing SLMs are used with light sources of the same wavelength, which requires coherent addition.
It would therefore be desirable to provide an alternative form of apparatus for performing vector multiplication in the optical domain.
In a different field of technology, wavelength selective switch devices are currently used in telecommunication applications, typically to add or drop specific wavelengths from a signal at specific transmission nodes, and to guarantee that signals are flat across the transmitted spectrum. Wavelength selective switches are typically used with a single input fiber on one side, comprising signals of different wavelengths, which are dispersed by a dispersive element to reflect different wavelengths in different physical position, to an SLM which attenuates the signals based on their respective wavelengths. This is coupled to multiple output fibers on the other side for the attenuated signals of different wavelengths. This device is bi-directional.
Described herein is a wavelength selective switch configured to carry out optical vector multiplication on a set of input signals of different wavelengths. A spatial light modulator is used to apply a specific loss factor to each input signal at a different respective wavelength, while a diffraction element is used to combine the resulting weighted signal into one beam for detection at a photodetector to compute a weighted sum of the inputs. This benefits from the advantage of the spatial light modulator, which has minimal system losses since each signal is passed through a single cell, and the advantage of incoherent detection at the photodetector, since all the signals are of different wavelengths (within the bandwidth of the photodetector).
Optical elements such as lenslet arrays and diffraction elements may be configured to collimate and direct the signals more effectively, enabling the wavelength-selective switch to be scaled out to a larger number of input signals, and enabling the use of the wavelength selective switch as a vector multiplier, for example, in optical Ising solver architecture modelling a large number of spin variables (typically hundreds or thousands of spins). This provides an advantage over typical SLM implementations of optical vector multipliers using lenses, which cannot be scaled easily to large inputs.
A first aspect disclosed herein provides apparatus for performing vector-by-vector multiplication in an optical domain, the apparatus comprising: a plurality of light signal generators, each arranged to emit a respective beam of light having a different respective carrier wavelength modulated with a respective input signal modelling a respective variable of a vector of variables; one or more sets of light modulator elements, wherein within each set, each light modulator element in the set is arranged to receive the beam of light modulated with a different corresponding one of said input signals, and apply a corresponding weight from a vector of weights in order to produce a corresponding weighted optical signal; a respective photosensor element for each of said sets; and one or more optical combining elements arranged, for each of said sets, to direct the weighted optical signals of the set onto the respective photosensor element and thereby produce a respective output in the form of an analogue electronic signal summing the weighted optical signals of the respective set.
Another aspect disclosed herein provides a method for performing vector-by-vector multiplication in an optical domain, the method comprising: generating, by a plurality of light sources, a plurality of beams of light, each having a different respective carrier wavelength modulated with a respective input signal modelling a respective variable of a vector of variables; receiving each beam of light at one or more sets of light modulator elements; and for each set of light modulator elements: applying a weight from a vector of weights to each beam of light by a respective light modulator element to produce a respective weighted optical signal; and directing, by one or more optical combining elements, the weighted optical signal onto a respective photosensor element thereby producing a respective output in the form of an analogue electronic signal summing the weighted optical signals of the respective set.
For a better understanding of the present disclosure, and to show how embodiments of the same may be put into effect, reference is made to the accompanying figures in which:
A general combinatorial optimisation problem can be solved by first mapping said problem to a QUBO problem, which can then be mapped to an Ising problem. For many problems, there is a known mapping to the QUBO formulation. For others, a mapping may have to be derived. The mapping of general NP-hard problems to a QUBO or Ising formulation is a topic that, in itself, will be understood by a person skilled in the art of mathematics. For example, a problem expressed in PUBO form, for example a cubic unconstrained binary optimization problem with the formula Σijk Qijkvivjvk, may be expressed as a QUBO problem by introducing extra variables and terms, and may thus be solved by the Ising solver disclosed herein. The solver disclosed herein provides a solution to an Ising problem, which can be used solve any NP-hard problem for which a mapping to that problem can be found.
In doing this, the problem is mapped to a physical system whose total energy is given by the Ising Hamiltonian, i.e. −Σi,jJijσiσj (assuming no external field). To map the given problem to an Ising system, the matrix J needs to be determined such that minimizing the total energy −Σi,jJij·σi·σj (i.e. maximizing Σi,jJij·σiσj) is equivalent to optimizing the problem.
The Hamiltonian is therefore a sum of a plurality of terms Jij·σi·σj, each being a product of a respective subset of the variables σi,σj with a corresponding weight Jij. (so the first term is the subset of one variable σ1 multiplied with itself and the weight J11, and the second term is the subset σ1, σ2 multiplied together and with the weight J12, etc.).
As can be seen, this sum can be broken down into a series of vector-by-vector (dot product) multiplications, by taking u, out to the left of the sum:
In this final representation, the sum in each line is an individual vector multiplication which represents a contribution of a different respective one of the variables σi to the energy in the Hamiltonian. I.e. the vector multiplication (σ1J11+σ2J12+ . . . +σNJ1N) is the contribution of σ1 toward the energy, and (σ1J21+2J22+ . . . +σNJ2N) is the contribution of σ2, etc. the weights represent the interactions between variables (so J11 is the interaction of σ1 with itself, J22 is the interaction between σ1 and σ2, etc.). The weights are set depending on the problem being modelled (and for any given problem some weights may be zero). In the system of
An example application is the travelling salesperson problem. In a simple example, imagine there are three cites the salesperson needs to visit: London, Edinburgh and Cardiff. These can be modelled with nine variables in a QUBO problem: v1 represents London being visited first, v2 represents London visited second, v3 represents London third, v4 represents Edinburgh first, v5 represents Edinburgh second, v6 represents Edinburgh third, v7 represents Cardiff first, v8 represents Cardiff third, and v9 represents Cardiff third. The elements of the matrix Qij represent the penalties of travelling between corresponding pairs of cities. So Q15 (London first and Edinburgh second) is the distance penalty for London to Edinburgh, etc. Note that some weights, such as Q19 (London first then Cardiff third) are set to zero since they are not meaningful in this problem as the total distance is determined by the distances between consecutive cities only. Other weights, such as such as Q12 or Q13 (London first then London second, or London first then London third) may be set to large penalty values so as to impose the constraint that each city is visited once. The QUBO problem (minimizing ΣiΣjQij·vi·vj) can then be transformed into an Ising problem (minimize the energy in the Hamiltonian term −Σi,jJijσiσj) and solved using the solver system of
Another example is a molecular similarity problem for estimating the molecular similarity between two molecules. E.g. this could be used to estimate that one molecule is likely to block another for use in a drug. Modelling molecular similarity as a QUBO problem is, in itself, known in the art.
An update rule may be derived for adapting the signals generated by the system in the direction that minimizes the Hamiltonian of the Ising system modelling the problem. A possible update equation for the Ising model may be written as follows:
is approximately equal to the update equation above for appropriately chosen constants α and β. Cosine squared is a useful approximation for optical spin generation in particular, due to specific hardware that can readily compute this function, described later. However, a different approximation may be made to evaluate other approximations of Equation 2 above. For example, analogue electronic components may be used to evaluate terms of a Taylor expansion of Equation 2 directly and generate an analogue signal for the updated value of xi. For example a cubic or quintic approximation of equation 2 could be used (an expansion only up to the cubic or 5th-order term, respectively). Indeed the reason for using cos2(x−π/4)−½ in equation (1) is because it approximates to x-x3 for small enough x. In other examples, any other formula that provides similar approximation (i.e. it is linear for x around 0) could also work.
As shown in
The above formula will be described in further detail later. Other formulations are possible. Whatever formulations are used, the underlying property of the update equation is that it pushes or adapts the signals x1 such that the physical system being modelled tends towards a minimum energy (i.e. the minimum of the Hamiltonian given above). This is driven by the term βΣjJijxj[k], which represents a contribution of the given signal to the energy of the overall system, and whose sign determines the direction of the update. In other words, this term provides feedback to the signal generator 102 to adapt the respective modelling signal x, in the respective channel 102. The sign of this feedback causes an adaptation in the respective modeling signal (x) which drives that signal in a direction which reduces the overall energy in the Hamiltonian of the system. A value of this feedback determines the degree of the adaptation (optionally damped relative to the signal x by the coefficients α and β).
Note that, while the solver determines signals directly representing Ising variables σi, this is equivalent to finding an optimal mapping of the QUBO variables v1, and may be transformed into a different set of variables in the form of the original problem. However, it is important that a mapping exists between a set of Ising variables (spins) which can be determined by the solver and a set of variables optimizing the original problem. Note that in the below description, either of vi or σi may be used to denote a binary variable modelled by the solver.
An all-analogue solver can be implemented which models the value of the binary variables σi as optical or electrical analogue signals, and which performs the above update for each modelled variable using a combination of non-digital hardware components. The solver generates an initial set of signals representing a given assignment of variables and generates new signals in a series of iterative steps based on a feedback signal computed using interaction logic implemented in analogue electronic or optical hardware. An example implementation of a solver architecture for an Ising problem is described in more detail later.
There are many choices of solver configuration which may be arranged according to the present disclosure, each configuration generating a feedback signal which encourages the signals generated over time into a set of signals which minimize the total energy of an ‘Ising’ system, which can be mapped to an optimal assignment of variables for the given problem definition.
The present disclosure provides a novel architecture for solving combinatorial optimization problems which can be mapped to Ising problems of N variables (sometimes referred to as ‘spins’), wherein the variables of the problem are modelled by a set of N distinct hardware channels, and updated iteratively based on feedback provided by signal interaction logic modeling the interaction of the variables according to the given problem definition. The system occurs only in the optical and analogue electronic domains, and the signal interaction logic may be modelled by either optical or analogue electronic hardware. This will now be described in further detail with reference to
A first channel 102 is configured to compute a modelling signal x1 corresponding to an Ising variable σ1 taking either a positive or a negative “spin” value, with the modelling variable x1 updated based on the feedback received at each iteration of the optimization. Note that while the variable σ being modelled may be binary, the modelling signal x may take a soft value that can vary between the two possible binary values of the variable. The process of determining the contribution by each channel will now be described. Note that each channel comprises hardware components which carry out the same steps to compute its respective contribution to the function.
Each channel 102 comprises a signal generator 100, a splitter 106 and signal interaction logic 104, each of which may comprise one or more hardware components. Note that ‘logic’ as used herein in this context does not refer to digital logic, but rather refers to signal operations carried out using analogue or optical hardware. The signal generator 100 generates a modelling signal for σi, with a measurable property of the signal representing a binary value of the variable σi. The signal may, for example, be an optical signal generated by a light source such as a laser. An optical modulator may be used to modulate a property of the optical signal to model the variable σi. For a binary variable σi to be encoded in the value of a property such as amplitude, a mapping should be defined between the possible modulated property values (amplitude) and the binary values (e.g. 1 and −1). For example, xi may lie in the range between [−α, +α], where a is some constant, and where a positive amplitude maps to an Ising variable σ=1, and a negative amplitude maps to an Ising variable σi=−1. Once a modulated signal modelling the variable σi has been generated (this may be referred to herein as a ‘modelling signal’ xi), this signal can be copied by applying a splitter 106, to generate multiple instances of that modelling signal xi encoding the same variable vi, which can be communicated to other channels as shown by the arrows in
The signal interaction logic 104 receives multiple modelling signals, representing a vector of variables, with each signal received from the splitter 106 of a respective channel j. The interaction logic 104 comprises a vector-by-vector multiplier that combines the modelling signals xi into a signal representing a weighted sum of the modelled variables, with the weights corresponding to the relevant elements of the matrix J defining the spin interaction for the Ising problem. There are various possible hardware configurations that may be used to perform vector-by-vector multiplication. One example disclosed herein is a wavelength selective switch (WSS). This is described in further detail later. Optical vector-by-vector multiplication may alternatively be carried out by other, known optical technology including spatial light modulators (SLM), ring resonators or Mach-Zehnder interferometers (MZIs), or some combination of such technologies or other suitable optical components. As another alternative, the vector-by-vector multiplication operation may also be implemented in the analogue electronic domain (i.e. using electrical signals), for example by using memristors.
Note that, while
The feedback signal is passed back along a feedback path 108 to the signal generator 104, which determines a new signal according to the hardware of the system. The updated signal may be generated, for example, by passing the feedback signal to a modulator to modulate the input signal from a light source, and detecting the resulting optical signal with a photodiode. Alternatively, in some embodiments, an analogue electronic signal encoding the feedback signal may be generated directly using analogue electronic components, for example, by using memristors. Either way, the system is designed such that over time it tends to a stable state which maps to an optimal assignment of the variables which minimize the energy function for the given problem formula.
Each channel updates its signals according to the same scheme described above, until a stable state is reached for all signals, corresponding to a particular assignment of variables to values. The pairwise interactions of an arbitrary number of variables σ1, . . . , σN may be modelled in this way, by setting up N channels and splitting each signal to N identical copies of the signal, one to be sent to each channel.
Each channel 102 iteratively generates an updated modelling signal x1 according to a feedback signal until the system settles into a stable set of states, representing an optimal assignment of variables according to the optimization problem to be solved. As described above, an update of the signal is given by the update equation, for example:
where xi[k] is the modelling signal at the kth iteration, Jij is the coefficient defining the interaction between the ith and jth variables according to the given problem as mapped to an Ising system, α and β are multiplicative constants, and δi[k] is a Gaussian noise term. The factors α and β are chosen so as to control the size of the update of each variable, where a large α relative to β causes the signal to move slowly in the direction given by the β term, i.e. β*ΣJijxij[k]. This is important in a system of many variables, as large updates at each step can prevent convergence of the full system to a suitable local optimum. Similarly the noise term provides a perturbation to the signal at each step to ensure that the system does not become ‘stuck’ in a local minimum that is a poor approximation of an optimal set of variables. The above equation may be derived mathematically by applying known principles based on the Hamiltonian of the Ising model and using sensible approximations. In particular, the cos2( ) term approximates the optimal update, which is easily applied using particular optical hardware, described later. The operation of a single channel 102 will now be described with reference to
An initial signal is generated by the spin generation hardware 300 representing an initial binary value of the variable σi modelled by the given channel. Note that ‘spin’ is used herein to refer to a signal representing a binary variable of an Ising system, and should not be confused with the quantum mechanical definition of spin. An example implementation of the hardware components of the spin generation hardware 300 are described in more detail below, with reference to
Note that in this embodiment, the spin generator 300 comprises only part of the signal generator 100 of
Along the first path, the signal is combined with the output of a light source 302, which is a laser at a specific wavelength, in a modulator 304 to modulate the laser beam, thereby generating a modelling signal x1, as described above with reference to
The modulator 304 sends the modelling signal to a 1-to-N splitter 306, which communicates an identical optical signal to a vector-by-vector multiplier 314 (VVM) in each of the N channels of the system. In the example of
In embodiments, the signal output by the VVM 314 remains in the optical domain, as shown by the unbroken lines in
term of Equation 1 is implemented in hardware by setting the modulator at a specific operation point. A Gaussian noise term 320 (corresponding with δ[k] in Eq. 1) is added to the feedback signal by analogue hardware 318 configured to perform addition of electrical signals, such as an electronic mixer. A Gaussian distribution may be defined, from which the Gaussian noise term δ[k] may be sampled at each iteration. Hardware for adding electrical signals is well known in the art and will not be described further herein. δ[k] is assumed to be small random (gaussian) noise. In each iteration, δ[k] takes takes a new value (from the same distribution).
Along the second path, the signal i is output to an amplifier which amplifies the electrical signal, representing the multiplication of the variable σi by a constant α, shown in Eq. 1. This is added to the sum
which is communicated along a feedback path from the analogue addition hardware 318, to obtain a signal
Finally, the updated signal is determined in the spin generation hardware 300, which modulates an optical signal based on the feedback signal 108 to compute a cosine of the feedback signal, detecting this signal at a photodetector and adding a second adaptive term, in order to evaluate the full expression of Eq. 1 and output an analogue electronic signal. Note that direct detection by the photodetector generates the square of the cosine in Eq. 1, as the photodetector measures intensity of the optical signal which is proportional to the square of the signal itself. For this reason, direct detection cannot be used for phase-modulated signals, as all phase information is lost in the detection of light intensity.
An example of the evaluation of the update equation by the spin generator 300 is described below with reference to
Note that multiple components operating together in
Each channel i is implemented in hardware which computes updates to that channel's signal in parallel. Updates continue until the system is stopped, for example after a predetermined stopping point of M iterations. Alternatively, the signals may be measured periodically, and the system stopped if there are no changes observed between subsequent measurements. An approximate solution is found when the system stabilizes, i.e. the set of variables modeled by the generated signals stays constant from one iteration to the next. This stable set of signals may then be mapped directly to an assignment of N variables which approximate the solution for the given Ising problem.
The example embodiment shown in
During each iteration of the example solver shown in
The signal-to-signal interaction logic 502 of
Note that in the example embodiment described above, the vector-by-vector multiplication operation of the signal interaction logic 504 is implemented in the optical domain, e.g. by a wavelength selective switch, described later. However, in other embodiments, the signal interaction may be implemented in the analogue electronic domain. Similarly, in some embodiments, other arithmetic operations such as addition of signals, may be carried out in the optical domain rather than the analogue electronic domains. The process shown in
As described above, an advantage of an architecture described herein is that it uses a ‘space-division’ multiplexing architecture, meaning that a system of N variables is modelled using separate hardware for each variable. Some state-of-the-art solvers, by contrast, use a time-division multiplexing architecture.
By contrast, the space-division multiplexing architecture shown in
As described with reference to
One possible detection scheme that allows detection of positive and negative values is coherent detection, which measures the amplitude and phase information of the received optical signal, which can be either positive or negative. However, a disadvantage of coherent detection is that it is more complex to implement than direct detection of light intensity. Coherent detection schemes often require digital signal processing. Some of the advantages of processing signals in the optical and analogue electronic domains, such as the speed of transmission of the signal are lost or diminished if converting back to the digital domain to carry out coherent detection.
An alternative detection method uses direct detection, i.e. detection of light intensity, which does not require the system complexity of coherent detection. Direct detection measures a positive-only signal in the analogue electronic domain, which may then be offset in the analogue electronic domain by adding or subtracting adaptive terms to correct the range of the signal to allow positive or negative values. This may be referred to as ‘differential detection’. Similar detection schemes are used in telecommunications to detect binary phase shift keying signals, which are real-valued.
A schematic illustration of this direct detection scheme is shown in
This signal is converted into an analogue signal by detecting it at photodetector 308. However, the detected signal is restricted to be positive only, as the photodetector 404 measures light intensity, which cannot have negative values. To correct this, the output signals of the VVM operation are corrected to allow positive or negative values by adding a DC offset term, shown in
This enables measurement of positive and negative signals required by the solver by adjusting the signal in the analogue domain. This differential detection scheme is simpler than a coherent one and can be implemented easily to convert the signal directly from the optical to the analogue electronic domain. However, for the VVM output, if the given input signals have different wavelengths, attention should be paid that the path lengths of all signals are matched. Incoherent addition of signals will be described in more detail below in the context of the operation of a wavelength selective switch.
While this differential detection scheme is described above in relation to the present solver architecture, direct detection with adaptive offset terms can be used for any application in which optical vector-by-vector multiplication operations taking real positive and negative values can be implemented. For example, this may be used in machine learning applications, such as deep neural networks, in which input vectors may be multiplied by network weights. This differential detection scheme may be applied to applications using various types of optical VVMs such as spatial light modulators (SLM), ring resonators, or wavelength selective switches, described in more detail below. This differential detection method has the advantage of allowing operations to be carried out in the optical domain, providing a significant speed improvement over digital operations, while enabling the desired range of real valued signals to be modelled, without requiring the difficult implementation of coherent detection schemes. Such a differential detection scheme may be implemented without requiring phase sensitivity of the system if different wavelengths are used for the input signals of the OVM, such as in a wavelength-selective switch or ring resonator VVM.
Note that in the described embodiments, the solver models x and the corresponding feedback signal in the form of positive and negative “spin” signals representing Ising variables, e.g. −1/1, as a method to solve QUBO problems which are easily mapped to Ising problems. The sign of the feedback signal represents the direction in which to drive the modelling signal x to reduce the energy of the Hamiltonian. However, in other embodiments, it is not excluded that purely positive signals could be used. Instead the matrix J may include positive and negative weights. In such embodiments the DC offset 310, 320 is not necessarily required. For example, QUBO variables 1/0 may be modelled directly. In this case, the positive signals generated by direct detection may not need to be corrected.
As described above, each channel may implement a respective vector-by-vector multiplier as part of the interaction logic 104. Various possible vector-by-vector multiplier configurations may be used with the solver architecture disclosed herein. Some VVMs may be implemented entirely in the optical domain, such as spatial light modulators, ring resonators, and Mach-Zehnder Interferometers. Other VVMs may be implemented in the analogue electronic domain, for example using memristors to compute the weighted sum of electrical signals.
One example of an optical VVM (OVVM) which is disclosed herein for use in some embodiments of the solver architecture disclosed herein is a wavelength selective switch. (WSS). WSSs are used in telecommunication applications and they allow signals at different wavelengths to be independently optimized to guarantee that all the signals are transmitted at the same power, as well as allowing signals of different wavelengths to be combined together in a single optical fiber or vice versa for add or drop functions at transmission nodes.
The implementation of WSS for optical vector multiplication is based on the fact that WSSs have the capability of emulating the product function as they attenuate (weigh) each individual wavelength and the addition function, achieved by its capability of combining different wavelength into a single fiber, subsequently detected by at least one photodetector.
A vector-by-matrix multiplication can be broken down into a series of vector-by-vector products of the following form:
Each element of the output vector o is a sum of elements of the ith row of the weight matrix W applied to the input vector y.
The configuration of
The operation of a wavelength selective switch to perform vector-by-vector or vector-by-matrix multiplication based on the above principle will now be described with reference to
The input vector v is represented by a set of optical signals 800 of different wavelengths, which may be, for example, a set of modelling signals {x1, . . . , xN} received from the N channels of a solver such as the one shown in
Note: for simplicity of illustration the fibers 808, 818 for only one channel 102 of the solver are shown in
The corresponding elements of the weight matrix Q are implemented in a spatial light modulator (SLM) 810, one example of which is a liquid crystal on silicon spatial light modulator (LCoS-SLM), which modulates each input optical signal 800 by a specific factor as described above. In this case, the signals are modulated by a factor dependent on the wavelength of the input, where each column of the SLM 810 corresponds to a different incident wavelength. The input signals 808 are passed through a lens to ensure that each of the signals reach the SLM in the correct horizontal position for its respective wavelength.
The output signal for a given channel is obtained by detecting at a photodetector 820 the modulated optical signals, combined into a single beam 818, which is then detected at a photodetector 820. The combination of the various optical signals, each having a different wavelength, into a single beam at the photodetector 820 may be referred to as wavelength-division multiplexing (WDM). This is facilitated by an arrangement of one or more lenses 816 and/or dispersive element(s) 814 (e.g. diffraction elements such as prisms or diffraction gratings); while the SLM guarantees independent weights to each individual wavelength.
The photodetector 820 performs incoherent addition of the various constituent light signals of different wavelengths. In order for the incoherent detection to compute the sum of the intensities of the constituent signals, it should be ensured that the difference in frequency of the respective signals being combined is much larger than the frequency bandwidth of the photodetector, meaning that the photodetector does not detect cross-terms from the interaction of the signals with each other. Incoherent detection of signals of different wavelengths does not require the signals to be phase matched. By contrast, if using a VVM architecture that takes as input light sources of the same wavelength, coherent addition must be performed at the detector, which has the difficult requirement of requiring all signals to be phase matched.
An architecture similar to that shown in
As described above, the solver described herein for Ising problems may be implemented in one of two architectures. In the first, as shown in
However, in the second architecture, a global vector-by-matrix multiplier (VMM) may be implemented, wherein the channels of the solver each provide their modelling signal xi to the VMM to form an input vector, the matrix in full being implemented in this VMM. The solver architecture shown in
An example WSS architecture is now described which extends the architecture of the WSS vector-by-vector multiplier to carry out vector-by-matrix operations. This architecture has the advantage of being capable of processing many more spins simultaneously than the vector-by-vector WSS described above.
To use a spatial light modulator for vector-by-matrix multiplication in this example solver architecture, the vertical axis of the SLM needs to provide different weights even for the same wavelength, so that the whole functionality of the vector-by-matrix multiplication is achieved. This is because, for matrix multiplication, the input vector needs to be multiplied by each row of the matrix Q to generate the full output vector. The SLM 908 is a modified version of that shown in
In the example solver architecture with a global VMM, a single input array 908 comprises the modelling signals xi generated at each channel. This vector is passed through a lenslet array 900 having a particular geometry that causes the signals to spread out vertically, while collimating the beam in the horizontal direction of the SLM 902 corresponding to that signal's wavelength. This allows more input signals of different wavelengths to be processed at a single SLM. Moving from a single lens as in
Note that in the architecture of
The SLM 902 comprises a 2D array of modulators, each element of the array applying a respective weight to the received input signal, in contrast with the SLM described for the vector-by-vector multiplier in
In embodiments, the output signals may be directed from the element 814 via one or more lenses, to direct the signals into a beam at the correct vertical height to be detected using incoherent addition at the photodetector corresponding to the output vector element represented by that beam. E.g. another lenslet array may also be included between the dispersive element 814 and the multiple channels (potentially fibers) at the end of the system.
The photodetector array 904 is arranged as a set of photodetectors in a vertical array, each combined signal directed from the dispersive element 814 corresponding with the output signal for a different channel.
A solver which uses a vector-by-matrix multiplier architecture described above allows simultaneous processing of the interaction of spins for all channels using a single hardware arrangement such as the one shown in
While optical vector multiplication has also been implemented by a number of existing technologies, such as spatial light modulators which do not use wavelength division multiplexing, ring resonators, and Mach Zehnder interferometers. Such technologies are described in detail for example in K. Kitayama et al, “Novel frontier of photonics for data processing-Photonic accelerator”, APL Photonics 2019, https://doi.org/10.1063/1.5108912, which is incorporated herein by reference in its entirety. The wavelength-selective switch implementation combines the spatial modulation of SLMs with the wavelength division of ring resonators, but where a ring resonator implementation requires the input signal to be passed through a series of ring resonators, the SLM only requires each signal to be passed through a single modulator, which is an advantage in terms of system losses. SLM VMM implementations do not use wavelength division, and instead use a single optical source, and use coherent addition at the photodetectors to compute the weighted sum for each element of the output array. The wavelength selective switch combines the advantages of both these techniques.
While the above description of wavelength-selective switches refers to its implementation in a solver architecture such as that described herein. However, vector-by-matrix multiplication has many applications, particularly in machine learning, for example to apply weights of a neural network to input vectors. The wavelength selective switch described herein may be used in such applications. Similarly, the wavelength selective switch VMM may be applied to other solver architectures, such as the time-division multiplexing architecture shown in
The techniques disclosed herein can be applied to a wide range of applications, in particular the solver implementation disclosed herein can be used to solve any NP-hard problems for which a known transformation to the Ising formulation exists. A well-known example of such problems is the Travelling Salesman problem. This may be also be used for problems in other fields, for example, in determining molecular similarity, for which work has been done to find a transformation of a graph similarity problem of graphical representations of molecules into a QUBO formulation. This work is described in Hernandez, Maritza, et al. “A quantum-inspired method for three-dimensional ligand-based virtual screening.” Journal of Chemical Information and Modeling 59.10 (2019): 4475-4485.
It will be appreciated that the above embodiments have been described by way of example. Other variants and applications of the disclosed techniques may become apparent to a person skilled in the art once given the disclosure of the concepts herein.
More generally, according to one aspect disclosed herein, there is provided apparatus for performing vector-by-vector multiplication in an optical domain, the apparatus comprising: a plurality of light signal generators, each arranged to emit a respective beam of light having a different respective carrier wavelength modulated with a respective input signal modelling a respective variable of a vector of variables;
In embodiments, the one or more optical combiner elements comprise one or both of: a diffraction element, and/or a lenslet array.
In embodiments, each of the light signal generators comprises a laser, the light beams being laser beams.
In embodiments, the apparatus is for performing vector-by-matrix multiplication, wherein: the apparatus comprises a plurality of said sets of light modulator elements; and the apparatus comprises one or more optical splitter elements arranged to split each of said beams of light so as to direct the respective optical signal to the corresponding light modulator element in each set, each set of light modulator elements modelling a different vector of weights from a matrix of weights.
In embodiments, the spatial light modulator elements are incorporated in a same plate of the spatial light modulator.
In embodiments, the spatial light modulator elements are arranged in a 2D array.
In embodiments, each set of light modulator elements is arranged in a respective row of the 2D array, and different light modulator elements within each set correspond to different columns of the 2D array, or vice versa.
In embodiments, the one or more optical splitter elements comprise a lenslet array.
In embodiments, the apparatus is configured to optimize a function comprising a weighted sum of a plurality of terms, each term comprising a product of a corresponding subset of the variables from said vector of variables, and each term being weighted by a corresponding weight from the matrix of weights which models interactions between the variables; wherein the apparatus is arranged into a plurality of channels, each channel comprising: a respective one of the light signal generators, a respective one of the sets of light modulator elements, and a respective one of the photosensor elements; wherein the output from the respective photo sensor element of each channel represents a respective contribution of the respective variable to a Hamiltonian of the function; and wherein each channel further comprises a feedback path arranged to return a feedback based on the respective output back to the respective light signal generator, and the respective light signal generator is configured to adapt the respective input signal in dependence on the respective feedback signal, the light signal generators being configured to perform said adaptation such that the apparatus tends towards a state in which the energy of the Hamiltonian is minimized.
In embodiments said function comprises an Ising problem.
In embodiments, said function comprises a QUBO problem.
In embodiments, the feedback path is arranged to introduce a noise component into the respective feedback signal before returning it to the respective light signal generator.
In embodiments, each set of light modulator elements is arranged to perform a respective vector-by-vector multiplication of a respective node in an artificial neural network, or the apparatus is arranged to perform a vector-by-matrix multiplication of a layer in an artificial neural network.
In embodiments, the respective input signal is modulated into each beam of light using amplitude modulation, and each photodetector element is arranged to produce its respective output signal by incoherent detection.
Another aspect disclosed herein provides a method for performing vector-by-vector multiplication in an optical domain, the method comprising: generating, by a plurality of light sources, a plurality of beams of light, each having a different respective carrier wavelength modulated with a respective input signal modelling a respective variable of a vector of variables; receiving each beam of light at one or more sets of light modulator elements; and for each set of light modulator elements: applying a weight from a vector of weights to each beam of light by a respective light modulator element to produce a respective weighted optical signal; and directing, by one or more optical combining elements, the weighted optical signal onto a respective photosensor element thereby producing a respective output in the form of an analogue electronic signal summing the weighted optical signals of the respective set.
In embodiments the method may further comprise steps in accordance with any of the system features disclosed herein.
Number | Date | Country | Kind |
---|---|---|---|
21155428.2 | Feb 2021 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2022/014174 | 1/28/2022 | WO |