The present technology concerns a wave propagation computing (WPC) device, such as an acoustic wave reservoir computing (AWRC) device, for performing computations by random projection. In some applications, the AWRC device may be used for signal analysis or machine learning.
The field of computer and information technology is being impacted simultaneously by two fundamental changes: (1) the types and quantity of data being collected is growing exponentially; and (2) the increase in raw computing power over time (i.e. Moore's Law) is slowing down and may stop altogether within 10 years.
It is estimated that human activity generates 2.5 quintillion (2.5×1018) bytes of data per day. Up to the recent past, recorded data consisted mostly of text and sound, both of which are dense data and form comparatively small data sets. Today, much of the recorded data consists of images and videos, which both possesses attributes (or features) that number from the thousands to the millions, and both are typically extremely sparse. This new state of fact is often called Big Data.
Data is converted into useful information using computer hardware and algorithms. Traditional statistical analysis techniques were developed for smaller and denser data sets, and require prohibitive resources (in cost of computer hardware, and cost of electrical power) to process large and sparse data sets. To convert image and video data into useful information, we need new analysis algorithms (and the appropriate computer hardware) that do not consume prohibitive amounts of power.
Moore's Law has been the driving force behind the computer revolution (cf.
One approach being pursued to process large and sparse data sets effectively despite the slowdown of computer chip development is the use of artificial (i.e. electronic) neural networks (NN) (cf. Jüurgen Schmidhuber, “Deep learning in neural networks: An overview”, Neural Networks, Volume 61, January 2015, p. 85-117). Artificial neural networks are formed of electronic processing elements that are interconnected in a network that loosely mimics those found in the brain. Every electronic neuron outputs a weighted (and, optionally, non-linearly convolved) average of its inputs. The network as a whole transforms one or more inputs to one or more outputs. Artificial neural networks are already used to analyze pictures and video in demanding applications such as self-driving vehicles.
The feed-forward neural network was the first type of neural network that was invented (cf. Simon Haykin in Nonlinear Dynamical Systems: Feedforward Neural Network Perspectives, Wiley Publishers, ISBN: 978-0-471-34911-2, February 2001). In such a network, there are no cycles in the interconnect network. The information moves in one direction, from in the inputs to the outputs, via the interconnected neurons. The neurons are typically organized in layers, with each layer taking as input the outputs from the previous layer. The inputs to the first layer are the network inputs. The outputs from the last layer are the network outputs. The intermediate layers are called “hidden layers”.
Recurrent Neural Networks (RNN) are a class of neural network architectures. They were proposed to extend traditional neural networks to the modeling of dynamical systems by introducing cycles in the neuronal interconnections (cf. Zachary C. Lipton, John Berkowitz, Charles Elkan, “A Critical Review of Recurrent Neural Networks for Sequence Learning”, arXiv: 1506.00019 [cs.LG]).
Neural networks have traditionally been implemented on general-purpose CPUs. But, neural network computational techniques are resource-intensive; even a moderately-complex neural network architecture requires significant computational resources—CPU time, memory and storage. For example, GoogLeNet (cf. Szegedy et al, “Going Deeper with Convolutions”, Proceedings of the Computer Vision and Pattern Recognition (CVPR) Conference, 2015) uses 22 layers and requires over 6.8 million parameters and 1.5 billion operations to characterize and run the network. This high resource requirement was one of the reasons that neural networks remained at the fringes of computer science until faster computer technology and cheap memory and storage became widely available.
One approach to reducing the computation time is to use graphical processing units (GPUs). GPUs were originally created to perform specialized mathematical computations necessary for graphical processing, 3D modeling and graphical video games. They were later repurposed to handle the mathematics and the large data sets used in deep (multi-layer) neural networks (cf. A. Krizhevsky, I. Sutskever, and G. Hinton, “ImageNet classification with deep convolutional neural networks”, In Advances in Neural Information Processing Systems 25, pages 1106-1114, 2012). GPUs have become the standard hardware for production neural networks and are widely used in industry. However, GPUs consume 100's of watts of power, and must be placed on cooled racks, and are therefore not suitable for portable applications.
It is worth noting that GPUs are not optimized specifically for neural networks; they contain circuits that provide several other specialized functions for 2D and 3D image and video applications. An ASIC specifically designed and optimized for neural networks would yield benefits at economies of scale, at the cost of a significant up-front investment. The Google Tensor-Processing Unit (TPU) (cf. Google Cloud Platform Blog, https://cloudplatform.googleblog.com/2017/04/quantifying-the-performance-of-the-TPU-our-first-machine-learning-chip.html, retrieved on May 16, 2017) is one such ASIC. The TPU is smaller and less power-hungry than a GPU—about the size of a typical hard drive—but must still be rack-mounted. The architecture of the TPU is reported in U.S. Patent Publication Nos. 2016/0342889A1, 2016/0342890A1, 2016/0342891A1 and 2016/0342892A1.
Movidius, Inc. is a company that makes a vision processor that is programmable to different architectures (cf. David Moloney, “1TOPS/W software programmable media processor”, Hot Chips 23 Symposium (HCS), 2011). Movidius was recently acquired by Intel Corp. Movidius' technology is based on best architectural practices from GPUs, ASICs and FPGAs (Field-Programmable Gate Arrays).
Nervana Systems, Inc., a company that was recently acquired by Intel Corp. (cf. Jeremy Hsu, “Nervana systems: Turning neural networks into a service”, IEEE Spectrum, 53(6):19-19, 19 May 2016), has developed a custom ASIC that is interconnected with other ASICs and dedicated memory to perform neural network computations. The system is available as a cloud-based service.
However, to achieve high performance, digital circuit implementations of neural networks require the use of the most advanced, and therefore the most expensive, semiconductor circuit manufacturing technologies available to-date, which result in a high per-unit cost of digital neural networks.
The digital approaches summarized above are all based on a Von Neumann architecture—programs and data are held in a common memory store, and an instruction and data operation cannot occur simultaneously. A different approach is to implement the neural network in analog circuits, with analog-to-digital and digital-to-analog conversion circuits to transfer inputs and outputs on and off chip. Analog circuits have the advantage over digital circuits that low-resolution analog computation circuits (such as adders and multipliers) require fewer transistors than corresponding-resolution digital computation circuits, and therefore have a lower cost of manufacturing and lower power consumption.
AT&T developed the ANNA neural network chip in the 1990s (cf. Eduard Sackinger et al., “Application of the ANNA Neural Network Chip to High-Speed Character Recognition”, IEEE Transactions on Neural Networks, 1992, 3(3):498-505). It was manufactured in 0.9 um CMOS technology, with 180,000 transistors that implemented 4096 synapses. It was capable of recognizing 1000 handwritten characters per second.
More recently, IBM Corp. has developed TrueNorth, a 5.4 billion transistor chip that combines 1 million spiking neurons and 256 million synapses (cf. Paul A. Merolla et al., “A million spiking-neuron integrated circuit with a scalable communication network and interface”, Science, 8 Aug. 2014, 345(6197):668-673). TrueNorth can process an input stream of 400-pixel-by-240-pixel video at 30 frames per second, and perform multi-object detection at a power consumption of 63 mW.
However, with every new generation of CMOS technology below 65 nm, analog circuits lose some of their performance advantage over digital circuits: the voltage gain of MOSFETs keeps decreasing, and the performance variability between MOSFETS that are designed to be identical keeps increasing. Both of these basic trends make analog circuits larger and more complex (to make up for the poor voltage gain or/and to correct transistor-to-transistor variability issues), which directly increases the size and power consumption of analog circuits, and thus decreases their performance advantage over digital circuits. As a result, analog circuit implementations of neural networks offer only a limited cost and power improvement over digital circuit neural networks, and alternate fabrication methods or/and computing schemes are needed.
Recently, there has been an interest in using optical technologies to perform certain computations for neural network applications. It is hoped that optics-based computational circuits will achieve a much higher speed of operation than electronic/semiconductor implementations of functionally-similar computational circuits.
Neural networks implemented in photonic systems derive the benefits of optical physics; linear transformation and some matrix operations can be performed entirely in the optical domain. For example, Coriant Technology and Twitter have described optical systems (cf. Yichen Shen et al., “Deep Learning with Coherent Nanophotonic Circuits”, arXiv: 1610.02365 Phys. Op., 7 Oct. 2016) that may achieve high speed of operation and low power consumption.
However, to-date, optics-based implementations of neural network computation circuits are bulky and costly (compared to all-semiconductor implementations) due to the lack of very-small-size low-cost light modulation circuits (which are required to generate signals and perform multiplications), and it is not yet apparent how this basic technological may get resolved.
Therefore, there is a need for improved computational circuits for neural networks and more generally machine learning applications, which combine low cost, small size, high speed of operation, and low power consumption.
Another mathematical technique used to analyze sparse data sets is random projection. It is a different technique than neural networks, but it is applicable to a wide range of machine learning, signal analysis and classification applications. Conceptually, random projection consists in transforming low-dimensionality input data into higher-dimensionality output data, such that the output data can easily be separated (or classified) into their constituent independent components.
As stated above, many types of very large data sets are very sparse in their attribute or feature space. This occurs because it is nearly impossible to obtain a uniform sampling of points along every attribute/feature axis. As a result, in the high-dimensional feature space (generally a Hilbert space), the vectors of very large data sets are distributed very sparsely. Further, Euclidean distances have a reduced significance—the ratio between the largest to the small distance in the feature space goes to 1 as the dimensionality approaches infinity (cf. D. L. Donoho, “High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality,” Lecture on Aug. 8, 2000, to the American Mathematical Society “Math Challenges of the 21st Century”; available from http://www-stat.stanford.edu/˜donoho/, retrieved on Dec. 10, 2016).
Random Projection (RP) and related methods like Principal Component Analysis (PCA) have been in use for decades to reduce the dimensionality of large datasets while still preserving the distance metric (cf. Alireza Sarveniazi, “An Actual Survey of Dimensionality Reduction”, American Journal of Computational Mathematics, 4:55-72, 2014).
The Johnson-Lindenstrauss lemma (cf. W. B. Johnson and J. Lindenstrauss (1984). “Extensions of Lipschitz mappings into a Hilbert space”. In Conference in Modern Analysis and Probability, Contemporary Mathematics. 26:189-206, 1984) forms the basis of random projection. It states that if points in a vector space are of sufficiently high dimension, then they may be projected into a suitable lower-dimensional space in a way which approximately preserves the distances between the points. That is, in random projection, the original d-dimensional data is projected to a k-dimensional (k<<d) subspace, using a random k×d—dimensional matrix.
A somewhat related concept is that of random selection of features (cf. Sachin Mylavarapu and Ata Kaban, “Random projections versus random selection of features for classification of high dimensional data”, Proceedings of the 13th UK Workshop on Computational Intelligence (UKCI), 2013, 9-11 Sept. 2013).
Random projection is traditionally implemented on general-purpose computers, and so suffers from the limitations of common CPU systems: comparatively high power consumption, and a limited speed due to the CPU/DRAM memory bottleneck.
Recently, the company LightON has proposed to use optical processing to perform random projection (cf. A. Saade et al., “Random Projections through multiple optical scattering: Approximating kernels at the speed of light”, Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6215-6219). The projection is achieved by the scattering of laser light in random media. The transmitted light is collected by a camera that records the resulting interference pattern, which forms the output. Such a system would have a high data throughput, but the proposed implementation is large and comparatively costly. Also, the proposed implementation has a fixed (i.e. not programmable and not reconfigurable) response. Improved implementations of random projection are needed.
An alternate mathematical technique to analyze sparse datasets is reservoir computing. Reservoir computing networks are a type of recurrent neural networks, but they are fundamentally different than multi-layer perceptron networks and therefore deserve to be considered as their own class of computing devices. Reservoir computing networks are applicable to a wide range of machine learning, signal analysis and classification applications.
In the context of machine learning, a reservoir is a group of nodes (or neurons), linear or nonlinear, that are interconnected in a network (see, e.g.,
Mathematically, a reservoir can be shown to perform a random projection of its input space onto its output space. A reservoir whose nodes and connections are linear is a linear time-invariant (LTI) system, and therefore can be thought as performing Principle Component Analysis (PCA), which is a linear method of separating data into its components based on its eigenvectors. A reservoir whose nodes or/and connections are non-linear has much richer dynamics than a linear reservoir. Research into nonlinear reservoirs is broadly classified into two areas: Echo State Networks (ESN) and Liquid State Machines (LSM).
When used for a classification application, a reservoir computing device typically has a reservoir and an extra “output layer” of neurons that is fully connected to the reservoir neurons. The weights of the connections between this output layer and the reservoir neurons are selected to perform the desired classification operation. Effectively, the reservoir computes a large variety of non-linear functions of the input data, and the output neurons select those functions that achieve the desired classification operation. Classifiers built on reservoir computing networks have the advantage of being easier to train (i.e. configure) than multi-layer perceptrons and other traditional neural network architectures. For applications that involve the classification of time sequences of data, classifiers built on reservoir computing networks have the advantage of being much easier to train than MLP-type recurrent neural networks.
Reservoir computing networks have been implemented primarily on digital electronic circuits, including general-purpose computers, ASICs, FPGAs and other specialized digital processors. For example, U.S. Pat. No. 7,321,882 (“Method for supervised teaching of a recurrent artificial neural network”) issued to Herbert Jaeger teaches how an ESN can be implemented on digital computers. However, when used for classification applications, this type of implementation involves a massive amount of computations (the state of every neuron in the reservoir must be explicitly computed at every time step) even though a small number of those computations is required to perform the classification task. As a result, digital electronic implementations of reservoir computing networks have comparatively high power consumption. In the present document, when we refer to implementations of reservoir computing networks on digital electronic circuits as digital reservoirs.
Reservoir computing networks have also been implemented using optical techniques, but many such demonstrations use an optical fiber as reservoir (i.e. the reservoir of randomly- and recurrently-connected neurons is replaced by a simple delay line), which reduces the functionality and computational capability of these networks. Other optics-based demonstrations use optical components that are comparatively large and costly. Finally, reservoir computing networks have been implemented using water as reservoir, but such demonstrations do not scale beyond limited proof-of-concept examples.
Embodiments of the present technology may be directed to an acoustic-wave reservoir computing (AWRC) device that performs computations by random projection. In some embodiments, the AWRC device is used as part of a machine learning system or as part of a more generic signal analysis system. The AWRC device takes in multiple electrical input signals and delivers multiple output signals. It performs computations on these input signals to generate the output signals. It performs the computations using acoustic (or electro-mechanical) components and techniques, rather than using electronic components (such as CMOS logic gates or MOSFET transistors) as is commonly done in digital reservoirs.
One aspect of the present disclosure relates to a wave propagation computing (WPC) device for computing random projections, the WPC device has an analog random projection medium; and a plurality of boundaries that demarcate at least one active region in the medium as one or more cavities; and a plurality of transducers connected to the medium, the plurality of transducers including at least one transducer to convert an electrical input signal into signal waves that propagate in the medium, and the plurality of transducers including at least one transducer to convert the signal waves that propagate in the medium into an electrical output signal.
In some embodiments, the medium has asymmetric geometric boundaries. In some embodiments, the medium provides non-linear propagation of the signal waves. In some embodiments, the medium provides a multi-resonant frequency response over at least one decade in frequency. In some embodiments, the medium has a piezoelectric material. In some embodiments, the medium has a thin-film piezoelectric material. In some embodiments, the medium has internal and/or external impedance discontinuities. In some embodiments, the impedance discontinuities are one or more of structure and material discontinuities. In some embodiments, the medium has one or more of a through hole, a partial hole, a local thickness increase, or a particulate/material inclusion. In some embodiments, the medium has two or more mediums. In some embodiments, the medium is demarcated by a plurality of surfaces to reflect the signal waves, the plurality of surfaces forming a three-dimensional structure. In some embodiments, the medium has a tunable propagation medium with one or more material properties that can be altered after manufacturing in a repeatable manner. In some embodiments, the material properties are one or more of a coefficient of a stiffness matrix, a modulus of elasticity, a Poisson ratio, or a wave velocity. In some embodiments, the material properties can be altered by the application of an electric field.
In some embodiments, a transducer of the plurality of transducers provides a non-linear electrical output signal. In some embodiments, a transducer of the plurality of transducers is a microelectromechanical systems (MEMS) device. In some embodiments, at least two of the transducers are electrically connected via an optional external circuit to form a feedback path. In some embodiments, at least two of the transducers are electrically connected via an optional external circuit to form a self-test path. In some embodiments, the transducers are positioned along a lateral periphery of the medium. In some embodiments, a transducer is positioned within an interior of the medium. In some embodiments, the transducers are positioned across a surface of the medium.
In some embodiments, the signal waves are acoustic waves. In some embodiments, the signal waves are elasto-acoustic waves. In some embodiments, the signal waves are electromagnetic waves.
In some embodiments, the WPC device further has a substrate and a suspension structure connecting the cavity to the substrate, wherein the suspension structure isolates the cavity from the environment. In some embodiments, the medium is formed by a Micro-Electro-Mechanical Systems (MEMS) thin-film structure.
Another aspect of the present disclosure relates to a compound WPC device having two or more WPC devices and an interconnect architecture connecting the two or more of the WPC devices. In some embodiments, the interconnect architecture is a MEMS structure. In some embodiments, the interconnect architecture is a circuit. For example, the compound WPC device and be configured to operate on different portions of a data sample simultaneously, or on multiple data samples simultaneously, or any combination thereof. The interconnect architecture is configurable by the skilled person to achieve device performance objectives.
Yet another aspect of the present disclosure relates to a method for performing computations with an analog random projection device, the method includes: sending a plurality of electrical input signals to a plurality of input transducers connected to an analog random projection device, wherein the input transducers convert the electrical input signals into signal waves to propagate in a medium of the analog random projection device; physically propagating the signal waves within the medium; and receiving a plurality of electrical output signals from a plurality of output transducers connected to the medium, wherein the output transducers generate the electrical output signals from the signal waves that propagate in the medium.
In some embodiments, the method further includes processing the electrical signals to perform any one or more of signal processing and machine learning. In some embodiments, the method further includes processing the electrical output signals to perform one or more signal processing or machine learning operations. In some embodiments, the medium has an asymmetric geometry. In some embodiments, the medium has impedance discontinuities. In some embodiments, at least two of the transducers are electrically connected to form a feedback path.
Embodiments of the present disclosure are described in detail with reference to the drawing figures wherein like reference numerals identify similar or identical elements. It is to be understood that the disclosed embodiments are merely examples of the disclosure, which may be embodied in various forms. Well-known functions or constructions are not described in detail to avoid obscuring the present disclosure in unnecessary detail. Therefore, specific structural and functional details disclosed herein are not to be interpreted as limiting, but merely as a basis for the claims and as a representative basis for teaching one skilled in the art to variously employ the present disclosure in virtually any appropriately detailed structure.
In the following, we describe acoustic (or elastic, or electro-mechanical) reservoirs, but one skilled in the art will understand that these descriptions apply equally to other types of physical reservoirs based on electromagnetic, thermomechanical, diffusion, or other physical operating principles whereby waves can propagate in a cavity demarcated by boundary conditions. For example, an electromagnetic reservoir can be designed and fabricated to exhibit a response similar to that of the AWRC device, and therefore can be exploited for the high-dimensional projection of input data. As used herein, propagation of signal waves in such devices may be understood to constitute any of dispersal, radiating, scattering and/or broadcasting of the signal waves. In this document, the word “acoustic” and the word “elasto-acoustic” are used. In physical systems, “acoustic” may imply waves in air/gas, i.e. sound, whereas “elasto-acoustic” may imply waves in solid/liquid material/medium. In this document and depending on it context, use of the phrase “acoustic” may be generally understood to concern “elasto-acoustic” and, in particular, the propagation of waves within a solid and/or liquid medium. Nevertheless, the present technology may also be applied with devices that propagate waves in an air/gas as in sound waves.
As shown in
The acoustic cavity 85 may be made of a material (or propagation medium) that supports the propagation of acoustic waves (e.g. 92 and 97). Acoustic waves are also called elasto-acoustic waves or elastic waves. The cavity may be designed such that waves reflect at impedance discontinuities located at the outer boundaries of the cavity, as well as at impedance discontinuities deliberately located within the cavity, e.g. 95 and 96. Such reflections 97 yield multiple wave propagation paths (e.g. 93 and 94) between the input transducers (e.g. 82) and the output transducers (e.g. 83). The reflections, as well as wave dispersion, mode conversion, reservoir tuning, the use of a non-linear propagation medium, the appropriate selection of transducer location, etc. contribute to achieving a rich dynamic response.
As shown in
As shown in
The AWRC device can also include electronic circuits, such as input interface circuits (e.g. 88) that drive the input transducers, or/and output interface circuits (e.g. 89) that amplify the low-level generated by the output transducers. These circuits can be linear or non-linear, and they can substantially affect the overall functionality or response of the AWRC device, as is readily appreciated by one skilled in the art. In addition, the AWRC device can include other circuits, such as to establish feedback loops around the reservoir (e.g. 99).
One method of operation of an AWRC device is depicted in
As depicted in
Likewise, the acoustic wave 106A propagates through the cavity, reflects at the outer boundaries of the cavity, and reaches output transducer 107B via multiple propagation paths, as depicted on
In
In some applications of the AWRC device, the input signals may be more complex than depicted here, so the output signals may be much more complex as well. In addition, in some AWRC devices, impedance discontinuities may be incorporated into the cavities, so many more paths of propagation may exist. Also, in some AWRC device, the propagation waves undergo dispersion, mode conversion, attenuation, and other linear and non-linear transformations, which are not depicted in
In some applications, input signals 101A and 101B may be applied simultaneously to input transducers 105A and 105B, so the waves 106A and 106B generated by input transducers 105A and 105B propagate simultaneously in the cavity 103 and combine to yield output signals 102A and 102B, as depicted on
AWRC devices can perform random projection on any data that can be encoded into a time series. This includes, without limitation, speech processing, image and video classification, sequence-to-sequence learning, autoencoders, and sensor data fusion applications.
Note that, depending on the nature of the data and the application of the AWRC device, the output signals can be read from the output transducers simultaneously with the input signals being applied to the input transducers, or after some delay after the input signals have been applied to the input transducers. For example, for applications that involve a continuous input data stream (that is, input data that is continuous over time, or continuous over a duration that is much longer than the memory duration of the AWRC device), such as when an AWRC device is used to detect a pattern in a continuous sensor data feed, the input signals are applied to the input transducers continuously, and the output signals are read from the output transducers simultaneously, or after a brief delay after the input data have started to be applied to the input transducers, to allow the acoustic waves to propagate in the cavity. Alternately, for example, for applications that involve discrete input data sets (that is, input data that is applied to the input transducers in a time comparable to or shorter than the memory duration of the AWRC device), such as when an AWRC device is used to detect a pattern in data sets generated by different sensors (so each data set is discrete, as defined above), in a first time the input signals are applied to the input transducers, then in a second time the acoustic waves are allowed to propagate in the cavity (and, optionally, feedback is applied, as described below), then in a third time the output signals are read from output transducers, then in a fourth time the acoustic waves are allowed to dissipate so the reservoir can return to a quiescent state before the next input data set is applied.
Note also that AWRC devices are not Turing machines because their operation is not defined by a sequential algorithm. In addition, they do not implement a von-Neumann computing architecture, because there is no separation between the “program” and the data.
A desirable characteristic of the response of an AWRC device is its randomness. Excluding random noise, the response of AWRC devices is deterministic. However, the response of any given AWRC device is random in the sense that it is difficult to ascertain accurately except by direct measurement. Thus, the response of AWRC devices is pseudo-random. The response is highly complex (as discussed below) and it is determined by the shape of the cavity, the location of impedance discontinuities within the cavity, the location of the input and output transducers, etc. The pseudo-random response of an AWRC device can be changed by varying its cavity shape, the location of impedance discontinuities within its cavity, the location of its input and output transducers, etc.
To provide an extra degree of pseudo-randomness, it is also possible to randomly assign the device's input and output signals to the reservoir's input and output transducers.
In addition, the pseudo-random nature of AWRC devices can be enhanced by deliberately allowing random defects to be built in the cavity during manufacturing. Such defects can include material discontinuities, particulate inclusions, vacancies, etc. The defects induce impedance discontinuities, which in turn cause wave reflection, wave dispersion, mode conversion, etc.
In addition, for applications where true randomness is required, the amplitude of the input signals can be decreased until noise generated by the cavity, the input transducers or the output transducers yields the desired signal-to-noise ratio.
Another desirable characteristic of the response of an AWRC device is its dimensionality. A high-dimension response is often preferred for random projection operations. As used herein, an AWRC device with a high-dimension response has a rich dynamic response. The richness of the reservoir dynamics depends on the number of modes that are stimulated by the applied input signals.
To achieve a rich dynamic response, a cavity may be designed to achieve a proper tradeoff between number of modes and quality factor (Q). At one extreme, a cavity could support a single mode (i.e., the cavity would behave like a single-mode transmission line or a single-mode resonator) with very high quality factor. At the another extreme, a cavity could support thousands of modes, with each mode having a very low Q. At either extremum, the input signals would not be projected to a sufficiently number of usable dimensions for random projection. Therefore, it is desirable to design the AWRC device reservoir to operate within these extrema so as to achieve a rich dynamic response. Design methods that achieve a suitably rich dynamic response are discussed below.
In a physical reservoir, a material element (e.g. an atom of the propagation medium) is connected locally to neighboring material elements in the cavity. Thus, the projection matrix of an AWRC device has fixed local connectivity, whereas the projection matrix of a digital reservoir (e.g. an Echo State Network) has global connectivity (i.e. every node of the reservoir is connected to every other node).
In addition, the input and output transducers (i.e. the input and output ports) of a AWRC device are connected to select locations of the cavity (where the transducers are physically located), whereas the input and output ports of a digital reservoir are connected to every node in the reservoir. The limited connectivity of AWRC device input and output transducers can achieve a form of dropout.
Further, the cavity of a AWRC device may not have gain, so its response may be intrinsically dissipative and bounded, whereas the spectral radius of the connection matrix of digital reservoirs must be normalized to ensure that the response of the reservoir does not diverge. An additional benefit of the finite quality factor of the cavity of an AWRC device is that the reservoir noise can perform L2 regularization.
An AWRC device can perform (1) the random projections of the input space onto a higher-dimensional space, followed by (2) a random projection onto the output space. As a consequence, the outputs are a frame of the inputs (as described above).
Physical systems such as an acoustic-wave reservoir have finite energy, therefore the signals that are generated, propagated, dissipated, reflected and received within a physical system all have finite energy. All the possible signals form a special Hilbert space called an 2 space, or Signal Space. Signal Space can be spanned by an orthonormal basis formed by the coefficients of a Fourier series. However using an orthonormal basis is not always preferred. Sometimes the noise in the signal or signal loss causes the basis to cease being orthonormal. Therefore, in the signal processing state of the art, “frames” (cf. Jelena Kovacevic and Amina Chebira, “An Introduction to Frames”, Foundations and Trends in Signal Processing, 2(1):1-94, 2008) are used. Frames are a set of vectors that do not form an orthonormal basis. Since the frame is not an orthonormal basis, a vector in the space can be represented in more than one way. This redundancy allows for fault tolerance and noise mitigation (cf. Stephane Mallat, A Wavelet Tour of Signal Processing, Third Edition: The Sparse Way, Academic Press, Dec. 25, 2008). This is a key advantage of using acoustic wave reservoirs to perform random projection. It is possible to select a frame that spans the Signal Space using methods known by those skilled in the art.
Consider what happens when a signal is injected into a physical reservoir, such as an electromechanical reservoir, via one or more input transducers. The injected signal stimulates one or more modes of the reservoir. The elastic wave propagates through the reservoir—in the bulk and/or the surface of the material. At internal or external impedance discontinuities, the wave is simultaneously reflected and transmitted. The amount of reflection and transmission depends on the impedance mismatch at the interface. In addition, wave propagation causes the generation of secondary waves that propagate along other directions, as a consequence of off-axis elements in the stiffness matrix.
The wave is received at one or more output transducers. The signals read out by the output transducers are linear combinations of the input signals. Thus, the outputs are not single Fourier series components. Rather, they are frames forming a (non-orthonormal) basis of the Signal Space.
Let xj, j=1 . . . M be the inputs to the electromechanical reservoir. Let yi, j=1 . . . N be the outputs from the reservoir. Then, the Fourier expansion of the inputs are of the form:
where coefficients ak are the coefficients of the complex frequency components ωk of the reservoir. A similar expansion can be written down for the outputs yi. However, the coefficients of the terms in yi would be redundant linear combinations of the corresponding coefficients in xi (i.e. a Frame):
where coefficients bk are not necessarily equal to ak, and coefficients cj describe a (possible non-unique) relationship between the frame bases and the orthonormal input bases. The set of coefficients bk form the frame for that particular output yj. Note that frequency components in a particular output are repeated across multiple outputs. Taken together, the outputs are a redundant representation of the input space, i.e. a “cluster ensemble” (cf. X.Z.F ern and C.Brodley, “Random projection for high dimensional data clustering: A cluster ensemble approach”, Proceedings of the Twentieth International Conference of Machine Learning, 2003).
A frame is more formally defined as the set of vectors F=fi where there exist constants A and B, 0<A≤B<∝ such that:
for every vector x in the Hilbert space.
In the embodiment of the Acoustic Wave Reservoir Computing depicted on
As mentioned above,
Signals are carried in the AWRC device of
In this document, a thin-film piezoelectric cavity is depicted in many of the figures. However, it is readily understood that a cavity can be designed to operate based on electromagnetic, thermomechanical, diffusion, or other physical operating principles. Further, the cavity can have a full 3D shape.
Signals are carried in the cavity by Lamb waves and Shear-Horizontal waves. The elastic wave equation describes the displacement of a wave in terms of spatial coordinates x and time t:
where c is the propagation velocity and ∇ is the Laplace operator over the spatial coordinates x.
Lamb waves are elastic waves in a plate. Their particle motion lies in the plane that contains the direction of wave propagation and the plate normal. They are guided waves because they are constrained by the geometry of the media in which they propagate. In an infinite solid plate of thickness d, the sinusoidal solutions of the elastic wave equation are of the form:
u=A
x(z)·ei(ωt-kx)
v=A
z(z)·ei(ωt-kx)
where <u, v> represent the x- and z-axis displacements, ω is the angular frequency, k is the wavevector and A( ) is the amplitude. The wave propagates along the x-axis with a frequency of ω/2π. Lamb waves therefore have no motion in the y-direction. Motion in the y-direction is caused by Shear-Horizontal (SH) waves which, together with Lamb waves, can propagate with straight wave fronts.
Two important modes, S0 and A0, are noted here for their ability to exist over the entire frequency spectrum. They are also called the Extensional mode and Flexural mode, respectively. The AWRC device makes use of both of these modes for operation. However, it is understood by one skilled in the art that higher-order modes exist and can be adapted for use in such devices.
An AWRC device can be designed to use other waves than the laterally-propagating waves discussed above. Conventional elastic theory makes an assumption of infinitesimally small rotational gradients at each material point. Allowing for non-zero rotations adds geometric (kinematic) nonlinearity into the elastic behavior of the cavity. A properly configured transducer can launch or sense such rotational traveling waves, as depicted on
3. Size of the Cavity
The dimensions of the cavity, along with the acoustic properties of the cavity material, determine its memory duration of an AWRC device, i.e. how long the input signals are remembered (just as a memory of a transmission line increase with its length) and projected. The reservoir may be large enough to accommodate all the inputs (if those are applied over a period of time) and outputs (if those are read out over a period of time), while simultaneously not being so large that the signals attenuate excessively before the computation is complete. Manufacturing considerations have an effect on the bounds on the cavity size.
An AWRC device may achieve a rich dynamic response by the use of a complex cavity shape. A complex cavity shape combined with proper placement of the input transducers, the output transducers, the impedance discontinuities located in the cavity, etc., can result in a high number of propagation paths for the waves that travel in the cavity, with a wide range of reflections. Reflections (which yield mode conversions), wave dispersion, and the other phenomena discussed below result in a rich dynamic response.
The cavity geometry can be convex or non-convex. In many situations, convex geometries are the most suitable configuration. Non-convex geometries can be used if it is known (or desired for the application under consideration) that some subset of inputs have only a weak relationship with a subset of the outputs. In such a situation, it can be advantageous to place the transducers for these inputs in a portion of the cavity that is topologically weakly connected to the rest of the cavity.
An AWRC device may achieve a rich dynamic response by the use of designed impedance discontinuities within the cavity (e.g. 95 and 96 on
Impedance discontinuities within the cavity induce reflections and mode conversions, and add to the reflections and mode conversions induced by the outer boundaries of the cavity. To enhance the richness of the dynamic response of the AWRC device, the internal impedance discontinuities are located randomly in the cavity. However, it has been shown (cf. Abel Klein and Andrew Koines in “A General Framework for Localization of Classical Waves: I. Inhomogeneous Media and Defect Eigenmodes”, Mathematical Physics, Analysis and Geometry, 2001, 4(2):97-130, and “A General Framework for Localization of Classical Waves: II. Random Media”, Mathematical Physics, Analysis and Geometry, 2004, 7(2): 151-185) that very disordered medium can cause the wave energy to become partially trapped in a part of the cavity, and not interact with the rest of the cavity. To avoid this condition, the nonuniformity should be designed with care.
An AWRC device may achieve a rich dynamic response by use of wave mode conversion in the cavity. Mode conversion is a property of elastic waves propagating in a medium. When the propagating (say, longitudinal) wave is incident on an internal or external boundary, the longitudinal wave is reflected. In addition, due to transverse motion at the boundary, SH waves are generated. These SH waves propagate to generate further longitudinal and/or SH waves at other boundaries. The degree of mode conversion is dependent on the boundary geometry (both internal and external), as well as the Poisson ratio of the materials in the media.
The richness of the dynamic response of an AWRC device may be enhanced by the use of wave diffraction in the cavity. A wave undergoes diffraction when it encounters an obstacle (created by a localized impedance discontinuity) that bends the wave, or when the wave spreads after emerging from an opening (also created by an impedance discontinuity). The diffraction effects are more pronounced when the size of the obstacle or opening is comparable to the wavelength of the wave.
The richness of the dynamic response of an AWRC device may be enhanced by the use of wave dispersion in the cavity. Elastic waves propagating in a medium exhibit dispersion when the velocity of wave propagation c depends on frequency ω, or/and on the material properties of the medium. Dispersion results in wave separating into their constituent frequencies as they propagate through the medium.
The richness of the dynamic response of an AWRC device can be enhanced by the use of a non-uniform propagation medium. The cavity can be composed on several separate regions (larger than localized impedance discontinuities discussed above), each made of a separate medium, so the propagating waves undergo refraction they pass from one region into another.
The richness of the dynamic response of an AWRC device can be enhanced by the use of a non-linear propagation medium. Nonlinearity can be introduced when the cavity is made of a piezoelectric material, in which case the mechanical response and the electrostatic response of the propagating medium are coupled, and the stiffness matrix of the medium is nonlinear.
Furthermore, non-linearity can be created when the input signals are strong enough to push the cavity or/and transducer materials out of their elastic-response regime, or to saturate the strain distribution in the cavity.
The richness of the dynamic response of an AWRC device can be enhanced by the use of feedback. The addition of explicit non-linear feedback into the reservoir causes it to cease to be an LTI system, but can enhance the richness of its dynamic response and thereby introduce new capabilities into its projection abilities.
Electrical feedback can be provided from one or more output transducers (e.g. 98B on
The electrical feedback operation can have gain; it can be linear or non-linear; it can instantaneous or time-delayed, etc. Feedback can also be provided by mechanical connections or by other means.
In the context of machine learning, output-to-input feedback can be thought of as “teacher forcing” wherein the target is fed back into the input. This feature can be disabled during training or open loop operation. It is also possible to duty cycle these connections to mix teacher forcing and training.
The richness of the dynamic response of an AWRC device can be enhanced by the use of a tunable propagation medium. We refer to a tunable propagation medium as a medium that has a propagation-related property that can be altered (or changed or tuned) after manufacturing, in a repeatable manner. For an acoustic medium, the material properties that affect wave propagation include the coefficients of the stiffness matrix, modulus of elasticity, Poisson ratio, wave velocity, etc. Material properties can be altered by electrical or magnetic or optical or thermal means.
Certain material property tuning processes are extremely fast—of the order of nanoseconds—and can enable new capabilities in a AWRC device-based machine learning system.
Certain material properties can be altered in a continuous manner, others in a discrete manner. In both cases, tunability can be exploited as a switch. A reservoir that incorporates a tunable material can be switched from one state to another. Thus, it is possible to “store” two or more different projections indexed by the tuning control means.
If the material property change is continuous, the material properties can be perturbed or dithered, and the effect on the operation of the AWRC device (and, if that is the case, on the learning process itself) can be observed.
In an AWRC device, input signals may be applied to piezoelectric transducers to create traveling waves, via the reverse piezoelectric effect, that couple into the reservoir cavity. Waves from the cavity are sensed and read out by piezoelectric transducers using the direct piezoelectric effect. In the following discussion, piezoelectric transducers are assumed, but other transducers can perform similar functions.
Piezoelectric transducers convert electrical voltage or current to strain energy at the inputs, and strain energy to electrical voltage or current at the outputs. Piezoelectric transduction typically uses a single mode to perform the transduction. Modes that are most often used are the TE-mode or the LE-mode. In the TE-mode, the displacement or strain is in the direction of the applied electrical field (for the reverse piezoelectric effect); or the generated voltage is in the direction of the applied strain (for the direct piezoelectric effect). In the LE-mode, also known as a lateral, or contour, or Lamb mode, the strain is perpendicular to the direction of the applied electric field (and vice versa for the direct piezoelectric effect). For the both modes, as a consequence of the Poisson ratio of the material, a strain in one direction results in a related strain in the orthogonal directions. TE-mode transducer embodiments can use this property to launch inputs and receive outputs from the cavity. Other modes, including without limitation, bulk- and surface- modes can be used for transduction.
The transducers can be designed to achieve a specific response, i.e. to generate and couple into the reservoir a wave of specific characteristics. Several transducer parameters can be designed: the transducer's location, its total area, its shape (e.g. a point, a line or area port to the cavity), its connection to the cavity, etc. In addition, transducers can be designed to be operated in groups. For example, multiple transducers (whether adjacent or not) can be coupled or activated together or with a certain time delay so that the acoustic wave is launched into the reservoir along a preferred axis or direction, or with a specific propagation lag or phase shift between the various transducers
With a thin-film cavity manufactured with common thin-film MEMS processes, transducers can be readily placed on the lateral periphery of a cavity, as depicted in
It can be advantageous to define transducers at many locations of the cavity, then select a random subset of transducers as input transducers and output transducers as a method to increase the variety of response dynamics of the reservoir.
Further, it can be advantageous to locate transducers across (on the top or the bottom of) the cavity, for example in cases where the AWRC device is to be used with 2D input data, such as images.
In many of the embodiments presented herein, the piezoelectric transducers operate either as input transducers or as output transducers. In many applications, the input transducers are connected to driver amplifiers or other electrical or electronic circuits to filter, amplify, or otherwise shape the waves generated by these input transducers. Likewise, in many applications, output transducers are connected to low-noise amplifiers or other electrical or electronic circuits to amplify, filter, or otherwise shape the output signals.
Alternately, because of the duality of the forward and reverse piezoelectric effects, piezoelectric transducers can be operated sequentially as input transducers and output transducers. Sequential input/output operation is possible when the transducers (and the AWRC device) are designed to sense the output signals at different times than when the input signals are applied. In such a sequential mode of operation, for example, the output signals may be sensed after the input signals are applied; alternately, the output signal sensing operations may be time-interleaved with the application of the input signals. A sequential dual-functionality transducer may be connected to the appropriate data source (optionally through a driver amplifier) when it is operated as an input transducer, and it may be connected to the appropriate data sink (optionally through a low noise amplifier) when it is operated as an output amplifier. The connections are switched according to the then-current function of the transducer. Note that the dual functionality concept applies to other types of transducers such as inductive transducers, which can be used in electromagnetic-wave reservoir computing devices.
Alternately, in networked reservoir applications, certain piezoelectric transducers can be used as simultaneously bi-directional transducers to connect separate reservoirs to form a compound reservoir. As depicted in
Transducers in AWRC devices are typically used to apply input signals to the reservoir and read out output signals from the reservoir, to perform random projection computations.
However, transducers can be used for test and characterization purposes. Transducers can be used to measure or monitor the properties of the reservoir, such as wave velocity, transducer gain, loss in the cavity, etc. A simple time-of-flight measurement between two test transducers can be used to characterize the wave velocity of the cavity.
Transducers can be dedicated to performing test and characterization. Alternately, input or/and output transducers that are used during the regular operation of the AWRC device can be re-purposed temporarily to perform test and characterization.
As waves propagate in the cavity, some waves reach and impinge the input and output transducers. If a transducer presents an acoustic impedance to impinging waves that is equal to the acoustic impedance of the cavity, then all impinging acoustic wave energy is converted to electrical energy by the transducer. Alternately, if a transducer presents an acoustic impedance that is not equal to the acoustic impedance of the cavity, then part of the impinging wave will reflect off the transducer.
The acoustic impedance presented by an output transducer to the waves propagating in the cavity is set by the response of the transducer itself and by the impedance presented by the electronic sense/amplification circuit connected to the output transducer. Likewise, the acoustic impedance presented by an input transducer to waves propagating in the cavity is set by the response of the transducer itself and by the impedance presented by the electronic drive circuit connected to the input transducer. Therefore, it is possible to vary the acoustic impedance presented by transducers, and thereby the response of the reservoir.
It is possible to use the transducer-impedance-based reservoir tuning scheme described above to enable certain unsupervised learning tasks. When the impedance of a wave impinging a transducer does not match the impedance of the transducer, a portion of the wave is converted to electrical energy by the transducer, and a portion of the wave is reflected back into the cavity. This reflected wave can be thought of as back propagation of the error signal.
Since the transducer impedance is tunable, it is possible to tune the impedance of the transducers to achieve certain goals, much like selecting weights in a neural network. Specifically, the transducer impedance values can be adjusted until the impinging wave no longer reflects into the cavity, and all of the wave energy is converted out of the cavity.
The tuning process can be automated further. Since the acoustic impedance of a transducer is determined in part by the impedance of the circuit that is connected to the transducer, then these transducer circuit impedances can act as training targets. The tuning process of the AWRC device consists in adjusting the feedback gains to match the transducer circuit impedances.
In the AWRC device depicted in
The reservoir and its suspension are built on a substrate. The substrate can contain electrical circuits. The reservoir and suspension may be isolated from the environment. This can be done using a device-level cap, or a wafer level cap can be made over the reservoir structure via wafer-to-wafer bonding. The reservoir need not be hermetically isolated from the environment; a cap that substantially protects it from external impurities and moisture is sufficient for most applications. If the substrate does not include circuits, the reservoir die can be co-packaged with a circuit die.
The size of MEMS-based reservoirs is limited by manufacturing and other constraints. One method to create a large reservoir is to connect several reservoirs, as depicted on
A network of reservoirs can be used in several ways. The network can accommodate many more ports than a single reservoir. The network can be designed to have a longer-duration memory or a more complex response than a single reservoir. The network can be configured to perform incremental learning. For example, if the network is learning a relationship between an input and an ultimate output that is long sequence of convolutional operations, the relationship can be learned in discrete stages using discrete reservoirs to learn each stage. According to such operation, a first reservoir will learn a first relationship between the original input and a first intermediate output, a second reservoir will learn a relationship between the first intermediate output (the first intermediate input of the second stage) and the second stage learns the relationship between the first intermediate input and an output that can be the ultimate output in a two stage example or a second intermediate output if the learning requires more than two stages. Compound reservoirs can be suited for applications such as sequence-to-sequence learning, autoencoders, denoising autoencoders, stacked autoencoders, adversarial reservoirs, without limitation.
Further, it is possible to implement committee-based learning with groups of AWRC devices. Ensembles or groups of AWRC can be trained on the same task, with different sets of random ports. At the outputs, a voting scheme would select the best set of parameters after training.
An AWRC device differs from other devices, both operationally and structurally, in many different ways.
For example, an AWRC device may differ significantly from existing implementations of random projection and neural network computational circuits.
First, AWRC devices have a physical reservoir, and therefore feature a local connectivity inside the reservoir. By contrast, traditional reservoir computing concepts (Echo State Networks and Liquid State Machines) have a digital reservoir, and therefore feature global node-to-node connectivity inside the reservoir. In addition, physical reservoirs are intrinsically lossy, whereas digital reservoirs can have gain greater than 1 between nodes, and nodes can form closed loops that have a divergent or oscillatory response. Therefore, AWRC devices are more likely to be numerically stable.
Second, AWRC devices perform computations passively by letting acoustic waves propagate in a cavity. As a result, the high-dimensional projection of input data is performed without power consumption. By contrast, circuit implementations of reservoir computing devices perform thousands of computations (additions and multiplications on binary data) explicitly at every time step of operation of the reservoir, and each computation dissipates some electrical power.
Third, AWRC devices use acoustic waves that propagate in solid-state elastic materials to compute and transport data from inputs to outputs. Such materials have an acoustic wave velocity that ranges from 1,000 meter per second to 15,000 meter per second. By contrast, in reservoir computing devices that use optical waves or electromagnetic waves, the wave velocity ranges from 75,000,000 meters per second (in a material such as germanium) to 300,000,000 meters per second (in vacuum), which is 5,000 to 300,000 times faster than the acoustic wave velocities cited above. As a result, reservoir computing devices that use optical waves or electromagnetic waves would be much larger that AWRC devices for applications that have equal input data rates.
Therefore, it is clear that AWRC devices may differ operationally from other devices in several ways.
As another example, AWRC devices are unique in their use of wave propagation in a reservoir to perform random projection computations. Known devices that most resemble AWRC devices in terms of structure are bulk acoustic wave (BAW) resonators and surface acoustic wave (SAW) filters. But even BAW resonators and SAW filters differ greatly from AWRC devices.
First, compare AWRC devices to BAW resonators. In terms of acoustic operation, BAW resonators function by confining a standing wave, in a cavity that supports a single mode of resonance with very high quality factor. By contrast, AWRC devices function with traveling waves, in a cavity that supports tens to hundreds modes, each with a moderate quality factor. In terms of construction, BAW resonators are manufactured to minimize the non-uniformity of the acoustic propagation medium, so as to minimize wave reflections, mode conversion and loss. By contrast, AWRC devices may be manufactured with deliberate non-uniformities in the acoustic propagation medium, so as to induce wave reflections and mode conversion. In terms of overall response, BAW resonators are designed and manufactured to have a very linear response. By contrast, AWRC devices are typically designed and manufactured to have a non-linear response.
Second, compare AWRC devices to SAW filters. In terms of frequency response, SAW filters have a frequency response in a relatively narrow range of frequencies. By contrast, AWRC devices have a multi-resonant frequency response over at least one decade in frequency. In terms of construction, SAW filters are manufactured to minimize the non-uniformity of the acoustic propagation medium, so as to minimize wave reflections, mode conversion and loss. By contrast, AWRC devices may be manufactured with deliberate non-uniformities in the acoustic propagation medium, so as to induce wave reflections and mode conversion. In terms of overall response, SAW filters are designed and manufactured to have a very linear response. By contrast, AWRC devices are typically designed and manufactured to have a non-linear response.
Therefore, it is clear that AWRC devices differ structurally from the other devices in several ways.
More generally, waves generated by two input transducers that are adjacent will interact with each other before they interact with waves generated by input transducers that are separated by a longer distance.
When the input data (e.g. pixels from an image) has a geometrical structure (e.g. the data source is an image, which is formed by a 2D array of pixels), and the input transducers located on the cavity 121 has an identical geometrical structure (e.g. a 2D array structure), and the input data is applied to the input transducers with a mapping that preserves the geometrical structure of the input data (e.g. pixels located at opposite corners of the source image are applied to the cavity at opposite corners of the array of input transducers), then as a consequence the physics of elastic wave propagation in the cavity, pixels close to each other interact with each other before pixels that are further apart. This behavior is repeated at all scales, viz. 2×2, 3×3, 4×4, etc. This may be interpreted as the reservoir applying a convolutional kernel to the image. After a period of time, the transducers at output ports 107 sense the arriving waves 247, and convert it to output voltage time series 102.
From the foregoing and with reference to the various figure drawings, those skilled in the art will appreciate that certain modifications can also be made to the present disclosure without departing from the scope of the same. While several embodiments of the disclosure have been shown in the drawings, it is not intended that the disclosure be limited thereto, as it is intended that the disclosure be as broad in scope as the art will allow and that the specification be read likewise. Therefore, the above description should not be construed as limiting, but merely as exemplifications of particular embodiments. Those skilled in the art will envision other modifications within the scope and spirit of the claims appended hereto.
The present application claims the benefit of the filing date of U.S. Provisional Application No. 62/520,167 filed Jun. 15, 2017, the disclosure of which is hereby incorporated herein by reference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2018/037009 | 6/12/2018 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62520167 | Jun 2017 | US |