The presently disclosed subject matter relates to methods and systems for the encoding and reconstruction of signals encoded with time encoding machines (TEM), and more particularly to the reconstruction of signals encoded with TEMs with the use of recurrent neural networks.
Most signals in the natural world are analog, i.e., they cover a continuous range of amplitude values. However, most computer systems for processing these signals are binary digital systems. Synchronous analog-to-digital (A/D) converters can be used to capture analog signals and present a digital approximation of the input signal to a computer processor. That is, at certain moments in time synchronized to a system clock, the amplitude of the signal of interest is captured as a digital value. When sampling the amplitude of an analog signal, each bit in the digital representation of the signal represents an increment of voltage, which defines the resolution of the A/D converter. Analog-to-digital conversion is used in many applications, such as communications where a signal to be communicated can be converted from an analog signal, such as voice, to a digital signal prior to transport along a transmission line.
Applying traditional sampling theory, a band limited signal can be represented with a quantifiable error by sampling the analog signal at a sampling rate at or above what is commonly referred to as the Nyquist sampling rate. It is a trend in electronic circuit design to reduce the available operating voltage provided to integrated circuit devices. In this regard, power supply voltages for circuits are generally decreasing. While digital signals can be processed at the lower supply voltages, traditional synchronous sampling of the amplitude of a signal becomes more difficult as the available power supply voltage is reduced and each bit in the A/D or D/A converter reflects a substantially lower voltage increment.
Time Encoding Machines (TEMs) can encode analog information in the time domain using only asynchronous circuits. Representation in the time domain can be an alternative to the classical sampling representation in the amplitude domain. Applications for TEMs can be found in low power nano-sensors for analog-to-discrete (A/D) conversion as well as in modeling olfactory systems, vision and audition in neuroscience.
Methods and systems for reconstructing TEM-encoded signals using recurrent neural networks are disclosed herein.
According to some embodiments of the disclosed subject matter, methods for reconstructing a signal encoded with a time encoding machine (TEM) using a recurrent neural network include first receiving a TEM-encoded signal and processing it for input into a recurrent neural network. The original signal is then reconstructed using the recurrent neural network.
In one embodiment, the reconstruction process includes formulating the reconstruction into a variational problem having a solution equal to a summation of a series of functions multiplied by a series of coefficients. The coefficients can be obtain by solving an optimization problem. The optimization problem can be solved by a recurrent neural network whose architecture can be defined according to a particular differential equation. The original analog signal can then be reconstructed using the coefficients.
In another embodiment, the TEM-encoded signal can be Video Time Encoding Machine (vTEM) encoded signals. The method of reconstructing using a recurrent neural network can have a plurality of inputs for a plurality of signals generated by the vTEM.
According to some embodiments of the disclosed subject matter, systems for reconstructing a signal encoded with a TEM using a recurrent neural network include at least one input for receiving a TEM-encoded signal. The input can then pass a signal along an arrangement of adders, integrators, multipliers, and/or piecewise linear activators arranged to reflect a recurrent neural network defined by a particular differential equation. The system can have a plurality of outputs for providing at least one coefficient representing a reconstructed signal.
In some embodiments, the recurrent neural network can include three layers. The first layer can consist of a plurality of multiply/add units. Each of the nodes in the first layer can be operatively connected to a second layer which also includes a plurality of multiply/add units. The third layer, also consisting of a plurality of multiply/add units can calculate a gradient weighted by a learning rate and can output the time derivative of the coefficient. The outputs can then be integrated and fed back into the first layer.
The presently disclosed subject matter provides techniques for encoding and decoding an analog signal into the time domain, also referred to as the spike domain. More particularly, the presently disclosed subject matter provides for the use of recurrent neural networks in encoding and decoding analog signals into and from the time domain and the encoding and decoding of visual stimuli with recurrent neural networks.
The present application makes use of time encoding, a time encoding machine (TEM) and a time decoding machine (TDM). Time encoding is a real-time asynchronous mechanism of mapping the amplitude of a bandlimited signal u=u(t), tε, into a strictly increasing time sequence (tk), kε, where and denote the sets of real numbers and integers, respectively. It should be noted that throughout this specification the symbol should be construed in the same manner as the symbol , to mean the set of real numbers A Time Encoding Machine (TEM) is the realization of an asynchronous time encoding mechanism. A Time Decoding Machine (TDM) is the realization of an algorithm for signal recovery. With increasing device speeds TEMs are able to better leverage a temporal model of encoding a signal. The interest in temporal encoding in neuroscience is closely linked with the natural representation of sensory stimuli (signals) as a sequence of action potentials (spikes). Spikes can be discrete time events that carry information about stimuli.
Time Encoding Machines (TEMs) model the representation (encoding) of stimuli by sensory systems with neural circuits that communicate via spikes (action potentials). TEMs asynchronously encode time-varying analog stimuli into a multidimensional time sequence. TEMs can also be implemented in hardware. For example Asynchronous Sigma-Delta Modulators (ASDMs), that have been shown to be an instance of TEMs, can be robustly implemented in low power analog VLSI. With the ever decreasing voltage and increasing clock rate, amplitude domain high precision quantizers are more and more difficult to implement. Representing information in time domain follows the miniaturization trends of nanotechnology and demonstrates its potential as next generation silicon based signal encoders.
Asynchronous Sigma/Delta modulators as well as FM modulators can encode information in the time domain as described in “Perfect Recovery and Sensitivity Analysis of Time Encoded Bandlimited Signals” by A. A. Lazar and L. T. Toth (IEEE Transactions on Circuits and Systems-I: Regular Papers, 51(10):2060-2073, October 2004), which is incorporated by reference. More general TEMs with multiplicative coupling, feedforward and feedback have also been characterized by A. A. Lazar in “Time Encoding Machines with Multiplicative Coupling, Feedback and Feedforward” (IEEE Transactions on Circuits and Systems II: Express Briefs, 53(8):672-676, August 2006), which is incorporated by reference. TEMs realized as single and as a population of integrate-and-fire neurons are described by A. A. Lazar in “Multichannel Time Encoding with Integrate-and-Fire Neurons” (Neurocomputing, 65-66:401-407, 2005) and “Information Representation with an Ensemble of Hodgkin-Huxley Neurons” (Neurocomputing, 70:1764-1771, June 2007), both of which are incorporated by reference. Single-input multiple-output (SIMO) TEMs are described in “Faithful Representation of Stimuli with a Population of Integrate-and-Fire Neurons” by A. A. Lazar and E. A. Pnevmatikakis (Neural Computation), which is incorporated by reference.
Disclosed herein are methods and systems of reconstructing a signal encoded with a time encoding machine (TEM) using a recurrent neural network. Examples will now be given showing exemplary embodiments of the disclosed subject matter. First, a method and system are provided for reconstruction of a TEM-encoded signal encoded with a single-input single-output TEM. For purposes of illustration and not limitation, a model of the signal and an overview of the encoding process are provided. One of ordinary skill will appreciate that other suitable models and encoding processes can be used in accordance with the subject matter disclosed herein. Next, the methods and systems for reconstruction of a single-input single-output TEM-encoded signal will be expanded to reconstruction of multi-dimensional signals, for example the reconstruction of space-time stimuli encoded with Video Time Encoding Machines (vTEMs). For purposes of illustration, the exemplary model of the single-input single-output signal is extended to the space-time vTEM encoded signal. An overview of vTEM encoding will also be provided. One of ordinary skill will appreciate that a variety of other suitable models and encoding processes can be used, and that the examples provided herein are not for purposes of limitation. For example, the presently disclosed subject matter can reconstruct olfactory or auditory stimuli as well as visual stimuli.
In the case of a single-input single-output TEM that encodes time-varying signals, time-varying stimuli (signals) can be elements of a space of trigonometric polynomials. For example, in Hilbert space , every element u=u(t), tε is of the form
with
an element of the basis spanning the space t; Ωt and Mt are the bandwidth and the order of the space of trigonometric polynomials, respectively. Every element in this Hilbert space is periodic with period
Assuming that all signals are real, c−m
The inner product in t can be defined in the usual way: ∀u, vεt,
The space of trigonometric polynomials can be a finite dimensional Hilbert space, and therefore, a Reproducing Kernel Hilbert Space (RKHS), with reproducing kernel
with t, sε.
Modeling the set of stimuli in a Hilbert space can enable the use of geometry of the space to reduce stimulus encoding to projections on a set of functions.
Encoding of a time-varying signal, for example, with a TEM, can consist of two cascaded modules. The signal can be passed through a temporal receptive field DT(t) before being fed into a neural circuit. The processing of the temporal receptive field can be modeled as filtering. For example, an operator rL:t→t, can be defined such that
v(t)=rLu=DT(t−s)u(s)ds=(DT*u)(t).
The neural circuit can encode the output of the receptive field. The neural circuit can be realized with different neural models. For example, the neural circuit can be an Integrate-And-Fire (IAF) neuron, a Hodgkin-Huxley neuron, an Asynchronous Sigma-Delta Modulator (ASDM), or other suitable model. Spike times of a spike train at the output of the neural circuit can be modeled as (tk), k=0, 1, 2, . . . , n.
The operation of the neural circuit can be described by a bounded linear functional TLk:t→. The explicit formula of this functional can be determined by the t-transform of the neuron, given by TLku=qk, for u=εt, where TLk and qk usually depend on (tk), k=0, 1, 2, . . . , n. For example, for an ideal IAF neuron with the t-transform given by ∫kk+1u(s)ds=κδ−b(tk+1−tk), leads to TLku=∫kk+1u(s)ds, qk=κδ−b(tk+1−tk), where κ, δ and b are, respectively, the integration constant, the threshold and the bias of the IAF neuron.
Combining the two cascaded modules together, bounded linear functionals Lk:t→ can be defined as Lk=TLkrL so that Lku=TLkrLu=qk. By the Riesz representation theorem, these functionals can be expressed in inner product form as Lku=u,φk,for all uε where, by the reproducing property, φk(t)=φk, Kt=Lk
Because the inner products are merely projections of the time-varying stimulus onto the axes defined by the φk's, encoding can be interpreted as generalized sampling, and the qk's are measurements given by sampling the signal. Unlike traditional sampling, the sampling functions in time encoding are signal dependent.
In one aspect of the disclosed subject matter, signals encoded with a time encoding machine (TEM) can be reconstructed using a recurrent neural network. In some embodiments, the recurrent neural network can decode a TEM-encoded signal that has been encoded by a single-input single-output TEM as described above. Reference will now be made to particular embodiments of the disclosed subject matter for reconstructing a TEM-encoded signal with a recurrent neural network for purposes of illustration. However, one of ordinary skill will recognize that other suitable variations exist, and thus the following discussion is not intended to be limiting.
Reconstruction of a time-varying signal u from a TEM-encoded signal can be formulated as a variational problem. In one embodiment, the reconstruction is formulated into the variational problem
where λ is a smoothing parameter. By the Representer Theorem, the solution to this problem takes the form
Substituting the solution into the problem, the coefficients ck can be obtained by solving the unconstrained optimization problem minimize ∥Gc−q∥1/22+nλcTGc, where c=[c1, c2, . . . , cn]T, q=[q1, q2, . . . . , qn]T, I is the n×n identity matrix and G is a symmetric matrix with entries
This minimization problem has an explicit analytical solution with c the solution of the system of linear equations GT(G+nλI)c=GTq. Therefore, reconstruction of a time-varying signal can be accomplished by solving a system of linear equations.
This system of linear equations can be solved by, particularly in the case where the matrix G is a singular matrix, taking the Moore-Penrose pseudo-inverse (hereinafter, “pseudo-inverse”). However, the calculating the Moore-Pensore pseudo-inverse is typically computationally intensive. For example, one conventional algorithm for evaluating the pseudo-inverse is based on singular value decomposition (SVD). SVD is particularly computationally demanding. Recurrent neural networks can be used to efficiently solve optimization problems. These networks can have structures that can be easily implemented in analog VLSI.
In one embodiment, a recurrent neural network can be used to solve the system of linear equations. For example, using a general gradient approach for solving the unconstrained optimization problem minimize∥Gc−q∥1/22+nλcTGc, a set of differential equations can be considered:
with initial condition c(0)=0, where E(c)=½(∥Gc−q∥1/22+nλcTGc), and μ(c, t) is a n×n symmetric positive definite matrix that determines the speed of convergence and whose entries are usually dependent on the variables c(t) and time t, define the architecture of the recurrent neural network. It follows that ∇E(c)=GT((G+nλI)c−q). Because E(c) is convex in c, the system of differential equations asymptotically approaches the unique solution of the regularized optimization problem minimize∥Gc−q∥1/22+nλcTGc. Consequently,
The set of differential equations can be mapped into a recurrent neural network, for example as depicted in
Alternatively, in another embodiment, the reconstruction of the TEM-encoded signal can be formulated as the spline interpolation problem
that seeks to minimize the norm as well as satisfy all the t-transform equations. This problem can also have a solution that takes the form
Substituting the solution into the interpolation problem, the vector of coefficients c are the solution of the optimization problem
where c=[c1, c2, . . . , cn]T, q=[q1, q2, . . . , qn]T, and G is a symmetric matrix with entries
Due to the RKHS property, G is a positive semidefinite matrix. Therefore, the optimization problem is a convex quadratic programming problem with equality constraints.
The optimization problem can be reformulated as a standard quadratic programming problem. By setting x=[x+Tx−T]T and imposing x+≧0, x−≧0 such that c=x+−x−, the convex programming problem
is obtained, where
A recurrent network can be constructed that solves the convex programming problem, given by the differential equation
where (x)+=[(x1)+, . . . , (xn)+]T and (xi)+=max{0, xi}, α is a positive constant and β>0 is the scaling constant. In one embodiment, this network can be a neural network depicted in
In various embodiments, the recurrent neural networks, including those described above, can be realized with adders, integrators, multipliers and piecewise linear activation functions. The recurrent neural networks disclosed herein can be highly parallel, and thus can solve large scale problems in real-time implemented in analog VLSI.
Just like in the case of single-dimension single-input single-out TEM-encoded signals, visual signals can be modeled as elements of the vector space of tri-variable trigonometric polynomials, denoted by . Each element Iε is of the form
where cm
my=−My, . . . , My, mt=−Mt, . . . , Mt, constitute a basis of and (x, y, t)ε3. (Ωx, Ωy, Ωt) and (Mx, My, Mt) are, respectively, the bandwidth and the order of the trigonometric polynomials in each variable. An element Iε is also, respectively, periodic in each variable with period
The inner product can be defined ∀I1I2ε as,
By defining the inner product as such, the space of trigonometric polynomials is a Hilbert space. Since is finite dimensional it is also a RKHS with reproducing kernel
Video Time Encoding Machines (vTEMs) encode space-time signals into the spike domain.
The receptive fields can be considered as spatio-temporal linear filters that preprocess the visual signals and feed them into the IAF neurons. The operation of the jth visual receptive field Dj(x,y,t) can be given by the operator SLj:→t by sLjI=(Dj(x,y,s)I(x,y,t−s)dxdy)ds t denotes the univariable trigonometric polynomial space with bandwidth Ωt and order Mt. The operator maps a 3-D space into a 1-D space.
A simplified case of the visual receptive field, where the field is spatio-temporally separable, can be considered. In this case, the receptive fields can be separated into spatial receptive field Dsj(x, y) and temporal receptive field DTj(t) such that DJ(x, y, t)=Dsj(x,y)DTj(t). The spatial receptive fields can be, for example, Gabor receptive fields or Difference of Gaussian receptive fields. Each spatial receptive field can be derived from a mother wavelet. For example, given the mother wavelet γ(x, y), the set of all receptive fields can be obtained by performing the following three operations on their combinations: Dilation Dα, αε+:
Rotation Rθ, θε[0,2π):Rθγ(x, y)=γ(x cos θ+y sin θ,−x sin θ+y cos θ), and Translation Tx
Each output of the visual receptive fields can then be fed into a neural circuit. The output of the jth neural circuit can be denoted (tkj), k=1, 2, . . . , nj, and the operation of the neural circuit can be described by a bounded linear functional TLkj: t→, where TLkj=qkj; for uεt.
Combining the two cascaded modules together and assuming a total of N visual receptive fields, the jth of which is connected to the jth neural circuit that generates one spike train (tkj), k=1, 2, . . . , nj, j=1, 2, . . . , N, bounded linear functionals TLkj: t→ can be defined as Lkj=TLkjSLj so that LkjI=TLkjI=TLkjSLjI=I,φkj=qkj, where φkj(x, y, t)=φkj, Kx,y,t=Lkj
In one aspect of the disclosed subject matter, signals encoded with a video time encoding machine (vTEM) can be reconstructed using a recurrent neural network. For purposes of illustration, reference will now be made to particular embodiments of the disclosed subject matter for reconstructing a vTEM-encoded signal with a recurrent neural network. However, one of ordinary skill will recognize that other suitable variations exist, and thus the following discussion is not intended to be limiting.
As in the case of single-input single-output TEMs, the output strike trains of vTEMs can be used for reconstruction of the video signal. In one embodiment, the notion of the single-input single-output case can be adapted and reconstruction can again be formulated as a variational problem of the form
where λ is the smoothing parameter and n=Σj=1Nnj is the total number of spikes. The solution to the variational problem can take the form
where c=[c11,c21, . . . , cn
with entries of each block given by [Gij]kl=<φkl,φlj>. Thus, again the reconstruction problem reduces to solving a system of linear equations.
In one embodiment, a recurrent neural network can be defined, similar to the case of a single-input single-output TEM-encoded signal, by
with c=[c11, c21, . . . , cn
with entries of each block given by [Gij]kl=<φki, φlj>.
In another embodiment, and again expanding the single-input single-output TEM case, the reconstruction can be formed as a spline interpolation problem
The solution can take the form
and vector c is the solution to the optimization problem
where c=[c11, c21, . . . , cn
with entries of each block given by [Gij]kl=<φki, φlj>.
In one embodiment, a recurrent neural network can be defined, similar to the case of a single-input single-output TEM-encoded signal, by
with c=[c11, c21, . . . , cn
with entries of each block given by [Gij]kl=<φki,φlj>.
A schematic diagram of an exemplary embodiment of the disclosed subject matter is depicted in
In some embodiments of the presently disclosed subject matter, the vTEM encoded signal can be encoded with neurons with a random threshold. Reconstruction of signals encoded with neurons with a random threshold can again be formulated as a variational approach, for example, by considering the reconstruction as the solution to a smoothing spline problem.
In one aspect of the disclosed subject matter, a system for reconstructing a TEM-encoded signal using a recurrent neural network comprises at least one input for receiving a TEM-encoded signal. Adders, integrators, multipliers, and/or piecewise linear activators can be arrange suitably such that the neural network is a map of at least one differential equation. The signal can be input into the at least one input, processed by the neural network, and output through at least one output. The neural network can also have a feedback, such that the outer layer sends a signal back to the first layer of the network. The outputs can be optionally integrated or otherwise processed.
In one embodiment, the system for reconstructing a TEM-encoded signal can comprise a GPU cluster. For example, a GPU's intrinsically parallel architecture can be exploited to realize a recurrent neural network. Multiple GPUs can be used for signal reconstruction, hosts of which can be connected using a switch fabric and peer-to-peer communication accomplished, for example, though the Message Passing Interface (MPI).
In another aspect of the disclosed subject matter, vTEM encoded signals can be first divided into smaller volumes. The volumes can then be reconstructed and finally the reconstructed volumes can be stitched together. For example, in one embodiment, a space-time video sequence can be divided into fixed sized, overlapping volume segments.
The disclosed subject matter and methods can be implemented in software stored on computer readable storage media, such as a hard disk, flash disk, magnetic tape, optical disk, network drive, or other computer readable medium. The software can be performed by a processor capable of reading the stored software and carrying out the instructions therein.
The foregoing merely illustrates the principles of the disclosed subject matter. Various modifications and alterations to the described embodiments will be apparent to those skilled in the art in view of the teaching herein. It will thus be appreciated that those skilled in the art will be able to devise numerous techniques which, although not explicitly described herein, embody the principles of the disclosed subject matter and are thus within the spirit and scope of the disclosed subject matter.
This application is a continuation of International Application Serial No. PCT/US2012/024413, filed Feb. 9, 2012 which claims priority to U.S. Provisional Application Ser. No. 61/441,203, filed Feb. 9, 2011, each of which is hereby incorporated by reference in it's entirety.
This invention was made with government support under grants FA9550-01-1-0350, awarded by USAR/AFOSR, CNS 0855217, awarded by the National Science Foundation, and CNS 0958379, awarded by the National Science Foundation. The government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
61441203 | Feb 2011 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/US2012/024413 | Feb 2012 | US |
Child | 13948615 | US |