System and method for nonlinear signal enhancement that bypasses a noisy phase of a signal

Information

  • Patent Grant
  • 7392181
  • Patent Number
    7,392,181
  • Date Filed
    Wednesday, March 2, 2005
    19 years ago
  • Date Issued
    Tuesday, June 24, 2008
    16 years ago
Abstract
A system and method for nonlinear signal enhancement is provided. The method comprises: performing a linear transformation on a measured signal comprising a source component and a noise component; determining a modulus of the linear transformed signal; estimating a noise-free part of the linear transformed signal; and reconstructing the source component of the measured signal using the noise-free part of the linear transformed signal.
Description
BACKGROUND OF THE INVENTION

1. Technical Field


The present invention relates to signal enhancement, and more particularly, to a system and method for nonlinear signal enhancement that does not use a noisy phase of a signal.


2. Discussion of the Related Art


Current signal enhancement techniques are directed to suppressing noise and improving the perceptual quality of a signal of interest. For example, by using signal enhancement algorithms, signal enhancement techniques can remove unwanted noise and interference found in speech and other audio signals while minimizing degradation to the signal of interest. Similarly, image enhancement techniques aim to improve the quality of a picture for viewing. In both cases, however, there is room for improvement due to the random nature of noise and the inherent complexities involved in speech and signal recognition.


Current signal enhancement techniques follow the approach shown, for example, in FIG. 1. As shown in FIG. 1, a linear transformation such as a Fourier transform is applied to a noisy signal giving a representation of the signal in the transformed domain (110). The modulus or absolute value of the transformed signal is then determined (120) and a statistical estimate of a noise free part of the signal is computed (130). As the statistical estimate is being computed, the phase of the transformed signal is found (140). The product of the statistical estimate and the phase of the transformed signal is then determined (150) and an inverse linear transform is applied to the product to invert the product back into its original domain (160), thus resulting in a cleaned version of the signal.


Such algorithms have been shown to yield significant improvements for large classes of signals. However, recent psychoacoustic studies have shown that signal quality is very dependent on phase estimation. For example, if one takes a speech signal, performs a Linear Predictive Coding analysis and uses random white noise for excitation, the reconstructed signal in the time domain sounds very machine-made. Yet, if one uses custom excitation signals, the signal quality improves dramatically; however, this technique requires an estimate of the signal phase.


In order to enhance signal quality, information loss that results when taking the modulus of a signal has been considered. For example, in optics-based applications, a discrete signal may be reconstructed from the modulus of its Fourier transform under constraints in both the original and Fourier domain. For finite signals, the approach uses the Fourier transform with redundancy, and all signals having the same modulus of the Fourier transform satisfy a polynomial factorization. Thus, in one dimension, this factorization has an exponential number of possible solutions and in higher dimensions, the factorization is shown to have a unique solution.


Accordingly, there is a need for a technique of accurately reconstructing a signal without using its noisy phase or estimation and that takes into account information loss of the modulus of the signal.


SUMMARY OF THE INVENTION

The present invention overcomes the foregoing and other problems encountered in the known teachings by providing a system and method for nonlinear signal enhancement that bypasses a noisy phase of a signal.


In one embodiment of the present invention, a method for nonlinear signal enhancement, comprises: performing a linear transformation on a measured signal comprising a source component and a noise component; determining a modulus of the linear transformed signal; estimating a noise-free part of the linear transformed signal; and reconstructing the source component of the measured signal using the noise-free part of the linear transformed signal.


The step of reconstructing the source component of the measured signal, comprises: performing a nonlinear transformation on the noise-free part of the linear transformed signal; determining a sign of the source component of the measured signal; determining a product of the nonlinear transformed signal and the sign; and performing an overlap-add procedure using the product of the nonlinear transformed signal and the sign.


The linear transformation is one of a Fourier transform and a wavelet transform. The noise-free part of the linear transformed signal is estimated using one of a Wiener filtering technique and an Ephraim-Malah estimation technique.


The noise-free part of the linear transformed signal is estimated by solving:







Y


(

k
,
ω

)


=

{








X


(

k
,
ω

)




-



R
n



(

k
,
ω

)






if







X


(

k
,
ω

)




2




R
n



(

k
,
ω

)







0


if


otherwise



,


where







R
n



(

k
,
ω

)



=


min


k
-
W



k


<
k





R
x



(


k


,
ω

)




,


and







R
x



(

k
,
ω

)



=



(

1
-
β

)




R
x



(


k
-
1

,
ω

)



+

β






X


(

k
,
ω

)




2

.










The step of reconstructing the source component of the measured signal comprises: defining a three layer neural network by:








q
k

=

σ


(





f
=
1

F




a
kf



Z
f



+

θ
k


)



,

1

k

L

,
and








z
m

=

σ


(





k
=
1

L




b
mk



q
k



+

τ
m


)



,


1

m

M

;






performing a nonlinear transformation on the noise-free part of the linear transformed signal by solving:








u
m

=


z
m






Y
1
2

+





+

Y
F
2




z
1
2

+





+

z
M
2






;





determining a sign of the source component of the measured signal by solving:






ρ
=

{





+
1



if






k
=
1

M







x
k

-





u
k






2








k
=
1

M







x
k

+





u
k






2








-
1



if


otherwise



;







determining a product of the nonlinear transformed signal and the sign; and performing an overlap-add procedure using the product of the nonlinear transformed signal and the sign.


The method further comprises iterating:








π

t
+
1


=


π
t

-

α





π







m
=
1

M







u
m

-

s
m




2





,





until π converges, wherein π=(A, B, θ, τ). The noise-free part of the linear transformed signal is estimated by solving:







min


0


a
k

<

2

π


,

2

k

F








k
=
1

F







Y
k


-

TU


(

Y


)



)


k




2



,


Y


=


(





k




Y
k


)


1

k

F



,



α
1

=
0

;










and the step of reconstructing the source component of the measured signal comprises: performing a nonlinear transformation on the noise-free part of the linear transformed signal by solving:

z=U(YO), YkO=ejakOYk;

determining a sign of the source component of the measured signal by solving:






ρ
=

{





+
1



if






k
=
1

M







x
k

-





u
k






2








k
=
1

M







x
k

+





u
k






2








-
1



if


otherwise



;







determining a product of the nonlinear transformed signal and the sign; and performing an overlap-add procedure using the product of the nonlinear transformed signal and the sign.


The step of reconstructing the source component of the measured signal comprises: (i) setting k=0, Y0=Y; (ii) computing zk=UYk; (iii) computing W=Tzk; (iv) computing Y0 using:









Y

k
+
1




(
n
)


=


Y


(
n
)





W


(
n
)





W


(
n
)







,

n
=
1

,
2
,





,
F
,


wherein





if









Y
k

-

Y

k
+
1






>

ɛ


:








incrementing k=k+1, repeating steps (i-iv); and estimating the source component of the measured signal using zk. The method further comprises outputting the reconstructed source component of the measured signal.


In another embodiment of the present invention, a system for nonlinear signal enhancement, comprises: a memory device for storing a program; a processor in communication with the memory device, the processor operative with the program to: perform a linear transformation on a measured signal comprising a source component and a noise component; determine a modulus of the linear transformed signal; estimate a noise-free part of the linear transformed signal; and reconstruct the source component of the measured signal using the noise-free part of the linear transformed signal


When the source component of the measured signal is reconstructed the processor is further operative with the program code to: perform a nonlinear transformation on the noise-free part of the linear transformed signal; determine a sign of the source component of the measured signal; determine a product of the nonlinear transformed signal and the sign; and perform an overlap-add procedure using the product of the nonlinear transformed signal and the sign. The measured signal is received using one of a microphone and a database comprising one of audio signals and image signals.


When the source component of the measured signal is reconstructed the processor is further operative with the program code to: define a three layer neural network by:








q
k

=

σ


(





f
=
1

F




a
kf



Z
f



+

θ
k


)



,

1

k

L

,
and








z
m

=

σ


(





k
=
1

L




b
mk



q
k



+

τ
m


)



,


1

m

M

;






perform a nonlinear transformation on the noise-free part of the linear transformed signal by solving:








u
m

=


z
m






Y
1
2

+





+

Y
F
2




z
1
2

+





+

z
M
2






;





determine a sign of the source component of the measured signal by solving:






ρ
=

{





+
1



if






k
=
1

M







x
k

-





u
k






2








k
=
1

M







x
k

+





u
k






2








-
1



if


otherwise



;







determine a product of the nonlinear transformed signal and the sign; and perform an overlap-add procedure using the product of the nonlinear transformed signal and the sign.


The noise-free part of the linear transformed signal is estimated by solving:







min


0


a
k

<

2

π


,

2

k

F








k
=
1

F







Y
k


-

TU


(

Y


)



)


k




2



,


Y


=


(





k




Y
k


)


1

k

F



,



α
1

=
0

;










and when the source component of the measured signal is reconstructed the processor is further operative with the program code to: perform a nonlinear transformation on the noise-free part of the linear transformed signal by solving:

z=U(YO), YkO=ejakOYk;

determine a sign of the source component of the measured signal by solving:






ρ
=

{





+
1



if






k
=
1

M







x
k

-





u
k






2








k
=
1

M







x
k

+





u
k






2








-
1



if


otherwise



;







determine a product of the nonlinear transformed signal and the sign; and perform an overlap-add procedure using the product of the nonlinear transformed signal and the sign.


When the source component of the measured signal is reconstructed the processor is further operative with the program code to: (i) set k=0, Y0=Y; (ii) compute zk=UYk; (iii) compute W=Tzk; (iv) compute Y0 using:









Y

k
+
1




(
n
)


=


Y


(
n
)





W


(
n
)





W


(
n
)







,

n
=
1

,
2
,





,
F
,


wherein





if









Y
k

-

Y

k
+
1






>

ɛ


:








increment k=k+1, repeat steps (i-iv); and estimate the source component of the measured signal using Zk.


The processor is further operative with the program code to output the reconstructed source component of the measured signal. The reconstructed source component of the measured signal is output to one of a loudspeaker and an automatic speech recognition system.


In yet another embodiment of the present invention, a method for nonlinear signal enhancement, comprises: receiving a signal comprising a source component and a noise component; performing a linear transformation on the received signal; determining an absolute value of the linear transformed signal; estimating a noise-free part of the linear transformed signal; performing a nonlinear transformation on the noise-free part of the linear transformed signal; determining a sign of the source component of the received signal; determining a product of the nonlinear transformed signal and the sign; and performing an overlap-add procedure on the product of the nonlinear transformed signal and the sign to form a reconstructed signal of the source component of the received signal, wherein the reconstructed signal does not comprise the noise component of the received signal; and outputting the reconstructed signal. The received signal is one of a speech signal and an image signal.


The foregoing features are of representative embodiments and are presented to assist in understanding the invention. It should be understood that they are not intended to be considered limitations on the invention as defined by the claims, or limitations on equivalents to the claims. Therefore, this summary of features should not be considered dispositive in determining equivalents. Additional features of the invention will become apparent in the following description, from the drawings and from the claims.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 is a flowchart illustrating a conventional signal enhancement method;



FIG. 2 is a block diagram of a computer system for use with an exemplary embodiment of the present invention; and



FIG. 3 is a flowchart illustrating a method for nonlinear signal enhancement that bypasses a noisy phase of a signal according to an exemplary embodiment of the present invention.





DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS


FIG. 2 is a block diagram of a computer system 200 for use with an exemplary embodiment of the present invention. As shown in FIG. 2, the system 200 includes, inter alia, a personal computer (PC) 210 connected to an input device 270 and an output device 280. The PC 210, which may be a portable or laptop computer or a personal digital assistant (PDA), includes a central processing unit (CPU) 220 and a memory 230. The CPU 220 includes a nonlinear signal enhancement module 260 that includes one or more methods for performing nonlinear signal enhancement that does not use a noisy phase or estimate of a signal.


The memory 230 includes a random access memory (RAM) 240 and a read only memory (ROM) 250. The memory 230 can also include a database, disk drive, tape drive or a combination thereof. The RAM 240 functions as a data memory that stores data used during execution of a program in the CPU 220 and is used as a work area. The ROM 250 functions as a program memory for storing a program executed in the CPU 220. The input device 270 is constituted by a keyboard, mouse, microphone or an array of microphones and the output device 280 is constituted by a liquid crystal display (LCD), cathode ray tube (CRT) display, printer or loudspeaker.


Before describing the method of nonlinear signal enhancement according to an exemplary embodiment of the present invention, its derivation will be discussed.


1. Initial Considerations


In formulating the method of nonlinear signal enhancement, an additive model given by equation (1) is first considered.

x(t)=s(t)+n(t), 0≦t≦T  (1)


As shown in equation (1), x(t) is a measured signal, s(t) is an unknown source signal and n(t) is a noise signal, all signals are considered at a time t. The signal (x(t))0≦t≦T is “vectorized” into sequence vectors (x(k))0≦k≦K, where each x(k) is an M-vector x(k)n=x(kB+n), where 0≦n≦M−1, and B is a time step, which is roughly BK=T. A window g of a size M is applied to the measured signal followed by the Fourier transform of equation (2):











X


(

k
,
ω

)


=


1

F







n
=
0


M
-
1








-
2


πjω






n
/
F






x


(
k
)


n



g


(
n
)






,

0

ω


F
-
1






(
2
)








where the window g defines a linear operator T:RM→CF. When F>M equation (2) may be referred to as being redundant by oversampling in the frequency domain. An overlap fraction






M
b





represents oversampling in the frequency domain. The redundancy by oversampling in the frequency domain of equation (2) is then considered.


For example, the inversion of the transformation of equation (2), which is a linear transform, is implemented using an “overlap-add” procedure shown below in equation (3):










z


(
t
)


=


1

F






k






ω
=
0


F
-
1







2

πjω






t
/
F





Z


(

k
,
ω

)





g
~



(

t
-
kB

)










(
3
)








where k ranges over the set of integers so that 0≦t−kB≦M−1, and a “dual” window g, which depends on parameters M, B, are computed to give a perfect reconstruction z=x, when Z=X. The inversion of equation (3) defines a linear operator U:CF→RM and the perfect reconstruction condition reads UT=I, where I is the identity matrix of a size M.


Conventional noise reduction algorithms use equation (2) as an analysis operator to represent the measured signal in the time-frequency domain. In the time-frequency domain, the estimation procedure implements a nonlinear estimator E(.) of the type shown below in equation (4):

Y(k,ω)=E(|X(k,ω)|)  (4)

followed by the inversion of equation (3) with,










Z


(

k
,
ω

)


=


Y


(

k
,
ω

)





X


(

k
,
ω

)





X


(

k
,
ω

)










(
5
)







Nonlinear functions of E(.), which are used to implement existing estimation algorithms, require additional information such as the statistics of the noise component or of the signal component, which are obtained separately. These algorithms use equations (3 and 5) to revert to the time-domain. Given the initial considerations, the method of the present invention will now be discussed where the transformation of equation (2) is inverted using only the absolute values of X(k,ω).


2. The Reconstruction Scheme


Starting with, for example, only two samples, x1, x2, the following four linear transformations shown below in equations (6-9) followed by a modulus are considered:

Y1=|x1+x2|  (6)
Y2=|x1+jx2|  (7)
Y3=|x1−2x2|  (8)
Y4=|x1−2jx2|  (9)


Direct computations of equations (6-9) produce the following equations (10 and 11):










x
1

=

±



3


Y
2
2


-

Y
1
2

-


1
2



Y
3
2









(
10
)







x
2

=



Y
1
2

-

Y
2
2



2


x
1







(
11
)







for x1≠0,










x
1

=
0




(
12
)







x
2

=

±




1
2



Y
3
2


-

Y
1
2








(
13
)








otherwise, equations (12 and 13) are produced. As can be seen, there remains an ambiguity regarding the signs (e.g., +/− signs). Therefore, a stochastic principle can be used to determine whether the sign is + or −. To determine a solution, the set (Y1, Y2, Y3, Y4) of nonnegative numbers from equations (6-9) has to satisfy a series of constraints shown in equations (14 and 15):










2


(


Y
1
2

-

Y
2
2


)


=


Y
4
2

-

Y
3
2






(
14
)







3


Y
2
2





Y
1
2

+


1
2



Y
3
2







(
15
)







If, for example, there is a 4-tuple of the set of nonnegative numbers (Y1, Y2, Y3, Y4) that do not satisfy equations (14 and 15), then the reconstruction scheme may be used to interpolate between admissible values.


The algorithms to be discussed below evolve from the above reconstruction scheme. In these algorithms, one first considers the general case of an F-vector (Y1, Y2 . . . YF). Then, linear transformations coming from Parceval frames will be considered, thus leading to a scaling as shown in equation (16).













n
=
0


M
-
1








x


(
k
)


n



2


=




f
=
0

F



Y
f
2






(
16
)







Consequently, an inversion will be defined as shown in equation (17),











Z
k

=


Y
k




Y
1
2

+





+

Y
F
2





,

1

k

F





(
17
)








in addition, the inversion map will implement a map shown in equation (18):











Q


:







S

F
-
1






(

R
+

)

F


->

S

M
-
1






(
18
)








between an F−1 dimensional unit sphere with nonnegative entries SF−1∩(R+)F and the M−1 dimensional unit sphere SM−1.


2.1. The Neural Network Algorithm


In the neural network algorithm, one first considers a 3-layer neural network defined by equations (19 and 20):











q
k

=

σ


(





f
=
1

F




a
kf



Z
f



+

θ
k


)



,

1

k

L





(
19
)








z
m

=

σ


(





k
=
1

L




b
mk



q
k



+

τ
m


)



,

1

m

M





(
20
)








where A=(akf)1≦k≦L, 1≦f≦F, B=(bmk)1≦m≦M, 1≦m≦M, and θ=(θk)1≦k≦L, τ=(τm)1≦m≦M are network parameters. They may be compactly written as π=(A, B, θ, τ). As shown in equation (20) an input vector Z=(Z1, Z2 . . . ZF) is processed to produce an output vector z=(z1, z2 . . . zM). To achieve the mapping shown in equation (18), the network output z=(zm)1≦m≦M has to be normalized to the norm of the input vector Y, and then assigned a sign as shown below in equation (21):










u
m

=

ρ






z
m






Y
1
2

+





+

Y
F
2




z
1
2

+





+

z
M
2









(
21
)







The sign ρ is decided based on a maximum likelihood estimation principle and assuming the noise is Gaussian with a variance b2, the two likelihoods are defined in equations (22 and 23):










p


(

x
=



u
+
n


ρ

=

+
1



)







k
=
1

M







x
k

-





u
k




2







(
22
)







p


(

x
=



u
+
n


ρ

=

-
1



)







k
=
1

M







x
k

+





u
k




2







(
23
)







Therefore, the sign is determined by equation (24):









ρ
=

{




+
1



if






k
=
1

M







x
k

-





u
k






2








k
=
1

M







x
k

+





u
k






2








-
1



if


otherwise








(
24
)







The training of the network, for example, the learning of the parameters of π=(A, B, θ, τ), may be done as shown in equation (25):










π

t
+
1


=


π
t

-

α










π







m
=
1

M







u
m

-

s
m




2








(
25
)








with the learning rate α=10−8, and Y=|T(s+v)| with T being the linear analysis operator shown, for example, in equation (2) and (v=(vm)1≦m≦M, s=(sm)1≦m≦M) from a training set made of speech and noise signals. For training one can generate random vectors of M components or use a database of speech and noise signals and divide each signal into vectors of a size M.


2.2. The Distance Optimization Algorithm


In the distance optimization algorithm, a distance minimization criterion may be used. More specifically, let Σ denote the set of all possible F-nonnegative vectors Y obtained by taking the absolute value of the linear transformation T in equation (2). Given an F-vector Yε(R+)F, which is not necessarily in Σ, the closest element in Σ to Y is found, and then it is nonlinearly inverted as shown below in equation (26).










Y
^

=

arg







min


Y



Σ







k
=
1

F







Y
k

-

Y
k





2








(
26
)







Equation (26) is equivalent to the optimization shown in equation (27):













min


0


α
k

<

2

π


,

2

k

F








k
=
1

F







Y
k


-

TU


(

Y


)



)


k





2

,


Y


=


(





k




Y
k


)


1

k

F



,


α
1

=
0





(
27
)








where α1=0 fixes the sign ambiguity. This optimization then defines the inverse in equation (28):

z=ρU(Y0), Yk0=ejakOYk  (28)

with α0=(αk0)1≦k≦F, being the solution of equation (27) and ρ=±1 being defined by equation (24). The optimization of equation (27) can then be performed using a gradient algorithm.


2.3. The Iterative Signal Reconstruction Algorithm


The iterative signal reconstruction algorithm works as follows. Let Yε(R+)F be the vector of, for example, absolute value estimates shown by (Z) in FIG. 3. Next, let T denote the linear transformation as shown by (X) in FIG. 3, and let U denote a left universe. In other words, let U denote another linear transformation that can be used with T to perform a perfect reconstruction, UT=Identity (e.g., T can be a Fourier transform, and U can be an inverse Fourier transform). Then choose ε>0 a stopping threshold, e.g., ε=10−3. The iterative signal reconstruction algorithm then iterates the following steps:


(1) Set k=0, Y0=Y.


(2) Compute zk=UYk.


(3) Compute W=Tzk.


(4) Compute equation (29) below.












Y

k
+
1




(
n
)


=


Y


(
n
)





W


(
n
)





W


(
n
)







,

n
=
1

,
2
,





,
F




(
29
)







(5) If ∥Yk−Yk+1∥>ε then increment k=k+1 and go to step 2; otherwise stop.


The estimated or reconstructed signal is indicated by the last computed zk.


2.4. Nonlinear Signal Reconstruction


Nonlinear signal reconstruction that does not use a noise component of a signal will now be discussed with reference to FIG. 3. As shown in FIG. 3, a linear transformation such as a Fourier transform or a wavelet transform is applied to a signal (x) (shown in equation (1)) comprising a noise component and an unknown source component giving a representation of the signal (X) in the transformed domain (310). The signal (x) may be acquired using a microphone or a database comprising audio signals and image signals. After the signal (x) is transformed, the modulus or absolute value (Z) of the transformed signal (X) is determined (320). Upon determining the modulus (Z) of the transformed signal, a statistical estimation (Y) of a noise free part of the transformed signal is determined (330).


The statistical estimation may be performed by using equation (4), Wiener filtering or by using an Ephraim-Malah estimation technique. The statistical estimation may also be performed using a spectral subtracting technique by solving equations (30), (31) and (32):










Y


(

k
,
ω

)


=

{







X


(

k
,
ω

)




-



R
n



(

k
,
ω

)






if







X


(

k
,
ω

)




2




R
n



(

k
,
ω

)







0


if


otherwise








(
30
)








R
n



(

k
,
ω

)


=


min


k
-
W



k


<
k





R
x



(


k


,
ω

)







(
31
)








R
x



(

k
,
ω

)


=



(

1
-
β

)




R
x



(


k
-
1

,
ω

)



+

β





X


(

k
,
ω

)




2







(
32
)







After performing the statistical estimation (Y) of the noise free part of the transformed signal, the unknown source component of the signal (x) is reconstructed using the statistical estimation (Y) of noise-free part of the linear transformed signal (340). The unknown source component of the signal (x) may be reconstructed by using the neural network, distance optimization and iterative signal reconstruction algorithms. In reconstructing the unknown source component using an alternative variant of the neural network and distance optimization algorithms, a nonlinear transformation or mapping of the statistical estimation (Y) is performed (350). The nonlinear mapping in the neural network algorithm may be performed by using equation (33),










u
m

=


z
m






Y
1
2

+





+

Y
F
2




z
1
2

+





+

z
M
2









(
33
)








after defining a three layer neural network using equations (19 and 20). Nonlinear mapping in the distance optimization algorithm may be performed by using equation (34),

z=U(YO), YkO=ejakOYk  (34)


After nonlinear mapping is performed, the sign (ρ) of the source component of the signal (x) is determined (360). The sign (ρ) is determined by using equation (24). After determining the sign (ρ), the product (u/z) of the sign (ρ) and the nonlinear mapped signal (z) is determined (370). Upon determining the product (u/z), a conventional overlap-add procedure such as that shown in equation (3), is performed on the product (u/z) (380) and a cleaned signal or reconstructed unknown source signal Ŝ results, which may then be output to a loudspeaker or further analyzed by an automatic speech recognition system.


2.5. Proofs


A set of proofs, which show nonlinear mapping is mathematically well defined, will now be described. This set of proofs is described in the paper entitled, “On Signal Reconstruction Without Noisy Phase”, Radu Balan, Pete Casazza, Dan Edidin, Dec. 20, 2004, available at (http://front.math.ucdavis.edu/author/Balan-R*), a copy of which is herein incorporated by reference.


By constructing new classes of Parseval frames for a Hilbert space, it will be shown that new classes of Parseval frames allow for the reconstruction of a signal without using its noisy phase or its estimation. Frames are redundant systems of vectors in Hilbert spaces. They satisfy the property of perfect reconstruction, in that any vector of the Hilbert space can be synthesized back from its inner products with the frame vectors. More precisely, the linear transformation from the initial Hilbert space to the space of coefficients obtained by taking the inner product of a vector with the frame vectors is injective and hence admits a left inverse. The follow proofs will show what kind of reconstruction is possible if one only has knowledge of the absolute values of the frame coefficients.


First, consider a Hilbert space H with a scalar product <,>. A finite or countable set of vectors F={fi, iεI} of H is called a frame if there are two positive constants A, B>0 such that for every vector xεH,










A




x


2







i

I









x
,

f
i






2




B




x


2






(
2.1
)







The frame is tight when the constants can be chosen equal to one another, A=B. For A=B=1, F is called a Parceval frame. The numbers <x, fi> are called frame coefficients.


To a frame F associate the analysis and synthesis operators defined by:











T


:






H

->


l
2



(
𝕀
)



,


T


(
x
)


=


{



x
,

f
i




}


i

𝕀







(
2.2
)









T
*



:








l
2



(
𝕀
)



->
H

,



T
*



(
c
)


=




i

I





c
i



f
i








(
2.3
)








which are well defined due to equation (2.1), and are adjacent to one another. The range of T in l2(I) is called the range of coefficients. The frame operator defined by S=T*T:H→H is invertible by equation (2.1) and provides the perfect reconstruction formula:









x
=




i

𝕀







x
,

f
i






S

-
1




f
i







(
2.4
)







Consider now, the nonlinear mapping:

Ma:H→l2(I), Ma(x)={|<x,fi>|}iεI  (2.5)

obtained by taking the absolute value entry of the analysis operator. Denote by Hr the quotient space Hr=H/˜ obtained by identifying two vectors that differ by a constant phase factor: x˜y if there is a scalar c with |c|=1 so that y=cx. For real Hilbert spaces c can only be +1 or −1, and thus Hr=H/{±1}. For complex Hilbert spaces c can be any complex number of modulus one, e, and then Hr=H/T1, where T1 is the complex unit circle. Thus, two vectors of H in the same ray would have the same image through Ma. Thus the nonlinear mapping Ma extends to Hr as:

M:Hr→l2(I), M({circumflex over (x)})={|<x,fi>|}iεI, xε{circumflex over (x)}  (2.6)


The following description will also center around the injectivity of the map M. For example, when it is injective, M admits a left inverse, meaning that any vector (e.g., signal) in H can be reconstructed up to a constant phase factor from the modulus of its frame coefficients.


Referring back to the conventional signal enhancement method shown in FIG. 1, the Ephraim-Malah noise reduction method may be used. Let {x(t), t=1, 2 . . . T} be the samples of a speech signal. These samples are first transformed into the time-frequency domain through:











X


(

k
,
ω

)


=




t
=
0


M
-
1




g


(
t
)



x


(

t
+
kN

)







-
2


πⅈω






t
M






,





k
=
0

,
1
,





,


T
-
M

N

,

ω


{

0
,
1
,





,

M
-
1


}






(
2.7
)








where g is the analysis window, and M, N are respectively the window size, and the time step. Next, a nonlinear transformation is applied to |X(k, ω)| to produce a minimum mean square error (MMSE) estimate of the short-time spectral amplitude:










Y


(

k
,
ω

)


=



π

2





υ


(

k
,
ω

)




γ


(

k
,
ω

)





exp


(

-


υ


(

k
,
ω

)


2


)








(

1
+

υ


(

k
,
ω

)



)




I
0



(


υ


(

k
,
ω

)


2

)



+


υ


(

k
,
ω

)





I
1



(


υ


(

k
,
ω

)


2

)










X


(

k
,
ω

)









(
2.8
)








where I0, I1 are modified Bessel functions of zero and first order, and v(k, ω),γ(k, ω) are estimates of certain signal-to-noise ratios. The speech signal windowed Fourier coefficients are estimated by:











X
^



(

k
,
ω

)


=


Y


(

k
,
ω

)





X


(

k
,
ω

)





X


(

k
,
ω

)










(
2.9
)








and then transformed back into the time domain though an overlap-add procedure:











x
^



(
t
)


=



k






ω
=
0


M
-
1






X
^



(

k
,
ω

)










2





π











ω



t
-
kN

M









h


(

t
-
kN

)









(
2.10
)








where h is the synthesis window. This example illustrates that nonlinear estimation in the representation domain modifies only the amplitude of the transformed signal and keeps its noisy phase.


Similarly, in automatic speech recognition systems, given a voice signal {x(t), t=1, 2 . . . T}, the automatic speech recognition system outputs a sequence of recognized phonemes from an alphabet. The voice signal is transformed into the time-frequency domain by the same discrete windowed Fourier transform shown in equation (2.7). The real cepstral coefficients Cx(k, ω) are defined as the logarithm of the modulus of X(k, ω):

Cx(k,ω)=log(|X(k,ω)|)  (2.11)


Two rationales have been discussed for using this object. First, the recorded signal x(t) is a convolution of the voice signal s(t) with the source-to-microphone (e.g., channel) impulse response h. In the time frequency domain, the convolution almost becomes a multiplication and the cepstral coefficients decouple as shown in equation (2.12):

Cx(k,ω)=log(|H(ω)|)+Cs(k,ω)  (2.12)

where H(ω) is the channel transfer function, and Cs is the voice signal cepstral coefficient. Because the channel function is invariant, by subtracting the time average we obtain:

Fx(k,ω)=Cx(k,ω)−ε[Cx(·,ω)]=Cs(k,ω)−ε[Cs(·,ω)]  (2.13)

where ε is the time average operator. Thus, Fx encodes information regarding the speech signal alone, independent of the reverberant environment. The second rationale for using Cx and thus Fx, is that phase does not matter in speech recognition. Thus, by taking the modulus in equation (2.11) one does not lose information about the message.


Returning to the automatic speech recognition system, the spectral coefficients Fx are fed into several hidden Markov models, one hidden Markov model for each phoneme. The outputs of the hidden Markov models give an utterance likelihood of a particular phoneme and the automatic speech recognition system chooses the phoneme with the largest likelihood. This example also illustrates that the transformed domain signal has either a secondary role or none whatsoever.


2.5(a). Analysis of M for Real Frames


Consider the case H=RN, where the index set I has cardinality M, I={1, 2 . . . M}. Then l2(I)≃RM. For a frame F={f1, f2 . . . fM} of RN, denote T by the analysis operator,














T


:








N





M


,






T


(
x
)


=




k
=
1

M






x
,

f
k










e
k




,




x


x
^








(
2.14
)








where {e1, e2 . . . eM} is the canonical basis of RM. Let W denote the range of the analysis map TRN that is an N-dimensional subspace of RM. Recall the nonlinear map we are using:















𝕄




:









N

/

{

±
1

}






M


,







𝕄




(

x
^

)


=




k
=
1

M








x
,

f
k












e
k




;







(
2.15
)







When there is no confusion, F will be dropped from the notation.


Two frames {fi}iεI and {gi}iεI are equivalent if there is an invertible operator T on H with T(fi)=gi, for all iεI. It is known that two frames are equivalent if their associated analysis operators have the same range. Next, deduce that M-element frames on RN are parameterized by the fiber bundle F(N, M; R), which is the GL(N, R) bundle over the Grassmanian Gr(N,M).


The analysis will now be reduced to equivalent classes of frames:


Proposition 2.1. For any two frames F and G that have the same range of coefficients, MF is injective if MG is injective.


Proof. Any two frames F={fk} and G={gk} that have the same range of coefficients are equivalent, e.g., there is an invertible R:RN→RN so that gk=Rfk, l≦k≦M. Their associated nonlinear maps MF, and respectively MG, satisfy MG(x)=MF(R*x). This shows that MF is injective if MG is injective. Consequently, the property of injectivity of M depends only on the subspace of coefficients W in Gr(N,M).


This result shows that for two frames corresponding to two points in the same fiber F(N, M; R), the injectivity of the their associated nonlinear maps would jointly hold true or fail. Because of this, one shall assume that induced topology is based on manifold Gr(N,M) of the fiber bundle F(N, M; R) into the set of M-element frames of RN.


If {fi}iεI is a frame with a frame operator S then {S−1/2fi}iεI is a Parseval frame which is equivalent to {fi}iεI and called the canonical Parseval frame associated to {fi}iεI. Also, {S−1fi}iεI is a frame equivalent to {fi}iεI and is called the canonical dual frame associated to {fi}iεI. Proposition 2.1 shows that when the nonlinear map MF is injective then the same property holds for the canonical dual frame and the canonical Parseval frame.


Given S⊂{1, 2 . . . M} define a map of σs:RM→RM by the equation:

σS(a1, . . . ,aM)=((−1)S(1)a1, . . . ,(−1)S(M)aM)  (2.16)


Clearly σS2=id and σSc=−σS where SC is the complement of S. Let LS denote the |S|-dimensional linear subspace of RM where LS={(a1, a2 . . . aM)|ai−0, iεS}, and let PS:RM→LS denote the orthogonal projection onto this subspace. Thus, (PS(u))i=0, if iεS, and (PS(u))i=ui, if iεSC. For every vector uεRM, σs(u)=u iff uεLS. Likewise σs(u)=−u iff uε(LS)C. Note:















P
S



(
u
)


=


1
2



(

u
+


σ
S



(
u
)



)



,






P

S
C




(
u
)


=


1
2



(

u
-


σ
S



(
u
)



)









(
2.17
)








Theorem 2.2. (Real Frames). If M≧2N−1 then for a generic frame F, M is injective. Generic means an open dense subset of the set of all M-element frames in RN.


Proof. Suppose that x and x′ have the same image under M=MF. Let a1, a2 . . . aM be the frame coefficients of x, and a1 . . . aM the frame coefficients for x′. Then ai=±ai for each i. In particular, there is a subset S ⊂{1, 2 . . . M} of indices such that ai=(−1)S(i)ai where the function S(i) is the characteristic function of S and is defined by the rule that S(i)=1 if iεS and S(i)=0 if i∉S. Then two vectors x, x′ have the same image under M if there is a subset S⊂{1, 2 . . . M} such that (a1, a2 . . . aM) and ((−1)s(1) a1, a2 . . . (−1)S(M) aM) are both in W the range of coefficients associated with F.


To finish the proof it will be shown that when M≧2N−1 such a condition is not possible for a generic subspace W⊂RN. This means that a set of such W's is a dense (e.g., Zariski) open set in the Grassmanian Gr(N,M). In particular, the probability that a randomly chosen W will satisfy this condition is 0.


To finish the proof the following Lemma is needed.


Lemma 2.3. If M≧2N−1 then the following holds for a generic N-dimensional subspace W⊂RM. Given uεW then σs(u)εW iff σs(u)=±u.


Proof. Suppose uεW and σs(u)≠±u but σs(u) εW. Because σs is an involution, u+σs(u) is fixed by σs and is non-zero, thus W∩LS≠0. Likewise,

0≠u−σS(u)=u+σSc(u)  (2.18)


Therefore, W∩(LS)C≠0.


Now LS and (LS)C are fixed linear subspaces of dimension M−|S| and |S|. If M≧2N−1 then one of these subspaces has a co-dimension greater than or equal to N. However a generic linear subspace W of dimension N has 0 intersections with a fixed linear subspace of co-dimension greater than or equal to N. Therefore, if W is generic and x, σs(x)εW then σs(x)=±x which ends the proof of the Lemma.


The proof of the theorem now follows from the fact that if W is in the intersection of generic conditions imposed by the proposition for each subset S⊂{1, 2 . . . M} then W satisfies the conclusion of the theorem.


The proof of Lemma 2.3 actually shows:


Corollary 2.4. The map M is injective if when there is a non-zero element uεW⊂RM with uεLS, then W∩(LS)C={0}.


Now it will be observed that the result is very good.


Proposition 2.5. If M≧2N−2, then the result fails for all M-element frames.


Proof of Proposition 2.5. Because M≧2N−2, we have that 2M−2N+2≦M. Let (ei)i=1M be the canonical orthonormal basis of RM. (ei)i=1M=(ei)i=1k∪(ei)i=k+1M can then be written where both k and M−k are ≧M−N+1.


Let W be any N-dimensional subspace of RM. Because dim W=M−N, there exists a nonzero vector uε(ei)i=1k so that u⊥W, hence uεW. Similarly, there is a nonzero vector v in span (ei)i=k+1M with u⊥W, that is vεW. By the above corollary, M cannot be infective. In fact M(u+v)=M(u−v).


The next result provides an easy way for frames to satisfy the above condition.


Corollary 2.6. If F is an M-element frame for RN with M≧2N−1 having the property that every N-element subset of the frame is linearly independent, then M is infective.


Proof. Given the conditions, it follows that W has no elements, which are zero in N coordinates, so the Corollary holds.


Corollary 2.7. (1) If M=2N−1, then the condition given in Corollary 2.6 is also necessary. (2) If M≧2N, this condition is no longer necessary.


Proof. (1) For (1) in Corollary 2.7, the contrapositive will be proven. Let M=2N−1 and assume there is an N-element subset (fi)iεS of F which is not linearly independent. Then there is a nonzero xε(span (fi)iεS)195⊂RN Hence, 0≠u=T(x)εLS∩W. On the other hand, because dim(span(fi)(iεS)C)≦N−1, there is a nonzero yε(span(fi)(iεS)C)⊂RN so that 0≠v=T(y)ε(LS)C∩W. Now, by Corollary 2.4, M is injective.


(2) If M≧2N an M-element frame for RN that has a linearly dependent subset is constructed. Let F′={f1, . . . f2N−1} be a frame for RN so that any N-element subset is linearly independent. By Corollary 2.4, the map MF′ is infective. Now extend this frame to F={f1 . . . fM} by f2N= . . . =fM=f2N−1. The map MF extends MF′ and therefore remains infective, whereas any N-element subset that contains two vectors from {f2N−1, f2N . . . fM} is no longer linearly independent.


It is noted that the above-mentioned frames can be constructed “by hand”. For example, start with an orthonormal basis for RN, say (fi)i=1N. Assume that sets of vectors (fi)i=1M have been constructed such that every subset of N vectors is linearly independent. Observe the span of all the (N−1)-element subsets of (fi)i=1M. Pick fM+1 not in the span of any of these subsets. Then (fi)i=1M+1 has the property that every N-element subset is linearly independent.


Now a slightly different proof of this result will be provided that gives necessary and sufficient conditions for a frame to have the required properties.


Theorem 2.8. Let (fi)i=1M be a frame for RN. The following are equivalent:


(1) The map M is injective; and


(2) For every subset S⊂{1, 2 . . . M}, either (fi)iεS spans RN or (fi)(iεS)C spans RN.


Proof. (1)→(2): The contrapositive is proven. Therefore, assume that there is a subset S⊂{1, 2 . . . M} so that neither {fi; iεS} nor {fi; iεSC} spans RN. Hence there are nonzero vectors x, yεRN so that x⊥span (fi)iεS and y⊥span (fi)iεSC. Then 0≠T(x)εLS∩W and 0≠T(x)ε(LS)C∩W. Now by Corollary 2.4 M cannot be injective.


(2)→(1): Assume M({circumflex over (x)}) for some M(ŷ). This means for every 1≦j≦M, |<x, fj>|=|<y, fj>| where xε{circumflex over (x)} and yεŷ^. Let,
S={j:<x,fj>=−<y,fj>}  (2.19)
and note,
SC={j:<x,fj>=<y,fj>}  (2.20)


Now, x+y⊥span (fi)iεS and x−y⊥span (fi)iεSC. Assume that {fi; iεS} spans RN. Then x+y=0 and thus {circumflex over (x)}=ŷ. If spans {fi; iεSC} then x−y=0 and again {circumflex over (x)}=ŷ. Either way {circumflex over (x)}=ŷwhich proves M is injective.


For M<2N−1 there are numerous frames for which M is not injective. However for a generic frame, it can be shown that the set of rays that can be reconstructed from the image under M is open dense in RN/{±1}.


Theorem 2.9. Assume M>N. Then for a generic frame FεF[N, M; R], the set of vectors xεRN so that (MF)−1 (MaF(x)) consists of one point in RN/{±1} has a dense interior in RN.


Proof. Let F be an M-element frame in RN. Then F is similar to a frame G that consists of the union of the canonical basis of RN {d1 . . . dN}, with some other set of M−N vectors. Let G={(gk; 1≦k≦M}. Thus, gkj=dj, 1≦k≦M, for some N elements {k1, k2 . . . kN} of {1, 2 . . . M}. Consider now the set of B frames F so that its similar frame G constructed above has a vector gk with all entries zero,









𝔹
=

{










[

N
,

M
;



]



|



𝒢


=

{


k

}


,


{


d
1

,





,

d
N


)


𝒢

,





j
=
1

N








k
0


,

d
j






0

,

for





some






k
0



}





(
2.21
)







Clearly B is open dense in F[N, M; R]. Thus, generically FεB. Let G={gk; 1≦k≦M} be its similar frame satisfying the condition above. Next, prove the set X=XF of vectors xεRN so that (MG)−1 (MaG(x)) has more than one point that is thin, e.g., it is included in a set whose complement is open an dense in RN. Then claim X⊂US(VS+∪VS) where (VS±)S⊂{1, 2 . . . N} are linear subspaces of RN of codimension 1 indexed by subsets S of {1, 2 . . . N}. This claim will conclude the proof of Theorem 2.9.


To verify the claim, let x, y,εRN so that MaG(x)=MaG(y) and yet x≠y, nor x≠−y. Because G contains the canonical basis of RN, |xk|=|yk| for all 1≦k≦N. Then there is a subset S⊂{1, 2 . . . N} so that yk=(−1)S(k)xk. Note S≠0, nor S≠{1, 2 . . . N}. Denote DS the diagonal N×M matrix (DS)kk=(−1)S(k). Thus y=DSx and yet DS≠±1. Let Gk0εG so that none of its entries vanish. Then |<x, gk0>|=|<y, gko>| implies,

<x,(I±DS)gkO>=0  (2.22)


This proves the set XG is included into the union of 2(2N−2) linear subspaces of codimension 1,

S≠Ø,Sc≠Ø{(I−DS)gkO}∪{(I+DS)gkO}  (2.23)


Because F is similar to G, XF is included into the image of the above set through a linear invertible map, which proves the claim.


2.5(b). Analysis for M Complex Frames


In this section the Hilbert space is CN. For an M-element frame F={f1, f2 . . . fM} of CN the analysis operator is defined by equation (2.14), where the scalar product is









x
,
y



=




k
=
1

N




x


(
k
)










y


(
k
)


_

.








The range of coefficients, e.g., the range of the analysis operator, is a complex N-dimensional subspace of CM that is denoted again by W. The nonlinear map to be analyzed is given by:














𝕄




:









N

/

𝕋
1







M



:









𝕄




(

x
^

)


=




k
=
1

M








x
,

f
k












e
k




,




x


x
^








(
2.24
)








where two vectors x, yε{circumflex over (x)} if there is a scalar cεC with |c|−1 so that y=cx.


By the equivalence, M-frames of CN are parameterized by points of the fiber bundle F(N, M; C), and the GL(N, C) bundle over the complex Grassmanian Gr(N, M)C.


Proposition 2.1 holds true for complex frames as well. Thus without loss of generality the topology induced by the base manifold of F(N, M; C) into the set of M-element frames of CN will be used. Thus, in the real case, the question about M-element frames in CN is reduced to a question about the Grassmanian of N-planes in CM. First, the following theorem is proved.


Theorem 3.1. If M≧4N−2 then the generic N-plane W in CM has the property that if v=(v1, v2 . . . vM) and w=(w1, w2 . . . wM) are vectors in W such that |vi|=|wi| for all i then v=λw for some complex number λ of modulus 1.


Proof. Assume that an N-plane W has a property (*) if there are non-parallel vectors v, w in W such that |vi|=|wi| for all i. Recall that two vectors x, y are parallel if there is a scalar cεC so that y=cx.


Given an N-plane W it may be assumed, after recording coordinates on CM, that W is the span of the rows of an N×M matrix of the form:









[



1


0





0



u


N
+
1

,
1








u

M
,
1






0


1





0



u


N
+
1

,
2








u

M
,
2































0


0





1



u


N
+
1

,
N








u

M
,
N





]




(
2.25
)








where the N (M−N) entries {Ui,j} are viewed as indeterminates. Thus, Gr(N, M)C is isomorphic to CN(M−N) in a neighborhood of W.


Now suppose that W satisfies (*) and v and w are two non-parallel vectors whose entries have the same modulus. The choice of basis for W ensures that one of the first N entries in v (and hence w) are nonzero. Because one only cares about these vectors up to rescaling, it may be assumed, after recording, that v1=w1=1. In addition, the vectors are assumed non-parallel so that it may be assumed that v1≠w1≠0 for some i≦N. After again reordering, it can be assumed that v2≠w2≠0.


Then set λ1=1. By assumption there are numbers λ2 . . . λM with λ2≠1 such that wiivi for 1, 2 . . . M. Expanding in terms of the basis for W, for i>N,








v
i

=




j
=
1

N




v
j



u

i
,
j





,





and







w
i

=




j
=
1

N




λ
j



v
j




u

i
,
j


.







Thus if W satisfies (*) there must be λ2 . . . λNεT1 (with λ2≠1) and v2 . . . vNεC such that for all N+1≦i≦M one has,















j
=
1

N




v
j



u

i
,
j






=






j
=
1

N




λ
j



v
j



u

i
,
j










(
2.26
)







Consider now, the variety Y of all tuples,

(W,u2, . . . ,uN2, . . . ,λN)  (2.27)

as above. Because v2≠0 and λ2≢1 this variety is locally isomorphic to the real 2N(M−N)+3N−3-dimensional variety,

CN(M−N)×(C\{0})×(C)N−2×(T1\{1})×(T1)N−2  (2.28)


The locus in Gr(N, M)C of planes satisfying the property (*) is denoted by X. This variety is the image under projection to the first factor of Y cut out by the M−N equation (2.26) for N+1≦i≦M. The analysis of equation (2.26) is summarized by the following result.


Lemma 3.2. The M−N equations of (2.26) are independent. Hence X is a variety of a real dimension at most 2N(M−N)+3N−3−(M−N).


Proof of Lemma 3.2. For any choice of 0≠v2, v3 . . . vN and 1≠λ2, λ3 . . . λN the equation,
















j
=
1

M




υ
j



u

i
,
j






2

=







j
=
1

M




λ
j



υ
j



u

i
,
j






2





(
2.29
)








is non-degenerate. Because the variables ui,1 . . . ui,N appear in exactly one equation, these equations (for fixed v2, v3 . . . vN, λ2, λ3 . . . λN) define a subspace of CN(M−N) of a real codimension at least M−N. Because this is true for all choices, it follows that the equations are independent.


From this Lemma, it follows that the locus of N-planes satisfying (*) has a local or real dimension 2N(M−N)+3N−3−(M−N). Therefore, if 3N−3−(M−N)<0, e.g., if M≧4N−2, this locus cannot be all of Gr(N, M)C, thus ending the proof of Theorem 3.1.


The main result in the complex case then follows from Theorem 3.1.


Theorem 3.3 (Complex Frames). If M≧4N−2 then MF is injective for a generic frame F={f1, f2 . . . fN}.


Lemma 3.2 yields the following result.


Theorem 3.4. If M≧2N then for a generic frame FεF|N, M; C| the set of vectors xεCN such that (MF)−1(MaF(x)) has one point in CN/T1 has a dense interior in CN.


Proof. From Lemma 3.2, for a generic frame the M−N equations of (2.26) in 2(N−1) indeterminates (v2 . . . vN, λ2 . . . λN) are independent. Note there are 3(N−1) real valued unknowns and M−N equations. Hence, the set of {(v2 . . . vN)} in CN−1 for which there are (λ2 . . . λN) such that equation (2.26) has a solution in (C\{0})x(C)N−2x(T1\{1})x(T1)N−2 has a real dimension at most 3(N31 1)−(M−N)=4N−3−M. For M≧2N it follows 3(N−1)−(M−N)<2(N−1) which shows the set of v=(v1 . . . vN) such that (MF)−1(MaF(v)) has more than one point is thin in CN, e.g., its complement has a dense interior.


The optimal bound for the complex case is thus believed to be 4N−2. However, as this case is different from the real case in that complex frames with only 2N−1 elements cannot have MF injective. To show this, the proof of Theorem 2.8 (1)→(2) does not use the fact that the frames are real. So in the complex case, one has:


Proposition 3.5. If {fj}jεI is a complex frame and MF is injective, then for every S⊂{1, 2 . . . M}, if LS∩W≠0 then (LS)C∩W. Hence, for every such S, either {fj}jεS or (fj)jεSC spans H.


Now it will be shown that complex frames must contain at least 2N-elements for MF to be injective.


Proposition 3.6 (Complex Frames). If MF is injective then M≧2N.


Proof. It is assumed that M=2N−1 and that in this case MF is not injective. Let {Zj}j=1N be a basis for W and let P be the orthogonal projection onto the first N−1 unit vectors in CM. Then {Pzj}j=1N sits in an N−1-dimensional space and so there are complex scalars








{

a
j

}


j
=
1


N
-
1


,





not all zero, so that ΣajPZj=0. In other words, there is a vector 0≠yεW with support y⊂{N, N+1 . . . 2N−1}. Similarly, there is a vector 0≠yεW with support x⊂{1, 2 . . . N}. If x(N)=0 or y(N)=0 we contradict Proposition 3.4. In addition, if








x


(
i
)


=


0





for





all





i

<
N


,


then






(

y
-
cx

)



(
N
)


=


0





for





c

=


y


(
N
)







x


(
N
)


_





x


(
N
)




2


.









Now, x, y−cx are in W and have disjoint support so the map is not injective. Otherwise, let,













z
=



x


(
N
)


_





x


(
N
)




2



,




w
=

i




y


(
N
)


_





y


(
N
)




2










(
2.30
)







Now, z, wεW and z(N)=1 and w(N)=i. Hence, |z+w|=|z−w|. It follows that there is a complex number |c|=1 so that z+w=(c(z−w). Because zi≠0 for some i<N, c=1 and w=0, which is a contradiction.


Thus, in accordance with an exemplary embodiment of the present invention, by constructing new classes of Parseval frames for a Hilbert space, an original signal that includes an unknown source component and a noise component is reconstructed without using its noise component or estimation. Therefore, by using information available from the transformed domain of a signal, signal reconstruction may take place without using an estimate of signal phase.


It is to be further understood that because some of the constituent system components and method steps depicted in the accompanying figures may be implemented in software, the actual connections between the system components (or the process steps) may differ depending on the manner in which the present invention is programmed. Given the teachings of the present invention provided herein, one of ordinary skill in the art will be able to contemplate these and similar implementations or configurations of the present invention.


It should also be understood that the above description is only representative of illustrative embodiments. For the convenience of the reader, the above description has focused on a representative sample of possible embodiments, a sample that is illustrative of the principles of the invention. The description has not attempted to exhaustively enumerate all possible variations. That alternative embodiments may not have been presented for a specific portion of the invention, or that further undescribed alternatives may be available for a portion, is not to be considered a disclaimer of those alternate embodiments. Other applications and embodiments can be implemented without departing from the spirit and scope of the present invention.


It is therefore intended, that the invention not be limited to the specifically described embodiments, because numerous permutations and combinations of the above and implementations involving non-inventive substitutions for the above can be created, but the invention is to be defined in accordance with the claims that follow. It can be appreciated that many of those undescribed embodiments are within the literal scope of the following claims, and that others are equivalent.

Claims
  • 1. A method for nonlinear signal enhancement, comprising: performing a linear transformation on a measured signal comprising a source component and a noise component;determining a modulus of the linear transformed signal;estimating a noise-free part of the linear transformed signal; andreconstructing the source component of the measured signal using the noise-free part of the linear transformed signal.
  • 2. The method of claim 1, wherein the step of reconstructing the source component of the measured signal, comprises: performing a nonlinear transformation on the noise-free part of the linear transformed signal;determining a sign of the source component of the measured signal;determining a product of the nonlinear transformed signal and the sign; andperforming an overlap-add procedure using the product of the nonlinear transformed signal and the sign.
  • 3. The method of claim 1, wherein the linear transformation is one of a Fourier transform and a wavelet transform.
  • 4. The method of claim 1, wherein the noise-free part of the linear transformed signal is estimated using one of a Wiener filtering technique and an Ephraim-Malah estimation technique.
  • 5. The method of claim 1, wherein the noise-free part of the linear transformed signal is estimated by solving:
  • 6. The method of claim 1, wherein the step of reconstructing the source component of the measured signal, comprises: defining a three layer neural network by:
  • 7. The method of claim 6, further comprising: iterating:
  • 8. The method of claim 1, wherein the noise-free part of the linear transformed signal is estimated by solving:
  • 9. The method of claim 1, wherein the step of reconstructing the source component of the measured signal, comprises: (i) setting k=0, Y0=Y;(ii) computing zk=UYk;(iii) computing W=Tzk;(iv) computing Y0 using:
  • 10. The method of claim 1, further comprising: outputting the reconstructed source component of the measured signal.
  • 11. A system for nonlinear signal enhancement, comprising: a memory device for storing a program;a processor in communication with the memory device, the processor operative with the program to:perform a linear transformation on a measured signal comprising a source component and a noise component;determine a modulus of the linear transformed signal;estimate a noise-free part of the linear transformed signal; andreconstruct the source component of the measured signal using the noise-free part of the linear transformed signal.
  • 12. The system of claim 11, wherein when the source component of the measured signal is reconstructed the processor is further operative with the program code to: perform a nonlinear transformation on the noise-free part of the linear transformed signal;determine a sign of the source component of the measured signal;determine a product of the nonlinear transformed signal and the sign; andperform an overlap-add procedure using the product of the nonlinear transformed signal and the sign.
  • 13. The system of claim 11, wherein the measured signal is received using one of a microphone and a database comprising one of audio signals and image signals.
  • 14. The method of claim 11, wherein when the source component of the measured signal is reconstructed the processor is further operative with the program code to: define a three layer neural network by:
  • 15. The method of claim 11, wherein the noise-free part of the linear transformed signal is estimated by solving:
  • 16. The method of claim 11, wherein when the source component of the measured signal is reconstructed the processor is further operative with the program code to: (i) set k=0, Y0=Y;(ii) compute zk=UYk;(iii) compute W=Tzk;(iv) compute Y0 using:
  • 17. The system of claim 11, wherein the processor is further operative with the program code to: output the reconstructed source component of the measured signal.
  • 18. The system of claim 17, wherein the reconstructed source component of the measured signal is output to one of a loudspeaker and an automatic speech recognition system.
  • 19. A method for nonlinear signal enhancement, comprising: receiving a signal comprising a source component and a noise component;performing a linear transformation on the received signal;determining an absolute value of the linear transformed signal;estimating a noise-free part of the linear transformed signal;performing a nonlinear transformation on the noise-free part of the linear transformed signal;determining a sign of the source component of the received signal;determining a product of the nonlinear transformed signal and the sign; andperforming an overlap-add procedure on the product of the nonlinear transformed signal and the sign to form a reconstructed signal of the source component of the received signal, wherein the reconstructed signal does not comprise the noise component of the received signal; andoutputting the reconstructed signal.
  • 20. The method of claim 19, wherein the received signal is one of a speech signal and an image signal.
CROSS-REFERENCE TO RELATED APPLICATION

This application claims the benefit of U.S. Provisional Application No. 60/550,751, filed Mar. 5, 2004, a copy of which is herein incorporated by reference.

US Referenced Citations (6)
Number Name Date Kind
6343268 Balan et al. Jan 2002 B1
6952482 Balan et al. Oct 2005 B2
7088831 Rosca et al. Aug 2006 B2
7149691 Balan et al. Dec 2006 B2
7158933 Balan et al. Jan 2007 B2
7266303 Linden et al. Sep 2007 B2
Related Publications (1)
Number Date Country
20050196065 A1 Sep 2005 US
Provisional Applications (1)
Number Date Country
60550751 Mar 2004 US