The invention relates to a linking unit according to the preamble of claim 1. The linking unit serves for generating linking information indicating components of consecutive (typically overlapping) extended segments sp and sc which may be linked together in order to form a sinusoidal track, the segments sp and sc approximating consecutive segments of a sinusoidal audio or speech signal s.
The invention further relates to a parametric encoder according to the preamble of claim 8 and a method for generating said linking information according to the preamble of claim 9.
In the prior there are known two substantially different approaches for providing the linking information L used to establish sinusoidal tracks over consecutive segments. According to a first approach as described in the WO 00/79519 (PHN 017502 EP.P) partial signals of an original audio or speech signal are reconstructed based on sinusoidal input data including amplitude, frequency and phase information from a previous and a current segment. These reconstructed partial signals are compared with the original audio- or speech signal. The weighted mean-squared error signal was proposed as a criterion to select relevant links, i.e. to generate the linking information L.
This first approach does not only take amplitude and frequency information into account for optimally linking consecutive segments but also considers phase information of the components of the previous and the current segment. However, the drawback of this first approach is its computational burden and the fact that the original signal is required to generate the linking information.
According to a second approach known in the art the linking information is generated by only considering the amplitude and the frequency information from the sinusoidal code data from the current and the previous segment but not their phase information. Said second approach is now described by referring to
Consequently, the linking information L indicates those pairs of components of consecutive extended segments which may be linked together when restoring the audio or speech signal s after storage or transmission such that transitions between consecutive segments or components thereof are as smooth as possible. Smooth transitions lead to an improved quality of the restored signal.
Hereinafter linked components continuing over consecutive segments are referred to as sinusoidal track even if the separate components include slight variations, e.g. amplitude or frequency variations.
An advanced application of that second approach has been described by B. Edler, H. Purnhagen, and C. Ferekidis, in “ASAC-Analysis/synthesis codec for very low bit rates”, Preprint 4179 (F-6) 100th AES Convention, Copenhagen, 11–14 May, 1996.
In that article the authors propose a combination of relative distances in frequency and amplitudes as an additional criterion for generating the linking information. Expressed in other words, the linking information indicates if and which components of the previous and the current segment are considered to be local estimates belonging to the same sinusoidal crack.
Advantageously according to the second approach the generation of the linking information is done without considering the original audio or speech signal; however, since generation of the linking information according to the second approach is based on estimated sinusoidal code data only, the generated linking information may be wrong and incorrect tracks may be provided.
Starting from said second approach it is the object of the present invention to further develop a known linking unit, a parametric encoder and a method for generating linking information such that the selection of components of consecutive segments suitable for being linked together is improved resulting in a definition of a correct sinusoidal track.
That object is solved by the subject matter of claim 1. According to the characterising portion of claim 1 enlarged sinusoidal code data shall be provided comprising not only amplitude and frequency information but also information about the phase of at least some of the M components xm and at least some of N components yn. Further, the calculation unit of a linking unit is adapted to calculate the similarity matrix S(m,n) by additionally considering the phase consistency between m'th component xm of the extended previous segment sp and the n'th component yn of the extended current segment sc.
Advantageously, the proposed linking unit does only use estimated sinusoidal code data including phase information for generating the linking information. By additionally considering the phase information a more accurate determination of the similarity matrix and thus, a more reliable—in comparison to the second approach known in the art—determination of the linking information is possible without considering the original audio or speech signal s.
According to a first embodiment the calculating unit comprises a first pattern generating unit for generating said M complex components xm(t) of the extended previous segment sp and a second pattern generating unit for generating said N complex components yn(t) of the extended current segment sc. The explicit calculation of these complex and time-dependent components is required according to the invention in order to be able to evaluate the phase consistency between each of said components of the previous and of the current segment.
Advantageously, the calculating module is adapted to calculate the similarity matrix S(m,n) as a product of a first similarity S1 (m,n) representing the similarity in shape and a second similarity matrix S2(m,n) representing the similarity in amplitude between the components m and n. Further, advantageous embodiments of the linking unit are subject matters of the dependent claims 4 to 7.
The object of the invention is further solved by a parametric encoder according to claim 8 and a method for generating linking information according to claim 9. The advantages of the parametric encoder and of the method substantially correspond to the advantages mentioned above by referring to linking unit.
Five figures are accompanying the description, wherein
Before a preferred embodiment of the invention will be described by referring to the figures a preliminary remark is made for providing some background information about the sinusoidal modelling of the signal segments in general.
In sinusoidal modelling, the models are typically of the form (or can be rewritten as such)
where seg is a segment approximating or modelling a segment of a sinusoidal signal s. In these models the segment seg is represented by an extension as given on the right-hand sight of equation (1), wherein R denotes the real part of a complex variable and uk are the K underlying sinusoidal or sinusoidal-like segment components of the segment seg.
In particular, for a pure first sinusoidal model (extension), the segment's components are
uk(t)=Akej(ω
with Ak, ωk and μk (real-valued) amplitude, frequency a n d phase, respectively, and j=√{square root over (−1)}
According to a second model the components of the segment are defined as:
uk(t)=Ake(σk+jωk)t+jμk (2)
where Ak, ωk and μk are as in the pure sinusoidal model and an additional parameter σk appears. σk is a real parameter which captures amplitude changes within a segment.
A third, more elaborated known model based on polynomial is:
with real parameters bk,m and Φk,n or complex amplitudes Bk,m=bk,mejΦ
Finally, according to a fourth model, the components of the segments are defined as:
with real parameters θk,n and complex parameters Ck,m.
If two consecutive signal segments sp and sc (previous and current segment, respectively) are considered then there is typically an overlap in their support. Hereinafter uk in the previous segment is denoted by xm (m=1, . . . , M) and uk in the current segment is denoted by yn(n=1, . . . , N). In order that profitable (in a coding sense) links are established, it seems reasonable to speak of a link between a component m from sp and a component n from sc only if xm(t) and yn(t) are similar within the overlap area.
In the following preferred embodiments of the invention will be described by referring to
The calculating unit 120 does not only receive sinusoidal code data in the form of amplitude and frequency data of the previous and the current segment but receives enlarged sinusoidal code data further comprising information about the phase of all of the components xm of the previous segment sc and each of the N components yn of the current segment sc.
Consequently, the calculating unit 120 is adapted to calculate the similarity matrix S(m,n) not only by considering the amplitude and frequency data but additionally by considering the phase consistency between the m'th component xm of the extended previous segment sp and the n'th component yn of the extended current segment sc for m=1 . . . M and n=1 . . . N. The evaluating unit 140 receives and evaluates the similarity matrix S(m,n) output from said calculating unit 120 in order to generate said linking information L by selecting those pairs of components (m,n) the similarity of which is maximal.
The components xm(t) and yn(t) are explicitly generated and input to the calculation module 126 in order to determine the phase consistency between two components m and n and to use that phase consistency information for calculating the similarity matrix.
In the following two embodiments of the invention will be described for carrying out the calculation of the similarity matrix S(m,n). Both embodiments have in common that the similarity matrix is preferably but not necessarily calculated by multiplying a first similarity matrix S1(m,n) representing the similarity in shape between the two components m and n with a second similarity matrix S2(m,n) representing the similarity in amplitude between said components m and n. Then the similarity matrix is calculated according to:
S(m,n)=S1(m,n)S2(m,n). (5)
S(m,n)=0 means that there is no link and the larger S(m,n) is, the more likely it is that this can be exploited profitably as a link in a sinusoidal coding scheme.
The first embodiment for calculating the similarity matrix S is based on the consideration of the similarity of the previous and the current segment within a complete overlapping area. The aim of said first embodiment is to identify components of the previous and the current segment which are similar. This can be done by a correlation method. Thus, according to the first embodiment a correlation coefficient ρm,n is defined by
where xm(m=[1,M]) represents a set of components xm of the previous segment Sp and yn(n=[1,N]) represents the set of components yn of the current segment sc. Further, w(t) represents a window function and Exm represents the energy in the signal xm according to:
Analogously, Eyn represents the energy in the component yn according to
Consequently, ρm,n is a complex number which, for a link, should be close to 1. Therefore, the first similarity matrix S1(m,n) is built as a (partial) similarity measure by:
with 0<D1<1.
Additionally, the equivalence in amplitude (or, more particular, in energy) can be taken into account by considering:
gain, for a link, R should be a value close to 1 (in contrast to ρm,n Rm,n is real-valued) and as similarity measure can act S2(m,n) defined by
with 0<D2<1.
f the previous segment sp is represented by M components and if the current segment sc is represented by N components the first matrix S1 and the second matrix S2 as well as the overall similarity matrix S are M×N matrices. The entries of said matrix S establish if there exist links and, if so, which are the most profitable ones. The most profitable ones are the ones the similarity values of which are maximal. This evaluation of the similarity matrix S(m,n) is done in the evaluating unit 140.
he second embodiment of the invention for calculating the similarity matrix S represents a simplification of the first embodiment. More specifically, not the whole overlapping region between the consecutive segment but only the mid point of said region is considered. At this point, hereinafter referred to as sample t0, it is
xm(t0)≈yn(t0) (11)
In that second embodiment it is appreciated that in the neighbourhood of to the components are matched as well. This is realised if the progression (the stride) in the components is (nearly) the same. This is preferably evaluated by the ratio of the components of the two consecutive segments sp and sc according to
In order to select links the first (partial) similarity matrix is now defined as:
with 0<D3<1.
Here, the amplitude similarity is involved in a relative way. This agrees with psycho-acoustic relevance and distance criteria.
The second partial similarity matrix S2 is defined as:
with 0<D4<1.
The second embodiment for calculating the overall similarity matrix S differs from the first embodiment in that the components xm and yn need only to be generated at specific instances, namely t0 and t0+1.
For real audio signals it has been noted that taken in phase information improves the quality of the coded material. However, in the encoder 400 the phase information is used only if a continuation of a track parametric is searched. If a frequency from the data of the previous frame does not have a backward connection (i.e., it is not yet a track but may, after linking with the current frame date, become the start of a track) then the phase information is used but relayed on the previous linking procedures based on frequency and amplitude data only. The reason for this is that at the start of the track the phase is usually not well-defined. This means that the linking information of the previous segment sp is input to the calculating module 126 in
Instead of looking at (relative) differences between complex values xm and ym, also the real and imaginary parts or amplitudes and phases can be looked at and can be used to construct the similarity criterion. This has the advantage that instead of the two parameters that control the above given similarity measure, one or more parameter per considered variable is received. Therefore, expressed in real parameters instead of complex ones, it typically ends up with twice as many parameters. E.g., splitting the complex signals into amplitudes and phases has the interesting property that it is easier that the similarity measure for the phases can be made frequency-dependent.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design many alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word ‘comprising’ does not exclude the presence of other elements or steps than those listed in a claim. The invention can be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In a device claim enumerating several means, several of these means can be embodied by one and the same item of hardware. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.
Number | Date | Country | Kind |
---|---|---|---|
01200144 | Jan 2001 | EP | regional |
01202613 | Jul 2001 | EP | regional |
Number | Name | Date | Kind |
---|---|---|---|
4885790 | McAulay et al. | Dec 1989 | A |
4937873 | McAulay et al. | Jun 1990 | A |
5504833 | George et al. | Apr 1996 | A |
Number | Date | Country |
---|---|---|
WO8909985 | Oct 1989 | WO |
WO0079519 | Dec 2000 | WO |
Number | Date | Country | |
---|---|---|---|
20020133358 A1 | Sep 2002 | US |