A more detailed understanding of the invention may be had from the following description of preferred embodiments, given by way of example only, and to be understood in conjunction with the accompanying drawings in which:
In the following detailed description of the various embodiments of the invention, reference is made to the accompanying drawings that form a part hereof, and in which are shown by way of illustration specific embodiments in which the invention may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the invention, and it is to be understood that other embodiments may be utilized and that changes may be made without departing from the scope of the present invention. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is defined only by the appended claims and their equivalents.
Referring to
The noise is calculated assuming that the reconstructed values are zero for each critical band. The noise for each critical band is calculated using the equation
Wherein X[k] are the original spectral coefficients, A[k] is an outer ear transform, and B[b] is final excitation values.
The excitation for each critical band is computed assuming that the critical band is zeroed out. All critical bands with spectral coefficient values lower than the current critical band are also assumed to have been zeroed out for the purpose of excitation computation At step 120, a quantization is performed on the original spectral coefficients associated with each of the critical bands within the frame to obtain quantized spectral coefficients. In some embodiments, an encoder applies a uniform, scalar quantization step size to a block of spectral data that was previously weighted by critical bands according to a quantization matrix. Alternatively, the encoder applies a non-uniform quantization to weight the block by quantization bands, or applies the quantization matrix and the uniform, scalar quantization step size.
At step 130, an inverse quantization is performed on the obtained quantized spectral coefficients for each of the critical bands within the frame to obtain reconstructed spectral coefficients. In some embodiments, an encoder reconstructs the block of spectral data from the quantized data. For example, the encoder applies the inverse quantization to reconstruct the block, and then applies an inverse multi-channel transform to return the block to independently coded channels.
In these embodiments, the encoder processes the reconstructed block in critical bands according to an auditory model. The number and placement of the critical bands depends on the auditory model, and may be different from the number and placement of quantization bands. By processing the block by critical bands, the encoder improves the accuracy of subsequent quality measurements.
At step 140, the method 100 determines whether to use pre-computed NER values associated with the critical bands as a function of the obtained reconstructed spectral coefficients or to compute new NER value for the critical bands using original excitation values
This step involves measuring quality of the reconstructed block, for example, measuring the NER as described above.
In some embodiments, noise pattern between original transform coefficients X[k] and the reconstructed transform coefficients Xr[k] is computed by calculating sample by sample differences N[k]. An outer ear transfer function A is applied to the difference to obtain N[k], as described below.
N[k]=A[k](X[k]−Xr[k])
Using the distortion coefficients N[k] thus obtained, the noise pattern in critical band ‘b’ is accumulated, over the length of the critical bandB[b] as described-above.
In some embodiments, the excitation pattern is computed using below outlined steps. Transform coefficients X[k] are multiplied by the outer ear transform A[k] to obtain Y[k]
Y[k]=X[k]*A[k]
The energy of the coefficients Y[k] are summed up for all critical bands to obtain En[b]
Frequency smearing is performed on En[b] bands. This can involve a process of convolution of En[b] with a level dependent spreading function to obtain Ec[b]. This spreading function models the frequency masking phenomenon of the inner ear.
Time smearing is performed on Ec[b] to obtain the final excitation values E[b]. Time smearing can involve first order low pass filtering on the excitation values on a per-band basis.
E[b]=aEPrev[b]+(1−a) Ec[b]
Wherein Eprev[b] is an excitation value corresponding to the previous frame.
At step 150, an overall perceptual distortion of the frame is computed by using either the pre-computed NER values or new NER values computed using the original excitation values based on the determination at step 140.
At step 160, the computed NER values associated with the critical bands are summed to obtain a summed NER value. At step 170, the method 100 compares the summed NER value with a target NER value and determines whether a target NER is achieved. The method 100 goes to step 180 and continues with the bit-rate loop process if the target NER is achieved. The method 100 goes to step 120 and repeats steps 120-170 if the target NER is not achieved.
Although the method 100 includes steps 110-180 that are arranged serially in the exemplary embodiments, other embodiments of the present subject matter may execute two or more acts in parallel, using multiple processors or a single processor organized two or more virtual machines or sub-processors. Moreover, still other embodiments may implement the acts as two or more specific interconnected hardware modules with related control and data signals communicated between and through the modules, or as portions of an application-specific integrated circuit. Thus, the exemplary process flow diagrams are applicable to software, firmware, and/or hardware implementations.
Various embodiments of the present invention can be implemented in software, which may be run in the environment shown in
A general computing device, in the form of a computer 210, may include a processor 202, memory 204, removable storage 201, and non-removable storage 214. Computer 210 additionally includes a bus 205 and a storage area network interface (NI) 212.
Computer 210 may include or have access to a utility computing environment that includes one or more computing servers 240 and one or more disk arrays 260, a SAN 250 and one or more communication connections 220 such as a network interface card or a USB connection. The computer 210 may operate in a networked environment using the communication connection 220 to connect to the one or more computing servers 240. A remote server may include a personal computer, server, router, network PC, a peer device or other network node, and/or the like. The communication connection may include a Local Area Network (LAN), a Wide Area Network (WAN), and/or other networks.
The memory 204 may include volatile memory 206 and non-volatile memory 208. A variety of computer-readable media may be stored in and accessed from the memory elements of computer 210, such as volatile memory 206 and non-volatile memory 208, removable storage 212 and non-removable storage 214. Computer memory elements can include any suitable memory device(s) for storing data and machine-readable instructions, such as read only memory (ROM), random access memory (RAM), erasable programmable read only memory (EPROM), electrically erasable programmable read only memory (EEPROM), hard drive, removable media drive for handling compact disks (CDs), digital video disks (DVDs), diskettes, magnetic tape cartridges, memory cards, Memory Sticks™, and the like; chemical storage; biological storage; and other types of data storage.
“Processor” as used herein, means any type of computational circuit, such as, but not limited to, a microprocessor, a microcontroller, a complex instruction set computing (CISC) microprocessor, a reduced instruction set computing (RISC) microprocessor, a very long instruction word (VLIW) microprocessor, explicitly parallel instruction computing (EPIC) microprocessor, a graphics processor, a digital signal processor, or any other type of processor or processing circuit. The term also includes embedded controllers, such as generic or programmable logic devices or arrays, application specific integrated circuits, single-chip computers, smart cards, and the like.
Embodiments of the present invention may be implemented in conjunction with program modules, including functions, procedures, data structures, application programs, etc., for performing tasks, or defining abstract data types or low-level hardware contexts.
Machine-readable instructions stored on any of the above-mentioned storage media are executable by the processor 202 of the computer 210. For example, a computer program 225 may comprise machine-readable instructions capable of measuring perceptual noise according to the teachings and herein described embodiments of the present invention. In one embodiment, the computer program 225 may be included on a CD-ROM and loaded from the CD-ROM to a hard drive in non-volatile memory 208. The machine-readable instructions cause the computer 210 to estimate SFO according to the various embodiments of the present invention.
The perceptual noise estimation technique of the present invention is modular and flexible in terms of usage in the form of a “Distributed Configurable Architecture”. As a result, parts of the perceptual estimation system may be placed at different points of a network, depending on the model chosen. For example, the technique can be deployed in a server and the input and output instructions streamed over from a client to the server and back, respectively. Such flexibility allows faster deployment to provide a cost effective solution to changing business needs.
The above description is intended to be illustrative, and not restrictive. Many other embodiments will be apparent to those skilled in the art. The scope of the invention should therefore be determined by the appended claims, along with the full scope of equivalents to which such claims are entitled.
The above-described methods and apparatus provide various embodiments for encoding characters. It is to be understood that the above-description is intended to be illustrative, and not restrictive. Many other embodiments will be apparent to those of skill in the art upon reviewing the above-description. The scope of the subject matter should, therefore, be determined with reference to the following claims, along with the full scope of equivalents to which such claims are entitled. The above-described process reduces the complexity of computing perceptual noise by about 40-50% of the overall traditional quantization techniques, after accounting for the initial calculation of noise-to-excitation ratio for each band as described-above. The above-described process alleviates the conventional iterative process of excitation computation. Further, in the above process the excitation values are computed only once prior to quantization.
As shown herein, the present invention can be implemented in a number of different embodiments, including various methods, a circuit, an I/O device, a system, and an article comprising a machine-accessible medium having associated instructions.
Other embodiments will be readily apparent to those of ordinary skill in the art. The elements, algorithms, and sequence of operations can all be varied to suit particular requirements. The operations described-above with respect to the method 100 illustrated in
It is emphasized that the Abstract is provided to comply with 37 C.F.R. §1.72(b) requiring an Abstract that will allow the reader to quickly ascertain the nature and gist of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims.
The above-described implementation is intended to be applicable, without limitation, to situations where improvement to an OFDM system is sought, considering the use of SFO estimation. The description hereinabove is intended to be illustrative, and not restrictive. The various embodiments of the method of improving the OFDM system described herein are applicable generally to any OFDM system, and the embodiments described herein are in no way intended to limit the applicability of the invention. Many other embodiments will be apparent to those skilled in the art. The scope of this invention should therefore be determined by the appended claims as supported by the text, along with the full scope of equivalents to which such claims are entitled.
Number | Date | Country | Kind |
---|---|---|---|
IN 1295/CHE/2006 | Jul 2006 | IN | national |