1. Field
Apparatuses, devices, and articles of manufacture consistent with the present disclosure relate to audio encoding and decoding, and more particularly, to a method and apparatus for processing audio signals at low complexity.
2. Description of the Related Art
In factorial pulse coding or decoding of audio signals, to determine a value m corresponding to the number of unit magnitude pulses from the number b of bits required for each frequency band, a method of presetting an arbitrary maximum value which m can have and finding the value m according to b while iteratively increasing the value of m from 0 to the preset maximum value by 1 has been used. However, such an iterative method has high complexity when a length of a frequency band is long or a range of values of m is great.
It is an aspect to provide a method and apparatus for determining the number of unit magnitude pulses at low complexity in correspondence with the number of bits allocated to each frequency band to apply factorial pulse coding for each frequency band unit, and a multimedia device employing the same.
According to an aspect of one or more exemplary embodiments, there is provided a spectrum encoding method including: determining a number of unit magnitude pulses for factorial pulse coding based on a number of allocated bits in frequency band units for a spectrum; and performing factorial pulse coding in the frequency band units for the spectrum by using the determined number of unit magnitude pulses.
According to another aspect of one or more exemplary embodiments, there is provided an audio encoding apparatus including: a transform unit to transform an audio signal in a time domain to an audio spectrum in a frequency domain; a bit allocation unit to determine a number of allocated bits by using spectral energy in predetermined frequency band units for the audio spectrum; and an encoding unit to determine the number of unit magnitude pulses for factorial pulse coding based on the number of allocated bits for the audio spectrum and to perform factorial pulse coding in the frequency band units for the audio spectrum by using the determined number of unit magnitude pulses.
According to another aspect of one or more exemplary embodiments, there is provided a spectrum decoding method including: determining a number of unit magnitude pulses for factorial pulse coding based on a number of allocated bits in frequency band units for a spectrum; and performing factorial pulse decoding in the frequency band units for the spectrum by using the determined number of unit magnitude pulses.
According to another aspect of one or more exemplary embodiments, there is provided an audio decoding apparatus including: a bit allocation unit to determine a number of allocated bits by using spectral energy in predetermined frequency band units for an audio spectrum included in a bitstream; a decoding unit to determine a number of unit magnitude pulses for factorial pulse decoding based on the number of allocated bits in frequency band units for the audio spectrum and to perform factorial pulse decoding in the frequency band units for the audio spectrum by using the determined number of unit magnitude pulses; and an inverse transform unit to transform the audio spectrum decoded by the decoding unit to an audio signal in a time domain.
The above and other features and advantages of the present invention will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings in which:
The present inventive concept may allow various kinds of change or modification and various changes in form, and specific exemplary embodiments will be illustrated in drawings and described in detail in the specification. However, it should be understood that the specific exemplary embodiments do not limit the present inventive concept to a specific disclosing form but include every modified, equivalent, or replaced one within the spirit and technical scope of the present inventive concept. In the following description, well-known functions or constructions are not described in detail since they would obscure the invention with unnecessary detail.
Although terms, such as ‘first’ and ‘second’, can be used to describe various elements, the elements cannot be limited by the terms. The terms can be used to classify a certain element from another element.
The terminology used in the application is used only to describe specific exemplary embodiments and does not have any intention to limit the present inventive concept. Although general terms as currently widely used as possible are selected as the terms used in the present inventive concept while taking functions in the present inventive concept into account, they may vary according to an intention of those of ordinary skill in the art, judicial precedents, or the appearance of new technology. In addition, in specific cases, terms intentionally selected by the applicant may be used, and in this case, the meaning of the terms will be disclosed in corresponding description of the invention. Accordingly, the terms used in the present inventive concept should be defined not by simple names of the terms but by the meaning of the terms and the content over the present inventive concept.
An expression in the singular includes an expression in the plural unless they are clearly different from each other in a context. In the application, it should be understood that terms, such as ‘include’ and ‘have’, are used to indicate the existence of implemented feature, number, step, operation, element, part, or a combination of them without excluding in advance the possibility of existence or addition of one or more other features, numbers, steps, operations, elements, parts, or combinations of them.
Hereinafter, the present inventive concept will be described more fully with reference to the accompanying drawings, in which exemplary embodiments are shown. Like reference numerals in the drawings denote like elements, and thus their repetitive description will be omitted.
The audio encoding apparatus 100 shown in
Referring to
The bit allocation unit 150 may determine the number of allocated bits in each sub-band unit for the audio spectrum by using a masking threshold value, which is obtained using spectral energy or a psychoacoustic model, and the spectral energy. The term “sub-band” may indicate a unit of grouping samples of an audio spectrum and may have a uniform or non-uniform length by reflecting a critical band. In the case of non-uniform length, sub-bands may be set so that the number of samples included in each sub-band gradually increases from a first sample of one frame to the last sample of the same frame. Herein, the number of sub-bands included in one frame or the number of samples included in the sub-bands may be determined in advance. Alternatively, after one frame is divided into a predetermined number of sub-bands having a uniform length, lengths of the sub-bands may be adjusted according to a distribution of spectral coefficients. The distribution of spectral coefficients may be determined using a spectral flatness measure, a difference between a maximum value and a minimum value, a differential value of the maximum value, or the like.
According to an embodiment, the bit allocation unit 150 may estimate the number of allowable bits by using a Norm value, e.g., a mean spectral energy, obtained in each sub-band unit, allocate bits by using the mean spectral energy, and limit the number of allocated bits not to exceed the number of allowable bits. According to another embodiment, the bit allocation unit 150 may estimate the number of allowable bits in each sub-band unit by using a psychoacoustic model, allocate bits by using the mean spectral energy, and limit the number of allocated bits not to exceed the number of allowable bits.
The encoding unit 170 may generate information regarding an encoded spectrum by quantizing and lossless encoding the audio spectrum based on the finally determined number of allocated bits in each sub-band unit.
The multiplexing unit 190 may generate a bitstream by multiplexing the Norm value provided by the bit allocation unit 150 and the information regarding the encoded spectrum provided by the encoding unit 170.
The audio encoding apparatus 100 may generate a noise level for a given sub-band and provide the generated noise level to an audio decoding apparatus (700 of
The bit allocation unit 200 shown in
Referring to
The Norm encoding unit 230 may quantize and lossless encode the Norm value obtained for each sub-band unit. The Norm value quantized in each sub-band unit may be provided to the bit estimation and allocation unit 250, or a Norm value dequantized again in each sub-band unit may be provided to the bit estimation and allocation unit 250. The Norm value quantized and lossless-encoded in each sub-band unit may be provided to the multiplexing unit 190.
The bit estimation and allocation unit 250 may estimate and allocate the required number of bits by using the Norm value in each sub-band unit. Preferably, a dequantized Norm value may be used so that the same bit estimation and allocation process can be used in an encoding part and a decoding part. At this time, a Norm value adjusted by taking a masking effect into account may be used. For the adjustment of a Norm value, for example, psychoacoustic weighting applied in ITU-T G.719 may be used, but the adjustment of a Norm value is not limited thereto.
The bit estimation and allocation unit 250 may calculate a masking threshold value by using the Norm value in each sub-band unit and estimate the number of perceptually required bits by using the masking threshold value. Various well-known methods may be used for obtaining a masking threshold value by using spectral energy. That is, the masking threshold value is a value corresponding to just noticeable distortion (JND), and when quantization noise is less than the masking threshold value, perceptual noise cannot be perceived. Thus, the number of minimum bits required not to perceive perceptual noise may be calculated using the masking threshold value. According to another embodiment, the number of bits satisfying the masking threshold value may be estimated by calculating a signal-to-mask ratio (SMR) by using a ratio of the Norm value to the masking threshold value in each sub-band unit and by using a relationship of 6.025 dB≈1 bit for the SMR. Although the number of estimated bits is the number of minimum bits required not to perceive perceptual noise, since more than the number of estimated bits does not have to be used in terms of compression, the number of estimated bits may be considered as the number of maximum bits allowed in sub-band units (hereinafter, referred to as the number of allowable bits). At this time, the number of allowable bits for each sub-band may be represented in an integer unit or a decimal point unit.
According to an embodiment, the bit estimation and allocation unit 250 may perform bit allocation of a decimal point unit by using the Norm value in each sub-band unit. At this time, bits are sequentially allocated from a sub-band having a greater Norm value, and more bits may be allocated to a perceptually important sub-band by weighting a Norm value of each sub-band according to a perceptual importance of each sub-band. The perceptual importance may be determined through, for example, psychoacoustic weighting as in ITU-T G.719.
In detail, the bit estimation and allocation unit 250 may sequentially allocate bits for each sample from a sub-band having a greater Norm value. That is, first of all, bits per sample are allocated to a sub-band having the maximum Norm value, and the Norm value of the sub-band for which bits have been allocated is decreased by a predetermined unit, and priority is changed so that bits are allocated to another sub-band. This process may be repeated until a number B of whole bits usable in a given frame is fully allocated.
The bit estimation and allocation unit 250 may determine the number of finally allocated bits by limiting the number of allocated bits for each sub-band not to exceed the number of estimated bits, i.e., the number of allowable bits. For all sub-bands, the number of allocated bits is compared with the number of estimated bits, and if the number of allocated bits is greater than the number of estimated bits, the number of allocated bits is limited to the number of estimated bits. As a result of the bit number limitation, if the number of bits for the whole sub-bands of the given frame is less than the number B of whole bits usable in the given frame, bits corresponding to the difference may be uniformly distributed to the whole sub-bands or non-uniformly distributed according to perceptual importance.
Accordingly, since the number of bits of each sub-band can be determined with a decimal point unit and limited to the number of allowable bits, the number of whole bits of a given frame may be efficiently distributed.
A mathematical equation may be used to estimate and allocate the number of bits required for each sub-band. For example, the number of allocated bits per sample of each sub-band for maximizing a signal-to-noise ratio (SNR) of an input spectrum may be estimated within a range of the number B of whole bits usable in a given frame based on a solution for optimizing quantization distortion and the number of bits allocated to each sub-band. Accordingly, since the number of allocated bits in each sub-band unit can be determined at once without repeating several times, complexity can be lowered.
The bit allocation unit 300 shown in
Referring to
The bit estimation and allocation unit 330 may estimate the number of perceptually required bits by using the masking threshold value in each sub-band unit. That is, an SMR may be obtained in each sub-band unit, and the number of bits satisfying the masking threshold value may be estimated by using the relationship 6.025 dB≈1 bit for the SMR. Although the number of estimated bits is the number of minimum bits required not to perceive perceptual noise, since more bits than the number of estimated bits do not have to be used in terms of compression, the number of estimated bits may be considered as the number of maximum bits allowed in sub-band units (hereinafter, referred to as the number of allowable bits). At this time, the bit estimation and allocation unit 330 may perform bit allocation with a decimal point unit by using spectral energy in each sub-band unit.
For the whole sub-bands, the bit estimation and allocation unit 330 may compare the number of allocated bits with the number of estimated bits and limit the number of allocated bits to the number of estimated bits if the number of allocated bits is greater than the number of estimated bits. As a result of the bit number limitation, if the number of bits for the whole sub-bands of a given frame is less than the number B of whole bits usable in the given frame, bits corresponding to the difference may be uniformly distributed to the whole sub-bands or non-uniformly distributed according to perceptual importance.
The scale factor estimation unit 350 may estimate a scale factor by using the number of allocated bits that is finally determined in each sub-band unit. The scale factor estimated in each sub-band unit may be provided to the encoding unit 170.
The scale factor encoding unit 370 may quantize and lossless encode the scale factor estimated in each sub-band unit. The scale factor encoded in each sub-band unit may be provided to the multiplexing unit 190.
The bit allocation unit 400 shown in
Referring to
The bit estimation and allocation unit 430 may obtain a masking threshold value by using spectral energy and estimate the number of perceptually required bits, i.e., the number of allowable bits, by using the masking threshold value in each sub-band unit.
The bit estimation and allocation unit 430 may perform bit allocation with an integer unit or a decimal point unit by using the spectral energy in each sub-band unit.
For the whole sub-bands, the bit estimation and allocation unit 430 compares the number of allocated bits with the number of estimated bits and limits the number of allocated bits to the number of estimated bits if the number of allocated bits is greater than the number of estimated bits. As a result of the bit number limitation, if the number of allocated bits for the whole sub-bands of a given frame is less than the number B of whole bits usable in the given frame, bits corresponding to the difference may be uniformly distributed to the whole sub-bands or non-uniformly distributed according to the perceptual importance.
The scale factor estimation unit 450 may estimate a scale factor by using the number of allocated bits that is finally determined in each sub-band unit. The scale factor estimated in each sub-band unit may be provided to the encoding unit 170.
The scale factor encoding unit 470 may quantize and lossless encode the scale factor estimated in each sub-band unit. The scale factor encoded in each sub-band unit may be provided to the multiplexing unit 190.
The encoding unit 500 shown in
Referring to
The spectrum encoding unit 530 may quantize the normalized audio spectrum by using the number of allocated bits of each sub-band and lossless encode the quantized result. For example, factorial pulse coding may be used for the spectrum encoding, but the embodiment is not limited thereto. According to factorial pulse coding, information, such as a position of a pulse, a magnitude of the pulse, and a sign of the pulse, within a range of the number of allocated bits may be represented in a factorial form.
The spectral information encoded by the spectrum encoding unit 530 may be provided to the multiplexing unit 190.
The audio encoding apparatus 600 shown in
Referring to
The transform unit 630 may determine a window size to be used for a transform according to a result of the transient period detection and perform a time-frequency domain transform based on the determined window size. For example, a short window may be applied to a sub-band in which a transient period is detected, and a long window may be applied to a sub-band in which a transient period is not detected.
The bit allocation unit 650 may be implemented by using any one of the bit allocation units 200, 300, and 400 shown in
The encoding unit 670 may determine a window size to be used for encoding according to a result of the transient period detection, similar to the transform unit 630.
The audio encoding apparatus 600 may generate a noise level for a sub-band and provide the generated noise level to an audio decoding apparatus (700 of
The audio decoding apparatus 700 shown in
Referring to
The bit allocation unit 730 may obtain a dequantized Norm value from the quantized and lossless-encoded Norm value and determine the number of allocated bits by using the dequantized Norm value. The bit allocation unit 730 may operate substantially the same as the bit allocation unit 150 or 650 of the audio encoding apparatus 100 or 600. When the Norm value was adjusted through psychoacoustic weighting by the audio encoding apparatus 100 or 600, the dequantized Norm value may also be adjusted in the same manner by the audio decoding apparatus 700.
The decoding unit 750 may lossless decode and dequantize the encoded spectrum by using the information regarding the encoded spectrum, which is provided by the demultiplexing unit 710. For example, factorial pulse decoding may be used for the spectrum decoding.
The inverse transform unit 770 may generate a reconstructed audio signal by transforming the decoded spectrum into an audio signal in the time domain.
The bit allocation unit 800 shown in
Referring to
The bit estimation and allocation unit 830 may determine the number of allocated bits by using the dequantized Norm value. In detail, the bit estimation and allocation unit 830 may obtain a masking threshold value by using spectral energy, i.e., a Norm value, in each sub-band unit and predict the number of perceptually required bits, i.e., the number of allowable bits, by using the masking threshold value. The bit estimation and allocation unit 830 may perform bit allocation with an integer unit or a decimal point unit by using spectral energy, i.e., a Norm value, in each sub-band unit.
For the whole sub-bands, the bit estimation and allocation unit 830 may compare the number of allocated bits with the number of estimated bits and limits the number of allocated bits to the number of estimated bits if the number of allocated bits is greater than the number of estimated bits. As a result of the bit number limitation, if the number of allocated bits for the whole sub-bands of a given frame is less than the number B of whole bits usable in the given frame, bits corresponding to the difference may be uniformly distributed to the whole sub-bands or non-uniformly distributed according to perceptual importance.
The decoding unit 900 shown in
Referring to
The envelope shaping unit 930 may reconstruct a spectrum before normalization by envelope-shaping the normalized spectrum provided by the spectrum decoding unit 910 by using the dequantized Norm value provided by the bit allocation unit 730.
The decoding unit 1000 shown in
Referring to
The decoding unit 1100 shown in
Referring to
The envelope shaping unit 1150 may reconstruct a spectrum before normalization for the spectrum including the sub-band filled with the noise component by using the dequantized Norm value provided by the bit allocation unit 730.
The audio decoding apparatus 1200 shown in
Referring to
The scale factor decoding unit 1230 may lossless decode and dequantize the quantized and lossless-encoded scale factor in each sub-band unit.
The spectrum decoding unit 1250 may lossless decode and dequantize the encoded spectrum by using the information regarding the encoded spectrum, which is provided by the demultiplexing unit 1210, and the dequantized scale factor. The spectrum decoding unit 1250 may include the same components as the decoding unit 1000 of
The inverse transform unit 1270 may generate a restored audio signal by transforming the spectrum decoded by the spectrum decoding unit 1250 to an audio signal in the time domain.
The audio decoding apparatus 1300 shown in
In comparison with the audio decoding apparatus 700 of
Referring to
The inverse transform unit 1370 may generate a reconstructed audio signal by transforming the decoded spectrum to an audio signal in the time domain. At this time, a window size may vary according to transient signaling information.
Referring to
In operation 1420, the number m of unit magnitude pulses may be determined for each frequency band based on the number of allocated bits.
In operation 1430, a transform coefficient of an audio spectrum may be quantized by performing factorial pulse coding based on the number m of unit magnitude pulses determined for each frequency band.
In operation 1440, codewords obtained as a result of factorial pulse coding for each frequency band may be combined. The combined codewords may be provided to the multiplexing unit 190 of
The principle of factorial pulse coding will now be described.
Factorial pulse coding is a technique of efficiently coding a signal by using unit magnitude pulses and may represent the signal by using all combinations of the number of non-zero pulses, positions of the non-zero pulses, magnitudes of the non-zero pulses, and signs of the non-zero pulses. An occasional number N for all combinations capable of representing pulses may be expressed by Equation 1.
In Equation 1, 2i indicates an occasional number of occurrences of a sign for representing i non-zero pulses as + or −, F(n, i) indicates an occasional number of positions of the i non-zero pulses which can be selected for given n sample positions, and D(m, i) indicates an occasional number of occurrences of a signal selected at the positions of the i non-zero pulses, which can be represented by m unit magnitude pulses.
In Equation 1, F(n, i) and D(m, i) may be represented by Equations 2 and 3, respectively.
The number b of bits required to represent the occasional number N for all combinations, which is calculated by Equation 1, may be represented by Equation 4.
b=log2 N (4)
Equation 4 may be arranged as in Equation 5.
That is, the number b of bits required to perform factorial pulse coding for an input signal vector included in an arbitrary frequency band is defined by a complex polynomial expression of n corresponding to a band length and m corresponding to the number of unit magnitude pulses. In this case, since n is a given value, the polynomial expression can be considered as a correlation between m and b. To determine a value of m from b, since it is impossible to directly calculate the value of m from b by using Equation 5, a method of presetting an arbitrary maximum value which m can have and finding the value of m satisfying b while increasing the value of m from 0 to the preset maximum value by 1 is used. Since this iteration method has high complexity when a range of values of m is great, the complexity may be reduced by applying a binary search to the iteration method. The basic principle of factorial pulse coding is disclosed in U.S. Pat. No. 6,236,960.
In operation 1520, it may be determined whether a difference between the minimum value Lp and the maximum value Hp is greater than 1. As a result of the determination in operation 1520, if the difference between the minimum value Lp and the maximum value Hp is equal to or less than 1, the final number num_pulse of unit magnitude pulses may be determined as a median m.
As a result of the determination in operation 1520, if the difference between the minimum value Lp and the maximum value Hp is greater than 1, a median m of the minimum value Lp and the maximum value Hp may be calculated in operation 1530.
In operation 1540, the number of bits required for factorial pulse coding with respect to the median m may be calculated and the number of calculated bits may be compared with a target value b. In operation 1540, fpc_bits(m,n) denotes a function for calculating the number of bits required for factorial pulse coding with respect to given values of m and n and corresponds to Equation 4.
As a result of the comparison in operation 1540, if the number of bits required for factorial pulse coding with respect to the median m is less than the target value b, a value greater than the median m may be required, and thus, the median m may be set as the minimum value Lp in operation 1550, and operation 1520 may be iteratively performed.
As a result of the comparison in operation 1540, if the number of bits required for factorial pulse coding with respect to the median m is greater than the target value b, a value less than the median m may be required, and thus, the median m may be set as the maximum value Hp in operation 1560, and operation 1520 may be iteratively performed.
In this case, when a maximum value in a range of values of m is MAX, the number of iterations may ┌log2 MAX┐+1 at the most, and thus, the number of iterations may increase as the maximum value which m can have is large.
First, a process of determining a maximum value of the number m of unit magnitude pulses will be described.
In detail, Equation 5 may be equivalent to Equation 6.
In Equation 6, z(m, n) may be expanded in a polynomial expression as in Equation 7.
When expansion is performed for the highest degree of m in Equation 7, Equation 6 may be represented by Equation 8.
When Equation 8 is rearranged for m, Equation 8 may be represented by Equation 9.
That is, Equation 9 exhibits a maximum value which the number m of unit magnitude pulses can have. The maximum value of m according to Equation 9 is a value closer to the final number of unit magnitude pulses than a maximum value arbitrarily determined in an initial stage, and a final value may be determined with much less iterations than existing methods. Here, the minimum value may be set as 0 or 1. Also, the minimum value may be determined as a natural number close to 0, through experiments or simulations in advance.
When Equation 8 is rearranged for m, Equation 8 may also be represented by Equation 10.
In Equation 10, n denotes a length of a frequency band, b denotes the number of bits required to perform factorial pulse coding, m denotes the number of unit magnitude pulses, and F(n) denotes a function for determining a minimum value of the number of unit magnitude pulses, wherein F(n) may be determined according to a length of a frequency band, for example, determined as 2 when n is less than 9, 3 when n is less than 17, and 6 when n is less than 33. According to this, since a length of a frequency band is less than 17 in most cases, when a binary search method is used, the final number of unit magnitude pulses of each frequency band may be determined by performing a matching process once or twice.
Referring back to
b
1=1+log2 n (11)
In operation 1620, the number b of allocated bits given to the predetermined frequency band may be compared with the number b1 of bits required for coding at least one pulse. As a result of the comparison in operation 1620, if the number b of allocated bits is less than the number b1 of bits required for coding at least one pulse, the final number num_pulse of unit magnitude pulses is set as 0 without iterations.
In operation 1630, a maximum value of the number m of unit magnitude pulses may be acquired using Equation 9.
In operation 1640, the number of bits required for factorial pulse coding may be calculated using the maximum value of the number m of unit magnitude pulses, and a difference value diff between the number of bits required for factorial pulse coding and the number b of allocated bits may be acquired. In operation 1640, fpc_bits(m,n) denotes a function for calculating the number of bits required for factorial pulse coding with respect to given m and n.
In operation 1650, the difference value diff may be compared with 0. As a result of the comparison in operation 1650, if the difference value diff is equal to or less than 0, a corresponding value of m may be determined as the final number num_pulse of unit magnitude pulses.
As a result of the comparison in operation 1650, if the difference value diff is greater than 0, the difference value diff may be compared with a predefined threshold THR in operation 1660. The threshold THR may be determined as an optimal value through experiments or simulations.
As a result of the comparison in operation 1660, if the difference value diff is greater than the predefined threshold THR, a rough and final value of m may be determined using a binary search within a range of (min, m) in operation 1670. The method illustrated in
As a result of the comparison in operation 1660, if the difference value diff is equal to or less than the predefined threshold THR, the number of bits required for factorial pulse coding may be recalculated by decreasing a current value of m by 1, and a linear decrement process may be repeated until the number of used bits satisfies the number b of allocated bits. A fine and final value of m, which satisfies the number b of allocated bits may be determined as the number of unit magnitude pulses of a corresponding frequency band.
Referring to
In operation 1720, the number of bits required for factorial pulse coding may be calculated using the maximum value of the number m of unit magnitude pulses, and a difference value diff between the number of calculated bits and the number b of allocated bits, for example, an absolute value of the difference value diff may be compared with a predefined threshold THR. As in
In operation 1730, when the difference value diff between the number of bits, which is calculated using the changed value of m, and the number b of allocated bits is less than the threshold THR, if the number of bits, which is calculated using the changed value of m, is greater than the number b of allocated bits, a current value of m may be decreased by 1 until the number b of allocated bits is satisfied. A value of m, which satisfies the number b of allocated bits, may be determined as the number of unit magnitude pulses of a corresponding frequency band.
In operation 1740, when the difference value diff between the number of bits, which is calculated using the changed value of m, and the number b of allocated bits is less than the threshold THR, if the number of bits, which is calculated using the changed value of m, is less than the number b of allocated bits, a current value of m may be increased by 1 until the allocated number b of bits is satisfied. A value of m, which satisfies the number b of allocated bits, may be determined as the number of unit magnitude pulses of a corresponding frequency band.
In conclusion, an approximate value of m may be determined by a binary search process and then a precise value of m may be determined by a linear decrement process.
According to the embodiments of the present invention, by using a mathematical equation to determine the maximum value of the number of unit magnitude pulses for each frequency band by selectively using a binary search between 1 and the maximum value and a linear decrement method to reduce the number of iterations, a method and apparatus for processing audio signal at low complexity can be realized. In addition, when the number of bits allocated to an arbitrary frequency band is less than the minimum number of bits required for coding one pulse, the number of unit magnitude pulses for the arbitrary frequency band is allocated to be 0, thereby reducing the processing complexity in an exceptional situation.
Table 1 illustrates a comparison of the number of iterations in a case of using a binary search method and a case of using a combination of the binary search method and a linear decrement method. It is assumed that a range of values which m can have is from 90 to 500.
The factorial pulse coding methods of
The factorial pulse coding methods of
The factorial pulse coding methods of
Referring to
The communication unit 1810 may receive at least one of an audio signal or an encoded bitstream provided from the outside or transmit at least one of a restored audio signal or an encoded bitstream obtained as a result of encoding by the encoding module 1830.
The communication unit 1810 is configured to transmit and receive data to and from an external multimedia device through a wireless network, such as wireless Internet, wireless intranet, a wireless telephone network, a wireless Local Area Network (LAN), Wi-Fi, Wi-Fi Direct (WFD), third generation (3G), fourth generation (4G), Bluetooth, Infrared Data Association (IrDA), Radio Frequency Identification (RFID), Ultra WideBand (UWB), Zigbee, or Near Field Communication (NFC), or a wired network, such as a wired telephone network or wired Internet.
According to an exemplary embodiment, the encoding module 1830 may generate a bitstream by transforming an audio signal in the time domain, which is provided through the communication unit 1810 or the microphone 1870, to an audio spectrum in the frequency domain, determining the number of allocated bits by using spectral energy in predetermined frequency band units for the audio spectrum, determining a number of unit magnitude pulses for factorial pulse coding based on a number of allocated bits in frequency band units for the audio spectrum, and performing factorial pulse coding in the frequency band units for the spectrum by using the determined number of unit magnitude pulses.
According to another exemplary embodiment, the encoding module 1830 may estimate a maximum value of the number of unit magnitude pulses with respect to the number of allocated bits and determine the final number of unit magnitude pulses of each frequency band by performing a binary search within a range from a minimum value to the maximum value. According to another exemplary embodiment, the encoding module 1830 may estimate a maximum value of the number of unit magnitude pulses with respect to the number of allocated bits and determine the final number of unit magnitude pulses of each frequency band by selectively performing a binary search method and a linear decrement method within a range of a minimum value and the maximum value.
The storage unit 1850 may store the encoded bitstream generated by the encoding module 1830. In addition, the storage unit 1850 may store various programs required to operate the multimedia device 1800.
The microphone 1870 may provide an audio signal from a user or the outside to the encoding module 1830.
The multimedia device 1900 of
Referring to
According to an exemplary embodiment, the decoding module 1930 may generate a restored audio signal by receiving a bitstream provided through the communication unit 1910, determining the number of allocated bits by using spectral energy in predetermined frequency band units for an audio spectrum in the bitstream, determining a number of unit magnitude pulses for factorial pulse coding based on a number of allocated bits in frequency band units for the audio spectrum, performing factorial pulse decoding in the frequency band units for the spectrum by using the determined number of unit magnitude pulses, and transforming the audio spectrum decoded by the decoding unit into an audio signal in the time domain
According to another exemplary embodiment, the decoding module 1930 may estimate a maximum value of the number of unit magnitude pulses with respect to the number of allocated bits and determine the final number of unit magnitude pulses of each frequency band by performing a binary search within a range from a minimum value to the maximum value. According to another exemplary embodiment, the decoding module 1930 may estimate a maximum value of the number of unit magnitude pulses with respect to the number of allocated bits and determine the final number of unit magnitude pulses of each frequency band by selectively using a binary search method and a linear decrement method with respect to a range from a minimum value to the maximum value.
The storage unit 1950 may store the restored audio signal generated by the decoding module 1930. In addition, the storage unit 1950 may store various programs required to operate the multimedia device 1900.
The speaker 1970 may output the restored audio signal generated by the decoding module 1930 to the outside.
The multimedia device 2000 shown in
Since the components of the multimedia device 2000 shown in
Each of the multimedia devices 1800, 1900, and 2000 shown in
When the multimedia device 1800, 1900, or 2000 is, for example, a mobile phone, although not shown, the multimedia device 1800, 1900, or 2000 may further include a user input unit, such as a keypad, a display unit for displaying information processed by a user interface or the mobile phone, and a processor for controlling the functions of the mobile phone. In addition, the mobile phone may further include a camera unit having an image pickup function and at least one component for performing a function required for the mobile phone.
When the multimedia device 1800, 1900, or 2000 is, for example, a TV, although not shown, the multimedia device 1800, 1900, or 2000 may further include a user input unit, such as a keypad, a display unit for displaying received broadcasting information, and a processor for controlling all functions of the TV. In addition, the TV may further include at least one component for performing a function of the TV.
The methods according to the embodiments can be written as computer-executable programs and can be implemented in general-use digital computers that execute the programs by using a non-transitory computer-readable recording medium. In addition, data structures, program instructions, or data files, which can be used in the embodiments, can be recorded on a non-transitory computer-readable recording medium in various ways. The non-transitory computer-readable recording medium is any data storage device that can store data which can be thereafter read by a computer system. Examples of the non-transitory computer-readable recording medium include magnetic storage media, such as hard disks, floppy disks, and magnetic tapes, optical recording media, such as CD-ROMs and DVDs, magneto-optical media, such as optical disks, and hardware devices, such as ROM, RAM, and flash memory, specially configured to store and execute program instructions. In addition, the non-transitory computer-readable recording medium may be a transmission medium for transmitting signal designating program instructions, data structures, or the like. Examples of the program instructions may include not only mechanical language codes created by a compiler but also high-level language codes executable by a computer using an interpreter or the like.
While the present invention has been particularly shown and described with reference to the exemplary embodiments thereof, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present invention as defined by the following claims.
Number | Date | Country | Kind |
---|---|---|---|
2012103446 | Feb 2012 | RU | national |
This application claims the benefit of Russian Patent Application No. RU2012103446, filed on Feb. 2, 2012, in the Russian Patent Office and U.S. Provisional Application No. 61/595,760, filed on Feb. 7, 2012, in the U.S. Patent Office, the disclosures of which are incorporated by reference herein in their entirety.
Number | Date | Country | |
---|---|---|---|
61595760 | Feb 2012 | US |