This application claims the benefit and priority under 35 U.S.C. §119(a) of Korean Patent Application No. 10-2007-0094357, filed on Sep. 17, 2007, in the Korean Intellectual Property Office, the disclosure of which is incorporated herein in its entirety by reference.
1. Field of the Invention
The present general inventive concept relates to a method and apparatus encode and/or decode audio signals or video signals, and more particularly, to a method and apparatus to scalably encode and/or decode audio signals or video signals.
2. Description of the Related Art
An audio signal or a video signal can be encoded into a plurality of layers by controlling a bit rate by providing scalability. When a network becomes overloaded because of the encoding of an audio signal or a video signal into a plurality of layers, when a decoder cannot perform decoding, or when a bit rate is decreased according to a user's setting, the sound or image quality decreases, but the original signal can be restored by using only a part of a bitstream that corresponds to some of the layers.
Examples of important factors for providing scalability include transformation, quantization, bit-plane coding, data reordering, etc. Because of a tradeoff relationship between coding efficiency and the sound or image quality, when the coding efficiency is increased, the sound or image quality is decreased. On the other hand, when the coding efficiency is decreased, the sound or image quality is increased. In order to increase coding efficiency, an overhead caused by scalability should be small. In order to increase the sound or image quality, transformation, quantization, data reordering, etc. should be optimized. Therefore, the coding efficiency and the sound or image quality have opposite characteristics, and accordingly encoding that appropriately satisfies both the coding efficiency and the sound or image quality is needed.
In a codec that provides scalability on the basis of bit-plane coding and data reordering, the number of calculations to be executed is decreased but the quality of sound at low layers rapidly decreases. More specifically, when bit-plane coding is applied, quantization noise is increased by undecoded symbols. In addition, due to a lack of bits to be transmitted in low layers, a bandwidth that can be restored is decreased as layers become lower, and a muffled sound is generated. Therefore, the sound quality at low layers rapidly decreases.
The present general inventive concept provides a method and apparatus to scalably encode and/or decode audio signals or video signals.
Additional aspects and utilities of the present general inventive concept will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the general inventive concept.
The foregoing and/or other aspects and utilities of the present general inventive concept may be achieved by providing a scalable encoding method including generating symbols by quantizing an input signal; extracting codewords corresponding to the symbols from among codewords prepared so as to have variable lengths according to a distribution of probabilities of codes and to correspond to a plurality of prepared symbols, dividing each of the extracted codewords in units of a predetermined length in an order from an upper bit to a lower bit and grouping the divided codewords, and scalably lossless-encoding the grouped codewords.
The foregoing and/or other aspects and utilities of the present general inventive concept may also be achieved by providing a scalable decoding method including lossless decoding scalably encoded codes, extracting symbols by restoring codewords by arranging the lossless decoded codes, restoring symbols corresponding to codes from which symbols cannot be extracted, using a distribution of probabilities of codes that can be provided to lower bits of the arranged codes, and dequantizing the extracted or restored symbols.
The foregoing and/or other aspects and utilities of the present general inventive concept may also be achieved by providing a computer readable recording medium having recorded thereon a computer program to execute a scalable encoding method including generating symbols by quantizing an input signal; extracting codewords corresponding to the symbols from among codewords prepared so as to have variable lengths according to a distribution of probabilities of codes and to correspond to a plurality of prepared symbols, dividing each of the extracted codewords in units of a predetermined length in the order from an upper bit to a lower bit and grouping the divided codewords, and scalably lossless-encoding the grouped codewords.
The foregoing and/or other aspects and utilities of the present general inventive concept may also be achieved by providing a computer readable recording medium having recorded thereon a computer program to execute a scalable decoding method including lossless decoding scalably encoded codes, extracting symbols by restoring codewords by arranging the lossless decoded codes, restoring symbols corresponding to codes from which symbols cannot be extracted, using a distribution of probabilities of codes that can be provided to lower bits of the arranged codes, and dequantizing the extracted or restored symbols.
The foregoing and/or other aspects and utilities of the present general inventive concept may also be achieved by providing a scalable encoding apparatus including a quantization unit to generate symbols by quantizing an input signal; a codeword extraction unit to extract codewords corresponding to the symbols from among codewords prepared so as to have variable lengths according to a distribution of probabilities of codes and to correspond to a plurality of prepared symbols, a grouping unit to divide each of the extracted codewords in units of a predetermined length in an order from an upper bit to a lower bit and to group the divided codewords, and a lossless encoding unit to scalably lossless-encode the grouped codewords.
The foregoing and/or other aspects and utilities of the present general inventive concept may also be achieved by providing a scalable decoding apparatus including a lossless decoding unit to lossless decode scalably encoded codes, a symbol extraction unit to extract symbols by restoring codewords by arranging the lossless decoded codes, a symbol restoration unit to restore symbols corresponding to codes from which symbols cannot be extracted, using a distribution of probabilities of codes that can be provided to lower bits of the arranged codes, and a dequantization unit to dequantize the extracted or restored symbols.
The foregoing and/or other aspects and utilities of the present general inventive concept may also be achieved by providing a scalable encoding method including selecting frequency components from an input signal according to a predetermined criterion, and selecting frequency components to be included in each layer from the selected frequency components and scalably encoding the selected frequency components for each layer.
The foregoing and/or other aspects and utilities of the present general inventive concept may also be achieved by providing a scalable decoding method including scalably decoding frequency components that have been selected according to a predetermined criterion and encoded for each layer, and restoring a signal from the decoded frequency components.
The foregoing and/or other aspects and utilities of the present general inventive concept may also be achieved by providing a computer readable recording medium having recorded thereon a computer program to execute a scalable encoding method including selecting frequency components from an input signal according to a predetermined criterion, and selecting frequency components to be included in each layer from the selected frequency components and scalably encoding the selected frequency components for each layer.
The foregoing and/or other aspects and utilities of the present general inventive concept may also be achieved by providing a computer readable recording medium having recorded thereon a computer program to execute a scalable decoding method including scalably decoding frequency components that have been selected according to a predetermined criterion and encoded for each layer, and restoring a signal from the decoded frequency components.
The foregoing and/or other aspects and utilities of the present general inventive concept may also be achieved by providing a scalable encoding apparatus including a frequency component selection unit to select frequency components from an input signal according to a predetermined criterion, and a layer encoding unit to select frequency components to be included in each layer from the selected frequency components and to scalably encode the selected frequency components for each layer.
The foregoing and/or other aspects and utilities of the present general inventive concept may also be achieved by providing a scalable decoding apparatus including a frequency component decoding unit to scalably decode frequency components that have been selected according to a predetermined criterion and encoded for each layer, and a signal restoration unit to restore a signal from the decoded frequency components.
The foregoing and/or other aspects and utilities of the present general inventive concept may also be achieved by providing a scalable encoding method including dividing each of components of codewords in order to form one or more groups, and scalably lossless-encoding the groups.
The components may include codes, and the dividing of each of the components of codewords may include dividing each of the codes of the codewords in units of a predetermined length in an order from an upper bit to a lower bit of the codes and grouping the divided codewords.
The scalable encoding method may further include quantizing the input signal to generate symbols, and extracting the codewords corresponding to the symbols from among codewords to have variable lengths according to a distribution of probabilities of codes and to correspond to a plurality of prepared symbols.
The components of the codewords may include a number of bits in order; the groups comprises a first group having first ones of the bits of the respective codewords, and a second group having second ones of the bits of the respective codewords.
The first group may be a base layer, and the second group may be an enhancement layer.
The components may include frequency components, the groups may include one or more layers, and the dividing of each of the components of codewords may include selecting the frequency components from the input signal according to a predetermined criterion, and selecting the frequency components to be included in each layer from the selected frequency components and scalably encoding the selected frequency components for each layer.
Each layer may include a first layer and a second layer having the selected frequency components, the selected frequency components of the first layer may be most perceptually important frequency components, and the selected frequency components of the second layer may be less perceptually important frequency components.
The foregoing and/or other aspects and utilities of the present general inventive concept may also be achieved by providing a scalable decoding method including lossless decoding an input signal of scalably encoded codes to generate one or more groups, and restoring one or more codewords by arranging each of components of the groups in order.
The scalable decoding method may further include restoring symbols according to the restored codewords, and the components may include codes.
The scalable deciding method may further include restoring symbols corresponding to codes of the codewords from which the symbols cannot be extracted, using a distribution of probabilities of codes that can be provided to lower bits of the arranged codes, and quantizing the extracted or restored symbols.
The components of the groups may include a number of bits in order; the codewords may include a first codeword having first ones of the bits of the respective groups, and a second codeword having second ones of the bits of the respective groups.
The first codeword may be a first symbol; and the second codeword may be a second symbol.
The components may include frequency components, the groups may include one or more layers, and the lossless decoding of the input signal and the restoring of the codewords may include scalably decoding the frequency components that have been selected according to a predetermined criterion and encoded for each layer.
The foregoing and/or other aspects and utilities of the present general inventive concept may also be achieved by providing a method of a scalable coding system, the method including encoding an input signal, the encoding of the input signal including dividing each of components of codewords in order to form one or more groups, and scalably lossless-encoding the groups, and decoding a second input signal, the decoding of the second input signal including lossless decoding the input signal of scalably encoded groups to generate one or more groups, and restoring one or more codewords by arranging each of components of the groups in order.
The foregoing and/or other aspects and utilities of the present general inventive concept may also be achieved by providing a scalable coding system, including an encoding apparatus to divide each of components of codewords in order to form one or more groups, and to scalably lossless-encode the groups, and a decoding apparatus to lossless decode the input signal of scalably encoded codes to generate one or more groups, and to restore one or more codewords by arranging each of components of the groups in order.
The above and other features and advantages of the present general inventive concept will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings in which:
Scalable encoding and decoding methods and apparatuses according to the present general inventive concept will now be described more fully with reference to the accompanying drawings, in which exemplary embodiments of the general inventive concept are shown.
Reference will now be made in detail to the embodiments of the present general inventive concept, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. The embodiments are described below in order to explain the present general inventive concept by referring to the figures.
In operation 100, an input signal is quantized to generate symbols.
In operation 110, codewords corresponding to the symbols generated in operation 100 are extracted from codewords that have been previously stored so as to have variable lengths (i.e., variable number of bits of 1 and/or 0) and to correspond to a plurality of prepared symbols. Here, a codeword is a collection of codes that are sequentially arranged. Examples of a code include binary numbers of 0 and 1.
The variable lengths of the previously stored codewords are determined on the basis of the statistical probabilities that symbols are generated. A codeword corresponding to a symbol that is highly likely to be quantized is short, and a codeword corresponding to a symbol that is unlikely to be quantized is long. In other words, a codeword corresponding to a symbol that is statistically frequently quantized can be expressed by using only a small number of bits, whereas a codeword corresponding to a symbol that is statistically rarely quantized should be expressed by using a large number of bits.
In addition, the codewords to be previously stored are formed on the basis of a tree structure in which the probability that each code is prepared is allocated to each node. The probability that each of the codes of a codeword is extracted is calculated using the probability of code extraction allocated to each node in the tree structure, according to Equations 1 and 2.
where pi(x) denotes a probability that x is extracted in an i-th node. Here, a node indicates each code of the codeword, and the i-th node indicates an i-th bit from the MSB of the codeword.
where pi(x) denotes a probability that x is extracted in an i-th node.
The tree structure is illustrated in the conceptual diagram of
In operation 120, codes are extracted by dividing the extracted codewords in units of a pre-set length in an order from most significant bits (MSBs) to least significant bits (LSBs), and the extracted codes are grouped into one or more code groups. For example, it is assumed that the codewords ‘000’, ‘01’, and ‘1100’ are extracted in operation 110 and the preset length is one bit. In this case, as illustrated in
In operation 130, the code groups generated in operation 120 are scalably lossless-encoded. In operation 130, a group of codes corresponding to MSBs is set as an uppermost layer and lossless encoded, and a group of codes corresponding to bits lower than the MSBs is set as a layer lower than the uppermost layer and lossless encoded. Referring to
The lossless encoding performed in operation 130 may be arithmetic encoding. An embodiment in which arithmetic encoding is performed in a scalable encoding method according to the present general inventive concept will now be described with reference to
Probabilities p1a(0), P2a(0), and P3a(0) that the codes of the codeword ‘000’ corresponding to the symbol ‘0’ are extracted are calculated as described below by using the tree structure illustrated in
The probability P2a(0) that the second code ‘0’ of the codeword ‘000’ is extracted is calculated using Equation 3.
Next, the probability P3a(0) that the third code ‘0’ of the codeword ‘000’ is extracted is calculated using Equation 4.
Therefore, the probabilities P1a(0), P2a(0), and P3a(0) that the codes of the codeword ‘000’ are extracted are calculated as 0.6, 0.5, and ⅓. Similarly, the probabilities P1b(0) and P2b(1) that the codes of the codeword ‘01’ corresponding to the symbol ‘2’ are extracted are calculated as 0.6 and 0.5. The probabilities P1c(1), P2c(1), P3c(1), and P4c(0) that the codes of the codeword ‘1110’ corresponding to the symbol ‘5’ are extracted are calculated as 0.4, 0.5, 0.5, and 0.5. In operation 120, the codes [0, 0, 1] corresponding to the MSBs of the codewords ‘000’, ‘01’, and ‘1100’ extracted in operation 110 are grouped into the first group 300, the codes [0, 1, 1] corresponding to the second bits of the codewords ‘000’, ‘01’, and ‘1100’ are grouped into the second group 310, the codes [0, X, 0] corresponding to the third bits of the codewords ‘000’, ‘01’, and ‘1100’ are grouped into the third group 320, and the codes [X, X, 0] corresponding to the fourth bits of the codewords ‘000’, ‘01’, and ‘1100’ are grouped into the fourth group 330.
In operation 130, the base layer is encoded by arithmetic encoding by using the probabilities P1a(0), P1b(0), and P1c(1) (=0.6, 0.6, 0.4) corresponding to the codes of the first group, the first enhancement layer is encoded by arithmetic encoding by using the probabilities P2a(0), P2b(1), and P2c(1) (=0.5, 0.5, 0.5) corresponding to the codes of the second group, the second enhancement layer is encoded by arithmetic encoding by using the probabilities P3a(0), X, and P3c(0) (=⅓, X, 0.5) corresponding to the codes of the third group, and the third enhancement layer is encoded by arithmetic encoding by using the probabilities X, X, and P4c(0) (=X, X, 0.5) corresponding to the codes of the fourth group.
Here, the probability may be a probability of each code of a codeword, a probability of a code of an n-th group or a layer, and a probability of the codeword corresponding to the symbol.
Referring to
The lossless decoding performed in operation 500 may be arithmetic decoding. In this case, in operation 500, values decimally encoded using a tree structure in which the probability that each code is prepared is allocated to each node are arithmetically decoded into binary codes. The tree structure is illustrated in the conceptual diagram of
In operation 510, codewords are restored by arranging the codes in each layer that have been lossless decoded. More specifically, the encoder encodes the codes of a plurality of codewords into a plurality of layers by dividing and grouping the codes of the codewords in units of a predetermined number of bits in the order from MSBs to LSBs. Accordingly, in operation 510, the original codewords are restored by extracting codes corresponding to upper bits from an upper layer and codes corresponding to lower bits from a lower layer and arranging the codes in an order from the upper bits to the lower bits.
In operation 520, codewords not completely restored are detected from the codewords restored in operation 510. The not completely restored codewords are collections of codes that cannot be dequantized due to an omission of the codes corresponding to the lower bits caused by a failure in the reception of bitstreams corresponding to predetermined layers by a decoding end. For example, if a symbol quantized in the encoder is 8, the codeword ‘1111110’ has been encoded by including only the code ‘1’, which is the MSB of the codeword ‘1111110’, in the base layer, by including the codes ‘11’, which are the second and third bits thereof, in the first enhancement layer, and by including the codes ‘1110’, which are the fourth through seventh bits thereof, in the second enhancement layer. However, when the decoding end receives only the base layer and the first enhancement layer, only the codes ‘111’ corresponding to upper bits from among the codes of the codeword ‘1111110’ are decoded. That is, the codes ‘1110’ corresponding to the other bits are not decoded. Consequently, the codeword ‘1111110’ is not completely restored.
In operation 530, symbols corresponding to the codewords determined to be completely restored in operation 510 are extracted. In an embodiment of the present general inventive concept, the symbols corresponding to the completely restored codewords may be extracted from a prestored table formed on the basis of the tree structure illustrated in
In operation 540, symbols corresponding to the codewords determined to be not completely restored in operation 510 are calculated using a distribution of the probabilities of codes that may be prepared for lower bits of the codewords not completely restored in operation 520. Here, the lower bits of the codewords may correspond to codes of an n-th group compared to codes of an n−1 th group.
In operation 540, an average value of all of the symbols that are restorable by the codes that can be prepared for the lower bits of the not completely restored codewords is calculated using the distribution of the probabilities of the codes that may be prepared for the lower bits of the not completely restored codewords.
where ynew denotes the calculated symbols, ‘nodes’ denote a collection of node numbers that can be prepared for the lower bits of the not completely restored codewords, ‘i’ denotes a specific node number among the node numbers included in the ‘nodes’, yi denotes a symbol of the node i, and pi denotes the probability of the node i.
For example, if the codes restored in operation 510 are ‘1110’, nodes that can complete codewords by combining lower bits correspond to symbols ‘5’, ‘6’, ‘7’, ‘8’, and ‘9’ encompassed by a dotted box 530 illustrated in
In operation 540, the symbols ‘5’, ‘6’, ‘7’, ‘8’, and ‘9’ corresponding to the codewords not completely restored in operation 510 may be calculated using Equation 5, as expressed in Equation 6.
As described above, the symbols are restored by calculating expected values for all of the undecodable codewords whose lower bits can be combined. Thus, quantization noise is minimized, and accordingly an optimal signal to noise ratio (SNR) can be obtained.
In operation 550, the original signal is restored by dequantizing the symbols extracted in operation 530 and the symbols calculated in operation 540.
The quantization unit 600 quantizes an input signal received via an input port IN so as to generate symbols.
The storage unit 605 stores codewords that are prepared so as to have variable lengths and to correspond to a plurality of prepared symbols. Here, a codeword is a collection of codes that are sequentially arranged. Examples of a code include binary numbers of 0 and 1.
The variable lengths of the codewords previously stored in the storage unit 605 are determined on the basis of the statistical probabilities that symbols are generated. A codeword corresponding to a symbol that is highly likely to be quantized is short, and a codeword corresponding to a symbol that is unlikely to be quantized is long. In other words, a codeword corresponding to a symbol that is statistically frequently quantized can be expressed by using only a small number of bits, whereas a codeword corresponding to a symbol that is statistically rarely quantized should be expressed by using a large number of bits.
In addition, the codewords to be previously stored in the storage unit 605 are formed on the basis of a tree structure in which the probability that each code is prepared is allocated to each node. The probability that each of the codes of a codeword is extracted is calculated using the probability of code extraction allocated to each node in the tree structure, according to Equations 7 and 8.
where pi(x) denotes a probability that x is extracted in an i-th node.
where pi(x) denotes a probability that x is extracted in an i-th node.
The tree structure is illustrated in the conceptual diagram of
The codeword extraction unit 610 extracts the codewords corresponding to the symbols generated by the quantization unit 600 from the storage unit 605.
Referring to the tree structure illustrated in
The grouping unit 620 extracts codes by dividing the extracted codewords in units of a pre-set length in the order from MSBs to LSBs and groups the extracted codes. For example, it is assumed that the codewords ‘000’, ‘01’, and ‘1100’ are extracted by the codeword extraction unit 610 and the preset length is one bit. In this case, as illustrated in
The lossless encoding unit 630 scalably lossless encodes code groups generated by the grouping unit 620. The lossless encoding unit 630 sets a group of codes corresponding to MSBs as an uppermost layer and lossless encodes the uppermost layer. The lossless encoding unit 630 sets a group of codes corresponding to bits lower than the MSBs as a layer lower than the uppermost layer and lossless encodes the lower layer. Referring to
The lossless encoding performed in the lossless encoding unit 630 may be arithmetic encoding. An embodiment in which arithmetic encoding is performed in a scalable encoding apparatus according to the present general inventive concept will now be described with reference to
Probabilities P1a(0), P2a(0), and P3a(0) that the codes of the codeword ‘000’ corresponding to the symbol ‘0’ are extracted are calculated as described below by using the tree structure illustrated in
Next, the probability P3a(0) that the third code ‘0’ of the codeword ‘000’ is extracted is calculated using Equation 10.
Therefore, the probabilities P1a(0), P2a(0), and P3a(0) that the codes of the codeword ‘000’ are extracted are calculated as 0.6, 0.5, and ⅓. Similarly, the probabilities P1b(0) and P2b(1) that the codes of the codeword ‘01’ corresponding to the symbol ‘2’ are extracted are calculated as 0.6 and 0.5. The probabilities P1c(1), P2c(1), P3c(1), and P4c(0) that the codes of the codeword ‘1110’ corresponding to the symbol ‘5’ are extracted are calculated as 0.4, 0.5, 0.5, and 0.5. The grouping unit 620 groups the codes [0, 0, 1] corresponding to the MSBs of the codewords ‘000’, ‘01’, and ‘1100’ extracted by the codeword extraction unit 610 into the first group 300, the codes [0, 1, 1] corresponding to the second bits of the codewords ‘000’, ‘01’, and ‘1100’ into the second group 310, the codes [0, X, 0] corresponding to the third bits of the codewords ‘000’, ‘01’, and ‘1100’ into the third group 320, and the codes [X, X, 0] corresponding to the fourth bits of the codewords ‘000’, ‘01’, and ‘1100’ into the fourth group 330.
The lossless encoding unit 630 encodes the base layer by performing arithmetic encoding using the probabilities P1a(0), P1b(0), and P1c(1) (=0.6, 0.6, 0.4) corresponding to the codes of the first group, encodes the first enhancement by performing arithmetic encoding using the probabilities P2a(0), P2b(1), and P2c(1) (=0.5, 0.5, 0.5) corresponding to the codes of the second group, encodes the second enhancement layer by performing arithmetic encoding using the probabilities P3a(0), X, and P3c(0) (=⅓, X, 0.5) corresponding to the codes of the third group, and encodes the third enhancement layer by performing arithmetic encoding using the probabilities X, X, and P4c(0) (=X, X, 0.5) corresponding to the codes of the fourth group.
The storage unit 700 stores symbols corresponding to codewords prepared so as to have variable lengths. The variable lengths of the codewords previously stored in the storage unit 700 are determined on the basis of the statistical probabilities that symbols are generated. A codeword corresponding to a symbol that is highly likely to be quantized is short, and a codeword corresponding to a symbol that is unlikely to be quantized is long. In other words, a codeword corresponding to a symbol that is statistically frequently quantized can be expressed by using only a small number of bits, whereas a codeword corresponding to a symbol that is statistically rarely quantized should be expressed by using a large number of bits.
In addition, the codewords to be previously stored in the storage unit 700 are formed on the basis of a tree structure in which the probability that each code is prepared is allocated to each node. The tree structure is illustrated in the conceptual diagram of
The lossless decoding unit 705 receives a bitstream scalably encoded by an encoder via an input port IN and lossless decodes codes in each layer included in the bitstream.
The lossless decoding performed in the lossless decoding unit 705 may be arithmetic decoding. In this case, the lossless decoding unit 705 may arithmetically decode into binary codes values that have been stored in the storage unit 700 and decimally encoded using a tree structure in which the probability that each code is prepared is allocated to each node.
The codeword restoration unit 710 restores codewords by arranging the codes in each layer that have been lossless decoded by the lossless decoding unit 705. More specifically, the encoder encodes the codes of a plurality of codewords into a plurality of layers by dividing and grouping the codes of the codewords in units of a predetermined number of bits in the order from MSBs to LSBs. Accordingly, the codeword restoration unit 710 restores the original codewords by extracting codes corresponding to upper bits from an upper layer and codes corresponding to lower bits from a lower layer and arranging the codes in the order from the upper bits to the lower bits.
The codeword detection unit 720 detects not completely restored codewords from the codewords restored by the codeword restoration unit 710. The not completely restored codewords are collections of codes that cannot be dequantized due to an omission of the codes corresponding to the lower bits caused by a failure in the reception of bitstreams corresponding to predetermined layers by a decoding end. For example, if a symbol quantized in the encoder is 8, the codeword ‘1111110’ has been encoded by including only the code ‘1’, which is the MSB of the codeword ‘1111110’, in the base layer, by including the codes ‘11’, which are the second and third bits thereof, in the first enhancement layer, and by including the codes ‘1110’, which are the fourth through seventh bits thereof, in the second enhancement layer. However, when the decoding end receives only the base layer and the first enhancement layer, only the codes ‘11’ corresponding to upper bits from among the codes of the codeword ‘1111110’ are decoded. That is, the codes ‘1110’ corresponding to the other bits are not decoded. Consequently, the codeword ‘1111110’ is not completely restored.
In an embodiment of the present general inventive concept, a determination of completely restored codewords or incompletely restored codewords performed by the codeword detection unit 720 may be based on whether the codewords restored by the codeword restoration unit 710 are consistent with the codewords stored in the storage unit 700.
The symbol extraction unit 730 extracts symbols corresponding to codewords determined to be completely restored codewords by the codeword detection unit 720 from the storage unit 700.
The symbol restoration unit 740 calculates symbols corresponding to codewords determined to be not completely restored codewords by the codeword detection unit 720 by using a distribution of the probabilities of codes that may be prepared for lower bits of the codewords determined to be the not completely restored codewords. The symbol restoration unit 740 calculates an average value of all of the symbols that are restorable by the codes that can be prepared for the lower bits of the not completely restored codewords, by using the distribution of the probabilities of the codes that may be prepared for the lower bits of the not completely restored codewords.
where ynew denotes the calculated symbols, ‘nodes’ denote a collection of node numbers that can be prepared for the lower bits of the not completely restored codewords, ‘i’ denotes a specific node number among the node numbers included in the ‘nodes’, yi denotes a symbol of the node i, and pi denotes the probability of the node i.
For example, if the codes restored by the code restoration unit 710 are ‘1110’, nodes that can complete codewords by combining lower bits correspond to the symbols ‘5’, ‘6’, ‘7’, ‘8’, and ‘9’ encompassed by the dotted box 530 illustrated in
As described above, the symbols are restored by calculating expected values for all of the undecodable codewords whose lower bits can be combined. Thus, quantization noise is minimized, and accordingly an optimal SNR can be obtained.
The dequantization unit 750 restores the original signal by dequantizing the symbols extracted by the symbol extraction unit 730 and the symbols calculated by the symbol restoration unit 740, and outputs the restored signal via an output port OUT.
First, in operation 800, an input signal in a time domain is transformed into a frequency domain so as to generate a spectrum.
In operation 810, frequency components are selected from the spectrum generated in operation 800 according to a preset criterion. The preset criterion denotes an important factor for human perception that has been preset according to experience or experimentation. In an embodiment of the present general inventive concept, frequency components corresponding to a signal having a signal-to-mask ratio (SMR) greater than a masking threshold may be selected. In another embodiment, frequency components may be selected by extracting a spectrum peak in consideration of a predetermined weight value. In still another embodiment, frequency components having peak values equal to or greater than a predetermined value are selected from subbands having low SNR values. Each of the aforementioned three embodiments may be performed, but a combination of at least two of the embodiments may also be performed.
In operation 820, frequency components that are to be included in an i-th layer and the number of which corresponds to a bitrate allocated to the i-th layer are selected from frequency components not encoded in layers higher than the i-th layer among the selected frequency components.
Here, the value ‘i’ is set as an initial value of 1. Accordingly, in operation 820, the most perceptually important frequency components to be included in a base layer corresponding to a first layer are first selected. When 2 is allocated as the value ‘i’ after performing operations 830 through 860, frequency components to be included in a first enhancement layer corresponding to a second layer are selected from frequency components other than the frequency components selected for the base layer. For example, referring to
In operation 820, first, as illustrated in a block 1010, the frequency component SC0 is selected as the most perceptually important frequency component to be included in the base layer corresponding to the first layer. Next, as illustrated in a block 1020, the frequency component SC1 is selected as the most perceptually important frequency component to be included in the first enhancement layer corresponding to the second layer, from among frequency components that remain after the selection of the frequency component SC0 from the frequency components SC0, SC1, SC2, and SC3. Then, as illustrated in a block 1030, the frequency component SC2 is selected as the most perceptually important frequency component to be included in the second enhancement layer corresponding to the third layer, from among frequency components that remain after the selection of the frequency components SC0 and SC1 from the frequency components SC0, SC1, SC2, and SC3. Lastly, the frequency component SC3 is selected as the most perceptually important frequency component to be included in the third enhancement layer corresponding to the fourth layer. Consequently, in operation 820, all of the frequency components SC0, SC1, SC2, and SC3 selected in operation 810 are selected to be included in the first through fourth layers as illustrated in a block 1040.
When a frequency component to be included in the i-th layer is selected in operation 820, frequency components perceptually important in terms of human hearing are selected to be included in upper layers, whereas frequency components perceptually not important in terms of human hearing are selected to be included in lower layers. A criterion used in operation 820 may be similar to the preset criterion used in operation 810 except that the value of a threshold corresponding to the preset criterion varies according to layers. For example, a threshold corresponding to the preset criterion for an upper layer is set to be high, whereas a threshold corresponding to the preset criterion for a lower layer is set to be low. Operation 820 is not limited to the aforementioned embodiments of operation 810 but may use different criteria depending on a perceptual importance in terms of human hearing for each layer.
Alternatively, referring back to
Referring back to
In operation 840, the frequency components selected for the i-th layer in operation 820 and information about the locations of the frequency components are encoded into the i-th layer, and the parameter calculated in operation 830 is parametrically encoded into the i-th layer.
In operation 850, it is determined whether the i-th layer is the last layer to be encoded.
In operation 860, when it is determined in operation 850 that the i-th layer is not the last layer, 1 is added to the value ‘i’. The operations 820 through 850 are repeated for an (i+1)th layer.
In operation 870, when it is determined in operation 850 that there are no layers to be encoded, all of the encoded layers, from the base layer to the last layer, which are encoded in operation 840, are scalably multiplexed to generate a bitstream.
First, in operation 900, a bitstream scalably encoded by an encoder is demultiplexed.
In operation 910, frequency components included in a demultiplexed i-th layer are decoded. Here, the value ‘i’ is set as an initial value of 1. Accordingly, in operation 910, the most perceptually important frequency components included in a base layer corresponding to a first layer are first decoded. If a decoding end receives a first enhancement layer corresponding to a second layer, frequency components that are included in the first enhancement layer and are of secondary importance to the frequency components included in the base layer are decoded.
For example, referring to
If the encoder selects and encodes frequency components for each layer in the order from an upper layer to a lower layer according to perceptual importance, decoding is performed in operation 910 as illustrated in
If the encoder selects and encodes frequency components for each layer on the basis of a frequency, that is, in the order from a low frequency to a high frequency, decoding is performed in operation 910 as illustrated in
In operation 920, it is determined whether an i-th layer decoded in operation 910 is the last layer demultiplexed in operation 900.
In operation 930, when it is determined in operation 920 that the i-th layer is not the last layer, 1 is added to the value ‘i’. Operations 910 and 920 are repeated for an (i+1)th layer.
In operation 940, the remaining frequency components corresponding to the frequency components not decoded in operation 910 are parametrically decoded. Examples of a parameter decoded in operation 940 include the energy value of the remaining frequency components included in each subband, parameters indicating the noise level and envelope of each subband, and so on. For example, in operation 940, an energy value for each subband of the frequency components that remain after excluding the frequency components decoded for the first through i-th layers in operation 910 is decoded.
In operation 950, the frequency components decoded for the first through i-th layers in operation 910 and the remaining frequency components parametrically decoded in operation 940 are mixed together.
In operation 960, a spectrum generated from the frequency components mixed in operation 950 is transformed from the frequency domain to the time domain, thereby restoring the original signal.
The transformation unit 1300 receives an input signal via an input port IN and transforms the input signal from the time domain to the frequency domain so as to generate a spectrum.
The frequency component selection unit 1310 selects frequency components from the spectrum generated by the transformation unit 1300 according to a preset criterion. The preset criterion denotes an important factor for human perception that has been preset according to experience or experimentation. In an embodiment of the present general inventive concept, frequency components corresponding to a signal having an SMR greater than a masking threshold may be selected. In another embodiment, frequency components may be selected by extracting a spectrum peak in consideration of a predetermined weight value. In still another embodiment, frequency components having peak values equal to or greater than a predetermined value are selected from subbands having low SNR values. Each of the aforementioned three embodiments may be performed, but a combination of at least two of the embodiments may also be performed.
The layer selection unit 1320 selects frequency components that are to be included in an i-th layer and the number of which corresponds to a bitrate allocated to the i-th layer from among frequency components not encoded in layers higher than the i-th layer among the selected frequency components.
Here, the value ‘I’ is set as an initial value of 1. Accordingly, the layer selection unit 1320 first selects the most perceptually important frequency components to be included in a base layer corresponding to a first layer. When 2 is allocated as the value ‘i’, the layer selection unit 1320 selects frequency components to be included in a first enhancement layer corresponding to a second layer from among frequency components other than the frequency components selected for the base layer from among the frequency components selected by the frequency component selection unit 1310. For example, referring to
First, as illustrated in a block 1010, the layer selection unit 1320 selects the frequency component SC0 as the most perceptually important frequency component to be included in the base layer corresponding to the first layer. Next, as illustrated in a block 1020, the layer selection unit 1320 selects the frequency component SC1 as the most perceptually important frequency component to be included in the first enhancement layer corresponding to the second layer, from among frequency components that remain after the selection of the frequency component SC0 from the frequency components SC0, SC1, SC2, and SC3 selected by the frequency component selection unit 1310. Then, as illustrated in a block 1030, the layer selection unit 1320 selects the frequency component SC2 as the most perceptually important frequency component to be included in the second enhancement layer corresponding to the third layer, from among frequency components that remain after the selection of the frequency components SC0 and SC1 from the frequency components SC0, SC1, SC2, and SC3 selected by the frequency component selection unit 1310. Lastly, the layer selection unit 1320 selects the frequency component SC3 as the most perceptually important frequency component to be included in the third enhancement layer corresponding to the fourth layer. Consequently, as illustrated in a block 1040, the layer selection unit 1320 selects all of the frequency components SC0, SC1, SC2, and SC3 selected by the frequency component selection unit 1310 so as to be included in the first through fourth layers.
When the layer selection unit 1320 selects a frequency component to be included in the i-th layer, frequency components perceptually important in terms of human hearing are selected to be included in upper layers, whereas frequency components perceptually not important in terms of human hearing are selected to be included in lower layers. A criterion used in the layer selection unit 1320 may be similar to the preset criterion used in the frequency component selection unit 1310 except that the value of a threshold corresponding to the preset criterion varies according to layers. For example, a threshold corresponding to the preset criterion for an upper layer is set to be high, whereas a threshold corresponding to the preset criterion for a lower layer is set to be low. The layer selection unit 1320 is not limited to the aforementioned embodiments of the frequency component selection unit 1310 but may use different criteria depending on a perceptual importance in terms of human hearing for each layer.
Alternatively, referring back to
Referring back to
The layer encoding unit 1340 encodes the frequency components selected for the i-th layer by the layer selection unit 1320 and information about the locations of the frequency components into the i-th layer and parametrically encodes the parameter calculated by the parameter calculation unit 1330 into the i-th layer.
The layer selection unit 1320, the parameter calculation unit 1330, and the layer encoding unit 1340 repeat their operations until all of the layers, from the base layer to the last layer, are encoded.
The multiplexing unit 1360 scalably multiplexes all of the encoded layers, from the base layer to the last layer, which are encoded by the layer encoding unit 1340, so as to generate a bitstream, and outputs the bitstream via an output port OUT.
The demultiplexing unit 1400 receives a bitstream scalably encoded by an encoder via an input port IN and demultiplexes the bitstream.
The frequency component decoding unit 1410 decodes frequency components included in an i-th layer demultiplexed by the demultiplexing unit 1400. Here, the value ‘i’ is set as an initial value of 1. Accordingly, the frequency component decoding unit 1410 first decodes the most perceptually important frequency components included in a base layer corresponding to a first layer. If a decoding end receives a first enhancement layer corresponding to a second layer, the frequency component decoding unit 1410 decodes frequency components that are included in the first enhancement layer and are of secondary importance to the frequency components included in the base layer.
For example, referring to
If the encoder selects and encodes frequency components for each layer in the order from an upper layer to a lower layer according to perceptual importance, the frequency component decoding unit 1410 performs decoding as illustrated in
If the encoder selects and encodes frequency components for each layer on the basis of a frequency, that is, in the order from a low frequency to a high frequency, the frequency component decoding unit 1410 performs decoding as illustrated in
The frequency component decoding unit 1410 repeats its operation until all of the layers, from the base layer to the last layer, which are demultiplexed by the demultiplexing unit 1400, are encoded.
The remaining frequency component decoding unit 1440 parametrically decodes the remaining frequency components corresponding to the frequency components not decoded by the frequency component decoding unit 1410. Examples of a parameter decoded by the remaining frequency component decoding unit 1440 include the energy value of the remaining frequency components included in each subband, parameters indicating the noise level and envelope of each subband, and so on. For example, the remaining frequency component decoding unit 1440 decodes an energy value for each subband of the frequency components that remain after excluding the frequency components decoded for the first through i-th layers by the frequency component decoding unit 1410.
The mixing unit 1450 mixes the frequency components decoded for the first through i-th layers by the frequency component decoding unit 1410 and the remaining frequency components parametrically decoded by the remaining frequency component decoding unit 1440.
The inverse transformation unit 1460 inversely transforms a spectrum generated from the frequency components mixed by the mixing unit 1450 from the frequency domain to the time domain so as to restore the original signal, and outputs the original signal via an output port OUT.
The present general inventive concept provides a method and apparatus for scalably encoding or decoding an audio signal or a video signal.
Accordingly, the encoding can be performed so that the amount of data can be adjusted to adapt to coding environments or communications environments. The encoding can also be performed so that the present general inventive concept approaches a theoretical rate/distortion curve (R/D), so that more efficient encoding is achieved. Thus, audio signals and video signals can be encoded and decoded with high sound and image qualities even by using a small number of bits.
A signal distribution is considered after binarization is performed based on a tree structure instead of bit-plane coding. Thus, a decoding end can minimize generation of quantization errors and restore the original signal.
Moreover, only information perceptually important to human hearing is sent in advance without decreasing a bandwidth which is restored from a low layer. Therefore, the quality of sound can be increased by maintaining a relatively large frequency bandwidth and decreasing quantization noise, and the degradation of compression performance is prevented.
In addition to the above described embodiments, embodiments of the present general inventive concept can also be implemented through computer readable code/instructions in/on a medium, e.g., a computer readable medium, to control at least one processing element to implement any above described embodiment. The medium can correspond to any medium/media permitting the storing and/or transmission of the computer readable code.
The computer readable code can be recorded/transferred on a medium in a variety of ways, with examples of the medium including recording media, such as magnetic storage media (e.g., ROM, floppy disks, hard disks, etc.) and optical recording media (e.g., CD-ROMs, or DVDs), and transmission media such as carrier waves, as well as through the Internet, for example. Thus, the medium may further be a signal, such as a resultant signal or bitstream, according to embodiments of the present general inventive concept. The media may also be a distributed network, so that the computer readable code is stored/transferred and executed in a distributed fashion. Still further, as only an example, the processing element could include a processor or a computer processor, and processing elements may be distributed and/or included in a single device.
While aspects of the present general inventive concept has been particularly illustrated and described with reference to differing embodiments thereof, it should be understood that these exemplary embodiments should be considered in a descriptive sense only and not for purposes of limitation. Any narrowing or broadening of functionality or capability of an aspect in one embodiment should not considered as a respective broadening or narrowing of similar features in a different embodiment, i.e., descriptions of features or aspects within each embodiment should typically be considered as available for other similar features or aspects in the remaining embodiments.
Thus, although a few embodiments of the present general inventive concept have been shown and described, it would be appreciated by those skilled in the art that changes may be made in these embodiments without departing from the principles and spirit of the general inventive concept, the scope of which is defined in the claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
2007-94357 | Sep 2007 | KR | national |