This application pertains to the field of coding and/or decoding data including, for example, images, and more particularly, to the field of coding and/or decoding data using wavelet transforms and/or matching pursuits.
Digital video services such as transmitting digital video information over wireless transmission networks, digital satellite services, streaming video over the internet, delivering video content to personal digital assistants or cellular phones, etc., are increasing in popularity. Increasingly, digital video compression and decompression techniques may be implemented that balance visual fidelity with compression levels to allow efficient transmission and storage of digital video content.
The claimed subject matter will be understood more fully from the detailed description given below and from the accompanying drawings of embodiments which should not be taken to limit the claimed subject matter to the specific embodiments described, but are for explanation and understanding only.
a is a diagram depicting an example decomposition of an image in a horizontal direction.
b is a diagram depicting an image that has been decomposed in a horizontal direction and is undergoing decomposition in a vertical direction.
c is a diagram depicting an image that has been decomposed into four frequency bands.
d is a diagram depicting an image that has been decomposed into four frequency bands where one of the bands has been decomposed into four additional bands.
a is a diagram depicting an example decomposition of an image in a horizontal direction.
b is a diagram depicting an image that has undergone decomposition in a horizontal direction yielding “m” frequency bands.
c is a diagram depicting an image that has undergone decomposition in a horizontal direction and a vertical direction yielding m*m frequency bands.
a is a diagram depicting an image that has been decomposed into four frequency bands.
b is a diagram depicting the image of
a is a diagram of an example basis function and an example signal.
b is a diagram of an example residual signal.
In the following detailed description, numerous specific details are set forth to provide a thorough understanding of claimed subject matter. However, it will be understood by those skilled in the art that claimed subject matter may be practiced without these specific details. In other instances, well-known methods, procedures, components and/or circuits have not been described in detail.
A process and/or algorithm may be generally considered to be a self-consistent sequence of acts and/or operations leading to a desired result. These include physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical and/or magnetic signals capable of being stored, transferred, combined, compared, and/or otherwise manipulated. It may be convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers and/or the like. However, these and/or similar terms may be associated with the appropriate physical quantities, and are merely convenient labels applied to these quantities.
Unless specifically stated otherwise, as apparent from the following discussions, throughout the specification discussion utilizing terms such as processing, computing, calculating, determining, and/or the like, refer to the action and/or processes of a computing platform such as computer and/or computing system, and/or similar electronic computing device, that manipulate and/or transform data represented as physical, such as electronic, quantities within the registers and/or memories of the computer and/or computing system and/or similar electronic and/or computing device into other data similarly represented as physical quantities within the memories, registers and/or other such information storage, transmission and/or display devices of the computing system and/or other information handling system.
Embodiments claimed may include one or more apparatuses for performing the operations herein. Such an apparatus may be specially constructed for the desired purposes, or it may comprise a general purpose computing device selectively activated and/or reconfigured by a program stored in the device. Such a program may be stored on a storage medium, such as, but not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), electrically programmable read-only memories (EPROMs), electrically erasable and/or programmable read only memories (EEPROMs), flash memory, magnetic and/or optical cards, and/or any other type of media suitable for storing electronic instructions, and/or capable of being coupled to a system bus for a computing device, computing platform, and/or other information handling system.
The processes and/or displays presented herein are not inherently related to any particular computing device and/or other apparatus. Various general purpose systems may be used with programs in accordance with the teachings herein, or a more specialized apparatus may be constructed to perform the desired method. The desired structure for a variety of these systems will appear from the description below. In addition, embodiments are not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings described herein.
In the following description and/or claims, the terms coupled and/or connected, along with their derivatives, may be used. In particular embodiments, connected may be used to indicate that two or more elements are in direct physical and/or electrical contact with each other. Coupled may mean that two or more elements are in direct physical and/or electrical contact. However, coupled may also mean that two or more elements may not be in direct contact with each other, but yet may still cooperate and/or interact with each other. Furthermore the term “and/or” may mean “and”, it may mean “or”, it may mean “exclusive-or”, it may mean “one”, it may mean “some, but not all”, and/or it may mean “both.”
Matching pursuits algorithms may be used to compress digital images. A matching pursuit algorithm may include finding a full inner product between a signal to be coded and each member of a dictionary of basis functions. At the position of the maximum inner product the dictionary entry giving the maximum inner product may describe the signal locally. This may be referred to as an “atom.” The amplitude is quantized, and the position, quantized amplitude, sign, and dictionary number form a code describing the atom. For one embodiment, the quantization may be performed using a precision limited quantization method. Other embodiments may use other quantization techniques.
The atom is subtracted from the signal giving a residual. The signal may then be completely, or nearly completely, described by the atom plus the residual. The process may be repeated with new atoms successively found and subtracted from the residual. At any stage, the signal may be completely, or nearly completely, described by the codes of the atoms found and the remaining residual.
Matching pursuits may decompose any signal ƒ into a linear expansion of waveforms that may belong to a redundant dictionary D=φ{γ} of basis functions, such that
where Rm ƒ is the mth order residual vector after approximating ƒ by m ‘atoms’ and
αn=φγ
is the maximum inner product at stage n of the dictionary with the nth order residual.
For some embodiments, the dictionary of basis functions may comprise two-dimensional bases. Other embodiments may use dictionaries comprising one-dimensional bases which may be applied separately to form two-dimensional bases. A dictionary of n basis functions in one dimension may provide a dictionary of n2 basis functions in two dimensions. For one embodiment, two-dimensional data, such as image data, may be scanned to form a one dimensional signal and a one-dimensional dictionary may be applied. In other embodiments, a one-dimensional dictionary may be applied to other one-dimensional signals, such as, for example, audio signals.
For compression, the matching pursuits process may be terminated at some stage and the codes of a determined number of atoms are stored and/or transmitted by a further coding process. For one embodiment, the further coding process may be a lossless coding process. Other embodiments may use other coding techniques, for example lossy processes.
An image may be represented as a two-dimensional array of coefficients, each coefficient representing luminance levels at a point. Many images have smooth luminance variations, with the fine details being represented as sharp edges in between the smooth variations. The smooth variations in luminance may be termed as lower frequency components and the sharp variations as higher frequency components. The lower frequency components (smooth variations) may comprise the gross information for an image, and the higher frequency components may include information to add detail to the gross information. One technique for separating the lower frequency components from the higher frequency components may include a Discrete Wavelet Transform (DWT). Wavelet transforms may be used to decompose images. Wavelet decomposition may include the application of Finite Impulse Response (FIR) filters to separate image data into sub sampled frequency bands. The application of the FIR filters may occur in an iterative fashion, for example as described below in connection with
At block 220, a matching pursuits algorithm begins. For this example embodiment, the matching pursuits algorithm comprises blocks 220 through 250. At block 220, an appropriate atom is determined. The appropriate atom may be determined by finding the full inner product between the wavelet transformed image data and each member of a dictionary of basis functions. At the position of maximum inner product the corresponding dictionary entry describes the wavelet transformed image data locally. The dictionary entry forms part of the atom. An atom may comprise a position value, a quantized amplitude, sing, and a dictionary entry value. The quantization of the atom is shown at block 230.
At block 240, the atom determined at block 220 and quantized at block 230 is removed from the wavelet transformed image data, producing a residual. The wavelet transformed image may be described by the atom and the residual.
At block 250, a determination is made as to whether a desired number of atoms has been reached. The desired number of atoms may be based on any of a range of considerations, including, but not limited to, image quality and bit rate. If the desired number of atoms has not been reached, processing returns to block 220 where another atom is determined. The process of selecting an appropriate atom may include finding the full inner product between the residual of the wavelet transformed image after the removal of the prior atom and the members of the dictionary of basis functions. In another embodiment, rather than recalculating all of the inner products, the inner products from a region of the residual surrounding the previous atom position may be calculated. Blocks 220 through 250 may be repeated until the desired number of atoms has been reached. Once the desired number of atoms has been reached, the atoms are coded at block 260. The atoms may be coded by any of a wide range of encoding techniques. The example embodiment of
a through 4d is a diagram depicting an example wavelet decomposition of an image 400. As depicted in
c shows the results of the horizontal and vertical analyses. Image 400 is divided into four sub bands. LL sub band 422 includes data that has been low passed filtered in both the horizontal and vertical directions. HL sub band 424 includes data that has been high pass filtered in the horizontal direction and low pass filtered in the vertical direction. LH sub band 426 includes data that has been low pass filtered in the horizontal direction and high pass filtered in the vertical direction. HH sub band 428 includes data that has been high pass filtered in both the horizontal and vertical directions. LL sub band 422 may include gross image information, and bands HL 424, LH 426, and HH 428 may include high frequency information providing additional image detail.
For wavelet transformation, benefits may be obtained by repeating the decomposition process one or more times. For example, LL band 422 may be further decomposed to produce another level of sub bands LL2, HL2, LH2, and HH2, as depicted in
a through 4d depict an example two band (low and high) wavelet transformation process. Other embodiments are possible using more than two bands.
Following the horizontal analysis, the analysis is performed in a vertical direction.
Although the example embodiment discussed in connection with
Another possible embodiment for wavelet transformation may be referred to as wavelet packets.
Motion residual 705 is received at a wavelet transform block 712. Wavelet transform block 712 may perform a wavelet transform on motion residual 705. The wavelet transform may be similar to one or more of the example embodiments discussed above in connection with
The output 707 of wavelet transform block 712 may be transferred to a matching pursuits block 714. Matching pursuits block 714 may perform a matching pursuits algorithm on the information 707 output from the wavelet transform block 712. The matching pursuits algorithm may be implemented in a manner similar to that discussed above in connection with
The coded atoms from block 720 and coded motion vectors from block 722 may be output as part of a bitstream 719. Bitstream 719 may be transmitted to any of a wide range of devices using any of a wide range of interconnect technologies, including wireless interconnect technologies, the Internet, local area networks, etc., although the claimed subject matter is not limited in this respect.
The various blocks and units of coding system 700 may be implemented using software, firmware, and/or hardware, or any combination of software, firmware, and hardware. Further, although
Build atoms block 812 receives coded atom parameters 803 and provides decoded atom parameters to a build wavelet transform coefficients block 814. Block 814 uses the atom parameter information and dictionary 822 to reconstruct a series of wavelet transform coefficients. The coefficients are delivered to an inverse wavelet transform block 816 where a motion residual image 805 is formed. The motion residual image may comprise a DFD image. Build motion block 818 receives motion vectors 807 and creates motion compensation data 809 that is added to motion residual 805 to form a current reconstruction image 813. Image data 813 is provided to a delay block 820 which provides a previous reconstruction image 815 to the build motion block 818 to be used in the construction of motion prediction information.
The various blocks and units of decoding system 800 may be implemented using software, firmware, and/or hardware, or any combination of software, firmware, and hardware. Further, although
As discussed previously, for at least some embodiments of the matching pursuits algorithm, once a basis function has been subtracted from a signal, a residual signal results. In order to more efficiently recalculate the inner products, some embodiments may recalculate the inner products for a region surrounding the location where the basis function was subtracted. The recalculation of inner products for a region surrounding a location where a basis function was subtracted may be referred to as a “repair” function.
a is a diagram of an example basis function 1020 and an example image data signal 1010. For this example, basis function 1020 has a width that is equal to the maximum width “w” for entries of an associated dictionary. For this example, a matching pursuits method determines that basis function 1020 at location “x” provides the maximum inner product. As can be seen in
b depicts a residual data signal 1030. Residual signal 1030 for this example is the result of subtracting basis function 1020 from signal 1010. For one embodiment, rather that recalculating all of the inner products for residual signal 1030, the inner products may be repaired; that is a subset of inner products are calculated for a region surrounding the location where basis function 1020 was subtracted. For this example, inner products for a region of x−w to x+w are calculated. Thus, the computation zone for this example has a with of 2w−1.
The computational costs for matching pursuits methods may be dominated by the calculation of inner products, either over an entire image (or other data type) or in the “repair” stage for some embodiments. The calculation of inner products may be much more complex and computation intensive for multi-dimensional dictionaries (or one dimensional dictionaries that are composed separately into two dimensional dictionaries) than for one dimensional dictionaries. For example, if a dictionary consists of N one dimensional basis functions of a maximum width W, then there are N2 two dimensional dictionary entries and the largest entry covers W2 pixels (in the case of a two-dimensional image). In order to perform a repair process, many inner products, some of W2 multiplications followed by summation operations may have to be found over a range of 2W−1 pixels both horizontally and vertically to cover the range whose inner products may have changed by the subtraction of the quantized atom for the image or its residual (for example, the subtraction of basis function 1020 from signal 1010 in
For one embodiment, the computational complexity for coding a multi-dimensional signal such as an image may be reduced by the use of one dimensional basis functions in conjunction with a method for scanning the multi-dimensional image to produce a one dimensional signal. In applying a matching pursuits method to a one dimensional signal, with N one dimensional dictionary entries of a maximum width of W, a repair stage may involve inner products of W multiplications carried out over a range of 2W−1 positions of the signal. The number of multiplications found for each atom found therefore may scale as 2NW2 which may be much less complex than the above-described two dimensional case by virtue of both a smaller dictionary and/or the smaller number of multiplications in each inner product.
The above example describes scanning a multi-dimensional signal (in this example a two dimensional image) in order to produce a one dimensional signal to provide for reduced-complexity matching pursuits computations. A decoding device may include a reverse-scanning process to reproduce the greater-dimensional data from the one dimensional data. Thus, a decoding device such as that shown in
Reference in the specification to “an embodiment,” “one embodiment,” “some embodiments,” or “other embodiments” means that a particular feature, structure, or characteristic described in connection with the embodiments is included in at least some embodiments, but not necessarily all embodiments. The various appearances of “an embodiment,” “one embodiment,” or “some embodiments” are not necessarily all referring to the same embodiments.
In the foregoing specification claimed subject matter has been described with reference to specific example embodiments thereof. It will, however, be evident that various modifications and/or changes may be made thereto without departing from the broader spirit and/or scope of the subject matter as set forth in the appended claims. The specification and/or drawings are, accordingly, to be regarded in an illustrative rather than in a restrictive sense.
Number | Name | Date | Kind |
---|---|---|---|
5315670 | Shapiro | May 1994 | A |
5321776 | Shapiro | Jun 1994 | A |
5412741 | Shapiro | May 1995 | A |
5495292 | Zhang et al. | Feb 1996 | A |
5585852 | Agarwal | Dec 1996 | A |
5699121 | Zakhor et al. | Dec 1997 | A |
5768437 | Monro et al. | Jun 1998 | A |
6078619 | Monro et al. | Jun 2000 | A |
6148106 | Impagliazzo | Nov 2000 | A |
6532265 | Van der Auwera et al. | Mar 2003 | B1 |
6556719 | Monro | Apr 2003 | B1 |
6587507 | Chui et al. | Jul 2003 | B1 |
6614847 | Das et al. | Sep 2003 | B1 |
6633688 | Nixon et al. | Oct 2003 | B1 |
6741739 | Vincent | May 2004 | B1 |
6757437 | Keith et al. | Jun 2004 | B1 |
6782132 | Fogg | Aug 2004 | B1 |
6795504 | Xu et al. | Sep 2004 | B1 |
6982742 | Adair et al. | Jan 2006 | B2 |
6990142 | Chappaz | Jan 2006 | B2 |
6990246 | Ferguson | Jan 2006 | B1 |
7003039 | Zakhor et al. | Feb 2006 | B2 |
7006567 | Frossard et al. | Feb 2006 | B2 |
7242812 | Hwang et al. | Jul 2007 | B2 |
7336811 | Takeo | Feb 2008 | B2 |
7436884 | Chen et al. | Oct 2008 | B2 |
7447631 | Truman et al. | Nov 2008 | B2 |
7548656 | Nakajima et al. | Jun 2009 | B2 |
20030103523 | Frossard et al. | Jun 2003 | A1 |
20040028135 | Monro | Feb 2004 | A1 |
20040126018 | Monro | Jul 2004 | A1 |
20040165737 | Monro | Aug 2004 | A1 |
20050084014 | Wang et al. | Apr 2005 | A1 |
20060013312 | Han | Jan 2006 | A1 |
20060146937 | Ye et al. | Jul 2006 | A1 |
20060203906 | Divorra Escoda et al. | Sep 2006 | A1 |
20070052558 | Monro | Mar 2007 | A1 |
20070053434 | Monro | Mar 2007 | A1 |
20070053597 | Monro | Mar 2007 | A1 |
20070053603 | Monro | Mar 2007 | A1 |
20070065034 | Monro | Mar 2007 | A1 |
20070081593 | Jeong et al. | Apr 2007 | A1 |
Number | Date | Country |
---|---|---|
WO9811730 | Mar 1998 | WO |
WO9908449 | Feb 1999 | WO |
WO0115456 | Mar 2001 | WO |
WO0163935 | Aug 2001 | WO |
WO0213538 | Feb 2002 | WO |
WO2005027049 | Mar 2005 | WO |
WO2005119581 | Dec 2005 | WO |
WO2007030702 | Mar 2007 | WO |
WO2007030784 | Mar 2007 | WO |
WO2007030785 | Mar 2007 | WO |
WO2007030788 | Mar 2007 | WO |
Number | Date | Country | |
---|---|---|---|
20070053597 A1 | Mar 2007 | US |