1. Field
The present disclosure relates generally to the field of video processing and, more specifically, to techniques for integrating scaling and filtering in a standalone module.
2. Background
The visual appearance of a video signal may be significantly improved by emphasizing the high frequency components of an image. Sharpening is such a practice to enhance the local contrast at boundaries, and has become one of the most important features in image processing. Among many solutions proposed in the past, the “unsharp masking with linear high pass filtering” has proven to be a simple and effective method to enhance an image. The video signal may be further sharpened with a sharpening filter.
There is therefore a need in the art for techniques for integrating scaling and filtering in a standalone module with a single scaling filter.
Techniques for integrating scaling and filtering in a standalone module are described herein. In one configuration, a device comprising a single scaling filter to filter a video signal once to perform both sharpening and scaling is provided. The device includes a memory to store original scaling filter coefficients for the scaling filter. The device also includes an integrated circuit to calculate new sharpening-scaling filter coefficients derived from the original scaling filter coefficients and one of sharpening filter coefficients for a sharpening filter and a sharpening strength and to apply the new sharpening-scaling filter coefficients to the single scaling filter.
In an aspect, an integrated circuit is provided. The integrated circuit comprises a single scaling filter to filter a video signal once to perform both sharpening and scaling. The integrated circuit includes a memory to store original scaling filter coefficients for the scaling filter. The integrated circuit also includes a circuit to calculate new sharpening-scaling filter coefficients derived from the original scaling filter coefficients and one of sharpening filter coefficients for a sharpening filter and a sharpening strength and to apply the new sharpening-scaling filter coefficients to the single scaling filter.
In a still further aspect, a computer program product is provided. The computer program product includes a computer readable medium having instructions for causing a computer to filter a video signal x once to perform both sharpening and scaling according to
where di are new sharpening-scaling filter coefficients derived from at least sharpening filter coefficients and original scaling filter coefficients of a scaling filter; m denotes an even number of taps; s denotes the scaling ratio; q denotes a coordination index after scaling; and i is an index for a tap of the scaling filter.
In a still further aspect, a processor with a single scaling filter to filter a video signal once to perform both sharpening and scaling is provided. The processor includes a memory to store original scaling filter coefficients for the scaling filter. The processor also includes an integrated circuit to calculate new sharpening-scaling filter coefficients derived from the original scaling filter coefficients and one of sharpening filter coefficients for a sharpening filter and a sharpening strength and applying the new sharpening-scaling filter coefficients to the single scaling filter.
Additional aspects will become more readily apparent from the detailed description, particularly when taken together with the appended drawings
Aspects and configurations of the disclosure will become more apparent from the detailed description set forth below when taken in conjunction with the drawings in which like reference characters identify corresponding elements throughout.
The images in the drawings are simplified for illustrative purposes and are not depicted to scale. To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the figures, except that suffixes may be added, when appropriate, to differentiate such elements.
The appended drawings illustrate exemplary configurations of the invention and, as such, should not be considered as limiting the scope of the invention that may admit to other equally effective configurations. It is contemplated that features or steps of one configuration may be beneficially incorporated in other configurations without further recitation.
The word “exemplary” is used herein to mean “serving as an example, instance, or illustration.” Any configuration or design described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other configurations or designs.
where {gj, −n≦j≦n} denote the coefficients of the linear HP filter 12.
The techniques described herein may be used for wireless communications, computing, personal electronics, etc. with a built-in camera module. An exemplary use of the techniques for wireless communication is described below.
The wireless device 100 is capable of providing bi-directional communications via a receive path and a transmit path. On the receive path, signals transmitted by base stations are received by an antenna 112 and provided to a receiver (RCVR) 114. The receiver 114 conditions and digitizes the received signal and provides samples to a digital section 120 for further processing. On the transmit path, a transmitter (TMTR) 116 receives data to be transmitted from the digital section 120, processes and conditions the data, and generates a modulated signal, which is transmitted via the antenna 112 to the base stations.
The digital section 120 includes various processing, interface and memory units such as, for example, a modem processor 122, a video processor 124, a controller/processor 126, a display processor (DP) 128, an ARM/DSP 132, a graphics processing unit (GPU) 134, an internal memory 136, and an external bus interface (EBI) 138. The modem processor 122 performs processing for data transmission and reception (e.g., encoding, modulation, demodulation, and decoding). The video processor 124 performs processing on video content (e.g., still images, moving videos, and moving texts) for video applications such as camcorder, video playback, and video conferencing. The video processor 124 includes a video front end (VFE) 125. The VFE may be a MSM 8600 VFE. The video processor 124 performs processing for a camera module 150 having a lens 152 to create still images and/or moving videos.
The controller/processor 126 may direct the operation of various processing and interface units within the digital section 120. The display processor 128 performs processing to facilitate the display of videos, graphics, and texts on a display unit 130. The ARM/DSP 132 may perform various types of processing for the wireless device 100. The graphics processing unit 134 performs graphics processing of a graphics pipeline.
The techniques described herein may be used for any of the processors in the digital section 120, e.g., the video processor 124. The internal memory 136 stores data and/or instructions for various units within the digital section 120. The EBI 138 facilitates the transfer of data between the digital section 120 (e.g., internal memory 136) and a main memory 140 along a bus or data line DL.
The digital section 120 may be implemented with one or more DSPs, micro-processors, RISCs, etc. The digital section 120 may also be fabricated on one or more application specific integrated circuits (ASICs) or some other type of integrated circuits (ICs).
The techniques described herein may be implemented in various hardware units. For example, the techniques may be implemented in ASICs, DSPs, RISCs, ARMs, digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), processors, controllers, micro-controllers, microprocessors, and other electronic units.
Combining Sharpening with Scaling
Combine Scaling with a Pre-Sharpening Filter
Returning again to
where {ci, −1≦i≦2} represent 4-tap finite impulse response (FIR) filter coefficients for scaling; q denotes the coordination index after scaling; s denotes the scaling ratio; and q/s is the coordination index before scaling. Note that the index of y has been scaled by the scaling ratio s since the scaling changes the coordination grid spacing.
Placing equation Eq. (2) into Eq. (3b) modifies the scaling equation into equation Eq. (4):
Since a 4-tap FIR filter is used in the DP 128 for scaling, the above equation needs to be modified in order to fit into current scaling architecture. First, choose n=1, then the Eq. (4) becomes equation Eq. (5)
Second, remove the weights for
To achieve this, linear predictions are used to predict
defined in equations Eq. (6) and Eq. (7)
Placing equations Eq. (6) and Eq. (7) into equation Eq. (5), the new sharpening-scaling equation Eq. (8) is formed:
Equation Eq. (8) shows that the scaled sharpened output zq is obtained by the convolution between the original (pre-sharpened and pre-scaled) signals
and a 4-tap sharpening-scaling filter di (−1≦i≦2). The coefficients of the new sharpening-scaling filter {di, −1≦i≦2} are derived from the original scaling filter coefficients {ci, −1≦i≦2}, sharpening filter coefficients {hi, −1≦i≦1}, forward prediction coefficients {ai, 1≦i≦4} and backward prediction coefficients {bi, 1≦i≦4}, as described in equation Eq. (9)
Note that the new sharpening-scaling equation (i.e. Eq. (8)) has the same format as the original scaling equation, (i.e. Eq. (3)) but with different coefficients. Therefore, both sharpening and scaling can be done in one shot with the use of the 32-phase polyphase 4-tap FIR scaling module.
Based on Orthogonality Principle, the optimal forward prediction coefficients {âi, 1≦i≦4} are defined by equation Eq. (10) as
where rk is the autocorrelation value, rk=E(xnxn-k). Similar equation is used for the optimal backward predictor coefficients {{circumflex over (b)}i, 1≦i≦4}.
To further simplify the calculations for di, choose
and
Furthermore, a simple first order approximation, instead of the complicated optimal solution, is used for forward and backward predictions. Then Eq. (9) becomes equation Eq. (11)
where the parameter α is called sharpening strength. One sharpening strength yields one set of coefficients {di}. In this example, a2, a3 and a4 are zero; b2, b3 and b4 are set to zero; and b1 and a1 are set to 1.
Arbitrary scaling in the DP 128 is achieved by the use of a polyphase structure with 32 phases, and each phase has its own set of FIR filter coefficients. Let {cp,i, −1≦i≦2} represent the original scaling filter coefficients for phase p, then the new sharpening-scaling filter coefficients for phase p, denoted as {dp,i, −1≦i≦2}, are defined by equation Eq. (12) as
Combine Scaling with a Post-Sharpening Filter
Referring back to
where s represents the scaling ratio. Placing Eq. (13) into Eq. (14) achieves equation Eq. (15)
where n is the number of the sharpening filter taps.
Since sharpening is placed after scaling, the coordination grid spacing for the sharpening input has been changed. This increases the implementation difficulty. Moreover, since fk(s) is a function of the scaling ratio s, a different scaling ratio yields different set of coefficients. Therefore, infinity sets of coefficients are required in order to support arbitrary scaling, which is essentially prohibitive.
Sharpening strength is determined by a user's preference and coding parameters. The user's preference is decided by the user.
Adjusting sharpening strength adaptively according to coding parameters is a mechanism to prevent the unwanted enhancement on annoying coding artifacts. The sharpening strength α is reduced by a certain value if QP is greater than a threshold (since coding artifact is proportional to QP), i.e., shown in equation Eq. (16) as
α=max(α0−k(max(0,Qp−τ)),αmin) (16)
where αmin is the minimum sharpening strength; τ is a threshold determined by the distance to the last I frame and codec type; k is a tunable constant; Qp is a quantization step size; and α0 is a default sharpening strength. A smaller τ is set for I frames and frames closer to the I frames, while a larger τ is set for frames far away from the I frames. The threshold τ is also affected by codec type. A larger τ is set for a codec with in-loop deblocker or post deblocker/smoothing filter, while a smaller τ is set for a codec without deblocker or any other modules to remove coding artifacts.
To implement the sharpening function properly, the following two changes are suggested to be made on FIR filters for new DP designs. First, increase the number of bits for the filter coefficients from s10 to s11 (in Q9 format). A signed ten-bit resolution (s10) is sufficient for performing scaling, but not sufficient enough for performing both scaling and sharpening. Taking phase 0 as an example, the FIR filter coefficients are [−2α, 512+4 α, −2α, 0], but an overflow problem occurs even at a sharpening strength α=1. To address this problem, the number of bits for the filter coefficients needs to be increased.
The output of block 404a is sent to adder 406. The modified 4-tap FIR filter circuit 400 includes four parallel paths. The first path includes the multiplier 402a and downshifter 404a. The second path includes a multiplier 402b and downshifter 404b. The third path includes a multiplier 402c and downshifter 404c. The fourth path includes a multiplier 402d and downshifter 404d. The first through fourth paths function essentially the same. Thus, adder 406 receives the output from downshifters 404a, 404b, 404c and 404d.
The output of adder 406 is sent to block 408 where downshifting bitwise by 2 (>>2) takes place. The output of adder 406 is 15 bits (15s). When the output of adder 406 is divided by 22, the resultant output is 13 bits (13s). The output of downshifter 408 is sent to adder 410 where a 1 bit matrix is added. Thus, the resultant output is now 14 bits (14s). The output of the adder 410 is sent to downshifter block 412 to perform a downshift bitwise by 1 or divide by 21. The resultant output is now 13 bits (13s). The output of the downshifter block 412 is sent to block 414 where the output is clamped to values between [0, 255]. The resultant output is an eight bit signal (8u).
Separate sets of coefficients for luminance and chrominance are suggested. The original scaling design uses the same set of coefficients for both luminance and chrominance. It may not be the best choice for sharpening-scaling module since the sharpening enhancement should be applied only on the luminance component. In the exemplary embodiment, the luminance should use a set of coefficients that are different from the ones for the chrominance. For instance, at sharpening strength α=32, the coefficient set listed in the Table 2 is used for luminance, while the coefficient set listed in the Table 1 is used for chrominance. In one configuration, sharpening is not applied on chrominance so the original scaling filter coefficients listed in the Table 1 {C} are used for chrominance (i.e., only doing scaling no sharpening). The new sharpening-scaling coefficients listed in the Table 2 {D} are used for luminance (i.e., doing both sharpening and scaling). The value W in
In addition to the above two changes, a new sub-module to calculate the sharpening-scaling filter coefficients is introduced here.
The coefficients {cp,k, −1≦k≦2} in Eq. (12) are in Q9 format and the sharpening strength α is in the range of [−127, 127]. Thus, Eq. (12) may be rewritten as equation Eq. (17)
where the hardware implementation of above equation is illustrated in
The adder 504 receives as input the sharpening strength α and the output of adder 502. This produces second resultant output 2α+256. The sharpening strength at the output of block 506 is −α, the third resultant output. In the exemplary embodiment, the first and second resultant outputs from circuit 500 are 9 bits (9u) while the third resultant output is 8 bits (8s).
The sub-circuit 600 calculates the product of the 4-tap FIR filter coefficients for scaling denoted as {ci, −1≦i≦2} and the 4×4 matrix of equation Eq. (17). There are four paths for inputting the FIR filter coefficients, Cp,-1, Cp,0, Cp,1 and Cp,2. The FIR filter coefficients, Cp,-1, Cp,0, Cp,1 and Cp,2 are 10 bits (10s). Each path has a multiplier 602, 606, 610 and 614. The multiplier 602 receives the input coefficient Cp,-1 and the first resultant output of integrated circuit 500, α+256. Likewise, the multiplier 614 receives the input coefficient Cp,2 and the first resultant output of integrated circuit 500, α+256.
The multiplier 606 receives the input coefficient Cp,0 and the second resultant output of integrated circuit 500, 2α+256. Likewise, the multiplier 610 receives the input coefficient Cp,1 and the second resultant output of integrated circuit 500, 2α+256.
Each of the four paths for inputting the FIR filter coefficients Cp,-1, Cp,0, Cp,1 and Cp,2 has a parallel branch path. Thus, the parallel branch path for Cp,-1 has multiplier 604 which multiplies Cp,-1 and the third resultant output of the integrated circuit 500, −α. The parallel branch path for Cp,0 has multiplier 608 which multiplies Cp,0 and the third resultant output of the integrated circuit 500, −α. The parallel branch path for Cp,1 has multiplier 612 which multiplies Cp,1 and the third resultant output of the integrated circuit 500, −α. The parallel branch path for Cp,2 has multiplier 616 which multiplies Cp2 and the third resultant output of the integrated circuit 500, −α.
The first and second resultant outputs of sub-circuit 600 include (α+256)Cp,-1 and −αCp,-1 from multipliers 602 and 604, respectively. The third and fourth resultant outputs of multipliers 606 and 608 include (2α+256)Cp,0 and −αCp,0, respectively. The fifth and sixth resultant outputs of multipliers 610 and 612 include (2α+256)Cp,2 and −αCp,2, respectively. The seventh and eighth resultant outputs of multipliers 614 and 616 include (α+256)Cp,2 and −αCp,2. These resultant outputs are inputs to sub-circuit 700 of
The sub-circuit 700 calculates the remaining operation of equation Eq. (17) using the inputs from sub-circuit 600. The sub-circuit 700 includes four adders 702a, 702b, 702c and 702d. The adder 702a adds together the first resultant output (α+256)Cp,-1 and the fourth resultant output −αCp,0 which produces an output. The output of adder 702a is sent to adder 704a. The adder 704a adds the output from adder 702a and a value of 128. The output of adder 704a is sent to a downshifter 706a where it is downshifted bitwise by 8 (>>8). For example, the downshifter 706a equivalently serves to divide the signal by 28 and produces dp,-1.
The adder 702b adds together the second resultant output −αCp,-1, the third resultant output (2α+256)Cp,0 and the sixth resultant output −αCp,1 which produces an output. The output of adder 702b is sent to adder 704b. The adder 704b adds the output from adder 702b and a value of 128. The output of adder 704b is sent to a downshifter 706b where it is downshifted bitwise by 8 (>>8). For example, the downshifter 706a equivalently serves to divide the signal by 28 and produces dp,0.
The adder 702c adds together the fourth resultant output −αCp,0, the fifth resultant output (2α+256)Cp,1 and the eighth resultant output −αCp,2 which produces an output. The output of adder 702c is sent to adder 704c. The adder 704c adds the output from adder 702c and a value of 128. The output of adder 704c is sent to a downshifter 706c where it is downshifted bitwise by 8 (>>8) and produces dp,1.
The adder 702d adds together the sixth resultant output −αCp,1 and the seventh resultant output (α+256)Cp,2 which produces an output. The output of adder 702d is sent to adder 704d. The adder 704d adds the output from adder 702d and a value of 128. The output of adder 704d is sent to a downshifter 706d where it is downshifted bitwise by 8 (>>8) and produces dp,2.
The block diagram 800 also shows a sharpening strength adjuster 810 to adaptively adjust the sharpening strength α to prevent unwanted enhancement on artifacts. The adaptive adjustment employs equation Eq. (16) above.
As can be appreciated, the equations described herein may be carried out by a processor or a combination of software and hardware. Furthermore, the sharpening strength adjuster 810 is shown in a dotted line box to denote that the sharpening strength adjuster 810 may be outside of the DP 128. For example, the sharpening strength adjuster 810 may be in the video processor 124. Likewise, the coefficient memory 804 is shown in a dotted line box to denote that the coefficient memory 804 may be external to the DP 128.
Only one sharpening parameter is defined in the table below. Note that many parameters used in the sharpening-scaling filter module 808, such as scaling coefficients for a FIR, have already been defined above. Table 4 illustrates a Hardware interface table with the sharpening strength α. This can be an input into the integrated circuit 500 of
Since the hardware (HW) changes cannot be made for some existing DPs, the overflow problem can be overcome with downshifting the FIR filter coefficients Cp,-1, Cp,0, Cp,1 and Cp,2 by 1 bit and then compensating back by gamma correction. Downshifting coefficients by 1-bit is equivalent to represent a Q8 value (28) in signed 10 bit resolution. Since the FIR filter coefficients, Cp,-1, Cp,0, Cp,1 and Cp,2 are divided by two, the new RGB values, after sharpening and scaling, would be only a half of the values it is supposed to be. These values by the gamma correction (gc) are mapped values stored in the column “Original Mapped Value.” Each Original Mapped Value entry in the Look-up Table (LUT) (Table 5) has a new mapped value corresponding to the original mapped value multiplied by two. An example of the new LUT is listed in Table 5.
In existing DPs, scaling is performed after CSC, so the pixels processed by the scaling module are in RGB space rather than in YCbCr space. This make “separate sets of coefficients for luminance and chrominance” less attractive.
The calculation for the new sharpening-scaling filter coefficients need to be done in software (SW) since there is no integrated circuit (HW) in the existing DPs for it.
Extend to m-Tap Filter
The finite impulse response (FIR) filter taps of the sharpening-scaling filter module may be increased. The sharpening-scaling filter module should be ready to adopt changes. Specifically, let m be the number of the taps for the FIR filter (suppose m is an even number), then the equations Eq. (2), (3), and (8) are modified accordingly, as equations Eqs. (18), (19) and (20)
The relationship between the new sharpening-scaling filter coefficients
and the original scaling filter coefficients
is shown in equation Eq. (21)
In one or more exemplary configurations, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. Also, any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk and disc, as used herein, includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
The previous description of the disclosed configurations is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to these configurations will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other configurations without departing from the spirit or scope of the disclosure. Thus, the disclosure is not intended to be limited to the configurations shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.