The present principles relate to methods and apparatus for detecting artifacts in a region of an image, a picture, or a video sequence after a concealment method is proposed.
Compressed video transmitted over unreliable channels such as wireless networks or the Internet may suffer from packet loss. A packet loss leads to image impairment that may cause significant degradation in image quality. In most practical systems, packet loss is detected at the transport layer and decoder error concealment post-processing tries to mitigate the effect of lost packets. This helps to improve image quality but could still leave some noticeable impairments in the video. In some applications such as no-reference video quality evaluation, detection of concealment impairments is typically needed. If only video coding layer information is available (i.e., the bitstream is not provided), concealment artifacts are detected based on image content.
The embodiments described herein provide a scheme for artifact detection. The proposed scheme is also based on the assumption that “sharp edges” are rarely aligned with macroblock boundaries. With an efficient framework, however, the proposed scheme practically solves the problem of error propagation and high false alarm rates.
The principles described herein relate to artifact detection. At least one implementation described herein relates to detection of temporal concealment artifacts. The methods and apparatus for artifact detection provided by the principles described herein lower error propagation, particularly in artifacts due to temporal error concealment, and reduce false alarm rates compared to prior approaches.
According to one aspect of the present principles, there is provided a method for artifact detection that produces a value indicative of the level of artifacts present in a region of an image and that is used to conditionally perform error concealment on an image region. The method is comprised of steps for determining an artifact level for an image region based on pixel values in the image, and conditionally performing error concealment in response to the artifact level.
According to another aspect of the present principles, there is provided a method for artifact detection that produces a value indicative of the level of artifacts present in a image and that is used to conditionally perform error concealment on the image. The method is comprised of the aforementioned steps for determining an artifact level for an image region based on pixel values in the image, performed on the regions comprising the entire image. The method is further comprised of steps for removing artifact levels for overlapping regions of the image, for evaluating the ratio of the size of the image covered by regions where artifacts have been detected to the overall size of the entire image, and conditionally performing error concealment in response to the artifact level.
According to another aspect of the present principles, there is provided a method for artifact detection that produces a value indicative of the level of artifacts present in a video sequence and that is used to conditionally perform error concealment on images in the video sequence. The method is comprised of the steps for determining an artifact level for image regions based on pixel values in the image, and performed on the regions comprising the entire images, and on the pictures comprising the video sequence. The method is further comprised of conditionally performing error concealment on images in the video sequence in response to artifact levels.
According to another aspect of the present principles, there is provided an apparatus for artifact detection that produces a value indicative of the level of artifacts present in a region of an image and that is used to conditionally perform error concealment on an image region. The apparatus is comprised of a processor that determines an artifact level for an image region based on pixel values in the image and a concealment module that conditionally performs error concealment on an image region.
According to another aspect of the present principles, there is provided an apparatus for artifact detection that produces a value indicative of the level of artifacts present in an image and that is used to conditionally perform error concealment on an entire image. The apparatus is comprised of the aforementioned processor that determines an artifact level for an image region based on pixel values in the image. The processor operates on the regions comprising the entire image. The apparatus is further comprised of an overlap eraser that removes artifact levels for overlapping regions of the images, a scaling circuit that evaluates the ratio of the size of the picture covered by regions where artifacts have been detected to the overall size of the image, and a concealment module that conditionally performs error concealment on the image.
According to another aspect of the present principles, there is provided an apparatus for artifact detection that produces a value indicative of the level of artifacts present in a video sequence and that is used to conditionally perform error concealment on the video sequence. The apparatus is comprised of the aforementioned processor that determines an artifact level for the images in a video sequence based on pixel values in the images, and that operates on regions comprising the images and on the images comprising the sequence. The apparatus is further comprised of an overlap eraser that removes artifact levels for overlapping regions of the images, a scaling circuit that evaluates the ratio of the size of each image that is covered by regions where artifacts have been detected to the overall size of the images, and a concealment module that conditionally performs error concealment on the images of the video sequence.
These and other aspects, features and advantages of the present principles will become apparent from the following detailed description of exemplary embodiments, which are to be read in connection with the accompanying drawings.
a and b show a limitation of certain traditional solutions: (a) error propagation (b) false alarm.
a and b show a sample value for (a) θi(x, y); (b) Φi (x, y).
a and b show (a) an exemplary embodiment of the intersample differences taken for an image region and (b) a macroblock and related notations.
a and b shows overlapping of two macroblocks when (a) overlap is only vertical and (b) when overlap is vertical and horizontal.
The principles described herein relate to artifact detection. Particularly, an object of the principles herein is to produce a value that is indicative of the artifacts present in a region of an image, in a picture, or in a video sequence when packets are lost and error concealment techniques will be used. An example of an artifact, which is commonly found when temporal error concealment is used, is shown in
For temporal error concealment, missing motion vectors are interpolated and damaged video regions are filled in by applying motion compensation. Temporal error concealment typically does not work well when the video sequence contains unsmooth moving objects or in the case of a scene change.
Some traditional temporal concealment detection solutions are based on the assumption that “sharp edges” are rarely aligned with macroblock boundaries in natural images. Based on this assumption, the difference between pixels, both at the horizontal boundary of each macroblock row and inside that macroblock row, are carefully checked to detect temporal concealment. These differences are referred to as intersample differences, which can be differences between adjacent horizontal pixels, adjacent vertical pixels, or between any other specified pixels.
The performance of some traditional detection solutions is quite limited for several reasons.
First, many artifacts will be propagated when the current frame is referenced by other frames in video encoding. This is also the case for many temporal concealment artifacts. Because of the error propagation, the content discontinuity will not only happen at macroblock boundaries, but anywhere in the frame.
Second, some traditional detection solutions result in high false alarm rates. When there is a natural edge across the macroblock boundary that is not critically aligned with a macroblock boundary, the value of average intersample difference is high as shown in
To solve the problem of high false alarm rates, one embodiment described herein checks the number of discontinuous points in the edge. Discontinuous points are those areas of an image where there is a larger than normal difference between pixels on alternate sides of the edge. If all the pixels in the macroblock boundary are discontinuous points, the image at the macroblock boundary has a higher likelihood of being an artifact. If only some pixels along the macroblock boundary are discontinuous points, and other pixels have a similar average intersample difference, it is more likely that the discontinuous points are caused by some natural edge crossing the macroblock boundary.
To solve the problem of error propagation, one embodiment described herein checks the intersample difference not only at a macroblock boundary, but along all horizontal and vertical lines to determine the level of artifacts present.
According to the analysis just described, the principles described herein propose a scheme for artifact detection to avoid disadvantages of some traditional solutions, that is, error propagation and high false alarm rates. In response to the detection of an artifact level, an error correction technique can conditionally be performed on an image, either instead of, or in addition to, a proposed or already performed error concealment operation.
To illustrate an example of these principles, assume a decoded video sequence V={f1, f2, . . . , fn} where fi (1≦i≦n) is a frame in a video sequence. The width and height of V is W and H respectively. Suppose the macroblock size is M×M and fi(x, y) is the pixel value at position (x, y) in frame fi.
Intersample Difference
For each frame fi, it is possible to define two two-dimensional (2D) maps θi, Φi: W×H→{0, 1, 2, . . . , 255} by
θi(x,y)=|fi(x,y)−fi(x−1,y)|×mask(x,y)
Φi(x,y)=|fi(x,y)−fi(x,y−1)|×mask(x,y) (1)
For simplicity, let fi(−1,y)=fi(0,y) and fi(x, −1)=fi(x, 0). In the above equations, mask(x, y) is a value, for example between 0 and 1, that indicates a level of masking effect (for example, luminance masking, texture masking, etc.). Detailed information of the masking effect can be found in Y. T. Jia, W. Lin, A. A. Kassim, “Estimating Just-Noticeable Distortion for Video”, in IEEE Transactions on Circuits and Systems for Video Technology, Jul. 2006.
The values of θi(x, y) and Φi(x, y) for the frame in
A filter g(•), such as one defined by the following equation, is then applied to both of the two maps.
where γ is a constant. Another example of a possible filter g(•), is defined by
The filtered, or thresholded, versions of θi(x, y) and Φi(x, y) are subsequently also referred to as θi(x, y) and Φi(x, y) in the following description.
Artifacts in a Macroblock
Consider a block whose left-upper corner locates at (x, y). It is desired to determine the level that the block is affected by artifacts, such as temporal error concealment artifacts.
Define θi(x, y) as the number of non-zero values in {θi(x, y),θi(x, y+1), . . . , θi(x, y+M−1)}, and φi(x, y) as the number of non-zero values in {Φi(x, y), Φi(x+1,y), . . . , Φi(x+M−1, y)}. That is, θi(x, y) and φi(x, y) denote the number of non-zero values along the length of a vertical line and a horizontal line started from (x, y) respectively.
a) shows intersample differences for one embodiment under the present principles for a region whose left-upper corner locates at (x, y). Differences between the pixels on the edges of the image region and corresponding pixels outside the region, are first found. In this example, the pixels that are outside the region are one pixel position away. Vertical differences are found across the top and bottom of the image, while horizontal differences are found for the left and right sides of the image. Each difference is then subjected to a weight, or mask, as in Equation (1) above. This is followed by filtering, or thresholding, as in Equation (2). The resulting values along each side of the region are then checked to determine how many of the values are above a threshold. If the threshold is taken to be zero, the number of non-zero values for each side, for example, is determined. A rule is then used to find a level of artifacts present in the region, as further described below.
b) indicates the notations that are used in the analysis. The four corners of the region, for example a macroblock, are located at (x, y), (x, y+M−1), (x+M−1,y), (x+M−1,y+M−1) respectively, where M is the length of the macroblock edge.
The number of non-zero intersample differences at the upper boundary is then identified as φi(x, y), the number of non-zero intersample differences at the bottom boundary is identified as φi(x, y+M−1), the number at the left boundary is identified as θi(x, y), and the number at the right boundary is identified as θi(x+M−1,y).
According to the previous description, higher intersample differences occur frequently at the macroblock boundary, for example, when the macroblock is affected by temporal error concealment artifacts. The rule for determining whether a macroblock is affected by artifacts can be implemented, for example, by a large lookup table, or by a logical combination of the filtered outputs.
One exemplary rule is,
if:
At least two of the four values of φi(x,y),φi(x,y+M−1),θi(x,y) and θi(x+M−1,y) are larger than a threshold c1; and 1.
The sum of the values of φi(x,y),φi(x,y+M−1),θi(x,y) and θi(x+M−1,y) is larger than a threshold c2, 2. (3)
then:
the macroblock is deemed to be affected by artifacts.
If the conditions listed in (3) are satisfied, the macroblock is deemed to be affected by artifacts. Otherwise, the macroblock is deemed to not be affected by artifacts. This exemplary rule has particular applicability to temporal error concealment artifact detection, and the logical expression in Equation 3 produces a binary result. However, other rules can be used for determining the level of artifacts in a region of an image that produce an analog value.
Proposed Model for Artifacts Level of a Frame
For an M×M image region, such as a macroblock, whose upper-left corner, for example, locates at (x, y), a method is proposed in the previous paragraphs to evaluate whether that macroblock is affected by artifacts, such as those caused by temporal error concealment, for example. Using this proposed method, it is possible define to what extent a frame fi is affected by artifacts.
STEP 1: Initial Settings for all Image Regions
For every pixel fi(x, y), set the artifact level d(fi, x, y)=1 if the image region whose upper-left corner locates at (x, y) satisfies the conditions in (3); otherwise set d(fi, x, y)=0 if the conditions in (3) are not satisfied.
STEP 2: Erase Overlapping
For two pixels fi(x1, y1) and fi(x2, y2) satisfying
x
1
=x
2
,|y
1
−y
2
|<M
or
y
1
=y
2
,|x
1
−x
2
|<M (4)
the edges of the corresponding image regions whose upper-left corner is located at these two pixels overlap to some extent. One example of this is shown in
Decreasing the influence of an overlap can be achieved, for example, by scanning the pixels fi(x1, y1) in the frame from left to right and top to bottom, and then, if d(fi, x, y)=1, set d(fi, x+j, y)=d(fi, x, y+j)=0 for every j=1−M, 2−M, . . . , −2, −1, 1, 2, . . . , M−1. This procedure will allow at most one of the image regions to be identified as being affected by artifacts.
STEP 3: Evaluation of Artifacts of Frame
For every pixel in the frame with value d(fi, x, y)=1, there is a corresponding macroblock whose upper-left corner is (x, y). The ratio of the covered pixels by all these macroblocks to the frame size is defined to be the overall evaluation of artifacts of fi, denoted as d(fi).
It should be noted that the above mentioned macroblocks will not have edge overlapping (as shown, for example, in
STEP 4: Evaluation of Artifacts for a Video Sequence
In order to determine the artifacts evaluation of a video sequence when the artifacts evaluation for every frame or block of the video sequence is known, a pooling problem must be solved. Since pooling strategy is well known in this field of technology, one of ordinary skill in the art can conceive of methods using the present principles to evaluate the level of artifacts in video sequences that is within the scope of these principles.
Parameter Values
In one exemplary embodiment of the present principles, the parameters mentioned in the previous paragraphs are set as follows:
mask(x, y)≡1, for simplicity, so that masking effects are not considered in this particular embodiment;
y=8;
M=16;
c1=4, c2=16.
Concealment artifact detection for frames will be easier to determine when bitstream information is provided. However, there are scenarios when the bitstream itself is unavailable. In these situations, concealment artifact detection is based on the image content. The present principles provide such a detection algorithm to detect the artifact level in regions of an image, a frame, or a video sequence.
A presently preferred solution taught in this disclosure is a pixel layer channel artifact detection method, although one skilled in the art can conceive of one or more implementations for a bitstream layer embodiment using the same principles. Although many of the embodiments described relate to artifacts such as those caused by temporal error concealment, it should be understood that the described principles are not limited to temporal error concealment artifacts, and can also relate to detection of artifacts caused by other sources, for example, filtering, channel impairments, or noise.
One embodiment of the present principles is shown in
Another embodiment of the present principles is shown in
Another embodiment of the present principles is shown in
Another embodiment of the present principles is shown in
One or more implementations having particular features and aspects of the presently preferred embodiments of the invention have been provided. However, features and aspects of described implementations can also be adapted for other implementations. For example, these implementations and features can be used in the context of other video devices or systems. The implementations and features need not be used in a standard.
Reference in the specification to “one embodiment” or “an embodiment” or “one implementation” or “an implementation” of the present principles, as well as other variations thereof, means that a particular feature, structure, characteristic, and so forth described in connection with the embodiment is included in at least one embodiment of the present principles. Thus, the appearances of the phrase “in one embodiment” or “in an embodiment” or “in one implementation” or “in an implementation”, as well any other variations, appearing in various places throughout the specification are not necessarily all referring to the same embodiment.
The implementations described herein can be implemented in, for example, a method or a process, an apparatus, a software program, a data stream, or a signal. Even if only discussed in the context of a single form of implementation (for example, discussed only as a method), the implementation of features discussed can also be implemented in other forms (for example, an apparatus or computer software program). An apparatus can be implemented in, for example, appropriate hardware, software, and firmware. The methods can be implemented in, for example, an apparatus such as, for example, a processor, which refers to processing devices in general, including, for example, a computer, a microprocessor, an integrated circuit, or a programmable logic device. Processors also include communication devices, such as, for example, computers, cell phones, portable/personal digital assistants (“PDAs”), and other devices that facilitate communication of information between end-users.
Implementations of the various processes and features described herein can be embodied in a variety of different equipment or applications. Examples of such equipment include a web server, a laptop, a personal computer, a cell phone, a PDA, and other communication devices. As should be clear, the equipment can be mobile and even installed in a mobile vehicle.
Additionally, the methods can be implemented by instructions being performed by a processor, and such instructions (and/or data values produced by an implementation) can be stored on a processor-readable medium such as, for example, an integrated circuit, a software carrier or other storage device such as, for example, a hard disk, a compact disc, a random access memory (“RAM”), or a read-only memory (“ROM”). The instructions can form an application program tangibly embodied on a processor-readable medium. Instructions can be, for example, in hardware, firmware, software, or a combination. Instructions can be found in, for example, an operating system, a separate application, or a combination of the two. A processor can be characterized, therefore, as, for example, both a device configured to carry out a process and a device that includes a processor-readable medium (such as a storage device) having instructions for carrying out a process. Further, a processor-readable medium can store, in addition to or in lieu of instructions, data values produced by an implementation.
As will be evident to one of skill in the art, implementations can use all or part of the approaches described herein. The implementations can include, for example, instructions for performing a method, or data produced by one of the described embodiments.
A number of implementations have been described. Nevertheless, it will be understood that various modifications can be made. For example, elements of different implementations can be combined, supplemented, modified, or removed to produce other implementations. Additionally, one of ordinary skill will understand that other structures and processes can be substituted for those disclosed and the resulting implementations will perform at least substantially the same function(s), in at least substantially the same way(s), to achieve at least substantially the same result(s) as the implementations disclosed. Accordingly, these and other implementations are contemplated by this disclosure and are within the scope of this disclosure.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/CN2011/082873 | 11/24/2011 | WO | 00 | 5/22/2014 |