This application claims priority from GB 0703889.6 filed Feb. 28, 2007.
This invention relates to compression of video signals containing fades and flashes.
Motion-compensated predictive video compression exploits a similarity of successive pictures by making predictions from previously coded pictures. The pictures from which the predictions are taken can come from the past (forward prediction) or the future (backward prediction) or a combination of the two (bi-directional prediction) thus enabling the prediction of uncovered areas. Therefore, state-of-the-art video compression engines can cope with most types of motion and critical picture material.
However, there are two types of effects which can cause severe picture degradation and are worthy of special mitigating treatment: video sequences that include fades and video sequences that contain short, bright flashes such as those that occur when a still camera with a flash apparatus is used in the field of view. During fades a motion estimator often produces random motion vectors due to video luminosity changes. Similarly, predictions taken from pictures with flashes such as camera flashes are poor indications of video behaviour and lead to poor compression performance.
According to a first aspect of the present invention there is provided a method of compressing a video signal comprising: on detecting a flash in the video signal, inserting at least one non-referenced frame to coincide with a duration of the flash; and on detecting a fade in the video signal, changing an order in which frames are coded such that referenced frames are positioned immediately before and after the fade and a pyramidal structure of bidirectionally coded frames are used for the duration of the fade.
Preferably, quantisation is increased in a frame in which a flash is detected.
Preferably, on detection of a fade a search range of a motion estimator is reduced for the duration of the fade.
Advantageously, detecting a flash in the video signal comprises the steps of: calculating a first average luminance of fields of at least one frame immediately preceding a frame of interest and a second average luminance of fields of at least one frame immediately succeeding the frame of interest; calculating whether the luminance of a top field of the frame of interest exceeds the greater of the first average luminance and the second average luminance by a first predetermined threshold and if so signalling that a flash occurred in the top field of the frame of interest; and calculating whether the luminance of a bottom field of the frame of interest exceeds the greater of the first average luminance and the second average luminance by the first predetermined threshold and if so signalling that a flash occurred in the bottom field of the frame of interest.
Conveniently, detecting a flash in the video signal comprises the steps of: calculating a first average luminance of the fields of the current frame; calculating a second average luminance of the fields of a frame preceding the current frame by two frames; calculating whether the luminance of a bottom field of the frame preceding the current frame exceeds a greater of the first average luminance and the second average luminance by a first predetermined threshold and if so signalling that a flash occurred in the bottom field of the previous frame; and calculating whether the luminance of a top field of the frame preceding the current frame exceeds a greater of the first average luminance and the second average luminance by a first predetermined threshold and if so signalling that a flash occurred in the top field of the previous frame.
Advantageously, detecting a fade in the video signal comprises the steps of: calculating an average luminance for each of a first plurality of successive fields; calculating successive differences in the average luminance between each of the first plurality of successive fields; calculating a sum of the successive differences; calculating an average of a second plurality of the sum and such sums of differences of immediately preceding fields; calculating an absolute difference between the sums and their average; and signalling that a fade is detected if the average is greater than a second predetermined threshold and if each of the absolute differences is less than a third predetermined threshold.
Conveniently, detecting a fade in the video signal comprises the steps of: calculating an average luminance for each of four successive fields; calculating successive differences in the average luminance between each of the four fields; calculating a sum of the successive differences; calculating an average of the sum and two such sums for successive differences of immediately preceding sets of fields; calculating an absolute difference between the three sums and their average; and signalling that a fade is detected if the average is greater than a second predetermined threshold and if each of the absolute differences is less than a third predetermined threshold.
According to a second aspect of the invention, there is provided an encoder for a video signal comprising: compensating delay means, fade detector means and flash detector means arranged such that a video signal may be input in parallel to the compensating delay means, the fade detector means and the flash detector means; frame re-ordering means having an input connected to an output of the compensating delay means; coding mode means having inputs from the fade detector means and the flash detector means to output a control signal to the frame re-ordering means; compression coding loop means having inputs from the frame re-ordering means and the coding mode means; motion estimator means having inputs from the frame re-ordering means and the coding mode means to output a motion vector to the compression coding loop means and to input a reconstructed video signal from the compression coding loop means; and entropy coding means to receive an input from the compression coding loop and to output a compressed video signal; wherein the frame re-ordering means is arranged such that: on detection of a flash in the video signal by the flash detector means, at least one non-referenced frame is inserted to coincide with a duration of the flash; and on detection of a fade in the video signal by the fade detector means, an order in which frames are coded is changed such that referenced frames are positioned immediately before and after the fade and a pyramidal structure of bidirectionally coded frames are used for the duration of the fade.
Advantageously, the fade detector means is arranged to: calculate an average luminance for each of a first plurality of successive fields; calculate successive differences in the average luminance between each of the first plurality of fields; calculate a sum of the successive differences; calculate an average of a second plurality of the sum and such sums of differences of immediately preceding fields; calculate an absolute difference between the sums and their average; and signal that a fade is detected if the average is greater than a second predetermined threshold and if each of the absolute differences is less than a third predetermined threshold.
Conveniently, the fade detector means is arranged to: calculate an average luminance for each of four successive fields; calculate successive differences in the average luminance between each of the four fields; calculate a sum of the successive differences; calculate an average of the last three sums; calculate an absolute difference between the three sums and their average; and signal that a fade is detected if the average is greater than a second predetermined threshold and if each of the absolute differences is less than a third predetermined threshold.
Advantageously, the flash detector means is arranged to: calculate a first average luminance of fields of at least one frame immediately preceding a frame of interest and a second average luminance of fields of at least one frame immediately succeeding the frame of interest; calculate whether the luminance of a top field of the frame of interest exceeds the greater of the first average luminance and the second average luminance by a first predetermined threshold and if so to signal that a flash occurred in the top field of the frame of interest; and calculate whether the luminance of a bottom field of the frame of interest exceeds the greater of the first average luminance and the second average luminance by the first predetermined threshold and if so to signal that a flash occurred in the bottom field of the frame of interest.
Conveniently, the flash detector means is arranged to: calculate an average luminance of each field for five field periods; calculate a first average luminance of fields of a frame immediately preceding a frame of interest and a second average luminance of fields of a frame immediately succeeding the frame of interest; calculate whether the luminance of a top field of the frame of interest exceeds the greater of the first average luminance and the second average luminance by a first predetermined threshold and if so to signal that a flash occurred in the top field of the frame of interest; and calculate whether the luminance of a bottom field of the frame of interest exceeds the greater of the first average luminance and the second average luminance by the first predetermined threshold and if so to signal that a flash occurred in the bottom field of the frame of interest.
Preferably, the compression coding loop is arranged to increase quantisation in a frame in which a flash is detected.
Preferably, the motion estimation means is arranged to reduce a search range on detection of a fade.
According to a third aspect of the invention, there is provided a computer program product comprising program code means arranged to perform all the steps of the method described above when that program code means is run on a computer.
According to a fourth aspect of the invention, there is provided a computer-readable medium embodying a computer program product as described above
Other aspects and features of the present invention will become apparent to those ordinarily skilled in the art upon review of the following description of specific embodiments of the invention in conjunction with the accompanying figures.
Embodiments of the present invention will now be described, by way of example only, with reference to the accompanying drawings, in which:
Throughout the description, identical reference numerals are used to identify like parts.
This disclosure describes methods for detection of video fades and flashes such as camera flashes and mechanisms to improve compression performance under those conditions. The detection is carried out ahead of an actual coding process. Once a video fade or camera flash has been detected, the encoding process can be controlled to provide improved coding with these effects.
Compression improvements of video signals with fades or flashes are described herein on a basis of the H.264 compression standard. For the purpose of this explanation four types of picture can be distinguished in H.264:
Intra-coded pictures (I) coded independently of any other pictures;
Forward predicted pictures (P) which take predictions from previous I or P pictures;
Non-referenced bi-directionally predicted pictures (Bnr) which take predictions from past and future I or P pictures but in which no predictions are taken from Bnr pictures;
Referenced bi-directionally predicted pictures (Br) which also take predictions from past and future I or P pictures, but in which predictions are also taken from Br pictures.
The combination of Bnr and Br pictures makes it possible to generate a pyramid B coding structure such as shown in
Such coding structures are generally more efficient than structures without referenced B pictures, such as MPEG 2. However, if extended to more layers of hierarchical coding such as shown in
The coding performance of flashes can be improved by ensuring that corresponding pictures are coded as Bnr pictures, that is, pictures that are not involved in the coding of neighbouring pictures.
In H.264, video frames can be coded as one or two pictures, i.e. one picture for the entire frame or two pictures, one for each of two interlaced fields. Since picture coding mode decisions are usually made for an entire frame, reference will be made hereinafter to I, P, Br or Bnr frames rather than pictures.
Three types of fades can be distinguished: fade to black, fade from black and cross-fades from one video signal to another. In the former two cases and, to some extent, in the latter case, a fade can be detected by measuring an average luminosity of a video signal over a number of fields. Referring to
1. Calculate, step 61, an average luminance value for each field, Yav(n) and store for four field periods;
2. Calculate, step 62, a difference in average luminance between the last four fields to determine changes in luminance between neighbouring fields, although it will be understood that different numbers of fields could be used in different embodiments of the invention
diff1=Yav(n)−Yav(n−1)
diff2=Yav(n−1)−Yav(n−2)
diff3=Yav(n−2)−Yav(n−3);
3. Calculate, step 63, a sum of the three field differences and store for three field periods, to provide a measure of the variation in luminance over a series of successive fields, although it will be understood that fewer or more differences could be summed in different embodiments of the invention
sum(n)=diff1+diff2+diff3;
4. Calculate, step 64, an average over the last three sums, to detect any trend in the luminance, although it will be understood that fewer or more sums could be averaged in different embodiments of the invention
av=(sum(n)+sum(n−1)+sum(n−2))/3;
5. Calculate, step 65, an absolute difference between the three sums and their average, to isolate any large variations from the average luminance indicative of a change in luminance other than a smooth transition
d1=|av−sum(n)|
d2=|av−sum(n−1)|
d3=|av−sum(n−2)|; and
6. Determine, step 66, whether (av>threshold 1 AND
d1<threshold2 AND
d2<threshold2 AND
d3<threshold2), if so, then fade is detected
In step 66 the condition in respect of av detects a significant change in luminance from Y(n−5) to Y(n) and the other conditions in respect of d1, d2 and d3 detect a relatively smooth transition, for example, without spikes in luminance which might otherwise have a disproportionate effect on the average change av, from a luma level at the start of the fade to that at the end.
Camera flashes typically last for only one to two fields. A detection algorithm for flashes is much simpler than that for fades. Referring to
1. Calculate, step 71, an average luminance value for each field, Yav(n) and store for five field periods;
2. At the end of each frame period, calculate, step 72, an average luminance value of the top and bottom fields of the current frame and those of a frame two frames earlier, although it will be understood that in a different embodiment the averages could be taken over more or fewer fields
current_av=(Yav(n)+Yav(n−1))/2
previous_av=(Yav(n−4)+Yav(n−5))/2;
3. If (Yav(n−2)>maximum(current_av,previous_av)+threshold), that is if the luminance of the preceding bottom field is significantly greater than the larger of the average field luminances of the preceding and succeeding frames, then a flash has been detected, step 73, on a bottom field of the previous frame; and
4. If (Yav(n−3)>maximum(current_av,previous_av)+threshold), that is if the luminance of the preceding top field is significantly greater than the larger of the average field luminances of the preceding and succeeding frames, then a flash has been detected, step 74, on the top field of the previous frame.
Comparing the two fields to the maximum luminance values of the previous and next frame prevents false detection on scene cuts.
In use, the uncompressed video signal is input to the fade detector 2 and the flash detector 3 as well as to the compensating delay 1. The compensating delay 1 is necessary because the algorithms take several field periods before a fade or flash is detected. It is in the nature of the fade detector and flash detector algorithms that if one property is detected the other one will not be detected.
Referring to
Referring to
Alternative embodiments of the invention can be implemented as a computer program product for use with a computer system, the computer program product being, for example, a series of computer instructions stored on a tangible data recording medium, such as a diskette, CD-ROM, ROM, or fixed disk, or embodied in a computer data signal, the signal being transmitted over a tangible medium or a wireless medium, for example microwave or infrared. The series of computer instructions can constitute all or part of the functionality described above, and can also be stored in any memory device, volatile or non-volatile, such as semiconductor, magnetic, optical or other memory device.
Although the present invention has been described with reference to preferred embodiments, workers skilled in the art will recognize that changes may be made in form and detail without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
0703889.6 | Feb 2007 | GB | national |