The present invention relates to method and apparatus for detecting a transition between video segments. In particular, but not exclusively, it relates to detecting and verifying the transition (or boundary) between a program and a commercial block.
Simple commercial detection algorithms, for example those using black-frame and audio pressure features to detect the transitions between a TV program and a commercial block are well known. It has been found that these are sometimes inaccurate, for example new items or program intros can be mistaken and merged into the commercials. Therefore, as well as the commercials, portions of the program may be skipped.
Many channels include a logo displayed in a corner of the screen during the TV program. These logos do not appear during commercials. Therefore, some known commercial block detectors use logo detection to establish a transition between the program and the commercial block. These have also proven to be inaccurate as logos are not always properly overlaid during live events, boundaries are not known, some channels do not use logos, transparent logos cannot be detected on a white background etc.
To improve the performance of such commercial block detectors some use logo presence detection is utilized to suppress for example black-frame or letterbox detections with logos on them. However, these cannot deal with improperly overlaid logos by broadcasters or misdetections of the logo itself.
The present invention seeks to provide a technique for accurately detecting and hence verifying the transition between video segments.
This is achieved according to a first aspect of the present invention by a method for detecting a transition between a first video segment and a second video segment, the method comprising the steps of: detecting a first transition between a first video segment and a second video segment by a first detection method; detecting a second transition between said first video segment and said second video segment by a second detection method, said first detection method being different from said second detection method; determining whether said second method is reliable by comparing said first transition with said second transition; and using at least said second transition to determine a final transition if said second method is determined to be reliable and not using said second transition to determine said final transition if said second method is determined to be unreliable.
This is also achieved according to a second aspect of the present invention by an apparatus for detecting a transition between a first video segment and a second video segment, the apparatus comprising: a first detector for detecting a first transition between a first video segment and a second video segment; a second detector for detecting a second transition between said first video segment and said second video segment, said first detector being different to said second detector; and a comparator for determining whether said second method is reliable by comparing said first transition with said second transition and using at least said second transition to determine a final transition if said second method is determined to be reliable and not using said second transition to determine said final transition if said second method is determined to be unreliable.
Certain methods either detect transitions in certain content very well or not well at all, e.g. logo detectors. By comparing the transitions detected with such a method with the transitions detected with another method, it can be determined whether it is advisable to use the second method to determine the final transitions or not.
In an embodiment of the present invention, the second detection method comprises a simple logo detector and with incorporation of a first detection method, the system can easily correct for improperly overlaid logos by broadcasters or misdetections of the logo detector. If the logo detections are reliable the boundaries can be tuned using logo and other information from the first detection method to obtain more accurate boundaries.
Said final transition may be based solely or predominantly on said second transition if said second method is determined to be reliable.
Said final transition may be determined by using said second transition to refine said first transition.
In an embodiment of the present invention, the second detection method is determined as reliable by comparing start and/or end times of the first and/or second video segments determined by the first and the second detection methods; determining a ratio of the differences between corresponding start and/or end times of the first and second video segments; and determining the second detection method reliable if the determined ratio of differences is below a threshold value.
Alternatively, the second detection method is determined as reliable by determining a ratio of a corrected duration of the first video segments detected by the second detection method over the total duration of the video stream of first and second video segments; and determining the second detection method reliable if the determined ratio is above a second threshold value. Alternatively, the second detection method is determined as reliable by determining a ratio of a corrected duration of the first video segments detected by the second detection method over a duration of the corresponding first video segments detected by the first detection method; and determining the second detection method reliable if the determined ratio is above a third threshold value.
Reliability of the second detection method may be determined by the any one of the above ration or any combination thereof.
For a more complete understanding of the present invention, reference is made to the following description in conjunction with the accompanying drawings, in which:
a), (b) and (c) are graphical representations of an example of the output of the detectors and comparator of the apparatus of
With reference to
With reference to
The second detector 107 receives the demultiplexed video presentation time stamps. The second detector 107 divides this into a plurality of frames. Each frame is analyzed to detect a graphical object such as a logo or recognized text or the like. The second detector 107 outputs a plurality of logo free episodes, namely an indication of the transition between appearance and/or disappearance of a graphic object (logo), step 201.
When the end of the data stream, step 203, is detected, the comparator combines the output of detector 105, 107 and generates a final list of the start and end times (transitions) of each video segment, i.e. commercial block start and end times.
This is achieved by estimating the reliability of the second detection detector 107, step 205. If the second detector is determined reliable, step 207, transitions found by the second detector 107 are processed and output, step 209, 211, 215 and combined with transitions detected by the first detector 105.
If the second detector 107 is determined unreliable, step 207, transitions found by the first detector 105 are processed only, step 213 and output, step 215.
Determination of reliability of the second detector 107 will now be described in more detail with reference to
a) is a graphical representation of the output of the first detector 105;
b) is a graphical representation of the output of the second detector 107;
c) is a graphical representation of a comparison of the output shown in
As mentioned above, the comparator first checks whether the second detector 107 (logo detector) is reliable or not. In carrying out such a check, in the event that some broadcasters forget to overlay the logo after a commercial break, especially during live events, and in the event that some channels (almost) always overlay a logo, also during commercial breaks, the logo data is not useable for commercial block detections. The transitions t11, t12, t13, t14, t21, t22, t23, t24 of
The ratio between the duration of logo free episodes t21 to t22 and t23 to t24 outside the detected commercial blocks and the video duration is calculated as follows:
Ratio_A=V1LogoFreeNoOverlap*100%/VideoDuration (1)
i.e. the ratio of commercial blocks detected by the first detector having no overlap with that detected by the second (logo free episodes) over the duration of the video stream.
In general this ratio is small (<5%), since the logo normally disappears only 20 seconds before start of a commercial block and appears 20 seconds after the end of a commercial block. However, if the logo detector fails for short periods because of static content or “invisible transparent logos on a white background” this percentage can slightly be higher. If this ratio exceeds 15% the broadcaster probably forgot to overlay the logos for a longer period.
Next, the ratio between the total duration of the corrected logo free episodes against the video duration is calculated:
Ratio_B=CorrLogoFreeDuration*100%/VideoDuration (2)
wherein CorrLogoFreeDuration is the corrected logo free episode duration in which durations considered too short or too long are discarded.
If this ratio is very small (less than 3%) the logo is probably always visible, i.e. it is also visible on commercials. Or the recording/broadcast do not contain any commercials.
The total duration of the corrected logo free episodes (second detector) against the total duration of the detections of the first detector is compared:
Ratio_C=CorrLogoFreeDuration*100%/CBV1Duration (3)
wherein CorrLogoFreeDuration is the corrected logo free episode duration in which durations considered too short or too long are discarded and wherein CBV1Duration is the duration of the commercial block detected by the first detector (
If this ratio is less than 45% the logo free episodes are significantly shorter than the commercial blocks detected by the first detector. This happens when logos are overlaid on some of the commercials, or commercials are interleaved with a lot of trailers with logos on them.
The Logo Free Episodes are therefore considered unreliable if:
(Ratio_A>NOOVERLAPRATIO) OR (RATIO_B<VIDDURATIONRATIO) OR (Ratio_C<CBV1RATIO) (4)
where, in the particular example above, NOOVERLAPRATIO equals to 15%, VIDDURATIONRATIO equals to 3% and CBV1RATIO equals to 45%. It can be appreciated that these threshold percentages are examples only and can be varied as appropriate. Otherwise the detected Logo Free Episodes are assumed to be reliable and used for fine-tuning the transitions detected by the first detector or at least to verify the transitions.
The candidate commercial blocks detected by the first detector may be used to verify the reliability of the logo detections. For example, the total duration of the logo free episodes against the total duration of the candidate commercial block detected by the first detector and/or compare the program duration against the total duration of the episodes where there is no overlap between a logo free episode and the candidate commercial blocks detected by the first detector.
If the logo detections of the second detector are considered not reliable the candidate commercial blocks detected by the first detector become the final transition and output of the apparatus. If the logo detections of the second detector are considered reliable, the transitions detected by the first detector are tuned using the logo detection of the second detector.
Although an embodiment of the present invention has been illustrated in the accompanying drawings and described in the foregoing detailed description, it will be understood that the invention is not limited to the embodiment disclosed, but is capable of numerous modifications without departing from the scope of the invention as set out in the following claims. The invention resides in each and every novel characteristic feature and each and every combination of characteristic features. Reference numerals in the claims do not limit their protective scope. Use of the verb “to comprise” and its conjugations does not exclude the presence of elements other than those stated in the claims. Use of the article “a” or “an” preceding an element does not exclude the presence of a plurality of such elements.
‘Means’, as will be apparent to a person skilled in the art, are meant to include any hardware (such as separate or integrated circuits or electronic elements) or software (such as programs or parts of programs) which reproduce in operation or are designed to reproduce a specified function, be it solely or in conjunction with other functions, be it in isolation or in co-operation with other elements. The invention can be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the apparatus claim enumerating several means, several of these means can be embodied by one and the same item of hardware. ‘Computer program product’ is to be understood to mean any software product stored on a computer-readable medium, such as a floppy disk, downloadable via a network, such as the Internet, or marketable in any other manner.
Number | Date | Country | Kind |
---|---|---|---|
07107666.5 | May 2007 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IB08/51734 | 5/5/2008 | WO | 00 | 11/6/2009 |