This invention relates to a method and system of processing an image signal.
It is common for people to watch television and engage in other activities that include visual content such as watching DVDs. The user experience with respect to watching such video content will change in the future. The first signs are already visible, for example in the television products of Philips, in which lamps are added to enhance the experience of watching television. This process of adding further devices and additional functionality to augment an entertainment experience such as watching a film is growing. The venture “amBX” (see for example, www.ambx.com) is preparing the next steps to enhance an experience such as watching television even further, by playing scripts, along with the original audio/visual content, containing effect descriptions that could be offered to the user using a suitable augmentation system. Additional devices in the user's entertainment space provide augmentation to the video content.
For example, United States of America Patent Application Publication US2002169817 discloses a real-world representation system which comprises a set of devices, each device being arranged to provide one or more real-world parameters, for example audio and visual characteristics. At least one of the devices is arranged to receive a real-world description in the form of an instruction set of a markup language and the devices are operated according to the description. General terms expressed in the language are interpreted by either a local server or a distributed browser to operate the devices to render the real-world experience to the user. In this way a script is delivered that is used to control other devices alongside the television delivering the original content.
It is necessary however, to author the scripts that will be used to create the additional effects in the additional devices. To assist the authoring process, many applications use content analysis to automate the processes that would otherwise have to be carried out manually. In relation to content creation, for example amBX scripting, well-trained authors go through a movie frame by frame and choose specific frames where they wish to start/stop an additional effect, such as the display of one or more lights. These lighting effects have a color that the author adapts to something (background, explosion, object) in the video sequence.
Content analysis can offer great benefits for the scripting authors. For example shot cuts can automatically be detected giving the authors positions in time where the lights might be changed. Dominant colors can be extracted for each frame in a shot or a selection of sampled frames, from which a set of colors can be proposed that would match the colors in the specific shot or time interval. An example of the latter could be the MPEG 7 dominant color descriptor, which gives up to eight colors for a frame. Other methods for choosing colors can be used as well, for example histograms. The dominant colors give very good suggestions to the authors, especially the ones with a high occurrence rate. However, often the not so obvious colors can be very distinguishing, and can be used to create effects that amaze the viewer. However it is not possible at the present time to detect these interesting colors, in order to propose them to the scripting author.
It is therefore an object of the invention to improve upon the known art.
According to a first aspect of the present invention, there is provided a method of processing an image signal comprising: receiving an image signal comprising a series of frames, calculating a plurality of dominant colors, over the series of frames, selecting a subset of frames of the image signal, calculating a plurality of dominant colors, over the subset of frames, comparing the dominant colors of the subset of frames to the dominant colors of the series of frames, and determining the dominant color in the subset of frames, with the largest difference from the closest dominant color in the series of frames.
According to a second aspect of the present invention, there is provided a system for processing an image signal comprising: a receiver arranged to receive an image signal comprising a series of frames, and a processor arranged to calculate a plurality of dominant colors, over the series of frames, to select a subset of frames of the image signal, to calculate a plurality of dominant colors, over the subset of frames, to compare the dominant colors of the subset of frames to the dominant colors of the series of frames, and to determine the dominant color in the subset of frames with the largest difference from the closest dominant color in the series of frames.
According to a third aspect of the present invention, there is provided a computer program product on a computer readable medium for processing an image signal, the product comprising instructions for: receiving an image signal comprising a series of frames, calculating a plurality of dominant colors, over the series of frames, selecting a subset of frames of the image signal, calculating a plurality of dominant colors, over the subset of frames, comparing the dominant colors of the subset of frames to the dominant colors of the series of frames, and determining the dominant color in the subset of frames with the largest difference from the closest dominant color in the series of frames.
Owing to the invention, it is possible to extract automatically, from an image signal, colors in a sequence that are of interest to an author, while going beyond the obvious selection of the most dominant color. The method and system provides, for a given time interval of a video sequence, the comparison of the colors of the frames in that interval to the colors of the whole video sequence, and finds the color or colors in the time interval that differ the most from the dominant colors in the whole sequence. These colors are remarkable colors, and the more they differ from the dominant colors of the sequence, the more interesting they can be to a content author, for example to create amazing effects in amBX scripting.
In one embodiment, the image signal further comprises data comprising color information, and the steps of calculating a plurality of dominant colors include accessing the data. This provides automation of the processing of the colors by using metadata that is present within the image signal, for example in the form of MPEG 7 color information. The alternative to this is that the steps of calculating a plurality of dominant colors include performing an analysis of the color content of the frames. Various methods exist to extract the color(s) from an image frame, for example by using pixel counts of individual colors.
Advantageously, each dominant color comprises a representation in 3-dimensional color space, and the step of determining the dominant color in the subset of frames, with the largest difference from the closest dominant color in the series of frames comprises resolving a Euclidian distance for each dominant color.
Preferably, the method further comprises generating a value, the value relating to the determined dominant color in the subset of frames with the largest difference in color from the closest dominant color in the series of frames, and defining the extent of the difference. In addition to identifying the remarkable color within a sequence of consecutive frames, the method and system can be configured to assign a value to the extent of the difference from the dominant color, which could be used in an automated authoring process, for example. For example, if yellow is detected as the most remarkable color in a frame sequence, then a value relating to the Euclidean distance from the nearest dominant color can be returned as how remarkable the color yellow is in the sequence.
Embodiments of the present invention will now be described, by way of example only, with reference to the accompanying drawings, in which:—
To illustrate the question of color within image frames, an example of an image frame 10 shown in
It is necessary to determine a set of dominant colors that are representative for the whole video sequence 14. A good example would be the average of the MPEG 7 dominant color descriptor. The MPEG 7 dominant color descriptor gives up to eight colors that are representative for a frame 10, and is contained within the data 16. The average of such a set of colors for multiple frames 10, can be calculated. Other methods for representing the dominant colors in the series 14 can be used, for example histograms. The average of the video sequence 14 can be computed as the average of the histograms over time. This produces a table similar to that shown in
The table in
This is not the sole way that dominant colors can be calculated for an image frame. This methodology above can be considered as based upon building histograms of the different colors within an image frame, where each histogram represents a predefined color range. Dominant color determination could also simply return the n most numerous colors in the image frame, where n might be 8, as RGB values. This is determining dominant colors based around the actual RGB values of the pixels and is simply looking for the n most commonly occurring RGB values.
Once the dominant colors have been calculated for the entire series 14 of frames 10, then a selection of a subset 18 of the frames 10 is made, as shown in
For each of the dominant colors of the specific interval 18, it is then possible to compute the distance to the closest dominant color of the whole sequence 14. This distance measure is ideally computed in a perceptually uniformly color space, for example LUV. To ensure a sensible result, it makes sense to compare the distances in such a way that the distances make sense to human perception. The end result of this comparison process is, for each dominant color in the interval, there is a distance to each color in the set of average dominant colors of the series 14. Next, it is determined which of the dominant colors in the subset 18 has the largest distance to its closest dominant color of the set of average colors for the sequence 14. This is the most remarkable color, since it is perceptually furthest from the average colors of the sequence 14. This will be explained relative to a specific example, below with reference to
The method of processing the image signal 12 to determine the most remarkable color in a frame sequence 18, relative to the overall content signal 12, is summarized in
If seq1, . . . , segn are the colors representing the whole video sequence 14, and c1, . . . , cm the colors representing the specific time interval, we look for the color that optimizes cindex in
Max(i:1≦i≦m:Min(j:1≦j≦n:distance(ci,seqj))) (1)
where the distance is a perceptually uniform distance measure, for example the Euclidian distance in LUV color space. Moreover, the value of (1) is also an indication for how remarkable this color is. The larger the distance from cindex to the representative colors of the whole sequence, the more interesting this color could be.
The RGB values of table 6a are converted to LUV values and the Euclidean differences in these LUV values are shown in the table 6b. Effectively each color in the table 6a is a point in color space, and the values in table 6b represent the length of a line drawn between each pair of points. Eight dominant colors in the overall movie are compared to eight dominant colors in the shot, giving sixty-four different pairs of points. The bottom row of the table 6b shows the minimum value for each of the shot colors, that minimum representing the distance from the closest of the movie dominant colors. It can be seen that SDC8 has the largest distance from the closest movie color, the 54.73 value in the minimum row. This is the color that will be determined by the step S6 of
The methodology of the processing of the image signal 12 can also be applied to a more flexible environment, for example to a sliding window. A video sequence can have large parts that take place in a completely different environment from other parts, and the process can be configured so that there would be comparison of the colors in a specific interval to the colors of a part of the video rather than to the whole video. Another embodiment is to compare a sliding window with a larger sliding window that nevertheless contains the first window. This emphasizes colors that are remarkable on a small scale, even within a shot. With the distance measure defined, the process would return only those colors that are very significantly different. This provides an automated method of filtering out the not so interesting colors and only focusing at the time instances where the most prominent color is most likely of interest.
The above description refers to the use of dominant colors. Also other descriptors like color histograms could be used as a way of determining color values for the colors within one or more frames of the signal 12. In a similar way the use of shots and shot cuts, is only one example of the selection of the subset 18 of frames 10 within the signal 12. For technologies such as amBX, it is advantageous to have stable colors per shot. However it is obvious that the above techniques can be used for any kind of interval. So the shot cut detection is just here as an example. As mentioned above, rather than comparing the dominant colors of a shot or interval to the dominant colors of the whole movie, it is possible to use a sliding window to compare the colors in this sliding window to the larger overlapping window.
Number | Date | Country | Kind |
---|---|---|---|
08150343.5 | Jan 2008 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IB09/50108 | 1/12/2009 | WO | 00 | 7/8/2010 |