The present invention relates to a system for automatically monitoring the viewing activities of television signals.
The so called term “fingerprint” appearing in this specification means a series of image sample information, in which each sample information is selected from a digitized frame of pattern of television signals, and a plurality of frames can be selected from the television signals, and one or more sample values can be selected from one video frame of television signals, so that the so called “fingerprint” can be used to uniquely identify the said television signals.
In broadcast television, one of the key questions advertisers often ask television programmers is how many people are watching their specific program channel. This determines the impact of a specific type of commercial on the viewer population. This is called the channel rating measure. It largely affects the price advertisers are willing to pay for a specific TV commercial slot available (called commercial avails, or simply avails) on that channel. For the programmers, they want to have as many people watching their specific channel as possible so that they can charge as much as possible for carrying the ad. For the advertisers and TV programmers, they want to know the rating number as accurately as possible so that they can use the information to get the best price from their own perspectives.
With the growing deployment of interactive television, advertisers and programmers alike also see the need to have the viewing patterns of specific viewers. This is often called addressable targeting. With addressable targeting, it is possible for the advertisers to deliver advertising messages specific for the viewer or viewer family. This can significantly increase the relevance of their advertising message and increase the chance that the viewers can be converted into paying customers.
Therefore, there is a need to measure the viewing activity on specific channels by specific viewers. In other words, there is a need to measure how many people are watching a specific television channel, and what specific channels a particular viewer is watching at the time.
Because it is generally impossible to measure the viewing patterns for all of the people watching television, the viewing population must be sampled to a smaller number of people to make the measurement more tractable. The population is sampled in such a way that their demographics, i.e., age, incoming level, ethnic background, and profession, etc., correlates closely to the general population. When this is the case, the sampled population can be considered as a proxy to the entire population as far as measured results are concerned. Several techniques have been developed to provide this information.
In one method, each of the sampled viewer or viewer family is given a paper diary. The sampled viewer needs to write down their viewing activities each time they turn on the television. The diary is then collected periodically to be analyzed by the data center.
In another method, each sampled viewing family is given a small device and a special purpose remote control. The remote control records all of the viewers' channel change and on/off activities. The data is then periodically collected and sent back to data center for further analysis. At the data center, the viewing activity is correlated to the program schedule present at the time of the viewing, the information on which channels are watched at any specific time can be obtained.
In another method, programmers modify the broadcast signal by embedding some specially coded signals into invisible portion of the broadcast signal. This signal can then be decoded by a special purpose device at the viewer home to determine which channel the viewer is watching. The decoded information is then sent to the data center for further analysis.
In yet another method, an audio detection device is used to decode hidden audio codes within the in-audible portion of the television broadcast signal. The decoded information can then be collected and sent to the data center for further analysis.
The first method above, the measurement can have serious accuracy problems, because it requires the viewers to write down, often in 15 minute intervals, what they are watching. Many times, viewers may forget to write it down on their diaries at the time of watching TV, and frequent channel changes can further complicate this problem.
The second method above can only be applied to the viewing of live television programming because it requires the real-time knowledge of program guide. Otherwise, only knowing the channel selected at any specific time will not be sufficient to determine what program the viewer is actually watching. For non-real-time television content, the method cannot be used. For example, a viewer can records the broadcast video content onto a disk-based PVR, and then plays it back at a different time, with possible fast forward, pause and rewind operations. In these cases, the original program schedule information can no longer be used to correlate to the content being viewed, or at least it would require change of the PVR hardware. In addition, the method cannot be used to track viewing activities of other media, such as DVD and personal media players because there are no pre-set schedules for the content being played. Therefore, the fundamental limitation of this method lies in the fact that the content being viewed must have associated play-out schedule information available for the purpose of measuring the viewing histories. This requirement cannot be met in general for content played from stored media because the play-out activity cannot be predicted ahead of time.
The third and fourth methods above both require modification to the television signals at the origination point before the signal is broadcast to the viewers. This may not always be possible given the complexity and regulatory requirement on such modifications.
It is object of the present invention to provide a system for automatically monitoring the viewing activities of television signals, which can monitor the viewing patterns of video signals in as many different devices as possible, including television signals, PVR play-outs, DVD players, portable media players, and mobile phone video players.
It is another object of the present invention to provide a system for automatically monitoring the viewing activities of television signals, which can provide accurate measure of the number of viewers.
It is another object of the present invention to provide a system for automatically monitoring the viewing activities of television signals, which can measure the viewing activities of pre-recorded video content that has not been distributed over the television broadcast network.
It is another object of the present invention to provide a system for automatically monitoring the viewing activities of television signals, which can reduce the hardware cost of the device used to perform such measurement.
Therefore, there is provided a system for automatically monitoring the viewing activities of television signals, comprising a measurement device, in which the television signals are adapted to be communicated to the measurement device and the TV set, making the measurement device receive the same signals as the TV set; the measurement device is adapted to extract a fingerprint data from the television signals displayed to the viewers, making the measurement device measures the same video signals as those being seen by the viewers; a data center to which the fingerprint data is transferred; and a fingerprint matcher to which the television signals which the viewers are selected to watch are sent to be monitored through the measurement device.
Preferably, each measurement device is provided in a viewer residence which is selected by demographics.
Preferably, the demographics are of the household income level, the age of each household member, the geographic location of the residence, and/or the viewer past viewing habit.
Preferably, the measurement device is connected to the internet to continuously send the fingerprint data to the data center; a local storage is integrated into the measurement device to temporarily hold the fingerprint data and upload the fingerprint data to the data center on periodic basis; or the measurement device is connected to a removable storage onto which the fingerprint data is stored, and the viewers periodically unplug the removable storage and then send it back to the data center.
Preferably, the measurement devices are typically installed in different areas away from the data center.
Preferably, the television signals are those of TV programs produced specifically for public distribution, recording of live TV broadcast, movies released on DVDs and video tapes, or personal video recordings with the intention of public distribution.
Preferably, the fingerprint matcher receives the fingerprint data from a plurality of measurement devices located in a plurality of viewer residence.
Preferably, the measurement device receives actual clips of digital video content data, performs the fingerprint extraction, and passes the fingerprint data to the fingerprint matcher and a formatter.
Preferably, the measurement device, the data center, and the fingerprint matcher are situated in geographically separate locations.
Preferably, the television signals are arranged in a parallel connection way to be communicated to the measurement device and the TV set.
According to the present invention, the proposed system does not require any change to the other devices already in place before the measurement device is introduced into the connections.
In the invention, there is provided a system for accurately determining the video content through a measurement device so that the measurement can be used to establish the viewing patterns for specific viewers connected to the device.
The method consists of several key components. The first component is a hardware device that must be situated in the viewers' homes. The device is connected to the television set in one end and to the incoming television signal in the other end. This is shown in
The data center 104 may be co-located with the video content source 100. The Content delivery device may be a network (over-the-air broadcast, cable networks, satellite broadcasting, IP networks, wireless network), or a storage media (DVD, portable disk drives, tapes, etc.).
Next look at
In
Once a video content has been registered, its fingerprint is also available for matching operations with the collected remote content fingerprint data. Therefore, the fingerprint registration, as outlined in
Specially, the content register, the content database and the content matcher may be situated in geographically separate locations; the content register may register only a portion of the content, not all of them; the registered content may include at least recording of live TV broadcast, movies released on recorded media such as DVDs and video tapes, TV programs produced specifically for public distribution, personal video recordings with the intention of public distribution (such as youtube clips, and mobile video clips); the viewing history contains time, location, channel and content description for the matched content fingerprint; the frame segmentation is used to divide the frames into groups of fixed number of frames, say, each group with 500 frames; the frame segmentation may discard some frames periodically so that not all of the frames are registered, for example, sample 500 frames, then discard 1000 frames and then sample another 500 frames, and so forth; the FP extractor may perform sampling differently depending on the group of frames, for some groups of frames, it may take 5 samples per frame, and for some other groups of frames, it may take 1 sample per frame, yet for some other groups of frames, it may take 25 samples per frame; and the preview/player 157 may take its input directly from a compressed video content segment, bypassing 131, 152 and 153 entirely, in this case, the preview/player performs the decompression, frame buffering, frame segmentation and display.
To better understand the processing flow at the data center, there is provided two cases. In the first case, shown in
At a later time, the content is delivered by a content delivery device 203. At the viewer homes, fingerprint extraction is performed 204 on the delivered video content. In addition, in a preferred embodiment, the extracted fingerprint data is immediately transferred to the data center, put into a storage device, and separated from the already-registered content. In another embodiment, the extracted fingerprint data is saved in the devices installed at the viewer homes and will be transferred to the data center at a later time when requested. The data center then compares the stored fingerprint archive data with the fingerprint within the content database 202. This is accomplished by content matching 205.
In another embodiment, as shown in
At the data center, after both the extracted fingerprint data from the delivered content and the registered content information are both available, the content matching 215 can be performed to come up with the viewing history 216.
Comparing
Next, look at the content matching process, as shown in
The fingerprint matcher 302 than takes the output of the parser 301, retrieves the registered video content fingerprints from the content database 124, and performs the fingerprint matching operation. When a match is found, the information is formatted by the formatter 303. The formatter takes the meta data information associated with the registered fingerprint data that is matched to the output of the parser 301, and creates a message that associates the meta data with the viewer home information before it is sent as viewing history 105.
Specially, the content matcher receives incoming fingerprint streams from many viewer homes 103, and parses them out to different fingerprint matchers; and the content matcher receives actual clips of digital video content data, performs the fingerprint extraction, and passes the fingerprint data to fingerprint matcher and formatter.
Next, it is to describe how the fingerprint matcher operates, as shown in
Alternatively, they may be for non-consecutive time-sections for the original video content. For example, FP1 maybe for time [1, 3] seconds (it means 1 sec through 3 sec, inclusive), and FP2 for time [6,8] seconds, and FP3 for time [11,100] seconds, and so forth. In other words, the length of video content represented by the fingerprint segments may or may not be identical. They may not be spaced uniformly either.
Multiple correlators 312 operate concurrently with each other. Each compares a different fingerprint segment with the incoming fingerprint data stream. The correlators generate a message indicating a match when a match is detected. The message is then sent to the formatter 303. The combiner 314 receives messages from different correlators and passes them to the formatter 303.
Next consider what happens at the viewer homes, as shown in
The television signal 605 is assumed to be in analog formats, and is connected to the measurement device 601. The measurement device 601 receives the same signal as the connected television set 602. The measurement device 601 extracts fingerprint data from the video signal. The television signal is displayed to the viewers 603, which means that the measurement device 601 measures the same video signal as it is seen by the viewers 603. The measurement is represented as fingerprint data streams which will be transferred to the data center 604. The viewer may have a remote control or some other devices that select the right television channel that they want to watch. Whatever channel selected will be sent through the television signal of the connected television set 602 and then measured by the measurement device 601. Therefore, the proposed method does not require any change to the other devices already in place before the measurement device 601 is introduced into the connections.
In an alternative embodiment, the measurement device 601 passes through the signal to the television 602. The resulting scheme is identical to that of
The measurement device 601 extracts the video fingerprint data. The video fingerprint data is a sub-sample of the video images so that it provides a representation of the video data information sufficient to uniquely represent the video content. Details on how to use this information to identify the video content are described by a provisional U.S. patent application No. 60/966,201 filed by the present inventor.
A preferred embodiment of the measurement device 601 is shown in
Another embodiment of the measurement device 631 is shown in
It is important to point out that in any of the above embodiments, the video input signal that the viewers see is not altered in anyway by the measurement device.
In the above discussion, it is assumed that audio signal is passed through along with the video signal and no further processing is performed.
In addition, the measurement device needs to locally store the fingerprint data and send it back to the data center for further processing. There are at least three ways to send the data. One preferred embodiment thereof is to have the device connected to the internet and continuously send back the collected data to the data center. In another embodiment thereof, a local storage is integrated into the device to temporarily hold the collected data and upload the data to the center on periodic basis. In another embodiment thereof is to have the device connected to a removable storage, such as a USB flash stick, and the collected video fingerprint data is stored onto the removable storage. Periodically, the viewers can unplug the removable storage, replace it with a blank, and then send back the replaced storage to the data center by mail.
Next, it is to describe the operations of the fingerprint extractor. See
It is now to focus on the internal operations of the fingerprint extractor in some greater detail, see
In
In the preferred embodiment, each video frames are sampled exactly the same way. In other words, the image samples from the same positions are sampled for different images, and the same number of samples is taken from different images. In addition, the images are sampled consecutively.
The samples are then organized as part of the continuous streams of image samples and placed into the transfer buffer 704. The image samples from different frames are organized together into the transfer buffer 704 before it is sent out.
Specially, the above sampling method can be extended beyond the preferred embodiment to include the following variations: the sampling position of each image may change from image to image; different number of samples may be taken for different video images; and sampling on images may be performed non-consecutively, in other words, the number of samples taken from each image may be different.
The above discussions can be applied to other fields by those familiar with the general technical field of expertise. These include, but not limited to, situations where the video content may be compressed in MPEG-2, MPEG-4, H.264, WMV, AVS, Real, and other future compression formats. The method can also be used in monitoring audio and sound signals. The method can also be used in monitoring video content that is re-captured in consumer or professional video camera devices. The system can also be extended in areas where there is a centralized registry of content meta data and a network connected system of remote collection devices.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/CN2008/071082 | 5/26/2008 | WO | 00 | 5/30/2008 |