The present disclosure generally relates to comparing versions of content files to determine if they are the same content item.
Copyright holders seek to identify copyright violations which occur when copyrighted content, such as a copyrighted video, is pirated. Such content, to which access has been made available in violation of copyright, may be referred to as hacked video, hacked content, rogue content, pirated content, or other similar terms.
It is often the case that pirated content will be manipulated by pirates in an attempt to frustrate automatic detection systems, so that automatic detection via simple comparison becomes difficult. Such manipulations may include, for example, but not be limited to: change of color, cropping, rotation/translation, audio mute/swap, video format transcoding, etc.
The present disclosure will be understood and appreciated more fully from the following detailed description, taken in conjunction with the drawings in which:
In one embodiment, a system, apparatus and method are described, the system having at least one storage device for storing a reference video file including a plurality of frames in which an identifiable recognizable object appears, a suspect video file including a plurality of frames in which the identifiable recognizable object appears, a computer including a processor to determine, on a per frame basis, at least one meta-feature of the identifiable recognizable object which appears in each frame of the reference video file, create a first vector for the reference video file, the first vector being a vector of the determined at least one meta-feature of the identifiable recognizable object which appears in each frame of the reference video file, determine, on a per frame basis, the at least one meta-feature of the identifiable recognizable object which appears in each frame of the suspect video file, create a second vector for the suspect video file, the second vector being a vector of the determined at least one meta-feature of the identifiable recognizable object which appears in each frame of the suspect video file, and calculate a correlation between the first vector and the second vector and thereupon apply a statistical method to determine a measure of the correlation between the first vector and the second vector, a result of the statistical method being indicative of a degree of confidence, and an interface for the processor to output a result on the basis of the degree of confidence, the result indicative of a degree of certainty that the suspect video file is a copy of the reference video file. Related systems, apparatus and methods are also described.
Reference is now made to
Reference is now made to
Other easily identified recognizable objects may appear in the various video frames in an incidental fashion, or simply as background. For example, a window is seen in the frames of the second exemplary suspect video file, and mountains appear in the exemplary reference video file and the first exemplary suspect video file. However, because background items may appear only in a limited number of particular scenes in an entire video file, background objects may not serve as useful recognizable objects for some embodiments, as described herein below. That is to say, although those objects might be easily identified recognizable objects, because of their sparsity in the entire video file, they might not be useful for embodiments described herein. By way of example, eight minutes out of a 95 minute movie may have background mountains. In such a case, mountains are not a useful easily identified recognizable object. Similarly, a bird 135, 140 appears to have been captured in an incidental fashion in frames 110B, 120B. However, because the bird's appearance in the video is incidental, the birds may not serve as a useful easily identifiable recognizable object for the purpose of content comparison, as described herein.
In general, it is appreciated that different movies and other content items may have different features which are both frequent and significant. Thus, different features should be chosen for different content. The features selection may be based on genre, for example.
At least one reason why content is valued by humans is that humans can understand and enjoy said content. However, because content is valued by humans, attempts to gain unauthorized or illegal access to the content items may be made by a hacker or pirate. Content, such as video to which the hacker or pirate has gained unauthorized or illegal access may be referred to as hacked video, hacked content, pirated video, pirated content, rogue content, or other similar terms. Persons or organizations which have gained unauthorized or illegal access to pirated content may be referred to as “pirates” or “hackers”. Pirates can manipulate the video to prevent detection, but they don't want to manipulate it in a way that makes it hard to enjoy.
By way of a non-limiting example, pirates may use a website or other part of a rogue content distribution network to share a video of a football game. In an attempt to frustrate efforts at detection, the pirates may crop out part of the frame and change the brightness a little, or otherwise modify the video, as is known in the art. For example, as was noted above, the aspect ratio of the exemplary reference video file is 16:9. The aspect ratio of the first exemplary suspect video file is 4:3. If, for the sake of example the first exemplary suspect video file is a pirated version of the exemplary reference video file, then it may be the case that, in order to introduce a change which may help disguise the origin of the first exemplary suspect video, pirates may have changed the aspect ratio from 16:9 to 4:3 prior to releasing the first exemplary suspect video to a content sharing network.
On the other hand, pirates do not want to change crucial properties of the game: they typically will not crop out or erase the ball, the score board, the player's faces, etc. By doing so, the pirates might harm the viewing experience, and, in the extreme case, render the video unwatchable. That is to say, any modification of the video by pirates which renders the recognizable objects unrecognizable will ruin the value of the content item.
Accordingly, content may be compared between the reference video file and the suspect video file by comparing elements appearing in the video which humans need to recognize in order to enjoy the content. It is repeated for emphasis here that in
Reference is now made to
In addition, the computing device typically comprises long term storage 430, comprising at least one storage device such as a hard (or floppy) disk drive, a flash memory device, or other appropriate storage devices, which may be used for storing the reference and suspect video files during the execution of the steps for
The device 400 may be operated by a copyright owner of the reference video file; by a broadcaster; by a law enforcement agency; or any other appropriate stakeholder.
Reference is now made to
On a per-frame basis, the number of said recognizable objects which appears in each frame of the frames of the reference video file is determined (step 330). For example, in frame 110A (
A first vector for the reference video file is created, the first vector being a vector of the determined number of said recognizable objects in each frame of the frames of the reference video file (step 340). Accordingly, the vector for the faces in each of the frames of the first row of five video frames 110 would be 3,3,3,3,3. By contrast, if birds were selected as the recognizable object, then, the vector would be 0,0,1,0,0. As noted above, a single bird 135 (
In step 350 a suspect video file is received. In step 360, similar to step 320, on a per frame basis, the number of said recognizable objects which appears in each frame of the frames of the suspect video file is determined. A second vector for the suspect video file is created in step 370, the second vector being a vector of the determined number of said recognizable objects in each frame of the frames of the suspect video file. Step 370 is similar to step 330.
It is appreciated that the steps of
In step 380, a correlation between the first vector and the second vector is calculated. A statistical method to determine a measure of the correlation between the first vector and the second vector is applied (step 390), where a result of the statistical method is indicative of a degree of confidence that the suspect video file is a copy of the reference video file.
The inventors have provided several examples of use of the present systems and method as a proof of concept. In a first example, a number of faces in each frame is used as the recognizable object. Reference is now made to
By contrast, reference is now made to
A statistical method is applied to determine the correlation between the vectors/histograms of
The results which are graphically displayed in
As 0 denotes uncorrelated data and 1 denotes full positive correlation, the cropped copy of the first reference video file shows a high correlation with the reference video file. By contrast, the unrelated video file has a Pearson coefficient closer to 0, and therefore has a lower correlation.
Reference is now made to
It is appreciated that in
Reference is now made to
As with the example provided contrasting
The Pearson coefficient of the second reference video file versus the cropped, pirated version of the second reference video file was 0.43954015470225877. By contrast, the Pearson coefficient of the second reference video file versus the unrelated video was 0.0013004417195126667. As in the example provided by
It is appreciated that the Pearson coefficient of the second reference video file versus the cropped, pirated version of the second reference video file of 0.43954015470225877 seems “small”. However, by comparison to the resulting Pearson coefficient for the unrelated video (i.e., 0.0013004417195126667), the 0.4395 . . . value indicates a much greater level of correlation. As is known in the art, the Pearson formula produces not only correlation coefficient, but also a confidence interval which is the probability that the correlation is not coincidental. Accordingly, a threshold indicating a correlation between suspect video files and the reference video file can be manually or automatically adjusted. Although the above discussion focuses on Pearson coefficients, other methods for determining correlations which are known in the art, such as, but not limited to Spearman's rank correlation coefficient, and Kendall rank correlation coefficient may be used as well.
Reference is now made to
Reference is now made to
As with the above examples, Pearson coefficients were determined for the correlation of the vector of
The Pearson coefficient of the third reference video file versus the cropped, pirated version of the third reference video file was 0.11072802160715986. By contrast, the Pearson coefficient of the second reference video file versus the unrelated video was −0.0014891778178670973. As in the above prior examples, the cropped pirated version of the video used to generate the histogram of
It is noted that the absolute correlation of this specific feature (i.e. average distance between faces) is not very high, although much higher than the unrelated movie. This may mean that in this specific movie, this feature is not common enough. For example, it may be the case that in the suspect video file the faces are scattered in the frame, as opposed to other video files where the faces are concentrated in the middle of the frame. As such, when cropping the video, a lot of the frames lose some of the faces and distort this feature. It may also be the case that there are not a lot of frames with two or more faces (i.e., this feature returns 0.0 for 0 or 1 faces) and, although in principle, “faces” are may be used as a feature for comparison in general, for this particular suspect video file, a difference feature should be used.
Alternatively, in a case such as the third example, more than one recognizable object may be used to generate multiple vectors for comparison. Repeated positive correlations would be indicative of a match between the reference video file and the suspect video file. It is appreciated that in some cases once a suspect video file is identified as a likely candidate for being a pirated video file, other methods (whether computational or visual) may be performed to confirm the identification.
It is appreciated that for longer video files, it may be desirable to perform the method described above for an excerpt of the reference video file, and then comparing the excerpt of the reference video file with a sliding window of the length of the excerpt of the suspect video file.
The method described herein above may by executed using a general purpose computer which comprises one or more processors. One of the processors may be a special purpose processor operative to perform the content comparison method described herein. Alternatively, the content comparison method described herein may be executed by a general purpose processor running special purpose software for the execution of the content comparison method described herein. The one or more processors typically operate modules, which may be hardware or software for execution of the method described herein. For example, determining may be performed at a determining module, etc.
The following block of pseudocode provides an exemplary routine which might be used for implementing the methods described herein:
It is appreciated that software components of the present invention may, if desired, be implemented in ROM (read only memory) form. The software components may, generally, be implemented in hardware, if desired, using conventional techniques. It is further appreciated that the software components may be instantiated, for example: as a computer program product or on a tangible medium. In some cases, it may be possible to instantiate the software components as a signal interpretable by an appropriate computer, although such an instantiation may be excluded in certain embodiments of the present invention.
It is appreciated that various features of the invention which are, for clarity, described in the contexts of separate embodiments may also be provided in combination in a single embodiment. Conversely, various features of the invention which are, for brevity, described in the context of a single embodiment may also be provided separately or in any suitable subcombination.
It will be appreciated by persons skilled in the art that the present invention is not limited by what has been particularly shown and described hereinabove. Rather the scope of the invention is defined by the appended claims and equivalents thereof:
Number | Name | Date | Kind |
---|---|---|---|
6473095 | Martino et al. | Oct 2002 | B1 |
8347408 | Rodriguez | Jan 2013 | B2 |
20030174859 | Kim | Sep 2003 | A1 |
20080229357 | Candelore et al. | Sep 2008 | A1 |
20090154806 | Chang | Jun 2009 | A1 |
20090175538 | Bronstein et al. | Jul 2009 | A1 |
20090327334 | Rodriguez | Dec 2009 | A1 |
20090328125 | Gits | Dec 2009 | A1 |
20090328237 | Rodriguez | Dec 2009 | A1 |
20100104259 | Shakya | Apr 2010 | A1 |
20110037852 | Ebling et al. | Feb 2011 | A1 |
20110113444 | Popovich | May 2011 | A1 |
20130039587 | Zhang | Feb 2013 | A1 |
20150134668 | Popovich | May 2015 | A1 |
20160048887 | Joshi | Feb 2016 | A1 |
20160072599 | Kariyappa | Mar 2016 | A1 |
20170228599 | De Juan | Aug 2017 | A1 |
20170357875 | Hardee | Dec 2017 | A1 |
Number | Date | Country |
---|---|---|
2013104432 | Jul 2013 | WO |
Entry |
---|
Douze, Matthijs et al.; An Image-Based Approach to Video Copy Detection With Spatio-temporal Post-Filtering, IEEE Transactions on Multimedia, vol. 12 No. 4, Jun. 2010. |
Wu, Chenxia et al.; A Content-Based Video Copy Detection Method With Randomly Projected Binary Features (2012). |
Number | Date | Country | |
---|---|---|---|
20180082121 A1 | Mar 2018 | US |