The present invention relates generally to video playback and, more specifically, to real-time scrubbing of on-line videos.
Manually searching for a specific location in a digital video file by “scrubbing,” i.e., by moving a cursor along a timeline slider associated with the video, can be performed fairly conveniently with a downloaded video. This is because the current selected frame of the downloaded video can generally be accessed and displayed almost immediately after the cursor is positioned at any point on the timeline slider. This allows a viewer to quickly determine what portion of the video corresponds to the current cursor position and whether the cursor should be repositioned to find the target location in the video. However, due to the long wait time associated with downloading longer videos before viewing of the video can begin, video streaming has become a popular alternative to video downloading.
Streaming video, in which audio-visual media are constantly received and presented to an end user while being delivered by a provider, allows viewing to begin after a relatively short period of data buffering has taken place. This is because the bulk of the downloading process takes place while the video is being presented to the end user. However, in contrast to downloaded videos, which are fully cached, searching for a specific location in a streaming video can be problematic. Specifically, when an end user searches for a specific location in a streaming video by scrubbing a cursor position forward or backward along a timeline associated with the video, the end user must wait for the data buffering process to complete before the portion of the video corresponding to the current cursor location is displayed. This is because streaming video players are generally unable to display video frames until a certain level of buffering has taken place. Unfortunately, such display latency occurs each time the cursor is repositioned, even if the cursor is only repositioned a very short distance along the video time line. Because the cursor is generally repositioned many times in the course of navigating to a target location in a video, even when the requisite data buffering takes as little as 5 or 10 seconds, navigation performance is significantly degraded. Furthermore, since video information corresponding to cursor location is not displayed during navigation of a timeline slider, an end user can only guess where to reposition the cursor in the course of navigating the video and then must wait for video to be displayed after data buffering is completed for the latest cursor position. Thus, the scrubbing experience for the end user can be very choppy, time-consuming, and frustrating.
As the foregoing illustrates, there is a need in the art for a more effective way to navigate an on-line video.
One embodiment of the present invention sets forth a computer-implemented method for traversing a streaming video file. The method includes receiving a representative streaming video file that includes less information than a higher-resolution streaming video file and spans the entire streaming video file. Based on navigation information associated with the representative streaming video file, a playback engine navigates to a different portion of the streaming video file. The navigation information may be based on input information received from a viewer of the streaming video file.
One advantage of the disclosed method is it enables fast and accurate navigation of a streaming video. This is because, unlike conventional video navigation, an end user does not suffer the latency during navigation caused by a video player waiting for a buffering process to complete before displaying streaming video. Such latency is inherent in prior art navigation techniques since navigation operations are performed across the actual, unbuffered video stream.
So that the manner in which the above recited features of the invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only typical embodiments of this invention and are therefore not to be considered limiting of its scope, for the invention may admit to other equally effective embodiments.
One embodiment of the present invention sets forth a computer-implemented method for real-time scrubbing of a streaming video. The method includes downloading a low-resolution version of the video and using the low-resolution version of the streaming video for display to facilitate navigation the video. Because the low-resolution version is fully downloaded before navigating the video takes place, any portion of the low-resolution version is available and any appropriate frame can be immediately displayed whenever a cursor is moved along a video timeline slider. Thus, during navigation along the video timeline slider, the low-resolution version of the video is displayed instead of the streaming, high-resolution version of the video. Once navigation ends, for example when a mouse button used to control the cursor is released, the display can snap back to the high-resolution streaming video and normal viewing of the video begins.
Processor 101 may be any suitable processor implemented as a central processing unit (CPU), a graphics processing unit (GPU), an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA), or another type of processing unit. Processor 101 is configured to execute program instructions associated with a particular instruction set architecture (ISA) specific to processor 101, including video player application 104 and a video scrubbing application 105. Processor 101 is also configured to receive data from and transmit data to I/O devices 102 and memory 103.
I/O devices 102 include devices that may be used to input data to computing device 100 or devices that output data from computing device 100. I/O devices 102 may include input devices, such as a joystick, a switch, a microphone, a video camera, a keyboard, a mouse, or a touchpad, among others. I/O devices 102 may also include one or more output devices, such as a one or more display screens, a speaker, a projector, or a lamp, among others. In addition, I/O devices 102 include devices used to input data to or output data from computing device 100, such as an Ethernet port, a serial port, a compact disc (CD) drive, or a digital video disc (DVD) drive, among others. In some embodiments, one or more of I/O devices 102 are configured to couple computing device 100 to a network 110.
I/O devices 102 also include a display device 120. Display device 120 may be a computer monitor, a video display screen, a display apparatus incorporated into a hand held device, or any other technically feasible display screen configured to present video media to an end user. In some embodiments, display device 120 is a terminal window displayed on another display device, such as a video display window that has been opened by video player application 104.
Network 110 may be any technically feasible type of communications network that allows data to be exchanged between computing device 100 and external entities or devices, such as a video archive database or a provider of on-demand streaming media. For example, network 110 may include a wide area network (WAN), a local area network (LAN), a wireless (WiFi) network, and/or the Internet, among others.
Memory block 103 may be a hard disk, a flash memory stick, a compact disc read-only memory (CD-ROM), a random access memory (RAM) module, or any other type of volatile or non-volatile memory unit capable of storing data. Memory 103 includes various software programs that can be executed by processor 101, including video player application 104 and, in some embodiments, video scrubbing application 105, as described in greater detail below. In other embodiments, video scrubbing application 105 resides on a server, and results of navigation, described below, are pushed to computing device 100, instead of being generated locally.
Video player application 104 may be any computer application or playback engine configured to play streaming video content for an end user on display device 120, and comprises program instructions that can be executed by processor 101. For example, video player application 104 may be a proprietary application provided to an end user by a provider of on-demand streaming media and installed on computing device 100. Alternatively, video player application 104 may be a non-proprietary third-party application installed on computing device 100. Video player application 104 is configured to present streaming, i.e., un-cached, video content on display device 120, such video content being received by computing device 100 via network 110. In the embodiment illustrated in
Video scrubbing application 105 is a software application configured to facilitate navigation when viewing streaming video content. In operation, embodiments of video scrubbing application 105 enhance real-time scrubbing of a streaming video by displaying a representative video 162 of streaming video 161 on display device 120 so that representative video 162 of streaming video 161 can be used for latency-free navigation to a desired location in streaming video 161. In some embodiments, representative video 162 of the streaming video is configured as a complete video file and can be cached in a suitable location in computing device 100, such as in memory 103. Furthermore, in some embodiments, video player application 104 receives data from an end user of computing device 100 via a keyboard, mouse, touch pad, and/or other suitable input devices, including a smart phone or digital tablet configured to communicate with computing device 100. Thus, a mouse, touchpad, or other input device can be used by an end user as a navigation device to traverse from one location in streaming video 161 to another while viewing streaming video 161. Such navigation can be performed by changing the position of a pointer or cursor on a timeline slider associated with streaming video 161, the position of the cursor being communicated to video scrubbing application 105 as navigation information 163.
To avoid unwanted and distracting latencies during video scrubbing of streaming video 161, representative video 162 is presented on display device 120 during the video scrubbing process. Thus, as the end user navigates timeline slider 201 associated with streaming video 161 by using cursor 202, a suitable frame is selected from representative video 162 and is displayed on display device 120. The selected frame shown on display device 120 corresponds to the location on timeline slider 201 indicated by cursor 202. Because representative video 162 of streaming video 161 includes significantly less information than streaming video 161, representative video 162 can be quickly downloaded and fully cached as a complete video file prior to or in an initial stage of viewing streaming video 161. Being fully cached, representation 161 can be scrubbed in real time and used to quickly locate a desired scene in streaming video 161, i.e., without the latency associated with data buffering each time the cursor is repositioned during navigation. Once navigation is paused or ended, video scrubbing application 105 provides a frame number, time code, or other indexing information to video player application 104 that indicates the point in streaming video 161 selected by the end user during navigation. Video player application 104 then requests the appropriate frames of streaming video 161 from the streaming video server, buffers a suitable number of frames of streaming video 161, and snaps back to streaming video 161 to present the selected video content. Thus, an end-user can freely navigate to any point in streaming video 161 and only experiences latency after navigation has been paused or completed and the desired portion of streaming video 161 is being buffered.
In some embodiments, during navigation, representative video 162 is overlayed on streaming video 161, i.e., frames of representative video 162 are displayed on display device 120 with substantially the same dimensions as streaming video 161. For example, both streaming video 161 and representative video 162 may both be displayed as full-screen videos, even though representative video 162 has significantly lower resolution than streaming video 161. Consequently, objects in representative video 162 are located in the same positions on display device 120 as in streaming video 161. In this way, a smooth visual transition between representative video 162 and streaming video 161 is created that avoids the visual disconnect and viewer disorientation that occurs when switching between two different-sized videos of the same subject matter. This smooth transition facilitates the navigation process for an end user.
In some embodiments, representative video 162 comprises a low-resolution video, in which the x-y pixel resolution is significantly reduced compared to the resolution of streaming video 161. In such embodiments, representative video 162 may include very low-resolution frames, where the number of pixels per frame is a low as 1 percent of the number of pixels in a frame of streaming video 161. Human vision can tolerate substantial degradation in image resolution and still provide enough information for image recognition. For example, as little as 16×16 pixel images have been shown in the literature to be suitable for face recognition. Furthermore, human scene recognition on images with a resolution of 32×32 has been shown to be 93% of the recognition of a full resolution of 256×256 source, i.e., a high recognition rate despite having only 1.5 percent the number of pixels of the original. Thus, to facilitate short download time of and minimize storage requirements for representative video 162, the file size of representative video 162 can be greatly reduced when representative video 162 comprises a video having substantially lower x-y pixel resolution relative to streaming video 161. Even when frames in representative video 162 have fewer than 10% of the pixels of frames in streaming video 161, an end user can recognize the current location using representative video 162 and accurately navigate to a desired portion of streaming video 161.
In some embodiments, representative video 162 comprises a low-resolution video with a frame resolution that is significantly reduced compared to the frame resolution of streaming video 161. In other words, there is not a one-to-one correspondence between the total number of frames in representative video 162 and the total number of frames in streaming video 162. Thus, in such embodiments, representative video 162 is encoded with significantly fewer frames than streaming video 161, thereby reducing the file size of representative video 162. In one such embodiment, representative video 162 is encoded with a number of frames that is selected based on the pixel width of the timeline slider associated with streaming video 161, for example timeline slider 201 in
In some embodiments, both the frame resolution and the x-y pixel resolution of representative video 162 can be selected to be significantly less for streaming video 161. In this way, the file size of representative video 162 can be greatly reduced—for example on the order of about 1 MB, even when streaming video 161 includes a full-length movie. In one such embodiment, representative video 162 is encoded with a frame count of 800 and an x-y pixel resolution of 100×134 pixels, so that representative video 162 has a file size of just a few MBs. Such embodiments are particularly beneficial for mobile devices that have limited download bandwidth.
In some embodiments, representative video 162 presents meta data associated with streaming video 161 encoded as a series of frames rather than a lower-resolution version of streaming video 161. Thus, in some embodiments, representative video 162 includes meta data associated with streaming video 161 or with content of streaming video 161, said meta data changing over the duration of the streaming video 161. Representative video 162 can include video frames that present the meta data associated with streaming video 161 in alpha-numeric, graphical or other formats. For example, when the content of streaming video 161 includes a sporting event, representative video 162 can include the game time, the current score, inning, etc. Alternatively, representative video 162 may present a specific portion of the frames making up streaming video 161, such as a close-up of a sporting event scoreboard.
As previously described herein, prior to beginning method 300, representative video 162 of streaming video 161 is produced. Representative video 162 includes significantly less information than streaming video 161, while still spanning substantially all of streaming video 161. Representative video 162 may be a low-resolution version of streaming video 161, in which the x-y pixel resolution and/or frame count is significantly reduced.
As shown, the method 300 begins at step 302, where computing device 100 receives representative video 162 from an external source, such as a provider of on-demand streaming media, a video archive database, or the like. In some embodiments, representative video 162 is received in conjunction with streaming video 161. In one embodiment, representative video 162 is received immediately prior to the streaming of streaming video 161 to computing device 100, so that representative video 162 can be fully cached prior to the streaming process. In this way, navigation of streaming video 161 is enabled prior to the streaming process. Because representative video 162 is a relatively small file, delay prior to streaming is minimal. In another embodiment, representative video 162 is received concurrently with streaming video 161, so that streaming is not delayed at all. In such an embodiment, the ability to navigate to any point in streaming video 161 is slightly delayed, i.e., until all of representative video 162 has been fully cached. In another embodiment, an initial representation video associated with streaming video 161 is received prior to representative video 162, where the initial representation video includes substantially less information than representative video 162, and therefore is very quickly downloaded and cached. For example, the initial representation video may have significantly coarser x-y pixel resolution and/or frame count than representative video 162. In such an embodiment, navigation throughout streaming video 161 is quickly enabled even before representative video 162 has been fully cached, a feature particularly useful for mobile applications that have limited data bandwidth.
In step 303, a user initiates navigation by moving a cursor or other selection device over a timeline slider associated with streaming video 161. In response, video scrubbing application 105 causes video player application 104 to overlay representative video 162 on display device 120 in lieu of streaming video 161. Consequently, the end user actually navigates representative video 162 rather than streaming video 161 when moving the cursor on display device 120.
In step 304, the user generates navigation information by activating the cursor or other selection device at some point on the timeline slider associated with streaming video 161. Because representative video 162 is overlayed on display device 120 during navigation, the current location in representative video 162 is immediately displayed as the cursor is moved along the timeline slider and the end user has latency-free navigation. Whenever navigation is paused or halted, for example by releasing a button on the selection device or by positioning a cursor in the same location on the timeline slider, navigation information is sent to video scrubbing application 105.
In step 306, video scrubbing application 105 receives the navigation information generated in step 304 based on end user input. Such navigation information may include one or more of a frame number, a time code, or any other indexing information associated with the location in streaming video 161 at which navigation was paused or halted in step 304.
In step 308, video scrubbing application 105 sends the suitable instructions to video player application 104, the instructions indicating the point in streaming video 161 selected by the end user during navigation.
In step 310, video player application 104 requests the appropriate frames of streaming video 161 from the streaming video server, buffers a suitable number of frames of streaming video 161, and snaps back to streaming video 161 to present the selected video content.
Various embodiments of the invention may be implemented as a program product for use with a computer system. The program(s) of the program product define functions of the embodiments (including the methods described herein) and can be contained on a variety of computer-readable storage media. Illustrative computer-readable storage media include, but are not limited to: (i) non-writable storage media (e.g., read-only memory devices within a computer such as CD-ROM disks readable by a CD-ROM drive, flash memory, ROM chips or any type of solid-state non-volatile semiconductor memory) on which information is permanently stored; and (ii) writable storage media (e.g., floppy disks within a diskette drive or hard-disk drive or any type of solid-state random-access semiconductor memory) on which alterable information is stored.
The invention has been described above with reference to specific embodiments and numerous specific details are set forth to provide a more thorough understanding of the invention. Persons skilled in the art, however, will understand that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention. The foregoing description and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.
The present application claims priority to U.S. Provisional Patent Application No. 61/547,561, filed Oct. 14, 2011, the entire contents of which are incorporated by reference herein.
Number | Name | Date | Kind |
---|---|---|---|
5377051 | Lane | Dec 1994 | A |
5818439 | Nagasaka | Oct 1998 | A |
5986717 | Fairhurst | Nov 1999 | A |
6139197 | Banks | Oct 2000 | A |
6181336 | Chiu | Jan 2001 | B1 |
6385641 | Jiang | May 2002 | B1 |
6400378 | Snook | Jun 2002 | B1 |
6677961 | Bhat | Jan 2004 | B1 |
6807306 | Girgensohn et al. | Oct 2004 | B1 |
6879634 | Oz | Apr 2005 | B1 |
7058721 | Ellison | Jun 2006 | B1 |
8032832 | Russ | Oct 2011 | B2 |
8086691 | Viger | Dec 2011 | B2 |
8286218 | Pizzurro | Oct 2012 | B2 |
8321905 | Streeter | Nov 2012 | B1 |
8390744 | Sullivan et al. | Mar 2013 | B2 |
8391370 | Mukherjee | Mar 2013 | B1 |
8503523 | Williams | Aug 2013 | B2 |
8520088 | Deever | Aug 2013 | B2 |
8676882 | Georgis | Mar 2014 | B2 |
8787726 | Rossi | Jul 2014 | B2 |
8938548 | Lewis | Jan 2015 | B2 |
8972862 | Haot | Mar 2015 | B2 |
9160960 | Schwesinger | Oct 2015 | B2 |
9171577 | Newman | Oct 2015 | B1 |
20020075572 | Boreczky et al. | Jun 2002 | A1 |
20020133247 | Smith | Sep 2002 | A1 |
20020140719 | Amir | Oct 2002 | A1 |
20030234803 | Toyama | Dec 2003 | A1 |
20040197071 | Zhang et al. | Oct 2004 | A1 |
20040268400 | Barde | Dec 2004 | A1 |
20050047681 | Hori | Mar 2005 | A1 |
20050071782 | Barrett | Mar 2005 | A1 |
20050195900 | Han | Sep 2005 | A1 |
20060026524 | Ma et al. | Feb 2006 | A1 |
20060064716 | Sull | Mar 2006 | A1 |
20060127059 | Fanning | Jun 2006 | A1 |
20060288392 | Fleming | Dec 2006 | A1 |
20070016611 | Wang | Jan 2007 | A1 |
20070019866 | Ayres | Jan 2007 | A1 |
20070056000 | Pantalone | Mar 2007 | A1 |
20070086669 | Berger et al. | Apr 2007 | A1 |
20070255844 | Shen | Nov 2007 | A1 |
20080022005 | Wu et al. | Jan 2008 | A1 |
20080046939 | Lu | Feb 2008 | A1 |
20080059989 | O'Connor et al. | Mar 2008 | A1 |
20080183843 | Gavin | Jul 2008 | A1 |
20080184120 | OBrien-Strain et al. | Jul 2008 | A1 |
20080189752 | Moradi et al. | Aug 2008 | A1 |
20080271095 | Shafton | Oct 2008 | A1 |
20080310814 | Bowra et al. | Dec 2008 | A1 |
20090006368 | Mei | Jan 2009 | A1 |
20090007202 | Williams | Jan 2009 | A1 |
20090080853 | Chen | Mar 2009 | A1 |
20090157743 | Li | Jun 2009 | A1 |
20090174677 | Gehani | Jul 2009 | A1 |
20090249208 | Song | Oct 2009 | A1 |
20090292819 | Kandekar | Nov 2009 | A1 |
20090322962 | Weeks | Dec 2009 | A1 |
20100153574 | Lee | Jun 2010 | A1 |
20100208086 | Kothandaraman | Aug 2010 | A1 |
20100242066 | Tseng | Sep 2010 | A1 |
20100251121 | Rosser et al. | Sep 2010 | A1 |
20100303440 | Lin | Dec 2010 | A1 |
20110191679 | Lin | Aug 2011 | A1 |
20120050012 | Alsina | Mar 2012 | A1 |
20120054615 | Lin | Mar 2012 | A1 |
20120070129 | Lin | Mar 2012 | A1 |
20120084454 | Lindquist et al. | Apr 2012 | A1 |
20120166950 | Frumar | Jun 2012 | A1 |
20120239642 | Bliss et al. | Sep 2012 | A1 |
20130036233 | Orleth | Feb 2013 | A1 |
20130080895 | Rossman | Mar 2013 | A1 |
20130120454 | Shechtman et al. | May 2013 | A1 |
20130132605 | Kocks | May 2013 | A1 |
20140109140 | Yao et al. | Apr 2014 | A1 |
20140310601 | Matejka et al. | Oct 2014 | A1 |
Number | Date | Country | |
---|---|---|---|
20130097508 A1 | Apr 2013 | US | |
20180129407 A9 | May 2018 | US |
Number | Date | Country | |
---|---|---|---|
61547651 | Oct 2011 | US |