An increasing number of people own and use digital video recorders to make videos that capture their experiences and document events in their lives. One of the problems with consumer home video acquisition devices such as digital video cameras is that they are linear-based devices and a single recording, either digital or on tape, may contain multiple “events” (e.g. birthday party, soccer game, vacation video). Each event may in turn include multiple “clips” or “shots” (e.g., the sequence of contiguous video frames between the time when the camera is instructed to start recording and when it is instructed to stop recording). Moreover, each shot may consist of one or more scenes. Unfortunately, the linear nature of typical video recordings often makes it difficult to find and playback a segment of the video showing a specific event, scene, or shot.
It is usually more convenient to the user if a long video may be divided into a number of shorter segments and the user is allowed to access those segments directly. Ideally the video should be divided at the points where natural discontinuities occur. Natural discontinuities include discontinuities in time (e.g., gaps in the recorded DV time code) as well as discontinuities in content (e.g., scene changes). If the recording is continuous on a digital video (DV) tape, for example, the time code should increment by a predictable value from frame to frame. If the recording is not continuous (e.g., the user stops the recording then records again later), then there will be a gap in the time stamp that is larger than the normal frame-to-frame increment. Such gaps correspond to discontinuity points in time. Similarly, if there is no sudden motion or lighting change, the video content would remain generally continuous as well. A sudden change in the video content may suggest the occurrence of some event in the video. Such sudden changes would correspond to discontinuity points in content. A time-based or content-based discontinuity point in a video is sometimes referred to as a shot boundary, and the portion of a video between two consecutive shot boundaries is considered to be a shot.
Known video playback, browsing and editing applications, such as multimedia editing applications (MEAs), allow a user to bring versatility to such linear video recordings via a personal computer by allowing the user to capture or transfer the video onto the computer and then to manually segment the digital video file into events of the user's choosing. Some MEAs make this easier for the user by attempting to automatically detect clip boundaries within a particular video file that has been captured. Conventional MEAs use various methods to detect clip boundaries within a particular video. Thereafter, the MEA may segment the video file into clips that are displayed in a library to allow the user to manually select clips and combine them to form recordings of events of the user's choosing. However, as known to those skilled in the art, these applications are unable to achieve the simultaneous capture and editing of videos.
Aspects of the invention not only detect clips in a video being captured, but also simultaneously display and edit the video during the capturing process. The invention provides a remote user-interface for allowing a user to interact with clips during the capturing process. Accordingly, a personalized home movie may be created using the remote interface during the capture process.
Computer-readable media having computer-executable instructions for segmenting videos embody further aspects of the invention. Alternatively, embodiments of the invention may comprise various other methods and apparatuses.
Other features will be in part apparent and in part pointed out hereinafter.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
Corresponding reference characters indicate corresponding parts throughout the drawings.
Referring first to
At the next higher level, digital video 102 comprises multiple video shots, or clips 106 including one or more video frames 104. As shown by timeline 108, each video clip 106 represents a continuously recorded portion of the digital video 102 between a record operation R and a stop operation S of the recording device. Within video clip 106, each subsequent video frame 104 after the first video frame 104 in the shot has a start date and time equal to the start date and time of the previous video frame 104 plus the duration D, as indicated by reference character 110, of the previous video frame 104. As known to those skilled in the art, the difference between the last frame of one shot and the first frame of the next shot is always greater than the duration of a single video frame 104. The time difference may be a few seconds or it may be several minutes, hours or even days or months away, and typically corresponds to the time between the user pressing stop on a video recording device (e.g., camcorder) and the next time the user starts recording.
Referring now to
In one embodiment, the digital video camera 206 records a visual image or series of visual images and generates the video stream 205 representative of the visual image or series of visual images. The video stream 205 includes video data 208 specifying the start time and date of the individual video images or “video frames” included in the video stream 205.
The remote CRM 207 may be any CRM storing video data 208 that may be linked to the computer 202 for the purpose of transferring or storing video data 208. For example, the remote CRM 207 may be an optical disc in a DVD-drive, another computer, a personal video recorder (PVR), or any other video-capable device that may be linked to the computer 202 via a network (e.g. Ethernet) or direct connection (e.g. Universal Serial Bus) such that video data 208 stored on the remote CRM 207 may be transferred to the computer 202 or received from the computer 202 via electronic means such as file transfer or electronic mail.
A capture tool 211 is linked to the computer 202 and the digital video camera 206 for capturing the video stream 205. The capture tool 211 transfers the digital data directly to the MEA 204 or directly to the CRM 212 (e.g., hard drive or random access memory (RAM) of the computer 202 for storage as a video file 214 containing, for example, DV data. Alternatively, the capture tool 211 may convert the format of digital video stream 205 from one digital video format to another during capture. For example, the capture tool 211 may convert the format of the video stream 205 from DV data to Windows Media Video (WMV) while preserving the date and time information about each of the series of video frame 104 included in the video data 208. The capture tool 211 may change the timing or the number of frames present within the video stream 205. For example, the capture tool 211 may convert the frame rate of the video steam 205 to a different frame rate while preserving the start time for each new video frame 104 created and calculating a new duration for each video frame 104. The capture tool 211 may be implemented using software that writes DV/Audio Video Interleave (AVI) files together with a direct connection such as an Institute of Electrical and Electronic Engineers (IEEE) 1394 interface. The IEEE-1394 interface may be connected to an IEEE-1394 connection port on a digital camcorder and connected to an IEEE-1394 connection port on the computer 202 to facilitate the transfer of the video stream 205, generated by digital video camera 206, to the computer 202 for storage. Although the capture tool 211 is described as capturing a video stream 205, it is contemplated that audio information (i.e., audio stream) that corresponds to a particular video 102 may also be captured. Thus, as used herein, the discussion relating to video is applicable to both video and audio information.
An MEA 204 allows a user to archive video recordings recorded on video tapes for storage in a digital format. The MEA 204 further allows a user to dynamically or selectively edit the video data 208 during the video stream 205 capturing process. That is, the user is not required to wait until the video stream 205 has been completely captured before editing the video data 208 included in the video stream. Moreover, embodiments of the MEA of the invention allow the user to remotely edit the video 102 during the capturing process. In other words, the user is not required to be located at the computer terminal, but may use a remote control (not shown) to send commands to the computer executing the MEA to initiate the editing process. As a result, the editing and capturing process may be experienced by multiple viewers.
The MEA 204 in the illustrated embodiment provides a user interface (UI) 220 for selectively defining a transition effect to insert between consecutive video clips and/or special effects to apply to one or more video clips during the video capturing process. The UI 220 also provides the user the ability to view video clips as they are being captured, the ability to preview edits made as the video is being captured, and the ability to store and transfer the edited video file after the video capturing process has been completed.
The exemplary operating environment illustrated in
Although described in connection with an exemplary computing system environment, aspects of the invention are operational with numerous other general purpose or special purpose computing system environments or configurations. The computing system environment is not intended to suggest any limitation as to the scope of use or functionality of aspects of the invention. Moreover, the computing system environment should not be interpreted as having any dependency or requirement relating to any one or combination of components illustrated in the exemplary operating environment. Examples of well known computing systems, environments, and/or configurations that may be suitable for use in embodiments of the invention include, but are not limited to, personal computers, server computers, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, mobile telephones, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
Embodiments of the invention may be described in the general context of computer-executable instructions, such as program modules, executed by one or more computers or other devices. Generally, program modules include, but are not limited to, routines, programs, objects, components, and data structures that perform particular tasks or implement particular abstract data types. Aspects of the invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
Referring now to
A segmentation component 312 segments the video stream 205 into video clips 106 as a function of the determined differences between property values of successive video frames 104 as the video stream 205 is being captured. For example, the segmentation component 312 compares the distance or difference between the color histograms of adjacent frames 104 to a threshold value and defines a segment boundary between adjacent video frames 104 having a difference greater than the threshold value. The segmentation component 312 is responsive to the defined segment boundary to segment the video stream 205 into video segments, or video clips. For example, referring briefly to
Referring again to
The segmentation component 312 calculates differences in time property values between successive video frames 104 to define the segment boundary 313. For example, if the time difference between a particular set of successive video frames 104 indicates a break in time, the segmentation component 312 defines a segment boundary between that particular set of video frames 104. There are at least two sets of time property data that can be used to determine if there is a break in time; time codes and time stamps. The time code is contiguous on the tape unless the user removes and reinserts parts of the tape. The time stamp represents when the video was taken. If there is a break in the time stamp or time code, a segment boundary is defined.
Other property value determination techniques including face recognition algorithms, GPS data, audio analysis (loud/soft, break in background music, music beat detection), accelerometer (physical movement of the recording device), detection of movement within the video may be used to assign property values to video frames 104 in order to distinguish adjacent video clips 106.
An edit component 318 is responsive to the defined segment boundary to apply a default transition effect between the successive video clips 106, separated by the defined segment boundary, while the video stream 205 is being captured from the video source 206. As known to those skilled in the art, a transition effect is added between successive video clips 106 in a video editing process to provide a desired visual transition between consecutive scenes on the video. Examples of such transitions include following: (1) dissolve; (2) side wipe; (3) checker board cut; (4) fade; (6) circle effect; (7) flip (8) wipe; and (9) lines. (See
For example, consider that the MEA 300 has captured first and second successive clips of a video. The first clip has an end time, as determined from time data included in the last video frame 104 of the first clip, and the second clip has a start time, as determined from time data included in the first video frame 104 of the second clip. If the difference between the end time of the first clip and the start time of the second clip is less than a predetermined time period (e.g., less than 5 minutes), a soft transition type such as a dissolve effect is applied between the first and second clip. As another example, if content property values of video frames included in the first and second clips, as determined by color histogram analysis, indicates the video clips were recorded in locations having substantially different backgrounds, a more complex transition type such as “fade to color” is applied between the first and second clips.
The edit component 318 is further responsive to the defined segment boundary to apply a default special effect to each video clip while the video data stream is being captured from the video source. According to aspects of the invention, the edit component 318 determine a default special effect to apply to a particular video clip as a function of the determined the property values of the video frames included in that particular video clip and/or based on special effect selection rules stored in a memory. For example, a pan and zoom effect may be automatically added to still images, flash and board effects may be automatically added to very short video clips, and a picture-in-picture effect may be automatically applied to clips that overlap in time.
A user interface (UI) component 320 allows the user to view and/or edit video clips 106 while the video stream 205 is being captured. More specifically, the UI component 320 displays a graphical user interface to the user via a display (e.g., display 221) that allows the user to view video clips 106, selectively edit video clips, and preview any edits made to video clips while the video stream is being captured from the video source 206.
Referring now to
The UI includes one or more transition controls 520 for adding a transition effect (transition) or replacing an existing transition (e.g., overriding a default transition) between video clips. In this particular embodiment, a transition control 520 is located between each of the miniaturized windows (e.g., 508, 510, 512, and 514) displayed in the clip queue window 506. By selecting a transition control 520 between a particular set of miniaturized windows (e.g., 508, 510, 512, and 514), a transition selection menu displaying various transition effects that may be applied between the corresponding video clips is presented to the user via the UI 500. For example, if the user selects the transition control 520 between the mini-capture window 508 and the mini-preview window 510, a first transition menu 522 such as shown in the screen shot illustrated in
Alternatively, a second transition menu 524 such as shown in the screen shot illustrated in
Referring back to
The UI 500 displays a rating field 532 for displaying rating data that has been assigned to the video clip 106 by the analysis component 310. The rating field 532 displays rating data for the particular video clip 106 being played in the capture window 504. Alternatively, if the user has selected a particular mini-preview window (e.g., 510, 512, and 514) from the clip queue 506 window, the rating field 532 displays rating data for that particular video clip 106. According to one aspect of the invention, the analysis component 310 is responsive to content included in video clips 106 being captured to assign a rating to each captured video clip 106. The rating data is presented via a rating scale that displays between one to four stars. One star corresponds to the lowest rating and four stars correspond to the highest rating. For example, the analysis component 310 determines a rating for each video clip 106 being captured as a function of the quality of video frames 104 included in each of the captured video clips, whether faces are detected in the video clips, or the speed at which the video clip was originally recorded. The analysis component 310 determines the quality of the video by determining whether the video frames 104 included in the video clip are dark or blurry. Video clips identified as dark or blurry are assigned a low rating (e.g. one star). The analysis component 310 determines if faces are included in video clips by using face detection techniques such as described in commonly owned U.S. patent application Ser. No. 10/850,883 to Blake et al. If faces are detected in the video clip a higher rating (e.g. 2.5 stars rating) is assigned to the video clip. The analysis component 310 uses motion vector analysis to analyze each video clip to identify clips that have motion above or below a certain threshold (e.g., slow motion). If slow motion is detected within a video clip, a high rating (e.g. 4 stars) is assigned to the video clip. According to one aspect of invention, the edit component 318 is responsive to the rating assigned by the analysis component 310 to deletes clips from the video having an assigned rating less than or equal to a threshold rating. For example, video clips 106 that have been assigned a rating less than 2 stars are deleted from the video. A user may override the assigned ratings by selecting the clip window from the clip queue and adjusting a star rating property. More specifically, when the clip is previewed in the main window 518, the clips star rating is displayed. Pressing the keys 1-4 or navigating to and selecting the star will adjust the clips star rating. If the user makes changes to the star rating, the auto movie is instantly updated and the current clip is replayed.
A playback control 538 controls the playback of a video clip. More specifically, the playback control 538 allows a user to stop (S), play/pause (P), rewind (R), or fast forward (F) a video clip 106 being played in the capture window 504, or being played in an active preview window 518.
After the video capturing process has completed, the UI component 314 displays a different graphical user interface to allow the user to burn the captured video to a DVD, edit the video, add a title, or share the video. Referring now to
Referring now to
The order of execution or performance of the operations in embodiments of the invention illustrated and described herein is not essential, unless otherwise specified. That is, the operations may be performed in any order, unless otherwise specified, and embodiments of the invention may include additional or fewer operations than those disclosed herein. For example, it is contemplated that executing or performing a particular operation before, contemporaneously with, or after another operation is within the scope of aspects of the invention.
When introducing elements of aspects of the invention or the embodiments thereof, the articles “a,” “an,” “the,” and “said” are intended to mean that there are one or more of the elements. The terms “comprising,” “including,” and “having” are intended to be inclusive and mean that there may be additional elements other than the listed elements.
As various changes could be made in the above constructions, products and methods without departing from the scope of embodiments of the invention, it is intended that all matter contained in the above description and shown in the accompanying drawings shall be interpreted as illustrative and not in a limiting sense.
Number | Name | Date | Kind |
---|---|---|---|
5353391 | Cohen et al. | Oct 1994 | A |
5513306 | Mills et al. | Apr 1996 | A |
5559641 | Kajimoto et al. | Sep 1996 | A |
5706417 | Adelson | Jan 1998 | A |
5929867 | Herbstman et al. | Jul 1999 | A |
6028603 | Wang et al. | Feb 2000 | A |
6081299 | Kesselring | Jun 2000 | A |
6173317 | Chaddha et al. | Jan 2001 | B1 |
6195088 | Signes | Feb 2001 | B1 |
6204840 | Petelycky et al. | Mar 2001 | B1 |
6230172 | Purnaveja et al. | May 2001 | B1 |
6269394 | Kenner et al. | Jul 2001 | B1 |
6317795 | Malkin et al. | Nov 2001 | B1 |
6351765 | Pietropaolo et al. | Feb 2002 | B1 |
6369835 | Lin | Apr 2002 | B1 |
6373507 | Camara et al. | Apr 2002 | B1 |
6424789 | Abdel-Mottaleb | Jul 2002 | B1 |
6469711 | Foreman et al. | Oct 2002 | B2 |
6597859 | Leinhart et al. | Jul 2003 | B1 |
6628710 | Llach-Pinsach et al. | Sep 2003 | B1 |
6721361 | Covell et al. | Apr 2004 | B1 |
6807306 | Girgensohn et al. | Oct 2004 | B1 |
6813313 | Xu et al. | Nov 2004 | B2 |
6877134 | Fuller et al. | Apr 2005 | B1 |
6882793 | Fu et al. | Apr 2005 | B1 |
6928613 | Ishii et al. | Aug 2005 | B1 |
7016540 | Gong et al. | Mar 2006 | B1 |
7027509 | Jun et al. | Apr 2006 | B2 |
7042464 | Paquette | May 2006 | B1 |
7124366 | Foreman et al. | Oct 2006 | B2 |
7203380 | Chiu et al. | Apr 2007 | B2 |
7222300 | Toyama et al. | May 2007 | B2 |
7334191 | Sivan et al. | Feb 2008 | B1 |
7398004 | Maffezzoni et al. | Jul 2008 | B1 |
7444062 | Matsumoto | Oct 2008 | B2 |
20010035875 | Suzuki et al. | Nov 2001 | A1 |
20010041020 | Shaffer et al. | Nov 2001 | A1 |
20020138619 | Ramaley et al. | Sep 2002 | A1 |
20020188943 | Freeman et al. | Dec 2002 | A1 |
20030052909 | Mo et al. | Mar 2003 | A1 |
20030090506 | Moore et al. | May 2003 | A1 |
20030146915 | Brook et al. | Aug 2003 | A1 |
20030192049 | Schneider et al. | Oct 2003 | A1 |
20030227493 | Yokomizo | Dec 2003 | A1 |
20030234805 | Toyama et al. | Dec 2003 | A1 |
20030237091 | Toyama et al. | Dec 2003 | A1 |
20040049419 | Knight | Mar 2004 | A1 |
20040131330 | Wilkins et al. | Jul 2004 | A1 |
20040131332 | Wilson et al. | Jul 2004 | A1 |
20040143846 | Zeps et al. | Jul 2004 | A1 |
20040230655 | Li et al. | Nov 2004 | A1 |
20050005308 | Logan et al. | Jan 2005 | A1 |
20050053352 | McKain et al. | Mar 2005 | A1 |
20050053356 | Mate et al. | Mar 2005 | A1 |
20050071774 | Lipsky et al. | Mar 2005 | A1 |
20050081159 | Gupta et al. | Apr 2005 | A1 |
20050097477 | Camara et al. | May 2005 | A1 |
20050246373 | Faulkner et al. | Nov 2005 | A1 |
20050249080 | Foote et al. | Nov 2005 | A1 |
20050257151 | Wu | Nov 2005 | A1 |
20050281535 | Fu et al. | Dec 2005 | A1 |
20050286863 | Howarth | Dec 2005 | A1 |
20060059426 | Ogikubo | Mar 2006 | A1 |
20060288288 | Girgensohn et al. | Dec 2006 | A1 |
20070074115 | Patten et al. | Mar 2007 | A1 |
20070218448 | Harmeyer et al. | Sep 2007 | A1 |
20080034325 | Ording | Feb 2008 | A1 |
Number | Date | Country |
---|---|---|
05290548 | Nov 1993 | JP |
WO 9226600 | Aug 1996 | WO |
WO 0146955 | Jun 2001 | WO |
Number | Date | Country | |
---|---|---|---|
20070074115 A1 | Mar 2007 | US |