1. Field of the Invention
The present invention relates to media editing and, more particularly, to compositing media from multiple takes or pieces.
2. Description of the Related Art
In the course of producing a video, such as a movie, it is common for audio engineers (or sound engineer) to add multiple audio tracks to a video track. This task can be referred to as audio production. It takes a substantial effort to place the audio tracks in the proper position with respect to the video track. Often, the audio tracks are overlapping such as to provide background noise, dialog, musical instruments, sound effects, etc. There are software programs that assist audio engineers with these tasks. One example of an existing software program for audio editing/production application is “Soundtrack Pro” available from Apple Inc. of Cupertino, Calif.
There is often a need to consider a plurality of “takes” of the same audio material. For example, when automatic dialog replacement is being performed, dialog for a scene is often recreated in a studio. When recreating dialog, an actor or voice professional may utilize several takes, whereby the dialog can be repeated several times. The audio engineer can then choose the best of the takes to assign to the audio track. However, often, no one take is perfect. In such case, if the audio engineer desires to manipulate the audio, then the audio engineer has to manually form a resulting audio track. This manually process imposed on the audio engineer is difficult, time consuming and offers little flexibility. Similar difficulties occur when an audio engineer needs to cut together multiple takes of a musical performance of an artist, choir, band, etc. Thus, there is a need for improved approaches to facilitate usage of multiple takes.
The invention pertains to methods, graphical user interfaces, computer apparatus and computer readable medium for producing media content. For example, a user of a computing device can utilize the methods, graphical user interfaces and computer readable medium to edit the media content. In one embodiment, the media content pertains to media tracks, such as audio or video tracks. The media content can be a plurality of individual media tracks that can be segmented and the resulting segments from different media tracks can thereafter be combined into a composite media track.
The invention can be implemented in numerous ways, including as a method, system, device, apparatus (including graphical user interface), or computer readable medium. Several embodiments of the invention are discussed below.
As a computer-implemented method for producing a composite media asset from a plurality of digital media assets, one embodiment of the invention includes at least: accessing a plurality of digital media assets; forming media segments with respect to the digital media assets; and receiving selections of the media segments from one or more of the digital media assets.
As a computer-implemented method for producing a composite audio recording from a plurality of individual recordings, one embodiment of the invention includes at least: obtaining a plurality of individual recordings; concurrently displaying the individual recordings; configuring transition points with respect to the individual recordings, thereby forming media segments with respect to the individual recordings; receiving selections of the media segments from different ones of the individual recordings; and combining the selected media segments into a composite audio recording.
As a graphical user interface for producing a composite digital media asset from a plurality of available digital media assets, one embodiment of the invention includes at least: a time line; a plurality of digital media track representations displayed with respect to said time line, the digital media track representations corresponding to and representing available digital media asset tracks; at least one region indicator that denotes different regions for the available digital media asset tracks, the at least one region indicator being user configurable; and a user control that enables a user to select one or more of the available digital media asset tracks for utilization in each of the different regions.
As a graphical user interface for producing a composite media asset from a plurality of individual assets, one embodiment of the invention includes at least: a time line; a plurality of individual media track representations displayed with respect to said time line, the individual media track representations corresponding to and representing individual media tracks; and a composite media track representation displayed with respect to said time line. The composite media track representation can correspond to and represent a composite media track formed from different portions of the individual media tracks.
As a computer readable medium including at least computer program code for producing a composite audio recording from a plurality of digital media assets, one embodiment of the invention includes at least: computer program code for obtaining a plurality of digital media assets; computer program code for concurrently displaying the digital media assets; computer program code for forming regions with respect to the digital media assets; computer program code for receiving selections of the regions from different ones of the digital media assets; and computer program code for forming a composite recording based on the region selections.
As a computer-implemented method for producing a composite audio recording from a plurality of individual recordings, one embodiment of the invention includes at least: displaying a first audio track representation including a waveform of audio content of a first audio track; displaying a second audio track representation including a waveform of audio content of a second audio track; displaying a first transition indicator across both the first audio track representation and the second audio track representation; segmenting the first audio track into a plurality of segments based on at least the first transition indicator; segmenting the second audio track into a plurality of segments based on at least the first transition indicator; receiving a first user selection of one of the segments of the first audio track or the second audio track as a first selected segment; receiving a second user selection of one of the segments of the first audio track or the second audio track as a second selected segment; and displaying a composite audio track representation. The composite audio track representation can include a waveform of audio content of a composite audio track. The composite audio track representation is a waveform that is a composite from (i) at least a portion of the waveform of audio content of the first audio track or the second audio track corresponding to the first selected segment and (ii) at least a portion of the waveform of audio content of the first audio track or the second audio track corresponding to the second selected segment.
As a computing apparatus, one embodiment of the invention includes at least: an audio pickup device suitable for obtaining multiple takes of a particular audio recording; a display device capable of concurrently displaying the multiple takes; a memory configured to store at least audio content pertaining to the multiple takes; and a processing device. The processing device operates to at least: cause the multiple takes to be displayed on the display device in an adjacent manner; form regions with respect to the multiple takes; receive region selections from the regions formed with respect to the multiple takes; and associate audio segments associated with the region selections to form a composite audio recording.
Other aspects and advantages of the invention will become apparent from the following detailed description taken in conjunction with the accompanying drawings which illustrate, by way of example, the principles of the invention.
The invention will be readily understood by the following detailed description in conjunction with the accompanying drawings, wherein like reference numerals designate like structural elements, and in which:
The invention pertains to methods, graphical user interfaces, computer apparatus and computer readable medium for producing media content. For example, a user of a computing device can utilize the methods, graphical user interfaces, computer apparatus, and computer readable medium to edit the media content. In one embodiment, the media content pertains to media tracks, such as audio or video tracks. The media content can be a plurality of individual media tracks that can be segmented and the resulting segments from different media tracks can thereafter be combined into a composite media track.
Embodiments of the invention are discussed below with reference to
The audio editing process 100 can obtain 102 multiple takes of a particular audio recording. The multiple takes are typically takes of generally similar audio content. For example, if the audio is user speech, the user might say the word, sentence or phrase multiple times, each of which can be considered a “take”. As another example, if the audio is a sound effect, the sound effect could be recorded multiple times.
The multiple takes can then be displayed 104 in an adjacent manner. The computing device typically has a display device and the multiple takes can be displayed 104 on the display in an adjacent manner. As one example, the multiple takes can be displayed in a vertically stacked manner. However, the multiple takes can be concurrently displayed is various other ways. In addition, transition points can be configured 106. The transition points are established with respect to the multiple takes and serve to effectively divide up audio content for an audio recording into a plurality of regions. In one implementation, a user can interact with user interface controls to establish, move, resize, etc. one or more transition points. By establishing and/or manipulating transition points, the position and number of transition points for the multiple takes can be established.
Next, region selections can be received 108. Here, the user can make selections from the different regions from the different takes. Hence, after the region selections have been received 108, a composite recording can be formed 110 based on the region selections. In one implementation, it is assumed that a user interacts with user interface controls to select desired regions from the multiple takes that are to be combined and utilized as a composite recording. The composite recording is typically also displayed on the display device. Following the block 110, the audio editing process 100 can end.
The media editing process 200 can obtain 202 a plurality of individual recordings. The individual recordings can be displayed 204 proximate to one another. In one implementation, the individual recordings are displayed 204 on a display device associated with the computing device that performs the media editing process 200.
Next, transition points can be configured 206 with respect to the individual recordings so as to form segments. For a given individual recording, the segments formed based on the transition points result in selectable segments of the associated individual recording. After the segments are formed, selections of the selectable segments from different ones of the individual recordings can be received 208. The selectable segments from the various individual recordings can then be combined 210 to form a composite recording. Following the block 210, the media editing process 200 can end.
With respect to the media editing process 200, in one embodiment, the individual recordings are audio recordings, and the composite recording is a composite audio recording. Alternatively, in another embodiment, the individual recordings are video recordings, and the composite recording is a composite video recording. More generally, the media editing process 200 operate on digital media assets. For example, in one embodiment, the media editing process 200 operates on arbitrary audio assets, where individual recordings are only one type of audio assets.
With respect to the media editing process 100 or the media editing process 200, once formed, the resulting composite recording can thereafter be used as a single object by a production application or other application. Also, should there be a need to re-visit the editing utilized to form the composite recording, the resulting composite recording can be re-opened in the media editing process so as to again be able to modify its constituent regions/segments as before.
The media edit window 300 includes a plurality of tracks, namely, media tracks, that are arranged in an adjacent manner. As illustrated in
The media edit window 300 also includes transition points 306, 308 and 310. The transition points 306, 308 and 310 can be moved or re-sized by a user interacting with transition controls 312. The first track 302-1 is divided into four distinct segments, first segment 314-1, second segment 314-2, third segment 314-3 and fourth segment 314-4. The segments for the first track 302-1 are designated by the transition points 306, 308 and 310. The second track 302-2 includes segments 316-1, 316-2, 316-3 and 316-4. The third track 302-3 includes segments 318-1, 318-2, 318-3 and 318-4. The fourth track 302-4 includes segments 320-1, 320-2, 320-3 and 320-4. At this point, a user can interact with the media edit window 300 to select different segments across the different takes to form a composite track 322 which is identified by a composite label 324. For example, the resulting composite track 322 will result from the combination of the four selected segments from any of the tracks 302-1, 302-2, 302-3 and 302-4.
In addition, each of the tracks 302-1, 302-2, 302-3 and 302-4 can include a “solo” control 326. On selection of the “solo” control 326, the associated track can then be utilized in its entirety as the composite track 322. For example, if the solo control 326-1 were selected, the composite track 322 would update to pertain to the first track 302-1. If the solo control 326-2 were selected, the composite track 322 would update to pertain to the second track 302-2. If the solo control 326-3 were selected, the composite track 322 would update to pertain to the third track 302-3. If the solo control 326-4 were selected, the composite track 322 would update to pertain to the fourth track 302-4. In one embodiment, use of the “solo” control is temporary and this prior selection that previously were selected for the composite track 322 would be retained for use once the “solo” control is deselected.
Still further, the media edit window 400 includes a time line 408. The time line 408 provides a time context for reference with respect to the various tracks 302 that can be presented within the media edit window 400. Still further, the media edit window 400 can include one or more tool controls 410 that allow a user of the media edit window to select an appropriate tool for usage. Although the tool controls 410 can vary with implementation, some examples include a transition tool (for forming regions) and a selection tool (for selecting regions to be active).
Additionally, in one embodiment, the audio recording, namely, the associated waveform, in a given region can be manipulated by user action. For example, a user could interface with the waveform in a region to move it left or right. As another example, a user might also interact with the waveform in a region to alter its audio characteristics, such as by affecting it volume.
The multi-take edit process 600 can begin by opening 602 a multi-take editor. The multi-take editor is, for example, the multi-take editor discussed above in
A decision 612 then determines whether the multi-take editor should close. When the decision 612 determines that the multi-take editor should not close, the multi-take edit process 600 can return to repeat the block 606 and subsequent blocks so that further editing of the multiple takes can be performed to produce the composite recording. Once the decision 612 determines that the multi-take editor can close, the multi-take editor is closed 614. The resulting composite recording can then be used elsewhere on the computing device as a single composite recording, even though the composite recording was assembled from a plurality of different regions from different ones of the takes. After the multi-take editor has closed 614, the multi-take edit process 600 ends. The multi-take editor can subsequently be re-opened to edit the composite recording, for example to change its constituent regions.
The multiple takes being edited can be accessed in various. One way to acquire multiple takes is through recording of multiple takes.
The multi-take recording process 650 begins with a decision 652. A decision 652 determines whether a multi-take request has been made. When the decision 652 determines that a multi-take request has not been received, the multi-take recording process 650 awaits such a request. In other words, the multi-take recording process 650 is initiated when a multi-take request is received. Once the decision 652 determines that a multi-take request has been received, a take duration is set 654. The take duration is duration of time during which a media clip is recorded during each take.
Next, a decision 656 can determines whether the takes for the recording should be started. When the decision 656 determines that the take should not be started, the multi-take recording process 650 awaits the appropriate time to start the takes. Once the decision 656 determines that the takes should be started, the various takes are recorded 658. Typically, the takes are recorded one after another. In one implementation, the takes are recorded one after another in an automatic successive fashion. A decision 660 determines whether more takes are still to be recorded. When the decision 660 determines that more takes are to be recorded, the multi-take recording process 650 returns to repeat the block 658 so that a next take can be recorded. On the other hand, when decision 660 determines that no more takes are to be recorded, the multi-take recording process 650 can end.
The various aspects, features, embodiments or implementations of the invention described above can be used alone or in various combinations.
The invention is preferably implemented by software, but can also be implemented in hardware or a combination of hardware and software. The invention can also be embodied as computer readable code on a computer readable medium. The computer readable medium is any data storage device that can store data which can thereafter be read by a computer system. Examples of the computer readable medium include read-only memory, random-access memory, CD-ROMs, DVDs, magnetic tape, optical data storage devices, and carrier waves. The computer readable medium can also be distributed over network-coupled computer systems so that the computer readable code is stored and executed in a distributed fashion.
The advantages of the invention are numerous. Different aspects, embodiments or implementations may yield one or more of the following advantages. One advantage of the invention is that composite recordings can be formed from constituent parts selected from a plurality of individual media tracks. As an example, the constituent parts are media tracks that can pertain to multi-takes, such that different portion from different takes are able to be easily combined to produce a composite media track that is deemed more useful than any of the individual media tracks. Alternatively, as another example, the constituent parts are media tracks that can pertain to arbitrary media tracks. Another advantage of the invention is that a composite track can be re-edited after being created. Still another advantage of the invention is that transition effects can be applied when merging constituent parts.
U.S. Provisional Patent Application No. 60/911,886, filed concurrently, and entitled “MULTIPLE VERSION MERGE FOR MEDIA PRODUCTION,” is hereby incorporated herein by reference.
U.S. patent application Ser. No. 11/735,466, filed concurrently, and entitled “MULTI-FRAME VIDEO DISPLAY METHOD AND APPARATUS,” is hereby incorporated herein by reference.
U.S. Provisional Patent Application No. 60/911,884, filed concurrently, and entitled “TECHNIQUES AND TOOLS FOR MANAGING ATTRIBUTES OF MEDIA CONTENT,” is hereby incorporated herein by reference.
The many features and advantages of the present invention are apparent from the written description. Further, since numerous modifications and changes will readily occur to those skilled in the art, the invention should not be limited to the exact construction and operation as illustrated and described. Hence, all suitable modifications and equivalents may be resorted to as failing within the scope of the invention.
Number | Name | Date | Kind |
---|---|---|---|
4558302 | Welch | Dec 1985 | A |
4591928 | Bloom et al. | May 1986 | A |
5029509 | Serra et al. | Jul 1991 | A |
5237648 | Mills | Aug 1993 | A |
5365254 | Kawamoto | Nov 1994 | A |
5467288 | Fasciano et al. | Nov 1995 | A |
5536902 | Serra et al. | Jul 1996 | A |
5732184 | Chao | Mar 1998 | A |
5781188 | Amiot et al. | Jul 1998 | A |
5792971 | Timis et al. | Aug 1998 | A |
5852435 | Vigneaux et al. | Dec 1998 | A |
6204840 | Petelycky et al. | Mar 2001 | B1 |
6351765 | Pietropaolo et al. | Feb 2002 | B1 |
6400378 | Snook | Jun 2002 | B1 |
6597375 | Yawitz | Jul 2003 | B1 |
6670966 | Kusanagi | Dec 2003 | B1 |
6714826 | Curley et al. | Mar 2004 | B1 |
6851091 | Honda et al. | Feb 2005 | B1 |
7017120 | Shnier | Mar 2006 | B2 |
7073127 | Zhao et al. | Jul 2006 | B2 |
7085995 | Fukuda et al. | Aug 2006 | B2 |
7120859 | Wettach | Oct 2006 | B2 |
7208672 | Camiel | Apr 2007 | B2 |
7213036 | Apparao et | May 2007 | B2 |
7325199 | Reid | Jan 2008 | B1 |
7336890 | Lu et al. | Feb 2008 | B2 |
7372473 | Venolia | May 2008 | B2 |
7437682 | Reid | Oct 2008 | B1 |
7444593 | Reid | Oct 2008 | B1 |
7541534 | Schnepel et al. | Jun 2009 | B2 |
7549127 | Chasen et al. | Jun 2009 | B2 |
7594177 | Jojic et al. | Sep 2009 | B2 |
7623755 | Kuspa | Nov 2009 | B2 |
7659913 | Makela | Feb 2010 | B2 |
7754959 | Herberger et al. | Jul 2010 | B2 |
7948981 | Schnepel et al. | May 2011 | B1 |
20020026442 | Lipscomb et al. | Feb 2002 | A1 |
20020091761 | Lambert | Jul 2002 | A1 |
20020175932 | Yu et al. | Nov 2002 | A1 |
20030002851 | Hsiao et al. | Jan 2003 | A1 |
20030009485 | Turner | Jan 2003 | A1 |
20030018978 | Singal et al. | Jan 2003 | A1 |
20030122861 | Jun et al. | Jul 2003 | A1 |
20040027369 | Kellock et al. | Feb 2004 | A1 |
20040160416 | Venolia | Aug 2004 | A1 |
20040205358 | Erickson | Oct 2004 | A1 |
20040234250 | Cote et al. | Nov 2004 | A1 |
20050042591 | Bloom et al. | Feb 2005 | A1 |
20050114754 | Miller et al. | May 2005 | A1 |
20050235212 | Manousos et al. | Oct 2005 | A1 |
20050268279 | Paulsen et al. | Dec 2005 | A1 |
20060100978 | Heller et al. | May 2006 | A1 |
20060106764 | Girgensohn et al. | May 2006 | A1 |
20060120624 | Jojic et al. | Jun 2006 | A1 |
20060150072 | Salvucci | Jul 2006 | A1 |
20060156374 | Hu et al. | Jul 2006 | A1 |
20060165240 | Bloom et al. | Jul 2006 | A1 |
20060168521 | Shimizu et al. | Jul 2006 | A1 |
20060224940 | Lee | Oct 2006 | A1 |
20060236221 | McCausland et al. | Oct 2006 | A1 |
20060284976 | Girgensohn et al. | Dec 2006 | A1 |
20070118873 | Houh et al. | May 2007 | A1 |
20070185909 | Klein et al. | Aug 2007 | A1 |
20070240072 | Cunningham et al. | Oct 2007 | A1 |
20070292106 | Finkelstein et al. | Dec 2007 | A1 |
20080041220 | Foust et al. | Feb 2008 | A1 |
20080126387 | Blinnikka | May 2008 | A1 |
20080256136 | Holland | Oct 2008 | A1 |
20080256448 | Bhatt | Oct 2008 | A1 |
20080263433 | Eppolito et al. | Oct 2008 | A1 |
20080263450 | Hodges et al. | Oct 2008 | A1 |
Entry |
---|
Digidesign Inc., “Pro Tools Reference Guide Version 5.0.1 for Macintosh and Windows”, 1999, Version 5.0.1, pp. 1-432. |
Mike Thornton, “Achieving Better Vocal Sounds”, Sound on Sound, Apr. 2005, pp. 1-4. |
John Walden, “Comp Performances”, Sound on Sound, Mar. 2001, pp. 1-3. |
“Soundtrack Pro User Manual”, Apple Computer, Inc., copyright 2005, 1-311 pages. |
“What is Audio Post Production”, Motion Picture Sound Editors, http://www.mpse.org/education/whatis.html, downloaded Feb. 1, 2008. |
Office Action for U.S. Appl. No. 11/735,466 mailed Feb. 3, 2010. |
“Adobe Audition—User Guide”, Adobe Systems Inc., 2003. |
“Adobe Audition 2.0”, Adobe Systems Inc., Oct. 8, 2005. |
“Adobe Studio on Adobe Audition 1.5—High Frequency Effects”, Adobe Systems Inc., 2005. |
Final Office Action for U.S. Appl. No. 11/735,466, mailed Jul. 20, 2010. |
Office Action for U.S. Appl. No. 12/060,010, mailed Jul. 19, 2010. |
Office Action for U.S. Appl. No. 12/060,010, mailed Jan. 6, 2011. |
Office Action for U.S. Appl. No. 12/082,898, mailed Mar. 2, 2011. |
Office Action for U.S. Appl. No. 12/082,899, mailed Mar. 2, 2011. |
Office Action for U.S. Appl. No. 12/060,010, mailed May 7, 2012. |
Number | Date | Country | |
---|---|---|---|
20080255687 A1 | Oct 2008 | US |