Conventional sound amplification and mixing systems have been employed for processing a musical score from a fixed medium to a rendered audible signal perceptible to a user or audience. The advent of digitally recorded music via CDs coupled with widely available processor systems (i.e. PCs) has made digital processing of music available to even a casual home listener or audiophile. Conventional analog recordings have been replaced by audio information from a magnetic or optical recording device, often in a small personal device such as MP3 and Ipod® devices, for example. In a managed information environment, audio information is stored and rendered as a song, or score, to a user via speaker devices operable to produce the corresponding audible sound to a user.
In a similar manner, computer based applications are able to manipulate audio information stored in audio files according to complex, robust mixing and switching techniques formerly available only to professional musicians and recording studios. Novice and recreational users of so-called “multimedia” applications are able to integrate and combine various forms of data such as video, still photographs, music, and text on a conventional PC, and can generate output in the form of audible and visual images that may be played and/or shown to an audience, or transferred to a suitable device for further activity.
Digitally recorded audio has greatly enabled the ability of home or novice audiophiles to amplify and mix sound data from a musical source in a manner once only available to professionals. Conventional sound editing applications allow a user to modify perceptible aspects of sound, such as bass and treble, as well as adjust the length by performing stretching or compressing on the information relative to the time over which the conventional information is rendered.
Conventional sound applications, however, suffer from the shortcoming that modifying the duration (i.e. time length) of an audio piece changes the tempo because the compression and expansion techniques employed alter the amount of information rendered in a given time, tending to “speed up” or “slow down” the perceived audio (e.g. music). Also, it can be difficult for novice users to combine portions of audio to meet a prescribed desired time duration. Further, conventional applications cannot rearrange discrete portions of the musical score without perceptible inconsistencies or artifacts (i.e. “crackles”, “phase erasement” or “pops”) as the audio information is switched, or transitions, from one portion to another.
Accordingly, configurations herein substantially overcome the shortcomings presented by conventional audio mixing and processing applications by defining an architecture and mechanism of storing audio information in a manner operable to be rearranged, or recombined, from discrete parts of the audio information into a finished musical composition piece of a predetermined length without detectable inconsistencies between the integrated audio parts from which it is combined. The example audio rearranger presented herein rearranges an audio piece (song) by concatenating the constituent parts into a finished composition having a predetermined duration (length). The method identifies a decomposed set of audio information in a file format indicative of a time and relative position of parts of the musical score, or piece, and identifies, for each part, a function and position in the recombined finished composition. Each of the stored parts is operable to be recombined into a seamless, continuous composition of a predetermined length providing a consistent user listening experience despite variations in duration.
The disclosed configuration provides time specification and limiting while adhering to a general musical experience by using a minimization technique that selects a song structure with least repetition. The minimizing technique further deviates minimally from the structure to achieve the desired length by rearranging the parts in the same or similar structure as the original. Employing such a rearranger allows less skilled users to adjust pre-composed songs to a desired length without involving a composer and thus mitigating resource (time and money) usage in developing a time conformant rendering of a song or other musical score.
The example shown herein presents an audio editing application that employs aggregation rules applicable to the parts of a song to produce a logical sequence of musical parts based on the type of the parts. The aggregation rules identify an ordering of the parts in the recombined, finished composition. A set of song structures identifies a mapping of sequential types of song parts that indicate allowable ordering of the types. In concurrence with the aggregation rules, the recombiner selects parts of a particular length to satisfy the desired total duration. Certain parts may be replicated in succession, to produce a duration multiple (e.g. 2 times, 3 times, etc.) of a part. The parts may also have part variations including similarly renderable (i.e. sounding similar) parts with a different duration. The aggregation rules attempt to minimize repetition while maintaining musical structure (i.e. logical part progression) in the finished composition.
The disclosed recombination mechanism allows the audio editing application to manipulate and recombine segments of a musical piece such that the resulting finished composition includes parts (segments) from the decomposed piece, typically a song, adjustable for length by selectively replicating particular parts and combining with other parts such that the finished composition provides a similar audio experience in the predetermined duration. The segments define the parts with part variations of independent length, and identified as performing a function of starting, middle, (looping) or ending parts. Each of the parts provides a musical segment that is integratable with other parts in a seamless manner that avoids audible artifacts (e.g. “pops” and “clicks” or “phase erasement”) common with conventional mechanical switching and mixing. Each of the parts further includes attributes indicative of the manner in which the part may be ordered, whether the part may be replicated or “looped” and modifiers affecting melody and harmony of the rendered finished composition piece, for example.
In further detail the method of processing and rendering audio information as disclosed herein includes computing a plurality of parts of an audio piece, such that each of the parts has a function and a duration, in which the function is indicative of a recombinable order of the parts, and the duration is indicative of a time length of the part. A file repository organizes each of the parts according to length and function, and a rearranger arranges a sequence of the parts according to an aggregate duration, in which arranging further includes ordering the parts according to the function of the preceding part and the combined duration of the aggregate parts.
In an example configuration, arranging the parts further includes gathering, from an audio source, a set of parts of the audio piece, each of the parts having a duration and a function, in which the function is indicative of the ordering of the parts in a renderable audio composition. A recombiner combines the set of parts in a sequence of parts to compute a renderable audio composition of a predetermined length based on the aggregate duration. The sequence of parts may include, for example, a part of a starting function, at least one part of a looping function, and a part of an ending function. Other sequences defined by song structures may be employed.
Further, the parts may include part variations, such that each of the part variations has the same type and a particular independent duration of the audio content contained in the part. Arranging the series of parts further includes building a finished composition piece by iteratively selecting a next part for concatenation to the finished composition. Iterating through available parts includes examining the available parts for concatenation, and computing, based on aggregation rules, a type of part adapted for inclusion as the next part. The iteration computes, if the type of part is adapted for inclusion, part variations of the part, each part variation having a different duration, and selects, if a part variations having a corresponding duration is found, the part variation. The selected corresponding duration is operable to provide a predetermined duration to the finished composition from all of the aggregated parts.
In an example configuration, the recombiner employs aggregation rules for identifying a song structure, in which the song structure is indicative of a sequence of part types operable to provide an acceptable musical progression. The recombiner selects, for each iteration, a part variation having a type corresponding to the song structure. Particular arrangements determine a resizability attribute for each of the parts, and concatenate, if the part is resizable, multiple iterations of the part to achieve a desired aggregate (total) duration of the rearranged renderable part. If a part is resizable, the recombiner computes an optimal number of iterations based on the duration of available parts, the duration minimizing duplicative rendering of the rearranged parts.
Particular configurations determine a recombination mode, in which the recombination mode is operable to automatically arrange types of parts such that the part structure may be modified in the generated renderable sequence of parts.
Alternate configurations of the invention include a multiprogramming or multiprocessing computerized device such as a workstation, handheld or laptop computer or dedicated computing device or the like configured with software and/or circuitry (e.g., a processor as summarized above) to process any or all of the method operations disclosed herein as embodiments of the invention. Still other embodiments of the invention include software programs such as a Java Virtual Machine and/or an operating system that can operate alone or in conjunction with each other with a multiprocessing computerized device to perform the method embodiment steps and operations summarized above and disclosed in detail below. One such embodiment comprises a computer program product that has a computer-readable medium including computer program logic encoded thereon that, when performed in a multiprocessing computerized device having a coupling of a memory and a processor, programs the processor to perform the operations disclosed herein as embodiments of the invention to carry out data access requests. Such arrangements of the invention are typically provided as software, code and/or other data (e.g., data structures) arranged or encoded on a computer readable medium such as an optical medium (e.g., CD-ROM), floppy or hard disk or other medium such as firmware or microcode in one or more ROM or RAM or PROM chips, field programmable gate arrays (FPGAs) or as an Application Specific Integrated Circuit (ASIC). The software or firmware or other such configurations can be installed onto the computerized device (e.g., during operating system or execution environment installation) to cause the computerized device to perform the techniques explained herein as embodiments of the invention.
The foregoing and other objects, features and advantages of the invention will be apparent from the following description of particular embodiments of the invention, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention.
Conventional sound applications suffer from the shortcoming that modifying the duration (i.e. time length) of an audio piece tends to change the tempo because the compression and expansion techniques employed alter the amount of information rendered in a given time, tending to “speed up” or “slow down” the perceived audio (e.g. music). Further, conventional methods employing mechanical switching and mixing tend to introduce perceptible inconsistencies (i.e. “crackles” or “pops”) as the audio information is switched, or transitions, from one portion to another. Configurations discussed below substantially overcome the shortcomings presented by conventional audio mixing and processing applications by defining an architecture and mechanism of storing audio information in a manner operable to be rearranged, or recombined, from discrete parts of the audio information. The resulting finished musical composition has a predetermined length from the constituent parts, rearranged by the rearranger without detectable inconsistencies between the integrated audio parts from which it is combined. Accordingly, configurations herein identify a decomposed set of audio information in a file format indicative of a time and relative position of parts of the musical score, or piece, and identify, for each part, a function and position in the recombined finished composition. Each of the stored parts is operable to be recombined into a seamless, continuous composition of a predetermined length providing a consistent user listening experience despite variations in duration.
The rearranger 130 further includes a recombiner 132, aggregation rules 134 and song structures 136. The recombiner 130 is operable to rearrange and reorder the parts 114 into a composition 138 of reordered segments 144-1 . . . 144-4 (144 generally) corresponding to the parts 114. Each of the segments 144 is a part variation having a particular duration, discussed further below. Each part variation 144 includes tracks having one or more clips, discussed below. The aggregation rules 134 employ a function of each of the parts 114 that indicates the order in which a particular part 114 may be recombined with other parts 114. In the example shown herein, the functions include starting, ending, and looping (repeatable) elements. Alternate parts having other functions may be employed; the recombinability specified by the function is granular to the clip and need not be the same for the entire part. The function refers to the manner in which the part, clip, or loop is combinable with other segments, and may be specific to the clip, or applicable to all clips in the part. The song structures 136 specify a structure, or type-based order, of each of the parts 114 used to combine different types of parts in a sequence that meets the desired duration. In the example configuration below, the recombiner 132 computes time durations of a plurality of parts 114 to assemble a composition 138 having a specified time length, or duration, received from the GUI 164.
In such a system, it is desirable to vary the length of a musical score, yet not deviate from the sequence of verses and intervening chorus expected by the listener. The rearranged composition 138 rendered to a user maintains an expected sequence of parts 114 (based on the function and type) to meet a desired time duration without varying the tempo by “stretching” or “compressing” the audio, while also preserving the musical “structure,” or logical progression of the parts. It should be noted that the concept of a “part” as employed herein refers to a time delimited portion of the piece, not to a instrument “part” encompassing a particular single instrument.
The rearranger 130 employs the decomposed song 112, which is stored as a set of files indexed as rearrangable elements 142-1 . . . 142-N (142 generally) on a local storage device 140, such as a local disk drive. The rearrangable elements 142 collectively include parts 114, part variations 144, and tracks and clips, discussed further below in
Therefore, in an example arrangement, the rearranger 130 computes for a given song variation (time length variant of a song) the length of the song (rearranged composition) 138 by combining all parts 114 contained in this song variation 138. For each part 114 all part variations are iteratively attempted in combination with any part variation of the other parts 114 of the song variation. If the resulting song variation duration is smaller than the desired length, the repetition count for all parts is incremented part by part. The rearranger 130 iterates as long as the resulting duration is equal or larger than the desired length. During the iteration part variations 144 are marked to be removed from search if the duration keeps being under the desired length. The 138 rearranger searches for a combination which gives the minimal error towards the desired length. (149,
The decomposer 110 organizes each of the parts 114 according to length and function, as depicted at step 201, and decomposes the song into rearrangeable elements 160 typically stored as individual files of tracks and clips, although any suitable file organization may be employed. The rearrangeable elements 160 therefore form a set of files of parts, responsive to the rearranger 130 for rearranging and reordering the parts 114 into the finished composition 138 according to the aggregation rules 134 and the desired predetermined duration. The rearranger 130 arranges a sequence 112 of the parts 114 according to an aggregate duration, in which arranging further includes ordering the parts according to the function of the preceding part and the combined duration of the aggregate parts, as depicted at step 302. The function of the part 114 indicates position relative to other parts, such as parts types which may follow or precede another, also referred to as the structure, discussed further below with respect to
In
The parts 114 further include attributes 160, including a function 161-1, a type 161-2, and a resizability 161-3. The function 161-1 is indicative of the ordering of the parts in the composition 138. In the example configuration, the function indicates a starting, ending, or looping part. The type 161-2 is a musical designation of the part in a particular song, and may indicate a chorus, verse, refrain, bridge, intro, or outtro, for example. The type indicates the musical flow of one part into another, such as a chorus between verses, or a bridge leasing into a verse, for example. The resizability 161-3 indicates whether a part 114 may be replicated, or looped multiple of times, to increase the duration of the resulting aggregate parts 114. This may be related to the function 161-2 (i.e. looping), although not necessarily.
The rearranger 130 arranging a sequence of the parts according to an aggregate duration, such that arranging further includes ordering the parts according to the function of the preceding part and the combined duration of the aggregate parts, as depicted at step 310. The aggregation rules, discussed further below with respect to
The recombiner selects, if a part variation 144 having a corresponding duration D is found, the part variation 144, the corresponding duration operable to provide a predetermined duration to the finished composition 138, as shown at step 321. Using the selected part variation 144, the recombiner builds the finished composition 138 piece by iteratively selecting a next part for concatenation to the finished composition, ass depicted at step 328. Therefore, a check is performed, at step 329, to determine if the intended duration 149 is reached, and control reverts to step 311 accordingly. Otherwise, the renderer 122 combines the set of parts selected in the sequence of parts 138 to compute a renderable audio composition 166 of a predetermined length based on the aggregate duration, as shown at step 330.
Referring now to
Referring to
The recombiner determining a recombination mode, in which the recombination mode is operable to automatically arrange types of parts such that the part structure is modified in the generated renderable sequence of parts, as shown at step 314. A check is performed, at step 315, to determine if recombination is enabled, meaning that the recombination may rearrange the structure (sequence of types) in the finished composition 138. If the recombination mode is enabled, then the structure (e.g. part 114 type ordering) is preserved, for example, the sequence of parts 138 includes a part of a starting function 114-1, at least one part of a looping function 114-2, and a part of an ending function 114-3, as depicted at step 316. In this mode, the recombiner selects, for each iteration, a part variation having a type corresponding to the song structure of the input score 102, as shown at step 317.
Otherwise If the recombination mode is enabled, the aggregation rules 134 may be employed to identify permissible song structures 136, or sequences of part types 161-1. The aggregation rules 136 identify a song structure such that the song structure i136 is indicative of a sequence of part types 161-1 operable to provide an acceptable musical progression, as shown at step 318. The recombiner 132 selects, for each iteration, a part variation 144 having a type 161-1 corresponding to the song structure 136 permitted by the aggregation rules 134 (e.g. 520, 540). Other structures may be specified by the song structures 136. The corresponding types 161-1 are determinable from a mapping of types, the mapping based on a logical musical progression defined by a predetermined song structure (520, 540), as shown at step 319. The recombiner selects the next part type 161-1 by iterating through the sequence defined by the song structure 136, as shown at step 320.
Referring to
Otherwise, at step 326, the recombiner concatenates, if the part is resizable, multiple iterations of the part 114 to achieve a desired aggregate duration of the rearranged renderable piece 138. In view of minimizing repetition, the aggregation rules specify repetition of the largest part that can be accommodated. Therefore, the recombiner computes, if a part is resizable, an optimal number of iterations based on the duration of available parts 114 (i.e. part variations 144), such that the duration minimizes duplicative rendering of the rearranged parts. Thus, 2 multiples of a 10 second part variation 144 are preferred to 4 multiples of a 5 second variation, for example.
Those skilled in the art should readily appreciate that the programs and methods for representing and processing audio information as defined herein are deliverable to a processing device in many forms, including but not limited to a) information permanently stored on non-writeable storage media such as ROM devices, b) information alterably stored on writeable storage media such as floppy disks, magnetic tapes, CDs, RAM devices, and other magnetic and optical media, or c) information conveyed to a computer through communication media, for example using baseband signaling or broadband signaling techniques, as in an electronic network such as the Internet or telephone modem lines. The disclosed method may be in the form of an encoded set of processor based instructions for performing the operations and methods discussed above. Such delivery may be in the form of a computer program product having a computer readable medium operable to store computer program logic embodied in computer program code encoded thereon, for example. The operations and methods may be implemented in a software executable object or as a set of instructions embedded in a carrier wave. Alternatively, the operations and methods disclosed herein may be embodied in whole or in part using hardware components, such as Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), state machines, controllers or other hardware components or devices, or a combination of hardware, software, and firmware components.
While the system and method for representing and processing audio information has been particularly shown and described with references to embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention encompassed by the appended claims.
Number | Name | Date | Kind |
---|---|---|---|
4881440 | Kakizaki | Nov 1989 | A |
RE33739 | Takashima et al. | Nov 1991 | E |
5525749 | Aoki | Jun 1996 | A |
5728962 | Goede | Mar 1998 | A |
5753843 | Fay | May 1998 | A |
6452082 | Suzuki et al. | Sep 2002 | B1 |
6541689 | Fay et al. | Apr 2003 | B1 |
6646194 | Kikumoto | Nov 2003 | B2 |
6872877 | Suzuki et al. | Mar 2005 | B2 |
20010023635 | Taruguchi et al. | Sep 2001 | A1 |
20020005109 | Miller | Jan 2002 | A1 |
20020189430 | Mukojima | Dec 2002 | A1 |
20030004701 | Ueta et al. | Jan 2003 | A1 |
20030177892 | Akazawa et al. | Sep 2003 | A1 |
20040089141 | Georges et al. | May 2004 | A1 |
20040255759 | Gayama | Dec 2004 | A1 |
20050016364 | Kamiya | Jan 2005 | A1 |
Number | Date | Country | |
---|---|---|---|
20080092721 A1 | Apr 2008 | US |