Conventional sound amplification and mixing systems have been employed for processing a musical score from a fixed medium to a rendered audible signal perceptible to a user or audience. The advent of digitally recorded music via CDs coupled with widely available processor systems (i.e. PCs) has made digital processing of music available to even a casual home listener or audiophile. Conventional analog recordings have been replaced by audio information from a magnetic or optical recording device, often in a small personal device such as MP3 and Ipod® devices, for example. In a managed information environment, audio information is stored and rendered as a song, or score, to a user via speaker devices operable to produce the corresponding audible sound to a user.
In a similar manner, computer based applications are able to manipulate audio information stored in audio files according to complex, robust mixing and switching techniques formerly available only to professional musicians and recording studios. Novice and recreational users of so-called “multimedia” applications are able to integrate and combine various forms of data such as video, still photographs, music, and text on a conventional PC, and can generate output in the form of audible and visual images that may be played and/or shown to an audience, or transferred to a suitable device for further activity.
Digitally recorded audio has greatly enabled the ability of home or novice audiophiles to amplify and mix sound data from a musical source in a manner once only available to professionals. Conventional sound editing applications allow a user to modify perceptible aspects of sound, such as bass and treble, as well as adjust the length by performing stretching or compressing on the information relative to the time over which the conventional information is rendered. Typically, a score is created by combining or layering various musical tracks to create a musical score. A track may contain one particular instrument (such as a flute), a family of instruments (i.e., all the wind instruments), various vocalists (such as the soloist, back up singers, etc.), the melody of the musical score (i.e., the predominant ‘tune’ of the musical score), or a harmony track (i.e., a series of notes that complement the melody).
Conventional sound applications, however, suffer from the shortcoming that modifying the duration (i.e. time length) of an audio piece changes the tempo because the compression and expansion techniques employed alter the amount of information rendered in a given time, tending to “speed up” or “slow down” the perceived audio (e.g. music). Further, conventional applications cannot rearrange discrete portions of the musical score without perceptible inconsistencies (i.e. “crackles” or “pops”) as the audio information is switched, or transitions, from one portion to another. Additionally, conventional sound applications do not allow for modification of the audio information (i.e., the musical score) based on mapping discrete audio segments arranged by audio type within a control system. Conventional sound editing applications do not provide a graphical user interface, allowing a user to modify the audio information based on audio type. A further deficiency involving conventional applications results from the lack of an audio data format that defines the raw audio files (as used in composing, rearranging and/or modifying a musical score) in a hierarchical structure such that audio data format is accessible to, and compatible with, a wide range of sound editing applications. Similarly, conventional sound applications do not provide an audio data format that describes song aspects and audio files to, i) enable rearranging discrete audio portions of a musical score while preserving the tempo; and/or ii) enable modification of audio information based on mapping discrete audio segments arranged by audio type within a control system.
Accordingly, configurations herein substantially overcome the shortcomings presented by providing an audio formatting process that defines an audio data format. The audio data format enumerates aspects of a musical score in a predetermined syntax, or scripting language, in order to provide a seamless interface between sound editing applications and the organic audio information stored as raw audio files. The audio data format defines a hierarchical object model that identifies the various elements, segments, attributes, modifiers, etc., of a musical score and the interdependencies thereof in order to provide a manner of access from a sound editing application (or rendering application) to the audio files. The hierarchical format is conducive for rendering and storing musical score variations by the temporal aspects (e.g., duration and repeatability of audio segments or parts) and by the qualitative aspects (e.g., intensity, harmony, melody, etc.) of the tracks and clips associated with the musical composition.
In accordance with embodiments disclosed herein, an audio formatting process identifies a musical score of audio information operable to be rendered by a rendering application. The audio formatting process further enumerates aspects of the score such that the aspects are operable to define renderable features of the score. In addition, the aspects further define a duration modifiable by the rendering application to a predetermined duration that preserves the tempo of the score. Furthermore, the audio formatting process enumerates at least one field associated with each aspect of the score, the fields indicative of rendering the score. With the classification of the aspects, the audio formatting process is able to store the enumerated aspects according to a predetermined syntax that is operable to indicate to the rendering application the manner of accessing each of the aspects of the score.
In an example configuration, the audio formatting process enumerates a location of an aspect in the score such that the location defines an offset time relative to a reference point in the score. Similarly, the audio formatting process enumerates a modifiable attribute associated with at least one aspect of the score. In addition, the audio formatting process enumerates a sequential assignment of an aspect relative to at least one other aspect of the score. More specifically, in accordance with example configurations, the audio formatting process enumerates a song aspect that identifies the available variations of the score. The audio formatting process also the audio formatting process enumerates a part aspect that identifies parts of the score such that each of the parts define a segment of the score operable as a rearrangeable element. With respect to the fields associated with aspects of the score, the audio formatting process identifies a name associated with the part. The audio formatting process also identifies a type associated with the part such that the type is indicative of a sequential ordering of the part. Additionally, the audio formatting process identifies a part variation identifier associated with the part. As such, the part variation identifier describes the content of a part length variation.
In another example embodiment, the audio formatting process enumerates an intensity aspect indicative of at least one intensity value for tracks of the score, wherein each track is operable to render audio content. In this manner, the audio formatting process identifies at least one track associated with the intensity value of the respective intensity aspect. Moreover, the audio formatting process enumerates a modifier aspect indicative of at least one modifier value for a plurality of tracks operable to render audio content. the audio formatting process. The audio formatting process also identifies a plurality of tracks associated with the modifier value of the respective modifier aspect. In a similar embodiment, the audio formatting process enumerates a melody attribute indicative of a melody value for the plurality of tracks. Likewise, the audio formatting process enumerates a harmony attribute indicative of a harmony value for the plurality of track. As per one example configuration, the audio formatting process identifies a preset value for each modifiable attribute such that the preset value indicates an initial value for each modifiable attribute.
In yet another embodiment, the audio formatting process enumerates a track aspect indicative of at least one track operable to render audio content. In this respect, the audio formatting process also identifies at least one clip associated with the at least one track of the score. Furthermore, the audio formatting process identifies a location associated with the at least clip, the location defining an offset time relative to a reference point in the score. According to one embodiment disclosed herein, the audio formatting process specifies a file associated with each clip, the file location indicated by a uniform resource locator (URL). The audio formatting process also provides a manner of accessing by the rendering application via a graphical user interface. In this sense, the rendering application is responsive to the manner of accessing for determining the aspects of the score, wherein the aspects of the score are indicative of file locations and file formats. In one embodiment, the audio formatting process stores the enumerated aspects according to a scripting language operable to indicate to the rendering application the manner of accessing each of the aspects of the score. More specifically, the audio formatting process may store the enumerated aspects according to an extensible markup language (XML) format.
Other embodiments disclosed herein include any type of computerized device, workstation, handheld or laptop computer, or the like configured with software and/or circuitry (e.g., a processor) to process any or all of the method operations disclosed herein. In other words, a computerized device such as a computer or a data communications device or any type of processor that is programmed or configured to operate as explained herein is considered an embodiment disclosed herein.
Other embodiments disclosed herein include software programs to perform the steps and operations summarized above and disclosed in detail below. One such embodiment comprises a computer program product that has a computer-readable medium including computer program logic encoded thereon that, when performed in a computerized device having a coupling of a memory and a processor, programs the processor to perform the operations disclosed herein. Such arrangements are typically provided as software, code and/or other data (e.g., data structures) arranged or encoded on a computer readable medium such as an optical medium (e.g., CD-ROM), floppy or hard disk or other a medium such as firmware or microcode in one or more ROM or RAM or PROM chips or as an Application Specific Integrated Circuit (ASIC). The software or firmware or other such configurations can be installed onto a computerized device to cause the computerized device to perform the techniques explained as embodiments disclosed herein.
It is to be understood that the system disclosed herein may be embodied strictly as a software program, as software and hardware, or as hardware alone. The embodiments disclosed herein, may be employed in data communications devices and other computerized devices and software systems for such devices such as those manufactured by Adobe Systems Incorporated of San Jose, Calif.
The foregoing and other objects, features and advantages of the invention will be apparent from the following description of particular embodiments of the invention, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention.
Still referring to
An example scripting language format (e.g., XML code) operable for use with the above configuration is shown in Table I:
Referring to Table I, the example script depicts enumerated aspects 165 for a particular score (e.g., score 160-1) in a collapsible form such that the fields 166 associated with each aspect are not shown. The aspects 165 and fields 166 in Table I are designated as tags enclosed by ‘<’ and ‘>’ symbols. To exemplify a typically script format, the “parts” aspect is expanded such that the fields 166 and subordinate aspects 165 associated with the “parts” aspect are shown in Table I. For example, the parts aspect enumerates an “id”, “name” and “type” field for defining a “part” aspect of the score 160. In the example shown in Table I, the “part” aspect is subordinate to the “parts” aspect such that a single part of the score 160 is a subset of the parts as a whole. Similarly, the “partvariation” aspect is a subset of the “part” aspect and, consequently, the “clip” aspect is a subset of the “partvariation” aspect, and so on. It should be noted that script is not limited to the aspects 165 and fields 166 shown in
The memory system 112 is any type of computer readable medium and in this example is encoded with an audio formatting application 140-1. The audio formatting application 140-1 may be embodied as software code such as data and/or logic instructions (e.g., code stored in the memory or on another computer readable medium such as a removable disk) that supports processing functionality according to different embodiments described herein. During operation of the computer system 110, the processor 113 accesses the memory system 112 via the interconnect 111 in order to launch, run, execute, interpret or otherwise perform the logic instructions of the audio formatting application 140-1. Execution of audio formatting application 140-1 in this manner produces processing functionality in a audio formatting process 140-2. In other words, the audio formatting process 140-2 represents one or more portions of runtime instances of the audio formatting application 140-1 (or the entire application 140-1) performing or executing within or upon the processor 113 in the computerized device 110 at runtime.
Flow charts of the presently disclosed methods are depicted in
In step 200, the audio formatting process 140-2 identifies a musical score 160-1 of audio information operable to be rendered by a rendering application 170. As shown in the example embodiment of
In step 201, the audio formatting process 140-2 enumerates aspects 165 of the score 160-1. The aspects 165 are operable to define renderable features of the score 160-1 and further define a duration modifiable by the rendering application 170 to a predetermined duration that preserves the tempo of the score 160-1. In conventional audio editing software, compression and expansion techniques employed alter the amount of audio information rendered in a given time (e.g., beats per minute), which tends to “speed up” or “slow down” the perceived audio (e.g. music). Conversely, the audio formatting process 140-2 disclosed herein provides aspects 165 operable to define renderable features of the score 160-1 such that the tempo remains constant (vis-à-vis the original music composition) for the entirety of the modified resulting musical composition. The methods for varying the duration of musical compositions while preserving the tempo are augmented by techniques discussed in copending patent application Ser. No. 11/585,289, entitled “METHODS AND APPARATUS FOR REPRESENTING AUDIO DATA”, filed concurrently, incorporated herein by reference.
In step 202, the audio formatting process 140-2 enumerates at least one field 166 associated with each aspect 165 of the score 160-1, wherein the fields 166 are indicative of rendering the score 160-1. The fields 166 represent properties and/or values particular to an aspect 165 of the score 160-1 and provide context for rendering the musical content defined by the aspects 165. Referring to the example script shown in Table I, the “parts” aspect defines an “id” field, a “name” field and a “type” field. As a result, each of the subordinate “part” aspects enumerated in the script have a corresponding value for each of “id”, “name” and “type” fields.
In step 203, the audio formatting process 140-2 enumerates a location of an aspect 165 in the score 160-1 such that the location defines an offset time relative to a reference point in the score 160-1. Stated differently, the location represents an anchor or cue point for an aspect 165 (e.g., a clip) to be inserted into the musical composition. For example, in one embodiment the location is an offset in samples of a clip from a reference point in the song (e.g., the beginning of a song, the end of a separate clip, etc.)
In step 204, the audio formatting process 140-2 enumerates a modifiable attribute associated with at least one aspect 165 of the score 160-1. The modifiable attributes (e.g., intensity, melody, harmony, etc.) represent a qualitative value associate with the audio information and does not typically impact the duration of a musical composition (e.g., song). The methods for modifying the qualitative attributes of musical compositions are augmented by techniques discussed in copending patent application Ser. No. 11/585,352, entitled “METHODS AND APPARATUS FOR MODIFYING AUDIO INFORMATION”, filed concurrently, incorporated herein by reference. Furthermore, as in one example configuration, the audio formatting process 140-2 identifies a preset value for each modifiable attribute, wherein the preset value is indicative of an initial value for each modifiable attribute. For example, the preset value for intensity may be 1.0 while the preset value for harmony may be 0.5 for a given score 160. As such, the values for the modifiable attributes are typically normalized for a predetermined range to provide a seamless interface and more simple interaction for a user 108 of audio editing software.
In step 205, the audio formatting process 140-2 enumerates a sequential assignment of an aspect 165 relative to at least one other aspect 165 of the score 160-1. More specifically, the sequential assignment describes the ordering of the parts as well as the ordering of the clips within those parts. Referring to the example script in Table I, the sequential assignment is represented by the value of the “id” field for each part and clip. For example, the part aspect in Table I defines a “part id=1” and, thus, denotes that this particular part is the first in the sequence of one or more parts associated with the “parts” aspect in the hierarchy. Likewise, the clip aspect defines a “clipref id=1” denoting that this is the first clip in a sequence of one or more clips associated with the first part aspect.
In step 206, the audio formatting process 140-2 stores the enumerated aspects 165 according to a predetermined syntax (e.g., score 160-1 as shown in Table I and
In step 207, the audio formatting process 140-2 stores the enumerated aspects 165 according to a scripting language operable to indicate to the rendering application 170 the manner of accessing each of the aspects 165 of the score 160-1. More specifically, in step 208, the audio formatting process 140-2 stores the enumerated aspects 165 according to an extensible markup language (XML) format. The enumerated aspects 165 may also be stored according to other scripting or markup language generally known in the art that are suitable for describing data.
In step 209, the audio formatting process 140-2 provides a manner of accessing by the rendering application 170 via a graphical user interface 171. The rendering application 170 is responsive to the manner of accessing for determining the aspects 165 of the score 160-1, wherein the aspects 165 of the score 160-1 are indicative of file locations and file formats. For example, while the graphical user interface 171 provides an interface for the user 108 to interact with the rendering application 170, the predetermined syntax (e.g., score 160-1 as described in an XML format) provides an interface, or manner of access, for the rendering application 170 to interact with the raw audio files 152 in database 151. In essence, the hierarchical structure of the DOM framework enables the user 108 (via rendering application 170 and graphical user interface 171) to modify the temporal and qualitative attributes of a musical composition from a large group of raw audio files.
In step 210, the audio formatting process 140-2 enumerates a song aspect that identifies the available variations of the score 160-1. For example, in one embodiment the song aspect includes fields 166 defining the part id's, or part aspects, of the song variation and a default part variation to be used in rendering the audio information. Additionally, the song aspect fields define a minimum and/or maximum number of how many times the respective part should be played in the particular song variation.
In step 211, the audio formatting process 140-2 enumerates a part aspect that identifies parts of the score 160-1 such that each of the parts defines a segment of the score operable as a rearrangeable element. Table I depicts an example script configuration that defines a part aspect that may be arranged in any desirable order with respect to one or more part aspects (e.g., by designating a corresponding value for the sequential assignment). In another embodiment, the part aspect includes a subordinate part variation aspect that defines the part variations (e.g., parts differing in length or beats).
In step 212, the audio formatting process 140-2 identifies a name associated with the part. As shown in the example script configuration of Table I, the enumerated part aspect is designated with the name “Intro”. Typically, the name is indicative of the respective ordering of the part in the sequence of the musical composition (e.g., “Intro” denotes that the part is located near the beginning of the song).
In step 213, the audio formatting process 140-2 identifies a type associated with the part, wherein the type indicative of a sequential ordering of the part. Still referring to Table I, the enumerated part aspect is designated with the type “intro”. Similar to the name attribute, the type also is indicative of the respective ordering of the part in the sequence of the musical composition.
In step 214, the audio formatting process 140-2 identifies a part variation identifier associated with the part such that the part variation identifier describes the content of a part length variation. Generally, a part may have multiple variations containing the same content but with varying durations as dictated by the number of clips associated with the respective part variation. Accordingly, the part variation shown in Table I includes at least one subordinate clip aspect. In one example embodiment, the clip aspect includes fields 166 defining the position in samples of the clip, the number of bars of the clip, the number of beats of the clip, and/or the metric unit of the clip (e.g., quarter, eighth, etc.)
In step 220, the audio formatting process 140-2 enumerates an intensity aspect indicative of at least one intensity value for tracks of the score 160-2, wherein each track is operable to render audio content. An intensity aspect defines all tracks assigned to the specific intensity value of the respective intensity aspect. According to one example configuration, the intensity aspect includes fields 166 defining the intensity group identity (e.g., “group id=1” represents a low intensity level) and the name of the intensity group (e.g., “Low”). In addition, the intensity aspect includes subordinate track aspects representing the tracks assigned to the particular intensity group. As such, the track aspects define the identity of the track, reference identity of the track and the individual gain, or volume, of the track.
In step 221, the audio formatting process 140-2 enumerates a modifier aspect indicative of at least one modifier value for a plurality of tracks operable to render audio content. A modifier aspect defines all tracks assigned to the specific modifier value of the respective modifier aspect. As per one example configuration, the modifier aspect includes fields 166 defining the identity of the modifier group (e.g., an integer value), the name of the modifier group (e.g., harmony, melody, etc.), and the default gain, or volume, of the respective modifier aspect.
In step 222, the audio formatting process 140-2 enumerates a melody attribute indicative of a melody value for the plurality of tracks. Similarly, in step 223, the audio formatting process 140-2 enumerates a harmony attribute indicative of a harmony value for the plurality of track.
In step 224, the audio formatting process 140-2 identifies at least one track associated with the intensity value of the respective intensity aspect. For example, as in one embodiment, the intensity aspect includes at least one subordinate track aspect associated with the respective intensity value of the intensity aspect. In this manner, the track aspect includes fields defining the track identity associated with the intensity group.
In step 225, the audio formatting process 140-2 identifies a plurality of tracks associated with the modifier value of the respective modifier aspect. As in one example embodiment, the modifier aspect includes at least one subordinate track aspect associated with the respective modifier value of the modifier aspect (e.g., harmony). In this manner, the track aspect includes fields defining the track identity associated with the modifier group.
In step 230, the audio formatting process 140-2 enumerates a track aspect indicative of at least one track operable to render audio content. Typically, as in one example embodiment, the track aspect includes fields defining the track identity (e.g., an integer value) and the name of the track (e.g., Drums). In an alternate embodiment, the track aspect includes at least one subordinate clip aspect as described below.
In step 231, the audio formatting process 140-2 identifies at least one clip associated with the at least one track of the score. According to an example embodiment, the clip aspect includes fields defining the clip identity (e.g., an integer value), the reference file identity (e.g., a file locator such as a Uniform Resource Locator “URL”), the name of the clip (e.g., Drums_Special—2 Bars), the offset in samples of the clip and the number of samples of the clip.
In step 232, the audio formatting process 140-2 identifies a location associated with the at least clip, wherein the location defining an offset time relative to a reference point in the score 160-1. For example, in one embodiment the clip aspect provides a field defining an offset value that specifies a predetermined offset time relative to a reference point in the score 160-1 (e.g., the beginning of the song).
In step 233, the audio formatting process 140-2 specifies a file (e.g., audio file 152) associated with each clip. According to an example configuration, the file location is indicated by a uniform resource locator (URL).
In one embodiment, the score 160-1 includes a score aspect as shown in Table I. The score aspect may include, but is not limited to, fields 166 that define specific data related to the score such as the name of the song/score, the composer, the creation date, copyright information, genre (e.g., “Rock”), style (e.g., “Modern”, “sad”), the sample rate of the song, and the like. In yet another example embodiment, the score 160-1 includes a beat aspect wherein the beat aspect may include, but is not limited to, fields that define time measurements such as the beat nominator, the beat denominator, the beats per minute, and/or similar time measures related to a musical composition.
Those skilled in the art should readily appreciate that the programs and methods for structuring audio data as defined herein are deliverable to a processing device in many forms, including but not limited to a) information permanently stored on non-writeable storage media such as ROM devices, b) information alterably stored on writeable storage media such as floppy disks, magnetic tapes, CDs, RAM devices, and other magnetic and optical media, or c) information conveyed to a computer through communication media, for example using baseband signaling or broadband signaling techniques, as in an electronic network such as the Internet or telephone modem lines. The disclosed method may be in the form of an encoded set of processor based instructions for performing the operations and methods discussed above. Such delivery may be in the form of a computer program product having a computer readable medium operable to store computer program logic embodied in computer program code encoded thereon, for example. The operations and methods may be implemented in a software executable object or as a set of instructions embedded in a carrier wave. Alternatively, the operations and methods disclosed herein may be embodied in whole or in part using hardware components, such as Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), state machines, controllers or other hardware components or devices, or a combination of hardware, software, and firmware components.
While the system and method for representing and processing audio information has been particularly shown and described with references to embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention encompassed by the appended claims.
Number | Name | Date | Kind |
---|---|---|---|
4930390 | Kellogg et al. | Jun 1990 | A |
4981066 | Kakizaki | Jan 1991 | A |
5051971 | Yamagishi et al. | Sep 1991 | A |
5510573 | Cho et al. | Apr 1996 | A |
5525749 | Aoki | Jun 1996 | A |
5728962 | Goede | Mar 1998 | A |
5990404 | Miyano | Nov 1999 | A |
6452082 | Suzuki et al. | Sep 2002 | B1 |
6646194 | Kikumoto | Nov 2003 | B2 |
6872877 | Suzuki et al. | Mar 2005 | B2 |
20020038598 | Fujishima et al. | Apr 2002 | A1 |
20020091455 | Williams | Jul 2002 | A1 |
20030004701 | Ueta et al. | Jan 2003 | A1 |