The instant invention relates generally to processing music works and, more particularly, methods of increasing the energy level of songs and automated adaptation of songs for video production.
Creation of a musical work has been a goal and dream of many people for as long as music has been around. However, a lack of knowledge of details regarding the intricacies of music styles, has prevented many from generating and writing music. As such, this endeavor has, for a very long time, been a privilege of people having the necessary knowledge and education.
With the advent of the personal computer and the widespread adoption of these devices in the home consumer market software, products have emerged that allow a user to create pleasing and useful musical compositions without having to know music theory or needing to understand music constructs such as measures, bars, harmonies, time signatures, key signatures, etc. These software products provide graphical user interfaces with a visual approach to song and music content that allow even novice users to focus on the creative process without being hampered by having to learn the intricacies of music generation.
In addition to increasing the accessibility of music generation, the content that is available and usable in the process of generating music has also been adapted to correspond to the directive of supplying an easy-to-use music generation approach. These sorts of programs typically provide a number of individual sound clips of compatible length, e.g., sound loops or just “loops”, which can be selected and inserted into the multiple tracks of an on-screen graphical user interface as part of the process of music creation. With these sorts of software products, the task of music or song generation has come within reach of an expanded audience of users, who happily take advantage of the more simplified approach to music or song generation as compared with note-by-note composition. These software products have evolved over the years, gotten more sophisticated and more specialized and some have even been implemented on mobile devices.
The general approach to music or song generation provided by these software products has remained virtually unchanged, even though the processing power of the computing devices has increased and the types of devices that run this software has expanded on par with the changes in device distribution. That is, the conventional approach to music creation which has remained largely unchanged involves requiring the user to select individual pre-generated audio loops that represent different instruments (e.g., drums, bass, guitar, synthesizer, vocals, etc.), and arrange these loops in digital tracks to generate individual song parts, typically with a length of 4 or 8 measures, the goal being the generation of a full audio clip or song. Using this approach most users are able to generate one or two of these song parts with the help of an informative graphical user interface of a mobile or desktop-based software product according to their own taste and are therefore potentially able to generate individual verses and maybe the refrain of their own song.
The songs generated by the user manually or with the help of an automated system feature a static generated music item containing a fixed selection of audio loops stored in a specified song structure. Therefore, these songs are also fixed in terms of their content and also in terms of their features. That is, if the intent is to use the generated music work as a soundtrack video production these songs feature only one fixed energy level. That becomes an issue when it is desired to produce a musical work that has a musical impact on the listener that is comparable to the action in a video. In video production producers usually want to have at least two individual energy versions of a music item that is to be utilized for the illustration of different video scenarios and differing content in video material.
Thus, what is needed is a system and method for increasing the energy level of songs and music items in a loop-based music generation system.
Heretofore, as is well known in the media editing industry, there has been a need for an invention to address and solve the above-described problems. Accordingly, it should now be recognized, as was recognized by the present inventors, that there exists, and has existed for some time, a very real need for a system and method that would address and solve the above-described problems.
Before proceeding to a description of the present invention, however, it should be noted and remembered that the description of the invention which follows, together with the accompanying drawings, should not be construed as limiting the invention to the examples (or embodiments) shown and described. This is so because those skilled in the art to which the invention pertains will be able to devise other forms of this invention within the ambit of the appended claims.
According to a first embodiment, one method presented herein involves methods of increasing the energy level of a user-selected song in a loop-based music generation system. In one embodiment the algorithm is integrated into a music generation/song construction process and comprises of three different approaches, with one being a hybrid version of the remaining approaches. The first approach is directed to exchanging loops that are a part of the song structure. The second approach is directed to increasing the song energy by adding loops to the song structure. The hybrid/third embodiment of the algorithm features a dynamic combination version of the previously mentioned approaches, wherein the instant invention preferably automatically selects a fitting approach for a particular user selected song.
It should be clear that an approach such as this would be a tremendous aid to the user and would additionally provide assistance in the development and the creation of professional songs with users specified differing energy levels. The often-frustrating trial and error process of finding and generating musical material that is fitting in dynamic and impact to a particular video and its sequences is replaced with an automatic process that provides the user with at least two versions of a selected music piece. Therefore, this approach delivers a functionality to the user which enables the user to swiftly create and review different versions of a selected music piece having a differing dynamic impact without the need to manually process each piece.
The foregoing has outlined in broad terms some of the more important features of the invention disclosed herein so that the detailed description that follows may be more clearly understood, and so that the contribution of the instant inventors to the art may be better appreciated. The instant invention is not to be limited in its application to the details of the construction and to the arrangements of the components set forth in the following description or illustrated in the drawings. Rather, the invention is capable of other embodiments and of being practiced and carried out in various other ways not specifically enumerated herein. Finally, it should be understood that the phraseology and terminology employed herein are for the purpose of description and should not be regarded as limiting, unless the specification specifically so limits the invention.
These and further aspects of the invention are described in detail in the following examples and accompanying drawings.
The invention will be described in connection with its preferred embodiments. However, to the extent that the following detailed description is specific to a particular embodiment or a particular use of the invention, this is intended to be illustrative only and is not construed as limiting the invention's scope. On the contrary, it is intended to cover all alternatives, modifications, and equivalents included within the invention's spirit and scope, as defined by the appended claims.
While this invention is susceptible of embodiment in many different forms, there is shown in the drawings, and will be described hereinafter in detail, some specific embodiments of the instant invention. It should be understood, however, that the present disclosure is to be considered an exemplification of the principles of the invention and is not intended to limit the invention to the specific embodiments or algorithms so described.
As is generally indicated in
Turning next to
In
In one preferred arrangement the associated audio loops are played and replayed if necessary, during the whole runtime of the part to which their parent instrument belongs. However, it is also possible that the user may select and de-select (mute) or switch/replace individual audio loops during the runtime of a particular part.
The instant invention provides and utilizes an evolving and growing database of audio loops, wherein the audio loops are categorized according to one or more particular styles, for example EDM, 50s, Drum'n Bass, Jazz, Classical, Rock, Metal, House, etc. Each style features a plurality of different instruments in the database associated with it and each instrument has a number of associated audio loops, i.e., audio loops in which the instrument sounds when the loop is played.
Also, in some cases, the loop might not contain traditional audio recordings of an acoustic instrument but might contain computer generated sounds instead that resemble (or not) traditional instruments, e.g., synth sounds. Either way, when it is said herein that an instrument is present in a recorded loop that term should be broadly construed to cover instances where there is a digital audio recording of that instrument as well as cases where the audio material in the loop is computer or otherwise generated. This database will preferably be updated on a regular basis with new styles and the associated instruments and loops being added, existing styles with the associated instruments and loops being updated or deleted, etc. Preferably these updates will be delivered over the Internet for free or in exchange for a particular payment option.
Turning next to
Turning next to
The third method, Method 3 520 is a hybrid version that is utilized if the requirements of Method 2 510 are not met. Therefore, Method 3 520 comprises a process where steps similar to method 2 510 and method 1 510 are implemented sequentially.
Turning next to
Coming next to
In a first preferred step the instant invention will select an individual audio loop 700 and will initiate an analysis 710 based on the openSMILE toolkit. That is, this analysis will implement the loudness directed analysis functionalities and features from openSMILE (open-source Speech and Music Interpretation by Large-space extraction). openSMILE is an open source toolkit for audio feature extraction and classification of speech and music signals and it is widely applied in automatic emotion recognition for affective computing. The features and functionality of this toolkit (e.g., https://en.wikipedia.org/wiki/OpenSMILE) are well known to those of ordinary skill in the art.
In a next preferred step, the instant invention will calculate a mean value from the gathered loudness features 720 obtained from the openSMILE analysis and as a next preferred step the instant invention will normalize the calculated mean to a value between, 0 and 1 (or from 0 to 100, etc.) 730 so as to generate a quantifiable value that represents each audio loop. As a last preferred step, the instant invention will use this normalized value to generate the loudness tag for each audio loop 740.
Coming next to
In a next preferred step, the instant invention will select an initial loop 815, wherein the instant invention will then determine whether the selected loop has associated family members 820 or if the loop has no associated family members 855. If the selected loop has associated family members 820 then in a next preferred step the instant invention will identify and determine an order of the family members 825 by their loudness tag values. Note that in some embodiments this step might sort the tags to create an ordered list. In other embodiments, the order might be determined without an actual sort taking place. Thus, whenever the term “sort” is used herein, that term should be broadly construed to include cases where an actual ordered list is prepared (i.e., the items are “sorted”) as well as instances where an order is determined without actually sorting the items.
In a next preferred step, the instant invention will calculate a value representative of the present overall energy level of the song part 830, which is preferably determined by summing the loudness values of each audio loop in the song part 830 and dividing the sum by the number of the audio loops. The calculated value can either be displayed to the user or it could be hidden.
In a next preferred step, the instant invention will automatically select a desired energy value of the song part or will the user give the option of manually selecting the desired energy value 835. The selection of the desired energy value might be communicated using, by way of example only, a numerical selection (e.g., 1, 2, or 3, 55 out of 100, 0.3 out of 1, etc.) or clicking a program button labeled “Higher” or it might even be possible to present the user with a selection of different levels of energy associated from the different loudness tags of the family members that are being considered for inclusion in the song.
As a next preferred step, the instant invention will select a replacement loop to achieve the desired energy value or level of the song part 840 from the sorted family members. In the next preferred step, the initial loop will be exchanged with the replacement loop 845.
In the event that the initial loop has no associated family members 855, the current embodiment continues by determining the instrument tag of the selected initial loop 860. As a next preferred step, the instant invention will determine and select from the database some number, e.g., at least the five, nearest neighbor loops of the selected initial loop 865. The determined nearest neighbor loops will then in a next preferred step also sorted by their loudness tags 870. As a next preferred step, the instant invention will calculate the present energy value of the song part 875, which is preferably determined by summing the loudness values of each audio loop and dividing the sum by the number of the audio loops. The calculated value can either be displayed to the user or will be hidden.
In a next preferred step for this initial loop, the instant invention will select the desired energy value of the output/modified song part or will the user be given the option to determine the desired energy value 880. The selection of the desired energy value might be by specifying a numerical value, clicking a “higher” button, or it might even be possible to present the user with a selection of different levels of energy associated with the different loudness tags of the family members. As a next preferred step, the instant invention will select a replacement loop from the identified family members to achieve the desired energy value or level of the song part 885 at least approximately. In the next preferred step, the initial loop will be exchanged with the replacement loop 890.
Turning next to
As a next preferred step each loop of the song part will be selected sequentially 905 and as a first preferred step the instrument type of the loop will be determined 910. If the instrument type is DRUMS 915 the instant invention will select all drum family loops 920 in the database and in the next preferred step the selected family loops will be sorted by their loudness tag 925. The instant invention will then analyze the sorted family loops and automatically classify each loop into the appropriate loop category 930. Preferably the loops with the lowest and highest energy will be in the categories “low” and “high”, respectively. If the selected loop has the highest or lowest energy in the family then it will preferably be assigned to the highest or lowest category accordingly.
For all remaining audio loops, i.e., loops not having the drum instrument tag 935, for each loop individually, the instant invention will select, for example, the five nearest neighbor loops 940. Those of ordinary skill in the art will recognize that “nearest neighbor” is an algorithm that associates or groups entities based on some measure of their similarity. Here, one approach that has proven satisfactory is to calculate distances between loops by comparing the musical properties of each loop, e.g., grouping them based on their loudness tags 945. The sorted loops will then be classified into the selected categories in the same way the audio loops with the drums instrument type were classified previously. As a result, the instant invention can provide the user three dynamic selectable versions of the song, with each of the three versions having a different energy level 955.
Turning next to
As a first preferred step the instant invention select all of the loudness tags of all of the loops in the database 1000. In the next preferred step, a kMeans clustering algorithm will be applied 1005 to the collection of calculated loudness tags to identify three different categories 1010, these categories preferably will be associated with low, medium or high loudness. For each loop 1015 that is a part of the song part the instant invention will then decide into which of the kMeans categories the loop belongs 1020. From this association the instant invention will select two nearest neighbors from the two remaining categories 1025. As a result, the instant invention will be able to provide the user with three dynamic selectable versions of the song, where the three versions feature three different energy levels 1030.
Turning next to
For example, suppose for purposes of illustration that the first song part is missing a Bass and Synth instrument loop, so a random Bass and Synth loop will be added to that song part. Note that for purposes of the instant disclosure, the term random in this context should be construed to mean that the instant invention will determine the, say, 30% most energetic loops from this instrument as stored in the audio loop database and then select one loop randomly from the determined 30%. To determine the energy, the instant invention utilizes the loudness tag stored with each audio loop. This process is then repeated for each song part that makes up the song. That is, each loop that is added is selected using a nearest neighbor algorithm. To continue with the current example, suppose the second song part is missing Synth and FX—nto the Synth instrument section the instant invention will then add a loop that has been selected by the nearest neighbor algorithm with the added audio loop from the Synth section from the first song part as the starter loop for the nearest neighbor selection. For the FX instrument section that is newly added one of the, say, 30% most energetic loops from this instrument is randomly added.
Coming next to
If fewer than six instruments are in this particular song part 1225, then the instant invention will add audio loops of at least two unused instruments 1235 to this part. In this embodiment, instant invention will proceed according to this ordered list: Drums, Bass, Synth, Guitar, Brass Woodwind, Percussion, FX, Samples. As a next preferred step, the instant invention will begin to add loops to the added instrument sections 1240. The loop selection process will undergo a particular screening process 1245 wherein in a first preferred step the audio loop database will be screened to determine if there are loops with a family association stored for the added instruments 1250. If that is the case the instant invention will select the most energetic loop, i.e., the loop with the highest loudness tag 1255 for insertion.
If there are fewer than three family members 1260 stored in the database then the instant invention will also use nearest neighbor loops in addition to the family members 1265 for loop selection and from that list the instant invention will select the most energetic loop, i.e. the loop with the highest loudness tag 1270 for insertion. If no family members are stored in the database, then the instant invention will use the nearest neighbor algorithm 1280 to select the most energetic replacement loop, i.e., the loop with the highest loudness tag 1285 from the complete audio loop database, for insertion.
It should be noted that this screening process preferably differs when selecting new loops for later parts of the selected song. If a previously processed part has a particular instrument and an associated loop has been added then for the selection process a new loop to add depends on the contents of the previous song part. For example: suppose that a first song part is missing Guitar and Brass Woodwind, so the algorithm adds a random and most energetic Guitar loop and Brass Woodwind loop to the song part and then ends processing of this song part and proceeds to the next one. For purposes of illustration only, assume that the following song part is missing Brass Woodwind and FX. So the algorithm adds Brass Woodwind and FX instruments, but for the Brass Woodwind instrument the screening process is carried out with the previously added loop to Brass Woodwind of the previous song part as the starter loop. By way of explanation, in this case a Brass Woodwind loop has been added to the current song part (Song Part 1). Rather than add the same loop to the next song part (Song Part 2), the loop added to Song Part 1 will serve as a starter for use in determining which loop should be added to the Song Part 2. This way the instant approach keeps the same loop from being added to multiple adjacent song parts but by using the previous loop as a starter for selection of the next loop some amount of musical continuity may be obtained. For the new added FX instrument the random most energetic loop is added.
After the loops for the added instruments have been determined and added to the instruments 1240, the instant invention will determine the total number of all loops added to the currently processed song part 1290. If that determination indicates that fewer than two audio loops 1292 have been added to this part, the algorithm will, in a next preferred step, replace all loops in this part with the most energetic loops from the family/nearest neighbor combination 1295. If two or more than two audio loops have been added, then the instant invention will proceed to the next song part.
According to one embodiment, there is provided a processing flow as follows when searching for higher energy loops. Typically, two steps are performed:
SP_energy=Sum_loop_energies/num_loops=0.7.
A brief description of how the aforementioned embodiments can be utilized in different ways are presented below.
Remarks: The previous approach demonstrated that less effective results would be expected if we only use the MP (i.e., mixpack) used in the demo song (so there is only one mixpack), as it also happened for the nearest neighbors and variation approach. It's better to evaluate it in a larger collection of mixpacks or even genres. Note that in some cases the selected loops will be duplicates. One way to solve this issue, assuming it needs to be solved, is to select the next loop in the queue.
Energy Levels 2—increase song energy with additional loops. According to another embodiment:
Energy Levels 3 “Hybrid”—combination of method 1 and 2. Here is one general approach. If possible, the energy level should be increased by adding loops of unused instruments in each SP, e.g., see Energy Level 2. But in some cases, this will not be possible, e.g., if no unused instruments are in the used mixpack(s).
If this is the case algorithm, Energy Levels 1 should be used, replacing (some) loops with higher energy versions.
To maintain consistency only these loops should be replaced by a more energetic one where the difference in loudness is “significant” in some sense, e.g., in an auditory or statistical sense. Implemented in one embodiment as follows:
Step 1—Check the current instrument in a song part. If more than 6 instruments are used in the part no additional loops should be added. Otherwise, add loops of two unused instruments in this part according to this ordered list: DRUMS, BASS, SYNTH, GUITAR, BRASS WOODWIND, PERCUSSION, FX, SAMPLES. If the previous part has needed an instrument, the algorithm will add its neighbor/family to the instrument that was added.
Step 2—Check added loops to a part. If fewer than two loops are added to this part, the algorithm will revert to Step 1 and replace all loops in this part with most energetic loop of their neighbors/families.
Variation of the “Hybrid” version:
In this embodiment, the add/remove loops concept should still be considered for use because it allows switching between energy versions at any time without discontinuity plus there are some additional adjustments in this approach:
Of course, many modifications and extensions could be made to the instant invention by those of ordinary skill in the art. For example, in one preferred embodiment the algorithm could potentially also be utilized to remove individual audio loops or even instruments wherein the selection parameters for the audio loops would be reversed (low vice/versa high).
It should be noted and understood that the invention is described herein with a certain degree of particularity. However, the invention is not limited to the embodiment(s) set for herein for purposes of exemplification, but is limited only by the scope of the attached claims.
It is to be understood that the terms “including”, “comprising”, “consisting” and grammatical variants thereof do not preclude the addition of one or more components, features, steps, or integers or groups thereof and that the terms are to be construed as specifying components, features, steps or integers.
The singular shall include the plural and vice versa unless the context in which the term appears indicates otherwise.
If the specification or claims refer to “an additional” element, that does not preclude there being more than one of the additional elements.
It is to be understood that where the claims or specification refer to “a” or “an” element, such reference is not to be construed that there is only one of that element.
It is to be understood that where the specification states that a component, feature, structure, or characteristic “may”, “might”, “can” or “could” be included, that particular component, feature, structure, or characteristic is not required to be included.
Where applicable, although state diagrams, flow diagrams or both may be used to describe embodiments, the invention is not limited to those diagrams or to the corresponding descriptions. For example, flow need not move through each illustrated box or state, or in exactly the same order as illustrated and described.
Methods of the present invention may be implemented by performing or completing manually, automatically, or a combination thereof, selected steps or tasks.
The term “method” may refer to manners, means, techniques and procedures for accomplishing a given task including, but not limited to, those manners, means, techniques and procedures either known to, or readily developed from known manners, means, techniques and procedures by practitioners of the art to which the invention belongs.
For purposes of the instant disclosure, the term “at least” followed by a number is used herein to denote the start of a range beginning with that number (which may be a ranger having an upper limit or no upper limit, depending on the variable being defined). For example, “at least 1” means 1 or more than 1. The term “at most” followed by a number is used herein to denote the end of a range ending with that number (which may be a range having 1 or 0 as its lower limit, or a range having no lower limit, depending upon the variable being defined). For example, “at most 4” means 4 or less than 4, and “at most 40%” means 40% or less than 40%. Terms of approximation (e.g., “about”, “substantially”, “approximately”, etc.) should be interpreted according to their ordinary and customary meanings as used in the associated art unless indicated otherwise. Absent a specific definition and absent ordinary and customary usage in the associated art, such terms should be interpreted to be ±10% of the base value.
When, in this document, a range is given as “(a first number) to (a second number)” or “(a first number)-(a second number)”, this means a range whose lower limit is the first number and whose upper limit is the second number. For example, 25 to 100 should be interpreted to mean a range whose lower limit is 25 and whose upper limit is 100. Additionally, it should be noted that where a range is given, every possible subrange or interval within that range is also specifically intended unless the context indicates to the contrary. For example, if the specification indicates a range of 25 to 100 such range is also intended to include subranges such as 26-100, 27-100, etc., 25-99, 25-98, etc., as well as any other possible combination of lower and upper values within the stated range, e.g., 33-47, 60-97, 41-45, 28-96, etc. Note that integer range values have been used in this paragraph for purposes of illustration only and decimal and fractional values (e.g., 46.7-91.3) should also be understood to be intended as possible subrange endpoints unless specifically excluded.
It should be noted that where reference is made herein to a method comprising two or more defined steps, the defined steps can be carried out in any order or simultaneously (except where context excludes that possibility), and the method can also include one or more other steps which are carried out before any of the defined steps, between two of the defined steps, or after all of the defined steps (except where context excludes that possibility).
Further, it should be noted that terms of approximation (e.g., “about”, “substantially”, “approximately”, etc.) are to be interpreted according to their ordinary and customary meanings as used in the associated art unless indicated otherwise herein. Absent a specific definition within this disclosure, and absent ordinary and customary usage in the associated art, such terms should be interpreted to be plus or minus 10% of the base value.
Still further, additional aspects of the instant invention may be found in one or more appendices attached hereto and/or filed herewith, the disclosures of which are incorporated herein by reference as if fully set out at this point.
Thus, the present invention is well adapted to carry out the objects and attain the ends and advantages mentioned above as well as those inherent therein. While the inventive device has been described and illustrated herein by reference to certain preferred embodiments in relation to the drawings attached thereto, various changes and further modifications, apart from those shown or suggested herein, may be made therein by those of ordinary skill in the art, without departing from the spirit of the inventive concept the scope of which is to be determined by the following claims.
This application claims the benefit of U.S. Provisional Patent Application Ser. No. 63/287,646 filed on Dec. 9, 2021 and incorporates said provisional application by reference into this document as if fully set out at this point.
Number | Date | Country | |
---|---|---|---|
63287646 | Dec 2021 | US |