Music is a unique medium with a powerful ability to shape a listener's emotions. Movies, plays, and video games would all be profoundly different experiences without their musical accompaniments that enhance, alter, and transform a subject's perception of the media. Music can magnify the emotional content of a work, building suspense or amplifying triumph or defeat; it can act as an indicator, foreshadowing an event or providing information about a character or their intentions; it can even completely transform the nature of an experience, converting an otherwise dark and menacing scene into a comedic and lighthearted one, or vice versa.
These synergies between music and other mediums rely on the intimate connections between music, non-music media, and the user/audience's emotions. The creation of music in this interdependent setting is a delicate task that requires an in-depth knowledge of music's emotional content. In the present invention, a jazz theory approach to music composition driven by a central “tension/release” value is used in order to create novel musical sequences that are both musically effective and emotionally appropriate. The fluidity and adaptability of jazz theory also allows the algorithm to react to sudden or drastic changes in input. By parameterizing emotion as a single tension/release value, the present invention aims to create musical sequences that can be used in any application where a desired emotion can be identified. The algorithm can also generate music in real-time, using an input that is not predetermined.
Many models attempting to produce emotionally affecting compositions do not rely on music theory so much as machine learning and neural network-based approaches. U.S. Pat. No. 6,297,439 (Browne) details a recursive artificial neural network system and method for automatically generating music of a given style or composer by inputting and analyzing an initial sequence of input notes. U.S. Pat. No. 10,657,934 (Kolen et al.) describes a different method for creating musical scores via a user interface where the user first selects a genre and artists or songs, which then drives the selection of musical constraints based on analyzing the artists or songs. These musical constraints are then used to provide feedback to the user where their score deviates from the musical constraints. Alternatively, U.S. Pat. No. 10,679,596 (Balassanian) determines a set of composition rules based on analyzing a plurality of tracks in a set of music content, where the rules include the number of tracks to overlay, the types of instruments to combine and selecting the next key in a progression, then using this information to inform which tracks to overlay to play at the same time. Lastly, U.S. Patent Application Publication US 2018/0322854 trains a melody prediction model for lyrics using a corpus of songs. The method then creates new melodies from new lyrics inputted by a user using probability distributions of melody features from the prediction model. Researchers have also developed composition algorithms that use neural networks and other machine learning approaches (Eck and Schmidhuber 2002; Liu and Ramakrishnan 2014). However, these kinds of models can suffer from various failure modes, such as notes or short motifs that repeat infinitely. These issues can sometimes be improved by using rules from music theory to contribute structure and give constraints, balancing the probabilities learned from training data with accepted music theory rules (Jaques et al. 2017). Probabilistic models, including some neural networks, are effective to some extent and in certain contexts, and while various workarounds for issues with machine learning composition algorithms can be implemented, issues with the core algorithm may lead one to look elsewhere for a more elegant solution.
Another approach is to combine existing segments, units, motifs, tracks or series of notes that have been shown to be musically effective into a larger musical piece or composition. U.S. Pat. No. 7,696,426 (Cope) describes a method for automatically composing new musical work based on a plurality of existing musical segments using a programmed linear retrograde recombinant musical composition algorithm. It analyzes and combines segments with pitch, duration and on-time metrics and combines the segments by matching the last note of desired/selected segments. Alternatively, U.S. Pat. No. 7,842,874 (Jehan) creates new music by listening to a plurality of music, and performing concatenative synthesis based on the listening and learning. It utilizes a spectrogram as the main analysis method and a combination of beat matching, time scaling and cross synthesis for the concatenative synthesis. U.S. Pat. No. 8,581,085 (Gannon) describes generating a musical composition from one or more portions of one or more performances of one or more musical compositions included in a database. The method then selects a portion of a pre-recorded composition based on degree of similarity using chord tones and notes in a scale associated with the chord tones. U.S. Patent Application Publication US 2020/0188790 assigns an emotion to musical motifs and then associates the motif to the desired emotion of a video game vector. The method then generates a musical composition based on these associations. Lastly, U.S. Pat. No. 8,812,144 (Balassanian) creates music by inputting a desired energy level, determining the tempo and key based on the energy level and combining at least one generated track and one loop track to create the music. These methods do not create new music so much as rearrange and transform preexisting music. Therefore, the musical breadth and depth of the resulting compositions is inherently limited to their source segments and motifs.
Overall, it is evidently extremely challenging to create an algorithm that learns to generate emotionally and musically effective sequences. A music theory-driven model that utilizes the perceived meanings and effects of the relationships between musical states is a novel and powerful approach that may be able to more effectively generate new musical sequences in real time. The simplicity and versatility of the present invention's single value “emotion” input allows for a variety of possible applications: to automatically generate a soundtrack for a movie, to compose music for a video game in real-time according to a player's actions, or to enhance a VR experience. Generated music could also be used to amplify or transform experiences not normally accompanied by music, such as scrolling through a social media feed, watching sports, or online messaging. Because the present invention uses an emotion input that changes over time, it also allows for the real time generation of novel music that appropriately matches storylines and emotional arcs rather than a single target emotion across a period of time or an entire experience.
The present invention provides innovative techniques for making a musical choice based on a target level of musical tension. One can begin with a value that represents the target level of musical tension, as well as a domain of possible musical states. The amount of musical tension that would ensue as a result of choosing each possible next state is calculated by independently considering the horizontal and vertical tensions, then a choice is made based on the calculated tensions of the possible next states in comparison to the target level of tension. Some specific embodiments of the invention are described below.
In one embodiment, the invention provides a computer implemented method of choosing a next chord in a sequence based on an input of a target tension value as well as a domain of possible chords. The vertical and horizontal tensions that would result from choosing each chord in the domain is calculated. Considerations in calculating the vertical tension of a possible next chord include the harmonic relationships between notes in the chord, the chord quality, and the relationship between the chord and globally defined parameters. Considerations in calculating the horizontal tension of a possible next chord include comparing corresponding attributes of the current and next chords, comparing the root notes of the current and next chords, determining the notes shared in common between the current and next chords, determining the notes in the current chord that are one semitone above a note in the next chord, and checking for a match with specific predefined chord sequences. A final tension value is calculated from the vertical and horizontal tension values, then a choice is made by comparing the final tension values of each possible chord to the target tension value, and selecting the chord that is the closest match.
In another embodiment, the invention provides a computer implemented method of choosing a next note in a sequence based on an input of a target tension value as well as a domain of possible notes. The vertical and horizontal tensions that would result from choosing each note in the domain is calculated. Considerations in calculating the vertical tension of a possible next note include the relationship between the note and other musical elements present at the same time, and the relationship between the note and globally defined parameters. Considerations in calculating the horizontal tension of a possible next note include the harmonic interval between the current and next note, and checking for a match with specific predefined sequences of notes. A final tension value is calculated from the vertical and horizontal tension values, then a choice is made by comparing the final tension values of each possible note to the target tension value, and selecting the note that is the closest match.
Other features and advantages of the invention will become readily apparent upon review of the following description in association with the accompanying drawings, where the same or similar structures are designated with the same reference numerals.
The present invention uses a jazz approach to music theory to inform generation. Jazz music theory is not exclusive to jazz music; it is simply a flexible and powerful method of abstracting, analyzing, creating, and communicating musical structures. This theory is related to but largely distinct from the classical approach to music theory. Basic jazz theory can be generalized and abstracted such that a few key concepts can be used to analyze very complex structures, and an additional benefit is that one state does not restrict the available choices for the next musical or emotional state, making it especially powerful when considering a wide variety of possible musical and emotional directions. This use of jazz theory enables the present invention to have the key advantages of being able to generate completely new music, as well as being able to create music in real time.
In the present invention, chords are represented by conventional jazz chord symbols, which are composed of two components: a “root” and a “quality.” The root of a chord is the tonal foundation of a chord. The chord quality determines the other notes in the chord relative to the root.
To use the concept of tension and release in the present invention as shown in
As shown in
To analyze vertical tension 600, the system first evaluates the tension within the chord 601 by determining the chord quality (see
The next step is to evaluate if the chord root is in or out of the key 602. The root of the chord is the note that the chord is constructed from, and together with the chord quality, determines the notes that comprise the chord. A root note that is outside of the key will result in an increase in the calculated tension. Considering the root note's relationship to the greater musical context independent of the rest of the chord provides a broad measure of the entire chord's relationship to the musical context, as the rest of the chord is constructed off of the root note.
Finally, the system analyzes if the chord tones themselves are in or out of the key 603. This provides a more detailed analysis of the chord's relationship to the musical context, and is a secondary, higher-resolution consideration after analyzing the root note.
To analyze horizontal tension 700, the system first evaluates the distance between chord roots 701. This distance is measured in ascending semitones, and is an effective indication of harmonic movement and function. Each distance has a corresponding degree of tension or release.
Next, the method then checks for a dominant V to I sequence 702. This specific chord movement is central to harmonic movement in Western music, and is thus specifically checked for.
The third step is to evaluate for common chord tones 703. This is a measurement of the magnitude of harmonic movement—if many chord tones are shared between two chords, the magnitude of tension or release generated will be smaller.
Finally, the system evaluates for leading tones 704. Leading tones are defined as notes in a chord that are a semitone below a note in the previous chord, and are a common means of harmonic resolution. The existence of one or more leading tones results in a greater degree of release.
For instance, if the current chord is C major, the movement C major→A minor, which is diatonic, has a root note interval of a major 6th, and shares in common 2 chord tones with C major, has the slightly released TRQ. The movement C major→A7 b 9, however, has the same root note interval but is a dominant chord with a chromatic alteration (b 9) and two chord tones not in the key of C major, so it has a more tense TRQ.
The input for the present invention is an array (for a generation of fixed length) or continuous stream (for a real time generation of unknown length) of TRQ values that represents the desired tension or release of the generation over time. Depending on the application, this tension/release profile can be obtained directly from the user or from another source—for instance, if the present invention is being used to generate music to accompany a video game, the events occurring in the game could be used to produce the profile.
Before generation starts, it is necessary to define the set of possible states that the generation could output. When generating chords, this is accomplished by specifying a domain of possible roots and chord qualities. For instance, a possible domain could include the root notes [C F G], and the chord qualities [major minor 7], yielding overall possible combinations of: C major, C minor, C7, F major, F minor, F7, G major, G minor, and G7.
Whenever the present invention reaches a musical state, the TRQs of all possible next states are calculated, and the algorithm chooses the state with a TRQ that most closely matches the target profile. This state becomes the current state, and the process is repeated. The algorithm can be executed with multiple “threads,” where several of the closest matches are selected at each stage of the algorithm, creating an N-ary tree structure as shown in
The present invention's core tension/release driven algorithm could be applied effectively to the generation of any musical structure—the TRQ would be adapted to calculate the tension or release imparted by each possible musical choice. If multiple musical structures are being generated (for instance, chords and melody), the tension or release of the individual components would be calculated, as well as the tension or release created by their coexistence.
The simplicity and flexibility of the single tension/release input allows the present invention to be adapted for a large variety of applications. Creators producing games, movies, installations, or VR experiences could use the present invention to create music that conforms to the intended tone. Using sentiment analysis, the emotional content of a text source could be used to calculate a TRQ over time, so the algorithm could be used to generate musical accompaniment for an online messaging conversation, e-book, or social network feed.
Number | Date | Country | |
---|---|---|---|
63092818 | Oct 2020 | US |