This application is a U.S. National Phase of International Patent Application No. PCT/JP2019/009353 filed on Mar. 8, 2019, which claims priority benefit of Japanese Patent Application No. US 62/804,450 filed in the United States Patent Office on Feb. 12, 2019. Each of the above-referenced applications is hereby incorporated herein by reference in its entirety.
The present disclosure relates to an information processing device, an information processing method, and an information processing program. Specifically, the present disclosure relates to use of music data composed on the basis of machine learning.
With the progress of artificial intelligence (AI), the use of computers in the art field is being promoted.
For example, there is known a technology of performing machine learning using existing music as learning data to generate a learning model for music generation, and having a computer compose new music (e.g., Patent Document 1). In such a technology, it is possible to imitate the features of existing music or generate a more natural melody by using a Markov model.
According to the conventional technology, since music data proposed (generated) by AI can be used for composition, the user can compose music on the basis of a wider variety of viewpoints.
However, with the conventional technology described above, it is not always possible to improve convenience of the automatic composition function by AI. For example, at present, many users use a digital audio workstation (DAW) to compose, arrange, and record. However, when the user uses the conventional technology described above in combination with a DAW, the work is carried out while going back and forth between different work environments, which may reduce the work efficiency. Additionally, since the automatic composition function by AI generally has a heavy information processing load, if the automatic composition function is executed at the same time as the DAW in the terminal device, there is a possibility that sufficient functions are not exhibited or processing on the DAW side is delayed.
Against this background, the present disclosure proposes an information processing device, an information processing method, and an information processing program that can improve convenience of an automatic composition function by AI.
In order to solve the above problems, an information processing device of one form according to the present disclosure is an information processing device that controls a first application and a second application that functions as a plug-in that extends functions of the first application, in which the first application includes a control unit that controls operation of the second application in the first application, and the second application includes a selection unit that selects setting information for controlling a composition function based on machine learning, and a transmission/reception unit that transmits the setting information to an external server that executes a composition function based on machine learning and receives music data composed by the external server through a network.
Hereinafter, embodiments of the present disclosure will be described in detail with reference to the drawings. Note that in each of the following embodiments, the same parts will be designated by the same reference numerals, thereby omitting duplicate description.
The present disclosure will be described according to the order of items shown below.
1. Embodiment
2. Modification
3. Other embodiments
4. Effect of information processing device according to the present disclosure
5. Hardware configuration
First, one example of information processing according to the present disclosure will be described with reference to
The user terminal 10 shown in
In the embodiment, the user terminal 10 includes an application (so-called DAW) that implements a comprehensive music production environment. In the following description, the application (DAW) is referred to as a first application or a host application. Another application for extending functions can be incorporated (inserted) into the first application according to the embodiment. That is, it is assumed that the first application can use a so-called plug-in, which is another application for extending functions. In this case, the first application functions as the host application for the incorporated plug-in.
Additionally, in the embodiment, the user terminal 10 includes an application having an automatic composition function by AI. In the following description, the application (application having automatic composition function by AI) is referred to as a second application or a plug-in. The second application according to the embodiment is incorporated as a plug-in of the first application described above. The plug-in can take the form of, for example, Steinberg's virtual studio technology (VST) (registered trademark), Audio Units, Avid Audio eXtension (AAX), and the like.
The processing server 100 shown in
As described above, by using the second application as a plug-in, the user terminal 10 can drag and drop the music data provided by the processing server 100 on the second application onto the first application, or perform editing on the first application. Additionally, while the automatic composition function has been dependent on the processing performance (CPU power and the like) of the terminal on which the processing is performed in the conventional technology, as shown in
As shown in
The user terminal 10 selects the setting information of music to be automatically composed in the plug-in 22 according to the user's operation. Although details will be described later, the user terminal 10 selects setting information such as chord progression of the music to be automatically composed, a subjective image of the music (dark, bright, and the like), and the composition of the music according to operations of the user. Then, the user terminal 10 transmits the selected setting information to the processing server 100 (step S3).
The processing server 100 performs predetermined learning processing on the basis of the setting information transmitted from the user terminal 10, and performs composition processing on the basis of the learning result (step S4). For such composition processing, the processing described in the conventional technology document described above may be used, for example. Then, the processing server 100 generates the composed music data.
Subsequently, the processing server 100 transmits the generated music data to the user terminal 10 (step S5). The user terminal 10 receives the music data transmitted from the processing server 100 in the plug-in 22. For example, the music data includes information such as chord progression, a melody, and bass note progression generated by the processing server 100. Note that the music data may be standard data such as musical instrument digital interface (MIDI) data, waveform data, or DAW original standard data. The user may edit the received music data on the plug-in 22, or may copy the music data to the host application 20 and use it on the host application 20.
As described above, the user terminal 10 controls the host application 20 and the plug-in 22 that functions as a plug-in that extends the functions of the host application 20. Additionally, the host application 20 controls the operation of the plug-in 22 in the host application 20. Additionally, the plug-in 22 selects the setting information for controlling the composition function based on machine learning, transmits the setting information to the processing server 100 through the network N, and receives the music data composed by the processing server 100.
That is, the user terminal 10 uses the automatic composition function as a plug-in of the DAW. For this reason, the user can receive the support of the automatic composition function in the DAW which is a normal working environment. Additionally, the user can avoid the delay of processing in the DAW by making the processing server 100 bear the processing load of the automatic composition function. Consequently, the user terminal 10 can improve convenience of the automatic composition function by AI.
Next, the details of the automatic composition function by the plug-in 22 will be described with reference to
In the example shown in
Setting information 31 displays information regarding the style palette, which is one example of setting information in the automatic composition function. The style palette is designation information for designating material music to be learning data for machine learning.
Setting information 32 displays information regarding harmony, which is one example of setting information in the automatic composition function. The information regarding harmony is, for example, information for determining the probability that constituent notes included in a chord appear in the melody of the music data composed by the processing server 100. For example, if the user sets the information regarding harmony to “strict”, constituent notes included in a chord are more likely to appear in the melody of the automatically composed music data. On the other hand, if the user sets the information regarding harmony to “loose”, constituent notes included in a chord are less likely to appear in the melody of the automatically composed music data. The example in
Setting information 33 displays note length information, which is one example of setting information in the automatic composition function. The note length information is, for example, information for determining the note length in the music data composed by the processing server 100. For example, if the user sets the note length information to “long”, notes with a relatively long note length (e.g., whole note, half note, and the like) are more likely to appear in the automatically composed music data. On the other hand, if the user sets the note length information to “short”, notes with a relatively short note length (e.g., eighth note, sixteenth note, and the like) are more likely to appear in the automatically composed music data.
Setting information 34 displays information for determining the type and amount of material music other than the material music included in the designation information (style palette designated by user), which is one example of setting information in the automatic composition function. Such information is, for example, information for determining whether or not to perform learning strictly on the basis of the music included in the style palette designated by the user in the music data composed by the processing server 100. For example, if the user sets such information to “never”, the tendency to use music other than the music included in the style palette in automatic composition learning decreases. On the other hand, if the user sets such information to “only”, the tendency to use music other than the music included in the style palette in automatic composition learning increases.
Music data 35 displays specific music data transmitted from the processing server 100. In the example of
Note that while the user interface 30 shown in
As shown in
The user can select the information to be copied to the host application 20 from the displayed user interface 30, user interface 38, and user interface 39, or edit a part of the bass note, for example.
Next, the style palette, which is one example of the setting information described above, will be described with reference to
A window 40 shown in
Note that as described above, the style palette is information for designating the music used by the processing server 100 for learning. That is, each style palette contains information for identifying an existing piece of music composed in advance. For example, it is assumed that a constituent music list 50 is associated with the style palette 41. The constituent music list 50 includes multiple pieces of existing music. Additionally, it is assumed that a constituent music list 51 is associated with the style palette 42. The constituent music list 51 includes multiple pieces of existing music different from the music included in the constituent music list 50.
For this reason, a learning model generated by machine learning on the basis of the style palette 41 is different from a learning model generated by machine learning on the basis of the style palette 42. This is because the learning data in machine learning changes depending on the style palette selected by the user. That is, the style palette can also be considered as designation information for designating learning data in automatic composition.
The music included in the style palette is, for example, pre-registered by the administrator, provider, or the like of the plug-in 22. For example, the administrator of the plug-in 22 extracts multiple pieces of music that are subjectively perceived as “bright” to generate the constituent music list 50, and associates the constituent music list 50 with the style palette 41. Note that the style palette and the music corresponding to the style palette may be arbitrarily edited by the user of the plug-in 22. For example, the user may select music from a web service such as a song distribution service or a social networking service (SNS), combine the selected pieces of music, and generate a desired style palette. Specifically, the user may arbitrarily extract music included in a playlist automatically generated by a predetermined music application or a playlist provided to the user of the music application, and change the constituent music of a style palette that he/she created or create a new style palette. As a result, the user can flexibly generate a style palette of his/her liking.
Note that the user may select multiple style palettes when selecting the style palette as setting information. For example, the user may select the style palette 41 as the setting information for composing a part of a song (e.g., first eight bars), and select the style palette 42 as the setting information for composing another part of the song (e.g., middle eight bars). Such information including multiple style palettes is hereinafter referred to as a style palette sequence. In other words, the style palette sequence can be considered as combined designation information in which pieces of designation information for designating music, that is, style palettes are combined. The user can easily create various music data having multiple features in a single piece of music by setting the style palette sequence for the music composition.
Next, the relationship between the host application 20 and the plug-in 22 will be conceptually shown with reference to
A processing block 60 shown in
Thereafter, the user records the melody 61, the chord 62, and the bass note 63 used for the music in the recorder related to the DAW, and creates each track corresponding to the melody, the chord, and the bass note. For example, the user sets instrument information indicating the instrument to be used in the performance for the melody 61 generated by the plug-in 22. Specifically, the user sets instrument information such as playing the melody 61 on a guitar registered in the DAW. Then, the user records the sound played by the virtual guitar in the recorder and creates a track corresponding to the guitar. Note that since the DAW can create multiple tracks, a track based on the performance sound of the performer and a track based on the music data created by the plug-in 22 may coexist.
Thereafter, the user mixes the tracks on the DAW and creates music data by performing a mixdown and the like. Additionally, the user performs mastering on the DAW, adjusts the acoustic signal level and the like, and creates a music file 65 that can be played back on a playback device or the like.
As described above, according to the information processing according to the embodiment, the user can use data automatically composed by the plug-in 22 according to the performance data played by the performer and the created MIDI data, and create music on the DAW. For example, the user can create music on the DAW by mixing a melody automatically composed by AI with the performance data played by the performer, or by incorporating a chord progression proposed by AI into the performance data played by the performer.
Hereinabove, the outline of the overall flow of the information processing according to the present disclosure has been described. In
The user terminal 10 is one example of the information processing device according to the present disclosure, and controls the operation of the host application 20 and the plug-in 22.
The processing server 100 is one example of an external server according to the present disclosure, and performs automatic composition processing in cooperation with the plug-in 22.
The management server 200 is, for example, a server managed by a business operator or the like that provides the plug-in 22.
For example, the management server 200 manages the user authority of the user of the plug-in 22, and manages information of the style palette available in the plug-in 22. For example, the management server 200 determines whether or not a user has the authority to use the plug-in 22 on the basis of a user ID that uniquely identifies the user. Additionally, the management server 200 creates a style palette, edits music included in the style palette, and transmits information regarding the style palette to the user terminal 10 and the processing server 100. Note that the management server 200 may be integrally configured with the processing server 100.
Next, the configuration of the user terminal 10, which is one example of the information processing device according to the present disclosure, will be described with reference to
The communication unit 11 is implemented by, for example, a network interface card (NIC) or the like. The communication unit 11 is connected to the network N (Internet or the like) by wire or wirelessly, and transmits and receives information to and from the processing server 100, the management server 200, and the like through the network N.
The input unit 12 is an input device that accepts various operations from the user. For example, the input unit 12 is implemented by an operation key or the like included in the user terminal 10. The display unit 13 is a display device for displaying various types of information. For example, the display unit 13 is implemented by a liquid crystal display or the like. Note that when a touch panel is adopted for the user terminal 10, a part of the input unit 12 and the display unit 13 are integrated.
The storage unit 15 is implemented by, for example, a semiconductor memory element such as a random access memory (RAM) or a flash memory, or a storage device such as a hard disk or an optical disk. The storage unit 15 stores various data used for information processing.
As shown in
The composition setting information 151 is information used by the plug-in 22 (second application) when performing automatic composition.
As shown in
As the performance style information, information such as the performance style of the music used as the learning data of the automatic composition is stored. The performance style includes, for example, information such as the overall shuffle ratio, chord and bass note splits, and overall balance.
The composition music information 152 is information on the music used by the plug-in 22 when performing automatic composition.
As shown in
The history information 153 indicates the history of operations by the user in the host application 20 and the plug-in 22, and the history of music created by the user.
As shown in
The associated instrument information 154 indicates instrument information set for music data transmitted from the processing server 100 and multiple pieces of candidate data included in the music data.
As shown in
Returning to
As shown in
The host application control unit 161 controls the host application 20 (DAW as first application).
The plug-in control unit 162 controls the operation of various plug-ins in the host application 20. For example, the plug-in control unit 162 controls operations such as calling a plug-in in the host application 20, activating a plug-in on the host application 20, and copying data in a plug-in to the host application 20.
For example, the plug-in control unit 162 individually sets instrument information for designating a tone quality when the plug-in plays back a chord, a melody, or a bass note included in the music data received from the processing server 100. For example, the plug-in control unit 162 reads out information on virtual instruments registered on the DAW, and sets the information on the virtual instruments to play each of the chord, melody, or bass note included in the music data of the plug-in.
The playback unit 163 controls playback processing in the host application 20. The playback unit 163 has a synchronous playback function, a playback information transmission function, a sound synthesis playback function, a playback style arrangement function, and the like in the host application 20.
For example, the playback unit 163 cooperates with the synchronous playback function of the playback unit 168 of the plug-in to play back the music data held by the plug-in. For example, the playback unit 163 can pass time information indicating the position where the host application is playing back to the plug-in to acquire and play back the melody, chord, and bass note of the portion corresponding to the playback position.
Additionally, in a case where the performance style or the like is set in the plug-in, the playback unit 163 may process the playback data according to the performance style and play back the processed data.
The display control unit 164 controls display control processing in the host application 20. For example, the display control unit 164 has a performance information display function for displaying information on each track on a screen (display unit 13), a composition music information pasting function for copying information such as music data to a track, and the like.
Additionally, the display control unit 164 controls the plug-in to separately display each window displaying information regarding the chord, melody, or bass note included in the music data received from the processing server 100. For example, the display control unit 164 displays a user interface corresponding to each of the chord, melody, or bass note on the screen of the DAW, as shown in
Additionally, the display control unit 164 controls transmission and reception of information between each window displaying information regarding the chord, melody, or bass note and a window displaying information regarding the host application, according to the user's operation. As a result, the user can quickly perform processing such as copying the automatically composed music data to an arbitrary track or performing editing on the DAW.
Note that the display control unit 164 may control not only the exchange of information between the host application and the plug-in but also the exchange of information between the displayed plug-in windows. That is, the display control unit 164 may control transmission and reception of information between windows each displaying information regarding the chord, melody, or bass note, according to the user's operation.
The plug-in application control unit 165 controls the operation of the plug-in running on the host application 20. For example, the plug-in application control unit 165 activates the plug-in on the host application 20 according to the user's operation.
The selection unit 166 selects setting information for controlling a composition function based on machine learning. For example, the selection unit 166 selects, as setting information, designation information for designating the material music to be learning data for machine learning. Specifically, the designation information corresponds to the style palette shown in
For example, according to the user's operation, the selection unit 166 selects designation information that is stored in the storage unit 15 in advance and includes feature information indicating a feature of the designation information and multiple pieces of material music associated with the feature information. For example, the user refers to the feature information (“bright”, “dark”, or the like) of the style palette through the window 40 or the like shown in
Additionally, the selection unit 166 may select combined designation information in which first designation information corresponding to some bars of the music data composed by the external server and second designation information corresponding to some other bars thereof are combined. As described above, the combined designation information corresponds to the style palette sequence. Additionally, the first designation information corresponds to the style palette which is the setting information for composing some bars. Additionally, the second designation information corresponds to the style palette which is the setting information for composing some other bars.
Also, in addition to the style palette, the selection unit 166 may select detailed setting information regarding the music data to be composed.
For example, the selection unit 166 may select, as setting information, the length information of notes included in the music data composed by the processing server 100 on the basis of the style palette. For example, the selection unit 166 accepts the selection of the note length information from the user through the display of the slider or the like of the setting information 33 included in the user interface 30 or the like shown in
Additionally, the selection unit 166 may select, as setting information, information for determining the probability that constituent notes included in a chord appear in the melody of the music data composed by the processing server 100 on the basis of the style palette. For example, the selection unit 166 accepts the selection of the information for determining the probability that constituent notes included in a chord appear in the melody from the user, through the display of the slider or the like of the setting information 32 included in the user interface 30 or the like shown in
Additionally, the selection unit 166 may select, as setting information, information for determining the type and amount of material music other than the material music included in the style palette in the music data composed by the processing server 100 on the basis of the style palette. For example, the selection unit 166 accepts the selection of the information for determining the type and amount of material music other than the material music included in the style palette from the user, through the display of the slider or the like of the setting information 34 included in the user interface 30 or the like shown in
Additionally, the selection unit 166 may select information other than the style palette as the setting information for automatic composition. As one example, the selection unit 166 may select, as setting information, a chord progression in the composed music on the basis of the user's operation. In this case, the processing server 100 automatically generates music data on the basis of the chord progression selected by the user.
The transmission/reception unit 167 transmits the setting information selected by the selection unit 166 to the processing server 100 that executes the composition function based on machine learning through the network N, and receives music data composed by the processing server 100.
For example, the transmission/reception unit 167 transmits a style palette selected by the selection unit 166 to the processing server 100. Then, the transmission/reception unit 167 receives music data generated by the processing server 100 on the basis of the style palette.
The transmission/reception unit 167 receives a chord in a bar of a specified length, a melody in the bar, and a bass note in the bar, for example, as music data. Such information may be data such as MIDI, MusicXML, or the like, information of DAW original standard, or waveform data (WAV file or the like).
Additionally, the transmission/reception unit 167 may transmit a style palette sequence selected by the selection unit 166 to the processing server 100. In this case, the transmission/reception unit 167 receives music data generated by the processing server 100 on the basis of the style palette sequence.
When the transmission/reception unit 167 receives music data composed by the processing server 100 on the basis of the style palette sequence, the transmission/reception unit 167 may store the music data in association with the style palette sequence in the storage unit 15. As a result, the user can refer to what kind of music data is created by what kind of style palette sequence as a history, so that such information can be utilized for composition.
Additionally, the transmission/reception unit 167 may transmit various setting information other than the style palette and the style palette sequence to the processing server 100. For example, the transmission/reception unit 167 transmits, to the processing server 100, note length information, information for determining the probability that constituent notes included in a chord appear in the melody, information for determining the type and amount of material music other than the material music included in the style palette, and the like set by the user.
Additionally, when the user performs a playback or editing operation on music data composed by the processing server 100 after receiving the music data, the transmission/reception unit 167 may transmit information regarding the playback or editing operation to the processing server 100. As a result, the processing server 100 can acquire information such as how the composed music data is used or how much the composed music data is used. In this case, the processing server 100 may adjust the learning method and the music data to be generated on the basis of such information. For example, the processing server 100 may analyze past music data used by more users and preferentially generate music data having such characteristics.
The playback unit 168 controls playback processing in the plug-in. For example, the playback unit 168 plays back music data received by the transmission/reception unit 167. Specifically, the playback unit 168 sets arbitrary instrument information for each of the melody, chord, and bass note included in the music data, and plays back each piece of data. Note that the playback unit 168 may play back the melody, the chord, and the bass note in combination.
The display control unit 169 controls display processing in the plug-in. For example, the display control unit 169 displays a window such as a user interface showing plug-in information on the screen.
As shown in
Additionally, the display control unit 169 may perform control to retrieve the history of past music data composed by the processing server 100 from the storage unit 15 and display the history of the past music data according to the user's operation. As a result, the user can proceed with the composition while referring to the data composed by the processing server 100 in the past. For example, the user can determine the final candidate by comparing the latest music created by editing with the history of music edited in the past.
Additionally, the display control unit 169 may perform control to retrieve the history of editing operations performed on the past music data composed by the processing server 100 from the storage unit 15 and also display the editing operations performed on the past music data. As a result, the user can refer to the editing operation performed in the past, the music data generated by the editing operation, for example, so that the composition can be performed efficiently.
Note that while
Next, the configuration of the processing server 100 which is one example of the external server according to the present disclosure will be described.
As shown in
The communication unit 110 is implemented by, for example, a NIC or the like. The communication unit 210 is connected to the network N by wire or wirelessly, and transmits and receives information to and from the user terminal 10, the management server 20, and the like through the network N.
The storage unit 120 is implemented by, for example, a semiconductor memory element such as a RAM or a flash memory, or a storage device such as a hard disk or an optical disk. The storage unit 120 stores various data used for information processing.
As shown in
The user information 121 indicates information of the user of the plug-in 22 (second application).
As shown in
The music information 122 indicates information on music used for automatic composition processing.
As shown in
The style palette information 123 indicates information regarding the style palette used for automatic composition processing.
As shown in
The style palette sequence information 124 indicates information regarding the style palette sequence used for automatic composition processing.
As shown in
The user composition information 125 indicates information regarding composition received from the user terminal 10.
As shown in
The history information 126 is various histories related to information processing of the processing server 100.
As shown in
Returning to
As shown in
The acceptance unit 131 accepts various information transmitted from the management server 200. For example, the acceptance unit 131 accepts information on the user of the plug-in, information regarding the style palette, information on the material music used in the automatic composition, and the like. For example, when the user purchases and activates a product (plug-in, DAW, or the like), the acceptance unit 131 performs processing of issuing a user ID to the user and accepting information regarding the user. Additionally, the acceptance unit 131 accepts registration of music to be linked to a style palette, editing of the style palette, and the like according to operations and commands from the management server 200.
The management unit 132 manages various information accepted by the acceptance unit 131. For example, the management unit 132 stores various information in the storage unit 120, and updates the stored information as appropriate.
For example, when the style palette registration processing by the management unit 132 is completed, the user can acquire and browse a list of style palette information.
The acquisition unit 133 acquires a request for automatic composition transmitted from the user terminal 10. Additionally, the acquisition unit 133 acquires setting information transmitted together with the request. For example, the acquisition unit 133 acquires a style palette that the user desires as setting information.
The composition unit 134 composes music on the basis of the setting information acquired by the acquisition unit 133. The composition unit 134 may compose music by using various existing music generation algorithms. For example, the composition unit 134 may use a music generation algorithm using a Markov chain, or may use a music generation algorithm using deep learning. As described above, the composition unit 134 generates multiple pieces of music data for a single piece of setting information transmitted from the user. As a result, the user can receive multiple proposals from the composition unit 134, and thus can proceed with the composition by using more diverse information.
When music data is generated by composition processing, the composition unit 134 associates the generated music data with the user ID of the user who transmitted the style palette and stores it in the storage unit 120 as history information.
The transmission unit 135 transmits the music data generated by the composition unit 134 to the user terminal 10.
Next, the procedure of information processing according to the embodiment will be described with reference to
As shown in
Subsequently, the user terminal 10 determines whether or not selection of a style palette or the like is accepted from the user (step S102). If selection of a style palette or the like is not accepted from the user (step S102; No), the user terminal 10 stands by until a selection is accepted.
On the other hand, if selection of a style palette or the like is accepted from the user (step S102; Yes), the user terminal 10 selects a style palette according to the user's operation (step S103). Note that the user terminal 10 may accept various setting information other than the style palette in step S102.
Thereafter, the user terminal 10 determines whether or not a composition request is accepted from the user (step S104). If a composition request is not accepted from the user (step S104; No), the user terminal 10 stands by until a request is accepted.
On the other hand, if a composition request is accepted from the user (step S104; Yes), the user terminal 10 transmits the accepted setting information together with the composition request to the processing server 100 (step S105). Thereafter, the user terminal 10 receives music data composed (generated) by the processing server 100 (step S106).
Subsequently, the user terminal 10 determines whether or not editing processing or the like has been performed by the user on the user terminal 10 (step S107). If the editing processing or the like has not been performed (step S107; No), the user terminal 10 stands by until the editing processing or the like is accepted (step S107).
On the other hand, if the editing processing or the like has been performed (step S107; Yes), the user terminal 10 reflects the editing and transmits information regarding the editing operation to the processing server 100 (step S108).
Thereafter, the user terminal 10 determines whether or not another composition request is accepted from the user (step S109). If a composition request is accepted from the user (step S109; Yes), the user terminal 10 accepts new setting information from the user.
On the other hand, if a composition request is not accepted from the user (step S109; No), the user terminal 10 determines whether or not a host application termination request is accepted (step S110). If a host application termination request is not accepted (step S110; No), the user terminal 10 continues the editing processing of the music data currently received. On the other hand, if a host application termination request is accepted (step S110; Yes), the user terminal 10 terminates the host application and the plug-in, and ends the processing.
The information processing system 1 described above may be implemented in various different forms other than the above embodiment. Hence, modifications of the embodiment will be described below.
In the above embodiment, the types of information set in the associated instrument information 154 and the like of the music data in the plug-in are assumed to be melody, chord, and bass note. However, the present invention is not limited to this. For example, the associated instrument information 154 can be applied not only to melody, chord, and bass note, but also to the performance part of each instrument of a full orchestra, for example.
In the above embodiment, DAW is assumed as the host application. However, the present invention is not limited to this. For example, the host application may be a video editing application or the like instead of a music editing application.
In the above embodiment, an example is shown in which the user terminal 10 selects setting information on the plug-in and transmits the selected information to the processing server 100. However, the setting information and the like may be selected by the host application. That is, the user terminal 10 may transmit the setting information (e.g., chord progression) and the like selected in the host application to the processing server 100 to enable execution of the automatic composition processing. In this case, the host application may provide the plug-in with an application programming interface (API) for the plug-in to use information of the host application, and allow acquisition of information for generating a style palette from the host application, and control transmission and reception processing with the processing server 100.
For example, the user terminal 10 uses a chord generation function of the DAW, which is the host application, to generate an arbitrary chord progression. Then, the user terminal 10 may execute automatic composition on the basis of the chord progression generated by the DAW. For example, the user terminal 10 inputs the chord progression generated by the DAW into the plug-in, and transmits the chord progression to the processing server 100 through the plug-in.
That is, the host application performs control to transmit information regarding the chord progression generated in the host application to the plug-in. Then, the plug-in selects, as setting information, the information regarding the chord progression generated in the host application. Moreover, the plug-in transmits the information regarding the chord progression generated in the host application to the processing server 100, and receives music data composed on the basis of the information regarding the chord progression.
Additionally, the user terminal 10 may automatically select a style palette to be transmitted to the processing server 100 on the basis of the chord progression generated by the DAW. For example, the user terminal 10 may select a style palette having features similar to the chord progression generated by the DAW, and transmit the style palette to the processing server 100. Additionally, the user terminal 10 may sequentially select style palettes according to the chord progression generated by the DAW, generate a style palette sequence, and transmit the generated style palette sequence to the processing server 100.
Additionally, the user terminal 10 may perform settings so that the plug-in is allowed access to information of a base track on the host application. For example, the user sets the base track of the DAW so that the track follows automatically composed music data. In this case, the base track is automatically complemented according to music data generated by the processing server 100 and a chord progression generated by the DAW, for example.
Additionally, the user terminal 10 may perform settings so that the plug-in is allowed access to information of a melody track on the host application. For example, the user sets the melody track of the DAW so that the track follows automatically composed music data. In this case, when the user selects a certain bar and requests automatic composition, the generated melody is automatically inserted into the track. Additionally, when the user sets the DAW to a mode (called comping mode or the like) for editing by combining multiple pieces of music data, the user can complete the melody by selecting desired parts of multiple tracks appearing on the screen.
Additionally, the user terminal 10 may perform settings so that the plug-in is allowed access to information of a melody track and MIDI input on the host application. In this case, the user can perform composition by making full use of both the automatic composition function and the MIDI input. For example, the user inputs an arbitrary chord progression in four bars and causes the DAW to loop. Then, the user performs input with the MIDI keyboard according to the loop performance. By uploading the chord progression and melody information to the processing server 100, the user terminal 10 can automatically create a personal style palette on the processing server 100 side. For example, in a newly added style palette menu on the DAW, the user can give instructions to start or stop creating, save, name, or delete the personal style palettes, for example. Such a personal style palette may be made publicly available through the style palette menu.
Additionally, the user terminal 10 may perform settings so that the plug-in is allowed access to information of an audio track on the host application. The audio track is, for example, a track on which an instrument performance sound is recorded, and is, for example, a track including a chord performance by a piano, a bass note performance by a bass guitar, a melody by a lead instrument, or the like. The plug-in accesses an audio track, analyzes audio data such as a melody, chord, and bass note of each track by signal processing, and obtains NIDI information of the melody, chord progression information, and the like. The plug-in may use, for example, 12-tone analysis technology or the like for analysis. In this case, the user terminal 10 may transmit the analyzed information to the processing server 100 to automatically infer the optimum chord progression by machine learning or the like. Then, the processing server 100 defines a style palette sequence on the basis of this chord progression information. As a result, the user can perform assisted composition based on the style palette sequence generated by the processing server 100, so that the entire composition can be recomposed or the composition can be partially recomposed and replaced, for example.
Additionally, the user terminal 10 may perform settings so that the plug-in is allowed access to information of an existing master track on the host application. The master track is obtained by performing mixing in the DAW and mixing down to two-channel stereo, for example. The plug-in accesses the master track, analyzes the audio data by signal processing, and obtains chord progression information and the like. The user terminal 10 may transmit the analyzed information to the processing server 100 to automatically infer the optimum chord progression by machine learning or the like. Then, the processing server 100 defines a style palette sequence on the basis of this chord progression information. As a result, the user can perform assisted composition based on the style palette sequence generated by the processing server 100, so that the entire composition can be recomposed or the composition can be partially recomposed and replaced, for example.
As described above, in a case where the host application is provided with various functions, the user terminal 10 may apply the functions to the plug-in and use them for information processing according to the present disclosure. For example, as described above, the user terminal 10 can generate a style palette sequence on the basis of a chord progression generated by the DAW, and make the style palette sequence publicly available on the network to invigorate composition processing among users.
In the embodiment, it is assumed that the processing server 100 is installed on a cloud network. However, the present invention is not limited to this example, and as long as communication with the user terminal 10 is possible, the processing server 100 and the management server 200 may installed on a network such as a local area network (LAN).
In the embodiment, an example in which the first application and the second application are installed in the user terminal 10 is shown. However, the first application and the second application may be applications installed in different devices. For example, the user terminal 10 may have the function of only the first application, and play back a sound source, for example, by controlling the second application installed on another device such as a tablet terminal or a smartphone.
The processing according to each of the above embodiments may be carried out in various different forms other than the above embodiments.
Additionally, among the processing described in each of the above embodiments, all or part of the processing described as being automatically performed can be performed manually, or all or part of the processing described as being manually performed can be performed automatically by a known method. In addition, the processing procedure, specific name, and information including various data and parameters shown in the above document and drawings can be arbitrarily changed unless otherwise specified. For example, the various information shown in the drawings is not limited to the illustrated information.
Additionally, each component of each illustrated device is a functional concept, and does not necessarily have to be physically configured as shown in the drawing. That is, the specific form of distribution or integration of each device is not limited to that shown in the drawing, and all or part of the device can be functionally or physically distributed or integrated in arbitrary units according to various loads and usage conditions.
Additionally, the above-described embodiments and modifications can be appropriately combined as long as the processing contents do not contradict each other.
Additionally, the effect described in the present specification is merely an illustration and is not restrictive. Hence, other effects can be obtained.
As described above, the information processing device (user terminal 10 in embodiment) according to the present disclosure controls a first application (host application 20 in embodiment) and a second application (plug-in 22 in embodiment) that functions as a plug-in that extends the functions of the first application. The first application includes a control unit (host application control unit 161 in embodiment) that controls the operation of the second application in the first application. The second application includes a selection unit (selection unit 166 in embodiment) that selects setting information for controlling a composition function based on machine learning, and a transmission/reception unit (transmission/reception unit 167 in embodiment) that transmits the setting information to an external server (processing server 100 in embodiment) that executes the composition function based on machine learning and receives music data composed by the external server through a network.
As described above, the information processing device according to the present disclosure handles the second application having the automatic composition function as a plug-in, and causes the external server to execute the actual composition processing. As a result, the information processing device can provide the user with an environment with good work efficiency while curbing the processing load. That is, the information processing device can improve convenience of the automatic composition function by AI.
The transmission/reception unit receives a chord in a bar of a specified length, a melody in the bar, and a bass note in the bar as music data. As a result, the information processing device can individually refer to and edit the music data, which can improve the user's convenience.
The control unit individually sets instrument information for designating a tone quality when playing back a chord, a melody, or a bass note included in the music data. As a result, the information processing device can provide various playback environments.
The control unit performs control to separately display each window displaying information regarding the chord, melody, or bass note included in the music data. As a result, the information processing device can improve convenience of the user's editing operation.
The control unit controls transmission and reception of information between each window displaying information regarding the chord, melody or bass note and a window displaying information regarding the first application, according to the user's operation. As a result, the information processing device can exchange information between the first application and the second application by operations such as drag and drop, so that convenience of the user's editing operation can be improved.
The control unit controls transmission and reception of information between windows each displaying information regarding the chord, melody, or bass note, according to the user's operation. As a result, the information processing device can improve convenience of the user's editing operation.
The selection unit selects, as setting information, designation information (style palette in embodiment) for designating material music to be learning data for machine learning. The transmission/reception unit transmits the designation information selected by the selection unit to the external server. As a result, the information processing device can execute automatic composition by designating various features that the user desires.
According to the user's operation, the selection unit selects designation information that is stored in a storage unit (storage unit 15 in embodiment) in advance and includes feature information indicating a feature of the designation information and multiple pieces of material music associated with the feature information. As a result, the information processing device can improve the convenience when the user selects the designation information.
The selection unit selects combined designation information (style palette sequence in embodiment) in which first designation information corresponding to some bars of music data composed by the external server and second designation information corresponding to some other bars thereof are combined. As a result, the information processing device can automatically generate various kinds of music.
When the transmission/reception unit receives music data composed by the external server on the basis of the combined designation information, the transmission/reception unit stores the combined designation information in association with the music data in the storage unit. As a result, the information processing device can improve the convenience when the user refers to combined designation information or the like which is the basis of music data created in the past.
The selection unit selects, as setting information, the length information of notes included in the music data composed by the external server on the basis of the designation information. The transmission/reception unit transmits the designation information and the note length information to the external server. As a result, the information processing device can generate music data having the characteristics that the user desires.
The selection unit selects, as setting information, information for determining the probability that constituent notes included in a chord appear in the melody of the music data composed by the external server on the basis of the designation information. The transmission/reception unit transmits the designation information and the information for determining the probability that the constituent notes included in a chord appear in the melody to the external server. As a result, the information processing device can generate music data having the characteristics that the user desires.
The selection unit selects, as setting information, information for determining the type and amount of material music other than the material music included in the designation information in the music data composed by the external server on the basis of the designation information. The transmission/reception unit transmits the designation information and the information for determining the type and amount of material music other than the material music included in the designation information to the external server. As a result, the information processing device can generate music data having the characteristics that the user desires.
The second application further includes a display control unit (display control unit 169 in embodiment) that performs control to retrieve the history of past music data composed by the external server from the storage unit and display the history of the past music data according to the user's operation. As a result, the information processing device can improve the convenience when the user refers to the past operation history and the like.
The display control unit performs control to retrieve the history of editing operations performed on the past music data composed by the external server from the storage unit, and also display the editing operations performed on the past music data. As a result, the information processing device can improve the convenience when the user refers to the past operation history and the like.
When the user performs a playback or editing operation on the music data composed by the external server after receiving the music data, the transmission/reception unit transmits information regarding the playback or editing operation to the external server. As a result, the information processing device can cause the processing server 100 to perform further learning on the basis of the editing and the like performed by the user.
The selection unit selects, as setting information, a chord progression in the composed music on the basis of the user's operation. The transmission/reception unit transmits the chord progression selected by the selection unit to the external server. As a result, the information processing device can provide music data that the user desires without depending on the designation information.
The control unit performs control to transmit information regarding the chord progression generated in the first application to the second application. The selection unit selects, as setting information, the information regarding the chord progression generated in the first application. The transmission/reception unit transmits the information regarding the chord progression generated in the first application to the external server, and receives music data composed on the basis of the information regarding the chord progression. As a result, the information processing device can perform composition processing utilizing the functions of the first application such as a DAW.
The information devices such as the user terminal 10, the processing server 100, and the management server 200 according to each of the above-described embodiments are implemented by a computer 1000 having a configuration as shown in
The CPU 1100 operates on the basis of a program stored in the ROM 1300 or the HDD 1400, and controls each part. For example, the CPU 1100 expands the program stored in the ROM 1300 or the HDD 1400 into the RAM 1200 and executes processing corresponding to various programs.
The ROM 1300 stores a boot program such as the basic input output system (BIOS) executed by the CPU 1100 when the computer 1000 is started, and programs that depend on the hardware of the computer 1000.
The HDD 1400 is a computer-readable recording medium that non-temporarily records programs executed by the CPU 1100 and data used by the programs. Specifically, the HDD 1400 is a recording medium for recording an information processing program according to the present disclosure, which is one example of program data 1450.
The communication interface 1500 is an interface for the computer 1000 to connect to an external network 1550 (e.g., the Internet). For example, the CPU 1100 receives data from another device or transmits data generated by the CPU 1100 to another device through the communication interface 1500.
The input/output interface 1600 is an interface for connecting an input/output device 1650 and the computer 1000. For example, the CPU 1100 receives data from an input device such as a keyboard or a mouse through the input/output interface 1600. Additionally, the CPU 1100 transmits data to an output device such as a display, a speaker, or a printer through the input/output interface 1600. Additionally, the input/output interface 1600 may function as a media interface for reading a program or the like recorded on a predetermined recording medium. The medium is, for example, an optical recording medium such as a digital versatile disc (DVD) or a phase change rewritable disk (PD), a magneto-optical recording medium such as a magneto-optical disk (MO), a tape medium, a magnetic recording medium, or a semiconductor memory.
For example, in a case where the computer 1000 functions as the user terminal 10 according to the embodiment, the CPU 1100 of the computer 1000 implements the functions of the control unit 16 and the like by executing the information processing program loaded on the RAM 1200. Additionally, the HDD 1400 stores the information processing program according to the present disclosure and the data in the storage unit 15. Note that while the CPU 1100 reads and executes the program data 1450 from the HDD 1400, as another example, these programs may be acquired from another device through the external network 1550.
Note that the present technology can also be configured in the following manner.
(1)
An information processing device that controls a first application and a second application that functions as a plug-in that extends functions of the first application, in which
the first application includes
a control unit that controls operation of the second application in the first application, and
the second application includes
a selection unit that selects setting information for controlling a composition function based on machine learning, and
a transmission/reception unit that transmits the setting information to an external server that executes a composition function based on machine learning and receives music data composed by the external server through a network.
(2)
The information processing device according to (1) above, in which
the transmission/reception unit receives a chord in a bar of a specified length, a melody in the bar, and a bass note in the bar as music data.
(3)
The information processing device according to (2) above, in which
the control unit individually sets instrument information for designating a tone quality when playing back the chord, melody, or bass note included in the music data.
(4)
The information processing device according to (3) above, in which
the control unit performs control to separately display each window displaying information regarding the chord, melody, or bass note included in the music data.
(5)
The information processing device according to (4) above, in which
the control unit controls transmission and reception of information between each window displaying information regarding the chord, melody or bass note and a window displaying information regarding the first application, according to a user's operation.
(6)
The information processing device according to (4) or (5) above, in which
the control unit controls transmission and reception of information between windows each displaying information regarding the chord, melody, or bass note, according to the user's operation.
(7)
The information processing device according to any one of (1) to (6) above, in which
the selection unit selects, as the setting information, designation information for designating material music to be learning data for the machine learning, and
the transmission/reception unit transmits the designation information selected by the selection unit to the external server.
(8)
The information processing device according to (7) above, in which
according to a user's operation, the selection unit selects designation information that is stored in a storage unit in advance and includes feature information indicating a feature of the designation information and multiple pieces of material music associated with the feature information.
(9)
The information processing device according to (7) or (8) above, in which
the selection unit selects combined designation information in which first designation information corresponding to some bars of music data composed by the external server and second designation information corresponding to some other bars thereof are combined.
(10)
The information processing device according to (9) above, in which
when the transmission/reception unit receives music data composed by the external server on the basis of the combined designation information, the transmission/reception unit stores the combined designation information in association with the music data in a storage unit.
(11)
The information processing device according to any one of (7) to (10) above, in which
the selection unit selects, as the setting information, length information of notes included in music data composed by the external server on the basis of the designation information, and
the transmission/reception unit transmits the designation information and the note length information to the external server.
(12)
The information processing device according to any one of (7) to (11) above, in which
the selection unit selects, as the setting information, information for determining the probability that constituent notes included in a chord appear in a melody of music data composed by the external server on the basis of the designation information, and
the transmission/reception unit transmits the designation information and the information for determining the probability that constituent notes included in a chord appear in a melody to the external server.
(13)
The information processing device according to any one of (7) to (12) above, in which
the selection unit selects, as the setting information, information for determining a type and amount of material music other than material music included in the designation information in music data composed by the external server on the basis of the designation information, and
the transmission/reception unit transmits the designation information and the information for determining a type and amount of material music other than material music included in the designation information to the external server.
(14)
The information processing device according to any one of (1) to (13) above, in which
the second application further includes a display control unit that performs control to retrieve a history of past music data composed by the external server from a storage unit and display the history of the past music data according to a user's operation.
(15)
The information processing device according to (14) above, in which
the display control unit performs control to retrieve a history of editing operations performed on the past music data composed by the external server from a storage unit, and also display the editing operations performed on the past music data.
(16)
The information processing device according to any one of (1) to (15) above, in which
when the user performs a playback or editing operation on music data composed by the external server after receiving the music data, the transmission/reception unit transmits information regarding the playback or editing operation to the external server.
(17)
The information processing device according to any one of (1) to (16) above, in which
the selection unit selects, as the setting information, a chord progression in composed music on the basis of a user's operation, and
the transmission/reception unit transmits the chord progression selected by the selection unit to the external server.
(18)
The information processing device according to any one of (1) to (17) above, in which
the control unit performs control to transmit information regarding a chord progression generated in the first application to the second application,
the selection unit selects, as the setting information, the information regarding the chord progression generated in the first application, and
the transmission/reception unit transmits the information regarding the chord progression generated in the first application to the external server, and receives music data composed on the basis of the information regarding the chord progression.
(19)
An information processing method executed by an information processing device that controls a first application and a second application that functions as a plug-in that extends functions of the first application, in which
the first application controls operation of the second application in the first application, and
the second application
selects setting information for controlling a composition function based on machine learning, and
transmits the setting information to an external server that executes a composition function based on machine learning and receives music data composed by the external server through a network.
(20)
An information processing program that causes an information processing device that controls a first application and a second application that functions as a plug-in that extends functions of the first application to function so that
the first application controls operation of the second application in the first application, and
the second application
selects setting information for controlling a composition function based on machine learning, and
transmits the setting information to an external server that executes a composition function based on machine learning and receives music data composed by the external server through a network.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2019/009353 | 3/8/2019 | WO |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2020/166094 | 8/20/2020 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
8779268 | Serletic | Jul 2014 | B2 |
9110817 | Pachet | Aug 2015 | B2 |
20050025320 | Barry | Feb 2005 | A1 |
20070044639 | Farbood | Mar 2007 | A1 |
20090071315 | Fortuna | Mar 2009 | A1 |
20090125799 | Kirby | May 2009 | A1 |
20100031804 | Chevreau | Feb 2010 | A1 |
20120072841 | Moricca | Mar 2012 | A1 |
20120297958 | Rassool | Nov 2012 | A1 |
20140000440 | Georges | Jan 2014 | A1 |
20140053710 | Serletic, II | Feb 2014 | A1 |
20180190250 | Hiskey | Jul 2018 | A1 |
20180374461 | Serletic | Dec 2018 | A1 |
20200168197 | Silverstein | May 2020 | A1 |
20200257723 | Kano | Aug 2020 | A1 |
20220130359 | Kishi | Apr 2022 | A1 |
20220230104 | Kishi | Jul 2022 | A1 |
20220262328 | Lerman | Aug 2022 | A1 |
20220406280 | Kishi | Dec 2022 | A1 |
20220406283 | Kishi | Dec 2022 | A1 |
20230298547 | Kishi | Sep 2023 | A1 |
Number | Date | Country |
---|---|---|
101167099 | Apr 2008 | CN |
103503015 | Jan 2014 | CN |
2002169570 | Jun 2002 | JP |
2005266320 | Sep 2005 | JP |
2008541149 | Nov 2008 | JP |
2010165160 | Jul 2010 | JP |
2017219699 | Dec 2017 | JP |
10-2018-0070340 | Jun 2018 | KR |
20180070340 | Jun 2018 | KR |
Entry |
---|
“Briot et al., Music Generation by Deep Learning—Challenges and Directions, Sep. 30, 2018, XP081073413” (Year: 2018). |
Briot, et al., “Music Generation by Deep Learning—Challenges and Directions”, Machine Learning, Audio and Speech Processing, XP081073413, Sep. 30, 2018, 17 pages. |
Sylvain, et al., “The Smuse: An Embodied Cognition Approach To Interactive Music Composition”, International Computer Music Association, XP055900470, 2012, pp. 365-372. |
Briot, et al., “Deep Learning Techniques for Music Generation—A Survey”, Machine Learning, XP081037564, Aug. 7, 2019, 189 pages. |
Extended European Search Report of EP Application No. 19914808.1, issued on Mar. 23, 2022, 16 pages. |
Briot, et al., “Music Generation by Deep Learning—Challenges and Directions”, Neural Computing & Applications, Springer Nature, Sep. 30, 2018, 17 pages. |
International Search Report and Written Opinion of PCT Application No. PCT/JP2019/009353, issued on May 21, 2019, 08 pages of ISRWO. |
Number | Date | Country | |
---|---|---|---|
20220130359 A1 | Apr 2022 | US |
Number | Date | Country | |
---|---|---|---|
62804450 | Feb 2019 | US |