The present invention relates to techniques for combining sound materials and audibly generating tones or musical sounds on the basis of the combined sound materials.
Heretofore, there have been known technique for prestoring a multiplicity of fragmentary sound materials in a database and selectively combining some of the prestored sound materials to generate tones (i.e., tone waveforms) on the basis of the combined sound materials. Individual sound materials to be used for generation of tones are selected from among the multiplicity of sound materials prestored in the database. To facilitate the selection of the sound materials, the multiplicity of sound materials stored in the database are classified into categories indicative of various musical characters or features. Japanese Patent Application Laid-open Publication No. 2010-191337 discloses a technique for extracting a plurality of sound materials from continuous sound waveforms of a multiplicity of music pieces, classifying the extracted sound materials into various categories and then storing the thus-classified sound materials into a database.
A sound generation style in which tones are to be audibly generated using some of the sound materials stored in the database is determined in advance, for example, by a user or the like defining sound materials to be used for the sound generation and sound generation timing of the sound materials. Therefore, the user has to determine as many combinations of the sound materials and sound generation timing as the number of tones to be audibly generated or sounded. Thus, the longer a music piece to be created by the user setting a multiplicity of combinations of sound materials and sound generation timing, the greater would become an amount of operation to be performed by the user.
A long music piece may contain a portion where a particular sound generation style (or sound generation content) of a predetermined time period is to be repetitively audibly generated or sounded. In such a case, a user may sometimes simplify the necessary operation by copying combinations of sound materials and sound generation timing of that portion and applying the copied combinations to another time period of the music piece. However, applying such a mere copy may undesirably result in monotonousness of the music piece.
To avoid such an inconvenience, the user may sometimes attempt to change impression of the copied portion of the music piece without greatly changing a progressing flow of the music piece. In such a case, the user, in effect, changes the types of the sound materials to be sounded (i.e., target sound materials) without changing the sound generation timing. However, because there is a need to change the types of all of the target sound materials, this approach would end up failing to achieve simplification of the operation.
In view of the foregoing prior art problems, it is an object of the present invention to provide an improved technique which, in a case where tones are to be generated by combining sound materials, facilities recombination of sound materials to be used so that impression of a music piece, for example, in a partial time period of the music piece can be changed with ease.
In order to accomplish the above-mentioned object, the present invention provides an improved sound generation control apparatus, which comprises: a display control section which displays, on a display screen, an image of an icon placement region having a time axis and which displays, in the icon placement region, an icon image, with which feature amount information descriptive of a feature of material data comprising a waveform of a sound material is associated, in association with a desired time position on the time axis; a setting section which sets, in association with a desired time range on the time axis of the icon placement region, a particular database to be used, the particular database being selected from among a plurality of types of databases that store material data in association with feature amount information; and a sound generation control section which acquires, on the basis of the feature amount information associated with the icon image, the material data from the database set in association with the time range containing the time position where the icon image is placed, and which generates tone data on the basis of the acquired material data and the time position where the icon image is placed.
With the aforementioned arrangements, the present invention can change material data to be retrieved from a desired database, by changing a database, associated with or corresponding to an icon image displayed at a desired position on the time axis, over to another desired one of the plurality of types of databases without changing feature amount information, i.e. by changing only the database from one type to another (namely, changing only the database type setting). Thus, even in a case where the user does not have skillful knowledge about data structures, file structures and/or the like of the feature amount information, the present invention allows the user to readily perform recombination of sound materials to be used.
The present invention may be constructed and implemented not only as the apparatus invention discussed above but also as a method invention. Also, the present invention may be arranged and implemented as a software program for execution by a processor, such as a computer or DSP, as well as a non-transitory storage medium storing such a software program. In this case, the program may be provided to a user in the storage medium and then installed into a computer of the user, or delivered from a server apparatus to a computer of a client via a communication network and then installed into the client's computer. Further, the processor used in the present invention may comprise a dedicated processor with dedicated logic built in hardware, not to mention a computer or other general-purpose processor capable of running a desired software program.
The following will describe embodiments of the present invention, but it should be appreciated that the present invention is not limited to the described embodiments and various modifications of the invention are possible without departing from the basic principles. The scope of the present invention is therefore to be determined solely by the appended claims.
Certain preferred embodiments of the present invention will hereinafter be described in detail, by way of example only, with reference to the accompanying drawings, in which:
<Overall Construction>
The information processing terminal 10 is, for example, a portable telephone, tablet terminal, or PDA (Personal Digital Assistant). As shown in
<Construction of the Server Apparatus 50>
The control section 51 includes a CPU (Central Processing Unit), a RAM (Random Access Memory), a ROM (Reed Only Memory), etc. The control section 51 performs various functions by executing various programs stored in the ROM or storage section 55. In the illustrated example, the control section 51 executes a search program or extraction program in response to an instruction given via the information processing terminal 10. Through execution of the search program, the control section 51 performs a function of searching through a feature amount database (sometimes referred to also as “feature amount DB”) in response to an instruction given via the information processing terminal 10 and transmitting identified (searched-out) material data to the information processing terminal 10. The extraction program performs a function of extracting material data, which becomes a sound material, from clipped data transmitted from the information processing terminal 10 and then storing the extracted material data into the storage section 55. Details of these functions will be described later.
Under control of the control section 51, the communication section 54 is connected to the communication line 1000 to communicate information with communication devices, such as the information processing terminal 10. The control section 51 may update information, stored in the storage section 55, with information acquired via the communication section 54. The communication section 54 may include an interface connectable with external devices in a wired or wireless fashion, without being limited to performing communication via the communication line 1000.
The storage section 55, which comprises a hard disk, non-volatile memory and/or the like, includes not only a storage area for storing the feature amount DB and a clipped data database (hereinafter referred to also as “clipped data DB”) but also a storage area for storing various programs, such as the search program and extraction program.
The clipped data DB is a database for storing a multiplicity of clipped data obtained by extracting (clipping) parts of tone waveforms. Each clipped data is data a part or whole of which is used as material data indicative of a sound material.
The feature amount DB comprises a plurality of types of feature amount databases that are represented by DBa, DBb, . . . . However, the feature amount databases will be collectively referred to as “feature amount DB” when they are explained without having to be particularly distinguished from one another. The feature amount DB is prestored in the storage section 55, and any new type of feature amount DB may be acquired from an external device via the communication section 54 and then additionally stored into the storage section 55.
The feature amount information (each of the pieces of feature amount information) comprises a plurality of types of feature amounts p1, p2, . . . descriptive of or defining one material data corresponding thereto. The feature amounts descriptive of the material data are, for example, feature amounts of the sound material (tone waveform) represented by the material data, such as intensity of individual frequency regions (e.g., high-frequency region, medium-frequency region and low-frequency region), a time point when an amplitude peak is reached (e.g., time point from the head of the material data), peak amplitude intensity, degree of harmony, complicatedness, and the like, of the sound material which are in the form of values obtained by analyzing the material data (i.e., clipped data of the tone waveform). For example, the value of one feature amount p1 is indicative of intensity of the high-frequency region of the sound material. In the following description, a plurality of pieces of feature amount information P of individual material data are indicated by different Pa, Pb, . . . . As apparent from the foregoing, each of the plurality of different pieces of feature amount information Pa, Pb, . . . comprises a set of specific feature amounts p1, p2, . . . specific to the corresponding material data.
Note that, even for same feature amount information P, material data associated with the feature amount information P may differ among the plurality of types of feature amount databases DBa, DBb, . . . . For example, material data that are retrievable from the different feature amount databases Dba and DBb in response to access with the feature amount information P of same content are different from each other although they are similar to each other. Thus, switching can be made among the different material data by changing the feature amount database to be accessed with the feature amount information P, without the feature amount information P being changed in content.
As noted above, the material data are classified into categories or classes in accordance with the content of the feature amount information. More specifically, material data (sound materials) similar in auditory character are classified into a same category. Examples of the categories include a category (class A) into which material data are classified as sounds having a clear attack and a strong edge feeling (e.g., edge sounds), and a category (class B) into which material data are classified as sounds sounding as noise (e.g., texture sounds).
In the feature amount DB of
<Construction of the Information Processing Terminal 10>
The control section 11 includes a CPU, RAM, ROM, etc, and performs various functions by executing various programs stored in the ROM or storage section 15. In the illustrated example, the control section 11 executes a sequence program, similar-sound replacement program or template sequence program in accordance with an instruction given by the user. Through execution of the sequence program, the control section performs, in accordance with an instruction input by the user, a function of generating sequence data for combining material data to audibly generate tones on the basis of the combined material data and acquires material data searched out and identified by the server apparatus 50 to audibly generate the material data through the speaker 161. The similar-sound replacement program performs a function of causing the server apparatus 50 to extract desired material data, which becomes a sound material, for example from music piece data prepared in advance, acquiring, form the database, material data similar in feature amount information to the extracted material data and replacing the extracted material data of the music piece data with the acquired similar material data to thereby modify the music piece data so that the modified music piece data is audibly generated through the speaker 161. The template sequence data performs a function of audibly generating material data, similar in feature amount information to the extracted material data, in accordance with a template. Details of such functions will be described later.
The operation section 12 includes a touch sensor 121 and an operation button 122 via which the user performs desired operation (i.e., which receives desired operation by the user), and it outputs, to the control section 11, operation information indicative of content of the received user's operation. Thus, the user's instruction is input to the information processing terminal 10.
The display section 13, which is a display device, such as a liquid crystal display, displays various content, corresponding to control performed by the control section 11, on a display screen 131. Namely, various content, such as a menu screen, setting screen etc., are displayed on the display screen 131 depending on the executed programs (see
Under control of the control section 11, the communication section 14 is connected to the communication line 1000 to communicate information with a communication device, such as the server apparatus 50. The control section 14 may update information stored in the storage section 15 with information acquired via the communication line 1000. Further, the communication section 14 may include an interface connectable with external devices in a wired or wireless fashion, without being limited to performing communication via the communication line 1000.
The storage section 15 includes a temporary storage area in the form of a volatile memory, and a non-volatile memory. Music piece data to be used in a later-described program, a program to be executed, etc. are temporarily stored in the temporary storage area. The non-volatile memory includes storage areas storing a music piece database (hereinafter referred to also as “music piece DB”), extracted data, material database (hereinafter referred to also as “material DB”), sequence data and template data, and a storage area storing various programs, such as the above-mentioned sequence program, similar-sound replacement program and template sequence program. Although the various data stored in the non-volatile memory are prestored in the storage section 15, other data may be acquired from an external device via the communication section 14 and additionally stored into the non-volatile memory. Further, new sequence data and template data created by the user in a later-described manner may also be stored into the storage section 15.
The music piece DB is a database having stored therein music piece data (music piece data A, music piece data B, . . . ) indicative of waveforms of various music pieces. The material DB is a database having stored therein replacing material data (material data W1, material data W2, . . . ) transmitted from the server apparatus 50 as a result of the server apparatus 50 executing the search program in the server apparatus 50.
The DB designating data is data designating or setting, for a given reproduction time range, a desired type of feature amount DB which should become an object of search (i.e., search-target feature amount DB) through which the server apparatus 50 searches to identify material data. More specifically, in the illustrated example of
Referring back to
<Functional Arrangements>
The following describe functions implemented by the control section 11 of the information processing terminal 10 executing the sequence program and the control section 51 of the server apparatus 50 executing the search program in response to the execution, by the control section 11, of the sequence program. Note that one or some or all of arrangements for implementing the following functions may be implemented by hardware.
In response to an instruction input by the user, the display control section 110 controls displayed content on the display screen 131. In this case, content as shown in
Icon images s1, s2, . . . are images which various feature amount information is associated with. With the icon images s1, s2, . . . displayed or placed in the icon placement region ST, sound generation timing of a sound based on the feature amount information corresponding to any one of the icon images is defined in accordance with a position along the time axis (i.e., time-axial position) of the left end of the icon image. Further, a sound volume is defined in accordance with a position, along the sound volume axis, of the lower end of the icon image. Types of designs of the individual icon images s1, s2, . . . are determined so as to differ depending on the categories (class A, class B, . . . ) which the various feature amount information associated with, or corresponding to, the icon images is classified into. For example, the feature amount information corresponding to the icon image s1 and the feature amount information corresponding to the icon image s2 is classified into different categories, but the feature amount information corresponding to the icon image s2 and the feature amount information corresponding to the icon image s4 is classified into a same category. Note, however, that the icon images need not necessarily differ in design depending on the categories; namely, all of the icon images may be of a same design. Alternatively, the icon images may be controlled to differ in design from one another in accordance with another parameter than the category.
DB images d1, d2, . . . are each an image indicative of a time range, designatable as desired, with which a desired type of feature amount DB can be associated. Each of such time ranges can be set at and changed to a desired position and length in response to user's operation or in accordance with sequence data or the like. Such DB images d1, d2, . . . are displayed or placed in the DB placement region DT, and a time period (time range) in which the feature amount DB corresponding to any one of the DB images is to be applied as an object of search (search-target feature amount DB) by the server apparatus 50 is defined in accordance with a time axial (left-end-to-right-end) position of the DB image. For example, a range from time point t0 to time point t2 is defined as a time range in which the feature amount database DBa is to be applied as a search-target feature amount DB, and a range from time point t1 to time point t3 is defined as a time range in which the feature amount database DBc is to be applied as a search-target feature amount DB. Namely, in the range from time point t0 to time point t2, both the feature amount database DBa and the feature amount database DBc are applied as the search target feature amount database.
Further, on the display screen 131 are displayed: tempo control buttons b1 for setting a reproduction tempo; a conversion instruction button b2 for instructing conversion from sequence data into tone data on the basis of a placement style of an icon image in the icon placement region; and a reproduction instruction button b3 for sounding or audibly generating the converted tone data. Note that a storage button for causing created sequence data to be stored into the storage section 15, and the like, may also be displayed.
Referring back to
The display control section 110 and the setting section 120 generate sequence data in accordance with an icon image placement style and a feature amount DB type setting style. Here, the display control section 110 generates feature amount designating data of the sequence data, while the setting section 120 generates DB designating data of the sequence data. Content of the sequence data may be determined each time an icon image is placed or a feature amount DB is set, or when the conversion instruction button b2 is operated. Once the above-mentioned storage button is operated by the user while the storage button is displayed on the display screen 131, the sequence data generated as above is stored into the storage section 15.
Once the conversion instruction button b2 is operated by the user, the sound generation control section 130 transmits a part or whole of the generated sequence data to the server apparatus 50 via the communication section 14, so that the control section 51 of the server apparatus 50 activates and executes the search program. The “part of the sequence data” means data in which at least feature amount information and a type of feature amount DB having correspondence relationship in time axis with the feature amount information (type information) are associated with each other. Then, the sound generation control section 130 receives material data from the server apparatus 50 via the communication section 14 and outputs tone data by means of the data output section 140 on the basis of the received material data and sequence data. More specifically, the sound generation control section 130 processes, i.e. changes the level of, the material data, corresponding to one of the icon images, in accordance with a sound volume with reference to feature amount designating data of the sequence data, and causes the processed material data to be output, as tone data at timing corresponding to a reproduction time point, via the data output section 140 that outputs the tone data under control of the sound generation control section 130.
The identification section 510 receives, via the communication section 54, information based on the sequence data transmitted from the sound generation control section 130, searches for a feature amount DB of a type (search-target type) indicated by the received information, and identifies, for each of the feature amount information included in the sequence data, material data having feature amount information matching (or identical to or similar to) that feature amount information included in the sequence data. In the illustrated example, the identification section 510 handles the feature amount information as a vector amount composed of a plurality of feature amounts and references a feature amount DB of a search-target type to identify material data having feature amount information that has the smallest Euclidian distance.
Note that any other conventionally-known algorithm than the aforementioned may be employed for determining a similarity (matching degree). As another alternative, the identification section 510 may identify material data whose feature amount information has the second or third smallest Euclidian distance rather than the smallest Euclidian distance, i.e. whose feature amount information is the second or third closest to the feature amount information included in the sequence data. Information necessary for such identification may be set in advance by the user or the like. Further, the material data to be identified need not necessarily be similar in feature amount information to the feature amount information included in the sequence data as long as it is in particular predetermined relationship with the feature amount information included in the sequence data. The search target may be further narrowed down by the category rather than being limited to the search-target feature amount DB. In such a case, the category that becomes a search target may be designated for example by the user, or may be a same category as, or a related category to, the feature amount information included in the sequence data. Here, the “related category” may be determined by a preset algorithm, or mutually-related categories may be set in advance.
Then, the identification section 510 transmits the identified material data to the sound generation control section 130 via the communication section 54. In the illustrated example, as seen from the above, the communication section 54 functions as an acquisition means for acquiring information based on the sequence data by the information through communication means, and as an output means for outputting the identified material data by transmitting the identified material data through the communication means.
The foregoing has been a description about the functional arrangements of the information processing terminal 10 and server apparatus 50. The following describe, with reference to
<Behavior During Execution of the Sequence Program>
A sequence for generating tones is created at step S110 in response to the user inputting an instruction for determining content of feature amount information, an instruction for displaying or placing, in the icon placement region ST, icon images of designs depending on the determined content, and an instruction for displaying or placing DB images in the DB placement region DT. As a consequence, the content shown in
Once the user inputs a conversion instruction by operating the conversion instruction button b2 at step S120, the sound generation control section 130 of the information processing terminal 10 transmits the sequence data to the server apparatus 50 at step S130. The sequence data to be transmitted here need not include all of predetermined information as long as it includes data having a portion where the feature amount information and types of feature amount DBs having predetermined correspondence relationship in time axis with the feature amount information are associated with each other.
Upon receipt of the sequence data from the information processing terminal 10, the server apparatus 50 executes the search program so that the identification section 510 searches through the feature amount DB to identify material data at step S140. For example, for the feature amount information Pc corresponding to the icon image s3, the identification section 510 searches through the feature amount database DBc of a particular type having predetermined correspondence relationship in time axis with the feature amount information Pc, retrieves material data identified as having feature amount information similar to the feature amount information Pc and transmits the retrieved material data to the information processing terminal 10 at step S150. At that time, the server apparatus 50 transmits the identified material data in such a manner as to permit identification as to which of the feature amount information the identified material data corresponds to.
Upon completion of the receipt of the material data, the information processing terminal 10 informs the user to that effect. Then, once the user input a reproduction instruction by operating the reproduction instruction button b3 at step S160, the sound generation control section 130 controls the data output section 140. The sound generation control section 130 adjusts the sound volume of the received material data with reference to feature amount designating data of the sequence data and causes the volume-adjusted material data to be output as tone data in accordance with a reproduction time point of the corresponding feature amount information (step S170), so that the material data is sounded or audibly generated through the speaker 161.
In the aforementioned manner, tone data are output from the information processing terminal 10 in accordance with the user-created sequence. Note that the type of feature amount DB that becomes a search target (search-target feature amount DB type) is defined by the DB designating data. Therefore, the search-target feature amount DB type can be changed to another by the user only changing the DB designating data, and thus, in this case, even when the feature amount designating data is not changed in content, the material data identified by the identification section 51 too changes; accordingly, content to be audibly generated in accordance with the user's reproduction instruction too changes. At that time, even if one material data changes to another, the feature amount information does not necessarily change, and thus, in most cases, material data can be identified, staring with material data classified into the same category as the feature amount information, without the sound of the material data changing to a completely different sound. Therefore, in the case where the types of feature amount DBs correspond to genres (jazz, rock, etc.), it is possible to change an impression of generated tones or sounds, for example, to a jazz-like or rock-like impression, by the user only changing the DB designating data while maintaining the same sound generation style or content (e.g., pattern of tones).
The following describe, with reference to
<Behavior During Execution of the Similar-Sound Replacement Program>
Once the user operates the music piece data selection button bs2, the control section 11 displays, on the display screen 131, a list of music piece data (i.e., music piece data sets) stored in the music piece DB, although not particularly shown. Then, once the user inputs an instruction for selecting one music piece data (music piece data set) from the list, the control section 11 determines the selected music piece data as tone data (see step S210 of
On the other hand, once the user operates the recording selection button bs1, the control section 11 switches the content displayed on the display screen 131 to the content shown in
Once the user operates the recording start button brs, the control section 11 switches the content displayed on the display screen 131 to the content shown in
In this state, the user inputs, via the microphone 162, sounds to be set as tone data. Once the user operates the recording stop button bre after termination of the sound input, the control section 11 terminates the accumulation of the data indicative of the sounds input via the microphone 162 and then switches the content displayed on the display screen 131 to the content shown in
Then, once the user operates again the recording start button brs, the control section 11 starts again the recording, in which case it starts accumulation of new data indicative of sounds either after discarding the so-far accumulated data or without discarding the so-far accumulated data. Once the user operates the enter button bf, on the other hand, the control section 11 determines the data, so far accumulated by the recording, as tone data (see step S210 of
The control section 11 determines, as tone data, the music piece data or the data accumulated by the recording (step S210) in the aforementioned manner and then switches the content displayed on the display screen 131 to content shown in
On the display screen 131 are also displayed range designating arrows (start designating arrow “as” and end designating arrow “ae”) for designating a data range (clipped data range) tw to be transmitted to the server apparatus 50. Once the user designates positions of the range designating arrows, a range between the designated positions is designated as the data range tw. A time display twc is indicative of a time of the data range tw. The ranges may be designated in any other suitable manners than the aforementioned; for example, the number of beats and times may be input in numerical values by some input means.
On the display screen 131 are also displayed a setting button bk for setting the designated data range tw, a return button br for receiving a user's instruction for returning to a last (immediately preceding) screen and a reproduction button by for receiving a user's instruction for reproducing the tone data of the designated data range tw so that the tone data is output through the speaker 161. The above-mentioned position may be designated, for example, by the user touching the range designating arrows with two fingers and spreading out, narrowing and/or sliding the two fingers on the display screen 131. The range designating arrows may be displayed in a superposed relation to the waveform wd2, more specifically on or near the centerline of the waveform wd2. With the range designating arrows displayed in a superposed relation to the waveform wd2 like this, the start designating point and the end designating point can be readily identified intuitively; thus, each of the range designating arrows need not necessarily be an arrow icon and may be any desired icon that visually indicates where to touch. Further, the range designating arrows may be partly transparent or semitransparent (translucent) in such a manner that the start designating point and end designating point of the waveform can be identified with ease.
Once the user operates the reproduction button by after designating a data range tw, the control section 11 reproduces only tone data of the designated data range tw so that the tone data of the designated data range tw is reproduced to be audibly output via the speaker 161. If the user operates the setting button bk after designating a data range tw, then the control section 11 sets the data range tw as an object of material data extraction by the server apparatus 50 (step S220 of
Note that the DB designating data used during execution of the sequence program may be used in the similar-sound replacement program too. In such a case, the DB designating data too is transmitted to the server apparatus 50.
Once the server apparatus 50 receives the clipped data, the control section 51 of the server apparatus 50 executes the extraction program to extract material data from the tone data of the data range tw (step S240). As one example method of extracting the material data, an On-set point where a sound volume varies by more than a predetermined amount may be detected from the clipped data, and a portion that is located within a predetermined time range from the detected On-set point and that has a feature amount satisfying a particular condition may be extracted as the material data. Although any one of the conventionally-known methods may be used for extracting the material data from the clipped data indicative of tones, it is preferable to use the method disclosed in Japanese Patent Application Laid-open Publication No. 2010-191337.
Then, the control section 51 registers, into the storage section 55, data related to the extracted material data (step S250). More specifically, the control section 51 registers the received clipped data into the clipped data DB and registers the data range, feature amount information and its class into the feature amount DB. Which one or ones of the types of feature amount DBs the data range, feature amount information and its class should be registered into may be designated in advance by the user. If the clipped data is a part of music piece data and a genre corresponding to the music piece data is acquirable, the clipped data may be associated with the genre.
Registration of material data at step S260 may be dispensed with, and whether the registration of material data should be performed or not may be designated in advance by the user.
Then, the control section 51 executes the search program to identify material data similar in feature amount information to individual material data extracted from the clipped data (step S260). More specifically, in the illustrated example, the control section 51 calculates, for each of the material data extracted from the clipped data, feature amounts and searches for and identifies five material data, similar in feature amount information to the extracted material data, from the feature amount DB. The material data identification may be performed here in the same manner as performed in the identification section 510. Namely, the control section 51 may identify, for each of the extracted material data, five material data with closest feature amount information to the feature amount information of the extracted material data, i.e. in descending order of Euclidian distances from the feature amount information of the extracted material data. Note that the information registered at step S250 is excluded from a search range.
Once the control section 51 identifies the five material data similar in feature amount information to the extracted material data in the aforementioned manner, it transmits not only these five material data but also information for forming extracted data as shown in
If DB designating data has been received from the information processing terminal 10, the control section 51 of the server apparatus 50 determines, on the basis of the received DB designating data, a type of feature amount DB that becomes a search target at the time of identifying material data, in the same manner as in the search program executed in the server apparatus 50 in response to execution of the sequence program. At that time, such a type of feature amount DB may be determined on the basis of a data position, in the clipped data, of the extracted material data. For example, a reproduction time range in the DB designating data may be designated using information indicative of a data position of the clipped data.
Once the information processing terminal 10 receives these information, the control section 11 of the information processing terminal 10 stores the extracted data into a temporary storage area of the storage section 15 and displays, on the display screen 131, content as shown in
The icon trains bk1, bk2, . . . are each in the form of a row of images of a design corresponding to a category into which the material data is classified in accordance with its feature amount information. Namely, each of the images corresponds to an icon image with which the feature amount information is associated. Although the image designs may be other than those shown in
Further, the icon trains bk1, bk2, . . . include an original sound material row bki in which icon images indicative of extracted material data are arranged, and similar sound material rows bkr in which icon images indicative of replacing material data are arranged. In the similar sound material rows bkr of
Once the user inputs an instruction for designating a position on any one of the icon trains bk1, bk2, . . . shown in
Further, once the user instructs reproduction by operating the reproduction button by (step S290 of
Note that the user may adjust the start or end time of the extracted material data by adjusting the time-axial length of the extraction windows wk1, wk2, . . . . If the start or end time of the extracted material data has been adjusted like this, the information processing terminal 10 may transmit information indicative of the changed start or end time to the server apparatus 50, and the control section 51 of the server apparatus 50 may perform the operation of step S250 on the extracted material data as changed material data.
The foregoing has been a description about the behavior of the sound generation control system 1 during execution of the similar-sound replacement program.
<Behavior During Execution of the Template Sequence Program>
To execute the template sequence program, the user operates a shift instruction button bts that receives user's operation for instructing a shift to the template sequence program on the aforementioned screen of
Each of the tracks tb1, tb2, tb3 and tb4 indicates, in a horizontal direction of the screen, individual sound generation timing for 16 beats of one measure; namely, the sound generation timing progresses sequentially, one beat by one beat, from the left-end icon. Namely, the horizontal axis direction in the track region represents the time axis as in the above-mentioned icon placement region ST. Each sound generation timing is indicated by a rectangular icon image in
For example, in the first track tb1, the material data corresponding to the extraction window wk1 is sounded at the first beat, material data identified to be the third most similar to the material data corresponding to the extraction window wk1 is sounded at the sixth beat, and material data identified to be the first most similar to the material data corresponding to the extraction window wk1 is sounded at the tenth beat.
By operating (i.e., touching) any one of the sound generation icon images tbs to change the numerical value of the icon image tbs, the user can cyclically select the type of the corresponding similar sound. Whereas the example of
Further, whereas the templates have been described above as predefined by the template sequence program, the user may create templates or modify or process existing templates. Templates created or processed like this may be stored into the storage section 15 as noted above so that they are read out and used in response to the user subsequently executing the template sequence program. Furthermore, the number of the icon images displayed in the track region TT may be increased or decreased in accordance with the total number of beats, or the icon images may be displayed in a scrolling manner. Furthermore, newly-created templates as well as templates prepared in advance may be used. In such a case, the user newly sets feature amount information and determines material data similar to the newly-set feature amount information from among the material data extracted at above-mentioned step S240 of
A slider is provided in a left lower portion of the screen is slidable by the user to designate a desired performance tempo. A template button tn provided in a right lower portion of the screen is operable to select a desired one of the templates. Each time the user touches the template button tn, the template of one template number changes to the template of the next template number. Thus, by sequentially touches the template button tn, the user can select a desired type of template; in the illustrated example, the template of template number “2” is currently selected.
Types of similar sounds may be displayed by different brightness or thickness of color of the sound generation icon images tbs instead of the numerical values indicated in the sound generation icon images tbs. Similarly, correspondence relationship between the tracks tb1, tb2, tb3 and tb4 and the extraction windows wk1, wk2, wk3 and k4 may be indicated by different colors or the like instead of the alphabetical letters.
Allocation, by the template, of material data to the individual tracks may be effected by the information processing terminal 10 transmitting information of individual templates as well to the server apparatus 50 at the time of transmission, to the server apparatus 50, of clipped data during execution of the similar-sound replacement program and by the server apparatus 50 allocating, to the tracks, material data similar to the feature amount information of the individual templates and then transmitting the allocated material data to the information processing terminal 10 together with extracted data. Then, the information processing terminal 10 allocates correspondence relationship between the tracks and the material data by use of an allocation table and stores the correspondence relationship between the material data and the tracks into the temporary storage region. In the illustrated example of
Although it is common to use the sixteen templates as differing from one another in feature amount and sound generation timing, some of the templates may share same feature amounts or sound generation timing with another template so that the quantity of data to be communicated between the server apparatus 50 and the information processing terminal 10 and the quantity of calculations performed in the server apparatus 50 can be reduced.
Once the user instructs reproduction by operating the reproduction button by while the screen of the content shown in
Then, in response to an instruction input by the user, the control section 11 stores, into the non-volatile memory of the storage section 15, the data of the individual templates and the allocation table as a single file with a file name designated therefor, so that the thus-stored file can be read out by the information processing terminal 10 alone using the file name.
<Example Application to DAW>
The information processing terminal 10 has been described above as applied to a tablet terminal, portable telephone, PDA or the like. As another example application, the individual functions of the information processing terminal 10 may be implemented by application software called “DAW” (Digital Audio Workstation) being run on an OC (Operating System) of a PC (Personal Computer). Namely, the information processing terminal 10 can be implemented as a music processing apparatus by means of a PC where the DAW is running. Such a music processing apparatus is capable of performing a series of music processes, such as recording/reproduction, editing and mixing of audio signals and MIDI (Musical Instrument Digital Interface) events, and the above-mentioned sequence program and template sequence program are provided as functions of the music processing apparatus.
When the personal computer (PC) executes given application software of the DAW, the given application software can operate in conjunction with the above-mentioned sequence program to extract feature amounts from signals reproduced by a MIDI sequencer, which controls recording/reproduction of MIDI events, to create sequence data and record, as audio signals, material data corresponding to the extracted feature amounts. Namely, the personal computer (PC) that executes the application software can communicate data between the MIDI sequencer and the sequence program and record and edit audio signals from data created by the sequence program
Further, when the personal computer (PC) executes given application software of the DAW, the given application software can operate in conjunction with the above-mentioned template sequence program to create MIDI tracks of the MIDI sequencer from the tracks of the template sequence program or conversely create templates of the template sequence program by use of timing information of the tracks of the MIDI sequencer and create MIDI data of one or more of the tracks of the template sequence.
Further, when the personal computer (PC) executes mixer-related application software and when the user selects or designates a track by use of a mixer screen of DAW's application software, input/output tracks of the sequence program are handled in such a manner that any of them can be selected or designated on the mixer screen similarly to other MIDI tracks and audio tracks. Thus, it is possible to mix together reproduced signals of MIDI data and the above-mentioned sequence data to output the mixed result, and mix together reproduced signals of audios and the above-mentioned sequence data to output the mixed result. Note that the personal computer (PC) may execute only the sequence program to perform reproductive output and recording based on the sequence program alone.
Furthermore, at the time of data storage and reproduction in the DAW's application, the above-mentioned sequence data and DB designating data may be provided as constituent data of a project file and organized into the single project file. Thus, in this case, the project file comprises the above-mentioned sequence data and DB designating data in addition to, for example, a header, data of audio tracks (i.e., management data and waveform data of a plurality of tracks), data of an audio mixer (parameters of the plurality of channels), data of MIDI tracks (sequence data of the plurality of tracks), data of a software tone generator (parameters of an activated software tone generator), data of a hardware tone generator (parameters of the hardware tone generator registered in a tone generator rack), data of a software effecter (parameters of an activated software effecter), data of a hardware effecter (parameters of an inserted hardware effecter), tone generator table, effecter table, data of a tone generator LAN and other data.
With such arrangements, a great quantity of audio data supplied by the DAW's application can be used as bases of the sequence data and material data, so that not only usability of the sequence program can be greatly enhanced but also the sequence program can be used a tool of the MIDI sequencer or audio data editing work.
<Modifications>
<Modification 1>
The icon images displayed in the icon placement region ST in the above-described preferred embodiment may be made expandable or stretchable or contractable in the time-axis direction in response to an instruction input by the user.
All of the icon images are shown in
<Modification 2>
The vertical axis of the icon placement region ST in the preferred embodiment has been described as a coordinate axis representing sound volumes (i.e., sound volume axis). However, the present invention is not so limited, and, the vertical axis may be a coordinate axis representing sound pitches, lengths or the like (which will hereinafter be referred to as “designating coordinate axis”). Namely, the icon placement region ST may have a designating coordinate axis representing designation values designating processing content, other than sound volumes, of material data. If the designating coordinate axis is one representing pitches, the sound generation control section 130 may change the pitch of the material data in accordance with a position, on the designating coordinate axis, the icon image and then output the pitch-changed material data as tone data via the data output section 140. If the designating coordinate axis is one representing sound lengths, the sound generation control section 130 may perform a time stretch process) for expanding the waveform of the material data), a loop process (for repetitively outputting the material data), etc. in accordance with a position, on the designating coordinate axis, the icon image and then output the thus-processed material data as tone data via the data output section 140.
Further, processing content designated by the designating coordinate axis may pertain to a plurality of types of factors, such as sound volume and pitch, in which case the designating coordinate axis may be switched among the plurality of types in response to an instruction input by the user so that the icon image is placed at a position along the switched coordinate axis. The material data may be processed variously by placing the icon image in the icon placement region ST having such a switchable designating coordinate axis.
<Modification 3>
Whereas the vertical axis of the icon placement region ST in the preferred embodiment has been described as a designating coordinate axis designating processing content of material data, it may be a coordinate axis representing identification values for identifying material data by means of the identification section 510 (such a coordinate axis will hereinafter be referred to as “identifying coordinate axis”). In this case, the identification section 510 may identify material data in accordance with a position, on the identifying coordinate axis, of the icon image. For example, in the case where the identification values represent similarities, an arrangement may be made such that material data having a lower similarity to the extracted material data (i.e., having feature amount information of a greater Euclidian distance from the feature amount information of the extracted material data) is identified by the identification section 510 if the icon image is located more upward in the icon placement region ST.
Such similarities may be designated in accordance with an algorithm (e.g., random algorithm) in which similarities are predetermined in association with all or pre-designated ones of the icon images, in response to the user performing predetermined operation (e.g., random button operation), rather than being designated by the user.
Further, the above-mentioned identification values may pertain to a plurality of types of factors, in which case the icon image may be placed in a switchably-selected one of the identifying coordinate axis in the icon placement region ST and the designating coordinate axis in modification 2. Another example of the type of the identification values may be categories into which the feature amount information corresponding to the icon images are classified, in which case the identification value on the identifying coordinate axis may be changed to change the feature amount information and thereby change the category.
<Modification 4>
According to the above-described preferred embodiment, the content displayed on the display screen 131 is switched from the content of
The control section 51 of the server apparatus 50 transmits, as extraction result data, information indicative of material data extracted from clipped data (i.e., information indicative of a data range in the clipped data and feature amount information), via the communication section 54 (step S310).
Once the information processing terminal 10 receives the extraction result data, the control section 11 of the information processing terminal 10 switches the displayed content on the display screen 131 to the content of
In the illustrated example, the extraction windows wr1, wr2, . . . are displayed in colors corresponding to categories into which respective feature amount information is classified. Here, each of the extraction windows is filled in a translucent color corresponding to the category such that the color becomes deeper or darker in a down-to-up direction while the color becomes lighter deeper in an up-to-down direction. The following description will be made in relation to the extraction window wrl.
A class switching region wrb is displayed in an upper end portion of the extraction window wrl. The class switching region wrb is divided into a plurality of sub regions, and these sub regions are filled with respective ones of colors corresponding to the categories. Further, vertical positions (i.e, positions in the vertical axis direction) in the extraction window wrl are associated with similarities in such a manner that the similarity increases in the down-to-up direction; namely, the similarity and the color density are correlated to each other.
Once the user designates any position of the class switching region wrb, the control section 11 changes the color of the extraction window wrl of the display screen 131 to the color of the sub region which the user-designated position belongs to. At that time, the color density gradation pattern that the color becomes deeper in the down-to-up direction while the color becomes lighter deeper in the up-to-down direction does not change. In this manner, the control section 11 sets the category corresponding to the changed-to color as a search-target class. If such user's designation is not made, then the original category in which the feature amount information of the material data corresponding to the extraction window is classified is set as-is as a search-target class. Once any position inside the extraction window wrl is designated by the user, the control section 11 sets, as a search condition, a similarity corresponding to a vertical axial position of the user-designated position. In the aforementioned manner, the control section 11 sets, as search conditions, the class and similarity (step S320).
Once the reproduction button by is operated by the user, the control section 11 transmits, via the communication section 14, condition data, indicative of the search conditions, in association with information identifying material data in the extraction window (step S330).
Once the server apparatus 50 receives the condition data, the control section 51 identifies material data similar to the feature amount information of the extracted material data in a similar manner to step S260 in the above-described preferred embodiment (step S340). However, in the illustrated example, unlike in the preferred embodiment, the control section 51 identifies material data on the basis of the condition data. Namely, the search target here is material data stored in the feature amount DB and having feature amount information classified into the category indicated by the condition data. Further, such material data are sequentially identified in descending order of similarities, i.e. starting with the one having the smallest Euclidian distance. Namely, material data with feature amount information having higher similarities and closer Euclidian distances to the feature amount information of the extracted material data than the others are identified. In the case where the DB designating data is employed, as noted above, the control section 51 determines a type of feature amount DB, which becomes a search target at the time of identification of material data, on the basis of the DB designating data.
Once the control section 51 identifies material data on the basis of the condition data, it transmits, to the information processing terminal 10 via the communication section 54, the identified material data and information identifying replacing material data in association with each other (step S350).
Upon receipt of these data from the server apparatus 50, the control section 11 of the information processing terminal 10 replaces the extracted material data with the identified material data (step S360). Once the user instructs reproduction by operating the reproduction button by (step S370), the control section 11 reproduces the clipped data having been subjected to the material data replacement, to thereby output the clipped data as tone data (step S380), so that the tone data is audibly generated through the speaker 161.
Here, each of the extraction windows indicates a portion of material data extracted in the server apparatus 50. Further, a user-designated portion (e.g., portion wrs shown in
Whereas the foregoing has been described in relation to the case where similarities are designated by the user, such similarities may be designated in accordance with an algorithm (e.g., random algorithm) in which similarities are predetermined in association with all or pre-designated ones of the extraction windows, in response to the user performing predetermined operation (random button operation).
Further, sound volumes with which portions included in the extraction windows and the other portions are to be reproduced may be made adjustable separately from each other. Such sound volume adjustment may be controlled with an continuous amount or intermittently in an ON/FF fashion. Also, the sound volume adjustment may be performed separately for each of the extraction windows. In this way, sounds can be audibly generated with material data portions made outstanding or non-outstanding. These modified features are also applicable while the display of
Further, as described above in relation to the preferred embodiment, the user may adjust the time-axial lengths of the extraction windows wr1, wr2, . . . so that the start and end times of extracted material data are adjustable.
<Modification 5>
In the above-described preferred embodiment, the operation of step S170 shown in
Further, in the above-described preferred embodiment, sounds based on such tone data are output through the speaker 161 of the information processing terminal 10. Alternatively, sounds based on such tone data may be output through an external speaker device connected to the information processing terminal 10 or through the server apparatus 50. Namely, the sound generation control section 130 may control not only the structural components of the information processing terminal 10 but also structural components connected to the information processing terminal 10.
<Modification 6>
In the above-described preferred embodiment, the icon images, DB images, etc. to be displayed on the display screen 131 during execution of the sequence program are generated by programs. However, the present invention is not so limited, and, in modification 6, such images may be prestored in the storage section 15, 55 or the like.
<Modification 7>
In the above-described preferred embodiment, content of the feature amount information corresponding to the icon images to be displayed in the icon placement region ST is determined in accordance with an instruction input by the user. However, the present invention is not so limited, and, in modification 7, the user may select a design of a desired icon image to thereby determine, as feature amount information, a representative value predetermined for the category corresponding to the selected design.
<Modification 8>
In the above-described preferred embodiment, content of the DB designating data is determined by the user placing DB images in the DB placement region DT. However, the present invention is not so limited, and, in modification 8, relationship between the reproduction time ranges and the feature amount DB types may be determined automatically by the control section 11.
<Modification 9>
The above-described preferred embodiment may be modified in such a manner that, if there is a reproduction time range where no type of feature amount DB is designated by the DB designating data, a predetermined type of feature amount DB (all or one or some particular ones of a plurality of types) or a type designated in an immediately preceding reproduction time range is designated as a search target for that reproduction time range.
<Modification 10>
Whereas, in the above-described preferred embodiment, the sound generation control system 1 comprises the information processing terminal 10 and the server apparatus 50 interconnected via the communication line 1000, it may comprise the information processing terminal 10 and the server apparatus 50 constructed as an integral unit without the intervention of the communication line 1000. Further, even where the information processing terminal 10 and the server apparatus 50 are provided as separate apparatus, one or some of the structural components of the information processing terminal 10 may be included in the server apparatus 50, or conversely, one or some of the structural components of the server apparatus 50 may be included in the information processing terminal 10. Further, it is only necessary that various information described above as stored in the storage section 15 of the information processing terminal 10 and various information described as stored in the storage section 55 of the server apparatus 50 be stored in any storage section in the entire sound generation control system 1. For example, a storage device for storing all or part of the various information may be connected to the communication line 1000 rather than the information processing terminal 10 and server apparatus 50. Further, the various information may be shared with another information processing terminal 10 connectable to the communication line 1000 so that another user can use the various information.
For example, the feature amount DB may be stored in the storage section 55 of the server apparatus 50 and the clipped data DB may be stored in the storage section 15 of the information processing terminal 10 so that the functions of the identification section 510 can be implemented. Furthermore, the search program and the extraction program may be executed in the information processing terminal 10, or may be executed in the server apparatus 50 on the basis of information acquired from the information processing terminal 10.
Furthermore, the present invention is not limited to the construction where software arrangements based on the aforementioned sequence program and template sequence program the modification are implemented by a computer or processor. The present invention may be constructed by hardware of a specialized sequencer. If the present invention is applied to the DAW and only cooperation with the MIDI sequencer suffices, then the aforementioned sequence program and template sequence program may be applied to the MIDI sequencer.
The following describe, with reference to
The storage section 15A is a combination of the storage section 15 and storage section 55 employed in the above-described preferred embodiment; namely, the storage section 15A stores both content described above as stored in the storage section 15 and content described above as stored in the storage section 55.
The control section 11A executes both of the programs executed separately by the control sections 11 and 51 in the above-described preferred embodiment. Programs to be executed together, such as the sequence program and search program, may be integrated and stored in the storage section 15A as a single program.
The following describe, with reference to
<Modification 11>
The number of tracks in the template sequence program is not limited to four and may be more or less than four. The number of tracks may be indefinitely great (in effect, as great as the system permits). In such a case, a great multiplicity of feature amount data are placed on the time axis, and similar sounds and feature amount parameters can be changed independently of one another.
<Modification 12>
Whereas the above-described preferred embodiment is arranged to acquire material data as necessary from the clipped data DB in accordance with a data range of the feature amount DB, only material data of portions clipped in advance may be prestored, in which case information indicative of a data range need not be stored in the feature amount DB.
<Modification 13>
Whereas the above-described preferred embodiment is arranged to search through the feature amount DB to identify material data having a high similarity to feature amount information of extracted material data (see step S260 of
<Modification 14>
In the foregoing description about the preferred embodiment and modification 4, it has been stated that the user may be allowed to adjust the time axial lengths of the extraction windows, an arrangement may be made to permit finer time axial length adjustment.
Such time axial length adjustment on the popup window may also be performed on the display screen of
<Modification 15>
Whereas, in the above-described preferred embodiment, DB images with which types of feature amount DBs are associated are displayed in the DB placement region DT, the types of feature amount DBs and the time ranges, which become search targets for the types, may be displayed separately from each other.
In the illustrated example of
Check boxes CB may be displayed to the left of the DB type designating region DM so that the user can designate whether the search targets designated in the corresponding horizontal rows should be made valid or invalid. Such check boxes CB may also be used in the display of
<Modification 16>
In the above-described preferred embodiment, the icon images on the display of
<Modification 17>
In the above-described preferred embodiment of the present invention, a type of feature amount DB and a time range, in which the type of feature amount DB becomes a search target, are designated by the user on the display of
<Modification 18>
On the display of
<Modification 19>
Each of the programs employed in the above-described preferred embodiment can be supplied stored in a computer-readable storage medium, such as a magnetic storage medium (like a magnetic tape or magnetic disk), an optical storage medium (like an optical disk), a magneto-optical storage medium or a semiconductor memory. Further, the information processing terminal 10 or the server apparatus 50 may download the programs via a network.
<Modification 20>
The preferred embodiment has been described above as storing a file created by the sequence program and a file created by the template sequence program into the non-volatile memory as separate files. In modification 20, the file created by the sequence program and the file created by the template sequence program is stored into the non-volatile memory in response to just one operation. At that time, these files may be either stored as separate files, for example, with different extensions, or stored combined together in a single file. Further, each of the file names may be automatically designated from information, such as a corresponding music piece name, date, etc., without being designated by the user.
This application is based on, and claims priorities to, JP PA 2011-045708 filed on 2 Mar. 2011 and JP PA 2011-242606 filed on 4 Nov. 2011. The disclosure of the priority applications, in its entirety, including the drawings, claims, and the specification thereof, are incorporated herein by reference.
Number | Date | Country | Kind |
---|---|---|---|
2011-045708 | Mar 2011 | JP | national |
2011-242606 | Nov 2011 | JP | national |