This application is based upon and claims the benefit of priority from the prior Japanese Patent Application No. 2012-4099643, filed Apr. 25, 2012, the entire contents of which is incorporated herein by reference.
1. Field of the Invention
The present invention relates to a music note position detection apparatus, an electronic musical instrument a music note position detection method, and a storage medium by which, based on a musical score image and music note data that are of the same musical piece but are independent from and not correlated with each other, the position of a music note in the musical score image corresponding to a sound represented by the music note data is detected.
2. Description of the Related Art
Users who desire to enjoy music, in particular, beginners, often do not know which sound is produced for a music note displayed on a musical score. For this reason, in recent years, various technologies have been developed which enable intuitive recognition of a correspondence between a music note displayed on a musical score and music note data representing the music note. For example, Japanese Patent No 3980888 discloses a technology in which music note data stored in a storage section is displayed on a screen as a musical score and, when a touch operation is performed on a desired music note in the musical more by using a touch panel provided on the display screen, a musical sound of the music note at the touched point is emitted.
In the technology disclosed in Japanese Patent No. 3980888, a plurality of music note data representing respective sounds constituting a musical piece and display positions of the respective music notes displayed on the musical score are associated in advance with each other. As a result, the sound of a music note specified by a touch operation can be emitted. However, there is a problem in that the position of a music note in a musical score image corresponding to a sound represented by music note data cannot be detected based on the musical score image and the music note data that are of the same musical piece but are independent from and not correlated with each other.
The present invention has been conceived in light of the above-described problems. An object of the present invention is to provide a music note position detection apparatus, a music note position estimation method, and program by which, based on a musical score image and music note data that are of the same musical piece but are independent from and not correlated with each other, the position of a music note in the musical score image corresponding to a sound represented by the music note data can be detected.
In accordance with one aspect of the present invention, there is provided a music note position detection apparatus comprising: an obtaining section which detects bar lines in musical score image data to divide at each bar line, and obtains a layout range of music notes in each measure obtained by division; an extracting section which estimates a position of each music note in the layout range obtained by the obtaining section by using a plurality of music note data constituting a musical piece, and extracts a matching value between the music note at the estimated position and a position of the music note detected on the musical score image data, and the detected position of the music note, as position candidates; and a determining section which excludes a musically improper position candidate from among the position candidates extracted by the extracting section, and determines a detected position of a remaining position candidate as a position of the music note.
In accordance with another aspect of the present invention, there is provided a non-transitory computer-readable storage medium having stored thereon a program that is executable by a computer, the program being executable by the computer to perform functions comprising: obtainment processing for detecting bar lines in musical score image data to divide at each bar line, and obtaining a layout range of music notes in each measure obtained by division; extraction processing for estimating a position of each music note in the layout range obtained in the obtainment processing by using a plurality of music note data constituting a musical piece, and extracting a matching value between the music note at the estimated position and a position of the music note detected on the musical score image data, and the detected position of the music note, as position candidates; determination processing for excluding a musically improper position candidate from among the position candidates extracted in the extraction processing, and determining a detected position of a remaining position candidate as a position of the music note.
In accordance with another aspect of the present invention, there is provided a music note position detection method comprising: detecting bar lines in musical score image data to divide at each bar line, and obtaining a layout range of music notes in each measure obtained. by division; estimating a position of each music note in the obtained layout range by using a plurality of music note data constituting a musical piece, and extracting a matching value between the music note at the estimated position and a position of the music note detected on the musical score image data, and the detected position of the music note, as position candidates; and excluding a musically improper position candidate from among the extracted position candidates, and determining a detected position of a remaining position candidate as a position of the music note.
The above and further objects and novel features of the present invention will more fully appear from the following detailed description when the same is read in conjunction with the accompanying drawings. It is to be expressly understood, however, that the drawings are for the purpose of illustration only and are not intended as a definition of the limits of the invention.
Embodiments of the present invention are described below with reference to the drawings.
A. Structure
A ROM (Read-Only Memory) 11 has stored therein various control programs to be loaded into the CPU 10. These various control programs include a program for music note position detection processing, which will be described further below. A RAM (Random Access Memory) 12 includes a work area, a data area, and a music note position storage area. In the work area of the RAM 12, various registers and flag data that are used in processing by the CPU 10.
In the data area of the RAM 12, musical score image data (in a bit map format) to be displayed on the screen of a display section 14 and a plurality of music note data representing respective sounds constituting a musical piece. The musical score image data and the music note data herein are of the same musical piece, but are independent from and not correlated with each other. The music note data is represented in a known MIDI data format. In the music note position storage area of the RAM 12, the positions of respective music notes in a musical score obtained from music note position detection processing described further below are stored.
The operation section 13 has various operation switches arranged on an apparatus panel, and generates a switch event corresponding to the type of a switch operated by a user. The switch event generated by the operation section 13 is loaded into the CPU 10. The operation section 13 is provided with a power switch for power-on/off and, for example, a mode switch for specifying a mode for performing the music note position detection processing, which will he described further below, and a timbre selection switch for selecting a timbre of a musical sound to be emitted.
The display section 14 includes a color liquid-crystal panel and the like and, in response to a display control signal supplied from the CPU 10, performs screen display of a musical score image based on musical score image data stored in the data area of the RAM 12 and screen display of states, such as a setting state and an operation state, of the musical instrument. The touch panel 15 includes a multi-touch-type touch screen arranged on the display screen of the display section 14, and outputs an operation signal according to a touch operation performed on the touch screen. The outputted operation signal is loaded into the CPU 10. A sound source 16 is configured by a known wave memory read scheme, and generates musical sound data according to an event supplied from the CPU 10. A sound system 17 converts the musical sound data outputted from the sound source 16 to a musical sound signal in an analog format, and after amplifying the musical sound signal, emits the sound from a loudspeaker.
B. Operation
(1) Operation of Music Note Position Detection Processing
Next, the operation of music note position detection processing by the CPU 10 is described with reference to
Subsequently, at Step SA2, the CPU 10 performs the image recognition of the musical score image data to detect a bar line in the musical score and, based on the detection result, divides the musical score image data for each measure. Next, at Step SA3, the CPU 10 performs preliminary inspection of the layout of music notes (a music note shape layout) in the display area of a specific measure obtained based on the detected bar-line information to obtain a music note layout range.
Next, the CPU 10 proceeds to Step SA4, and estimates a position where each music note is to be placed., by using the music note data (MIDI data). Specifically, when the musical score is a piano's grand staff, the musical score has forty-eight position conditions, such as two scales (the C scale and the F scale), two positions of staffs (an upper staff and a lower staff), three signs (flat, sharp, and natural) to lower and raise a note by a semitone, and four music note positions (center, upper or lower, above, and below the staff), and all combinations of these conditions are laid out to estimate the position of each music note.
Then at Step SA5, the CPU 10 judges whether the estimated position of each music note obtained at Step SA4 is within a music note layout range. When the estimated position is outside the music note layout range obtained by the preliminary detection or when the layout determined by a pitch (a note number) in the MIDI data is improper layout that is above, below, or in the middle of the staff, the judgment result is “NO”, and the CPU 10 returns to Step SA4 to exclude the position from position candidates.
Conversely, when the estimated position of each music note is within the music note layout range, the judgment result at Step SA5 is “YES”, and the CPU 10 proceeds to Step SA6. At Step SA6, the CPU 10 sets a detection range whose height in a vertical direction with respect to the estimated position is equal to the staff height and width in a horizontal direction is equal to the width in the preliminary detection. In addition, the CPU 10 performs pattern matching for music notes in this range with three music note types (a whole note, a half note, and a quarter note), and stores a matching value (a degree of coincidence) and a detected position of each music note.
Subsequently at Step SA7, the CPU 10 corrects the matching value by using a sound emission time and a measure width. The music note detected by the pattern matching is at a musical score position more to the left as the sound emission time is earlier and, conversely, at a musical score position more to the right as the sound emission time is later. Therefore, weighting is performed with the layout estimated position of the music note being taken as a maximum value. As a matter of course, since the music note position does not coincide with the sound emission time, a gentle correction curve (a music note position curve) is provided, an example of which is depicted in
Then at Step SA8, the CPU 10 judges whether all music notes in the measure have been completely detected. When detection has not been completed, the judgment result is “NO”, and the CPU 10 returns to Step SA4. Thereafter, Step SA4 to Step SA8 are repeated until all music notes in the measure are completely detected. Then, after all music notes are completely detected, the judgment result at Step SA8 is “YES”, and therefore the CPU 10 performs Step SA9 depicted in
At Step SA9, the CPU 10 performs musical grammar filtering on the position candidates obtained so far (the detection position conditions, the matching values, and the detected positions). That is, musical grammar filter processing is performed (which will be described in detail further below) in which musically improper position candidates such as those out of rules (arrangements) in musical score notation and musicology are excluded to narrow down the candidates.
Subsequently at Step SA10, the CPU 10 totalize all combinations of the position candidates obtained by narrowing down by the musical grammar filter processing, and outputs a position candidate with the highest evaluation value as the position of the music note. The outputted position of the music note is stored in the music note position storage area of the RAM 12. Then at Step SA11, the CPU 10 judges whether all measures have been completely processed. When all measures have not been completely processed, the judgment result is “NO”, and therefore the CPU 10 returns to Step SA2 (refer to
(2) Operation of Musical Grammar Filter Processing
Next, the operation of musical grammar filter processing is described with reference to
Next at Step SB2, the CPU 10 assumes that those with a high total value are upper-limit and lower-limit scales in the musical score, and deletes position candidates other than those of the above-described scales. This is because a musical grammar without a scale change in a measure is presumed. Subsequently at Step SB3, the CPU 10 counts cases where a distance between position candidates is within the staff width. At Step SB4, the CPU 10 deletes position candidates with a count value equal to or larger than 2. That is, if the count value is equal to or larger than 2, this means that two music notes overlap each other, and therefore they are taken as adjacent music notes that are improper in a musical score and relevant position candidates are deleted.
Then at Step SB5, the CPU 10 searches for a music note with only one position candidate. Then, at subsequent Step SB6, the CPU 10 deletes position candidates at the same position as this single candidate. That is, when a music note has only a single position, candidates of the other notes having a plurality of position candidates which are at the same position as the single position are deleted. Since all music notes on the musical score are eventually at different positions the CPU 10 prioritizes the music note laid out at a single position.
As such, in the musical grammar filter processing, among the obtained position candidates (the detection position conditions the matching values, and the detected positions), musically improper position candidates such as those out of rules (arrangements) in musical score notation are excluded to narrow down the candidates.
As described above, in the first embodiment, the CPU 10 detects bar lines in a musical score image and divides at each bar line. Subsequently, the CPU 10 obtains a layout range of music notes in each measure obtained by the division. If the estimated position of a music note estimated in the layout range by using music note data (MIDI data) is out of the layout range, that music note is excluded. If the estimated position of the music note is within the layout range, a music note in a detection range corresponding to the estimated position is detected by pattern matching, a matching value and a detected position obtained by the detection are stored as position candidates, and the matching value of the position candidates is corrected with the sound emission time and the measure width. Subsequently, from among the position candidates, musically improper position candidates such as those out of rules (arrangements) in musical score notation are excluded to narrow down the candidates. Then, from among the position candidates obtained by narrowing down, a position candidate with the highest evaluation value is outputted as the position of the music note. Therefore, based on the musical score image and music note data that are of the same musical piece but are independent from and not correlated with each other, the position of a music note in the musical score image corresponding to a sound represented by the music note data can be detected.
Next, a second embodiment is described. In the first embodiment described above, based on a musical score image and music note data that are of the same musical piece but are independent from and not correlated with each other, the position of a music note in the musical score image corresponding to a sound represented by the music note data is detected. In the second embodiment, when thus found position of the music note in the musical score image is specified by a user's touch operation, the sound of the music note at the specified position is emitted.
The structure of the second embodiment is the same as that of the first embodiment described above, and therefore is not described herein. In the following descriptions, the operation of musical performance processing according to the second embodiment is described.
When the processing starts the CPU 10 proceeds to Step SC1 depicted in
Next at Step SC3, the CPU 10 performs positional conversion of the touched point obtained at Step SC2 in consideration of a current musical score display magnification to calculate a bit map coordinate value in the musical score image. If the musical score display is to be at unity magnification, the bit map coordinate values represent the same position. In the case of double zoomed-in display, the bit map coordinate values represent a position obtained by adding a display offset and a half of the touched point together. Then, the CPU 10 proceeds to Step SC4, and calculates a distance between the bit map coordinate value obtained at Step SC3 and the position of each music note stored in the music note position storage area of the RAM 12.
Subsequently at Step SC5, the CPU 10 judges whether the calculated distance is within a staff width. If the distance is within the staff width, the judgment result is “YES”, and therefore the CPU 10 proceeds to Step SC6. At Step SC6, the CPU 10 instructs the sound source 16 to perform note-on based on the music note data (MIDI data) associated with the relevant music note, and then proceeds to Step SC9. In the case of a chord in which a plurality of music notes gather at one position in the staff width, the CPU 10 causes the sound source 16 to simultaneously emit a plurality of sounds based on the music note data (MIDI data) associated with each music note configuring the chord.
Conversely, if the distance is not within the staff width, the judgment result at Step SC5 is “NO”, and therefore the CPU 10 proceeds to Step SC7. At Step SC7, the CPU 10 judges whether the sound of the music note is being emitted. If the sound is not being emitted, the judgment result is “NO”, and therefore the CPU 10 proceeds to Step SC9. If the sound of the music note is being emitted, the judgment result is “YES”, and therefore the CPU 10 proceeds to Step SC8. At Step SC8, the CPU 10 instructs the sound source 16 to perform note-off based on the music note data (MIDI data) associated with the relevant music note, and then proceeds to Step SC9.
Subsequently at Step SC9, the CPU 10 judges whether distance calculation for the last music note in the musical piece has been completed. If distance calculation has not been completed, the judgment result is “NO” and the CPU 10 repeats the processing at Step SC4 and the following processing. If distance calculation for the last music note in the musical piece has been completed, the judgment result at Step SC9 is “YES”, and therefore the CPU 10 returns to Step SC2.
As such, in the musical performance processing, when the user performs a touch operation on a musical score image displayed on the screen, the CPU 10 sets a circular detection range whose radius centering on the touched point is equal to the staff width. Then, from among the positions of the respective music notes stored in the music note position storage area of the RAM 12, a music note included in the circular detection range is taken as a music note subjected to the touch operation, and the sound of the music note is emitted. When the user releases the touch, the music note whose sound is being emitted becomes out of the circular detection range and is muted.
As described above, in the second embodiment, the CPU 10 sets a circular detection range whose radius centering on the touched point is equal to the staff width. However, the present invention is not limited thereto. For example, as in an example depicted in
Also, in the second embodiment, the CPU 10 detects a music note subjected to a touch operation by using only the distance from the touched point. However, the present invention is not limited thereto, and a configuration may be adopted in which a music note subjected to a touch operation is detected further considering the sound emission time (note-on timing) of the music note.
Heat, a modification example of the second embodiment is described with reference to
When musical performance processing according to the modification example starts, the CPU 10 proceeds to Step SD1 depicted in
Next at Step SD3, the CPU 10 performs positional conversion of the touched point obtained at Step SD2 in consideration of a current musical score display magnification to calculate a bit map coordinate value in the musical score image. If the musical score display is to be at unity magnification, the bit map coordinate values represent the same position. In the case of double zoomed-in display, the bit map coordinate values represent a position obtained by adding a display offset and a half of the touched point together. Then, the CPU 10 proceeds to Step SD4, and calculates a distance between the bit map coordinate value obtained at Step SD3 and the position of each music note stored in the music note position storage area of the RAM 12.
Subsequently at Step SD5, the CPU 10 judges whether the calculated distance is within the staff width. If the distance is within the staff width, the judgment result is “YES”, and therefore the CPU 10 proceeds to Step SD6. At Step SD6, the CPU 10 instructs the sound source 16 to perform note-on based on the music note data (MIDI data) associated with the relevant music note and then proceeds to Step SD10. In the case of a chord in which a plurality of music notes gather at one position in the staff width, the CPU 10 causes the sound source 16 to simultaneously emit a plurality of sounds based on the music note data (MIDI data) associated with each music note configuring the chord.
Conversely, if the distance is not within the staff width, the judgment result at Step SD5 is “NO”, and therefore the CPU 10 proceeds to Step SD7. At Step SD7, the CPU 10 judges whether a slide operation whose difference from the previous touched point in a horizontal direction is within the staff width has been performed. If a slide operation whose difference from the previous touched point in the horizontal direction is within the staff width has been performed, or in other words, if a slide operation in a height direction has been performed, the judgment result is “YES”, and therefore the CPU 10 proceeds to Step SD10 described further below. Therefore, if the user performs a slide operation in the height direction with the sound of the music note within the staff width being emitted by the previous touch operation, the sound is continuously emitted without being muted.
At Step SD7, if a slide operation whose movement amount exceeds the staff width has been performed from the previous touched point in the horizontal direction, the judgment result at Step SD7 is “NO”, and therefore the CPU 10 proceeds to Step SD8. At Step SD8, the CPU 10 judges whether the sound of the music note is being emitted. If the sound is not being emitted, the judgment result is “NO”, and therefore the CPU 10 proceeds to Step SD10. Conversely, if the sound of the music note is being emitted, the judgment result is “YES”, and therefore the CPU 10 proceeds to Step SD9. At Step SD9, the CPU 10 instructs the sound source 16 to perform note-off based on the music note data (MIDI data) associated with the relevant music note and then proceeds to Step SD10.
Subsequently at Step SD10, the CPU 10 judges whether distance calculation for the last music note in the musical piece has been completed. If distance calculation has not been completed, the judgment result is “NO”, and therefore the CPU 10 repeats the processing at Step SD4 and the following processing. If distance calculation for the last music note in the musical piece has been completed, the judgment result at Step SD10 is “YES”, and therefore the CPU 10 returns to Step SD2.
As such, in the musical. performance processing according to the modification example, when the user performs a touch operation on a musical score image displayed on the screen, the CPU 10 sets a circular detection range whose radius centering on the touched point is equal to the staff width. Then, from among the positions of respective music notes stored in the music note position storage area of the RAM 12, a music note included in the circular detection range is taken as a music note subjected to the touch operation, and the sound of the music note is emitted. When the user releases the touch operation, the music note whose sound is being emitted becomes out of the circular detection range and is muted. Also, if the user performs a slide operation in the height direction with the sound of the music note within the staff width being emitted by the previous touch operation, the sound is continuously emitted without being muted.
In this modification example, a sound is continuously emitted by a slide operation in the height direction. However, automatic musical performance may be performed by a touch operation. Although the position of a music note on a musical score does not strictly coincide with the sound emission time of the music note data, this correspondence can be used as a rough guideline. Accordingly, a ratio between a measure width and the position of a touch operation is converted to a musical performance time length in the measure, and the music note data of music notes included in the musical performance time length obtained by the conversion are subjected to automatic musical performance. For example, when a touch operation is performed on point A in the first measure of the example of a musical score depicted in
Also, in the first and second embodiments and the modification example, a music note in a musical score image displayed on a screen is specified by a touch panel operation. However, the present invention is not limited thereto, and a configuration may be adopted in which a keyboard image is displayed below a musical score, and the position of a key corresponding to a currently specified music note is displayed as a guide, as in an example depicted in
Moreover, in the modification example, if the CPU 10 detects that the user has performed a slide operation over a bar line as in an example depicted in
While the present invention has been described with reference to the preferred embodiments, it is intended that the invention be not limited by any of the details of the description therein but includes all the embodiments which fall within the scope of the appended claims
Number | Date | Country | Kind |
---|---|---|---|
2012-099643 | Apr 2012 | JP | national |