The disclosure pertains to special effects devices for musical instruments. More particularly, the disclosure pertains to special effects devices producing a rhythmic effect from an electrical signal produced by a musical instrument with an electrical output, such as an electric guitar, or by an acoustic instrument or voice where the acoustic signal is transformed into an electrical signal with a microphone.
There are many effects devices currently available in the market that create rhythmic effects based on the input signal. The most common device in this category is the delay effect, which replays the input signal, with or without further processing, after a period of time and at a specified volume. The first delay effects were created using magnetic tape recorders and used the length of the tape loop and position of the tape heads to affect the type of delay sound that was created. Eventually, delay effects were created using analog bucket brigade delays, and finally with digital electronics. While the earliest tape-based delays were difficult to reconfigure without significant effort, the use of digital signal processors to create delays has led to delay effects processors with many parameters that can be manually adjusted to create the desired sound. For example, in some delay processors, many copies of the original input signal can be created and played back at various delay times by using more than one delay sub-system. Setting up the time and level parameters for all these delay sub-systems can be extremely complex and time-consuming. There exist some delay processors that can set a single time (tempo) by detecting the beat of an input signal (such as a guitar strum); however, these processors cannot set complex rhythms using this method.
In some examples, rhythmic pattern recognition systems are provided that automatically set up timing and level information corresponding to a specific rhythm pattern by analyzing an input signal from a musical instrument, such as, for example, an electric guitar. In this way, a musician does not need to program the rhythm by setting the individual delay times and levels of each beat of a complex rhythm. Instead, the musician can simply play one or more repetitions of the rhythm pattern into the system and the corresponding delay times and levels required to recreate the rhythm will be set automatically.
In some examples, methods for determining rhythms include receiving a digital audio signal from a musical instrument and analyzing the received digital audio signal to detect events in the digital audio signal. Events are selected from the detected events based on one or more associated event scores, and a rhythmic pattern is extracted based on the selected events. Typically the rhythmic pattern is communicated to a musical device that is configured to produce a corresponding rhythm audio signal based on the rhythmic pattern. Typically, the rhythmic pattern is communicated to a delay effect processor, a drum machine, or a sequencer.
In some examples, the extracted rhythmic pattern includes a pattern period, beat time locations within the rhythmic pattern, and beat levels. In some embodiments, events are detected based on energy in the digital audio signal in at least one frequency band. In some examples, events are detected by estimating an energy envelope of the received digital audio signal in one or more frequency bands and estimating a derivative of the energy envelope in one or more frequency bands. At least two peaks are identified in the derivative, and the at least two peaks are associated with corresponding events, wherein timing information is extracted based on times associated with the peaks. According to some embodiments, an event time and level are established by searching forward in the energy envelope from at least one derivative peak. In additional examples, event detection includes scoring events based on at least one of a derivative peak level and an envelope peak level, and pruning events with scores less than a minimum allowable score. In further representative examples, the rhythmic pattern is extracted by locating at least one pattern period repetition.
In some representative methods, the repeating pattern is located by grouping events into candidate repeating periods, matching events between the repeating periods, establishing a cost for each set of matched events based on a temporal distance between the matching events, increasing the cost for each set of matched events based on a number of events that are unmatched between the repeating periods, determining overall costs based on the established costs and the increased costs, and selecting at least one repeating pattern based on the determined costs. In some examples, the pattern analyzer uses dynamic programming to search for the period associated with optimal event matching. In additional examples, the beat locations and levels of the rhythmic pattern are based on the event locations and levels extracted based on the rhythm pattern period. The received audio signal can be produced by a stringed instrument such as guitar, or other musical instruments.
Computer readable storage medium are provided having computer-executable instructions for methods that include analyzing a received digital audio signal to detect events in the digital audio signal, selecting events from the detected events based on scores associated with the events, and extracting a rhythmic pattern based on the selected events. In some examples, the method further comprises communicating the rhythmic pattern or storing the rhythmic pattern in a memory.
Apparatus comprise an input configured to receive a digital audio signal from a musical instrument and a processor that is configured to analyze the received digital audio signal to detect events in the digital audio signal, select events from the detected events based on scores associated with the events, and extract a rhythmic pattern based on the selected events. An output is configured to deliver the rhythmic pattern. In some examples, the processor is configured to search for a repeating pattern by grouping events into candidate repeating periods, matching events between the repeating periods, establishing a cost for each set of matched events based on a temporal distance between the matching events, increasing the cost for each set of matched events based on a number of events that are unmatched between the repeating periods, determining overall costs based on the established costs and the increased costs, and selecting at least one repeating pattern based on the determined costs. In other examples, the processor is configured to detect events by estimating an energy envelope of the received digital audio signal in one or more frequency bands, estimating a derivative of the energy envelope in one or more frequency bands, identifying at least two peaks in the derivative, and associating the at least two peaks with corresponding events, wherein timing information is extracted based on times associated with the peaks.
The foregoing and other objects, features, and advantages of the technology will become more apparent from the following detailed description, which proceeds with reference to the accompanying figures.
As used in this application and in the claims, the singular forms “a,” “an,” and “the” include the plural forms unless the context clearly dictates otherwise. Additionally, the term “includes” means “comprises.”
The systems, apparatus, and methods described herein should not be construed as limiting in any way. Instead, the present disclosure is directed toward all novel and non-obvious features and aspects of the various disclosed embodiments, alone and in various combinations and sub-combinations with one another. The disclosed systems, methods, and apparatus are not limited to any specific aspect or feature or combinations thereof, nor do the disclosed systems, methods, and apparatus require that any one or more specific advantages be present or problems be solved.
Although the operations of some of the disclosed methods are described in a particular, sequential order for convenient presentation, it should be understood that this manner of description encompasses rearrangement, unless a particular ordering is required by specific language set forth below. For example, operations described sequentially may in some cases be rearranged or performed concurrently. Moreover, for the sake of simplicity, the attached figures may not show the various ways in which the disclosed systems, methods, and apparatus can be used in conjunction with other systems, methods, and apparatus. Additionally, the description sometimes uses terms like “produce” and “provide” to describe the disclosed methods. These terms are high-level abstractions of the actual operations that are performed. The actual operations that correspond to these terms will vary depending on the particular implementation and are readily discernible by one of ordinary skill in the art.
Theories of operation, scientific principles, or other theoretical descriptions presented herein in reference to the apparatus or methods of this disclosure have been provided for the purposes of better understanding and are not intended to be limiting in scope. The apparatus and methods in the appended claims are not limited to those apparatus and methods which function in the manner described by such theories of operation.
In a typical embodiment, a guitar delay engine such as a multi-tap delay pedal is provided. In other examples, rhythm engines such as vocal multi-tap delay pedals, drum machines, or other devices that provide a rhythmic pattern (i.e. level and timing information). Typical guitar delay devices are configured so that a guitarist can hold down a foot pedal, play a rhythmic pattern on a guitar, and then release the foot pedal so that the rhythmic pattern is emulated in the delay pattern provided by the guitar delay device.
In some examples, values, procedures, or apparatus' are referred to as “lowest”, “best”, “minimum,” or the like. It will be appreciated that such descriptions are intended to indicate that a selection among many used functional alternatives can be made, and such selections need not be better, smaller, or otherwise preferable to other selections.
As used herein, an audio signal can be represented as an analog electrical signal (typically a timer varying voltage or current), a digitization of an analog signal, or an encoded version thereof. For examples, audio signals can include analog signals after processing by an analog to digital convertor, or audio signals process for representation in an encoded format such as advanced audio coding (AAC) or mp3 format. Encoding can be lossless or lossy.
Events can be identified, extracted, and/or characterized based on one or more scores associated with portions or features of a digital audio signal. For example, scores can be associated with magnitudes of a signal, its derivatives, or other features of the signal including its spectrum and spectral envelope.
Next, the number of events in the pruned list is counted (601). If that number is less than maxBeats+1, then it is assumed that the pattern was entered by the user only once, and therefore a direct method to compute the pattern is used. Note that the criterion for using the direct method could also be controlled by a user parameter or some other mechanism. In one example system, maxBeats is set to 6. The direct method (602) assumes that the first N+1 strums define the N beats of the pattern, and works by computing the difference between the sample index of each event and the sample index of the first event. The pattern repeat period is set to be equal to the last event for which this difference is less than Pmax (the maximum period in samples for a pattern—in one example system this corresponds to 5 s), and this event is labeled E_N+1. For each event starting with the 2nd event (E—2) and ending with E_N+1, we add a new beat, to the pattern and set the delay time for B_i to the difference between the corresponding event sample index and the first event sample index. The level for B_i is set to the envelope peak value for E_i+1.
If the number of events in the pruned list is greater than maxBeats+1, the pattern is extracted using the pattern recognition method under the assumption that it was repeated by the user two or more times (603). The pattern recognition flow chart is shown in
Next, for each candidate period, an attempt is made to match each event with its corresponding event in a subsequent repetition of the pattern. If the input, for example, consists of 6 events in which the first 3 events define the first repetition of the pattern and the second 3 events define the second iteration of the pattern, then event 1 would be matched with event 4, event 2 would be matched with event 5, and event 3 would be matched with event 6. However, due to fact that for this system to be useful it must be robust to the various sources of error in entering the pattern, for example errors in timing, missing beats, extra erroneous beats that made it through the initial pruning process, a method has been developed based on dynamic programming to create an optimal set of event matches based on minimizing a cost function.
First, a path is defined as a set of event pairings for each event. The first event in a pairing is the source event, and the second event in a pairing is the target event. Note that not every event necessarily will be a source event or a target event. For example, some events will be considered erroneous and will not be matched with any other peaks. Other events may be target events, but may also be in the last pattern repetition and therefore they will not be source events.
Next, a cost function is defined as follows: C(E_i,E_j)=min(MaxCost, |ndx_i+Pc−ndx_j|)+SourceSkipPenalty+TargetSkipPenalty where:
The dynamic programming algorithm (701) proceeds as follows:
The cost for each candidate period is computed as follows: First, the candidate period is refined by computing the average distance between each matched peak. The cost function for each pair is also refined by re-computing the cost with the new candidate period. Finally, a total cost for each candidate period is compute (702) by compute the sum of the squared costs for each pair, and dividing that by the number of events that were selected as source events.
Once the best path for each candidate period is computed, the candidate with the lowest associated cost is selected (703).
Now that there are a set of optimal matched events, the next step is to derive the rhythm pattern delays and levels (704). This is not necessarily straightforward because the dynamic programming algorithm for peak assignment allows the flexibility of having one or more peaks missing from one or more of the pattern repetitions. The algorithm for defining the rhythm pattern proceeds as follows:
At this point, a final check is done to see if the cost associated with the pattern derived using the pattern recognition method is less than a threshold (PatternErrTol, which in one example system is set to 0.05). If that is the case, the error in using the complex pattern is considered to be too high, and instead the direct method described above is used to compute the rhythm pattern.
Note that there are two other steps that can be used to restrict the patterns in order to make them more musical as well as more robust to human error:
Once the rhythm pattern has been determined, the delay times and levels for each tap in the delay engine are set to match each beat in the rhythm pattern. In the stereo setup used for the delay engine, taps can be set to have the same value on the left and right channel (i.e. center pan), or the taps could be set to alternate left and right based on a system or user preference. It should be clear that the taps may be assigned to the left and right channels using alternate criteria and/or parameters.
A delay engine can be implemented is a variety of ways. The description below defines a typical delay engine, but the exact implementation is not critical to the claims of this invention. The delay engine uses a circular buffer to implement the delay effect. A circular buffer is a signal processing construct that writes audio into a fixed size buffer at a location defined by the write pointer in a circular manner such that when the buffer is full, the write pointer wraps back to the start of the buffer. There is also one or more read pointers that read audio out of the circular buffer at a specified delay from the write pointer. The read pointer also wraps at the boundaries of the circular buffer such that audio is always read from a valid location in the fixed size audio buffer. The audio read out of the circular buffers may optionally be written back into the circular buffer at the position of the write pointer with a user or system defined gain to produce a feedback loop. There is also a user or system defined output gain associated with each read pointer such that the audio from each read pointer is multiplied by said output gain and mixed together to form the wet delay signal. The wet delay signal is then multiplied by an overall wet gain before being mixed with the dry (i.e. the input) signal to produce the output of the delay engine. In the present invention, the output gain and delay associated with each read pointer is set using the rhythm pattern extraction as described above.
This section describes various examples of methods and apparatus embodying the disclosed technology. The following descriptions include the step of setting level information from the derived rhythm pattern, however it should be noted that setting level information is an enhancement and not necessary to create useful rhythm information for the output rhythm engine.
The first example is a method where an audio signal is analyzed to extract timing and level information of events for use in a rhythm engine. To accomplish this, the method would locate the events in audio signal, compute the level and timing of these events, determine the most likely rhythm pattern from the level and timing information, and finally set the timing and level information of the determined pattern in a rhythm engine.
Typical input signals include electric guitar signals or acoustic guitar signals (from a mic or a pickup). Vocal signals and other acoustic signals can also be used as the input signal where the vocal signal or other acoustic signal is picked up using a microphone.
One example of a typical rhythm engine is a delay engine in which the input is played back at a delay time related to the timing information of the rhythm pattern, and at a level related to the level information of the rhythm pattern. Another example is a synthesizer or music generation system in which a different signal from the input is played back one or more times in sequence where the playback times are related to the timing information of the derived rhythm pattern and the playback levels are related to the level information of the derived pattern.
One specific embodiment of this invention is a guitar delay pedal in which an electric guitar is plugged into the pedal as an input. An analog to digital converter is used to convert the input signal into a digital signal. The rhythm pattern recognition system is implemented as machine code that runs on a DSP. When the user depresses a switch, for example on a foot pedal, the DSP analyzes the input signal and detects the rhythm pattern that the user is playing according to the described invention. When the player releases the pedal, the delays are configured according to the rhythm pattern. Another switch or foot pedal could be used to enable or disable the delay effect. Another specific embodiment of this invention is a drum machine that derives the rhythm pattern in the same way as the guitar delay pedal, but instead of setting delay times and levels, the drum machine creates a drum pattern with various drum sounds corresponding to the recognized beats. The drum pattern would be repeated, resulting in a drum pattern that matched the rhythm specified by the user. Different types of drum sounds could be selected using, for example, a knob on the user interface.
It should be noted that the automatic rhythm recognition system could be implemented entirely in software as machine code, and could be used in the form of a computer program that can run on a general purpose computer. The computer program could be a stand-alone application or a software plug-in that runs within the environment of another computer program. In these cases, the rhythm recognition system could work on input signals that are stored on disk or other computer media. The output of the system could be the creation of an output audio file, or, for example, control information such as MIDI information containing the timing and level information of the derived rhythm pattern.
With reference to
The exemplary PC 800 further includes one or more storage devices 830 such as a hard disk drive for reading from and writing to a hard disk, a magnetic disk drive for reading from or writing to a removable magnetic disk, and an optical disk drive for reading from or writing to a removable optical disk (such as a CD-ROM or other optical media). Such storage devices can be connected to the system bus 806 by a hard disk drive interface, a magnetic disk drive interface, and an optical drive interface, respectively. The drives and their associated computer-readable media provide nonvolatile storage of computer-readable instructions, data structures, program modules, and other data for the PC 800. Other types of computer-readable media which can store data that is accessible by a PC, such as magnetic cassettes, flash memory cards, digital video disks, CDs, DVDs, RAMs, ROMs, and the like, may also be used in the exemplary operating environment.
A number of program modules may be stored in the storage devices 830 including an operating system, one or more application programs, other program modules, and program data. A user may enter commands and information into the PC 800 through one or more input devices 840 such as a keyboard and a pointing device such as a mouse. Other input devices may include a digital camera, microphone, joystick, game pad, satellite dish, scanner, or the like. These and other input devices are often connected to the one or more processing units 802 through a serial port interface that is coupled to the system bus 806, but may be connected by other interfaces such as a parallel port, game port, or universal serial bus (USB). A monitor 846 or other type of display device is also connected to the system bus 806 via an interface, such as a video adapter. Other peripheral output devices, such as speakers and printers (not shown), may be included.
The PC 800 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 860. In some examples, one or more network or communication connections 850 are included. The remote computer 860 may be another PC, a server, a router, a network PC, or a peer device or other common network node, and typically includes many or all of the elements described above relative to the PC 800, although only a memory storage device 862 has been illustrated in
When used in a LAN networking environment, the PC 800 is connected to the LAN through a network interface. When used in a WAN networking environment, the PC 800 typically includes a modem or other means for establishing communications over the WAN, such as the Internet. In a networked environment, program modules depicted relative to the personal computer 800, or portions thereof, may be stored in the remote memory storage device or other locations on the LAN or WAN. The network connections shown are exemplary, and other means of establishing a communications link between the computers may be used.
In view of the many possible embodiments to which the principles of the disclosed technology may be applied, it should be recognized that the illustrated embodiments are only preferred examples and should not be taken as limiting in scope.
This application claims the benefit of U.S. Provisional Application 61/186,351, filed Jun. 11, 2009 which is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
61186351 | Jun 2009 | US |