The present invention relates in general to audio systems and, more particularly, to an audio system and method of using adaptive intelligence to distinguish dynamic content of an audio signal generated by a musical instrument and control a signal process function associated with the audio signal.
Audio sound systems are commonly used to amplify signals and reproduce audible sound. A sound generation source, such as a musical instrument, microphone, multi-media player, or other electronic device generates an electrical audio signal. The audio signal is routed to an audio amplifier, which controls the magnitude and performs other signal processing on the audio signal. The audio amplifier can perform filtering, modulation, distortion enhancement or reduction, sound effects, and other signal processing functions to enhance the tonal quality and frequency properties of the audio signal. The amplified audio signal is sent to a speaker to convert the electrical signal to audible sound and reproduce the sound generation source with enhancements introduced by the signal processing function.
Musical instruments have always been very popular in society providing entertainment, social interaction, self-expression, and a business and source of livelihood for many people. String instruments are especially popular because of their active playability, tonal properties, and portability. String instruments are enjoyable and yet challenging to play, have great sound qualities, and are easy to move about from one location to another.
In one example, the sound generation source may be an electric guitar or electric bass guitar, which is a well-known musical instrument. The guitar has an audio output which is connected to an audio amplifier. The output of the audio amplifier is connected to a speaker to generate audible musical sounds. In some cases, the audio amplifier and speaker are separate units. In other systems, the units are integrated into one portable chassis.
The electric guitar typically requires an audio amplifier to function. Other guitars use the amplifier to enhance the sound. The guitar audio amplifier provides features such as amplification, filtering, tone equalization, and sound effects. The user adjusts the knobs on the front panel of the audio amplifier to dial-in the desired volume, acoustics, and sound effects.
However, most if not all audio amplifiers are limited in the features that each can provide. High-end amplifiers provide more in the way of high quality sound reproduction and a variety of signal processing options, but are generally expensive and difficult to transport. The speaker is typically a separate unit from the amplifier in the high-end gear. A low-end amplifier may be more affordable and portable, but have limited sound enhancement features. There are few amplifiers for the low to medium end consumer market which provide full features, easy transportability, and low cost.
In audio reproduction, it is common to use a variety of signal processing techniques depending on the music and playing style to achieve better sound quality, playability, and otherwise enhance the artist's creativity, as well as the listener's enjoyment and appreciation of the composition. For example, guitar players use a large selection of audio amplifier settings and sound effects for different music styles. Bass players use different compressors and equalization settings to enhance sound quality. Singers use different reverb and equalization settings depending on the lyrics and melody of the song. Music producers use post processing effects to enhance the composition. For home and auto sound systems, the user may choose different reverb and equalization presets to optimize the reproduction of classical or rock music.
Audio amplifiers and other signal processing equipment, e.g., dedicated amplifier, pedal board, or sound rack, are typically controlled with front panel switches and control knobs. To accommodate the processing requirements for different musical styles, the user listens and manually selects the desired functions, such as amplification, filtering, tone equalization, and sound effects, by setting the switch positions and turning the control knobs. When changing playing styles or transitioning to another melody, the user must temporarily suspend play to make adjustments to the audio amplifier or other signal processing equipment. In some digital or analog instruments, the user can configure and save preferred settings as presets and then later manually select the saved settings or factory presets for the instrument.
In professional applications, a technician can make adjustments to the audio amplifier or other signal processing equipment while the artist is performing, but the synchronization between the artist and technician is usually less than ideal. As the artist changes attack on the strings or vocal content or starts a new composite, the technician must anticipate the artist action and make manual adjustments to the audio amplifier accordingly. In most if not all cases, the audio amplifier is rarely optimized to the musical sounds, at least not on a note-by-note basis.
A need exists to dynamically control an audio amplifier or other signal processing equipment in realtime. Accordingly, in one embodiment, the present invention is an audio system comprising a signal processor coupled for receiving an audio signal. The dynamic content of the audio signal controls operation of the signal processor.
In another embodiment, the present invention is a method of controlling an audio system comprising the steps of providing a signal processor adapted for receiving an audio signal, and controlling operation of the signal processor using dynamic content of the audio signal.
In another embodiment, the present invention is an audio system comprising a signal processor coupled for receiving an audio signal. A time domain processor receives the audio signal and generates time domain parameters of the audio signal. A frequency domain processor receives the audio signal and generates frequency domain parameters of the audio signal. A signature database includes a plurality of signature records each having time domain parameters and frequency domain parameters and control parameters. A recognition detector matches the time domain parameters and frequency domain parameters of the audio signal to a signature record of the signature database. The control parameters of the matching signature record control operation of the signal processor.
In another embodiment, the present invention is a method of controlling an audio system comprising the steps of providing a signal processor adapted for receiving an audio signal, generating time domain parameters of the audio signal, generating frequency domain parameters of the audio signal, providing a signature database including a plurality of signature records each having time domain parameters and frequency domain parameters and control parameters, matching the time domain parameters and frequency domain parameters of the audio signal to a signature record of the signature database, and controlling operation of the signal processor based on the control parameters of the matching signature record.
a-7b illustrate waveform plots of the audio signal;
a-9b illustrate time sequence frames of the sampled audio signal;
a-31b illustrate time sequence frames of the sampled audio signal;
The present invention is described in one or more embodiments in the following description with reference to the figures, in which like numerals represent the same or similar elements. While the invention is described in terms of the best mode for achieving the invention's objectives, it will be appreciated by those skilled in the art that it is intended to cover alternatives, modifications, and equivalents as may be included within the spirit and scope of the invention as defined by the appended claims and their equivalents as supported by the following disclosure and drawings.
Referring to
Further detail of front control panel 30 of audio system 32 is shown in
The programmable control panel 54 includes LCD 60, functional mode buttons 62, selection buttons 64, and adjustment knob or data wheel 66. The functional mode buttons 62 and selection buttons 64 are elastomeric rubber pads for soft touch and long life. Alternatively, the buttons may be hard plastic with tactic feedback micro-electronic switches. Audio system 32 is fully programmable, menu driven, and uses software to configure and control the sound reproduction features. The combination of functional mode buttons 62, selection buttons 64, and data wheel 66 provide control for the user interface over the different operational modes, access to menus for selecting and editing functions, and configuration of audio system 32. The programmable control panel 54 of audio system 32 may also include LEDs as indicators for sync/tap, tempo, save, record, and power functions.
In general, programmable control panel 54 is the user interface to the fully programmable, menu driven configuration and control of the electrical functions within audio system 32. LCD 60 changes with the user selections to provide many different configuration and operational menus and options. The operating modes may include startup and self-test, play, edit, utility, save, and tuner. In one operating mode, LCD 60 shows the playing mode of audio system 32. In another operating mode, LCD 60 displays the MIDI data transfer in process. In another operating mode, LCD 60 displays default setting and preset's. In yet another operating mode, LCD 60 displays a tuning meter.
Turning to
In audio reproduction, it is common to use a variety of signal processing techniques depending on the content of the audio source, e.g., performance or playing style, to achieve better sound quality, playability, and otherwise enhance the artist's creativity, as well as the listener's enjoyment and appreciation of the composition. For example, bass players use different compressors and equalization settings to enhance sound quality. Singers use different reverb and equalization settings depending on the lyrics and melody of the song. Music producers use post processing effects to enhance the composition. For home and auto sound systems, the user may choose different reverb and equalization presets to optimize the reproduction of classical or rock music.
To accommodate the signal processing requirements in accordance with the dynamic content of the audio source, audio amplifier 90 employs a dynamic adaptive intelligence feature involving frequency domain analysis and time domain analysis of the audio signal on a frame-by-frame basis and automatically and adaptively controls operation of the signal processing functions and settings within the audio amplifier to achieve an optimal sound reproduction. Each frame contains a predetermined number of samples of the audio signal, e.g., 32-1024 samples per frame. Each incoming frame of the audio signal is detected and analyzed on a frame-by-frame basis to determine its time domain and frequency domain content, and characteristics. The incoming frames of the audio signal are compared to a database of established or learned note signatures to determine a best match or closest correlation of the incoming frame to the database of note signatures. The note signatures from the database contain control parameters to configure the signal processing components of audio amplifier 90. The best matching note signature controls audio amplifier 90 in realtime to continuously and automatically make adjustments to the signal processing functions for an optimal sound reproduction. For example, based on the note signature, the amplification of the audio signal can be increased or decreased automatically for that particular frame of the audio signal. Presets and sound effects can be engaged or removed automatically for the note being played. The next frame in sequence may be associated with the same note which matches with the same note signature in the database, or the next frame in sequence may be associated with a different note which matches with a different corresponding note signature in the database. Each frame of the audio signal is recognized and matched to a note signature that in turn controls operation of the signal processing function within audio amplifier 90 for optimal sound reproduction. The signal processing function of audio amplifier 90 is adjusted in accordance with the best matching note signature corresponding to each individual incoming frame of the audio signal to enhance its reproduction.
The adaptive intelligence feature of audio amplifier 90 can learn attributes of each note of the audio signal and make adjustments based on user feedback. For example, if the user desires more or less amplification or equalization, or insertion of a particular sound effect for a given note, then audio amplifier builds those user preferences into the control parameters of the signal processing function to achieve the optimal sound reproduction. The database of note signatures with correlated control parameters makes realtime adjustments to the signal processing function. The user can define audio modules, effects, and settings which are integrated into the database of audio amplifier 90. With adaptive intelligence, audio amplifier 90 can detect and automatically apply tone modules and settings to the audio signal based on the present note signature. Audio amplifier 90 can interpolate between similar matching note signatures as necessary to select the best choice for the instant signal processing function.
Continuing with
The pre-filter block 92, pre-effects block 94, non-linear effects block 96, user-defined modules 98, post-effects block 100, post-filter block 102, and power amplification block 104 within audio amplifier 90 are selectable and controllable with front control panel 30 in
The audio signal can originate from a variety of audio sources, such as musical instruments or vocals. The instrument can be an electric guitar, bass guitar, violin, horn, brass, drums, wind instrument, piano, electric keyboard, percussions, or other instruments capable of generating electric signals representative of sound content. The audio signal can originate from an audio microphone handled by a male or female with voice ranges including soprano, mezzo-soprano, contralto, tenor, baritone, and bass. In the present discussion, the instrument is guitar 20, more specifically an electric bass guitar. When exciting strings 24 of bass guitar 20 with the musician's finger or guitar pick, the string begins a strong vibration or oscillation that is detected by pickup 22. The string vibration attenuates over time and returns to a stationary state, assuming the string is not excited again before the vibration ceases. The initial excitation of strings 24 is known as the attack phase. The attack phases is followed by a sustain phase during which the string vibration remains relatively strong. A decay phase follows the sustain phase as the string vibration attenuates and finally a release phase as the string returns to a stationary state. Pickup 22 converts string oscillations during the attack phase, sustain phase, decay phase, and release phase to an electrical signal, i.e., the analog audio signal, having an initial and then decaying amplitude at a fundamental frequency and harmonics of the fundamental.
The artist can use a variety of playing styles when playing bass guitar 20. For example, the artist can place his or her hand near the neck pickup or bridge pickup and excite strings 24 with a finger pluck, known as “fingering style”, for modern pop, rhythm and blues, and avant-garde styles. The artist can slap strings 24 with the fingers or palm, known as “slap style”, for modern jazz, funk, rhythm and blues, and rock styles. The artist can excite strings 24 with the thumb, known as “thumb style”, for Motown rhythm and blues. The artist can tap strings 24 with two hands, each hand fretting notes, known as “tapping style”, for avant-garde and modern jazz styles. In other playing styles, artists are known to use fingering accessories such as a pick or stick. In each case, strings 24 vibrate with a particular amplitude and frequency and generate a unique audio signal in accordance with the string vibrations phases, such as shown in
The time domain analysis block 122 of
Equation (1) provides another illustration of the operation of blocks 138-142.
g(m,n)=max(0,[E(m,n)/E(m,n−1)]−1) (1)
where:
The function g(m,n) has a value for each frequency band 1-m and each frame 1-n. If the ratio of E(m,n)/E(m,n−1), i.e., the energy level of band m in frame n to the energy level of band m in frame n−1, is less than one, then [E(m,n)/E(m,n−1)]−1 is negative. The energy level of band m in frame n is not greater than the energy level of band m in frame n−1. The function g(m,n) is zero indicating no initiation of the attack phase and therefore no detection of the onset of a note. If the ratio of E(m,n)/E(m,n−1), i.e., the energy level of band m in frame n to the energy level of band m in frame n−1, is greater than one (say value of two), then [E(m,n)/E(m,n−1)]−1 is positive, i.e., value of one. The energy level of band m in frame n is greater than the energy level of band m in frame n−1. The function g(m,n) is the positive value of [E(m,n)/E(m,n−1)]−1 indicating initiation of the attack phase and a possible detection of the onset of a note.
Summer 144 accumulates the difference in energy levels E(m,n) of each frequency band 1-m of frame n and frame n−1. The onset of a note will occur when the total of the differences in energy levels E(m,n) across the entire monitored frequency bands 1-m for the sampled audio signal 118 exceeds a predetermined threshold value. Comparator 146 compares the output of summer 144 to a threshold value 148. If the output of summer 144 is greater than threshold value 148, then the accumulation of differences in the energy levels E(m,n) over the entire frequency spectrum for the sampled audio signal 118 exceeds the threshold value 148 and the onset of a note is detected in the instant frame n. If the output of summer 144 is less than threshold value 148, then no onset of a note is detected.
At the conclusion of each frame, attack detector 136 will have identified whether the instant frame contains the onset of a note, or whether the instant frame contains no onset of a note. For example, based on the summation of differences in energy levels E(m,n) of the sampled audio signal 118 over the entire spectrum of frequency bands 1-m exceeding threshold value 148, attack detector 136 may have identified frame 1 of
At the conclusion of each frame, attack detector 136 will have identified whether the instant frame contains the onset of a note, or whether the instant frame contains no onset of a note. For example, based on the summation of energy levels E(m,n) of the sampled audio signal 118 within frequency bands 1-m exceeding threshold value 164, attack detector 136 may have identified frame 1 of
Returning to
Repeat gate 168 monitors the number of onset detections occurring within a time period. If multiple onsets of a note are detected within a repeat detection time period, e.g., 50 milliseconds (ms), then only the first onset detection is recorded. That is, any subsequent onset of a note that is detected, after the first onset detection, within the repeat detection time period is rejected.
Noise gate 170 monitors the energy levels E(m,n) of the sampled audio signal 118 about the onset detection of a note. If the energy levels E(m,n) of the sampled audio signal 118 about the onset detection of a note are generally in the low noise range, e.g., the energy levels E(m,n) are −90 dB, then the onset detection is considered suspect and rejected as unreliable.
The time domain analysis block 122 of
Note peak release block 176 uses the energy function E(m,n) to determine the time from the onset detection of a note to a lower energy level during the decay phase or release phase of the note over all frequency bands 1-m, i.e., a summation of frequency bands 1-m. The onset detection of a note is determined by attack detector 136. The lower energy levels are monitored frame-by-frame in peak detectors 132a-132c. In one embodiment, the lower energy level is −3 dB from the peak energy level over all frequency bands 1-m. The note peak release is a time domain parameter or characteristic of each frame n for all frequency bands 1-m and is stored as a value in runtime matrix 174 on a frame-by-frame basis.
Multiband peak attack block 178 uses the energy function E(m,n) to determine the time from the onset detection of a note to the peak energy level of the note during the attack phase or sustain phase of the string vibration prior to the decay of the energy levels for each specific frequency band 1-m. The onset detection of a note is determined by attack detector 136. The peak energy level is the maximum value during the attack phase or sustain phase of the string vibration prior to the decay of the energy levels in each specific frequency band 1-m. The peak energy level is monitored frame-by-frame in peak detectors 132a-132c. The peak energy level may occur in the same frame as the onset detection or in a subsequent frame. The multiband peak attack is a time domain parameter or characteristic of each frame n for each frequency band 1-m and is stored as a value in runtime matrix 174 on a frame-by-frame basis.
Multiband peak release block 180 uses the energy function E(m,n) to determine the time from the onset detection of a note to a lower energy level during the decay phase or release phase of the note in each specific frequency band 1-m. The onset detection of a note is determined by attack detector 136. The lower energy level is monitored frame-by-frame in peak detectors 132a-132c. In one embodiment, the lower energy level is −3 dB from the peak energy level in each frequency band 1-m. The multiband peak release is a time domain parameter or characteristic of each frame n for each frequency band 1-m and is stored as a value in runtime matrix 174 on a frame-by-frame basis.
Slap detector 182 monitors the energy function E(m,n) in each frame 1-n over frequency bands 1-m to determine the occurrence of a slap style event, i.e., the artist has slapped strings 24 with his or her fingers or palm. A slap event is characterized by a sharp spike in the energy level during a frame in the attack phase of the note. For example, a slap event causes a 6 dB spike in energy level over and above the energy level in the next frame in the attack phase. The 6 dB spike in energy level is interpreted as a slap event. The slap detector is a time domain parameter or characteristic of each frame n for all frequency bands 1-m and is stored as a value in runtime matrix 174 on a frame-by-frame basis.
Tempo detector 184 monitors the energy function E(m,n) in each frame 1-n over frequency bands 1-m to determine the time interval between onset detection of adjacent notes, i.e., the duration of each note. The tempo detector is a time domain parameter or characteristic of each frame n for all frequency bands 1-m and is stored as a value in runtime matrix 174 on a frame-by-frame basis.
The frequency domain analysis block 120 in
where:
The frequency domain analysis block 120 of
The energy levels E(m,n) of one frame n−1 are stored in block 191 of attack detector 192, as shown in
Equation (1) provides another illustration of the operation of blocks 191-194. The function g(m,n) has a value for each frequency bin 1-m and each frame 1-n. If the ratio of E(m,n)/E(m,n−1), i.e., the energy level of bin m in frame n to the energy level of bin m in frame n−1, is less than one, then [E(m,n)/E(m,n−1)]−1 is negative. The energy level of bin m in frame n is not greater than the energy level of bin m in frame n−1. The function g(m,n) is zero indicating no initiation of the attack phase and therefore no detection of the onset of a note. If the ratio of E(m,n)/E(m,n−1), i.e., the energy level of bin m in frame n to the energy level of bin m in frame n−1, is greater than one (say value of two), then [E(m,n)/E(m,n−1)]−1 is positive, i.e., value of one. The energy level of bin m in frame n is greater than the energy level of bin m in frame n−1. The function g(m,n) is the positive value of [E(m,n)/E(m,n−1)]−1 indicating initiation of the attack phase and a possible detection of the onset of a note.
Summer 195 accumulates the difference in energy levels E(m,n) of each frequency bin 1-m of frame n and frame n−1. The onset of a note will occur when the total of the differences in energy levels E(m,n) across the entire monitored frequency bins 1-m for the sampled audio signal 118 exceeds a predetermined threshold value. Comparator 196 compares the output of summer 195 to a threshold value 197. If the output of summer 195 is greater than threshold value 197, then the accumulation of differences in energy levels E(m,n) over the entire frequency spectrum for the sampled audio signal 118 exceeds the threshold value 197 and the onset of a note is detected in the instant frame n. If the output of summer 195 is less than threshold value 197, then no onset of a note is detected.
At the conclusion of each frame, attack detector 192 will have identified whether the instant frame contains the onset of a note, or whether the instant frame contains no onset of a note. For example, based on the summation of differences in energy levels E(m,n) of the sampled audio signal 118 over the entire spectrum of frequency bins 1-m exceeding threshold value 197, attack detector 192 may have identified frame 1 of
At the conclusion of each frame, attack detector 192 will have identified whether the instant frame contains the onset of a note, or whether the instant frame contains no onset of a note. For example, based on the summation of energy levels E(m,n) of the sampled audio signal 118 within frequency bins 1-m exceeding threshold value 200, attack detector 192 may have identified frame 1 of
Returning to
Repeat gate 202 monitors the number of onset detections occurring within a time period. If multiple onsets of a note are detected within the repeat detection time period, e.g., 50 ms, then only the first onset detection is recorded. That is, any subsequent onset of a note that is detected, after the first onset detection, within the repeat detection time period is rejected.
Noise gate 203 monitors the energy levels E(m,n) of the sampled audio signal 118 about the onset detection of a note. If the energy levels E(m,n) of the sampled audio signal 118 about the onset detection of a note are generally in the low noise range, e.g., the energy levels E(m,n) are −90 dB, then the onset detection is considered suspect and rejected as unreliable.
Returning to
Harmonic release ratio block 205 determines a ratio of the energy levels of various frequency harmonics of the frequency domain sampled audio signal 118 during the decay phase or release phase of the note on a frame-by-frame basis. Alternatively, the harmonic release ratio monitors a fundamental frequency and harmonic of the fundamental. In one embodiment, to monitor a slap style, the frequency domain energy level of the sampled audio signal 118 is measured at 200 Hz fundamental of the slap and 4000 Hz harmonic of the fundamental during the release phase of the note. The ratio of frequency domain energy levels 4000/200 Hz during the release phase of the note for each frame 1-n is the harmonic release ratio. Other frequency harmonic ratios in the release phase of the note can be monitored on a frame-by-frame basis. Block 205 determines the rate of change of the energy levels in the harmonic ratio, i.e., how rapidly the energy levels are increasing or decreasing, relative to each frame during the release phase of the note. The harmonic release ratio is a frequency domain parameter or characteristic of each frame n and is stored as a value in runtime matrix 174 on a frame-by-frame basis.
Open and mute factor block 206 monitors the energy levels of the frequency domain sampled audio signal 118 for occurrence of an open state or mute state of strings 24. A mute state of strings 24 occurs when the artist continuously presses his or her fingers against the strings, usually near the bridge of guitar 20. The finger pressure on strings 24 rapidly dampens or attenuates string vibration. An open state is the absence of a mute state, i.e., no finger pressure or other artificial dampening of strings 24 so the string vibration naturally decays. In mute state, the sustain phase and decay phase of the note is significantly shorter due to the induced dampening than a natural decay in the open state. A lack of high frequency content and rapid decrease in the frequency domain energy levels of the sampled audio signal 118 indicates the mute state. A high frequency value and natural decay in the frequency domain energy levels of the sampled audio signal 118 indicates the open state. The open and mute factor is a frequency domain parameter or characteristic of each frame n and is stored as a value in runtime matrix 174 on a frame-by-frame basis.
Neck and bridge factor block 207 monitors the energy levels of the frequency domain sampled audio signal 118 for occurrence of neck play or bridge play by the artist. Neck play of strings 24 occurs when the artist excites the strings near the neck of guitar 20. Bridge play of strings 24 occurs when the artist excites the strings near the bridge of guitar 20. When playing near the neck, a first frequency notch occurs about 100 Hz in the frequency domain response of the sampled audio signal 118. When playing near the bridge, a first frequency notch occurs about 500 Hz in the frequency domain response of the sampled audio signal 118. The occurrence and location of a first notch in the frequency response indicates neck play or bridge play. The neck and bridge factor is a frequency domain parameter or characteristic of each frame n and is stored as a value in runtime matrix 174 on a frame-by-frame basis.
Pitch detector block 208 monitors the energy levels of the frequency domain sampled audio signal 118 to determine the pitch of the note. Block 208 records the fundamental frequency of the pitch. The pitch detector is a frequency domain parameter or characteristic of each frame n and is stored as a value in runtime matrix 174 on a frame-by-frame basis.
Runtime matrix 174 contains the frequency domain parameters determined in frequency domain analysis block 120 and the time domain parameters determined in time domain analysis block 122. Each time domain parameter and frequency domain parameter is a numeric parameter value PVn,j stored in runtime matrix 174 on a frame-by-frame basis, where n is the frame and j is the parameter. For example, the note peak attack parameter has value PV1,1 in frame 1, value PV2,1 in frame 2, and value PVn,1 in frame n; note peak release parameter has value PV1,2 in frame 1, value PV2,2 in frame 2, and value PVn,2 in frame n; multiband peak attack parameter has value PV1,3 in frame 1, value PV2,3 in frame 2, and value PVn,3 in frame n; and so on. Table 1 shows runtime matrix 174 with the time domain and frequency domain parameter values PVn,j generated during the runtime analysis. The time domain and frequency domain parameter values PVn,j are characteristic of specific notes and therefore useful in distinguishing between notes.
Table 2 shows one frame of runtime matrix 174 with the time domain and frequency domain parameters generated by frequency domain analysis block 120 and time domain analysis block 122 assigned sample numeric values for an audio signal originating from a fingering style. Runtime matrix 174 contains time domain and frequency domain parameter values PVn,j for other frames of the audio signal originating from the fingering style, as per Table 1.
Table 3 shows one frame of runtime matrix 174 with the time domain and frequency domain parameters generated by frequency domain analysis block 120 and time domain analysis block 122 assigned sample numeric values for an audio signal originating from a slap style. Runtime matrix 174 contains time domain and frequency domain parameter values PVn,j for other frames of the audio signal originating from the slap style, as per Table 1.
Returning to
The time domain parameters and frequency domain parameters 1-j in note signature database 112 contain values preset by the manufacturer, or entered by the user, or learned over time by playing an instrument. The factory or manufacturer of audio amplifier 90 can initially preset the values of time domain and frequency domain parameters 1-j, as well as weighting factors 1-j and control parameters 1-k. The user can change time domain and frequency domain parameters 1-j, weighting factors 1-j, and control parameters 1-k for each note signature 1-i in database 112 directly using computer 209 with user interface screen or display 210, see
In another embodiment, time domain and frequency domain parameters 1-j, weighting factors 1-j, and control parameters 1-k can be learned by the artist playing guitar 20. The artist sets audio amplifier 90 to a learn mode. The artist repetitively plays the same note on guitar 20. For example, the artist fingers a particular note or slaps of a particular note many times in repetition. The frequency domain analysis 120 and time domain analysis 122 of
As the note is played in repetition, the artist can make manual adjustments to audio amplifier 90 via front control panel 78. Audio amplifier 90 learns control parameters 1-k associated with the note by the settings of the signal processing blocks 92-104 as manually set by the artist. For example, the artist slaps a note on bass guitar 20. Frequency domain parameters and time domain parameters for the slap note are stored frame-by-frame in database 112. The artist manually adjusts the signal processing blocks 92-104 of audio amplifier 90 through front panel controls 78, e.g., increases the amplification of the audio signal in amplification block 104 or selects a sound effect in pre-effects block 94. The settings of signal processing blocks 92-104, as manually set by the artist, are stored as control parameters 1-k for the note signature being learned in database 112. The artist slaps the same note on bass guitar 20. Frequency domain parameters and time domain parameters for the same slap note are accumulated with the previous frequency domain and time domain parameters 1-j in database 112. The artist manually adjusts the signal processing blocks 92-104 of audio amplifier 90 through front panel controls 78, e.g., adjust equalization of the audio signal in pre-filter block 92 or selects a sound effect in non-linear effects block 96. The settings of signal processing blocks 92-104, as manually set by the artist, are accumulated as control parameters 1-k for the note signature being learned in database 112. The process continues for learn mode with repetitive slaps of the same note and manual adjustments of the signal processing blocks 92-104 of audio amplifier 90 through front panel controls 78. When learn mode is complete, the note signature record in database 112 is defined with the note signature parameters being an average of the frequency domain parameters and time domain parameters accumulated in database 112, and an average of the control parameters 1-k taken from the manual adjustments of the signal processing blocks 92-104 of audio amplifier 90 and accumulated in database 112. In one embodiment, the average is a root mean square of the series of accumulated frequency domain and time domain parameters 1-j and accumulated control parameters 1-k in database 112.
Weighting factors 1-j can be learned by monitoring the learned time domain and frequency domain parameters 1-j and increasing or decreasing the weighting factors based on the closeness or statistical correlation of the comparison. If a particular parameter exhibits a consistent statistical correlation, then the weighting factor for that parameter can be increased. If a particular parameter exhibits a diverse statistical correlation, then the weighting factor for that parameter can be decreased.
Once the parameters 1-j, weighting factors 1-j, and control parameters 1-k of note signatures 1-i are established for database 112, the time domain and frequency domain parameters 1-j in runtime matrix 174 can be compared on a frame-by-frame basis to each note signature 1-i to find a best match or closest correlation. In normal play mode, the artist plays guitar 20 to generate a sequence of notes corresponding to the melody being played. For each note, runtime matrix 174 is populated on a frame-by-frame basis with time domain parameters and frequency domain parameters determined from a runtime analysis of the audio signal, as described in
The comparison between runtime matrix 174 and note signatures 1-i in database 112 can be made in a variety of implementations. For example, the time domain and frequency domain parameters 1-j in runtime matrix 714 are compared one-by-one in time sequence to parameters 1-j for each note signature 1-i in database 112. The best match or closest correlation is determined for each frame of runtime matrix 174. Adaptive intelligence control block 114 uses the control parameters 1-k in database 112 associated with the matching note signature to control operation of the signal processing blocks 92-104 of audio amplifier 90.
In another example, the time domain and frequency domain parameters 1-j in a predetermined number of the frames of a note, less than all the frames of a note, in runtime matrix 174 are compared to parameters 1-j for each note signature 1-i in database 112. In one embodiment, the time domain and frequency domain parameters 1-j in the first ten frames of each note in runtime matrix 174, as determined by the onset detection of the note, are compared to parameters 1-j for each note signature 1-i. An average of the comparisons between time domain and frequency domain parameters 1-j in each of the first ten frames of each note in runtime matrix 174 and parameters 1-j for each note signature 1-i will determine a best match or closest correlation to identify the frames in runtime matrix 174 as being a particular note associated with a note signature i. Adaptive intelligence control block 114 uses the control parameters 1-k in database 112 associated with the matching note signature to control operation of the signal processing blocks 92-104 of audio amplifier 90.
In a illustrative numeric example of the parameter comparison process to determine a best match or closest correlation between the time domain and frequency domain parameters 1-j for each frame in runtime matrix 174 and parameters 1-j for each note signature 1-i, Table 4 shows time domain and frequency domain parameters 1-j with sample parameter values for note signature 1 (fingering style note) of database 112. Table 5 shows time domain and frequency domain parameters 1-j with sample parameter values for note signature 2 (slap style note) of database 112.
The time domain and frequency domain parameters 1-j for one frame in runtime matrix 174 and the parameters 1-j in each note signatures 1-i are compared on a one-by-one basis and the differences are recorded. For example, the note peak attack parameter of frame 1 in runtime matrix 174 has a value of 28 (see Table 2) and the note peak attack parameter in note signature 1 has a value of 30 (see Table 4).
Next, the note peak attack parameter of frame 1 in runtime matrix 174 has a value of 28 (see Table 2) and the note peak attack parameter in note signature 2 has a value of 5 (see Table 5). Compare block 212 determines the difference 5−28 and stores the difference between frame 1 and note signature 2 in recognition memory 213. The note peak release parameter of frame 1 in runtime matrix 174 has a value of 196 (see Table 2) and the note peak release parameter in note signature 2 has a value of 40 (see Table 5). Compare block 212 determines the difference 40−196 and stores the difference in recognition memory 213. For each parameter of frame 1, compare block 212 determines the difference between the parameter value in runtime matrix 174 and the parameter value in note signature 2 and stores the difference in recognition memory 213. The differences between the parameters 1-j in runtime matrix 174 for frame 1 and the parameters 1-j of note signature 2 are summed to determine a total difference value between the parameters 1-j in runtime matrix 174 for frame 1 and the parameters 1-j of note signature 2.
The time domain and frequency domain parameters 1-j in runtime matrix 174 for frame 1 are compared to the time domain and frequency domain parameters 1-j in the remaining note signatures 3-i in database 112, as described for note signatures 1 and 2. The minimum total difference between the parameters 1-j in runtime matrix 174 for frame 1 and the parameters 1-j of note signatures 1-i is the best match or closest correlation. In this case, the time domain and frequency domain parameters 1-j in runtime matrix 174 for frame 1 are more closely aligned to the time domain and frequency domain parameters 1-j in note signature 1. Frame 1 of runtime matrix 174 is identified as a frame of a fingering style note.
With time domain parameters and frequency domain parameters 1-j of frame 1 in runtime matrix 174 generated from a played note matched to note signature 1, adaptive intelligence control block 114 of
Next, the time domain and frequency domain parameters 1-j for frame 2 in runtime matrix 174 and the parameters 1-j in each note signatures 1-i are compared on a one-by-one basis and the differences are recorded. For each parameter 1-j of frame 2, compare block 212 determines the difference between the parameter value in runtime matrix 174 and the parameter value in note signature i and stores the difference in recognition memory 213. The differences between the parameters 1-j of frame 2 and the parameters 1-j of note signature i are summed to determine a total difference value between the parameters 1-j of frame 2 and the parameters 1-j of note signature i. The minimum total difference between the parameters 1-j of frame 2 of runtime matrix 174 and the parameters 1-j of note signatures 1-i is the best match or closest correlation. Frame 2 of runtime matrix 174 is identified with the note signature having the minimum total difference between corresponding parameters. In this case, the time domain and frequency domain parameters 1-j of frame 2 in runtime matrix 174 are more closely aligned to the time domain and frequency domain parameters 1-j in note signature 1. Frame 2 of runtime matrix 174 is identified as another frame for a fingering style note. Adaptive intelligence control block 114 uses the control parameters 1-k in database 112 associated with the matching note signature 1 to control operation of the signal processing blocks 92-104 of audio amplifier 90. The process continues for each frame n of runtime matrix 174.
In another numeric example, the note peak attack parameter of frame 1 in runtime matrix 174 has a value of 6 (see Table 3) and the note peak attack parameter in note signature 1 has a value of 30 (see Table 4). The difference 30−6 between frame 1 and note signature 1 is stored in recognition memory 213. The note peak release parameter of frame 1 in runtime matrix 174 has a value of 33 (see Table 3) and the note peak release parameter in note signature 1 has a value of 200 (see Table 4). Compare block 212 determines the difference 200−33 and stores the difference in recognition memory 213. For each parameter of frame 1, compare block 212 determines the difference between the parameter value in runtime matrix 174 and the parameter value in note signature 1 and stores the difference in recognition memory 213. The differences between the parameters 1-j of frame 1 in runtime matrix 174 and the parameters 1-j of note signature 1 are summed to determine a total difference value between the parameters 1-j of frame 1 and the parameters 1-j of note signature 1.
Next, the note peak attack parameter of frame 1 in runtime matrix 174 has a value of 6 (see Table 3) and the note peak attack parameter in note signature 2 has a value of 5 (see Table 5). Compare block 212 determines the difference 5−6 and stores the difference in recognition memory 213. The note peak release parameter of frame 1 in runtime matrix 174 has a value of 33 (see Table 3) and the note peak release parameter in note signature 2 has a value of 40 (see Table 5). Compare block 212 determines the difference 40−33 and stores the difference in recognition memory 213. For each parameter of frame 1, compare block 212 determines the difference between the parameter value in runtime matrix 174 and the parameter value in note signature 2 and stores the difference in recognition memory 213. The differences between the parameters 1-j of frame 1 and the parameters 1-j of note signature 2 are summed to determine a total difference value between the parameters 1-j of frame 1 and the parameters 1-j of note signature 2.
The time domain and frequency domain parameters 1-j in runtime matrix 174 for frame 1 are compared to the time domain and frequency domain parameters 1-j in the remaining note signatures 3-i in database 112, as described for note signatures 1 and 2. The minimum total difference between the parameters 1-j of frame 1 of runtime matrix 174 and the parameters 1-j of note signatures 1-i is the best match or closest correlation. Frame 1 of runtime matrix 174 is identified with the note signature having the minimum total difference between corresponding parameters. In this case, the time domain and frequency domain parameters 1-j of frame 1 in runtime matrix 174 are more closely aligned to the time domain and frequency domain parameters 1-j in note signature 2. Frame 1 of runtime matrix 174 is identified as a frame of a slap style note.
With time domain parameters and frequency domain parameters 1-j of frame 1 in runtime matrix 174 generated from a played note matched to note signature 2, adaptive intelligence control block 114 of
The time domain and frequency domain parameters 1-j for frame 2 in runtime matrix 174 and the parameters 1-j in each note signatures 1-i are compared on a one-by-one basis and the differences are recorded. For each parameter 1-j of frame 2, compare block 212 determines the difference between the parameter value in runtime matrix 174 and the parameter value in note signature i and stores the difference in recognition memory 213. The differences between the parameters 1-j of frame 2 and the parameters 1-j of note signature i are summed to determine a total difference value between the parameters 1-j of frame 2 and the parameters 1-j of note signature i. The minimum total difference between the parameters 1-j of frame 2 of runtime matrix 174 and the parameters 1-j of note signatures 1-i is the best match or closest correlation. Frame 2 of runtime matrix 174 is identified with the note signature having the minimum total difference between corresponding parameters. In this case, the time domain and frequency domain parameters 1-j of frame 2 in runtime matrix 174 are more closely aligned to the time domain and frequency domain parameters 1-j in note signature 2. Frame 2 of runtime matrix 174 is identified as another frame of a slap style note. Adaptive intelligence control block 114 uses the control parameters 1-k in database 112 associated with the matching note signature 2 to control operation of the signal processing blocks 92-104 of audio amplifier 90. The process continues for each frame n of runtime matrix 174.
In another embodiment, the time domain and frequency domain parameters 1-j for one frame in runtime matrix 174 and the parameters 1-j in each note signatures 1-i are compared on a one-by-one basis and the weighted differences are recorded. For example, the note peak attack parameter of frame 1 in runtime matrix 174 has a value of 28 (see Table 2) and the note peak attack parameter in note signature 1 has a value of 30 (see Table 4). Compare block 212 determines the weighted difference (30−28)*weight 1,1 and stores the weighted difference in recognition memory 213. The note peak release parameter of frame 1 in runtime matrix 174 has a value of 196 (see Table 2) and the note peak release parameter in note signature 1 has a value of 200 (see Table 4). Compare block 212 determines the weighted difference (200−196)*weight 1,2 and stores the weighted difference in recognition memory 213. For each parameter of frame 1, compare block 212 determines the weighted difference between the parameter value in runtime matrix 174 and the parameter value in note signature 1 as determined by weight 1,j and stores the weighted difference in recognition memory 213. The weighted differences between the parameters 1-j of frame 1 and the parameters 1-j of note signature 1 are summed to determine a total weighted difference value between the parameters 1-j of frame 1 and the parameters 1-j of note signature 1.
Next, the note peak attack parameter of frame 1 in runtime matrix 174 has a value of 28 (see Table 2) and the note peak attack parameter in note signature 2 has a value of 5 (see Table 5). Compare block 212 determines the weighted difference (5−28)*weight 2,1 and stores the weighted difference in recognition memory 213. The note peak release parameter of frame 1 in runtime matrix 174 has a value of 196 (see Table 2) and the note peak release parameter in note signature 2 has a value of 40 (see Table 5). Compare block 212 determines the weighted difference (40−196)*weight 2,2 and stores the weighted difference in recognition memory 213. For each parameter of frame 1, compare block 212 determines the weighted difference between the parameter value in runtime matrix 174 and the parameter value in note signature 2 by weight 2,j and stores the weighted difference in recognition memory 213. The weighted differences between the parameters 1-j of frame 1 in runtime matrix 174 and the parameters 1-j of note signature 2 are summed to determine a total weighted difference value between the parameters 1-j of frame 1 and the parameters 1-j of note signature 2.
The time domain and frequency domain parameters 1-j in runtime matrix 174 for frame 1 are compared to the time domain and frequency domain parameters 1-j in the remaining note signatures 3-i in database 112, as described for note signatures 1 and 2. The minimum total weighted difference between the parameters 1-j of frame 1 of runtime matrix 174 and the parameters 1-j of note signatures 1-i is the best match or closest correlation. Frame 1 of runtime matrix 174 is identified with the note signature having the minimum total weighted difference between corresponding parameters. Adaptive intelligence control block 114 uses the control parameters 1-k in database 112 associated with the matching note signature to control operation of the signal processing blocks 92-104 of audio amplifier 90.
The time domain and frequency domain parameters 1-j for frame 2 in runtime matrix 174 and the parameters 1-j in each note signatures 1-i are compared on a one-by-one basis and the weighted differences are recorded. For each parameter 1-j of frame 2, compare block 212 determines the weighted difference between the parameter value in runtime matrix 174 and the parameter value in note signature i by weight i,j and stores the weighted difference in recognition memory 213. The weighted differences between the parameters 1-j of frame 2 and the parameters 1-j of note signature i are summed to determine a total weighted difference value between the parameters 1-j of frame 2 and the parameters 1-j of note signature i. The minimum total weighted difference between the parameters 1-j of frame 2 of runtime matrix 174 and the parameters 1-j of note signatures 1-i is the best match or closest correlation. Frame 2 of runtime matrix 174 is identified with the note signature having the minimum total weighted difference between corresponding parameters. Adaptive intelligence control block 114 uses the control parameters 1-k in database 112 associated with the matching note signature to control operation of the signal processing blocks 92-104 of audio amplifier 90. The process continues for each frame n of runtime matrix 174.
In another embodiment, a probability of correlation between corresponding parameters in runtime matrix 174 and note signatures 1-i is determined. In other words, a probability of correlation is determined as a percentage that a given parameter in runtime matrix 174 is likely the same as the corresponding parameter in note signature i. The percentage is a likelihood of a match. As described above, the time domain parameters and frequency domain parameters in runtime matrix 174 are stored on a frame-by-frame basis. For each frame n of each parameter j in runtime matrix 174 is represented by Pn,j=[Pn1, Pn2, . . . Pnj].
A probability ranked list R is determined between each frame n of each parameter j in runtime matrix 174 and each parameter j of each note signature i. The probability value ri can be determined by a root mean square analysis for the Pn,j and note signature database Si,j in equation (3):
The probability value R is (1−ri)×100%. The overall ranking value for Pn,j and note database Si,j is given in equation (4).
R=[(1−r1)×100%(1−r2)×100%(1−ri)×100%] (4)
In some cases, the matching process identifies two or more note signatures that are close to the played note. For example, the played note may have a 52% probability that it matches to note signature 1 and a 48% probability that it matches to note signature 2. In this case, an interpolation is performed between the control parameter 1,1, control parameter 1,2, through control parameter 1,k, and control parameter 2,1, control parameter 2,2, through control parameter 2,k, weighted by the probability of the match. The net effective control parameter 1 is 0.52*control parameter 1,1+0.48*control parameter 2,1. The net effective control parameter 2 is 0.52*control parameter 1,2+0.48*control parameter 2,2. The net effective control parameter k is 0.52*control parameter 1,k+0.48*control parameter 2,k. The net effective control parameters 1-k control operation of the signal processing blocks 92-104 of audio amplifier 90. The audio signal is processed through pre-filter block 92, pre-effects block 94, non-linear effects block 96, user-defined modules 98, post-effects block 100, post-filter block 102, and power amplification block 104, each operating as set by net effective control parameters 1-k. The audio signal is routed to the speaker in enclosure 24 or speaker 82 in enclosure 72. The listener hears the reproduced audio signal enhanced in realtime with characteristics determined by the dynamic content of the audio signal.
The adaptive intelligence control described in
The signal processing functions can be associated with equipment other than a dedicated audio amplifier.
In one embodiment, signal processing equipment 215 is a computer 218, as shown in
The pre-filter block 222, pre-effects block 224, non-linear effects block 226, user-defined modules 228, post-effects block 230, and post-filter block 232 within the signal processing function are selectable and controllable with front control panel 234, i.e., by the computer keyboard or external control signal to computer 218.
To accommodate the signal processing requirements for the dynamic content of the audio source, computer 218 employs a dynamic adaptive intelligence feature involving frequency domain analysis and time domain analysis of the audio signal on a frame-by-frame basis and automatically and adaptively controls operation of the signal processing functions and settings within the computer to achieve an optimal sound reproduction. The audio signal from musical instrument 214 is routed to frequency domain and time domain analysis block 240. The output of block 240 is routed to note signature block 242, and the output of block 242 is routed to adaptive intelligence control block 244.
The functions of blocks 240, 242, and 244 correspond to blocks 110, 112, and 114, respectively, as described in
Some embodiments of audio source 12 are better characterized on a frame-by-frame basis, i.e., no clear or reliably detectable delineation between notes. For example, the audio signal from vocal patterns may be better suited to a frame-by-frame analysis without detecting the onset of a note.
Audio amplifier 270 has a signal processing path for the audio signal, including pre-filter block 272, pre-effects block 274, non-linear effects block 276, user-defined modules 278, post-effects block 280, post-filter block 282, and power amplification block 284. Pre-filtering block 272 and post-filtering block 282 provide various filtering functions, such as low-pass filtering and bandpass filtering of the audio signal. The pre-filtering and post-filtering can include tone equalization functions over various frequency ranges to boost or attenuate the levels of specific frequencies without affecting neighboring frequencies, such as bass frequency adjustment and treble frequency adjustment. For example, the tone equalization may employ shelving equalization to boost or attenuate all frequencies above or below a target or fundamental frequency, bell equalization to boost or attenuate a narrow range of frequencies around a target or fundamental frequency, graphic equalization, or parametric equalization. Pre-effects block 274 and post-effects block 280 introduce sound effects into the audio signal, such as reverb, delays, chorus, wah, auto-volume, phase shifter, hum canceller, noise gate, vibrato, pitch-shifting, graphic equalization, tremolo, and dynamic compression. Non-linear effects block 276 introduces non-linear effects into the audio signal, such as m-modeling, distortion, overdrive, fuzz, and modulation. User-defined module block 278 allows the user to define customized signal processing functions, such as adding accompanying instruments, vocals, and synthesizer options. Power amplification block 284 provides power amplification or attenuation of the audio signal. The post signal processing audio signal is routed to speakers 266 in enclosure 256.
The pre-filter block 272, pre-effects block 274, non-linear effects block 276, user-defined modules 278, post-effects block 280, post-filter block 282, and power amplification block 284 within audio amplifier 270 are selectable and controllable with front control panel 262. By turning knobs 260 on front control panel 262, the user can directly control operation of the signal processing functions within audio amplifier 270.
The time domain analysis block 302 of
Vibrato detector block 326 uses the energy function E(m,n) to track changes in amplitude of the energy levels over time indicating amplitude modulation associated with the vibrato effect. The vibrato detector is a time domain parameter or characteristic of each frame n for all frequency bands 1-m and is stored as a value in runtime matrix 324 on a frame-by-frame basis.
The frequency domain analysis block 300 in
Once the sampled audio signal 298 is in frequency domain, vowel “a” formant block 340 uses the frequency domain sampled audio signal to determine an occurrence of the vowel “a” in the sampled audio signal 298. Each vowel has a frequency designation. The vowel “a” occurs in the 800-1200 Hz range and no other frequency range. The vowel “a” formant parameter is value one if the vowel is present in the sampled audio signal 298 and value zero if the vowel is not present. The vowel “a” formant is a frequency domain parameter or characteristic of each frame n and is stored as a value in runtime matrix 324 on a frame-by-frame basis.
Vowel “e” formant block 342 uses the frequency domain sampled audio signal to determine an occurrence of the vowel “e” in the sampled audio signal 298. The vowel “e” occurs in the 400-600 Hz range and also in the 2200-2600 frequency range. The vowel “e” formant parameter is value one if the vowel is present in the sampled audio signal 298 and value zero if the vowel is not present. The vowel “e” formant is a frequency domain parameter or characteristic of each frame n and is stored as a value in runtime matrix 324 on a frame-by-frame basis.
Vowel “i” formant block 344 uses the frequency domain sampled audio signal to determine an occurrence of the vowel “i” in the sampled audio signal 298. The vowel “i” occurs in the 200-400 Hz range and also in the 3000-3500 frequency range. The vowel “i” formant parameter is value one if the vowel is present in the sampled audio signal 298 and value zero if the vowel is not present. The vowel “i” formant is a frequency domain parameter or characteristic of each frame n and is stored as a value in runtime matrix 324 on a frame-by-frame basis.
Vowel “o” formant block 346 uses the frequency domain sampled audio signal to determine an occurrence of the vowel “o” in the sampled audio signal 298. The vowel “o” occurs in the 400-600 Hz range and no other frequency range. The vowel “o” formant parameter is value one if the vowel is present in the sampled audio signal 298 and value zero if the vowel is not present. The vowel “o” formant is a frequency domain parameter or characteristic of each frame n and is stored as a value in runtime matrix 324 on a frame-by-frame basis.
Vowel “u” formant block 348 uses the frequency domain sampled audio signal to determine an occurrence of the vowel “u” in the sampled audio signal 298. The vowel “u” occurs in the 200-400 Hz range and no other frequency range. The vowel “u” formant parameter is value one if the vowel is present in the sampled audio signal 298 and value zero if the vowel is not present. The vowel “u” formant is a frequency domain parameter or characteristic of each frame n and is stored as a value in runtime matrix 324 on a frame-by-frame basis.
Overtone detector block 350 uses the frequency domain sampled audio signal to detect a higher harmonic resonance or overtone of the fundamental key, giving the impression of simultaneous tones. The overtone detector is a frequency domain parameter or characteristic of each frame n and is stored as a value in runtime matrix 324 on a frame-by-frame basis.
Runtime matrix 324 contains the time domain parameters determined in time domain analysis block 302 and the frequency domain parameters determined in frequency domain analysis block 300. Each time domain parameter and frequency domain parameter is a numeric parameter value PVn,j stored in runtime matrix 324 on a frame-by-frame basis, where n is the frame and j is the parameter, similar to Table 1. The time domain and frequency domain parameter values Pn,j are characteristic of specific frames and therefore useful in distinguishing between frames.
Returning to
The time domain parameters and frequency domain parameters in frame signature database 292 contain values preset by the manufacturer, or entered by the user, or learned over time by playing an instrument. The factory or manufacturer of audio amplifier 270 can initially preset the values of time domain and frequency domain parameters 1-j, as well as weighting factors 1-j and control parameters 1-k. The user can change time domain and frequency domain parameters 1-j, weighting factors 1-j, and control parameters 1-k for each frame signature 1-i in database 292 directly using computer 352 with user interface screen or display 354, see
In another, embodiment, time domain and frequency domain parameters 1-j, weighting factors 1-j, and control parameters 1-k can be learned by the artist singing into microphone 250. The artist sets audio amplifier 270 to a learn mode. The artist repetitively sings into microphone 250. The frequency domain analysis 300 and time domain analysis 302 of
The artist can make manual adjustments to audio amplifier 270 via front control panel 262. Audio amplifier 270 learns control parameters 1-k associated with the frame by the settings of the signal processing blocks 272-284 as manually set by the artist. When learn mode is complete, the frame signature records in database 292 are defined with the frame signature parameters being an average of the frequency domain parameters and time domain parameters accumulated in database 292, and an average of the control parameters 1-k taken from the manual adjustments of the signal processing blocks 272-284 of audio amplifier 270 in database 292. In one embodiment, the average is a root mean square of the series of accumulated frequency domain and time domain parameters 1-j and accumulated control parameters 1-k in database 292.
Weighting factors 1-j can be learned by monitoring the learned time domain and frequency domain parameters 1-j and increasing or decreasing the weighting factors based on the closeness or statistical correlation of the comparison. If a particular parameter exhibits a consistent statistical correlation, then the weight factor for that parameter can be increased. If a particular parameter exhibits a diverse statistical diverse correlation, then the weighting factor for that parameter can be decreased.
Once the parameters 1-j, weighting factors 1-j, and control parameters 1-k of frame signatures 1-i are established for database 292, the time domain and frequency domain parameters 1-j in runtime matrix 324 can be compared on a frame-by-frame basis to each frame signature 1-i to find a best match or closest correlation. In normal play mode, the artist sings lyrics to generate an audio signal having a time sequence of frames. For each frame, runtime matrix 324 is populated with time domain parameters and frequency domain parameters determined from a time domain analysis and frequency domain analysis of the audio signal, as described in
The time domain and frequency domain parameters 1-j for frame 1 in runtime matrix 324 and the parameters 1-j in each frame signature 1-i are compared on a one-by-one basis and the differences are recorded.
Next, for each parameter of frame 1, compare block 358 determines the difference between the parameter value in runtime matrix 324 and the parameter value in frame signature 2 and stores the difference in recognition memory 360. The differences between the parameters 1-j of frame 1 in runtime matrix 324 and the parameters 1-j of frame signature 2 are summed to determine a total difference value between the parameters 1-j of frame 1 and the parameters 1-j of frame signature 2.
The time domain parameters and frequency domain parameters 1-j in runtime matrix 324 for frame 1 are compared to the time domain and frequency domain parameters 1-j in the remaining frame signatures 3-i in database 292, as described for frame signatures 1 and 2. The minimum total difference between the parameters 1-j of frame 1 of runtime matrix 324 and the parameters 1-j of frame signatures 1-i is the best match or closest correlation and the frame associated with frame 1 of runtime matrix 324 is identified with the frame signature having the minimum total difference between corresponding parameters. In this case, the time domain and frequency domain parameters 1-j of frame 1 in runtime matrix 324 are more closely aligned to the time domain and frequency domain parameters 1-j in frame signature 1.
With time domain parameters and frequency domain parameters 1-j of frame 1 in runtime matrix 324 matched to frame signature 1, adaptive intelligence control block 294 of
The time domain and frequency domain parameters 1-j for frame 2 in runtime matrix 324 and the parameters 1-j in each frame signature 1-i are compared on a one-by-one basis and the differences are recorded. For each parameter 1-j of frame 2, compare block 358 determines the difference between the parameter value in runtime matrix 324 and the parameter value in frame signature i and stores the difference in recognition memory 360. The differences between the parameters 1-j of frame 2 in runtime matrix 324 and the parameters 1-j of frame signature i are summed to determine a total difference value between the parameters 1-j of frame 2 and the parameters 1-j of frame signature i. The minimum total difference between the parameters 1-j of frame 2 of runtime matrix 324 and the parameters 1-j of frame signatures 1-i is the best match or closest correlation and the frame associated with frame 1 of runtime matrix 324 is identified with the frame signature having the minimum total difference between corresponding parameters. In this case, the time domain and frequency domain parameters 1-j of frame 2 in runtime matrix 324 are more closely aligned to the time domain and frequency domain parameters 1-j in frame signature 2. Adaptive intelligence control block 294 uses the control parameters 1-k associated with the matching frame signature 2 in database 292 to control operation of the signal processing blocks 272-284 of audio amplifier 270. The process continues for each frame n of runtime matrix 324.
In another embodiment, the time domain and frequency domain parameters 1-j for one frame in runtime matrix 324 and the parameters 1-j in each frame signature 1-i are compared on a one-by-one basis and the weighted differences are recorded. For each parameter of frame 1, compare block 358 determines the weighted difference between the parameter value in runtime matrix 324 and the parameter value in frame signature 1 as determined by weight 1,j and stores the weighted difference in recognition memory 360. The weighted differences between the parameters 1-j of frame 1 in runtime matrix 324 and the parameters 1-j of frame signature 1 are summed to determine a total weighted difference value between the parameters 1-j of frame 1 and the parameters 1-j of frame signature 1.
Next, for each parameter of frame 1, compare block 358 determines the weighted difference between the parameter value in runtime matrix 324 and the parameter value in frame signature 2 by weight 2,j and stores the weighted difference in recognition memory 360. The weighted differences between the parameters 1-j of frame 1 and the parameters 1-j of frame signature 2 are summed to determine a total weighted difference value between the parameters 1-j of frame 1 and the parameters 1-j of frame signature 2.
The time domain parameters and frequency domain parameters 1-j in runtime matrix 324 for frame 1 are compared to the time domain and frequency domain parameters 1-j in the remaining frame signatures 3-i in database 292, as described for frame signatures 1 and 2. The minimum total weighted difference between the parameters 1-j of frame 1 in runtime matrix 324 and the parameters 1-j of frame signatures 1-i is the best match or closest correlation and the frame associated with frame 1 of runtime matrix 324 is identified with the frame signature having the minimum total weighted difference between corresponding parameters. Adaptive intelligence control block 294 uses the control parameters 1-k in database 292 associated with the matching frame signature to control operation of the signal processing blocks 272-284 of audio amplifier 270.
The time domain and frequency domain parameters 1-j for frame 2 in runtime matrix 324 and the parameters 1-j in each frame signature 1-i are compared on a one-by-one basis and the weighted differences are recorded. For each parameter 1-j of frame 2, compare block 358 determines the weighted difference between the parameter value in runtime matrix 324 and the parameter value in frame signature i by weight i,j and stores the weighted difference in recognition memory 360. The weighted differences between the parameters 1-j of frame 2 in runtime matrix 324 and the parameters 1-j of frame signature i are summed to determine a total weighted difference value between the parameters 1-j of frame 2 and the parameters 1-j of frame signature i. The minimum total weighted difference between the parameters 1-j of frame 2 of runtime matrix 324 and the parameters 1-j of frame signatures 1-i is the best match or closest correlation and the frame associated with frame 1 of runtime matrix 324 is identified with the frame signature having the minimum total weighted difference between corresponding parameters. Adaptive intelligence control block 294 uses the control parameters 1-k in database 292 associated with the matching frame signature to control operation of the signal processing blocks 272-284 of audio amplifier 270. The process continues for each frame n of runtime matrix 324.
In another embodiment, a probability of correlation between corresponding parameters in runtime matrix 324 and frame signatures 1-i is determined. In other words, a probability of correlation is determined as a percentage that a given parameter in runtime matrix 324 is likely the same as the corresponding parameter in frame signature i. The percentage is a likelihood of a match. As described above, the time domain parameters and frequency domain parameters in runtime matrix 324 are stored on a frame-by-frame basis. For each frame n of each parameter j in runtime matrix 174 is represented by Pn,j=[Pn1, Pn2, . . . Pnj].
A probability ranked list R is determined between each frame n of each parameter j in runtime matrix 174 and each parameter j of each frame signature i. The probability value ri can be determined by a root mean square analysis for the Pn,j and frame signature database Si,j in equation (3). The probability value R is (1−ri)×100%. The overall ranking value for Pn,j and frame database Si,j is given in equation (4).
In some cases, the matching process identifies two or more frame signatures that are close to the present frame. For example, a frame in runtime matrix 324 may have a 52% probability that it matches to frame signature 1 and a 48% probability that it matches to frame signature 2. In this case, an interpolation is performed between the control parameter 1,1, control parameter 1,2 through control parameter 1,k and control parameter 2,1, control parameter 2,2, through control parameter 2,k, weighted by the probability of the match. The net effective control parameter 1 is 0.52*control parameter 1,1+0.48*control parameter 2,1. The net effective control parameter 2 is 0.52*control parameter 1,2+0.48*control parameter 2,2. The net effective control parameter k is 0.52*control parameter 1,k+0.48*control parameter 2,k. The net effective control parameters 1-k control operation of the signal processing blocks 272-284 of audio amplifier 270. The audio signal is processed through pre-filter block 272, pre-effects block 274, non-linear effects block 276, user-defined modules 278, post-effects block 280, post-filter block 282, and power amplification block 284, each operating as set by net effective control parameters 1-k. The audio signal is routed to speaker 266 in enclosure 256. The listener hears the reproduced audio signal enhanced in realtime with characteristics determined by the dynamic content of the audio signal.
While one or more embodiments of the present invention have been illustrated in detail, the skilled artisan will appreciate that modifications and adaptations to those embodiments may be made without departing from the scope of the present invention as set forth in the following claims.