The invention relates to electronic devices, and, more particularly, to circuitry and methods for beat detection in audio streams and applications.
In recent years, methods have been developed which can track the tempo of an audio signal and identify its (musical) beats. This has enabled various beat-matching applications, including beat-matched audio editing, automatic play-list generation, and beat-matched crossfades. Indeed, in a beat-matched crossfade, a deejay slows down or speeds up one of the two audio tracks so that the beats between the incoming track and the outgoing track line up.
With the popularity of portable audio devices in athletic pursuits, today's exercise enthusiasts choose their individual music to motivate their workouts. They will select songs to motivate them to run/cycle at a desired target rate (e.g., running at a pace of eight minutes per mile where their steps match the musical downbeat), but the original music beat rate may not match their exact desired rate for the workout. Also, variations in the beat rate between songs can speed up or slow down the athlete. This lack of control over the exact music beat rate can cause the athlete to run/cycle/exercise faster or slower than the desired target.
Approaches to include bio-metric data to influence audio playback can be found in US patent publications 2005/0126370 and 2006/0112808 and in Japanese Kokai 2002-073018.
Maintenance/monitoring of machinery often involve heat and pressure sensors, which usually signal a problem only after a catastrophic failure. Some equipment and/or machinery is remotely located (e.g. cellular sites, radio repeater sites, pipeline “lift” stations), where it is far less costly to provide scheduled and preventive maintenance in good weather than to provide system critical repairs in poor weather, when it is difficult or impossible to travel to the site. Various machinery emits consistent, repetitive beat sounds; for example: fans in environmental air handler (for temperature, humidity, filtration, etc.); pumping stations (water, petroleum, sewer, etc.); rotating machinery, piston movement, horizontal repetitive motion, vertical repetitive motion (e.g., bottling machine, stamper), conveyor belt, bucket lift. If these repetitive sounds change drastically in their beat rate, it can signify a problem with the machinery that may need to be fixed. If additional, extraneous sounds occur within a consistent beat signal, this can also signify a problem.
People who interface with machines (i.e. assembly line workers in factories) are often asked to work at the same pace as the machines. These factories are often looking for methods to motivate their employees to work at the machine's pace. Music can be a motivating force for these employees. Simply playing music over a loudspeaker would not synchronize the workers to the machine's pace.
Beat detection for a digital audio stream can be performed in various ways. A simple approach just computes autocorrelations and selects the beat period as the delay corresponding to the peak autocorrelation. Alonso et al., “Tempo and Beat Estimation of Musical Signals”, Proc. Intl. Conf. Music Information Retrieval (ISMIR 2004), Barcelona, Spain, October 2004, proceeds through three steps: First an onset detector analyzes the audio signal and produces scalars that reflect the level of spectral change over time; this uses short-time Fourier transforms and differences the frequency channel magnitudes. The differences are summed and a threshold is applied through a median filter to output a detection function that shows only peaks at points in time that have large amounts of spectral change. Second, the detection function is fed to a periodicity estimator which applies spectral product methods to evaluate tempo (beat rate) hypotheses; this gives the beat rate estimate. In the third step a beat locator uses the detection function and the estimated beat rate to determine the locations of the beats in a frame.
All beat matchers must mitigate the limitations of the beat detection method which they employ. This includes the tendency of beat detectors to jump from one tempo beats-per-minute value to a harmonic or sub-harmonic thereof between analysis frames.
Another important characteristic for beat matchers is to avoid excessively modifying the input music being matched to another (reference) music or beat source track. Typically, modifications are either time-scale modifications (TSM) or sampling rate conversions (SRC).
TSM methods change the time scale of an audio signal without changing its perceptual characteristics. For example, synchronized overlap-and-add (SOLA) provides a time scale change by a factor r by taking successive length-N frames of input samples with frame k starting at time kTanalysis and aligning frame k to (within a range about) its target synthesis starting time kTsynthesis (where Tsyntesis=rTanalysis) in the currently synthesized output by optimizing the cross-correlation of the overlap portions and then adding aligned frame k to extend the currently synthesized output with averaging of the overlap portions. Various SOLA modifications lower the complexity of the computations; for example, Wong and Au, Fast SOLA-Based Time Scale Modification Using Modified Envelope Matching, IEEE ICASSP vol. III, pp. 3188-3191 (2002).
Sampling rate conversion (which may be asynchronous) theoretically is just analog reconstruction and resampling, i.e., non-linear interpolations. Ramstad, Digital Methods for Conversion between Arbitrary Sampling Frequencies, 32. IEEE Tr. ASSP 577 (1984) presents a general theory of filtering methods for interfacing time-discrete systems with different sampling rates and includes the use of Taylor series coefficients for improved interpolation accuracy.
The present invention provides beat detection for audio play as athletic/user incentive, monitoring mechanical devices, and/or synchronization of audio play to mechanical devices.
a-1c are functional block diagrams and flowchart of a preferred embodiment beat matching on a portable audio device during workout.
a-2c show beat-matching waveforms and time-scale modification versus sampling rate conversion plus a combination.
1. Overview
Preferred embodiments provide architectures and methods for applications of beat detection including athletic/exercise workout incentive, monitoring mechanical devices, and/or beat matching of audio playout to the mechanical device as beat source.
Preferred embodiment systems implement preferred embodiment architectures and methods with any of several types of hardware: digital signal processors (DSPs), general purpose programmable processors, application specific circuits, or systems on a chip (SoC) such as combinations of a DSP and a RISC processor together with various specialized programmable accelerators such as for FFTs and variable length coding (VLC). For example, the 55x family of DSPs from Texas Instruments has sufficient power. A stored program in an onboard or external (flash EEP)ROM or FRAM could implement the signal processing. Analog-to-digital converters and digital-to-analog converters can provide coupling to the real world, modulators and demodulators (plus antennas for air interfaces) can provide coupling for transmission waveforms, and packetizers can provide formats for transmission over networks such as the Internet.
2. Portable Audio with Selected Tempo
a illustrates functional blocks of a first preferred embodiment portable audio/media device which can be used for athletic training. An athlete (user) carries the portable device during training to play music to accompany a workout which has a selected target level of effort (e.g., target heart rate); a digital processor in the portable device beat matches the music to play at a beat rate compatible with the target effort level. In particular, prior to the workout, the user (athlete) enters a tempo (beats per minute) plus selects a source for music to play during the workout, such as songs stored on the portable device and/or wireless streaming downloads to the portable device. Then during the workout, the portable device alters the playback speed (beat matches) of the music being played in real-time to match the entered tempo. The portable device sends out the altered audio to be played by speakers or headphones.
In effect, the same music can be played with different tempos during different workouts by selecting different beat rates. Thus, an athlete can listen to favorite music which is beat-adapted to a target workout effort level.
3. Portable with Metric to Tempo Conversion
4. Portable with Selected Workout Profile
Many workout machines contain a set of “workout profiles” (e.g., hill climbing profile, fat burn profile, cardio profile, etc.) that increase/decrease speed or resistance throughout the workout.
5. Portable with Feedback Control
If the workout profile is for biometric targets (e.g., target range for heart rate), then the real-time biometric data is used to increase/decrease the music speed when the athlete is below/above the target range (see Section 10 for more details on motivational aspect of this invention). Biometric and/or performance data may be provided by individual sensors (either internal to or external to the portable device) or by an exercise machine.
In particular, prior to the workout, the athlete selects a beat source, such as a wired or wireless heart monitor, selects a performance and/or biometric target, and then selects a source for music to play during the workout. Then during the workout, the portable device analyzes sensor inputs to determine whether performance and/or biometric targets are being met and computes a beat rate. The beat matcher then adjusts the tempo (alters the speed) of the music being played to match the computed beat rate. The beat rate computation can be according to a simple algorithm. For example, let BPMinput denote the input rate from a heart monitor, BPMtarget denote the target heart rate for the workout (which can be programmed to vary in time), and BPMmusic denote the music tempo, then BPMmusic could be determined as:
BPMmusic=BPMinput+constant*(BPMtarget−BPMinput)
where the constant can be programmed and even adjusted over time. Thus with a positive constant (e.g., 0.5), when the athlete's heart rate is below target, the music tempo is computed to exceed the current heart rate by a fraction of the target miss, and similarly when the athlete's heart rate is above target, the music tempo is computed to be less than the current heart rate by a fraction of the target miss. More generally, the square of the target miss, or other non-linear function of the target miss could be used. Coincidentally, common aerobic workout heart rates are similar to many song tempos; e.g., 120-150 beats per minute; so the beat matcher typically will not distort the song beyond familiarity.
6. Portable with Profiles, Feedback Biometric Plus GPS
7. Exercise Equipment
As illustrated in
The audio output of the portable audio device is streamed into the exercise machine (with a buffer for the incoming audio provided by the exercise machine). The streaming could be done either in the analog domain (i.e., audio-out/line-in) or be done digitally. An advantage of performing this digitally is that the exercise machine can monitor its (variable) consumption of the digital audio buffer data during the streaming, and then communicate via the two-way streaming interface with the portable audio device to request the appropriate amount of audio data to fill the buffer. As the audio is streamed through the exercise machine (with some delay due to the buffering), the athlete can listen to the speed-altered (beat-matched) output on the output jack of the exercise machine.
9. Exercise Equipment Influencing Portable Audio Player
The
10. Real-Time Beat Matching for Athletic Pursuits
Note that this is not limited to a single target metric. The playback rate can be a function of multiple biometric/performance metrics, and with different weights assigned to each metric. For example, if both speed and heart rate are monitored, they could be combined during the comparisons with the target values. If HCURR and HTARGET represent the current heart rate and target heart rate, respectively, and SCURR and STARGET designate the current and target speeds, then the following decision table could be formulated:
b illustrates functional blocks of a preferred embodiment beat matching architecture which includes beat detector, beat generator, a conversion ratio computer, and both a time-scale modifier and a variable sampling rate converter. The preferred embodiment methods start with an initial alignment of the input digital audio stream to the reference stream (beats generated from the beat source input) by alignment of a beat detected near the beginning of the input stream with a beat generated for the reference, and then continue with beat-matching on a frame-by-frame basis using both the TSM and the VSRC (variable sampling rate converter) to modify the input stream to beat match the reference stream. The frames are 10-second intervals of stream samples, and adjacent frames have about a 50% overlap. Note that a 10-second interval corresponds to 441,000 samples when a stream has a 44.1 kHz sampling rate. Also, a tempo of 120 beats per minute (bpm) would yield about 20 beat locations detected in a frame. The frame size could be larger or smaller; the 10-second frame was selected as a compromise between accuracy and memory requirements. For the reference stream from a beat source such as a heart rate monitor, a pedometer, or even a software beat generator, a beat location generator would provide the beat locations; see
RTSM[n]=└R[n]/8+1/16┘
RVSRC[n]=R[n]/RTSM[n]
when |R[n]/RTSM[n]−RVSRC[n−1]|<|R[n]/RTSM[n−1]−RVSRC[n−1] |, but otherwise as
RTSM[n]=RTSM[n−1]
RVSRC[n]=R[n]/RTSM[n]
The division by 8 in defining RTSM[n] just reflects the step size of the TSM; with a different step size, the divisor and round-off would adjust.
As previously mentioned, the TSM provides coarse time-scale modification (in ⅛ increments between 4/8 and 16/8) and the VSRC provides variable time-scale adjustments. In these formulas, two TSM+VSRC, conversion ratios are computed, and the VSRC ratio closest to the previous value is selected (in order to avoid significant jumps in pitch). The first TSM ratio is obtained by rounding the overall conversion ratio to the nearest ⅛th increment, and the first VSRC ratio is obtained simply by dividing the overall conversion ratio by the first TSM ratio (since the TSM+VSRC are connected in series). The second VSRC ratio is obtained by dividing the overall conversion ratio by the previous TSM ratio. As shown in
12. Conversion Ratio Stability
The tempo reported by beat detectors has a tendency to jump between analysis frames. These tempo jumps can be harmonics or simple ratios of the previously-detected tempos in prior analysis frames. That is, the current tempo may be a multiple such as 2×, 0.5×, 3×, 0.67×, 1.5×, 1.33×, etc. of a prior tempo. These jumps are highly disruptive to the beat matcher, as they cause large, audible jumps in the conversion ratios from frame to frame.
Likewise, heart monitors and other parameter transducers may provide erratic inputs due to poor physical contacts, wireless interference, etc.; and even the physical beat source may have erratic output, such as heart beat transients or arrhythmia.
To remedy the tempo jump problem, the preferred embodiments maintain a history of prior tempo values for the input stream and the beat source (e.g., Bi and Br for prior frames) and adjust a current tempo from the previous tempos in the history, such as by a majority voting decision.
13. Monitoring Mechanical Devices
After each beat detection analysis frame, the Analyzer/Comparator will compare the current frame's data to the data in the Beat Data History. If the current frame has a significant variation from the history, or is approaching or exceeding the set limits, then the Monitoring Location(s) can be notified of this problem. This notification can occur through various transmission methods, both wired (landline, IP, etc.) and wireless (radio, WiFi, etc.). Analysis can be enabled/disabled from the Monitoring Location(s), if continuous analysis is not desired. Also, the Monitoring Location can enable the Audio Monitoring Device to send positive indications of correct operation. If Monitoring Location can also communicate with the Machine, it could shut off the Machine if the audio sensor records a problem. If remote communication with the Machine is not possible, a repair crew can be sent before the Machine's problem is critical (e.g. overheating, etc.).
As illustrated in the preferred embodiment system of
In short, the preferred embodiments facilitate another diagnostic monitor for regular mechanical systems. No modification is required to the machine, motor, or apparatus being sensed. No machine (or production) downtime is required for installation. Little technical skill is required to install each sensing device. A single device can sense “within limits”, “out of limits”, and “approaching limits” operation of a system, as opposed to a component. (Most sensors can sense only a component of the system.) This diagnostic monitor provides alerts in order to perform preventive maintenance before system-critical problem occurs. This is a significant advantage over temperature and pressure sensors, which signal catastrophic problems like overheating and dangerous pressure levels. For remote locations, early-problem detection enables preventive maintenance that can be scheduled more easily (i.e. avoiding bad weather) than the fixing of catastrophic emergencies, which must be fixed immediately.
More particularly, the Analyzer/Comparator could have various status outputs such as “within normal limits operation”, “within safe limits operation high” (i.e., not normal, but not failed—indicating a future failure at the high limit), “within safe limits operation low” (i.e., not normal, but not failed—indicating a future failure at the low limit), “out of limits—high”, “out of limits—low”. Also multiple sets of parameters may be auto-sensed and/or adjusted to accommodate multiple sets of boundaries/multiple rates of operation (e.g., a fan that runs at high speed when heat rises, then slows when the temperature drops). The preferred embodiments have the ability to adapt and the ability to output multiple levels of performance/operation information.
14. Matching Music to Machinery and Other Beat Sources
Some types of “machines” that may require synchronization with people:
While the machine beat pattern could be used as the reference signal, other information could be used instead to control the playback rate of the music over the loudspeakers as illustrated in
Workers' Speed Metric—
Manager's Desired BPM—
This application claims priority from U.S. provisional patent Appl. No. 60/827,500, filed Sep. 29, 2006. Copending, co-assigned application Ser. Nos. 11/371,597, filed Mar. 9, 2006, and 11/469,745, filed Sep. 1, 2006, disclose related subject matter.
Number | Date | Country | |
---|---|---|---|
60827500 | Sep 2006 | US |