Claims
- 1. A method for determining the location of transients in a sampled audio signal, said method comprising:breaking said sampled audio signal into a series of time windows at a series of time points; determining the frequency energy characteristics of each window; determining energy curve values at time points of windows having frequency characteristics increased in magnitude from frequency energy characteristics of an preceding window; low-pass filtering the energy curve values to provide a smoothed energy curve; and selecting maxima of the smoothed energy curve as transient points of the sampled audio signal.
- 2. A method of time scaling a sampled audio signal, said method comprising:locating the transients of the sampled audio signal; protecting an interval about each transient so that time scaling is performed only on non-transient frames of the sampled audio signal located between transients; and changing the duration of the non-transient frames by repeating or deleting portions of the non-transient frame.
- 3. The method of claim 2 where said locating the transients comprises:breaking said sampled audio signal into a series of time windows at a series of time points; determining the frequency energy characteristics of each window; determining energy curve values at time points of windows having frequency characteristics increased in magnitude from frequency energy characteristics of an immediately preceding window; and selecting times points at peaks of the energy curve as transient points of the sampled audio signal, and where for a selected non-transient frame of the audio signal having a time duration of T seconds, changing the duration comprises: determining a modification factor for the selected non-transient frame with the product of T with the modification factor being the modified duration of the selected non-transient frame; and splicing segments of the selected non-transient frame into the non-transient frame to change the duration of the selected non-transient frame to the modified duration.
- 4. A method for changing the duration of an audio signal from a time T to a time T1, said method comprising:locating transient times identifying times when a transient occurs in the audio signal, with each transient time bracketed by preceding and following protected areas; and for an audio signal interval between a current and next transient: calculating the duration of the audio signal interval; calculating the duration of an ideal modified interval; determining a modified time-scale factor to compensate for the shortening of the audio signal interval due to the protected areas bracketing the transients; performing frequency domain time scaling based on the modified time- scale factor to modify the length of the interval between the protected areas to form a time-scaled interval; and overlapping the time-scaled interval with the current and next transients.
- 5. The method of claim 4 comprising:for a preceding protected area of a first duration and a following protected area of a second duration around each transient; subtracting the second duration following the current transient and the first duration preceding the next transient from the duration of the audio signal interval and the duration of an ideal modified interval to form a compensated audio signal interval and an ideal modified interval respectively; and calculating a modification factor equal to the ratio of the compensated ideal modified interval to the compensated audio signal interval.
- 6. The method of claim 5 comprising:multiplying the compensated audio signal interval by the modification interval to determine the actual duration of a time-scaled audio signal to be inserted between the left protected area of the initial transient and right protected area of the next transient.
- 7. A method for determining the location of transients in a sampled audio signal having a predetermined time duration, said method comprising:breaking said sampled audio signal into a series of time windows at a series of time values; performing a fast Fourier transform (FFT) on each time window to obtain a set of frequency bins for each time window; summing the positive differences between bins of preceding and following time windows at the same frequencies to determine values of a rectified level signal; filtering the rectified level signal to form a filtered level signal; and locating transients at peaks of the filtered level signal.
- 8. A computer product comprising:a computer usable medium having computer readable program code embodied therein for directing operation of a data processor, said computer readable program code including: computer readable program code executed by said data processor to protect an interval about each transient so that time scaling is performed only on non-transient frames of the sampled audio signal located between transients; computer readable program code executed by said data processor to change the duration of the non-transient frames by repeating or deleting portions of the non-transient frame; and for a selected non-transient frame of the audio signal having a time duration of T seconds: computer readable program code executed by said data processor to determine a modification factor for the selected non-transient frame with the product of T with the modification factor being the modified duration of the selected non-transient frame; and computer readable program code executed by said data processor to splice segments of the selected non-transient frame into the non-transient frame to change the duration of the selected non-transient frame to the modified duration.
- 9. A system for time-scaling an audio signal, the system comprising:a central processing unit (CPU); a memory storing a digital representation of the audio signal and program code for execution by said CPU; with said CPU executing said program code to: locate transients of a sampled audio signal; protect an interval about each transient so that time scaling is performed only on non-transient frames of the sampled audio signal located between transients; and change the duration of the non-transient frames by repeating or deleting portions of the non-transient frame.
- 10. A method for changing the duration of an audio signal from a time T to a time T1, said method comprising:locating transient times identifying times when a transient occurs in the audio signal; and for an audio signal interval between a current and a next transient: calculating a duration of audio signal interval; calculating a duration of an ideal modified interval; determining a duration of required splicing; providing a desired splice length; based on the desired splice length, determining the number of splices, the location of the splices, and the duration of the splices; perform splices and outputting a modified audio signal interval.
- 11. A computer product comprising:a computer usable medium having computer readable program code embodied therein for directing operation of a data processor to time scale an interval between a current and a next transient in an audio file, said computer readable program code including: computer readable program code executed by said data processor to calculate the duration of the audio signal interval; computer readable program code executed by said data processor to calculate the duration of an ideal modified interval; computer readable program code executed by said data processor to determine a modified time-scale factor to compensate for the shortening of the audio signal interval due to the protected areas bracketing the transients; computer readable program code executed by said data processor to perform frequency domain time scaling based on the modified time-scale factor to modify the length of the interval between the protected areas to form a time-scaled interval; and computer readable program code executed by said data processor to overlap the time-scaled interval with the current and next transients.
- 12. A system for time-scaling an audio signal, the system comprising:a central processing unit (CPU); a memory storing a digital representation of the audio signal and program code for execution by said CPU; with said CPU executing said program code to: locate transients of a sampled audio signal; protect an interval about each transient so that time scaling is performed only on non-transient frames of the audio signal located between transients; and for an audio signal interval between a current and a next transient: calculate a duration of the audio signal interval; calculate a duration of an ideal modified interval; determine a modified time-scale factor to compensate for shortening of the audio signal interval due to the protected areas bracketing the transients; perform frequency domain time scaling based on the modified time-scale factor to modify a length of the interval between the protected areas to form a time-scaled interval; and overlap the time-scaled interval with current and next transients.
- 13. A method of time scaling a sampled audio signal, said method comprising:locating transients of the sampled audio signal; performing time scaling on non-transient frames of the sampled audio signal located between transients; and changing a duration of the non-transient frames to time scale the sampled audio signal.
- 14. A method for determining the location of transients in a sampled audio signal, said method comprising:breaking said sampled audio signal into a series of time windows at a series of time points; determining the frequency energy characteristics of each window; determining energy curve values at time points of windows having frequency characteristics increased in magnitude from frequency energy characteristics of an preceding window; filtering the energy curve values to provide a filtered energy curve; and selecting points at peaks of the filtered energy curve as transient points of the sampled audio signal.
CROSS-REFERENCE TO RELATED APPLICATION
This application is a continuation-in-part of application Ser. No. 08/745,929, filed Nov. 7, 1996, entitled “Time-Domain Time/Pitch Scaling of Speech or Audio Signal,” assigned to the assignee herein, the disclosure of which is incorporated herein by reference. Application Ser. No. 08/745,929 was issued as U.S. Pat. No. 6,049,766 on Apr. 11, 2000.
This application claims priority from provisional application Serial No. 60/117,154, filed Jan. 25, 1999, entitled “Beat Synchronous Audio Processing,” the disclosure of which is incorporated herein by reference.
US Referenced Citations (5)
Non-Patent Literature Citations (4)
Entry |
“Time-Frequency Analysis of Musical Signals.” Pielemeier, William et al. Proceedings of the IEEE, vol. 84, No. 9, Sep. 1996.* |
“Determination of the meter of musicl scores by autocorrelation,” Brown, J. Acoust. Soc. Am. 94 (4) Oct. 1993. |
“Tempo and beat analysis of acoustic musical signals,” Scheirer, J. Acoust. Soc. Am., 103 (1) Jan. 1998. |
“Pulse Tracking with a Pitch Tracker,” Scheirer, Machine Listening Group, MIT Medical Laboratory, Cambridge MA 02139, 1997. |
Provisional Applications (1)
|
Number |
Date |
Country |
|
60/117154 |
Jan 1999 |
US |
Continuation in Parts (1)
|
Number |
Date |
Country |
Parent |
08/745929 |
Nov 1996 |
US |
Child |
09/378377 |
|
US |