At least some embodiments of the present invention relate generally to audio signal processing, and more particularly, to correlating audio signals.
Audio signal processing, sometimes referred to as audio processing, is the processing of a representation of auditory signals, or sound. The audio signals, or sound may be in digital or in analog data format. The analog data format is normally electrical, wherein a voltage level represents the air pressure waveform of the sound. A digital data format expresses the air pressure waveform as a sequence of symbols, usually binary numbers. The audio signals presented in analog or in digital format may be processed for various purposes, for example, to correct timing of the audio signals.
Currently, audio signals may be generated and modified using a computer. For example, sound recordings or synthesized sounds may be combined and altered as desired to create standalone audio performances, soundtracks for movies, voiceovers, special effects, etc. To synchronize stored sounds, including music audio, with other sounds or with visual media, it is often necessary to alter the tempo (i.e.; playback speed) of one or more sounds.
Generally, a loop in audio processing may refer to a finite element of sound which is repeated using, for example, technical means. Loops may be repeated through the use of tape loops, delay effects, cutting between two record players, or with the aid of computer software. Many musicians may use digital hardware and software devices to create and modify loops, often in conjunction with various electronic musical effects. Live looping is generally referred to recording and playback of looped audio samples in real-time, using either hardware (magnetic tape or dedicated hardware devices) or software. A user typically determines the duration of the recorded musical piece to set the length of a loop. The speed or tempo of playing of the musical piece may define the speed of the loop. The recorded piece of music is typically played in the loop at a constant reference tempo. New musical pieces can be recorded subsequently on top of the previously recorded musical pieces played at a tempo of the reference loop.
Because the tempo and/or speed of recording of the new musical pieces may change, the loops of the newly recorded musical pieces may be non-synchronized to each other. The lack of synchronization between the musical pieces can severely impact a listening experience. Therefore, after being recorded, the tempo of the new musical pieces may be changed to the constant reference tempo of the previously recorded musical piece played in the reference loop.
Unfortunately, merely changing the tempo of all newly recorded musical pieces to a constant reference tempo may result in undesired audible side effects such as pitch variation (e.g., the “chipmunk” effect of playing a sound faster) and clicks and pops caused by skips in data as the tempo of the newly recorded pieces is changed. Currently there are no ways to dynamically adjust the tempo of the musical pieces during recording.
Exemplary embodiments of methods, apparatuses, and systems to correlate changes in one audio signal to another audio signal are described. In one embodiment, a first audio signal is outputted, and a second audio signal is received. The second audio signal may be stored in a memory buffer. The first audio signal is correlated to conform to changes in the second audio signal. The first audio signal may be dynamically correlated to match with the second audio signal while the second audio signal is received. At least in some embodiments, a size of a musical time unit of the second audio signal is determined to correlate the first audio signal. At least in some embodiments, the adjusted first audio signal is stored in another memory buffer.
At least in some embodiments, correlating the first audio signal may include time stretching the first audio signal, time compressing the first audio signal, or both. In some embodiments, correlating the first audio signal includes adjusting a tempo of the first audio signal to the tempo of the second audio signal.
At least in some embodiments, a first audio signal is outputted, and a second audio signal is received. For example, the first audio signal may be played back, generated, or both. Data of the second audio signal may be stored in a memory buffer. The data of first audio signal may be dynamically correlated to conform to the changes in the second audio signal while the second audio signal is received. Further, a third audio signal may be received. The third audio signal may be stored in another memory buffer. At least the second audio signal may be adjusted to conform to the third audio signal.
At least in some embodiments, a first audio signal is outputted while a second audio signal is received. The data of the second audio signal may be stored in a memory buffer. Further, a determination is made whether to commit data of the second audio signal to mix with the data of the first audio signal. The data of the first audio signal is dynamically correlated to match with the data of the second audio signal if the data of the second audio signal is committed to mix with the data of the first audio signal.
At least in some embodiments, a new audio signal is received. The new audio signal is stored in a memory buffer. A size of a musical unit of the new audio signal may be determined. The musical time unit may be, for example, a beat, a measure, a bar, or any other musical time unit. The size of the musical unit of a recorded audio signal is adjusted to the size of the musical unit of the new audio signal. At least in some embodiments, the new audio signal may be grouped with one or more previously recorded audio signals.
At least in some embodiments, a new audio signal is received. The new audio signal is stored in a memory buffer. A size of a musical unit of the new audio signal may be determined. The size of the musical unit may be determined based on a tempo of the new audio signal. The size of the musical unit may include a time value. The size of the musical unit of a recorded audio signal is adjusted to the size of the musical unit of the new audio signal.
At least in some embodiments, a determination is made whether to commit data of the new audio signal to mix with the data of the recorded audio signal. The size of the musical unit of a recorded audio signal is adjusted to the size of the musical unit of the new audio signal when the data of the new audio signal are committed to mix with the data of the recorded audio signal.
At least in some embodiments, adjusting data of the recorded audio signal to the data of the new audio signal comprises time stretching data of the recorded audio signal to match the size of the musical unit of the new audio signal, time compressing data of the recorded audio signal to match the size of the musical unit of the new audio signal, or both. At least in some embodiments, the recorded audio signal is faded out after being correlated to changes in the new audio signal.
The present invention is illustrated by way of example and not limitation in the figures of the accompanying drawings in which like references indicate similar elements.
Various embodiments and aspects of the inventions will be described with reference to details discussed below, and the accompanying drawings will illustrate the various embodiments. The following description and drawings are illustrative of the invention and are not to be construed as limiting the invention. Numerous specific details are described to provide a thorough understanding of various embodiments of the present invention. It will be apparent, however, to one skilled in the art, that embodiments of the present invention may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form, rather than in detail, in order to avoid obscuring embodiments of the present invention.
A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever. Copyright© Apple, 2009, All Rights Reserved.
Reference in the specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the invention. The appearances of the phrase “in one embodiment” in various places in the specification do not necessarily refer to the same embodiment.
Unless specifically stated otherwise, it is appreciated that throughout the description, discussions utilizing terms such as “processing” or “computing” or “calculating” or “determining” or “displaying” or the like, refer to the action and processes of a data processing system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.
Embodiments of the present invention can relate to an apparatus for performing one or more of the operations described herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a machine (e.g., computer) readable storage medium, such as, but is not limited to, any type of disk, including floppy disks, optical disks, CD-ROMs, and magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), erasable programmable ROMs (EPROMs), electrically erasable programmable ROMs (EEPROMs), magnetic or optical cards, or any type of media suitable for storing electronic instructions, and each coupled to a bus.
The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various general-purpose systems may be used with programs in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform the required machine-implemented method operations. The required structure for a variety of these systems will appear from the description below.
In addition, embodiments of the present invention are not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of embodiments of the invention as described herein.
Exemplary embodiments of methods, apparatuses, and systems to correlate changes in audio signals are described. More specifically, the embodiments are directed towards methods, apparatuses, and systems for recording new audio while playing back existing audio. The system may output, for example, generate, and/or playback a first audio signal while receiving a second (new) audio signal. The newly recorded audio signal and the first audio signal may be correlated, such that the existing first audio signal matches the tempo changes of the new audio signal. The new audio signal may be stored in a memory buffer. The first audio signal is correlated to conform to changes in the second audio signal. The first audio signal may be dynamically correlated to match with the second audio signal while the second audio signal is received.
At least in some embodiments, a size of a musical time unit of the second audio signal is determined to correlate the first audio signal. At least in some embodiments, the adjusted first audio signal is stored in another memory buffer. Embodiments of the invention operate to maintain the record buffer playing back at a correct synchronization and pitch when the tempo of the newly recorded audio is changed, so as if the tape speeds up and slows down along with a master clock, as set forth in further detail below. That is, the embodiments of the invention operate on preserving the sound quality while keeping the most recent performances as free of time stretching/time compressing as possible, as described in further details below.
As shown in
Memory 109 can be dynamic random access memory (DRAM) and can also include static RAM (SRAM). Memory 109 may include one or more memory buffers, as described in further detail below. The bus 107 couples the processor 105 to the memory 109 and also to non-volatile storage 115 and to display controller 111 and to the input/output (I/O) controller 117. The display controller 111 controls in the conventional manner a display on a display device 113 which can be a cathode ray tube (CRT) or liquid crystal display (LCD). The I/O controller 117 is coupled to one or more audio input devices 125, for example, one or more microphones, to receive audio signals.
As shown in
The display controller 111 and the I/O controller 117 can be implemented with conventional well known technology. A digital image input device 121 can be a digital camera which is coupled to an I/O controller 117 in order to allow images from the digital camera to be input into the data processing system 101. The non-volatile storage 115 is often a magnetic hard disk, an optical disk, or another form of storage for large amounts of data. Some of this data is often written, by a direct memory access process, into memory 109 during execution of software in the data processing system 101. One of skill in the art will immediately recognize that the terms “computer-readable medium” and “machine-readable medium” include any type of storage device that is accessible by the processor 105.
It will be appreciated that the data processing system 101 is one example of many possible data processing systems which have different architectures. For example, personal computers based on an Intel microprocessor often have multiple buses, one of which can be an input/output (I/O) bus for the peripherals and one that directly connects the processor 105 and the memory 109 (often referred to as a memory bus). The buses are connected together through bridge components that perform any necessary translation due to differing bus protocols.
Network computers are another type of data processing system that can be used with the embodiments of the present invention. Network computers do not usually include a hard disk or other mass storage, and the executable programs are loaded from a network connection into the memory 109 for execution by the processor 105. A Web TV system, which is known in the art, is also considered to be a data processing system according to the embodiments of the present invention, but it may lack some of the features shown in
It will be apparent from this description that aspects of the present invention may be embodied, at least in part, in software. That is, the techniques may be carried out in a data processing system or other data processing system in response to its processor, such as a microprocessor, executing sequences of instructions contained in a memory, such as ROM, volatile RAM, non-volatile memory, cache, or a remote storage device.
In various embodiments, hardwired circuitry may be used in combination with software instructions to implement the present invention. Thus, the techniques are not limited to any specific combination of hardware circuitry and software nor to any particular source for the instructions executed by the data processing system. In addition, throughout this description, various functions and operations are described as being performed by or caused by software code to simplify description. However, those skilled in the art will recognize what is meant by such expressions is that the functions result from execution of the code by a processor, such as the processing unit 105.
A machine readable medium can be used to store software and data which when executed by a data processing system causes the system to perform various methods of the present invention. This executable software and data may be stored in various places including for example ROM, volatile RAM, non-volatile memory, and/or cache. Portions of this software and/or data may be stored in any one of these storage devices.
Thus, a machine readable medium includes any mechanism that provides (i.e., stores and/or transmits) information in a form accessible by a machine (e.g., a computer, network device, cellular phone, personal digital assistant, manufacturing tool, any device with a set of one or more processors, etc.). For example, a machine readable medium includes recordable/non-recordable media (e.g., read only memory (ROM); random access memory (RAM); magnetic disk storage media; optical storage media; flash memory devices; and the like.
The methods of the present invention can be implemented using dedicated hardware (e.g., using Field Programmable Gate Arrays, or Application Specific Integrated Circuit) or shared circuitry (e.g., microprocessors or microcontrollers under control of program instructions stored in a machine readable medium. The methods of the present invention can also be implemented as computer instructions for execution on a data processing system, such as system 100 of
Many of the methods of the present invention may be performed with a digital processing system, such as a conventional, general-purpose computer system. The computer systems may be, for example, entry-level Mac mini® and consumer-level iMac® desktop models, the workstation-level Mac Pro® tower, and the MacBook® and MacBook Pro® laptop computers produced by Apple Inc., located in Cupertino, Calif. Small systems (e.g. very thin laptop computers) can benefit from the methods described herein. Special purpose computers, which are designed or programmed to perform only one function, or consumer electronic devices, such as a cellular telephone, may also perform the methods described herein.
In one embodiment, the outputting includes playing back the first audio signal in a loop. The length of the first audio signal, e.g., one or more number of musical measures, bars, or any time measure may determine the length of a loop. In another embodiment, the outputting includes generating (e.g., synthesizing) the first audio signal to play in the loop. The first audio signal may be outputted through, for example, audio output 123 depicted in
At operation 202, a second audio signal is received. In one embodiment, the second audio signal has one or more tempo variances (changes) relative to the first audio signal. The tempo variances may cause pitch changes in the second audio signal relative to the first audio signal. The second audio signal may be received through, for example, audio input 125 depicted in
At operation 204, the data of the first audio signal are correlated to conform to the changes in the second audio signal. In one embodiment, the data of the first audio signal are dynamically correlated to the data of the second audio signal while the second audio signal is received. In one embodiment, the tempo of the second audio signal changes continuously, and the first audio signal is dynamically correlated to the second audio signal to homogenize the speed at which playback is happening versus recording time and recording speed.
In one embodiment, correlating the data of the first audio signal to conform to the changes in the second audio signal includes adjusting a tempo of the first audio signal to the tempo of the second audio signal.
A portion (e.g., grain) of data of the first audio signal may be dynamically adjusted to match to the data of the second audio signal. For example, the portion of data of the first audio signal may be stretched in time (“time stretched”), compressed in time (“time compressed”), or both, to match to the data of the newly received second audio signal. That is, the data of the first audio signal are adjusted to the data of the second audio signal piecemeal based on the grains. In one embodiment, time-stretching and/or time-compressing of the portion of the data of the first audio signal to the portion of the data of the second audio signal is performed such that the first audio signal is relatively adjusted in pitch to the relative pitch changes in the second audio signal. In one embodiment, the size of the grain of data is the size of a musical time unit. The musical time unit may be, e.g., a beat, a portion of the beat, measure, bar, or any other musical time unit. The size of the grain of the audio data can be determined based on the tempo of the audio signal.
In one embodiment, the grain size of the audio data varies according to the tempo of the audio signal. The data of the first audio signal may be correlated by adjusting the size of the musical units to match to the size of the musical units associated with the second audio signal, as described in further detail below. In one embodiment, the relatively adjusted first audio signal is stored in a third memory buffer, such as yet another memory buffer of memory 109.
At operation 205 it is determined whether one or more new audio signals are received. If there are no more new audio signals received, method 200 returns to operation 201. If there are new audio signals, method 200 continues at operation 206 that involves receiving a new audio signal. The new audio signal may have one or more tempo variances (changes) relative to the one or more previously recorded audio signals. At operation 207 data of the new audio signal are stored in a new memory buffer, such as yet another memory buffer of memory 109.
At operation 208, the data of each of the one or more previously recorded audio signals are correlated to conform to the changes in the new audio signal, as described above with respect to operation 204. The correlated data of each of the previously recorded audio signals can be stored in the corresponding memory buffers. That is, instead of adapting new audio performance to what was already in the memory buffer the old performance already played in the loop is adjusted to the new performance that becomes a new master tempo until the next audio performance is received.
The time the loop is played back is determined by the tempo and the length of the loop. For example, if the length of the loop is 1 measure (8 beats), and the rate of the first audio signal's playback (tempo) is 120 beats per minute, the time the loop is played is 4 seconds. If the length of the loop is 1 measure (8 beats), and the rate of the first audio signal's playback (tempo) is 60 beats per minute, the time the loop is played is 8 seconds. At operation 302, a second audio signal is received. The second audio signal may include one or more tempo variances whereby the tempo variances cause relative pitch changes in the second audio signal. The data of the second audio signal are stored in a second memory buffer at operation 303, as set forth above.
At operation 304, a size of a musical unit associated the second audio signal may be determined. The musical time unit may be a beat, a portion of the beat, measure, bar, or any other musical time unit. In one embodiment, the size of the musical unit includes time. In one embodiment, the size of the musical unit is determined based on a tempo of the audio signal. For example, if the rate of the first audio signal's playback (tempo) is 120 beats per minute, the size (“length of time”) of the beat associated with the first audio signal is 0.5 seconds. If the second audio signal is played at the tempo 60 beats per minute, the size of the beat associated with the second audio signal is 1 second. If the loop has the length of one measure, the loop is played 8 second.
At operation 305, the size of the musical unit of the first audio signal is adjusted to the size of the musical unit of the second audio signal. For example, the size of the beat of previously recorded audio signal is adjusted from 0.5 second to 1 second to match to the size of the beat of the newly received audio signal. Musically the tempo may be granular to the beat, so that the tempo of every beat of the previously recorded audio data can be instantaneously adjusted to the changing tempo of the newly received audio data.
That is, the size of the each beat of the previously recorded audio signal is adjusted dynamically to match with the size of the each beat of the currently received audio signal. Then, the grains of the audio data of the previously recorded audio signal can be time stretched/compressed based on the adjusted size of the each beat. The adjusted grains of audio data of the first audio signal and the audio data of the second audio signal are then mixed and output through an audio output device, as described below.
At operation 403, the size of a musical unit associated with the new audio signal is determined based on the tempo. In one embodiment, the musical unit of the audio signal is a beat. The size may be a time length (duration) of the musical unit, for example, the duration of a beat. At operation 404, it is determined if the size of the musical unit of the new audio signal is different from the size of the musical unit of the previously recorded audio signal. If the size of the musical unit of the new audio signal is not different from the size of the musical unit of the previously recorded audio signal, the data of the previously recorded audio signal are not adjusted at operation 405.
If the size of the musical unit of the new audio signal is different from the size of the musical unit of the previously recorded audio signal, operation 406 is performed that involves determining whether the size of the musical unit of the new audio signal is greater than the size of the musical unit of the previously recorded audio signal. If the size of the musical unit of the new audio signal is greater than the size of the musical unit of the previously recorded audio signal, then at operation 407 a portion of the data of the previously recorded audio signal is time stretched to match to the size of the musical unit of the new audio signal.
If the size of the musical unit of the new audio signal is smaller than the size of the musical unit of the previously recorded audio signal, then at operation 408 a portion of the data of the previously recorded audio signal is time compressed to match to the size of the musical unit of the new audio signal. Time stretching and time compressing of the audio data may be performed using one of techniques known to one of ordinary skill in the art of audio processing.
Referring back to
In one embodiment, the memory buffer 605 does not playback. In one embodiment, the data of the audio signal do not output from memory buffer 605 to playback the audio signal.
Referring back to
Referring back to
In one embodiment, each of the “Full Undo” memory buffers can be played back. There may be multiple speeds of playback of audio signals on each of the “Full Undo” memory buffers simultaneously. The audio data recorded into each of the “Full Undo” memory buffers may be time stretched and/or time compressed to play back at a correct synchronization and pitch when the tempo of the newly recorded audio signal changes. That is, previously recorded audio data from each of the “Full Undo” memory buffers can be time stretched and/or time compressed to playback while the most recently received audio data are kept substantially free of time stretching/time compressing.
Referring back to
As shown in
Referring back to
As shown in
In one embodiment, the audio data of the previously recorded audio signal are time-stretched to match to the size of the musical unit associated with the data of the new audio signal 601, as set forth above. In another embodiment, the audio data of the previously recorded audio signal are time compressed to match to the size of the musical unit of the new audio signal 607.
As shown in
If the audio signal data are arranged in groups, each group of the audio data may be stored into a corresponding main memory buffer, such as buffer 610. For example, a group A of the audio data adjusted, as described above with respect to
In one embodiment, audio data of the previously recorded audio signal are faded out after being adjusted to conform to the new recording's tempo. For example, the previously recorded audio signal may sound quieter and quieter as the play back in the loop proceeds further. After being adjusted and mixed, as described above, the audio data are outputted at 614, for example, through one or more speakers.
The visual representation of tape 715 moves all the way to right, as recording of new audio data proceeds, new data appear on the tape together with the previously recorded old audio data. GUI 701 includes a “record” button 702, a “play” button 703, and a “reverse play” button 705. GUI 701 includes an indicator 706 indicating a current relative position of the recording audio along the loop. An indicator 704 indicates a total length of the loop. For example, the total length of the loop may be any number (e.g., from 1 to 8) of measures and/or bars. The total length of the loop may be set by the user. GUI 701 further includes a “clock” knob 707. At the beginning of the loop, the position of the knob 707 is at zero, and knob 707 moves around all the way back to zero like a little “clock” as the audio is played back one time in the loop.
GUI 701 has a ruler 716 with a time signature, a tempo indicator 708. The tempo may be set by a user, or may come from a master tempo. The master tempo may be determined, e.g., by most recently received audio. GUI 701 may include a “fade out” time indicator 709, and “fade out” button 717. If “fade out” button 717 is selected, the previously recorded audio data are faded out.
GUI 701 may include a turn “on/off” metronome button 711, “ahead of time” button 712, and “undo” button 713. User may select these buttons for recording the audio while playing back existing audio in the loop, as discussed above. Selecting buttons on the GUI is known to one of ordinary skill in the art of audio processing. “Record” button may be selected to start recording a new audio signal. For example, in response to a user's selection of “undo” button, newly recorded audio data can be discarded from “working undo” buffer 605, as described above with respect to
For example, in response to a user's selection of “fade out” button 717, the previously recorded audio that has been adjusted according to methods described above, is faded out using one of the techniques known to one of ordinary skill in the art of audio processing.
In one embodiment, GUI 701 includes a “group” button 719, to group the audio data together. The audio data of multiple audio signals selected to be in the same group are adjusted and mixed to be output from a corresponding main buffer, as described above.
In the foregoing specification, embodiments of the invention have been described with reference to specific exemplary embodiments thereof. It will be evident that various modifications may be made thereto without departing from the broader spirit and scope of the invention. The specification and drawings are, accordingly, to be regarded in an illustrative sense rather than a restrictive sense.
This application claims the benefit of prior U.S. Provisional Patent Application No. 61/156,128 entitled “Correlating Changes in Audio,” filed Feb. 27, 2009, which is herby incorporated by reference.
Number | Date | Country | |
---|---|---|---|
61156128 | Feb 2009 | US |