The disclosure herein relates to audio signal processing methods and systems, and in particular to recording musical instruments using a microphone array included in a portable electronic device.
Audio sources such as musical instruments are sometimes recorded in a professional studio where a sound engineer has access to a range of microphones. These microphones typically have specific characteristics that make them suitable for different applications (e.g., recording different types of instruments). Depending on the type of instrument being recorded, the engineer may select a microphone with an appropriate directivity pattern and may position the microphone at a particular point in space to capture the desired sound characteristics of the instrument. (This technique may be referred to in this disclosure as a “close-mic technique”.) The engineer may also use a combination of two or more microphones to create a sensation of enhanced spatial width. (This technique may be referred to in this disclosure as a “stereo-mic technique”.)
The aspects herein are illustrated by way of example and not by way of limitation in the figures of the accompanying drawings in which like references indicate similar elements. It should be noted that references to “an” or “one” aspect in this disclosure are not necessarily to the same aspect, and they mean at least one. Also, in the interest of conciseness and reducing the total number of figures, a given figure may be used to illustrate the features of more than one aspect, and not all elements in the figure may be required for a given aspect.
Several aspects of the invention with reference to the appended drawings are now explained. Whenever aspects are not explicitly defined, the scope of the invention is not limited only to the parts shown, which are meant merely for the purpose of illustration. Also, while numerous details are set forth, it is understood that some aspects of the invention may be practiced without these details. In other instances, well-known circuits, structures, and techniques have not been shown in detail so as not to obscure the understanding of this description.
Generally, an aspect herein aims to use an array of microphones mounted on a portable electronic device (e.g., mobile phone or a tablet computer) to emulate the techniques used in a professional recording studio. The raw signals from the array of microphones are combined to define acoustic pick up beams that emulate varying directivity patterns (similar to patterns of professional recording microphones) and that have different look-directions (similar to angles of professional recording microphones). Various professional recording microphones may be emulated by the single microphone array based on the type of audio source to be recorded. Articulation by the musician of an instrument and genre of the music to the recorded may also be considered. These factors (e.g, type of audio source, articulation, genre) may be determined by the portable device by analyzing audio signals from the microphones and/or by using sensors (e.g., camera), or may be input by a user.
In one aspect, an interface is provided to instruct a user on device placement to place the device in a particular position to record the audio source, such that the user does not need the expertise of a sound engineer in order to simulate a professional studio environment. The interface may also be configured to receive input from a user, such that it is interactive. The interface may therefore be manual, automated, or semi-automated. The interface may provide instructions and feedback to the user by overlaying positioning instructions on top of a video feed of the instrument in an augmented-reality fashion, by using haptic feedback, or by using audio feedback.
Microphones 1 (or individually, microphones 1a, 1b) may be integrated within the housing 25a of the device 100, and may have a fixed geometrical relationship to each other. In the example depicted in
In one aspect, beamforming may also be applied to the microphone signals. The signals from the microphones 1 are digitized, and made available simultaneously or parallel in time, to a digital processor (e.g., processor 802 of
In particular, the signals from the microphones on the phone can be combined to yield beamformers, emulating varying directivity patterns (similar to the desired patterns of professional recording microphones) and, depending on their arrangement, with different look-directions (similar to the angles of professional recording microphones). Thus, coordination and design of the beams may include shaping the beams and directing the beams to pick up a desired audio source (e.g., musical instrument or voice) for recording. In one aspect, a subset of the microphones used to produce the beam is also identified or assigned.
The configuration of the beams may be based on a number of factors including the type of instrument to be recorded (e.g, guitar, clarinet, piano, etc). In one aspect, the type of instrument being recorded may be determined using a sensor (e.g., camera). In one aspect, the type of instrument being recorded may be input by a user. Playing style or articulation by the musician of the instrument may also be considered when configuring the beams. For example, the music being recorded may be analyzed by the processor to determine whether a transition or continuity on a single sound or between multiple sounds in the music being recorded is short, long, loud, soft, etc. Genre of the music to the recorded may also be considered.
Other factors in configuration of the beams may include the sensitivities and directivities of the microphones, the positions of the microphones, the geometrical relationship between the microphones, the location of the audio source (e.g., musical instrument) relative to the positions of the microphones, the direction of the audio signal from the audio source relative to the position of the microphones, the shape of the housing of the portable device. One or more sensors (e.g., camera) may be included in the device 100 in order to determine the position of the device 100 relative to the instrument being recorded. In one aspect, these factors are also analyzed in order to determine which microphones should be assigned to produce a beam to pick up the audio signals from the audio source.
Three example beams are depicted in
The portable device may therefore perform beamforming to produce the appropriate beams for recording a musical instrument by coordinating one or more of the following parameters as instructed by the beam analyzer: a shape of the beam (pattern), a general direction of the beam (look-direction), and which microphones in the microphone array will be assigned to produce the beam.
Turning to
Portable device 200 is communicatively coupled to processing device 220, either wirelessly or via a wire. Processing device 220 may perform some or all of the processing for generation of the virtual studio microphone (using the microphone signals to produce the acoustic pick up beams) and for generation of the interface (to instruct a user on device placement), based on factors that may be sensed by the device or input by the user. In contrast, in the aspect of
Before turning to these figures,
In contrast to close-mic techniques, in some situations the sound engineer may want to record an instrument together with the ambience (reverberation) of the room. This is particularly desirable for string and woodwind instruments, which benefit from room reverberation. In such situations, it is often desirable to conduct a stereo recording, which captures the spaciousness of the acoustic environment.
An array of microphones on a consumer electronics device, such as a phone or a tablet, facilitates a means for various spatial signal processing algorithms. In turn, using such algorithms it is possible to emulate the microphone techniques depicted in
Turning to
In the aspect of
At block 702, using the input received at block 701 as factors, the portable device determines a microphone configuration (directivity pattern, look-direction and equalization for each of the plurality of microphones) for the particular circumstances of the recording, based on pre-designed presets, and causes the microphones to emulate the determined microphone configuration. These presets may correspond to the type or family of the instrument being recorded, and/or the playing style/articulation as well as musical genre, as received at block 701. In one aspect, a user may input preference with respect to the parameters of the microphone configuration. The microphone configuration is emulated as described above, by accessing and combining one or more signals from the microphones of the array to produce acoustic pickup beams.
At block 703, the portable device determines whether there is an error in its position. The portable device detects an error in its position based on, for example: an amplitude of one or more of the microphones signals; a signal to noise ratio measurement of one or more of the microphones signals; a direction of arrival estimation (DoA) of one or more of the microphones signals; and a left-right balance of one or more of the microphones signals. In one aspect, an amount of reverberation is considered when detecting the error. For example, the portable device may measure the reverberance of the recording space (e.g., room in which the audio source is located). In one aspect, the reverberant portion of an audio signal from the microphones is estimated. In one aspect, the reverberance may be characterized. For example, the reverberance may be characterized as diffusive (e.g., surrounding the portable device) or directional (e.g., from a specific location). The reverberance may also be generally characterized as “desired” or “undesired” based on the circumstances of the recording scenario (input at block 701).
In one aspect, at block 704, based on the configuration determined at block 702 and the error detected at block 703, an interface is generated that advises a user on whether to reposition the device and how to reposition the device, if needed. The interface is displayed on the display of the portable device. One or more sensors may be included in the portable device to provide information regarding the current position of the portable device relative to the audio source. The interface may be interactive, such that an interactive recording mode is provided where the interface aids the user in recording one or more musical instruments. In one aspect, the instructions provided by the interface are updated based on the current position of the portable device as the user is moving it. The interface may advise the user to reorient the portable device (e.g., portrait or landscape) and on how to angle the device (e.g., using arrows or text indicating a number of degrees). The instructions may advise the user to move the portable device closer or farther from the audio source, or to the left or right of the audio source, among other things. When the user reaches a position relative to the audio source that has been determined to be advantageous for recording the audio source (based on one or more factors of the circumstances of the recording scenario), the interface may advise the user to stop.
As one example, with respect to reverberance, if the audio source to be recorded is a string quartet, the portable device may determine that the reverberant portion of the audio signal is desired, and may instruct the user to move the device further from the audio source in order to increase the reverberance. If the audio source is a single musical instrument such as a saxophone, the portable device may determine that the reverberance is highly directional and may produce a beam in the direction of the audio source.
In one aspect, the interface is configured to accept updated input from the user, e.g, updating the input discussed in connection with block 701 and the microphone configuration preferences.
At block 705, the repositioned portable device records the musical instrument using the microphone configuration determined at block 702.
Although the foregoing descriptions discuss recording a single musical instrument, it will be appreciated that the aspects described herein may be applied to recording multiple musical instruments. One such example is recording a string quartet. If input is received that the instrument to be recorded is a string quartet, the portable device may determine that reverberation is desirable and may therefore instruct the user to position the portable device at some distance (e.g., 3 feet) from the string quartet, such that the microphones can be configured to do a stereo recording. On the other hand, one advantage of using the portable device described herein to record multiple audio sources is that multiple acoustic pickup beams may be produced to separately record each of the audio sources or groups of the audio sources. The separation may be based on sound source separation or beamforming. In the example of a string quartet, 4 beams may be produced by the portable device, one to record each instrument in the ensemble. The beams may have different directivity patterns and look-directions. Alternatively, 3 beams may be produced by the portable device, one for both violins, one for the viola and one for the cello. Other configurations are also possible. In this way, the audio data available for post-production (e.g., sound mixing) may be improved, since the portable device may pick up separate sound sources. In one aspect, these separate sound sources may be labeled for easy referencing and access by the user. The label may comprise the type of sound source, either as text or as an image.
In contrast, when recording multiple sound sources in a typical professional recording studio environment, a sound engineer typically uses multiple unique microphones each having its own characteristics to record each of the different sound sources.
Thus, by virtue of the methods arrangement described herein, it is possible to simulate the multiple unique microphones typically required in a professional studio. It is also possible to provide guidance and expertise on how to use and position the portable device to achieve the simulation. A professional recording studio may therefore be simulated without the expertise of a sound engineer and without expensive professional equipment.
The memory 806 has stored therein instructions that when executed by the processor 802 compute a configuration of the microphones, produce the acoustic pickup beams using the microphone signals, detect an error in the position of the microphones, provide an instruction on how to reposition the microphones, and record an instrument (as described above). The instructions that program the processor 802 to perform all of the processes described above are all referenced in
Similar to device 800, the components of device 900 may be integrated within a housing such as that of a mobile phone (e.g., see
The memory 906 has stored therein instructions that when executed by the processor 902 produce the acoustic pickup beams using the microphone signals (as described above). The instructions that program the processor 902 to perform the processes described above are all referenced in
Processing device 920 includes a processor 922, communications transmitter and receiver 924, and memory 926. The memory 926 has stored therein instructions that when executed by the processor 922 compute a configuration of the microphones, detect an error in the position of the microphones, provide an instruction on how to reposition the microphones, and cause the microphones 930 record an instrument (as described above). The instructions that program the processor 922 to perform the processes described above are all referenced in
In other aspects, the instructions discussed above are performed by a combination of the portable device 900 and the processing device 920 working together. Thus, processing device 920 performs any one or more of the instructions discussed above and the remaining instructions are performed by the portable device 900.
Some portions of the preceding detailed descriptions have been presented in terms of algorithms and symbolic representations of operations on data bits within a computer memory. These algorithmic descriptions and representations are the ways used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. An algorithm is here, and generally, conceived to be a self-consistent sequence of operations leading to a desired result. The operations are those requiring physical manipulations of physical quantities. It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the above discussion, it is appreciated that throughout the description, discussions utilizing terms such as those set forth in the claims below, refer to the action and processes of an audio system, or similar electronic device, that manipulates and transforms data represented as physical (electronic) quantities within the system's registers and memories into other data similarly represented as physical quantities within the system memories or registers or other such information storage, transmission or display devices.
The processes and blocks described herein are not limited to the specific examples described and are not limited to the specific orders used as examples herein. Rather, any of the processing blocks may be re-ordered, combined or removed, performed in parallel or in serial, as necessary, to achieve the results set forth above. The processing blocks associated with implementing the audio system may be performed by one or more programmable processors executing one or more computer programs stored on a non-transitory computer readable storage medium to perform the functions of the system. All or part of the audio system may be implemented as, special purpose logic circuitry (e.g., an FPGA (field-programmable gate array) and/or an ASIC (application-specific integrated circuit)). All or part of the audio system may be implemented using electronic hardware circuitry that include electronic devices such as, for example, at least one of a processor, a memory, a programmable logic device or a logic gate. Further, processes can be implemented in any combination hardware devices and software components.
While certain aspects have been described and shown in the accompanying drawings, it is to be understood that such aspects are merely illustrative of and not restrictive on the broad invention, and the invention is not limited to the specific constructions and arrangements shown and described, since various other modifications may occur to those of ordinary skill in the art. For example, it will be appreciated that aspects of the various aspects may be practiced in combination with aspects of other aspects. The description is thus to be regarded as illustrative instead of limiting.