This application claims the benefit of Australian Provisional Patent Application No. 2013903503 filed 12 Sep. 2013, which is incorporated herein by reference.
The present invention relates to the digital processing of signals from microphones or other such transducers, and in particular relates to a device and method for mapping a plurality of such signals to produce a multi-channel recording, such as a stereo recording, in a manner which is responsive to an orientation in which a device bearing the microphones is held or positioned by a user.
Recording of multi-channel audio is widely used, for example in music and video recordings, in order to retain spatial cues for subsequent listeners. For example, the multi-channel audio may be in stereo with left and right channels, or may have a greater number of channels such as a surround sound “5.1” multi-channel audio recording. To record multi-channel audio requires that a plurality of microphones be positioned in a particular orientation relative to the audio source(s) being recorded. In the simplest case of two channels for a stereo recording, two microphones must be positioned laterally apart by a sufficient distance that the audio signal captured at each microphone when played back through respective stereo speakers will retain the spatial cues that allow a listener to perceive left/right directionality in the resulting audio playback. Similar microphone position requirements apply when capturing audio recordings with a greater number of channels, such as when including front/rear channels and/or above/below channels.
However, a large number of consumer devices now contain multiple microphones for taking an audio recording, often captured together with a video recording. Users of such devices can hold the device in any one of a number of orientations, as there is no single “correct way up” to use the device. Smart phones and point-and-shoot cameras are examples of such devices which can be held in any one of a number of orientations during audio recording. A user might choose to hold the device in a landscape orientation for some recordings, but in a portrait orientation for other recordings, or even using both orientations within a single recording. When using a touchscreen device, a user may hold a smartphone in a first landscape orientation or in a second landscape orientation rotated 180 degrees from the first landscape orientation; for example depending on whether the user is right handed or left handed. Accordingly it is not possible to preconfigure such devices with knowledge of the relative position of each microphone to the audio source being recorded. That is, it is not possible to “hard wire” one microphone to be connected to a left recording channel, for example, because in use the device might be rotated so that that microphone is in fact capturing right side audio when the device is in a reversed landscape orientation, or is capturing top-centre audio when the device is in a portrait orientation.
Any discussion of documents, acts, materials, devices, articles or the like which has been included in the present specification is solely for the purpose of providing a context for the present invention. It is not to be taken as an admission that any or all of these matters form part of the prior art base or were common general knowledge in the field relevant to the present invention as it existed before the priority date of each claim of this application.
Throughout this specification the word “comprise”, or variations such as “comprises” or “comprising”, will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps.
In this specification, a statement that an element may be “at least one of” a list of options is to be understood that the element may be any one of the listed options, or may be any combination of two or more of the listed options.
According to a first aspect the present invention provides a method of adaptively mapping a plurality of microphone signals to a multi-channel audio signal, the method comprising:
obtaining at least first and second microphone signals from respective first and second spaced apart microphones;
obtaining a device orientation signal from a device orientation sensor;
adaptively mapping the first and second microphone signals to produce a first audio signal channel of a multi-channel audio signal, based on the device orientation signal; and
adaptively mapping the first and second microphone signals to produce a second audio signal channel of a multi-channel audio signal, based on the device orientation signal.
The device orientation sensor may in some embodiments be onboard the device bearing the microphones, and may in some embodiments comprise a gyroscopic motion sensor or micro electro-mechanical system (MEMS) accelerometer.
The microphone signals may in some embodiments be obtained directly from the microphones, or may in some embodiments be obtained indirectly via an intermediate signal path such as via a digital signal processor and/or a digital signal storage device. When the microphone signals are obtained indirectly, a device orientation signal obtained contemporaneously with the microphone signals is preferably similarly stored and/or processed in order to provide a temporally appropriate device orientation signal.
The method may be performed substantially at the time that the microphone signals are sensed, and the multi-channel audio signal may be output to a recording medium, so as to record a multi-channel audio signal from the microphones. Alternatively, the method may be performed at a time after the microphone signals are sensed, such as by being performed upon a stored copy of the microphone signals, for example in order to produce a multi-channel audio signal at a later time such as at a time of signal playback.
In some embodiments of the invention, the adaptive mapping is based upon information or parameters which reflect the position of the microphones upon the specific device being used. The information or parameters which reflect the position of the microphones may in some embodiments specify or reflect a spacing between respective microphones of the device. For example, in the case of a smartphone, a lateral spacing between microphones in a landscape orientation may be about 122 mm whereas a lateral spacing between microphones in a portrait orientation may be about 64 mm. Such physical parameters are fixed at a time of product design and manufacture, and may therefore be known in advance and provided for use by software implementing the present invention. It is to be understood that a lateral spacing between microphones defines the path length difference in signals arriving at each microphone and that knowledge of the microphone spacing to the adaptive mapping process thus permits the adaptive mapping to operate in an appropriate manner for the microphone spacing being experienced as a result of the orientation in which the user chooses to hold the device. For example, in some embodiments a “stereo widening” process may be applied more aggressively when the device is in a portrait orientation in order to improve stereo effects which may otherwise be captured less effectively by closer-spaced microphones. Orientation of the device may similarly adaptively control parameters that control the operation of any spatial processing algorithm, such as beamforming or adaptive noise cancellation, to provide the most natural reproduction of the captured environment.
In some embodiments, the information or parameters which reflect the position of the microphones may additionally or alternatively specify or reflect a position of one or more microphones upon the device. For example, a smartphone bearing three-microphones may have first and second microphones near adjacent corners of the device at each end of a first short side of the device, and a third microphone substantially in the middle of a second short side of the device at an opposite end to the first short side. When the device orientation sensor indicates that the device is in a landscape mode, either or both of the first and second microphones may be adaptively mapped to one of a left-side and right-side stereo audio channel, with the third microphone being mapped to the other of the left-side and right-side stereo audio channel However, when the device orientation sensor indicates that the device is in a portrait orientation, the information or parameters which reflect the position of the microphones will indicate that the third microphone is in a central position and of less value to capture stereo than the first and second microphones, so that the adaptive mapping process can thus optimise stereo audio capture by adaptively mapping the first microphone to one of a left-side and right-side stereo audio channel and adaptively mapping the second microphone to the other of the left-side and right-side stereo audio channel.
In some embodiments, the information or parameters which reflect the position of the microphones may additionally or alternatively specify or indicate a surface within which the respective microphone is positioned, as a direction of sensitivity of the microphone may for example differ by 90 degrees or 180 degrees from other microphones and may be taken into account by the adaptive mapping process. It is noted that in many devices the microphone itself may be an omnidirectional microphone and that a direction of sensitivity of the microphone may be defined primarily by an associated port in the body of the device and in particular whether the port is in the front, side or back of the device. It is further noted that in such configurations the microphone may be substantially equally sensitive throughout a wide range of angles of arrival of sound, such as 180 degrees or more, and that in such embodiments the nominal direction of sensitivity of the microphone as used herein relates to a centre-point of such a range of arrival angles. In such embodiments, where the direction of sensitivity of a first microphone differs by 180 degrees from a second microphone, a spacing between the first and second microphones may be defined in the information or parameters which reflect the position of the microphones, and may in turn be used by the adaptive mapping process to produce signal channels which convey some sense of front/rear directionality and/or to adaptively control a beam steering algorithm, directional microphone or other noise reduction scheme. The spacing between the first and second microphones in such embodiments may be about the same as the thickness of the device, or may differ from the thickness of the device depending on the acoustic path around the device between the first and second microphones. In addition to, or alternatively to, the spacing between the first and second microphones, the information or parameters which reflect the position of the microphones may specify or indicate a predetermined relative time-of-arrival or inter-microphone acoustic delay for one or more device orientations.
Moreover, some such embodiments may define one of the first and second microphones as being a primary microphone based on which microphone is orientated toward the field of view. Where the device has both forward looking and rearward looking (user-facing) cameras, a parameter indicating which camera is recording at the time is preferably utilised by the adaptive mapping process in order to define that the primary microphone of a front/rear microphone pair is whichever microphone is orientated in the same direction as the camera in use.
The adaptive mapping of the present invention may in some embodiments be performed once only for each recording, in order to define a fixed microphone mapping for that recording based on the device orientation at or prior to the commencement of the recording. Such embodiments assume that the device is likely to be held in a single orientation for the entire duration of the recording.
Alternatively, the adaptive mapping of the present invention may in some embodiments be performed repeatedly or continuously throughout a recording, to permit the microphone mapping to change within the recording should the device orientation change. In such embodiments, changes in microphone mapping are preferably smoothed over a suitable transition period in order to avoid inappropriate listener perceptions which may arise from step changes or rapid changes in microphone mapping.
According to a second aspect the present invention provides a device configured to adaptively map a plurality of microphone signals to a multi-channel audio signal, the device comprising:
first and second spaced apart microphones for sensing sounds and producing respective first and second microphone signals;
a device orientation sensor for producing a device orientation signal;
an audio signal processor for adaptively mapping the first and second microphone signals to produce a first audio signal channel of a multi-channel audio signal, based on the device orientation signal, and for adaptively mapping the first and second microphone signals to produce a second audio signal channel of a multi-channel audio signal, based on the device orientation signal.
In some embodiments of the second aspect of the invention, the audio signal processor is a dedicated audio processing chip, or audio hub, separate to a general processor controlling other functions of the device. Such embodiments are advantageous in removing audio signal processing overhead from a main device processor.
According to another aspect the present invention provides computer software for carrying out the method of the first aspect.
According to another aspect the present invention provides a computer program product comprising computer program code means to make a computer execute a procedure for adaptively mapping a plurality of microphone signals to a multi-channel audio signal, the computer program product comprising computer program code means for carrying out the method of the first aspect.
Where the device provides functions other than multi-channel audio capture, such as a smartphone which is also able to provide mono-channel telephony for example, the present invention may be applied alongside such other functions in order to provide multi-channel microphone mapping in relation to those particular device functions which require retention of spatial cues in the audio signal being processed.
Where the device is equipped with multi-microphone processing, such as being equipped with a beamforming function or adaptive noise cancelling function, the adaptive mapping may in some embodiments be configured to hierarchically order the device's microphones depending on the device's orientation to produce appropriate inputs for the multi-microphone processing. Alternatively, if the device is not equipped with multi-microphone processing and a simple stereo output is required, the adaptive mapping may in some embodiments be configured to select, out of all of the device's microphone signals, the two signals which maximise spatial cue.
The device may in some embodiments comprise 2 microphones, 3 microphones, or 4 microphones, or indeed any practical number of microphones. The microphones may in some embodiments be located in any suitable positions on the device.
An example of the invention will now be described with reference to the accompanying drawings, in which:
[left, right primary, right auxiliary]=F(mic 1, mic 2, mic 3, DO),
where DO is ‘device orientation’.
Similarly, in other embodiments relating to a 4-microphone device, the Microphone Mapping Block function may be expressed as follows:
[left primary, left auxiliary, right primary, right auxiliary]=F(mic 1, mic 2, mic 3, mic 4, DO).
The Multi-Microphone Processing Block 206 produces two output signals: left and right, which may undergo noise reduction in the Noise Reduction Block 208. Often, in the landscape orientation mode, modern devices have sufficient left and right microphone separation that no further enhancement of the spatial cue is required. Therefore in this embodiment in the landscape orientation mode 110 the Stereo Enhancement Block 210 is bypassed by sending it a ‘disable’ signal 212. In contrast, in
In another embodiment,
The Multi-Microphone Mapping Block 302 produces two output signals: left and right. In the landscape mode 110 reflected in
On the other hand, when device 100 is held in portrait orientation mode 120, mic 2 is mapped to the “left” signal and mic 3 is mapped to the “right” signal by block 302 as shown in
It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the invention as shown in the specific embodiments without departing from the spirit or scope of the invention as broadly described. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive.
Number | Date | Country | Kind |
---|---|---|---|
2013903503 | Sep 2013 | AU | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/AU2014/000890 | 9/10/2014 | WO | 00 |