The present application is a national phase entry under 35 U.S.C. § 371 of International Application No. PCT/AU2015/050492 filed Aug. 26, 2015, which claims priority from Australian Application No. 2014903381 filed Aug. 26, 2014, all of which are hereby incorporated herein by reference.
The invention relates to methods and systems for positioning and controlling sound images in three-dimensional space. The invention is generally applicable to the field of media production, including audio, video, film and multi-media production.
Media production involving the positioning and control of sound images in three-dimensional space is becoming increasingly sophisticated involving a vast array of features and functions. However, the actual static and dynamic positioning and control of sound images in three-dimensional space is typically driven by mouse, keyboard and other peripherals. Such traditional controllers can only provide an operator with a limited sense of depth in relation to positioning and controlling sound images in three-dimensional space.
Accordingly, with increasing functionality, particularly in complex and high-throughput situations, there is a continued need to provide improved systems and methods for positioning and controlling sound images in three-dimensional space.
It is an object of the present invention to substantially overcome or at least ameliorate one or more of the disadvantages of the prior art.
In an aspect, there is provided a system for positioning and controlling sound images in three-dimensional space. Such a system may comprise:
An alternate system according to this aspect may comprise:
In such an alternate system, there may be five (5) or more additional computer systems network connected to the master computer system by communication links.
In another aspect, there is provided a method for positioning sound images in three-dimensional space. Such a method may comprise:
Preferred embodiments of the invention will now be described with reference to the accompanying drawings wherein:
A preferred embodiment involves the use of a three-dimensional motion sensing input device, for example, the LeapMotion Control (see https://www.leapmotion.com/ last accessed on 21 Aug. 2015) or Microsoft's Kinect (see http://www.microsoft.com/en-us/kinectforwindows/ last accessed 21 Aug. 2015) as a controller connected to a computer system running a digital audio workstation application for positioning and controlling sound images in three-dimensional space.
A computer system running a digital audio workstation application typically may comprise or be connected to a display. As seen in the example depicted in
Modifiers may be implemented, for example, via a keyboard, that modifies the behaviour of the control to other functions, such as:
“Lock to POI” restricting the movement to points of interest, for example, to loudspeaker positions, or to specific planes;
“Rotate” switching the sensing to control of the sound image rather than the position, enabling rotation of the sound image by responding to twisting motions of the hand;
“Tilt” may switch the sensing to control of the sound image rather than the position, enabling tilt of the sound image, by responding to titling motions of the hand;
“Spread” may allow control of sound image size, by responding to the hand/finders opening and closing;
“Divergence” may allow control of sound image spill, by responding to the hand/finders opening and closing.
Referring to
The computer system 13 incorporates a digital audio workstation application 14, processor 15, memory 16 and a display 17. The processor 15 may execute instructions which are stored in memory 16 to provide audio-video output signals to the display 17 and to achieve other functionality. The digital audio workstation application 14 may take the form of an audio production software platform such as, for example, Fairlight Dream II, Nuendo or ProTools. The digital audio workstation application may comprise a gesture library 141 and one or more built-in features 142 for audio production, for example, a panning function. As seen in
The digital audio workstation application may include a gesture library 141, such as a collection of gesture filters, each having information concerning a gesture that may be performed (as the user moves). For example, a gesture filter can be provided for various hand gestures, such as swiping or flinging of the hands. By comparing a detected motion to each filter, a specified gesture or movement which is performed by a person may be identified. An extent to which the movement is performed may also be determined. Information captured by the three-dimensional motion sensing input device 11 and provided to the computer system 13 via the communication link 12 may be pre-defined in the gesture library 141 to control one or more built-in features 142 of the digital audio workstation application.
Referring to
The three dimensional motion sensing input device of preferred embodiments of the present invention may be any conventional three dimensional motion sensing input device capable of detecting an intuitive or a predefined gesture of a user and that is capable of recognizing the gesture as being, for example, a selecting gesture, a grabbing gesture, a throwing gesture, or the like. Examples of suitable three dimension sensing input devices currently in available are the Microsoft Kinect and the Leap Motion three dimensional input sensing devices.
In preferred embodiments, the three dimensional motion sensing input device may be incorporated into another network component, such as a mobile device or a personal computer, or may be a stand-alone device in the network, such as a wall-mounted, desktop or free-standing device. Additionally, the three dimensional motion sensing input device may be any suitable distance from, and may have any orientation to, a user, a user's gesture, or to any network component, including any virtual component or cloud resource. A suitable distance may include a small distance, such as millimetres, or a large distance, such as any distance over which the three dimensional motion sensing input device remains capable of accurately obtaining sufficient gesture information. A suitable orientation may include any orientation, such as an orthogonal orientation, a perpendicular orientation, an aerial orientation, or any other orientation.
In preferred embodiments, the three dimensional motion sensing input device may be configured to recognize a gesture that is a multi-part gesture or a gesture that is partially delayed in time. For example, the three dimensional motion sensing input device may recognize a grab and throw gesture even when the grab gesture is performed some time prior to, and separate in time from, the throw gesture.
In preferred embodiments, the system may be calibrated prior to use in order for the system to be capable of accurately detecting a particular gesture or a particular user. System calibration may also aid the system in extrapolating the orientation, distance and/or direction of network devices and components from one another and/or the user.
This specification is written to a person of ordinary skill in the art of media processing, computer architecture, and programming.
Unless specifically stated otherwise, throughout the specification discussions utilizing terms such as “processing”, “computing”, “calculating”, “determining” or the like, refer to the action and/or processes of a computer or computing system, or similar electronic computing device, that manipulates, and/or transforms data represented as physical, such as electronic, quantities into other data similarly represented as physical quantities.
In a similar manner, the term “processor” may refer to any device or portion of a device that processes electronic data, e.g., from registers and/or memory to transform that electronic data into other electronic data that, e.g., may be stored in registers and/or memory. A “computer” or a “computing machine” or a “computing platform” may include one or more processors.
Each processor may include one or more CPUs, a graphics processing unit, and a programmable DSP unit. The processing system further may include a memory subsystem including main RAM and/or a static RAM, and/or ROM. A bus subsystem may be included for communicating between the components. If the processing system requires a display, such a display may be included, e.g., a liquid crystal display (LCD) or a cathode ray tube (CRT) display. If manual data entry is required, the processing system also includes an input device such as one or more of an alphanumeric input unit such as a keyboard, a pointing control device such as a mouse, and so forth. The term memory unit as used herein also encompasses a storage system such as a disk drive unit. The processing system in some configurations may include a sounds output device, and a network interface device. The memory subsystem thus includes a carrier medium that carries computer-readable instructions, e.g., software, for performing, when executed by the processing system, one or more of the methods described herein. Note that when the method includes several elements, e.g., several steps, no ordering of such elements is implied, unless specifically stated. The software may reside in the hard disk, or may also reside, completely or at least partially, within the RAM and/or within the processor during execution thereof by the computer system. Thus, the memory and the processor also constitute carrier medium carrying computer-readable instructions.
Note that while some diagram(s) only show(s) a single processor and a single memory that carries the computer-readable instructions, those in the art will understand that many of the components described above are included, but not explicitly shown or described in order not to obscure the inventive aspect.
It will be understood that the steps of methods discussed are performed in one embodiment by an appropriate processor (or processors) of a processing (i.e., computer) system executing instructions (code segments) stored in storage. It will also be understood that the invention is not limited to any particular implementation or programming technique and that the invention may be implemented using any appropriate techniques for implementing the functionality described herein. The invention is not limited to any particular programming language or operating system.
Although preferred forms of the present invention have been described with particular reference to applications in relation to audio production, it will be apparent to persons skilled in the art that modifications can be made to the preferred embodiments described above or that the invention can be embodied in other forms and used in alternative applications.
Throughout this specification and the claims which follow, unless the context requires otherwise, the words “incorporate” and “comprise”, and variations such as “incorporates”, “incorporating”, “comprises” and “comprising”, will be understood to imply the inclusion of a stated integer or step or group of integers or steps, but not the exclusion of any other integer or step or group of integers or steps.
The reference in this specification to any prior publication (or information derived from it), or to any matter which is known is not, and should not be taken as an acknowledgment or admission or any form of suggestion that that prior publication (or information derived from it) or known matter forms part of the common general knowledge in the field of endeavour to which this specification relates.
Number | Date | Country | Kind |
---|---|---|---|
2014903381 | Aug 2014 | AU | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/AU2015/050492 | 8/26/2015 | WO | 00 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2016/029264 | 3/3/2016 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
8068105 | Classen | Nov 2011 | B1 |
8255069 | Evans et al. | Aug 2012 | B2 |
8448083 | Migos | May 2013 | B1 |
8638989 | Holz | Jan 2014 | B2 |
20090303231 | Robinet | Dec 2009 | A1 |
20130167026 | Shafer | Jun 2013 | A1 |
20140201666 | Bedikian et al. | Jul 2014 | A1 |
20140240231 | Minnen | Aug 2014 | A1 |
20140355789 | Bohrarper | Dec 2014 | A1 |
20150116200 | Kurosawa | Apr 2015 | A1 |
20150149929 | Shepherd | May 2015 | A1 |
20150179186 | Swierk | Jun 2015 | A1 |
20160357262 | Ansari | Dec 2016 | A1 |
20170161014 | Kikugawa | Jun 2017 | A1 |
Number | Date | Country |
---|---|---|
H0990963 | Apr 1997 | JP |
Entry |
---|
“3D Audio Workspace explained”. Resolution. vol. 13.7. Nov./Dec. 2014. ISSN 1477-4216. pp 48-49. [retrieved from internet on Nov. 2 & 5, 2015] URL : http://www.resolutionmag.com/back-issues-content/81_content. pdf >URL : http://www.fairlight.com.au/wp-content/uploads/20 I4/06/3DA W-inResolution_web.pdf >. |
International Search Report for PCT/AU2015/050492, 7 pps, dated Nov. 10, 2015. |
Written Opinion of the International Searching Authority, PCT/AU2015/050492, 6 pps, dated Nov. 15, 2015. |
LeapMotion Control, [retrieved from internet, Feb. 23, 2017], https://www.leapmotion.com/, 6 pages. |
Microsoft's Kinect, [retrieved from internet, Feb. 23, 2017], http://www.microsoft.com/en-US/kinectforwindows/, 7 pages. |
Churnside, et al., “Musical Movements—Gesture Based Audio Interfaces,” Audio Engineering Society Convention Paper 8496, Presented at the 131st Convention Oct. 20-23, 2011, New York, NY, pp. 1-10. |
Fan, et al., “Move That Sound There: Exploring Sound in Space with a Markerless Gestural Interface,” Leonardo Music Journal, Dec. 31, 2013, pp. 31-32, vol. 23. |
Fohl, et al., “A Gesture Control Interface for a Wave Field Synthesis System.” In NIME, May 28, 2013, pp. 341-346. |
Marshall, et al., “On the Development of a System for Gesture Control of Spatialization,” Proceedings of the 2006 International Computer Music Conference, Nov. 6-11, 2006, New Orlelans, USA. |
Okamoto, et al., “Sound Image Rendering System for Headphones,” IEEE Transactions on Consumer Electronics, Aug. 3, 1997, pp. 689-693, vol. 43, No. 3. |
Number | Date | Country | |
---|---|---|---|
20170262075 A1 | Sep 2017 | US |