The disclosure of Japanese Patent Application No. 2017-088857 filed on Apr. 27, 2017 including the specification, claims, drawings, and abstract is incorporated herein by reference in its entirety.
The present disclosure relates to a target position setting apparatus and a sound image localization apparatus.
There have been proposed sound image localization apparatuses for processing a sound source signal to localize a sound image at a target position.
JP 2004-193877 A discloses a structure for setting a sound image position, the structure including an X position setting section for setting a longitudinal position on a horizontal plane, a Y position setting section for setting a lateral position on the horizontal plane, a Z position setting section for setting a height position on a vertical plane, a θ position setting section for setting an angle of the horizontal plane, and a φ position setting section for setting an angle of the vertical plane. The listener can set the positions by clicking each item of these setting sections displayed on a graphical user interface (GUI) application screen and sliding a slider.
JP 2008-211834 A discloses a sound image localization apparatus in which a head-related transfer function is implemented.
Jens Blauert, Masayuki Morimoto, and Toshiyuki Goto, “Spatial Hearing”, Kajima Institute Publishing Co., Ltd., Jul. 10, 1986, discloses a technique for localizing a sound image at a desired position by reproducing a head-related transfer function and convolving it with a sound source signal to present the resultant position to the listener.
However, the above-described approach that involves the operation of the X position setting section, the Y position setting section, the Z position setting section, the θ position setting section, and the φ position setting section has a drawback in that it is difficult to image a 3D position of an actual stereoscopic sound image.
As an alternative, there also has been developed binaural recording in which sound is collected using a high-sensitivity microphone attached to a model of a human head at a position behind the eardrum, “a dummy head microphone”; however, this method is generally costly.
The present disclosure provides a technique that allows easy setting of a target position by using a GUI screen.
According to an aspect of the present disclosure, there is provided a target position setting apparatus comprising a display configured to display a first representation that represents a listener, a second representation that is obtained by projecting a hemispherical dome having a radius R with the listener at the center on a horizontal plane, and a mark that represents a target position in association with the second representation; a first operation element configured to set the radius R; a second operation element configured to freely move the mark within the second representation; and a controller configured to output three-dimensional position data of the mark with respect to the listener as target position data for sound image localization.
In one embodiment of the present disclosure, the display is further configured to display a third representation that is obtained by projecting the hemispherical dome having the radius R on a vertical plane and to display the mark in association with the third representation.
In another embodiment of the present disclosure, the second representation has a size that represents a distance from a sound image with respect to the listener, and the mark is located at a position that represents a position of the sound image with respect to the listener in a left-to-right direction, in a front-to-rear direction, and in a height direction.
In still another embodiment of the present disclosure, the first operation element and the second operation element comprise a mouse, the radius R is set in response to a click operation of the mouse, and the mark is moved in response to a drag operation of the mouse.
In still another embodiment of the present disclosure, the first operation element and the second operation element comprise a touch screen, the radius R is set in response to a touch operation on the touch screen, and the mark is moved in response to movement of a finger on the touch screen.
In still another embodiment of the present disclosure, each of the first operation element and the second operation element comprises a slide bar displayed on the display.
According to another aspect of the present disclosure, there is provided a sound image localization apparatus comprising the above-described target position setting apparatus; and a sound source signal processing apparatus configured to process a sound source signal using the target position data output from the target position setting apparatus to output a sound image localization signal.
The present disclosure allows easy setting of a target position for sound image localization using a GUI screen. Particularly, the present disclosure allows a user to easily identify a three-dimensional position of a sound image with respect to a listener.
Embodiments of the present disclosure will be described by reference to the following figures, wherein:
Embodiments of the present disclosure will be described below with reference to the accompanying drawings.
GUI Screen as Precondition
A GUI screen that serves as a precondition in the illustrated embodiments will be described below.
A listener is located at the midpoint P of the hemispherical dome 52, and a sound image is localized at a desired position on the surface of the dome 52.
The slide bar 54 is a slide bar for localizing a position in the left-to-right direction with respect to the listener, and a sound image is moved in the left-to-right direction by moving a slider 55 using, for example, a mouse. The slide bar 54 corresponds to a slide bar for changing an azimuth angle θ in a horizontal plane. The slide bar 56 is a slide bar for localizing a position in the front-to-rear direction with respect to the listener, and a sound image is moved in the front-to-rear direction by moving a slider 57 using, for example, a mouse. The slide bar 56 corresponds to a slide bar for changing an angle (elevation angle) φ of a vertical plane. The target position of a sound image on the surface of the hemispherical dome 52 is represented by, for example, a dot O, and the sound image target position with respect to the listener can be easily imaged by moving the dot O as the slider 55 or 57 is moved.
It is, however, difficult to identify the X position, the Y position, and the Z position of a sound image, as such a GUI basically allows localization by setting the azimuth angle θ and the elevation angle φ. Specifically, to set a movement path of a sound image that successively moves, for example, from near the right ear of the listener passing around behind the listener to near the left ear of the listener, while it appears that this can be easily achieved by identifying the X position, the Y position, and the Z position, it is difficult to identify them only with the hemispherical dome 52.
In the illustrated embodiments, while such a hemispherical dome 52 is included as a precondition, an improved GUI is provided.
GUI of Embodiments
The top view 60 further includes an icon or a representation 60a that schematically represents the listener, an icon or a representation 60b that schematically represents the hemispherical dome 52, and a dot O that represents the target position of a sound image. The representation 60a that represents the listener is illustrated in this view in a state in which the listener faces upward. The representation 60a serves as the first representation, and the representation 60b serves as the second representation.
The side view 62 further includes a representation 60a that represents the listener, a representation 60b that represents the surface of the hemispherical dome 52, and a dot O that represents the sound image target position. The representation 60a that represents the listener is illustrated in this view in a state in which the listener faces to the left. The representation 60b in the side view 62 serves as the third representation.
The slide bar 64 is a slide bar for localizing a position in the left-to-right direction with respect to the listener, and a sound image is moved in the left-to-right direction with respect to the listener by moving a slider 65 using, for example, a mouse. The slide bar 64 corresponds to a slide bar for changing an azimuth angle θ in a horizontal plane with the midpoint P of the representation 60a at the center.
The slide bar 66 is a slide bar for localizing a position in the front-to-rear direction with respect to the listener, and a sound image is moved in the front-to-rear direction with respect to the listener by moving a slider 67 using, for example, a mouse. The slide bar 66 corresponds to a slide bar for changing an elevation angle φ with the midpoint P of the representation 60a at the center.
The slide bar 68 is a slide bar for adjusting the radius R of the hemispherical dome 52 with respect to the listener (the midpoint P of the representation 60a), and the distance R between the listener (midpoint P) and a sound image is increased or reduced by moving a slider 69 using, for example, a mouse. The slide bar 68 corresponds to a slide bar for changing the distance R from the listener to the sound image. The slide bar 68 serves as the first operation element, and the slide bars 64 and 66 serve as the second operation element.
By viewing the top view 60 and the side view 62, the user can set the distance from the sound image and the position of the sound image in the left-to-right direction and in the front-to-rear direction. More specifically, moving the slider 65 moves the dot O in the top view 60 in the left-to-right direction, and moving the slider 67 moves the dot O in the top view 60 and the dot O in the side view 62 in the front-to-rear direction. Moving the slider 69 changes the radius position with respect to the midpoint P of the dot O in the side view 62. Therefore, the user can easily identify from the top view 60 and the side view 62 the distance between the listener and the sound image and the position in the left-to-right direction and in the front-to-rear direction.
Alternatively, rather than changing the position of the dot O in the side view 62 when the user moves the slider 69, the radius of the representation 60b may be changed so that it is scaled up or down. By viewing the representation 60b, the user can visually recognize the distance between the listener and the sound image. The X position and the Y position can be identified from the top view 60, and the Z position can be identified from the side view 62.
First, the user clicks a desired position with respect to the representation 60a of the listener as illustrated in
Then, the user drags the mouse to a desired position to move the dot O toward the center of the representation 60a as illustrated in
First, the user clicks a desired point (position) on the GUI screen in
Then, the user moves the dot O through a drag operation toward the center portion to set the front-to-rear position, the left-to-right position, and the height (S102). The CPU of the computer receives a mouse drag operation input and detects a drag end position. A circle having the distance R from the center portion to the first click position as the radius is set and displayed as the representation 60b, and the drag end position is displayed in the form of a dot representing the sound image position (S103). The CPU of the computer sets the hemispherical dome 52 having the distance from the center portion to the first click position as the radius R, and localizes the drag end position on the surface of the dome 52 to output it as the sound image position (S104). The sound image position may include an azimuth angle θ and an elevation angle φ, or the X position, the Y position, and the Z position, or may be in any other form.
As described above, the target position of a sound image can be set by only a single series of click and drag operations, and the user can easily identify the X position, the Y position, and the Z position of the sound image by viewing a top view (a projection of the hemispherical dome 52 as projected on a horizontal plane) as illustrated in
The computer 10 includes a CPU 12, a ROM 14, a RAM 16, a display screen 18, an HDD 20, an input/output interface (I/F) 22, a keyboard 24, and a mouse 26.
The CPU 12 causes the display screen 18, which serves as the display 50, to display a GUI screen in accordance with a processing program stored in the ROM 14 or the HDD 20 using the RAM 16 as working memory. The CPU 12 causes the representation 60a of the listener to be displayed substantially at the center of the GUI screen. In response to the user's click operation of the mouse 26, a corresponding operation signal is supplied to the CPU 12 through the input/output I/F 22. The CPU 12 detects the click position, sets the radius R of the hemispherical dome 52, and stores it in, for example, the RAM 16. Further, in response to the user's drag operation of the mouse 26, a corresponding operation signal is supplied to the CPU 12 through the input/output I/F 22. The CPU 12 detects the drag end position, sets the X position, the Y position, and the Z position of a sound image, and stores them in, for example, the RAM 16. The CPU 12 reads the radius R stored in, for example, the RAM 16, and causes the display screen 18 to display the representation 60b of the hemispherical dome 52 and to display a dot O that represents the position of the sound image at the drag end position. The X position, the Y position, and the Z position stored in the RAM 16 are output to the sound source signal processing apparatus 30 as the target position data of the sound image.
The sound source signal processing apparatus 30 processes a sound source signal using, for example, a head-related transfer function HRTF, and outputs a sound image localization signal to, for example, a loudspeaker or headphones. The processing of a sound source signal using a head-related transfer function HRTF is known and disclosed in, for example, JP 2008-211834 A.
Specifically, in response to an input of the target position data from the computer 10, the sound source signal processing apparatus 30 sets parameters corresponding to structural characteristics for the right ear and for the left ear separately in accordance with the input target position data, and sets those parameters for an IIR filter for the right ear and for an IIR filter for the left ear to process a sound source signal through these IIR filters to output sound image localization signals for Rch and Lch.
Although embodiments of the present disclosure are described above, the present disclosure is not limited to these embodiments and various modifications are possible. Modification examples will be described below.
Although, in the embodiments, the target position of a sound image is set using a mouse, the display 50 may be implemented by a touch screen and the target position may be set by a touch operation. Specifically, referring to
In the embodiments, the sound image localization apparatus includes the computer 10 and the sound source signal processing apparatus 30 as illustrated in
Although, in the embodiments, sound image paths as illustrated in
Although, in the embodiments, the dot O representing the target position of a sound image on a GUI screen is a black round dot, the shape and the color may be freely chosen. Any mark that can be visually recognized by the user may be displayed. The mark may be movable by an operation element such as a slide bar or a mouse, or may be movable through a touch operation.
In the embodiments, a circle with the listener at the center is provided as the representation 60b of the hemispherical dome 52 with the listener at the center as projected on a horizontal plane. Gradation may be added to an area within this circle to visually indicate the height. Any presentation method for two-dimensionally rendering the three-dimensional height may be used.
Although, in the embodiments, a circle with the listener at the center is provided as the representation 60b of the hemispherical dome 52 with the listener at the center as projected on a horizontal plane, the circle may be replaced with an ellipse as illustrated in
Number | Date | Country | Kind |
---|---|---|---|
2017-088857 | Apr 2017 | JP | national |
Number | Name | Date | Kind |
---|---|---|---|
5440639 | Suzuki | Aug 1995 | A |
5521981 | Gehring | May 1996 | A |
8190438 | Nelissen | May 2012 | B1 |
20020150257 | Wilcock | Oct 2002 | A1 |
20080219454 | Iida et al. | Sep 2008 | A1 |
20090034772 | Iida et al. | Feb 2009 | A1 |
20100080396 | Aoyagi | Apr 2010 | A1 |
20120057715 | Johnston | Mar 2012 | A1 |
20120210223 | Eppolito | Aug 2012 | A1 |
20150023524 | Shigenaga et al. | Jan 2015 | A1 |
20150326988 | Zielinsky | Nov 2015 | A1 |
20160227342 | Yuyama | Aug 2016 | A1 |
20170215018 | Rosset | Jul 2017 | A1 |
20180139565 | Norris | May 2018 | A1 |
Number | Date | Country |
---|---|---|
101065990 | Oct 2007 | CN |
101175343 | May 2008 | CN |
104301664 | Jan 2015 | CN |
2004-193877 | Jul 2004 | JP |
2008-211834 | Sep 2008 | JP |
Number | Date | Country | |
---|---|---|---|
20180314488 A1 | Nov 2018 | US |