The present application claims priority to European Patent Application No. EP17150264.4 entitled “ARRANGEMENTS AND METHODS FOR GENERATING NATURAL DIRECTIONAL PINNA CUES”, and filed on Jan. 4, 2017. The entire contents of the above-listed application are hereby incorporated by reference for all purposes.
The disclosure relates to arrangements and methods for 3D audio generation, in particular for 3D audio generation for virtual and augmented reality applications.
Virtual reality (VR) and augmented reality (AR) applications have become more and more popular. Virtual reality typically refers to computer technologies that use software to generate realistic images, sounds and other sensations that replicate a real environment, or create an imaginary setting, and simulate a user's physical presence in this environment, by enabling the user to interact with this space and any objects depicted therein using specialized display screens or projectors and other devices. Virtual reality equipment usually includes a headset that may be arranged on the user's head. The headset holds a display in position in front of the user's eyes and in some cases provides loudspeakers for generating a suitable sound experience. Often, VR headsets are combined with standard headphones. Most headphones available on the market today produce an in-head sound image when driven by a conventionally mixed stereo signal. “In-head sound image” in this context means that the predominant part of the sound image is perceived as being originated inside the user's head, usually on an axis between the ears. If sound is externalized by suitable signal processing methods (externalizing in this context means the manipulation of the spatial representation in a way such that the predominant part of the sound image is perceived as being originated outside the user's head), the center image tends to move mainly upwards instead of moving towards the front of the user. While especially binaural techniques based on HRTF filtering are very effective in externalizing the sound image and even positioning virtual sound sources on most positions around the user's head, such techniques usually fail to position virtual sources correctly on a frontal part of the median plane (in front of the user).
This means that acoustic events from the front, which is arguably the most important direction for VR environments and AR applications, currently cannot be reliably reproduced at the correct position when played over commercially available headphones. Generally, the visual content of VR or AR applications may help to improve frontal localization. However, visible sound sources for all sounds in front of the user are not necessarily present in VR and AR applications. In some embodiments of the present disclosure the localization of sound sources in front of the user may be improved if combined with suitable signal processing. Besides the optimization of spatial sound aspects for VR and AR applications, ease of use and wearing comfort are further important factors for VR and AR headsets. Loudspeakers that are integrated into VR and AR headsets generally help to prevent the clutter that may result when two devices are worn on top of each other (VR/AR headset and headphones). Current arrangements that try to integrate loudspeakers into the VR/AR headsets suffer from a degradation of special sound aspects, especially perceived source direction and limited low frequency output. In order to avoid the degradation of localization performance, an individual compensation of the transfer functions between the loudspeakers and the ears may be used for each user. The proposed sound source arrangements do not require individual transfer function compensation and, therefore, can avoid the corresponding measurement procedure as well as measurement hardware.
A headset arrangement for virtual reality, augmented reality or mixed reality applications is configured to induce natural directional pinna cues. The arrangement comprises a support structure configured to be arranged on a user's head and to hold a display in front of the user's eyes. For each ear, the support structure comprises at least a first sound source and a second sound source, wherein, when the support structure is arranged on a user's head, the first sound source and the second sound source are arranged such that at the concha of the user a primary sound incidence direction of sound emitted by the first sound source is essentially opposing to a primary sound incidence direction of sound emitted by the second sound source. The primary sound incidence direction is the direction from which the sound emitted by a sound source reaches the concha for the first time.
Other systems, methods, features and advantages will be or will become apparent to one with skill in the art upon examination of the following detailed description and figures. It is intended that all such additional systems, methods, features and advantages be included within this description, be within the scope of the disclosure and be protected by the following claims.
The method may be better understood with reference to the following description and drawings. The components in the figures are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the disclosure. Moreover, in the figures, like referenced numerals designate corresponding parts throughout the different views.
Many virtual reality (VR) and augmented reality (AR) headsets today rely on additional conventional headphones to generate sound for VR and AR applications. Only few VR and AR headsets have loudspeakers directly integrated into the support structure of the headset that is worn on the head to hold the display in place in front of the user's eyes. Usually, an additional headphone has to be worn by the user.
Sound source positions in the space surrounding the user can be described by means of an azimuth angle φ (position left to right), an elevation angle υ (position up and down) and a distance measure (distance of the sound source from the user). The azimuth and the elevation angle are usually sufficient to describe the direction of a sound source. The human auditory system uses several cues for sound source localization, including interaural time difference (ITD), interaural level difference (ILD), and pinna resonance and cancellation effects, that are all combined within the head related transfer function (HRTF).
If sound in conventional headphone arrangements is externalized by suitable signal processing methods (externalizing in this context means that at least the predominant part of the sound image is perceived as being originated outside the user's head), the center channel image tends to move mainly upwards instead of to the front. This is exemplarily illustrated in
Sound sources that are arranged on the median plane (azimuth angle φ=0°) lack interaural differences in time (ITD) and level (ILD) which could be used to position virtual sources. If a sound source is located on the median plane, the distance between the sound source and the ear as well as the shading of the ear through the head are the same to both the right ear and the left ear. Therefore, the time the sound needs to travel from the sound source to the right ear is the same as the time the sound needs to travel from the sound source to the left ear and the amplitude response alteration caused by the shading of the ear through parts of the head is also equal for both ears. The human auditory system analyzes cancellation and resonance magnification effects that are produced by the pinnae, referred to as pinna resonances in the following, to determine the elevation angle on the median plane. Each source elevation angle and each pinna generally provokes very specific and distinct pinna resonances.
Pinna resonances may be applied to a signal by means of filters derived from HRTF measurements. However, attempts to apply foreign (e.g., from another human individual), generalized (e.g., averaged over a representative group of individuals), or simplified HRTF filters usually fail to deliver a stable location of the source in the front, due to strong deviations between the individual pinnae. Only individual HRTF filters are usually able to generate stable frontal images on the median plane if applied in combination with individual headphone equalizing. However, such a degree of individualization of signal processing is almost impossible for the consumer mass market.
The present disclosure includes VR and AR headset arrangements that are capable of individually generating directional pinna cues associated with at least two approximately opposing directions. Some of the proposed headset arrangements support the generation of an improved centered frontal sound image and embodiments of the disclosure are further capable of positioning virtual sound sources all around the user's head 2 if combined with appropriate signal processing. This is exemplarily illustrated in
Some of the VR headsets available today provide integrated solutions for audio playback. One example of such a VR headset 100 is schematically illustrated in
A third example of a prior art headset 100 is schematically illustrated in
Most VR headsets today, however, do not have any integrated audio sources, but have to be combined with standard headphones. The spatial characteristics of typical headphones are usually less important than general sound quality attributes such as tonal balance, a wide working frequency range and low distortion. If the general sound quality is inferior to typical headphone standards, spatial effects are usually rejected by users, especially for stereo playback. Embodiments of the proposed headset arrangement may not be substantially worse in general sound quality aspects than typical headphones that are available today. Especially the playback of low frequencies usually requires physical structures of considerable size to be positioned around the user's ear. The reduction of negative effects of such structures on the controlled induction of natural directional pinna cues is one aspect of the proposed headset arrangement. Controlled induction of natural directional pinna cues can serve multiple purposes. As has been described before, the localization accuracy of virtual sources on the median plane can be improved by inducing suitable directional pinna cues. Another advantage over conventional binaural synthesis based on generalized HRTFs is the improved tonality, because the user is presented with his own spectral shape cues which are, in contrast to foreign spectral shape cues, not perceived as disturbing tonality alterations. On the other hand, directional pinna cues may also be suppressed in a controlled way by superposition of multiple essentially contradicting directional cues as provided by the proposed headset arrangements. This provides an ideal basis for conventional binaural synthesis based on generalized or individual HRTFs, because no disturbing directional pinna cues are generated by the headset arrangement.
Conventional binaural synthesis that is based on generalized or individual HRTFs is currently the de facto standard for virtual and augmented reality applications which often only provide a binaural (2 channel) signal. Finally, even normal stereo playback without any spatial processing may benefit from headset arrangements that do not produce uncontrolled comb filtering effects which may result from reflections inside a headphone structure and disturb the tonality of reproduced sound. In some of the proposed headset arrangements, which include measures for reducing reflections within the headset structure, the natural sound field may reach the ear of the user virtually unaltered. Furthermore, the proposed headset arrangement solves problems of conventional headphones such as unwanted pressure on the ears or heat built up inside the ear cups, for example.
Within this document, the terms pinna cues and pinna resonances are used to denominate the frequency and phase response alterations imposed by the pinna and possibly also the ear canal in response to the direction of arrival of sound. The terms directional pinna cues and directional pinna resonances within this document have the same meaning as the terms pinna cues and pinna resonances, but are used to emphasize the directional aspect of the frequency and phase response alterations produced by the pinna. Furthermore, the terms natural pinna cues, natural directional pinna cues and natural pinna resonances are used to point out that these resonances are actually generated by the user's pinna in response to a sound field in contrast to signal processing that emulates the effects of the pinna. Generally, pinna resonances that carry distinct directional cues are excited if the pinna is subjected to a direct, approximately unidirectional sound field from the desired direction. This means that sound waves emanating from a source from a certain direction hit the pinna without the addition of very early reflected sounds of the same sound source from different directions. While humans are generally able to determine the direction of a sound source in the presence of typical early room reflections, reflections that arrive within a too short time window after the direct sound will alter the perceived sound direction. Therefore, some embodiments of the headset arrangement according to the present disclosure send direct sound to the pinna while suppressing, or at least reducing, reflections from surfaces close to the pinna and, therefore, are able to induce strong directional cues.
Known stereo headphones generally can be grouped into in-ear, over-ear and around-ear types. Around-ear types are commonly available as so-called closed-back headphones with a closed back-chamber behind the loudspeaker or as so-called open-back headphones with an open back-chamber behind the loudspeaker. Headphones may have a single or multiple drivers (loudspeakers). Besides high quality in-ear headphones, specific multi-way surround sound headphones exist that utilize multiple loudspeakers aiming on generation of directional effects.
In-ear headphones are generally not able to generate natural pinna cues, due to the fact that the sound does not pass the pinna at all and is directly emitted into the ear canal. Within a fairly large frequency range, on-ear and around-ear headphones having a closed back produce a pressure chamber around the ear that usually either completely avoids pinna resonances or at least alters them in an unnatural way. In addition, this pressure chamber is directly coupled to the ear canal which alters ear canal resonances as compared to an open sound-field, thereby further obscuring natural directional cues. At higher frequencies, elements of the ear cups reflect sound, whereby a diffuse sound field is produced that cannot induce pinna resonances associated with a single direction. The headset according to the present disclosure includes an open sound structure and, therefore, avoids such drawbacks.
Typical open-back headphones as well as most closed-back around-ear and on-ear headphones that are available on the market today utilize large diameter loudspeakers. Such large diameter loudspeakers are often almost as big as the pinna itself, thereby producing a large plane sound wave from the side of the head that is not appropriate to generate consistent pinna resonances as would result from a directional sound field from the front. Additionally, the relatively large size of such loudspeakers as compared to the pinna, as well as the close distance between the loudspeaker and the pinna and the large reflective surface of such loudspeakers result in an acoustic situation which resembles a pressure chamber for low to medium frequencies and a reflective environment for high frequencies. Both situations are detrimental to the induction of natural directional pinna cues associated with a single direction.
Surround sound headphones with multiple loudspeakers usually combine loudspeaker positions on the side of the pinna with a pressure chamber effect and reflective environments. Such headphones are usually not able to generate consistent directional pinna cues, especially not for the frontal hemisphere.
Generally all kinds of objects that cover the pinna, such as back covers of headphones or large loudspeakers themselves may cause multiple reflections within the chamber around the ear which generates a diffused sound field that is detrimental for natural pinna effects as caused by directional sound fields.
Therefore, the present disclosure provides an optimized headset arrangement that allows to send direct sound towards the pinna from all desired directions while minimizing reflections, in particular reflections from the headset arrangement itself into the region of the pinna or the concha of the user. While pinna resonances are widely accepted to be effective above frequencies of about 2 kHz, real world loudspeakers usually produce various kinds of noise and distortion that will allow the localization of the loudspeaker even for substantially lower frequencies. The user may also notice differences in distortion, temporal characteristics (e.g., decay time) and directivity between different speakers used within the frequency spectrum of the human voice. Therefore, a lower frequency limit in the order of about 200 Hz or lower may be chosen for the loudspeakers that are used to induce directional cues with natural pinna resonances, while reflections may be controlled at least for higher frequencies (e.g., above 2-4 kHz).
Generating a stable frontal image on the median plane presents the presumably highest challenge as compared to generating a stable image from other directions. Generally, the generation of individual directional pinna cues is more important for the frontal hemisphere (in front of the user) than for the rear hemisphere (behind the user). Effective natural directional pinna cues are easier to induce for the rear hemisphere for which the replacement with generalized cues is generally possible with good effects at least for standard headphones which place loudspeakers at the side of the pinna. Therefore, some of the proposed headset arrangements focus on optimization of frontal hemisphere cues while providing weaker, but still adequate, directional cues for the rear hemisphere. Other arrangements may provide equally good directional cues for each of the front and rear direction. To achieve strong natural directional pinna cues, the headset arrangements are configured such that the sound waves emanated by one or more sound sources mainly pass the pinna, or at least the concha, once from the desired direction with reduced energy in reflections that may occur from other directions. Some arrangements focus on the reduction of reflections for sound sources in the frontal part of the sound structure, while other arrangements minimize reflections independent from the position of the sound source. The sound structure of a VR or AR headset according to the present disclosure may comprise such parts of the headset, which contribute to the generation or control of sound. Such parts may, for example, comprise sound sources, waveguides, sound tubes, reflectors, and any support structure for any of these components. The sound structure may be partly or completely integrated into a larger support structure of the headset. The sound structure may encircle the ear of the user partly or completely. The present disclosure generally avoids putting the ear into a pressure chamber, at least above 2 kHz, and in some embodiments reduces reflections into the pinnae which tend to cause a diffuse sound field. To avoid reflections, the at least two sound sources may be positioned on the headset such that it results in the desired directions of the respective sound fields. The support structure is arranged such that reflections are avoided or minimized.
Most VR and AR headsets today include solid structures that are arranged almost all around the user's head to comfortably support the weight of the display that is arranged in front of the user's eyes. The display usually forms a mass center that is arranged at a comparably large distance in front of the user's head. In many cases such solid structures generally allow an integration of loudspeakers or, more generally speaking, sound sources. An integration of sound sources usually only causes a moderate increase of the external dimensions of the headset. In any case, most of the headset structures today are strong enough to carry additional sound sources. Most headset structures also allow to place the sound sources at clearly defined positions with respect to the user's ears. Some headset structures already offer an advantageous design that allows to place the sound sources at positions which are advantageous for generation of natural directional pinna cues associated with the preferred directions for improvement of virtual sound source positioning (e.g., front and back). Furthermore, an uneven mass distribution caused by the display arranged at the front of the headset structure allows for the addition of a certain weight along the middle and rear parts of the headset structure.
Therefore, according to some embodiments of the present disclosure, loudspeakers or sound sources are integrated into headset structures that are similar to known VR headset designs. These embodiments illustrate the principles of sound source integration into VR headsets, although sound sources generally may be integrated into any VR headset design. Generally, loudspeakers may be arranged anywhere on the headset structure. In some examples, the loudspeakers radiate sound directly in a desired direction. In other examples, however, one or more loudspeakers radiate sound into a sound control unit such as a sound canal, sound tube, wave guide, reflector or the like. The sound control unit may be configured to control the direction of the sound field that arrives at the ear of the user or, in particular at the pinna of the user's ear. For example, a loudspeaker may be arranged at a first end of a sound canal and the sound outlet at the other end of this sound canal may be arranged such that sound is emitted in a desired direction and/or from a desired position with respect to the pinna when exiting the sound canal. The respective loudspeakers, however, do not necessarily have to be arranged in proximity to the user's ear and/or emit sound in a desired direction. For example, a loudspeaker may be arranged within a sound canal, sound tube or wave guide of which separate sections attach to the front and respectively back of the loudspeaker, guiding sound from one side of the loudspeaker towards a pinna of the user while guiding sound from the other side of the loudspeaker away from the pinna or towards the second pinna. The Figures exemplarily illustrate loudspeakers and loudspeaker arrangements. However, it should be noted that the loudspeakers illustrated in the Figures merely represent sound sources, e.g., sound outlets of sound control units, and the sound may be generated at different locations within the headset structure. In those examples where the loudspeakers are arranged at or close to the positions illustrated in the Figures, they should not necessarily be understood as a single loudspeaker. One of the exemplarily illustrated sound sources may include more than one loudspeaker or more than one other sound generating device. In any case, it may be assumed that sound sources direct at least a part of their radiated sound towards the pinna. Furthermore, most of the Figures illustrate a headset structure only for the right side of a user's head. It should be noted that the same applies for the other ear (e.g., left ear) which is not illustrated in the Figures.
One example of a headset 100 is illustrated in
The frame of the sound structure 14 may be at least partially hollow inside. One or more walls may separate one or more cavities inside the frame from the surrounding air on the outside. At least one of the sound sources 20, 30, 40 may be a loudspeaker, wherein a first side of the loudspeaker faces the outside and a second side of the loudspeaker faces one of the at least one cavities inside the frame. In this way the one or more cavities provide a back volume for at least one loudspeaker. The at least two sound sources 20, 30, 40 are configured to emit sound to the ear from a desired direction (e.g., from the front, rear or top). One of the at least two sound sources 20, 30, 40 may be positioned on the frontal half of the sound structure 14 to support the induction of natural directional cues as associated with the frontal hemisphere. At least one sound source 30 may be arranged behind the ear on the rear half of the sound structure 14 to support the induction of natural directional cues as associated with the rear hemisphere. When arranging the at least one sound source 20 on the frontal half of the sound structure 14, the sound source position with respect to the horizontal plane through the ear canal does not necessarily have to match the elevation angle υ of the resulting sound image. An optional sound source 40 above the user's ear, or user's pinna, may improve sound source locations above the user 2.
A similar arrangement is illustrated by means of
As is illustrated in
A third sound source 40 may be arranged on the support structure 120 essentially above the user's ear. The main direction of sound propagation of the third sound source 40 may be directed essentially towards the user's ear canal. However, the main direction of sound propagation of the third sound source 40 does not necessarily have to be perpendicular to the horizontal plane (sound source 40 arranged directly above the ear canal of the user 2). The third sound source 40 may be arranged such that its main direction of sound propagation is at an angle between about 45° and about 90°, between about 60° and 90° or between 75° and 90° with respect to the horizontal plane.
The second extension 300 may be an essentially straight extension passing behind the user's ear. This is, however, only an example. The second extension 300 may include an appendix which passes below the user's ear. In one example, the second extension 300 is essentially L-shaped. A fourth sound source 50 may be arranged on the appendix of the second extension 300 such that it emits sound from essentially below the user's ear (main direction of sound propagation perpendicular to the horizontal plane from below). It is also possible that the first extension 200 is an essentially L-shaped extension and includes a sound source which emits sound from essentially below the user's ear, for example.
Referring to
Generally, sound sources that are arranged essentially at ear level in front of the user's ear are suited particularly well for generating virtual sound sources in front of the user 2. However, there is a wide range of locations at which sound sources may be positioned around the ear or, in particular, around the pinna of the user 2. As has already been described above, the term “sound source” as used herein, may refer to a loudspeaker or to a sound outlet of a sound control unit which directs sound of a remote loudspeaker or any other remote sound generation unit in a desired direction. The general principle of the present disclosure is described in more detail referring to
The examples of
Examples a) and b) of
For natural pinna resonance stimulation above about 2 kHz a sound source that is arranged approximately in front of the pinna can be used to improve stability and accuracy of virtual sound sources in front of the user. A definition of directions with respect to the pinna, e.g., front, rear, left, right, is given by means of
Therefore, the proposed headset aims at essentially neutralizing natural directional cues in form of pinna resonances for those cases in which the desired virtual sound source direction does not match the available directional cue from any individual or combined sound sources. Therefore, sound fields from opposing directions are superimposed in the area of the pinna. This requires respective sound sources arranged at largely opposing directions with respect to the pinna or concha region. If a sound source is arranged in front of the pinna, another sound source behind the pinna may be added to complement the sound source in front of the pinna with a sound field from an opposing direction. This is exemplarily illustrated in
In example b) of
Instead of arranging a first sound source 21 in front of the user's pinna or concha and a second sound source 22 behind the user's pinna or concha, it is also possible, for example, to arrange one sound source above the user's pinna or concha and one sound source below the user's pinna or concha. In the second case, the relevant directions of sound propagation of the sound sources are essentially perpendicular to the horizontal plane. These are, however, only examples. Any other angles between the relevant direction of sound propagation of a sound source and the horizontal plane are possible, the relevant directions of the sound sources being essentially opposing with respect to the sound radiated towards the pinna or concha area. Possible angles Ψ between the relevant directions of sound propagations of two essentially opposing sound sources have already been described above with respect to examples a) and b) of
Example c) of
The three sound sources 21, 22, 23 may be arranged at the corners of an isosceles triangle, wherein the symmetry axis S1 of the triangle runs across the pinna or concha, or the ear canal. In example c), the second sound source 22 is arranged behind the pinna such that its relevant direction of sound propagation is essentially parallel to the horizontal plane. The first and third sound sources 21, 23 are arranged in front of the pinna, with the first sound source 21 being arranged below the horizontal plane and the third sound source 23 arranged above the horizontal plane. The relevant directions of sound propagation of the first and third sound sources 21, 23 arranged in front of the pinna are directed upwards or downwards, respectively, towards the horizontal plane and, in particular, towards the concha. The symmetry axis S1 in example c) is essentially parallel to the horizontal plane. This is, however, only an example. The symmetry axis S1 may be arranged at any angle with regard to the horizontal plane. In order to provide a signal to the user that is essentially neutral with regard to directional pinna cues induced at the user's ear, the first sound source 21, the second sound source 22 and the third sound source 23 may emit the same signal towards the concha of the user's ear, at least for frequencies between about 4 and about 15 kHz, whereas the signal level of the first sound source 21 and the third sound source 23 may be reduced by approximately 6 dB as compared to the signal of the second sound source 22, because the total SPL of the first and third sound source 21, 23 adds up and, therefore, needs to be reduced for an equal weighting of frontal and rear directional pinna cues as induced by the frontal and rear sound sources, respectively.
Other arrangement, such as the arrangements that are illustrated by means of
The same applies for a rear sound source which is arranged behind the frontal plane, rear direction R. A top sound source is arranged above the horizontal plane, which divides the user's head in an upper part and a lower part, top direction T, and a bottom sound source is arranged below the horizontal plane, bottom direction B. Top and bottom sound sources may be arranged on a plane which runs essentially parallel to the median plane such that their relevant direction of sound propagation towards the concha is essentially parallel to the median plane. It is, however, also possible that top and bottom sound sources are arranged such that their relevant direction of sound propagation towards the concha is at an angle α with respect to the median plane. The relevant direction of sound propagation towards the concha may be directed towards or away from the median plane. A sound source that is arranged on the side of the user's head, side direction S, may be arranged on the horizontal plane such that its relevant direction of sound propagation towards the concha is essentially parallel to the horizontal plane and the frontal plane and essentially perpendicular to the median plane.
Besides the above mentioned directions (front direction F, rear direction R, top direction T and bottom direction B), sound sources may be placed all around the ear with an angle α between their respective relevant direction of sound propagation towards the concha and a plane through the ear parallel to the median plane. Generally there are no restrictions for the angle α. However, it should be considered that especially virtual sound sources on the median plane, in particular sound sources in front of the user, are often subject to false localization due to the lack of interaural differences, as has already been mentioned before. Sound source positions that very closely mimic the incidence direction of sound of sound sources that are arranged on the median plane, are often very well suited for the induction of natural pinna resonances supporting specific directions on the median plane. Therefore, deviations of the angle α from the plane parallel to the median plane, as illustrated in
As has already been described above, very early reflections of sound that is emitted by a sound source that is used for generating directional pinna resonances may be caused by objects close to the pinna. Such very early reflections are detrimental to the introduction of strong natural directional pinna cues if they reach the pinna from considerably different directions than the direct sound. Therefore, such reflections should be avoided or at least reduced as far as possible. Measures that may be taken in order to reduce reflections that are directed towards the pinna include the avoidance of surface area orientations around the pinna that re-direct sound from any sound source towards the pinna, concealing any mechanical structures that are arranged behind the user's ear behind the pinna to shade them against direct sound, application of sound absorbing or low reflective material to structures that are prone to directing reflections at the pinna, and controlling sound source radiation patterns, thereby reducing sound radiation towards obstacles that would reflect sound towards the pinna. If reflections which cannot be avoided result in a small shift of the direction associated with the generated pinna cues from the intended direction, the position of the sound source may be shifted in order to compensate for the deviation from the desired direction associated with the pinna cues. If, for example, the elevation angle of a source direction associated with pinna cues induced by a frontal sound source is higher than desired, the position of the physical sound source may be shifted to a lower elevation angle to compensate for the deviation.
There are several parameters that can alter directional pinna cues. These parameters include the individual perception characteristics of the user which may lead to variations of the perceived image elevation angle, and reflections on parts of the headset arrangement. Generally, individual directional pinna resonance cues from the front support and improve the generation of sound images in the frontal hemisphere of the user and thereby also the generation of sound images at a centered position in front of the user, even if the incidence angle at which the sound source is positioned does not exactly match the elevation angle of the desired sound image.
The frame of the sound structure may have an essentially rounded or essentially oval shape. The rounded or oval shape, however, is only an example. Generally, the sound structure may have any suitable form, e.g., circular, rectangular or any other regular or irregular form. The form of the sound structure in combination with the sound source arrangement may be chosen such that reflections of the sound on the sides of the sound structure opposite to the sound sources are reduced. The form of the sound structure may be chosen such that the pinna is kept essentially open and such that it allows the sound sources to be positioned at effective angles with respect to the horizontal plane to obtain the desired sound direction. However, there are usually constraints when choosing an optimum shape of the sound structures. Such constraints may be given by the shape of the support structure. The desired target sound field is unidirectional, meaning that reflections into the pinna or at least the concha region are altogether avoided. If a direct sound emanated from the frontal part of the sound structure reaches the concha region and is accompanied by a reflection into the concha region from above or behind the pinna, a directional cue may be weakened or be destroyed altogether. The more or the stronger the reflections, the less clear directional pinna cues will be left. Therefore, reflections may be reduced in order to be able to provide strong directional pinna cues.
A possibility to reduce reflections into regions of the pinna or especially the concha, is to direct the reflections away from the pinna or concha. The external surface of sound structures or support structures may comprise a plurality of external surface sections. These external surface sections may for example be so small that their surface area is approximately plain (e.g. less than 1° variation in the direction of the vertical on any part of the surface area). External surface sections of a sound structure or support structure arranged around the ear may either be angled such that the verticals of these surface sections point in a direction towards the pinna or concha or in a direction that does not point towards the pinna or concha. In order to minimize reflections into the concha region, external surface sections that point towards the pinna or concha may be avoided or their surface area minimized. This is of particular importance for surface sections with a direct line of sight towards the pinna or especially the concha, or, in other words, from which a straight line can be drawn towards a part of the pinna or concha without intersection of other objects in between. External surface sections around the pinna may, for example, be angled at an angle <90°, <70° or <50° with respect to the median plane in order to direct reflections away from the pinna. For example more than 30%, more than 50% or more than 70% of the surface sections with a direct line of sight towards the pinna or concha may be angled at an angle <90°, <70° or <50° to the median plane such that their vertical does not point towards the pinna or concha. Generally the sound intensity of reflections when they reach the concha will be lower the more distant the surface section is, which directed the reflection towards the concha. In another example, more than 30%, more than 50% or more than 70% of the surface sections with a direct line of sight towards the pinna or concha may be angled at an angle <90°, <70° or <50° to the median plane such that their vertical does not point towards the pinna or concha only if these surface sections fall into a radius of, e.g., 10 cm or 15 cm around the concha.
A further possibility is to arrange at least one sound source that comprises surface sections with a direct line of sight towards the pinna or concha such that these surface sections face away from the pinna or concha. If, for example, the sound source is a loudspeaker with a membrane for sound radiation, the loudspeaker may be oriented such that the loudspeaker membrane and/or the main sound radiating lobe of the loudspeaker are tilted away from the pinna or concha. Loudspeakers may be arranged such that the loudspeaker membrane is arranged at an angle 90° with respect to the median plane. Loudspeakers generally radiate sound essentially uniformly at low frequencies and merely focus sound into a main radiation lobe at high frequencies. This may result in an amplitude response at the pinna, with falling levels towards high frequencies, which may simply be compensated by suitable equalizing filters that boost high frequencies for which loudspeakers usually provide enough headroom in the available sound pressure level.
An additional or alternative possibility for reducing reflections is the use of sound damping or sound absorbing materials. For example, highly sound absorbing foam materials exist that may be applied to any surface on the sound structure or support structure, most effectively on any surfaces facing the pinna. For example, sound absorbing materials based on glass mineral wool or cotton may be used. The so-called sound absorption coefficient, which describes the fraction of sound energy absorbed by a material, is known as a performance metric for sound absorbing materials. The sound absorption coefficient generally ranges between 0 (no absorption) and 1 (full absorption), although some measurement methods for determining the sound absorption coefficient may result in values >1. Usually the sound absorption coefficient is frequency-dependent and often tends to increase from low to high frequencies. For the application of sound absorbing materials within the proposed headset arrangements the sound absorption coefficient may be greater than 0.5 for frequencies between 2 kHz and 15 kHz or greater than 0.3 for frequencies between 4 kHz and 10 kHz. However, it should be noted that the absorption coefficient generally depends on the thickness of the sound absorbing material, the incident and reflection angles as well as the measurement method that is used to determine the absorption coefficient. For some materials the maximum sound absorption is reached at an intermediate frequency, while sound absorption decreases for lower and higher frequencies. Therefore, the sound absorption may vary over the surface of the headphone arrangement that is covered with sound absorbing material as well as with the frequency content of the sound.
A single loudspeaker or sound source generally resembles a point source, as is schematically illustrated in
Remaining reflections may still adversely bias the perceived source localization, especially the elevation angle of the sound image. An additional or alternative possibility is to shift the sound source position along the opposing boundaries of the sound structure to compensate for the elevation bias. Users generally tend to locate frontal sound sources above the head or in front of the forehead when headphone playback with HRTF-based filtering is implemented. A comparable effect can be observed with normal stereo loudspeaker playback where the phantom image between the loudspeakers is often perceived above the physical loudspeaker position. One possibility to compensate for such phantom image or virtual source elevation effects for playback over the proposed headphone arrangements is to position the sound sources that are intended for generating frontal directional pinna cues associated with an elevation angle of 0°, below the horizontal plane through the ear canal to compensate for the tendency of increased elevation angle perception.
For example, one or more sound sources may be arranged below the horizontal plane on a frontal part of the sound structure such that they provide sound to the ear of the user from a lower frontal direction. If only one sound source is arranged below the horizontal plane on a frontal part of the sound structure, its relevant direction of sound propagation towards the concha may be angled with respect to the horizontal plane. In one example, its relevant direction of sound propagation towards the concha may be angled at an angle of about 10° to about 40° with respect to the horizontal plane. If two or more sound sources are arranged on the frontal part of the frame below the horizontal plane, the relevant direction of sound propagation towards the concha of each individual sound source may be angled with respect to the horizontal plane, and an average angle of the respective relevant directions of sound propagation may be between about 10° and about 40°.
It should be noted that the proposed headset arrangements may include multiple sound sources that may be individually controlled by individual electrical sound signals. Furthermore, the voice coil impedance and/or efficiency of loudspeakers of the sound sources may not be compatible with standard headphone amplifiers, as, for example, headphone amplifiers as provided in many smart phones today. Therefore, the headset arrangement may include at least one electronic driving unit that is configured to receive an input signal and to apply the conditioned input signal as a driving signal to a single or multiple loudspeakers. Furthermore, the processing of the electrical sound signals may be required in some applications in order to achieve certain sound quality or sound spatiality characteristics. Therefore, the headset arrangement may include at least one signal processing unit that is configured to receive at least one input signal, to process the at least one input signal and to emit at least one processed input signal to at least one electronic driving unit.
According to one example, a headset arrangement for virtual reality or augmented reality applications is configured to generate natural directional pinna cues. The arrangement comprises a support structure configured to be arranged on a user's head and to hold a display in front of the user's eyes. The support structure comprises at least one ear cup comprising a frame that is configured to be arranged to at least partially encircle the ear of the user, thereby defining an open volume about the ear of the user, at least a first sound source and a second sound source arranged within the frame of the ear cup, wherein the first and the second sound source are arranged such that their main directions of sound propagation are directed in essentially opposing directions.
According to a further example, the first sound source and the second sound source emit the same content for frequencies between about 4 and about 15 kHz.
According to a further example, an angle Ψ between the main direction of sound propagation of the first sound source and the main direction of propagation of the second sound source is between about 0° and about 10°, between about 0° and about 30°, between about 0° and about 50°, or between about 0° and about 90°.
According to a further example, the arrangement further comprises a third sound source arranged within the frame of the ear cup, wherein the first, second and third sound sources are arranged at the corners of an isosceles triangle, and wherein a symmetry axis of the isosceles triangle runs across the pinna or the concha of the user.
According to a further example, the at least one ear cup is integrated into the support structure.
According to a further example, the ear cup comprises at least one extension that is connected to the support structure, wherein the at least one extension and at least a section of the support structure form the frame of the ear cup.
According to a further example, the first sound source and the second sound source comprise at least one of a loudspeaker, a sound canal, a sound tube, a wave guide and a reflector.
According to a further example, at least one of the first sound source and the second sound source comprises a loudspeaker that is arranged at a first end of a sound canal, and wherein a sound outlet at a second end of the sound canal is configured to emit sound into the open volume about the ear of the user.
According to a further example, the ear cup comprises surfaces that are oriented essentially towards the pinna and surfaces that are oriented essentially away from the pinna, wherein at least parts of the surfaces oriented essentially towards the pinna comprise a sound absorbing material, the sound absorbing material being configured to reduce the intensity of sound that is emitted by the sound sources and reflected towards the pinna of the user.
According to a further example, the frame comprises a plurality of sections, and wherein at least one section is arranged behind the pinna such that it is shaded from direct sound emitted by a sound source arranged on the frontal part of the ear cup.
According to a further example, the inner walls of the frame comprise a plurality of sections, wherein the inner walls of the frame are walls that are essentially facing the open volume within the frame, and at least sections that are arranged opposite to a sound source are at least partially beveled at an angle >20° and <90° with respect to a median plane to direct reflections away from the user's head, wherein the median plane crosses the user's head midway between the user's ears, thereby dividing the head exactly in a left side and a right side.
According to a further example, at least two sound sources are arranged adjacent to each other to form an extended sound source that is configured to emit an approximately plane sound wave.
The description of embodiments has been presented for purposes of illustration and description. Suitable modifications and variations to the embodiments may be performed in light of the above description or may be acquired from practicing the methods. For example, unless otherwise noted, one or more of the described methods may be performed by a suitable device and/or combination of devices, such as signal processing components with one or more of the sound sources discussed above. The methods may be performed by executing stored instructions with one or more logic devices (e.g., processors) in combination with one or more additional hardware elements, such as storage devices, memory, hardware network interfaces/antennas, switches, actuators, clock circuits, etc. The described methods and associated actions may also be performed in various orders in addition to the order described in this application, in parallel, and/or simultaneously. The described systems are exemplary in nature, and may include additional elements and/or omit elements. The subject matter of the present disclosure includes all novel and non-obvious combinations and sub-combinations of the various systems and configurations, and other features, functions, and/or properties disclosed.
As used in this application, an element or step recited in the singular and proceeded with the word “a” or “an” should be understood as not excluding plural of said elements or steps, unless such exclusion is stated. Furthermore, references to “one embodiment” or “one example” of the present disclosure are not intended to be interpreted as excluding the existence of additional embodiments that also incorporate the recited features. The terms “first,” “second,” and “third,” etc. are used merely as labels, and are not intended to impose numerical requirements or a particular positional order on their objects. The following claims particularly point out subject matter from the above disclosure that is regarded as novel and non-obvious.
While various embodiments have been described, it will be apparent to those of ordinary skill in the art that many more embodiments and implementations are possible within the scope of the disclosure. Accordingly, the disclosure is not to be restricted except in light of the attached claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
17150264 | Jan 2017 | EP | regional |
Number | Name | Date | Kind |
---|---|---|---|
5357585 | Kumar | Oct 1994 | A |
6038330 | Meucci, Jr. | Mar 2000 | A |
7532734 | Pham | May 2009 | B2 |
9131310 | Sjoman | Sep 2015 | B2 |
9635450 | Drinkwater | Apr 2017 | B2 |
20040032964 | Liang | Feb 2004 | A1 |
20170195795 | Mei | Jul 2017 | A1 |
20170332186 | Riggs | Nov 2017 | A1 |
Number | Date | Country |
---|---|---|
2611214 | Jul 2013 | EP |
0163967 | Aug 2001 | WO |
2016001909 | Jan 2016 | WO |
Entry |
---|
Woelfl, G. et al., “Systems and Methods for Generating Natural Directional Pinna Cues for Virtual Sound Source Synthesis,” U.S. Appl. No. 15/860,451, filed Jan. 2, 2018, 120 pages. |
Woelfl, G. et al., “Arrangements and Methods for Active Noise Cancelling,” U.S. Appl. No. 15/860,546, filed Jan. 2, 2018, 36 pages. |
Woelfl, G. et al., “Arrangements and Methods for Generating Natural Directional Pinna Cues,” U.S. Appl. No. 15/860,489, filed Jan. 2, 2018, 61 pages. |
European Patent Office, Extended European Search Report Issued in Application No. 17209911.1, dated Feb. 8, 2018, Germany, 10 pages. |
Number | Date | Country | |
---|---|---|---|
20180192227 A1 | Jul 2018 | US |