1. Field of the Invention
The present invention relates to virtual environments and, more particularly, to a method and apparatus for implementing realistic audio communications in a three dimensional computer-generated virtual environment.
2. Description of the Related Art
Virtual environments simulate actual or fantasy 3-D environments and allow for many participants to interact with each other and with constructs in the environment via remotely-located clients. One context in which a virtual environment may be used is in connection with gaming, although other uses for virtual environments are also being developed.
In a virtual environment, an actual or fantasy universe is simulated within a computer processor/memory. Multiple people may participate in the virtual environment through a computer network, such as a local area network or a wide area network such as the Internet. Each player selects an “Avatar” which is often a three-dimensional representation of a person or other object to represent them in the virtual environment. Participants send commands to a virtual environment server that controls the virtual environment to cause their Avatars to move within the virtual environment. In this way, the participants are able to cause their Avatars to interact with other Avatars and other objects in the virtual environment.
A virtual environment often takes the form of a virtual-reality three dimensional map, and may include rooms, outdoor areas, and other representations of environments commonly experienced in the physical world. The virtual environment may also include multiple objects, people, animals, robots, Avatars, robot Avatars, spatial elements, and objects/environments that allow Avatars to participate in activities. Participants establish a presence in the virtual environment via a virtual environment client on their computer, through which they can create an Avatar and then cause the Avatar to “live” within the virtual environment.
As the Avatar moves within the virtual environment, the view experienced by the Avatar changes according to where the Avatar is located within the virtual environment. The views may be displayed to the participant so that the participant controlling the Avatar may see what the Avatar is seeing. Additionally, many virtual environments enable the participant to toggle to a different point of view, such as from a vantage point outside of the Avatar, to see where the Avatar is in the virtual environment.
The participant may control the Avatar using conventional input devices, such as a computer mouse and keyboard. The inputs are sent to the virtual environment client, which forwards the commands to one or more virtual environment servers that are controlling the virtual environment and providing a representation of the virtual environment to the participant via a display associated with the participant's computer.
Depending on how the virtual environment is set up, an Avatar may be able to observe the environment and optionally also interact with other Avatars, modeled objects within the virtual environment, robotic objects within the virtual environment, or the environment itself (i.e. an Avatar may be allowed to go for a swim in a lake or river in the virtual environment). In these cases, client control input may be permitted to cause changes in the modeled objects, such as moving other objects, opening doors, and so forth, which optionally may then be experienced by other Avatars within the virtual environment.
“Interaction” by an Avatar with another modeled object in a virtual environment means that the virtual environment server simulates an interaction in the modeled environment, in response to receiving client control input for the Avatar. Interactions by one Avatar with any other Avatar, object, the environment or automated or robotic Avatars may, in some cases, result in outcomes that may affect or otherwise be observed or experienced by other Avatars, objects, the environment, and automated or robotic Avatars within the virtual environment.
A virtual environment may be created for the user, but more commonly the virtual environment may be persistent, in which it continues to exist and be supported by the virtual environment server even when the user is not interacting with the virtual environment. Thus, where there is more than one user of a virtual environment, the environment may continue to evolve when a user is not logged in, such that the next time the user enters the virtual environment it may be changed from what it looked like the previous time.
Virtual environments are commonly used in on-line gaming, such as for example in online role playing games where users assume the role of a character and take control over most of that character's actions. In addition to games, virtual environments are also being used to simulate real life environments to provide an interface for users that will enable on-line education, training, shopping, and other types of interactions between groups of users and between businesses and users.
As Avatars encounter other Avatars within the virtual environment, the participants represented by the Avatars may elect to communicate with each other. For example, the participants may communicate with each other by typing messages to each other or audio may be transmitted between the users to enable the participants to talk with each other.
Although great advances have happened in connection with visual rendering of Avatars and animation, the audio implementation has lagged and often the audio characteristics of a virtual environment are not very realistic. Accordingly, it would be advantageous to be able to provide a method and apparatus for implementing more realistic audio communications in a three dimensional computer-generated virtual environment.
A method and apparatus for implementing realistic audio communications in a three dimensional computer-generated virtual environment is provided. In one embodiment, a participant in a three dimensional computer-generated virtual environment is able to control a dispersion pattern of his Avatar's voice such that the Avatar's voice may be directionally enhanced using simple controls. In one embodiment, an audio dispersion envelope is designed to extend further in a direction in front of the Avatar and in a smaller direction to the sides and rear of the Avatar. The shape of the audio dispersion envelope may be affected by other aspects of the virtual environment such as ceilings, floors, walls and other logical barriers. The audio dispersion envelope may be static or optionally controllable by the participant to enable the Avatar's voice to be extended outward in front of the Avatar. This enables the Avatar to “shout” in the virtual environment such that other Avatars normally outside of hearing range of the Avatar the User can still hear the user. Similarly, the volume level of the audio may be reduced to allow the Avatars to whisper or adjusted based on the relative position of the Avatars and directions in which the Avatars are facing. Individual audio streams may be mixed for each user of the virtual environment depending on the position and orientation of the user's Avatar in the virtual environment, the shape of the user's dispersion envelope, and which other Avatars are located within the user's user dispersion envelope.
Aspects of the present invention are pointed out with particularity in the appended claims. The present invention is illustrated by way of example in the following drawings in which like references indicate similar elements. The following drawings disclose various embodiments of the present invention for purposes of illustration only and are not intended to limit the scope of the invention. For purposes of clarity, not every component may be labeled in every figure. In the figures:
The following detailed description sets forth numerous specific details to provide a thorough understanding of the invention. However, those skilled in the art will appreciate that the invention may be practiced without these specific details. In other instances, well-known methods, procedures, components, protocols, algorithms, and circuits have not been described in detail so as not to obscure the invention.
The virtual environment may be implemented as using one or more instances, each of which may be hosted by one or more virtual environment servers. Where there are multiple instances, the Avatars in one instance are generally unaware of Avatars in the other instance, however a user may have a presence in multiple worlds simultaneously through several virtual environment clients. Conventionally, each instance of the virtual environment may be referred to as a separate World. In the following description, it will be assumed that the Avatars are instantiated in the same world and hence can see and communicate with each other. A world may be implemented by one virtual environment server 18, or may be implemented by multiple virtual environment servers. The virtual environment is designed as a visual representation of a real-world environment that enables humans to interact with each other and communicate with each other in near-real time.
Generally, a virtual environment will have its own distinct three dimensional coordinate space. Avatars representing users may move within the three dimensional coordinate space and interact with objects and other Avatars within the three dimensional coordinate space. The virtual environment servers maintain the virtual environment and generate a visual presentation for each user based on the location of the user's Avatar within the virtual environment. The view may also depend on the direction in which the Avatar is facing and the selected viewing option, such as whether the user has opted to have the view appear as if the user was looking through the eyes of the Avatar, or whether the user has opted to pan back from the Avatar to see a three dimensional view of where the Avatar is located and what the Avatar is doing in the three dimensional computer-generated virtual environment.
Each user 12 has a computer 22 that may be used to access the three-dimensional computer-generated virtual environment. The computer 22 will run a virtual environment client 24 and a user interface 26 to the virtual environment. The user interface 26 may be part of the virtual environment client 24 or implemented as a separate process. A separate virtual environment client may be required for each virtual environment that the user would like to access, although a particular virtual environment client may be designed to interface with multiple virtual environment servers. A communication client 28 is provided to enable the user to communicate with other users who are also participating in the three dimensional computer-generated virtual environment. The communication client may be part of the virtual environment client 24, the user interface 26, or may be a separate process running on the computer 22.
The user may see a representation of a portion of the three dimensional computer-generated virtual environment on a display/audio 30 and input commands via a user input device 32 such as a mouse, touch pad, or keyboard. The display/audio 30 may be used by the user to transmit/receive audio information while engaged in the virtual environment. For example, the display/audio 30 may be a display screen having a speaker and a microphone. The user interface generates the output shown on the display under the control of the virtual environment client, and receives the input from the user and passes the user input to the virtual environment client. The virtual environment client enables the user's Avatar 34 or other object under the control of the user to execute the desired action in the virtual environment. In this way the user may control a portion of the virtual environment, such as the person's Avatar or other objects in contact with the Avatar, to change the virtual environment for the other users of the virtual environment.
Typically, an Avatar is a three dimensional rendering of a person or other creature that represents the user in the virtual environment. The user selects the way that their Avatar looks when creating a profile for the virtual environment and then can control the movement of the Avatar in the virtual environment such as by causing the Avatar to walk, run, wave, talk, or make other similar movements. Thus, the block 34 representing the Avatar in the virtual environment 14, is not intended to show how an Avatar would be expected to appear in a virtual environment. Rather, the actual appearance of the Avatar is immaterial since the actual appearance of each user's Avatar may be expected to be somewhat different and customized according to the preferences of that user. Since the actual appearance of the Avatars in the three dimensional computer-generated virtual environment is not important to the concepts discussed herein, Avatars have generally been represented herein using simple geometric shapes such as cubes and diamonds, rather than complex three dimensional shapes such as people and animals.
For example,
As shown in
In the example shown in
If users are closer together, the users will be able to hear each other more clearly, and as the users get farther apart the volume of the audio tapers off until, at the edge of the dispersion envelope, the contribution of a user's audio is reduced to zero. In one embodiment, the manner in which the volume of a user's contribution is determined on a linear basis. Thus, looking at the example shown in
Audio is mixed individually for each user of the virtual environment. The particular mix of audio will depend on which other users are within the dispersion envelope for the user, and their location within the dispersion envelope. The location within the dispersion envelope affects the volume with which that user's audio will be presented to the user associated with the dispersion envelope. Since the audio is mixed individually for each user, an audio bridge is not required per user, but rather an audio stream may be created individually for each user based on which other users are proximate that user in the virtual environment.
In the example shown in
In the embodiment shown in
The shape of the audio dispersion envelope may depend on the preferences of the user as well as the preferences of the virtual environment provider. For example, the virtual environment provider may provide the user with an option to select an audio dispersion shape to be associated with their Avatar when they enter the virtual environment. The audio dispersion shape may be persistent until adjusted by the user. For example, the user may select a voice for their Avatar such that some users will have robust loud voices while others will have more meek and quiet voices. Alternatively, the virtual environment provider may provide particular audio dispersion profiles for different types of Avatars, for example police Avatars may be able to be heard at a greater distance than other types of Avatars. Additionally, the shape of the audio dispersion envelope may depend on other environmental factors, such as the location of the Avatar within the virtual environment and the presence or absence of ambient noise in the virtual environment.
According to an embodiment of the invention, user A may “Shout” toward Avatar B to cause the audio dispersion profile to extend further in the direction of B. The user's intention to shout at B may be indicated by the user through the manipulation of simple controls. For example, on a wheeled mouse, the mouse wheel may be a shout control that the user uses to extend the audio dispersion profile of their Avatar. In this embodiment, the user may simply scroll the mouse wheel away by contacting the mouse wheel and pushing forward with their finger. This is commonly used to scroll up using the mouse wheel in most common computer user interfaces. Conversely, if the user no longer wants to shout, the user may pull back on the top of the mouse wheel in a motion similar to scrolling down with the mouse wheel on most common user interfaces.
There are times where the user may want to communicate with every user within a given volume of the virtual environment. For example a person may wish to make a presentation to a room full of Avatars. To do this, the user may cause their audio dispersion envelope to increase in all directions to expand to fill the entire volume. This is shown in
A user may use explicit controls to invoke OmniVoice or, preferably, OmniVoice may be invoked intrinsically based on the location of the Avatar within the virtual environment. For example, the user may walk up to a podium on a stage and, the user's presence on the stage, may cause audio provided by the user to be included in the mixed audio stream of every other Avatar within a particular volume of the virtual environment.
The mouse wheel may have multiple uses in the virtual environment, depending on the particular virtual environment and the other activities available to the Avatar. If this is the case, then a combination of inputs may be used to control the audio dispersion profile. For example, left clicking on the mouse combined with scrolling of the mouse wheel, or depressing a key on the keyboard along with scrolling of the mouse wheel may be used to signal that the mouse scrolling action is associated with the Avatar's voice rather than another action. Alternatively, the mouse wheel may not be used and a key stroke or combination of keystrokes on the keyboard may be used to extend the audio dispersion envelope and cause the audio dispersion envelope to return to normal. Thus, in addition to implementing directional audio dispersion envelopes based on the orientation of the Avatar within the three dimensional virtual environment, the user may also warp the audio dispersion envelope in real time to control the Avatar's audio dispersion envelope in the virtual environment.
When the user elects to shout toward another user in the virtual environment, the facial expression of the Avatar may change or other visual indication may be provided as to who is shouting. This enables the user to know that they are shouting as well as to enable other users of the virtual environment to understand why the physics have changed. For example, the Avatar may cup their hands around their mouth to provide a visual clue that they are shouting in a particular direction. A larger extension of the Avatar's voice may be displayed as an Avatar or even by providing a ghost of the Avatar move closer to the new center of the voice range so that other users can determine who is yelling. Other visual indications may be provided as well.
The user controlled audio dispersion envelope warping may toggle, as shown in
In another embodiment, the selection of a person to shout to may be implemented when the user mouses over another Avatar and depresses a button, such as left clicking or right clicking on the other Avatar. If the person is within normal talking distance of the other Avatar, audio from that other Avatar will be mixed into the audio stream presented to the user. Audio from other Avatars within listening distance will similarly be included in mixed audio stream presented to the user. If the other Avatar is not within listening distance, i.e. is not within the normal audio dispersion envelope, the user may be provided with an option to shout to the other Avatar. In this embodiment the user would be provided with an instruction to double click on the other Avatar to shout to them. Many different ways of implementing the ability to shout may be possible depending on the preferences of the user interface designer.
In one embodiment, the proximity of the Avatars may adjust the volume of their audio when their audio is mixed into the audio stream for the user so that the user is presented with an audio stream that more closely resemble normal realistic audio. Other environmental factors may similarly affect the communication between Avatars in the virtual environment.
Although the previous description has focused on enabling the Avatar to increase the size of the dispersion envelope to shout in the virtual environment, the same controls may optionally also be used to reduce the size of the dispersion envelope so that the user can whisper in the virtual environment. In this embodiment, the user may control their voice in the opposite direction to reduce the size of the dispersion envelope so that users must be closer to the user's Avatar to communicate with the user. The mouse controls or other controls may be used in this manner to reduce the size of the audio dispersion envelope as desired.
According to an embodiment of the invention, the audio dispersion profile may be adjusted to account for obstacles in the virtual environment. Obstacles may be thought of as creating shadows on the profile to reduce the distance of the profile in a particular direction. This prevents communication between Avatars where they would otherwise be able to communicate if not for the imposition of the obstacle. In
The shadow objects may be wholly opaque to transmission of sound or may simply be attenuating objects. For example, a concrete wall may attenuate sound 100%, a normal wall may attenuate 90% of sound while allowing some sound to pass through, and a curtain may attenuate sound only modestly such as 10%. The level of attenuation may be specified when the object is placed in the virtual environment.
Audio be implemented using a communication server that is configured to mix audio individually for each user of the virtual environment. The communication server will receive audio from all the users of the virtual environment and create an audio stream for a particular user by determining which of the other users have an Avatar within the user's dispersion envelope. To enable participants to be selected, a notion of directionality needs to be included in the selection process such that the selection process does not simply look at the relative distance of the participants, but also looks to see what direction the participants are facing within the virtual environment. This may be done by associating a vector with each participant and determining whether the vector extends sufficiently close to the other Avatar to warrant inclusion of the audio from that user in the audio stream. Additionally, if shadows are to be included in the determination, the process may look to determine whether the vector transverses any shadow objects in the virtual environment. If so, the extent of the shadow may be calculated to determine whether the audio should be included. Other ways of implementing the audio connection determination process may be used as well and the invention is not limited to this particular example implementation.
In another embodiment, audio is to be transmitted between two Avatars may be determined by integrating attenuation along a vector between the Avatars. In this embodiment, the normal “air” or empty space in the virtual environment may be provided with an attenuation factor such as 5% per unit distance. Other objects within the environment may be provided with other attenuation factors depending on their intended material. Transmission of audio between Avatars, in this embodiment, may depend on the distance between the Avatars, and hence the amount of air the sound must pass through, and the objects the vector passes through. The strength of the vector, and hence the attenuation able to be accommodated while still enabling communication, may depend on the direction the Avatar is facing. Additionally, the user may temporarily increase the strength of the vector by causing the Avatar to shout.
In the preceding description, it was assumed that the user could elect to shout within the virtual environment. Optionally, that privilege may be reserved for Avatars possessing particular items within the virtual environment. For example, the Avatar may need to find or purchase a particular item such as a virtual bull horn to enable the Avatar to shout. Other embodiments are possible as well.
The virtual environment servers will also define an audio dispersion envelope for Avatar A which specifies how the Avatar will be able to communicate within the virtual environment (104). Each Avatar may have a set pre-defined audio dispersion envelope which is a characteristic of all Avatars within the virtual environment, or the virtual environment servers may define custom specific audio dispersion envelopes for each user. Thus, the step of defining audio dispersion envelopes may be satisfied by specifying that the Avatar is able to communicate with other Avatars that are located a greater distance in front of the Avatar than other Avatars located in other directions relative to the Avatar.
Automatically, or upon initiation of the user controlling Avatar A, the virtual environment server will determine whether Avatar B is within audio dispersion envelope for Avatar A (106). This may be implemented, for example, by looking to see whether Avatar A is facing Avatar B, and then determining how far distance Avatar B is from Avatar A in the virtual environment. If Avatar B is within the audio dispersion envelope of Avatar A, the virtual environment server will enable audio from Avatar B to be included in the audio stream transmitted to the user associated with Avatar A.
If Avatar B is not within the audio dispersion envelope of Avatar A, the user may be provided with an opportunity to control the shape of the audio dispersion envelope such as by enabling the user to shout toward Avatar B. In particular, user A may manipulate their user interface to cause Avatar A to shout toward Avatar B (110). If user A properly signals via their user interface that he would like to shout toward Avatar B, the virtual environment server will enlarge the audio dispersion envelope for the Avatar in the virtual environment in the direction of the shout (112).
The virtual environment server will then similarly determine whether Avatar B is within the enlarged audio dispersion envelope for Avatar A (114). If so, the virtual environment server will enable audio to be transmitted between the users associated with the Avatars (116). If not, the two Avatars will need to move closer toward each other in the virtual environment to enable audio to be transmitted between the users associated with Avatars A and B (118).
Users 12A, 12B are represented by avatars 34A, 34B within the virtual environment 14. When the users are proximate each other and facing each other, an audio position and direction detection subsystem 64 will determine that audio should be transmitted between the users associated with the Avatars. Audio will be mixed by mixing function 78 to provide individually determined audio streams to each of the Avatars.
In the embodiment shown in
The functions described above may be implemented as one or more sets of program instructions that are stored in a computer readable memory and executed on one or more processors within on one or more computers. However, it will be apparent to a skilled artisan that all logic described herein can be embodied using discrete components, integrated circuitry such as an Application Specific Integrated Circuit (ASIC), programmable logic used in conjunction with a programmable logic device such as a Field Programmable Gate Array (FPGA) or microprocessor, a state machine, or any other device including any combination thereof. Programmable logic can be fixed temporarily or permanently in a tangible medium such as a read-only memory chip, a computer memory, a disk, or other storage medium. All such embodiments are intended to fall within the scope of the present invention.
It should be understood that various changes and modifications of the embodiments shown in the drawings and described in the specification may be made within the spirit and scope of the present invention. Accordingly, it is intended that all matter contained in the above description and shown in the accompanying drawings be interpreted in an illustrative and not in a limiting sense. The invention is limited only as defined in the following claims and the equivalents thereto.
This application claims priority to U.S. Provisional Patent Application No. 61/037,447, filed Mar. 18, 2008, entitled “Method and Apparatus For Providing 3 Dimensional Audio on a Conference Bridge”, the content of which is hereby incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
61037447 | Mar 2008 | US |