The present technology particularly relates to an information processing apparatus, an information processing method, and a program capable of appropriately reproducing a sense of distance from a user to a virtual sound source and an apparent size of the virtual sound source in spatial sound representation.
As a method of making a user recognize a space using sound, a method of representing the direction, distance, movement, and the like of a virtual sound source by computation using a head-related transfer function (HRTF) is known.
Representation of the direction and distance of a virtual sound source is important to make the user recognize the space using sound. Although the direction of the virtual sound source can be represented by computation using HRTF, it is difficult to sufficiently represent the sense of distance from the user to the virtual sound source by conventional methods.
The present technology has been made in view of such circumstances, and is intended to appropriately reproduce the sense of distance from the user to the virtual sound source and the apparent size of the virtual sound source.
An information processing apparatus according to one aspect of the present technology includes a sound source setting unit that sets a first sound source, and a plurality of second sound sources at positions corresponding to a size of a sound image of a first sound that is a sound of the first sound source; and an output control unit that outputs first sound data obtained by convolution processing using HRTF information corresponding to a position of the first sound source and a plurality of pieces of second sound data obtained by convolution processing using HRTF information corresponding to the positions of the second sound sources, wherein the second sound sources are set to be positioned around the first sound source.
In one aspect of the present technology, a first sound source and a plurality of second sound sources are set, the second sound sources being set at positions corresponding to a size of a sound image of a first sound that is a sound of the first sound source, and first sound data obtained by convolution processing using HRTF information corresponding to a position of the first sound source and a plurality of pieces of second sound data obtained by convolution processing using HRTF information corresponding to the positions of the second sound sources are output. The second sound sources are set to be positioned around the first sound source.
An embodiment for implementing the present technology will be described below. The description will be made in the following order.
In
The way the user, who is a listener, perceives the sound changes according to the distance from the car.
In the example of
On the other hand, in the example of B of
In this way, the user perceives the sense of distance to the sound source by perceiving the size of the sound image.
In the present technology, the distance from the user to an object serving as a virtual sound source is represented by controlling the size of the sound image. By changing the size of the sound image that the user hears, it is possible to make the user perceive the sense of distance from the user to the virtual sound source.
As shown in
In the example of
In the present technology, sound is presented by, for example, converting the sound from each sound source generated by computation using the head-related transfer functions (HRTF) corresponding to the positions of the central sound source and the ambient sound sources into L/R 2-channel sound and outputting the same from the headphones 1.
The sound from the central sound source is the central sound that represents the sound of the object serving as the virtual sound source, and is called the central sound in the present specification. The sound from the ambient sound source is the sound that represents the size of the sound image of the central sound, and is called the ambient sound in the present specification.
As shown in
In the example of
According to the present technology, it is possible to represent an object around the user as if it is a sound source. In addition, according to the present technology, it is possible to represent sounds as if they are coming from an empty space around the user.
By listening to the central sound and a plurality of ambient sounds, the user feels that the sound image of the central sound representing the sound from the virtual sound source has a size as indicated by a colored circle #11. As described with reference to
In this way, the user can perceive a sense of distance from the user to the object serving as the virtual sound source in the spatial sound, and can experience the spatial sound with a sense of reality.
As shown in
The central sound, which is the sound of the central sound source C, is the central sound representing the sound of the object that is the virtual sound source. Further, the central sound is used as a reference sound for making the user perceive the sense of distance from the user to the virtual sound source.
A plurality of ambient sound sources are set around the central sound source C set in this way. For example, the plurality of ambient sound sources are arranged at regular intervals on a circle around the central sound source C.
As shown in
The ambient sounds, which are the sounds of the ambient sound sources LU, RU, LD, and RD, are sounds for representing the size of the sound image of the central sound. By listening to the central sound and the ambient sounds, the user feels that the sound image of the central sound has a size. This allows the user to perceive the sense of distance to the object, which is the virtual sound source.
For example, the ambient sound source RU is arranged at a position P11 which is a horizontal angle rAzim (d) and a vertical angle rElev (d) away from the position P1 where the central sound source C is arranged with respect to the user U. Similarly, the remaining ambient sound sources LU, RD, and LD are arranged at positions P12, P13, and P14, which are set with reference to the position P1.
A position P12 where the ambient sound source LU is arranged is a position which is a horizontal angle −rAzim (d) and a vertical angle rElev (d) away from the position P1. A position P13 where the ambient sound source RD is arranged is a position which is a horizontal angle rAzim (d) and a vertical angle rElev (d) away from the position P1. A position P14 where the ambient sound source LD is arranged is a position which is a horizontal angle −rAzim (d) and a vertical angle −rElev (d) away from the position P1.
For example, the distances from the central sound source C to each ambient sound source are the same. In this way, the four ambient sound sources LU, RU, LD, and RD are arranged radially with respect to the central sound source C.
For example, when the central sound source and the ambient sound sources are viewed obliquely from above, the positional relationship between the central sound source and the ambient sound sources is the relationship shown in
The positions of the plurality of ambient sound sources set around the central sound source C as described above are different depending on the size of the sound image of the central sound to be perceived by the user.
Although an example in which four ambient sound sources are set has been described as a representative example, the number of ambient sound sources is not limited to this.
According to the present technology, by controlling the positions of the ambient sound sources arranged around the central sound source, the user can perceive different distances to the virtual sound sources.
In this way, by changing the position of the ambient sound to an arbitrary position, it is possible to represent the distance even for a virtual sound source having a characteristic shape such as a vertically or horizontally long shape.
Next, configurations of a sound reproducing system and an information processing apparatus to which the present technology is applied will be described.
In the present technology, for example, a user wears the headphones 1 and carries the information processing apparatus 10. A user can experience the spatial sound of the present technology by listening to the sound corresponding to the sound data processed by the information processing apparatus 10 through the headphones 1 connected to the information processing apparatus 10.
The information processing apparatus 10 is, for example, a smartphone, a mobile phone, a PC, a television, a tablet, or the like possessed by the user.
Moreover, the headphones 1 are also called a reproducing device, and an earphone or the like is assumed in addition to the headphones 1. The headphones 1 are worn on the user's head, more specifically, on the user's ears, and are connected to the information processing apparatus 10 by wire or wirelessly.
As illustrated in
The information processing apparatus 10 also includes an input/output interface 15, an input unit 16 configured with various buttons and a touch panel, and an output unit 17 configured with a display, a speaker, and the like. The bus 14 is connected to the input/output interface 15 to which the input unit 16 and the output unit 17 are connected.
The information processing apparatus 10 further includes a storage unit 18 such as a hard disk or nonvolatile memory, a communication unit 19 such as a network interface, and a drive 20 for driving a removable medium 21. A storage unit 18, a communication unit 19, and a drive 20 are connected to the input/output interface 15.
The information processing apparatus 10 functions as an information processing apparatus that processes sound data reproduced by a reproducing device such as the headphones 1 worn by the user.
The communication unit 19 functions as an output unit that supplies audio data when the information processing apparatus 10 and the reproducing device are wirelessly connected.
The communication unit 19 may also function as an acquisition unit that acquires virtual sound source data and HRTF information via a network.
As shown in
The sound source setting unit 31 sets a virtual sound source for representing a sense of distance at a predetermined position. Further, the sound source setting unit 31 sets a central sound source according to the position of the virtual sound source, and sets ambient sound sources at positions according to the distance to the virtual sound source.
The spatial sound generation unit 32 generates sound data of sounds from the central sound source and ambient sound sources set by the sound source setting unit 31.
For example, the spatial sound generation unit 32 performs convolution processing on the virtual sound source data based on HRTF information corresponding to the position of the central sound source to generate sound data of the central sound. The spatial sound generation unit 32 also performs convolution processing on the virtual sound source data based on HRTF information corresponding to the position of each ambient sound source to generate sound data of each ambient sound.
Even if the virtual sound source data to be subjected to convolution processing based on HRTF information corresponding to the position of the central sound source and the virtual sound source data to be subjected to convolution processing based on HRTF information corresponding to the positions of the ambient sound sources may be the same data and may be different data.
The output control unit 33 converts the sound data of the central sound and the sound data of each ambient sound generated by the spatial sound generation unit 32 into L/R sound data. The output control unit 33 controls the output unit 17 or the communication unit 19 to output the converted sound data from the reproducing device worn by the user.
In addition, the output control unit 33 appropriately adjusts the volume of the central sound and the volume of each ambient sound. For example, it is possible to decrease the volume of the ambient sound to decrease the size of the sound image of the central sound, or increase the volume of the ambient sound to increase the size of the central sound image. Further, the volume values of the respective ambient sounds can be set to either the same value or different values.
In this manner, the information processing unit 30 sets the virtual sound source and also sets the central sound source and the ambient sound sources. Further, the information processing unit 30 performs convolution processing based on HRTF information corresponding to the positions of the central sound source and the ambient sound sources, thereby generating sound data of the central sound and the ambient sounds, and outputting them to the reproducing device.
HRTF data corresponding to the position of the central sound source and HRTF data corresponding to the positions of the ambient sound sources may be synthesized by, for example, multiplying them on the frequency axis, and processing equivalent to the above-described processing may be realized using the synthesized HRTF data. The synthesized HRTF data becomes HRTF data for representing the area, which is the apparent size of the virtual sound source.
If the central sound source and the ambient sound sources are the same, there is an effect that the amount of computation is reduced.
The processing of the information processing apparatus 10 will be described with reference to the flowchart of
In step S101, the sound source setting unit 31 sets a virtual sound source at a predetermined position.
In step S102, the sound source setting unit 31 sets the central sound source according to the position of the virtual sound source.
In step S103, the sound source setting unit 31 sets an ambient sound source according to the distance from the user to the virtual sound source. In steps S101 to S103, the sound volume of each sound source is appropriately set.
In step S104, the spatial sound generation unit 32 performs convolution processing based on the HRTF information to generate sound data of the central sound, which is the sound of the central sound source, and the ambient sound, which is the sound of the ambient sound sources. The sound data of the central sound and the sound data of the ambient sounds generated by the convolution processing based on the HRTF information are supplied to the reproducing device and used for outputting the central sound and the ambient sounds.
In step S105, the sound source setting unit 31 determines whether the distance from the user to the virtual sound source changes.
If it is determined in step S105 that the distance from the virtual sound source to the user changes, the sound source setting unit 31 controls the positions of the ambient sound sources according to the changed distance in step S106. For example, when representing that a virtual sound source approaches, the sound source setting unit 31 controls the position of each ambient sound source to move away from the central sound source. Further, when representing that the virtual sound source moves away, the sound source setting unit 31 controls the position of each ambient sound source to approach the central sound source.
In step S107, the spatial sound generation unit 32 performs convolution processing based on the HRTF information to generate data of the central sound and ambient sounds that are set again according to the distance to the virtual sound source. After the central sound and ambient sounds are output using the sound data generated by the convolution processing based on the HRTF information, the processing ends.
On the other hand, if it is determined in step S105 that the distance from the user to the virtual sound source does not change, the processing ends similarly. The above-described processing is repeated while the user listens to the sound of the virtual sound source.
Through the above-described processing, the information processing apparatus 10 can appropriately represent the sense of distance from the user to the virtual sound source.
The user can perceive the distance to the virtual sound source through a realistic spatial sound experience.
As shown in
As shown in
For example, the information processing apparatus 10 communicates with the virtual sound source data providing server 60 to acquire virtual sound source data provided from the virtual sound source data providing server 60.
The information processing apparatus 10 also communicates with the HRTF server 70 and acquires HRTF information provided from the HRTF server 70. The HRTF information is data for adding the transfer characteristics from the virtual sound source to the user's ear (eardrum). That is, the HRTF information is data in which the head-related transfer function for localizing the sound image at the position of the virtual sound source is recorded for each direction of the virtual sound source viewed from the user.
The HRTF information acquired from the HRTF server 70 may be recorded in the information processing apparatus 10, or may be acquired from the HRTF server 70 each time the sound of the virtual sound source is output.
As the head-related transfer function, information recorded in the form of head-related impulse response (HRIR), which is information in the time domain, may be used, or information recorded in the form of HRTF, which is information in the frequency domain, may be used. In the present specification, description is given assuming that HRTF information is handled.
Further, the HRTF information may be personalized according to the physical characteristics of the individual user, or may be commonly used by a plurality of users.
For example, the personalized HRTF information may be information obtained by placing the subject in a test environment and performing actual measurements, or may be information calculated from the ear image of the subject. Information calculated based on the size information of the head and ear of the subject may be used as the personalized HRTF information.
The HRTF information used in common may be information obtained by measurement using a dummy head, or may be information obtained by averaging HRTF information of a plurality of persons. A user may compare reproduced sounds using a plurality of pieces of HRTF information, and the HRTF information that the user determines to be the most suitable may be used as the HRTF information used in common.
The reproducing device 50 in
In
Further, the virtual sound source data providing server 60 and the HRTF server 70 may be realized by one device.
Notification of Obstacles Using Spatial Sound when Visually Impaired People Walk
The white cane W also includes a processing control unit that controls the output of ultrasonic waves from the ultrasonic speaker unit and processes sounds detected by the microphone unit. These configurations are provided in a housing formed at the upper end of the white cane W, for example.
The ultrasonic speaker unit and the microphone unit provided on the white cane W function as sensors, and the user U is notified of information about surrounding obstacles. Notification to the user U is performed using the sound of a virtual sound source that gives a sense of distance based on the size of the sound image.
As shown in
When the processing control unit of the white cane W detects the distance to the wall X and the direction of the wall X, the processing control unit sets the wall X which is an obstacle as an object corresponding to a virtual sound source.
The processing control unit also sets a central sound source and an ambient sound source that represent the distance to the wall X and the direction of the wall X. For example, the central sound source is set in the direction of the wall X, and the ambient sound sources are set at positions corresponding to the size of the sound image representing the distance to the wall X.
The processing control unit uses data such as notification sounds as virtual sound source data, and performs convolution processing on the virtual sound source data based on HRTF information corresponding to the respective positions of the central sound source and the ambient sound sources to generate the sound data of the central sound and the ambient sound. The processing control unit transmits the sound data obtained by performing the convolution processing to the headphones 1 worn by the user U, and outputs the central sound and the ambient sound.
When walking with a normal white cane (a white cane without an ultrasonic speaker unit and a microphone unit), for example, a user who is visually impaired person can only obtain information about 1 meter around the user, and cannot obtain information about obstacles such as walls, steps, and cars several meters ahead, which poses a danger.
In this way, by representing the distance and direction of the obstacle detected by the white cane W with the spatial sound, the user U can perceive not only the direction of the surrounding obstacles but also the distance to the obstacle only by the sound. In addition to information on obstacles, the presence of an anterior lower space representing the edge of a platform, is also acquired as spatial information.
In this application example, the white cane W acquires distance information to surrounding obstacles by using the ultrasonic speaker unit and the microphone unit as sensors and represents the distance to the obstacle based on the acquired distance information using spatial sound.
For example, by repeating such processing at short intervals such as 50 ms, the user can immediately know information such as surrounding obstacles even while walking.
In
In addition, there are individual differences in how people perceive a sense of distance due to sound. The relationship between how the user perceives the distance and the size of the sound image may be learned in advance, and the size of the sound image may be adjusted according to the user's recognition pattern.
Furthermore, by adjusting the size of the sound image according to whether the user is walking or standing still, a representation that allows the user to easily perceive the sense of distance may be provided.
Presentation of Map Information Using Sound
In
The information processing apparatus 10 possessed by the user U includes a position detection unit that detects the current position of the user U and a surrounding information acquisition unit that acquires information such as surrounding stations.
In this application example, the information processing apparatus 10 acquires the position of the user U by the position detection unit, and acquires the surrounding information by the surrounding information acquisition unit. Further, the information processing apparatus 10 controls the size of the sound image presented to the user U according to the distance to the destination D, thereby allowing the user U to immediately perceive the sense of distance to the destination D.
For example, the information processing apparatus 10 increases the size of the sound image of the sound representing the destination D as the user U approaches the destination D. This allows the user U to perceive that the distance to the destination D is short.
In this way, it is possible to present map information using sound for the user to go to a destination in an easy-to-understand manner using spatial sound.
Further, by changing the size of the sound image according to the amount of noise in the surroundings, it is possible to make the representation easier to understand.
Example of Notification Sound
The information processing apparatus 10 possessed by the user U includes a detection unit that detects the degree of urgency and importance of the contents of the notification in cooperation with other devices such as household electric appliances (home appliances).
In this application example, the information processing apparatus 10 changes the size of the sound image of the notification sound of the home appliance according to the degree of urgency and importance detected by the detection unit, thereby immediately inform the user U of the degree of urgency and importance of the notification sound.
According to this application example, even if the user U does not notice the monotonous buzzer sound from the speaker installed in the home appliance, the notification sound of the home appliance is presented by increasing the size of the sound image. Thus, it is possible to make the user U notice the notification sound of the home appliance.
The degree of urgency and importance of the notification sound of the home appliance is set according to the danger, for example. When the water boils, it is dangerous to leave it as it is without noticing the notification sound. A high level is set as the degree of urgency and importance for notification in this case.
Although the home appliance has been described as a kettle, the present invention can also be applied to presentation of notification sounds of other home appliances. Applicable home appliances include refrigerators, microwave ovens, rice cookers, dishwashers, washing machines, water heaters, and vacuum cleaners. Moreover, the examples given here are general ones, and are not limited to those illustrated.
Further, when it is desired to draw the user's attention to a specific part of a device, it is possible to guide the user's line of sight by gradually reducing the area of the caution sound. The specific parts of the device are, for example, switches, buttons, touch panels, and the like provided in the device.
In this way, according to the present technology, it is possible to allow the user to perceive a sense of distance to the virtual sound source, present the user with the importance and urgency of the notification sound of the device, and guide the user's line of sight.
Example of Teleconference System
The communication management server 100 controls transmission and reception of voice data between users. Voice data transmitted from the information processing apparatus 10 used by each user is mixed in the communication management server 100 and distributed to all the information processing apparatuses 10.
The communication management server 100 also manages the position of each user on the space map, and outputs each user's voice as sound having a sound image whose size corresponds to the distance between the users on the space map. The communication management server 100 has functions similar to those of the information processing apparatus 10 described above.
The users A to D wear the headphones 1 and participate in the teleconference using the information processing apparatuses 10A to 10D, respectively. Each information processing apparatus 10 has microphones built therein or connected thereto, and is installed with a program for using the teleconference system.
The example of
User D can set the distance to a desired user by moving the position of the icon and controlling the position of each user on the space map. In the example of
As indicated by a colored circle #61, the voice of user B, who is set at a close position on the space map, is output as sound with a large sound image according to the distance. As indicated by circles #62 and #63, the voices of users A and C are output as sounds with sound images whose sizes correspond to their respective distances.
If the voices of all users are mixed as monaural voices and output from the headphones 1, the positions of the speakers are aggregated at one point, so that the cocktail party effect is unlikely to occur, and the user cannot pay attention to the voice of a specific speaker and listen to it. In addition, it becomes difficult to have group discussions among a plurality of groups.
In this way, by controlling the size of the sound image of the voice of each speaker according to the position of each speaker, it is possible to represent the sense of distance between the user and each speaker.
By representing the distance to each speaker who is present at the conference, the user can have a conversation while feeling a sense of perspective.
The voice of the speaker to be grouped may be output as a voice with a large sound image as if it is localized at a position close to the ear. This makes it possible to represent the feeling of a group of speakers.
Each information processing apparatus 10 may have an HMD, a camera, or the like built therein or connected thereto. By detecting the direction of the user's face using an HMD or camera and by increasing the size of a sound image of the voice of a speaker that the user is paying attention to when detecting that the user is paying attention to a specific speaker, it is possible to make the user feel as if the specific speaker is speaking close to the user.
In this example, each user can control the positions of other users (speakers), but the present invention is not limited to this. For example, it is conceivable that each of the participants in the conference controls their own position or other participants' positions on the space map, and the positions set by someone are shared among all the participants.
Example of Simulated Car Engine Sound
Pedestrians are thought to recognize traveling cars mainly based on visual and auditory information, but the engine sound of recent electric cars is low, making it difficult for pedestrians to notice. Moreover, even if the sound of a car is heard, if other noises are heard together, it is difficult to notice that a car is approaching.
In this application example, the simulated engine sound emitted by a car 110 is made to be heard by a user U, who is a pedestrian, so that the traveling car 110 is noticed. The car 110 is equipped with a device having functions similar to those of the information processing apparatus 10. The user U walking while wearing the headphones 1 hears the simulated engine sound output from the headphones 1 under the control of the car 110.
In this application example, the car 110 includes a camera for detecting the user U who is a pedestrian, and a communication unit for transmitting a simulated engine sound as approach information to the user U walking nearby.
When the car 110 detects the user U, the car 110 generates a simulated engine sound having a sound image whose size corresponds to the distance to the user U. The simulated engine sound generated based on the central sound and the ambient sound is transmitted to the headphones 1 and presented to the user U.
The simulated engine sound based on the central sound and the ambient sound may be generated in the information processing apparatus 10 possessed by the user U instead of in the car 110.
According to the present technology, it is possible to allow the user U to perceive the sense of distance to the car 110 as well as the direction of arrival of the car 110, and to improve the accuracy of danger avoidance.
Notification using the simulated engine sound as described above can be applied not only to cars with low engine sound, but also to conventional cars. By exaggerating the sense of distance by causing the user to hear a simulated engine sound with a sound image whose size corresponds to the distance, it is possible to make the user perceive that the car is approaching and improve the accuracy of danger avoidance.
Example of Obstacle Warning Sound of Car
Although there are already systems that give audible warnings when a car is close to a wall, such as when the car is parked, the user may not feel the sense of distance between the car and the wall.
In this application example, the car is equipped with a camera for detecting approaching walls. Also in this case, the car is equipped with a device having the same function as the information processing apparatus 10.
The device mounted on the car detects the distance between the car body and the wall based on the image captured by the camera, and controls the size of the sound image of the warning sound. The closer the car body is to the wall, the louder the warning sound is output. By perceiving the sense of distance to the wall from the size of the sound image of the warning sound, it is possible to improve the accuracy of danger avoidance.
Example of Predictive Fish School Detection
The present technology can be also applied to presentation of schools of fish by a predictive fish school detection device. For example, the larger the area of the school of fish, the larger the sound image of the presented warning sound. This allows the user to immediately determine the predicted value of the size of the school of fish.
Example of Sound Space Representation
The present technology allows the user to perceive a sense of distance from the virtual sound source. In addition, by changing the area of the reverberant sound (the size of the sound image) relative to the direct sound, it is possible to represent the expansion of space. That is, by applying the present technology to reverberant sound, it is possible to represent a sense of depth.
In addition, by representing the area of the reverberant sound by reducing the amount of change according to the user's familiarity, it is possible to reduce the stimulation burden on the user.
The perception of sound differs depending on whether the sound is coming from the front, the side, or the back of the face. By providing parameters suitable for each direction as parameters related to area representation, representation appropriate for the presentation direction of the sound can be provided.
Examples of Video Content and Movies
The present technology can be applied to presentation of sound for various contents such as video contents such as movies, audio contents, and game contents. By setting an object in the contents as a virtual sound source and controlling the central sound and ambient sound, it is possible to realize an experience as if the virtual sound source approaches or moves away from the user.
Configuration of Reproducing Device
Closed headphones (over-ear headphones) as shown in
The reproducing device shown in
The open-type earphones shown in
The left unit 120L has the same structure as the right unit 120R. The left unit 120L and the right unit 120R are connected wired or wirelessly.
The driver unit 121 of the right unit 120R receives an audio signal transmitted from the information processing apparatus 10 and generates sound according to the audio signal and causes sound corresponding to the audio signal to be output from the tip of the sound conduit 122 as indicated by the arrow A1. A hole for outputting sound to the outer earhole is formed at the junction of the sound conduit 122 and the mounting part 123.
The mounting part 123 is shaped like a ring. Along with a sound outputted from the tip of the sound conduit 122, an ambient sound also reaches the outer earhole as indicated by an arrow A2.
In this way, it is possible to use open earphones that do not seal the ear canal.
These reproducing devices may be provided with a detection unit that detects the direction of the user's head. When a detection unit that detects the direction of the user's head is provided, the HRTF information used in the convolution processing is adjusted so that the position of the virtual sound source is fixed even if the direction of the user's head changes.
Program
The above-described series of processing can be executed by software and can be executed by hardware. When the series of processing is performed by software, a program for the software to be installed from a program recording medium to a computer embedded in dedicated hardware or a general-purpose personal computer.
The installed program is provided by being recorded in a removable medium configured as an optical disc (a compact disc-read only memory (CD-ROM), a digital versatile disc (DVD), or the like), a semiconductor memory, or the like. In addition, the program may be provided through a wired or wireless transmission medium such as a local area network, the Internet or digital broadcasting. The program can be installed in a ROM or a storage unit in advance.
The program executed by the computer may be a program that performs a plurality of steps of processing in time series in the order described in the present specification or may be a program that performs a plurality of steps of processing in parallel or at a necessary timing such as when a call is made.
Meanwhile, in the present specification, a system is a collection of a plurality of constituent elements (devices, modules (components), or the like) and all the constituent elements may be located or not located in the same casing. Thus, a plurality of devices housed in separate housings and connected via a network, and one device in which a plurality of modules are housed in one housing are both systems.
The effects described in the present specification are merely examples and are not limited, and other effects may be obtained.
The embodiments of the present technology are not limited to the aforementioned embodiments, and various changes can be made without departing from the gist of the present technology.
For example, the present technique may be configured as cloud computing in which a plurality of devices share and cooperatively process one function via a network.
In addition, each step described in the above flowchart can be executed by one device or executed in a shared manner by a plurality of devices.
Furthermore, in a case in which one step includes a plurality of processes, the plurality of processes included in the one step can be executed by one device or executed in a shared manner by a plurality of devices.
Combination Examples of Configurations
The present technology can be configured as follows.
(1)
An information processing apparatus including;
(2)
The information processing apparatus according to (1), wherein the sound source setting unit sets the second sound sources around the first sound source.
(3)
The information processing apparatus according to (1) or (2), wherein the sound source setting unit sets the second sound sources to positions further away from the first sound source as the size of the sound image of the first sound increases.
(4)
The information processing apparatus according to any one of (1) to (3), wherein
(5)
The information processing apparatus according to any one of (1) to (4), wherein
(6)
The information processing apparatus according to any one of (1) to (5), wherein
(7)
The information processing apparatus according to (6), wherein the output control unit adjusts a volume of each of the first sound and the second sound according to the size of the sound image of the first sound.
(8)
The information processing apparatus according to any one of (2) to (7), wherein
(9)
The information processing apparatus according to any one of (2) to (5), wherein
(10)
The information processing apparatus according to any one of (2) to (9), further including:
(11)
An information processing method for causing an information processing apparatus to execute processing including:
(12)
A program for causing a computer to execute processing including:
Number | Date | Country | Kind |
---|---|---|---|
2021-035102 | Mar 2021 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2022/000832 | 1/13/2022 | WO |
Number | Date | Country | |
---|---|---|---|
20240137724 A1 | Apr 2024 | US |