The present technology relates to an information processing device, an information processing method, and a program, and more particularly, to an information processing device, an information processing method, and a program that enable a person with hearing loss to suitably hear a sound having sound image localization.
According to Non-Patent Document 1, a human is said to perceive a direction by using a peak and a notch on a frequency axis of a transfer characteristic that changes for each sound arrival direction as clues, and it is known that a head-related transfer function (HRTF) is individually optimized to obtain high sound image localization with headphones or the like.
A person with hearing loss hears sound by correcting sound data with a hearing aid according to the auditory characteristic, but if sound data having sound image localization is corrected with a hearing aid, sound image localization cannot be perceived in some cases.
The present technology has been made in view of such a situation, and enables a person with hearing loss to suitably hear a sound having sound image localization.
An information processing device or a program of the present technology is an information processing device including: a rendering processing unit that generates stereophonic sound data having sound image localization on the basis of a direction of a sound source arranged in a virtual space; and a signal processing unit that performs data conversion processing corresponding to an auditory characteristic of a user on the stereophonic sound data generated by the rendering processing unit and generates output sound data to be heard by the user, or a program for causing a computer to function as such an information processing device.
An information processing method of the present technology is an information processing method for an information processing device including a rendering processing unit and a signal processing unit, the information processing method including: by the rendering processing unit, generating stereophonic sound data having sound image localization on the basis of a direction of a sound source arranged in a virtual space; and by the signal processing unit, performing data conversion processing corresponding to an auditory characteristic of a user on the stereophonic sound data generated by the rendering processing unit and generating output sound data to be heard by the user.
In the information processing device, the information processing method, and the program of the present technology, stereophonic sound data having sound image localization is generated on the basis of a direction of a sound source arranged in a virtual space, data conversion processing corresponding to an auditory characteristic of a user is performed on the stereophonic sound data, and output sound data to be heard by the user is generated.
Hereinafter, embodiments of the present technology will be described with reference to the drawings.
In
The external cooperation device 11 is an arbitrary signal processing device such as a smartphone, a smart watch, a personal computer (PC), a head mounted display (HMD), or the like. The external cooperation device 11 supplies left sound data (stereophonic sound data) (for left ear) and right sound data (for right ear) in 3D audio having sound image localization to the hearing aid 12. 3D audio (stereophonic audio) refers to a method of reproducing a three-dimensional sound direction, distance, spread, or the like when reproducing the sound.
The hearing aid 12 includes a left ear hearing aid 12L that is worn on the left ear by a person with hearing loss and outputs a sound (output sound data) to be heard by the left ear, and a right ear hearing aid 12R that is worn on the right ear by the person with hearing loss and outputs a sound (output sound data) to be heard by the right ear. In the hearing aid 12, for example, a multiband compressor that compresses the input and output characteristic of a sound with a frequency that is difficult to be heard by each of the left ear and the right ear of a person with hearing loss is used. The left ear hearing aid 12L and the right ear hearing aid 12R execute processing by the multiband compressors on the left sound data and the right sound data supplied from the external cooperation device 11, respectively, and output the processed sound data as sound waves from sound output units.
The 3D rendering processing unit 31 is arranged, for example, in the external cooperation device 11. The 3D rendering processing unit 31 performs 3D rendering processing on the basis of sound source data included in content with 3D metadata, and generates sound data (stereophonic sound data) in stereophonic audio. Content with 3D metadata is, for example, information of a virtual object, a virtual sound source (hereinafter, simply referred to as a sound source), or the like in a virtual space in which a virtual world such as virtual reality (VR) or augmented reality (AR) is formed. 3D metadata includes data related to arrangement of an object such as the position and posture of a virtual object arranged in the virtual space, or the position or direction of the sound source. In the present embodiment, as content with 3D metadata, only a sound source to which data of the direction of the sound source in a virtual space is added is focused on, and sound data generated from the sound source is referred to as sound source data. The direction of the sound source is also referred to as the angle of the sound source with the front direction of the user as a reference (0 degrees). In the description of the present embodiment, a description is given assuming that the sound source is arranged in a direction limited in a two-dimensional plane, but the present technology can be applied similarly to the present embodiment even in a case where the sound source is not limited in a two-dimensional plane and is arranged in a three-dimensionally extended direction.
The 3D rendering processing unit 31 acquires sound source data that is content with 3D metadata stored in advance in a storage unit (not illustrated) of the external cooperation device 11. However, the sound source data may be supplied to the external cooperation device 11 (3D rendering processing unit 31) via a communication line such as the Internet, and the path through which the sound source data is supplied to the 3D rendering processing unit 31 may have any form.
The 3D rendering processing unit 31 acquires a head related transfer function (HRTF) corresponding to the angle of the sound source from an individually-optimized HRTF data set on the basis of the data of the direction (angle) of the sound source added to the acquired sound source data. The individually-optimized HRTF data set is stored in advance in the storage unit (not illustrated) of the external cooperation device 11. The head related transfer function represents a transfer function until a sound wave generated from the sound source reaches each of the left ear and the right ear of the user. The head related transfer function changes according to the direction of the sound source with respect to the user's head (the arrival direction in which the sound wave arrives at the user's head), and is also different for the left ear and the right ear. The head related transfer function differs depending on the user, and it is assumed that the user-specific left head related transfer function (for left ear) and the user-specific right head related transfer function (for right ear) are created in advance as the individually-optimized HRTF data set for each direction of the sound source and stored in the storage unit. Note that, as the head related transfer function, an average function common to all users may be used instead of the head related transfer function optimized for each user. As is well known, the head related transfer function corresponds to a Fourier transform of a head-related impulse response (HRIR) represented in a frequency domain, the HRIR representing a sound wave heard by each of the left ear and the right ear in a case where one impulse is generated at the position of the sound source.
The 3D rendering processing unit 31 generates left sound data and right sound data from the sound source data from the storage unit and the left head related transfer function and the right head related transfer function corresponding to the direction (angle) of the sound source added to the sound source data. Specifically, the 3D rendering processing unit 31 generates left sound data obtained by convolution integration of the sound source data and the left head-related impulse response on the basis of the sound source data and the left head related transfer function. In the convolution integration of the sound source data and the left head-related impulse response, the sound source data is subjected to frequency conversion from the time domain representation to the frequency domain representation, and then the sound source data in the frequency domain and the left head related transfer function are multiplied in the same frequency components. The data of the frequency components thus obtained is subjected to inverse Fourier transform to generate left sound data. The same applies to generation of right sound data. Hereinafter, in the case of simply referring to the head related transfer function, the head-related impulse response, or the sound data without limitation of left or right, the head related transfer function, the head-related impulse response, or the sound data represent each of the left and right head related transfer function, each of the left and right head-related impulse response, or each of the left and right sound data, respectively. The convolution integration of the sound source data and the head-related impulse response is also referred to as a convolution integration of the sound source data and the head related transfer function. The sound source data and the sound data generated by the 3D rendering processing unit 31 may also be data represented not in the time domain but in the frequency domain, and in the following, it is not distinguished whether the data is represented in the time domain or the frequency domain.
The 3D rendering processing unit 31 supplies the generated left sound data and right sound data to the signal processing units 41L and 41R for a person with hearing loss, respectively.
The signal processing units 41L and 41R for a person with hearing loss are arranged in, for example, the left ear hearing aid 12L and the right ear hearing aid 12R, respectively. The signal processing unit 41L for a person with hearing loss executes processing (compression processing) of the multiband compressor on the left sound data from the 3D rendering processing unit 31. The signal processing unit 41R for a person with hearing loss executes processing (compression processing) of the multiband compressor on the right sound data from the 3D rendering processing unit 31. The processing of the multiband compressor is processing of dividing the entire frequency domain (for example, the entire audible range) of the sound data into a plurality of frequency bands, converting the input level (amplitude level) of the input sound data according to the input and output characteristic for each frequency band, and outputting the converted sound data.
Each of the signal processing units 41L and 41R for a person with hearing loss executes processing of the multiband compressor as described above. The auditory characteristic is different for each user and for each frequency (for each frequency band). The auditory characteristic is also different between the left ear and the right ear. Therefore, the input and output characteristics of the multiband compressors in the signal processing units 41L and 41R for a person with hearing loss are set to be input and output characteristics adapted to the auditory characteristics of the left ear and the right ear for each user and for each frequency band. In the present embodiment, the setting or change of the input and output characteristic of the multiband compressor in each of the signal processing units 41L and 41R for a person with hearing loss is performed by adjusting the value of the parameter for signal processing for a person with hearing loss specifying (determining) the input and output characteristic. However, processing of the signal processing units 41L and 41R for a person with hearing loss is not limited to processing of the multiband compressor, and may be any processing that performs data conversion processing of converting input sound data into sound data for a person with hearing loss. Also in this case, it is assumed that the processing characteristic of the signal processing units 41L and 41R for a person with hearing loss is set or changed by adjusting the value of the parameter for signal processing for a person with hearing loss, and the signal processing units 41L and 41R for a person with hearing loss perform the data conversion processing of the characteristic corresponding to the auditory characteristic of the user. The parameter for signal processing for a person with hearing loss is also simply referred to as a parameter.
In
The sound output units 42L and 42R are arranged in the left ear hearing aid 12L and the right ear hearing aid 12R, respectively. In the left ear hearing aid 12L worn on the left ear of the user, the sound output unit 42L outputs sound data from the signal processing unit 41L for a person with hearing loss to the left ear of the user as a sound wave. In the hearing aid 12R worn on the right ear of the user, the sound output unit 42R outputs sound data from the signal processing unit 41R for a person with hearing loss to the right ear of the user as a sound wave.
Note that all of the 3D rendering processing unit 31 and the signal processing units 41L and 41R for a person with hearing loss may be arranged in the external cooperation device 11 or may be arranged in the hearing aid 12.
The user interface unit 51 is arranged, for example, in the external cooperation device 11. The user interface unit 51 is an operation input unit that receives a user's operation when adjusting the parameter of the signal processing units 41L and 41R for a person with hearing loss. In adjustment of the parameter of the signal processing units 41L and 41R for a person with hearing loss, for example, as described in detail later, the 3D rendering processing unit 31 generates left sound data and right sound data in 3D audio with respect to test sound source data generated in a test sound source (for adjustment). The left sound data and right sound data generated by the 3D rendering processing unit 31 are converted into left sound data and right sound data for a person with hearing loss by the signal processing units 41L and 41R for a person with hearing loss, respectively, and are output from the sound output units 42L and 42R, respectively. The user hears the sounds output from the sound output units 42L and 42R and inputs (specifies) the perceived direction (sound arrival direction) of the sound source (sound image) by the user interface unit 51. As a result, the parameter of the signal processing units 41L and 41R for a person with hearing loss is adjusted such that the direction of the sound source of the sound data generated by the 3D rendering processing unit 31 coincides with the direction of the sound source input from the user interface unit 51 by the user.
The parameter controller 52 is arranged in the external cooperation device 11, for example. The parameter controller 52 adjusts the value of the parameter of the signal processing units 41L and 41R for a person with hearing loss on the basis of information input from the user interface unit 51 by the user, or the like.
<Configuration Example of Information Processing System in Case where Person with Normal Hearing Hears Sound of 3D Audio>
In a case where a person with normal hearing hears sound source data of content with 3D metadata in 3D audio, a sound output device such as general earphones or headphones used by the person with normal hearing instead of the hearing aid 12 of
<Configuration Example of Information Processing System in Case where Person with Hearing Loss Hears Sound in 3D Audio>
In a case where a person with hearing loss hears sound source data of content with 3D metadata in 3D audio, the left ear hearing aid 12L and the right ear hearing aid 12R are connected to the external cooperation device 11 in a wired or wireless manner as illustrated in
Therefore, in the information processing system 1 of
Note that in the following, the value of the parameter (input and output characteristic of the multiband compressor) of the signal processing units 41L and 41R for a person with hearing loss in a case where the parameter is not adjusted for 3D audio is referred to as the value of the parameter of a hearing aid normally used by the user.
<Procedure for Adjusting Parameter of Signal Processing Units 41L and 41R for Person with Hearing Loss>
In
In step S12, the parameter controller 52 sets the frequency band f to be focused on as the first frequency band. Here, it is assumed that the parameter controller 52 divides the entire frequency range (for example, the entire audible range) allowed as an input signal (sound data) input to the multiband compressors of the signal processing units 41L and 41R for a person with hearing loss into a plurality of frequency bands and adjusts the parameter for each frequency band. The frequency band to be a parameter adjustment target may be some of the plurality of divided frequency bands. It is assumed that the order (turn) is given to each frequency band, for example, in descending order or ascending order of frequency. At this time, the frequency band f to be focused on represents the frequency band of the parameter to be adjusted, and the first frequency band represents the frequency band to which the first order is given among the orders (turns) given to the respective frequency bands. The processing proceeds from step S12 to step S13.
In step S13, the 3D rendering processing unit 31 generates left sound data and right sound data in 3D audio for the test sound source data generated from a test sound source for the user's head in the virtual space. The test sound source data may be sound data including frequency components in all the frequency bands to be a parameter adjustment target, or may be sound data including only frequency components in the focused-on frequency band f that is currently an adjustment target. The signal processing units 41L and 41R for a person with hearing loss apply processing of the multiband compressors to the left sound data and the right sound data generated by the 3D rendering processing unit 31, respectively. The processing proceeds from step S13 to step S14.
In step S14, the parameter controller 52 outputs left sound data and right sound data for a person with hearing loss generated by applying the processing of the multiband compressors of the signal processing units 41L and 41R for a person with hearing loss from the sound output units 42L and 42R, respectively, and presents the left sound data and the right sound data to the user. The processing proceeds from step S14 to step S15.
In step S15, the parameter controller 52 judges whether or not a sound can be heard on the basis of input information from the user interface unit 51. For example, in a case where the user does not specify the direction (angle) of the sound source (sound image) by the user interface unit 51, it is judged that the sound cannot be heard, and in a case where the user specifies the direction of the sound source by the user interface unit 51, it is judged that the sound can be heard.
In a case where it is judged in step S15 that the sound cannot be heard, the processing proceeds to step S16, and the parameter controller 52 increases the value of the parameter in the focused-on frequency band f of each of the signal processing units 41L and 41R for a person with hearing loss by one. The value of the parameter of the signal processing units 41L and 41R for a person with hearing loss represents, for example, a parameter that determines the relationship between the amplitude level of an input signal and the amplitude level of an output signal in the input and output characteristic of the multiband compressor. In the present embodiment, it is assumed that the input and output characteristic of the multiband compressor is set such that the greater the value of the parameter, the greater the amplitude level of an output signal with respect to the amplitude level of an input signal. For example, in a case where the sound data input to the signal processing units 41L and 41R for a person with hearing loss is fixed, the amplitude of the sound data output by the signal processing units 41L and 41R for a person with hearing loss increases as the value of the parameter increases. The processing proceeds from step S16 to step S19.
In a case where it is judged in step S15 that the sound can be heard, the processing proceeds to step S17, and the parameter controller 52 judges whether or not the direction of the sound source (sound image localization) perceived by the user is appropriate on the basis of the input information from the user interface unit 51. Specifically, in a case where the angle difference between the direction (angle) in which the test sound source is arranged with respect to the user's head in the virtual space and the direction (angle) of the sound source input by the user from the user interface unit 51 is equal to or smaller than a predetermined threshold, the parameter controller 52 judges that the sound image localization is appropriate, and in a case where the angle difference is larger than the threshold, the parameter controller 52 judges that the sound image localization is not appropriate.
In a case where it is judged in step S17 that the sound image localization is not appropriate, the processing proceeds to step S18, and the parameter controller 52 decreases the value of the parameter in the focused-on frequency band f of each of the signal processing units 41L and 41R for a person with hearing loss by one. The processing returns from step S18 to step S13.
In a case where it is judged in step S17 that the sound image localization is appropriate, the parameter controller 52 sets (determines) the value of the parameter (input and output characteristic of the multiband compressor) in the focused-on frequency band f of each of the signal processing units 41L and 41R for a person with hearing loss to the current value. The processing proceeds to step S19.
In step S19, the parameter controller 52 updates the frequency band f to be focused on to a frequency band given the next turn with respect to the order of the current frequency band. The processing proceeds from step S19 to step S20.
In step S20, the parameter controller 52 judges whether or not adjustment of the parameter (adjustment of the input and output characteristic of the multiband compressor) in all frequency bands (frequency bands that are adjustment targets) has been terminated. That is, in a case where the order of the frequency band f to be focused on updated in step S19 exceeds the final order, the parameter controller 52 judges that parameter adjustment in all the frequency bands that are adjustment targets has been terminated. In a case where the order of the frequency band f to be focused on does not exceed the final order, the parameter controller 52 judges that parameter adjustment in all the frequency bands that are adjustment targets has not been terminated.
In a case where it is judged in step S20 that parameter adjustment in all the frequency bands that are adjustment targets has not been terminated, the processing returns to step S13, and steps S13 to S20 are repeated.
In a case where it is judged in step S20 that parameter adjustment in all the frequency bands that are adjustment targets has been terminated, the process flow of this flowchart is terminated.
Adjustment of the parameter for signal processing for a person with hearing loss of the signal processing units 41L and 41R for a person with hearing loss may be repeatedly executed while changing the direction in which the test sound source is arranged to a plurality of different directions, and adjustment of the parameter for signal processing for a person with hearing loss may be ended in a case where the values of the parameters for signal processing for a person with hearing loss converge.
According to adjustment of the parameter for signal processing for a person with hearing loss as described above, it is possible to provide sound in 3D audio having sound image localization suitable for the user (person with hearing loss).
Note that since the burden on the user is heavy in a case where adjustment of the input and output characteristic of the multiband compressor is obtained in a brute-force manner, the user may take an A/B test and adjustment may be performed by reinforcement learning. At that time, as A, sound data generated by the initial value before adjustment of the parameter for signal processing for a person with hearing loss is started is presented to the user, and as B, sound data generated by using the parameter for signal processing for a person with hearing loss being adjusted is presented to the user, and the user selects one of them that has sound image localization which can be heard more appropriately.
Regarding the direction of the sound source specified by the user by the user interface unit 51, in a case where the direction of the sound source perceived by the user varies, or in a case where the direction of the sound source is specified from the motion of the head by using the head mounted display 63 as illustrated in
<Processing of Reproducing Plurality of Pieces of Sound Source Data for Person with Normal Hearing Corresponding to Plurality of Sound Sources>
It is assumed that at the time of reproducing the sound source data of content with 3D metadata, a plurality of sound sources 1 to N is arranged at a plurality of locations (directions) in the virtual space, and a person with normal hearing hears sound source data (sound waves) generated by the sound sources 1 to N in 3D audio. Directions (angles) of the sound sources 1 to N with respect to the user's head in the virtual space are defined as angles θ1 to θN, respectively. In this case, the 3D rendering processing unit 31 individually performs 3D rendering processing on pieces of sound source data of the sound sources 1 to N on the basis of the pieces of sound source data of the sound sources 1 to N, and generates pieces of sound data in 3D audio. That is, the 3D rendering processing unit 31 performs 3D rendering processing P1-1 to P1-N in the directions θ1 to θN on the pieces of sound source data of the sound sources 1 to N, and generates pieces of left sound data and pieces of right sound data. At this time, the 3D rendering processing unit 31 acquires the head related transfer function corresponding to each of the angles θ1 to θN of the sound sources from the individually-optimized HRTF data set and uses the head-related transfer function for generation of sound data.
The 3D rendering processing unit 31 adds (sums up) the pieces of left sound data generated by the 3D rendering processing P1-1 to P1-N in the directions θ1 to θN by addition processing P2-L to generate one piece of left sound data (for one channel). The sound data generated by the addition processing P2-L is output from the left sound output unit 71L such as an earphone or a headphone used by a person with normal hearing. Similarly, the 3D rendering processing unit 31 adds the pieces of right sound data generated by the 3D rendering processing P1-1 to P1-N in the directions θ1 to θN by addition processing P2-R to generate one piece of right sound data. The sound data generated by the addition processing P2-R is output from the right sound output unit 71L such as an earphone or a headphone used by the person with normal hearing.
<First Form of Processing of Reproducing Plurality of Pieces of Sound Source Data for Person with Hearing Loss Corresponding to Plurality of Sound Sources>
Similarly to the case described in
Similarly to the case of
The signal processing unit 41L for a person with hearing loss executes processing of the multiband compressor by signal processing P3-L for a person with hearing loss on the left sound data from the addition processing P2-L to generate left sound data for a person with hearing loss. Similarly, the signal processing unit 41R for a person with hearing loss executes processing of the multiband compressor by signal processing P3-R for a person with hearing loss on the right sound data from the addition processing P2-R to generate right sound data for a person with hearing loss. In the signal processing P3-L and P3-R for a person with hearing loss at this time, the value of the parameter adjusted (set) in advance by the method described in
The signal processing units 41L and 41R for a person with hearing loss output the pieces of sound data generated by the signal processing P3-L and P3-R for a person with hearing loss from the sound output units 42L and 42R, respectively.
Note that, in a case where pieces of sound data in 3D audio of N pieces of sound source data are generated for a person with normal hearing on the basis of sound source data of content with 3D metadata as illustrated in
<Second Form of Processing of Reproducing Plurality of Pieces of Sound Source Data for Person with Hearing Loss Corresponding to Plurality of Sound Sources>
Similarly to the case described in
In this case, the 3D rendering processing unit 31 and the signal processing units 41L and 41R for a person with hearing loss perform 3D rendering processing P4-1 to P4-N for a person with hearing loss in the directions θ1 to θN on pieces of the sound source data of the sound sources 1 to N, respectively.
The 3D rendering processing P4-1 to P4-N for a person with hearing loss in the directions θ1 to θN will be described focusing on 3D rendering processing P4-n for a person with hearing loss in the direction θn (n is any one of 1 to N). In the 3D rendering processing P4-n for a person with hearing loss in the direction θn, similarly to
In the 3D rendering processing P4-n for a person with hearing loss in the direction θn, the signal processing units 41L and 41R for a person with hearing loss further execute the processing of the multiband compressor on the left sound data and the right sound data generated by the 3D rendering processing in the direction θn to generate left sound data and right sound data for a person with hearing loss, respectively. At this time, the value of the parameter adjusted (set) in advance is set as the parameter of the signal processing units 41L and 41R for a person with hearing loss. However, since it can be assumed that the appropriate parameter of the signal processing units 41L and 41R for a person with hearing loss differs according to the angle θn of the sound source, the value of the parameter adjusted by a method to be described later is set. Regarding the parameter of the signal processing units 41L and 41R for a person with hearing loss, the value of the parameter adjusted by the method or the like described in
The 3D rendering processing P4-1 to P4-N for a person with hearing loss in the directions θ1 to θN generate left sound data and right sound data for a person with hearing loss for the sound sources at the angles θ1 to θN.
The signal processing unit 41L for a person with hearing loss or a processing unit at a subsequent stage, not illustrated, adds pieces of the left sound data for a person with hearing loss generated by the 3D rendering processing P4-1 to P4-N for a person with hearing loss in the directions θ1 to θN by addition processing P5-L to generate one piece of left sound data, and the signal processing unit 41R for a person with hearing loss or a processing unit at a subsequent stage, not illustrated, adds pieces of the right sound data for a person with hearing loss generated by the 3D rendering processing P4-1 to P4-N for a person with hearing loss in the directions θ1 to θN by addition processing P5-R to generate one piece of right sound data. The signal processing units 41L and 41R for a person with hearing loss or the processing units at a subsequent stage output the left sound data and the right sound data generated by the addition processing P5-L and P5-R from the sound output units 42L and 42R, respectively.
<Description of adjustment of parameter for signal processing for person with hearing loss in second form of processing of reproducing plurality of pieces of voice data>
In
For example, in the signal processing for a person with hearing loss when executing the 3D rendering processing P4-n for a person with hearing loss in the direction θn (n is any one of 1 to N), the signal processing units 41L and 41R for a person with hearing loss acquire the value of the parameter (parameter θn for signal processing for a person with hearing loss) corresponding to the angle θn from the parameter set for signal processing for a person with hearing loss, and execute the signal processing for a person with hearing loss by the multiband compressor having an input and output characteristic corresponding to the acquired value of the parameter.
It is assumed that directions (angles) of a plurality of sound sources corresponding to values of a plurality of parameters for signal processing for a person with hearing loss included in the parameter set for signal processing for a person with hearing loss in
The parameter controller 52 determines an appropriate value of the parameter for signal processing for a person with hearing loss corresponding to the angle θ of the sound source when generating the parameter set for signal processing for a person with hearing loss. At this time, it is assumed that a test sound source is arranged as a test object sound source S in the direction of an angle θ with respect to the user's head in the virtual space, and test sound source data is generated from the sound source. The 3D rendering processing unit 31 executes 3D rendering processing on the sound source data of the test object sound source S by using the head related transfer function corresponding to the angle θ, and generates left sound data and right sound data in 3D audio. The left sound data and right sound data generated by the 3D rendering processing unit 31 are supplied to the signal processing units 41L and 41R for a person with hearing loss, respectively.
The value of the parameter specified from the parameter controller 52 is set for each of the signal processing units 41L and 41R for a person with hearing loss. The signal processing units 41L and 41R for a person with hearing loss execute signal processing (processing of the multiband compressor) for a person with hearing loss and generate left sound data and right sound data for a person with hearing loss, respectively. The generated left sound data and right sound data are output as sound waves from the sound output units 42L and 42R, respectively.
On the basis of input information from the user interface unit 51, the parameter controller 52 adjusts the value of the parameter (input and output characteristic of the multiband compressor) corresponding to the angle θ currently set for the signal processing units 41L and 41R for a person with hearing loss to be appropriate while judging whether or not the value of the parameter is proper. In a case where an appropriate value of the parameter is obtained, the parameter controller 52 stores the value of the parameter in the storage unit, not illustrated, as the value of the parameter corresponding to the angle θ. The parameter controller 52 changes the angle θ to acquire an appropriate value of the parameter corresponding to the angle θ and stores the value in the storage unit, thereby generating a parameter set for signal processing for a person with hearing loss.
In a case where a plurality of pieces of sound source data of content with 3D metadata is reproduced, 3D rendering processing for a person with hearing loss in the direction θ is executed correspondingly to the angle θ of each of the sound sources. In the 3D rendering processing for a person with hearing loss in the direction θ, the head related transfer function corresponding to the angle θ is supplied from the individually-optimized HRTF data set to the 3D rendering processing unit 31, and 3D rendering processing is executed on the sound source data of the sound source at the angle θ. In the 3D rendering processing for a person with hearing loss in the direction θ, the value of the parameter corresponding to the angle θ is supplied from the parameter set for signal processing for a person with hearing loss to the signal processing units 41L and 41R for a person with hearing loss, and signal processing for a person with hearing loss is executed.
In
In step S41, the parameter controller 52 sets the angle θ of the sound source to 0 degrees as an initial value. The processing proceeds from step S41 to step S42.
In step S42, the parameter controller 52 causes the 3D rendering processing unit 31 and the signal processing units 41L and 41R for a person with hearing loss to execute 3D rendering processing for a person with hearing loss in the direction θ on the test sound source (test object sound source S) at the angle θ. As a result, left sound data and right sound data for a person with hearing loss are generated. Note that, in the 3D rendering processing for a person with hearing loss in the direction θ, the 3D rendering processing unit 31 uses the head related transfer function corresponding to the angle θ in the individually-optimized HRTF data set. The signal processing units 41L and 41R for a person with hearing loss use the initial value of the parameter for signal processing for a person with hearing loss corresponding to the angle θ in the parameter set for signal processing for a person with hearing loss. The initial value of the parameter for signal processing for a person with hearing loss may be the value of the parameter of the hearing aid usually used by the user, may be the value of the parameter adjusted for another user, or may be another value. Left sound data and right sound data generated by the 3D rendering processing for a person with hearing loss in the direction θ are output from the sound output units 42L and 42R for a person with hearing loss, respectively, and are presented to the user. The processing proceeds from step S42 to step S43.
In step S43, the parameter controller 52 judges whether or not the angle (sound image localization) of the sound source perceived by the user is appropriate on the basis of input information from the user interface unit 51. Specifically, in a case where the angle difference between the angle at which the sound source is arranged with respect to the user's head in the virtual space and the angle of the sound source input by the user from the user interface unit 51 is equal to or smaller than a predetermined threshold, the parameter controller 52 judges that the sound image localization is appropriate, and in a case where the angle difference is larger than the threshold, the parameter controller 52 judges that the sound image localization is not appropriate.
In a case where it is judged in step S43 that the sound image localization is not appropriate, the processing proceeds to step S44, and the parameter controller 52 adjusts the parameter for signal processing for a person with hearing loss by the method described in
In step S45, the parameter controller 52 judges whether or not to perform readjustment on the basis of input information from the user interface unit 51. Note that whether or not to perform readjustment may be specified by the user using the user interface unit 51 or may be forcibly performed by the parameter controller 52.
In a case where it is judged in step S45 that readjustment is to be performed, the processing returns to step S42 and repeats from step S42. In a case where it is judged not to perform readjustment in step S45, the processing proceeds to step S46.
In a case where it is judged in step S43 that the sound image localization is appropriate, the processing proceeds to step S66. In step S46, the parameter controller 52 updates the angle θ of the sound source to a value obtained by adding 30 degrees to the current value. The processing proceeds from step S46 to step S47. In step S47, the parameter controller 52 judges whether or not the angle θ is less than 360 degrees.
In a case where it is judged in step S47 that the angle θ is less than 360 degrees, the processing returns to step S42 and repeats from step S42. In a case where it is judged in step S47 that the angle θ is not less than 360 degrees, the processing terminates the process flow of the present flowchart.
<Description of procedure of second form of parameter adjustment in second form of processing of reproducing plurality of pieces of voice data>
In step S61, the parameter controller 52 sets the angles θ at 30-degree intervals from 0 degrees to 330 degrees with respect to the angle θ of the sound source as initial values of one set (angle set) S. The parameter controller 52 selects from the angle set S any one angle θ for which the parameter for signal processing for a person with hearing loss is unadjusted. The processing proceeds from step S61 to step S62.
The parameter controller 52 causes the 3D rendering processing unit 31 and the signal processing units 41L and 41R for a person with hearing loss to execute 3D rendering processing for a person with hearing loss in the direction θ on the test sound source data generated from a sound source in a case where the sound source is arranged at the angle θ selected in step S62. As a result, the generated left sound data and right sound data for a person with hearing loss are output from the sound output units 42L and 42R, respectively, and are presented to the user. The processing proceeds from step S62 to step S63.
In step S63, the parameter controller 52 judges whether or not the angle (sound image localization) of the sound source perceived by the user is appropriate on the basis of input information from the user interface unit 51.
In a case where it is judged in step S63 that the sound image localization is not appropriate, the processing proceeds to step S64, and the parameter controller 52 adjusts the parameter for signal processing for a person with hearing loss corresponding to the angle θ by the method described in
In step S65, the parameter controller 52 judges whether or not to perform readjustment on the basis of input information from the user interface unit 51. Note that whether or not to perform readjustment may be specified by the user using the user interface unit 51 or may be forcibly performed by the parameter controller 52.
In a case where it is judged in step S65 that readjustment is to be performed, the processing returns to step S62 and repeats from step S62. In a case where it is judged in step S65 not to perform readjustment, the processing proceeds to step S67.
In a case where it is judged in step S63 that the sound image localization is appropriate, the processing proceeds to step S66, and the parameter controller 52 removes the angle θ from the angle set S. The processing proceeds from step S66 to step S67.
In step S67, the parameter controller 52 judges whether or not to terminate the processing. That is, the parameter controller 52 judges not to terminate the processing in a case where there is an angle for which the parameter for signal processing for a person with hearing loss is not adjusted in the angle set S, and judges to terminate the processing in a case where there is no angle for which the parameter for signal processing for a person with hearing loss is not adjusted.
In a case where it is judged in step S67 not to terminate the processing, the processing returns to step S61 and repeats from step S61. In a case where it is judged in step S67 to terminate the processing, the processing terminates the process flow of the present flowchart.
In
In step S82, the parameter controller 52 executes the formalities (processing) illustrated in the flowchart of
In step S83, the parameter controller 52 stores the angle set S in the storage unit, not illustrated. After step 83 is executed, the processing terminates the process flow of the present flowchart.
In step S101, the parameter controller 52 reads the angle set S stored in step S83 of
In step S102, the parameter controller 52 executes the procedure of
In step S103, the parameter controller 52 stores the angle set S in the storage unit, not illustrated. After step 83 is executed, the processing terminates the process flow of the present flowchart.
The above-described series of processing in the information processing system 1 can be executed by hardware or can be executed by software. In a case where the series of processing is executed by the software, a program constituting the software is installed on a computer. Here, examples of the computer include a computer incorporated in dedicated hardware, and a general-purpose personal computer capable of executing various functions by installing various programs, for example.
In the computer, a central processing unit (CPU) 201, a read only memory (ROM) 202, and a random access memory (RAM) 203 are mutually connected by a bus 204.
An input/output interface 205 is further connected to the bus 204. The input/output interface 205 is connected to an input unit 206, an output unit 207, a storage unit 208, a communication unit 209, and a drive 210.
The input unit 206 includes a keyboard, a mouse, a microphone, and the like. The output unit 207 includes a display, a speaker, and the like. The storage unit 208 includes a hard disk, a non-volatile memory and the like. The communication unit 209 includes a network interface, and the like. The drive 210 drives a removable medium 211 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
In the computer configured as described above, for example, the CPU 201 loads the program stored in the storage unit 208 into the RAM 203 via the input/output interface 205 and the bus 204 and executes the program, thereby performing the above-described series of processing.
The program executed by the computer (CPU 201) can be provided by being recorded on the removable medium 211 as a package medium or the like, for example. Furthermore, the program can be provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital broadcasting.
In the computer, the program can be installed in the storage unit 208 via the input/output interface 205 by loading the removable medium 211 in the drive 210. Furthermore, the program can be received by the communication unit 209 via a wired or wireless transmission medium and installed on the storage unit 208. Additionally, the program may be installed in advance on the ROM 202 and the storage unit 208.
Note that the program executed by the computer may be a program that performs processing in a time-series manner in the order described in the present description, or may be a program that performs processing in parallel or at necessary timing such as when a call is made.
The present technology can also have the following configurations.
(1)
An information processing device including:
The information processing device according to (1), in which
The information processing device according to (2), in which
The information processing device according to any one of (1) to (3)), in which
The information processing device according to (4), in which
The information processing device according to (4) or (5), in which
The information processing device according to any one of (1) to (6) further including
The information processing device according to (7), in which
The information processing device according to (7) or (8), in which
The information processing device according to any one of (1) to (9), in which
The information processing device according to any one of (1) to (9), in which
The information processing device according to any one of (8) to (11) further including
The information processing device according to any one of (1) to (12), in which
An information processing method for an information processing device including a rendering processing unit and a signal processing unit, the information processing method including:
by the rendering processing unit, generating stereophonic sound data having sound image localization on the basis of a direction of a sound source arranged in a virtual space; and
A program causing a computer to function as:
Number | Date | Country | Kind |
---|---|---|---|
2021-152892 | Sep 2021 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2022/011325 | 3/14/2022 | WO |