1. Technical Field
The present invention relates to an audio data processing apparatus, an audio apparatus, an audio data processing method, a program, and a recording medium recording this program.
2. Description of Related Art
In recent years, researches for audio systems employing basic principles of wave field synthesis (WFS) are actively carried out in Europe and other regions (for example, see Non-patent Document 1). (A. J. Berkhout, D. de Vries, and P. Vogel (The Netherlands), Acoustic control by wave field synthesis, The Journal of the Acoustical Society of America (J. Acoust. Soc.), Volume 93, Issue 5, May 1993, pp. 2764-2778)).The WFS is a technique that the wave front of sound emitted from a plurality of speakers (referred to as a “speaker array”, hereinafter) arranged in the shape of an array is synthesized on the basis of Huygens' principle.
A listener who listens sound in front of a speaker array in sound space provided by a WFS receives feeling as if sound emitted actually from the speaker array were emitted from a sound source (referred to as a “virtual sound source”, hereinafter) virtually present behind the speaker array (for example, see
Apparatuses to which WFS systems are applicable include movies, audio systems, televisions, AV racks, video conference systems, and TV games. For example, in a case that digital contents are a movie, the presence of each actor is recorded on a medium in the shape of a virtual sound source. Thus, when an actor who is speaking moves inside the screen space, the virtual sound source is allowed to be located left, right, back, and forth, and in an arbitrary direction within the screen space in accordance with the direction of movement of the actor inside the screen space. For example, Patent Document 1 (Japanese Unexamined Patent Application Publication No. 2007-502590) describes a system achieving the movement of a virtual sound source.
In a physical phenomenon known as the Doppler effect, the frequency of sound waves are observed in different values depending on the relative velocity between a sound source which is a source generating sound waves and a listener. According to the Doppler effect, when a sound source which is a source generating sound waves approaches a listener, the oscillation of sound waves is compressed and hence the frequency becomes higher. On the contrary, when the sound source departs from the listener, the oscillation of sound waves is expanded and hence the frequency becomes lower. This indicates that even when the sound source moves, the number of waves of the sound reaching from the sound source does not change.
Nevertheless, in the technique described in Non-patent Document 1, it is premised that the virtual sound source is fixed and not moving. Thus, the Doppler effect occurring in association with the movement of the virtual sound source is not taken into consideration. Thus, when the virtual sound source moves in a direction of departing from the speaker or in a direction of approaching, the number of waves of the audio signal providing the basis of the sound generated by the speaker is changed and hence the change in the number of waves causes distortion in the waveform. When distortion is caused in the waveform, the listener perceives the distortion as noise. Thus, means resolving the waveform distortion need be provided. Details of distortion in the waveform are described later.
On the other hand, in the method described in Patent Document 1, with taking into consideration the Doppler effect generated in association with the movement of the virtual sound source, a weight coefficient is changed for the audio data in a range from suitable sample data within a particular segment in the audio data providing the basis of the audio signal to suitable sample data in the next segment, so that the audio data in the range is corrected. Here, the “segment” indicates the unit of processing of audio data. When the audio data is corrected, extreme distortion in the audio signal waveform is resolved to some extent and hence noise caused by the waveform distortion is reduced.
Nevertheless, in the method described in Patent Document 1, in order that the audio data of the segment at present should be corrected, the sound wave propagation time for the audio data of the next segment need be calculated in advance. That is, in the method described in Patent Document 1, until calculation processing and the like for the sound wave propagation time of the audio data of the next segment are completed, correction of the audio data of the segment at present is not achievable. Thus, a problem arises that a delay corresponding to one segment occurs in the output of the audio data of the segment at present.
The present invention has been devised in view of this problem. An object of the present invention is to provide an audio data processing apparatus and the like identifying a distorted part in audio data and then correcting the identified waveform distortion, wherein audio data is outputted without the occurrence of the above-mentioned delay.
The audio data processing apparatus according to the present invention is an audio data processing apparatus that receives audio data corresponding to sound generated by a moving virtual sound source, a position of the virtual sound source, and a position of a speaker emitting sound on the basis of the audio data and that corrects the audio data on the basis of the position of the virtual sound source and the position of the speaker, the apparatus comprising: calculating means calculating first and second distances measured at two time points from the position of the speaker to the position of the virtual sound source; identifying means, when the first and the second distances are different from each other, identifying a distorted part in the audio data at the two time points; and correcting means correcting the audio data of the identified part by interpolation using a function.
In the audio data processing apparatus according to the present invention, the audio data contains sample data, the identifying means identifies a repeated part and a lost part of the sample data caused by departing and approaching of the virtual sound source relative to the speaker, and the correcting means corrects the repeated part and the lost part having been identified, by interpolation using a function.
In the audio data processing apparatus according to the present invention, the interpolation using a function is linear interpolation.
In the audio data processing apparatus according to the present invention, the part to be processed by the correction has a time width equal to a difference between time widths during propagation of the sound waves through the first and the second distances or a time width proportional to the difference.
The audio apparatus according to the present invention is an audio apparatus that uses audio data corresponding to sound generated by a moving virtual sound source, a position of the virtual sound source, and a position of a speaker emitting sound on the basis of the audio data and that thereby corrects the audio data on the basis of the position of the virtual sound source and the position of the speaker, the apparatus comprising: a digital contents input part receiving digital contents containing the audio data and the position of the virtual sound source; a contents information separating part analyzing the digital contents received by the digital contents input part and separating audio data and position data of the virtual sound source contained in the digital contents; an audio data processing part, on the basis of the position data of the virtual sound source separated by the contents information separating part and position data of the speaker, correcting the audio data separated by the contents information separating part; and an audio signal generating part converting the corrected audio data into an audio signal and then outputting the obtained signal to the speaker, wherein the audio data processing part includes: calculating means calculating first and second distances measured at two time points from the position of the speaker to the position of the virtual sound source; identifying means, when the first and the second distances are different from each other, identifying a distorted part in the audio data at the two time points; and correcting means correcting the audio data of the identified part by interpolation using a function.
In the audio apparatus according to the present invention, the digital contents input part receives digital contents from a recording medium storing digital contents, a server distributing digital contents through a network, or a broadcasting station broadcasting digital contents.
The audio data processing method according to the present invention is an audio data processing method employed in an audio data processing apparatus that receives audio data corresponding to sound generated by a moving virtual sound source, a position of the virtual sound source, and a position of a speaker emitting sound on the basis of the audio data and that corrects the audio data on the basis of the position of the virtual sound source and the position of the speaker, the method comprising: a step of calculating first and second distances measured at two time points from the position of the speaker to the position of the virtual sound source; a step of, when the first and the second distances are different from each other, identifying a distorted part in the audio data at the two time points; and a step of correcting the audio data of the identified part by interpolation using a function.
The program according to the present invention is a program, on the basis of a position of a virtual sound source formed by sound emitted from a speaker receiving an audio signal corresponding to audio data and on the basis of a position of the speaker, correcting the audio data corresponding to sound emitted from the moving virtual sound source, the program causing a computer to execute: a step of calculating first and second distances measured at two time points from the position of the speaker to the position of the virtual sound source; a step of, when the first and the second distances are different from each other, identifying a distorted part in the audio data at the two time points; and a step of correcting the audio data of the identified part by interpolation using a function.
The recording medium according to the present invention records the above-mentioned program.
In the audio data processing apparatus according to the present invention, the part of waveform distortion is identified depending on the approaching or departing of the virtual sound source relative to the speaker. Then, the identified waveform distortion is corrected by interpolation using a function. Thus, the audio data is corrected and outputted without delay.
In the audio data processing apparatus according to the present invention, a repeated part and a lost part of the sample data caused by departing and approaching of the virtual sound source relative to the speaker are identified. Then, correcting means corrects the repeated part and the lost part having been identified, by interpolation using a function. Thus, the audio data is corrected and outputted without delay.
In the audio data processing apparatus according to the present invention, the part of waveform distortion is identified depending on the approaching or departing of the virtual sound source relative to the speaker. Then, the identified waveform distortion is corrected by linear interpolation. Thus, the audio data is corrected and outputted without delay.
In the audio apparatus according to the present invention, the part of waveform distortion is identified depending on the approaching or departing of the virtual sound source relative to the speaker. Then, the identified waveform distortion is corrected by interpolation using a function. Thus, the audio data is corrected and outputted without delay.
In the audio data processing method according to the present invention, the part of waveform distortion is identified depending on the approaching or departing of the virtual sound source relative to the speaker. Then, the identified waveform distortion is corrected by interpolation using a function. Thus, the audio data is corrected and outputted without delay.
In the program according to the present invention, the part of waveform distortion is identified depending on the approaching or departing of the virtual sound source relative to the speaker. Then, the identified waveform distortion is corrected by interpolation using a function. Thus, the audio data is corrected and outputted without delay.
In the recording medium recording a program according to the present invention, the part of waveform distortion is identified depending on the approaching or departing of the virtual sound source relative to the speaker. Then, the identified waveform distortion is corrected by interpolation using a function. Thus, the audio data is corrected and outputted without delay.
According to the audio data processing apparatus and the like according to the present invention, distortion of the audio data caused by the approaching or departing of the virtual sound source relative to the speaker can be corrected without delay and then the corrected audio data can be outputted.
The above and further objects and features will more fully be apparent from the following detailed description with accompanying drawings.
First, description is given for: a calculation model assuming that the virtual sound source does not move in sound space provided by a WFS; and a calculation model taking into consideration the movement of the virtual sound source. Then, an embodiment is described.
On the other hand, FIG.2 are explanation diagrams generally describing audio signals. When an audio signal is to be treated theoretically, in general, the audio signal is expressed as a continuous signal S(t).
The contents of calculation model without considering a movement of the virtual sound source 101 is as follows. In the present calculation model, the audio signal provided to the speaker array 103 is generated by using following Equations (1) to (4).
In the present calculation model, sample data at discrete time t is generated for an audio signal provided to the m-th speaker (referred to as the “speaker 103—m”, hereinafter) contained in the speaker array 103. Here, as illustrated in
Here,
qn(t) is sample data at discrete time t of sound wave emitted from the n-th virtual sound source (referred to as the “virtual sound source 101—n”, hereinafter) among the N virtual sound sources 101 and then having reached the speaker 103—m, and
lm(t) is sample data at discrete time t of an audio signal provided to the speaker 103—m.
q
n
=G
n
·s
n(t−τmn) (2)
Here,
Gn is a gain coefficient for the virtual sound source 101—n,
sn(t) is sample data at discrete time t of an audio signal provided to the virtual sound source 101—n, and
τmn is the number of samples corresponding to the sound wave propagation time corresponding to the distance between the position of the virtual sound source 101—n and the position of the speaker 103—m.
Here,
w is a weight constant,
rn is the position vector (fixed value) of the virtual sound source 101—n, and
rm is the position vector (fixed value) of the speaker 103—m.
└ ┘ is a floor symbol,
R is the sampling rate, and
c is the speed of sound in air.
Here, the floor symbol expresses “an integer that is maximum among those not exceeding a given value”.
As seen from Equations (3) and (4), in the present calculation model, the gain coefficient Gn for the virtual sound source 101—n is inverse proportional to the square root of the distance from the virtual sound source 101—n to the speaker 103—m. This is because the set of the speakers 103—m is modeled as a line of sound source. On the other hand, the sound wave propagation time τmn is proportional to the distance from the virtual sound source 101—n to the speaker 103—m.
In Equations (1) to (4), it is premised that the virtual sound source 101—n does not move and stands still at a particular position. Nevertheless, in the real world, persons speak while walking, and automobiles run while generating engine sound. That is, in the real world, a sound source stands still in some cases and moves in some cases. Thus, in order to treat these cases, a new calculation model (calculation model according to Embodiment 1) is introduced which takes into consideration a situation that a sound source moves. This new calculation model is described below.
When a situation that the virtual sound source 101—n moves is taken into consideration, Equations (2) to (4) are replaced by Equations (5) to (7) given below.
q
n(t)=Gn,t·sn(t−τmn,t) (5)
Here,
Gn,t is a gain coefficient for the virtual sound source 101—n at discrete time t, and
τmn,t is the number of samples corresponding to the sound wave propagation time corresponding to the distance between the virtual sound source 101—n and the speaker 103—m at discrete time t.
Here,
rn,t is the position vector of the virtual sound source 101—n at discrete time t.
Since the virtual sound source 101—n moves, as seen from Equations (5) to (7), the gain coefficient for the virtual sound source 101—n, the position of the virtual sound source 101—n, and the sound wave propagation time vary as a function of discrete time t.
In general, signal processing on the audio data is performed segment by segment. The “segment” is the unit of processing of audio data and is also referred to as a “frame”. For example, one segment is composed of 256 pieces of sample data or 512 pieces of sample data. Thus, lm(t) (sample data at discrete time t of an audio signal provided to the speaker 103—m) in Equation (1) is calculated in the unit of segment. Thus, in the present calculation model, the segment of audio data calculated at discrete time t and used for generating the audio signal provided to the speaker 103—m is expressed by a vector Lm,t. In this case, Lm,t is vector data constructed from “a” pieces of sample data (such as 256 pieces of sample data and 512 pieces of sample data) contained in one segment extending from discrete time t−a+1 to discrete time t. Lm,t is expressed by Equation (8).
L
m,t=(lm(t−a+1), lm(t−a+2), . . . , lm(t)) (8)
Thus, for example, Lm,t0 at discrete time t0 is expressed by
L
m,t0=(lm(t0−a+1), lm(t0−a+2), . . . , lm(t0))
When this Lm,t0 is obtained, Lm,(t0+a) is then calculated.
Lm,(t0+a) is expressed by
L
m,(t0+a)=(lm(t0+1), lm(t0+2), . . . , lm(t0+a))
Since the audio data is processed segment by segment, it is practical that rn,t also is calculated segment by segment. However, the frequentness of update of rn need not indispensably agree with the segment unit. Then, as a result of comparison between the virtual sound source position rn,t0 at discrete time t0 and the virtual sound source position rn,t0−a at discrete time (t0−a), it is recognized that the virtual sound source position rn,t0 varies by the distance that the virtual sound source 101—i n has moved relative to the speaker 103—m between discrete time (t0−a) and discrete time t0. The following description is given for: a case that the virtual sound source 101—n moves in a direction of departing from the speaker 103_m (the virtual sound source 101—n is departing from the speaker 103—m); and a case that the virtual sound source 101—n moves in a direction of approaching (the virtual sound source 101—n is approaching the speaker 103—m).
Gn,t and τmn,t also vary in correspondence to the distance that the virtual sound source 101—n moves between discrete time (t0−a) and discrete time t0. The following Equations (9) and (10) express the amount of variation in the gain coefficient that varies in accordance with the distance that the virtual sound source 101—n has moved between discrete time (t0−a) and discrete time t0 and the amount of variation in the number of samples corresponding to the sound wave propagation time. For example, ΔGn,t0 expresses the amount of variation of the gain coefficient at discrete time t0, and Δτmn,t0 expresses the amount of variation (also referred to as a “time width”) of the number of samples corresponding to the sound wave propagation time at discrete time t0 relative to the number of samples corresponding to the sound wave propagation time at discrete time (t0−a). When the virtual sound source moves from discrete time (t0−a) to discrete time t0, these amounts of variation take any one of a positive value and a negative value depending on the direction of movement of the virtual sound source 101—n.
When the virtual sound source 101—n is departing or approaching relative to the speaker 103—m, ΔGn,t0 and time width Δτmn,t0 arise and hence waveform distortion occurs at discrete time t0. Here, a state that “waveform distortion” has occurred indicates a state that the audio signal waveform does not vary continuously and does vary discontinuously to an extent that the part is perceived as noise by the listener.
For example, when the virtual sound source 101—n moves in a direction of departing from the speaker 103—m so that the sound wave propagation time increases, that is, when the time width Δτmn,t0 is positive, in the beginning part of the segment starting at discrete time t0, the audio data of the final part of the preceding segment appears again for the time width Δτmn,t0. In the following description, the preceding segment of the segment starting at discrete time t0 is referred to as a first segment, and the segment starting at discrete time t0 is referred to as a second segment. In such a manner the audio data appears repeatedly, as a result, distortion occurs in the waveform.
On the other hand, when the virtual sound source 101 n moves in a direction of approaching the speaker 103—m so that the sound wave propagation time decreases, that is, when the time width Δτmn,t0 is negative, a loss of time width Δτmn,t0 is generated between the audio data of the final part of the first segment and the audio data of the beginning part of the second segment. As a result, a discontinuity point arises in the audio signal waveform. This is also waveform distortion. Detailed examples of distortion in the waveform are described below with reference to the drawings.
First, description is given for a case that the virtual sound source 101—n moves in a direction of departing from the speaker 103—m so that the sound wave propagation time corresponding to the distance between the position of the virtual sound source 101—n and the position of the speaker 103—m increases, that is, a case that the time width Δτmn,t0 is positive.
Description is given below for the contrary case that the virtual sound source 101—n moves in a direction of approaching the speaker 103—m so that the sound wave propagation time decreases, that is, a case that the time width Δτmn,t0 is negative.
The reason why waveform distortion is generated when the virtual sound source 101—n moves has been described above. Next, Embodiment 1 in which audio data is corrected so that waveform distortion is resolved is described in detail with reference to the drawings.
From a recording medium 1117 storing digital contents (such as movies, computer games, and music videos), the reproducing part 1109 reads appropriate digital contents and then outputs the contents to the contents information separating part 1102. The recording medium 1117 is composed of a CD-R (Compact Disc Recordable), a DVD (Digital Versatile Disk), a Blu-ray Disk (registered trademark), or the like. In the digital contents, a plurality of audio data files respectively corresponding to the virtual sound sources 101_1 to 101_N and virtual sound source position data corresponding to the virtual sound sources 101_1 to 101_N are recorded in a manner of correspondence to each other.
The communication interface part 1110 acquires digital contents from a server 1115 distributing digital contents via a communication network such as the Internet 1114, and then outputs the acquired contents to the contents information separating part 1102. Further, the communication interface part 1110 is provided with devices (not illustrated) such as an antenna and a tuner, and receives a program broadcasted from a broadcasting station 1116 and then outputs the received program as digital contents to the contents information separating part 1102.
The contents information separating part 1102 acquires digital contents from the reproducing part 1109 or the communication interface part 1110, and then analyzes the digital contents so as to separate audio data and virtual sound source position data from the digital contents. Then, the contents information separating part 1102 outputs the audio data and the virtual sound source position data obtained by the separation, respectively to the audio data storing part 1103 and the virtual sound source position data storing part 1104. For example, when the digital contents is a music video, the virtual sound source position data is position data corresponding to the relative positions of a singer and a plurality of musical instruments displayed on the video screen. The virtual sound source position data is, together with the audio data, stored in the digital contents.
The audio data storing part 1103 stores the audio data acquired from the contents information separating part 1102, and the virtual sound source position data storing part 1104 stores the virtual sound source position data acquired from the contents information separating part 1102. The speaker position data storing part 1106 acquires from the speaker position data input part 1105 the speaker position data specifying the within-the-sound-space positions of the speakers 103_1 to 103_M of the speaker array 103, and then stores the acquired data. The speaker position data is information set up by the user on the basis of the positions of the speakers 103_1 to 103_M constituting the speaker array 103. For example, this information is expressed with reference to coordinates in one plane (X-Y coordinate system) fixed to the audio apparatus 1100 within the sound space. The user operates the speaker position data input part 1105 so as to store the speaker position data into the speaker position data storing part 1106. In a case that arrangement of the speaker array 103 is determined in advance from a constraint on the practical mounting, the speaker position data is set up as fixed values. On the other hand, in a case that the user is allowed to determine the arrangement of the speaker array 103 arbitrarily to an extent, the speaker position data is set up as variable values.
The audio data processing part 1101 reads from the audio data storing part 1103 the audio files corresponding to the virtual sound sources 101_1 to 101_N. Further, the audio data processing part 1101 reads from the virtual sound source position data storing part 1104 the virtual sound source position data corresponding to the virtual sound sources 101_1 to 101_N. Further, the audio data processing part 1101 reads from the speaker position data storing part 1106 the speaker position data corresponding to the speakers 103_1 to 103_M of the speaker array 103. On the basis of the virtual sound source position data and the speaker position data having been read, the audio data processing part 1101 performs the processing according to the embodiment onto the read-out audio data. That is, the audio data processing part 1101 performs arithmetic processing on the basis of the above-mentioned calculation model in which the movement of the virtual sound sources 101_1 to 101_N is taken into consideration, so as to generate audio data used for forming audio signals to be provided to the speakers 103_1 to 103_M. The audio data generated by the audio data processing part 1101 is outputted as audio signals through the D/A conversion part 1107, and then outputted through the amplifiers 1108_1 to 1108_M to the speakers 103_1 to 103_M. On the basis of these audio signals, the speaker array 103 generates and emits sound to the sound space.
The distance data calculating part 1201 acquires the virtual sound source position data and the speaker position data respectively from the virtual sound source position data storing part 1104 and the speaker position data storing part 1106, then, on the basis of these data, calculates distance data (|rn,t−rm|) between the virtual sound source 101—n and each of the speakers 103_1 to 103_M, and then outputs the calculated data to the sound wave propagation time data calculating part 1202 and the gain coefficient data calculating part 1204. On the basis of the distance data (|rn,t−rm|) acquired from the distance data calculating part 1201, the sound wave propagation time data calculating part 1202 calculates sound wave propagation time data (the number of samples corresponding to the sound wave propagation time) τmn,t (see Equation (7)). The sound wave propagation time data buffer 1203 acquires the sound wave propagation time data τmn,t from the sound wave propagation time data calculating part 1202, and then temporarily stores the sound wave propagation time data corresponding to plural segments. On the basis of the distance data (|rn,t−rm|) acquired from the distance data calculating part 1201, the gain coefficient data calculating part 1204 calculates gain coefficient data Gn,t (see Equation (6)).
The input audio data buffer 1206 acquires from the audio data storing part 1103 the input audio data corresponding to the virtual sound source 101—n, and then stores temporarily the input audio data corresponding to plural segments. For example, one segment is composed of 256 pieces of audio data or 512 pieces of audio data. Using the sound wave propagation time data τmn,t calculated by the sound wave propagation time data calculating part 1202 and the gain coefficient data Gn,t calculated by the gain coefficient data calculating part 1204, the output audio data generating part 1207 generates output audio data corresponding to the input audio data temporarily stored in the input audio data buffer 1206. The output audio data superposing part 1208 synthesizes the output audio data generated by the output audio data generating part 1207, in accordance with the number of virtual sound sources 101—n.
With reference to
The distance data calculating part 1201 calculates the distance data (|rl,t1−rl|) expressing the distance at discrete time t1 between the first virtual sound source (referred to as the “virtual sound source 101_1”, hereinafter) and the first speaker (referred to as the “speaker 103_1”, hereinafter), and then outputs the calculated data to the sound wave propagation time data calculating part 1202 and the gain coefficient data calculating part 1204.
Using Equation (7), on the basis of the distance data (|rl,t1−rl|) acquired from the distance data calculating part 1201, the sound wave propagation time data calculating part 1202 calculates the sound wave propagation time data τl1,t1 and then outputs the calculated data to the sound wave propagation time data buffer 1203.
The sound wave propagation time data buffer 1203 stores the sound wave propagation time data τl1,t1 acquired from the sound wave propagation time data calculating part 1202. With reference to
Using Equation (6), on the basis of the distance data (|rl,t−rl|) acquired from the distance data calculating part 1201, the gain coefficient data calculating part 1204 calculates gain coefficient data Gl,t1.
Using the newer sound wave propagation time data stored in the sound wave propagation time data buffer 1203 and the gain coefficient data calculated by the gain coefficient data calculating part 1204, the output audio data generating part 1207 generates output audio data.
In a case that the virtual sound source 101—n is departing from the speaker 103—m between discrete time (t1−a) and discrete time (t1−1), waveform distortion as illustrated in
First, the correction interval width is set to be 5 which is equal to the time width Δτmn,t1. The output audio data buffer 1209 already stores the sample data 312 at the last discrete time (t1−1) of the preceding segment. In Embodiment 1, for the purpose of resolving the waveform distortion illustrated in
At the time of correcting the waveform distortion near discrete time t1, it is sufficient that the sound wave propagation time of the segment starting at discrete time (t1−a) and the sound wave propagation time of the segment starting at discrete time t1 have been calculated. That is, at the time of correcting distortion in the audio data near the starting point of the present segment, the sound wave propagation time of the audio data of the next segment, that is, the segment starting at discrete time (t1+a), need not have been calculated. Thus, in a case that the virtual sound source 101—n is departing from the speaker 103—m, a delay of one segment does not occur. Thus, even in a case that the virtual sound source position is changed in real time, the audio data is corrected without delay.
Next, in a case that the virtual sound source 101—n is approaching the speaker 103—m between discrete time (t1−a) and discrete time t1, the sound wave propagation time data τmn,t1−a becomes smaller than the sound wave propagation time data τmn,t1. Thus, since Δτmn,t1=τmn,t1−a−τmn,t1, the time width Δτmn,t1 becomes negative. In this case, the audio data is lost relative to the segment starting at discrete time (t1−a) and the segment starting at discrete time t1.
The output audio data buffer 1209 already stores the sample data 312 at the last discrete time (t1−1) of the preceding segment. In Embodiment 1, for the purpose of resolving the waveform distortion illustrated in
At step S15, when it is judged that the first and the second distance data are different from each other (S15: YES), that is, when it is judged that the virtual sound source 101—n has moved relative to the speaker 103—m, the audio data processing part 1101 goes to the processing of step S16. In contrast, at step S15, when it is judged that the first and the second distance data are the same (S15: NO), that is, when it is judged that the virtual sound source 101—n stands still, the audio data processing part 1101 goes to the processing of step S19. On the basis of the judgment result obtained at step S15, the audio data processing part 1101 identifies a repeated part and a lost part of the sample data caused by departing and approaching of the virtual sound source relative to the speaker (S16), and then performs linear interpolation described above onto the distorted part of the waveform so as to correct the waveform (S17).
Then, the audio data processing part 1101 performs gain control on the virtual sound source 101—n (S18). Then, the audio data processing part 1101 adds 1 to the number n of the virtual sound source 101—n (S19) and then judges whether the number n of the virtual sound source 101—n is equal to the maximum value N (S20). As a result of the judgment at step S20, when it is judged that the number n of the virtual sound source 101—n is equal to the maximum value N (S20: YES), audio data is synthesized (S21). On the other hand, as a result of the judgment at step S20, when it is judged that the number of the virtual sound sources 101—n is not equal to the maximum value N (S20: NO), the audio data processing part 1101 returns to the processing of step S11 so that performs the processing of step S11 to step S18 onto the second virtual sound source 101_2 and the first speaker 103_1.
After the synthesis of audio data at step S21, the audio data processing part 1101 substitutes 1 into the number n of the virtual sound source 101—n (S22) and adds 1 to the number m of the speaker 103—m (S23). Then, the audio data processing part 1101 judges whether the number m of the speaker 103—m is equal to the maximum value M (S24). When it is judged that the number m of the speaker 103—m is equal to the maximum value M (S24: YES), the audio data processing part 1101 terminates the processing. In contrast, when it is judged that the number m of the speaker 103—m is not equal to the maximum value M (S24: NO), the audio data processing part 1101 returns to the processing of step S11.
The program 231 is not limited to one read from the recording medium 230 and then stored into the EEPROM 24 or the internal storage device 25. That is, the program 231 may be stored in an external memory such as a memory card. In this case, the program 231 is read from an external memory (not illustrated) connected to the CPU 17, and then stored into the EEPROM 24 or the internal storage device 25. Alternatively, communication may be established between a communication part (not illustrated) connected to the CPU 17 and an external computer, and then the program 231 may be downloaded onto the EEPROM 24 or the internal storage device 25.
As this invention may be embodied in several forms without departing from the spirit of essential characteristics thereof, the present embodiments are therefore illustrative and not restrictive, since the scope of the invention is defined by the appended claims rather than by the description preceding them, and all changes that fall within metes and bounds of the claims, or equivalence of such metes and bounds thereof are therefore intended to be embraced by the claims.
Number | Date | Country | Kind |
---|---|---|---|
2009-279793 | Dec 2009 | JP | national |
This application is the national phase under 35 U. S. C. §371 of PCT International Application No. PCT/JP2010/071490 filed on Dec. 1, 2010, which claims priority under 35 U.S.C. 119(a) to Patent Application No. 2009-279793 filed in Japan on Dec. 9, 2009, all of which are hereby expressly incorporated by reference into the present application.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2010/071490 | 12/1/2010 | WO | 00 | 7/9/2012 |