This application claims the priority of Korean Patent Application No. 2004-73367, filed on Sep. 14, 2004 in the Korean Intellectual Property Office, the disclosure of which is incorporated herein in its entirety by reference.
1. Field of the Invention
The present general inventive concept relates to a method of controlling a sound field, and more specifically, to a method of embedding sound field factors and sound field information into a sound source and a method of processing the sound field factors and the sound field information.
2. Description of the Related Art
Conventionally, transmitting sound field information for sound field processing requires a user to directly designate the sound field information. Additionally, the sound field information is typically inserted into a header of a packet having a compressed sound source. The sound field information may also be extracted from a sound source itself.
The user designates the sound field information through an input of an audio device with a sound field processor. This conventional method has a drawback in that the user is required to designate the sound field information, according to characteristics of the sound source. In an attempt to overcome this drawback, a method of matching information about a medium and audio tracks stored thereon with already-input sound field information has been disclosed.
The method of controlling the sound field includes an operation S21 of setting and storing sound field information on a CD number or track, an operation S22 of determining whether the CD is playing, an operation S23 of inputting currently playing CD number and track information, an operation S24 of determining whether the sound field information has already been stored, an operation S25 of controlling the sound field based on the sound field information on the given CD and track when the sound field information on the given CD and track has already been stored, an operation S26 of storing the sound field information selected by a user when sound field information on the given CD and track has not been stored, and an operation S27 of controlling the sound field based on sound field information selected by the user.
According to the conventional method of controlling the sound field illustrated in
However, the method of controlling the sound field illustrated in
Further, when the sound field information is inserted into the header of an audio packet having a compression sound source (e.g., an MPEG compression sound source) the sound field information may be corrupted any time the header is corrupted by transformation such as a format conversion and/or a transmission. In addition, when the sound field information is extracted from the sound source itself, there are problems in that accuracy is not guaranteed, real time processing may not be achieved, and the characteristics of the sound field are significantly different for most types of media. Therefore, this method is difficult to implement.
The present general inventive concept provides a method of embedding sound field control (SFC) factors representing characteristics of a sound source and sound field information representing a scene of a program, a genre of the program, and a sound field mode etc., into an uncompressed sound source.
The present general inventive concept also provides a method of processing a sound field according to the method of embedding the SFC factors.
Additional aspects and advantages of the present general inventive concept will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the general inventive concept.
The foregoing and/or other aspects and advantages of the present general inventive concept are achieved by providing a method of embedding sound field control (SFC) factors, the method comprising: coding sound field factors and sound field information to obtain the SFC factors for a sound source in a binary data type, wherein the sound field factors represent an acoustic characteristic of the sound source and the sound field information represents an environment under which the sound source is decoded, and watermarking the SFC factors into the sound source without compressing the sound source.
The SFC factors, which refer to sound field factors and sound field information, may be embedded into the uncompressed sound source using watermarking. The uncompressed sound source may be segmented into a plurality of frames according to a frame unit, and the SFC factors may be included in each frame. In addition, the frame segmenting may be initiated at a position where characteristics of sound field change significantly.
The SFC factors that represent characteristics of the sound source may be embedded into the sound source itself using a digital watermarking technology. Therefore, the user need not manually set the SFC factors one by one. In addition, the SFC factors can be reliably transmitted, irrespective of header corruption caused by format conversion of a compressed sound source and transmission.
The foregoing and/or other aspects and advantages of the present general inventive concept are also achieved by providing a method of processing a sound field, the method comprising: receiving a sound source having watermarked SFC factors, decoding the watermarked SFC factors from the sound source and performing a sound field processing on the sound source based on the decoded SFC factors.
A transitional processing, such as fade-in and fade-out processing, can be performed based on SFC factors in a present frame and other SFC factors in a next frame. Therefore, a sound field processing can be performed with presence.
These and/or other aspects and advantages of the present general inventive concept will become apparent and more readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
Reference will now be made in detail to the embodiments of the present general inventive concept, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. The embodiments are described below in order to explain the present general inventive concept while referring to the figures.
The present general inventive concept provides a method of embedding sound field control factors (hereinafter, referred to as ‘SFC factors’) that represent sound field characteristics of an uncompressed sound source using watermarking. The watermarked sound source is able to maintain sound properties thereof even though the SFC factors are embedded therein. In addition, the SFC factors, which are decoded by an extracting method that corresponds to the embedding method, are used to process the sound field.
The SF factor, the SF mode, the program scene, and the program genre are embedded in the sound source So and stored in the SFC factor database 204. The SF factor may be directly extracted from the sound source So signal. The user may designate the SF mode, the program scene, and the program genre at the time that the sound source So is recorded.
The sound source So is segmented into a plurality of frames. The SFC factors are embedded in the sound source So for each frame. The plurality of frames may be segmented based on a position where the characteristics of the sound field of the sound source So can be clearly distinguished. For example, the plurality of frames may be obtained based on a position where the SF mode, the program scene, or the program genre change or where the SF factor can be noticeably distinguished.
The sound source So is segmented into the plurality of frames including fo, f1, f2, . . . , and fN-1. For each of the plurality of frames fo, f1, f2, . . . , and fN-1, corresponding SFC factors SFCF0, SFCF1, SFCF2, . . . , and SFCFN-1 are embedded in respective frames of the sound source So.
The SFC factors SFCF, which comprise coded digital information, include corresponding SF factors, such as RT-reverberation time, C80-clearness, and PER-pattern of early reflection, and other sound field information.
As a result of the encoding of the sound source So with the SFC factors SFCF using the watermark encoder 202, the embedded results including f′o, f′1, f′2, . . . , f′N-1 are obtained.
A kernel of the time-spread echo method can be represented by the following equation.
k(n)=δ(n)+α·p(n−Δ)
where δ(n) is a dirac-delta function, p(n) is a pseudo-noise (PN) sequence, α is an amplitude, and Δ is a time delay. The time-spread echo method adds different information (binary data) to the sound source by using different time delays Δ or different PN sequence p(n).
In addition, p(n) serves as a secret key or an open key with which the embedded information can be extracted. Therefore, the secret key or the open key type can be used according to a system specification. For example, a key type may depend on controlling access of the embedded information.
Referring to
W(n)=s(n)*k(n) where * refers to a linear convolution.
A present frame fpresent and a next frame fnext are decoded through independent decoding processes. Thus, an SFC factor of the present frame SFCFpresent and an SFC factor of the next frame SFCFnext are decoded. The sound field processor references the decoded SFC factors.
In the sound field processing, the SFC factors in the present frame are referenced for the processing of the next frame. For example, when the SF mode of the present frame is a cave mode and the SF mode of the next frame is a plain (i.e., an extensive area of land without trees) mode, a fade-out processing is performed to prevent a reverberation sound adapted to the cave SF mode from affecting a reverberation sound adapted to the plain SF mode.
According to the present general inventive concept, the SFC factors, encoded as illustrated in
The decoded sound source d(n) obtained from operation illustrated in
d(n)=F−1[log[F[W(n)]]]{circle around (×)}LPN
where F[ ] and F−1[ ] represent a Fourier transform, and an inverse Fourier transform, respectively, log[ ] refers to a logarithmic function, {circle around (×)} refers to a cross-correlation function, and LPN refers to a PN sequence.
The SFC factors are detected by checking a clear peak position of Δ or ĝ from d(n). The cross correlation {circle around (×)} performs a despreading function between the pseudo noise function and the rest of the cepstrum analyzed signal.
At operation S804, the SFC factors are decoded from the watermarked sound source. The operation S804 of decoding the SFC factors from the watermarked sound source is described above with reference to
At operation S806, it is determined whether the SFC factors are extracted. If the SFC factors are extracted, at operation S808, the sound field factor and the sound field information that correspond to the embedded SFC factors are obtained by referring to the SFC factor database 204 (see
At operation S810, the sound field processing is performed by referring to the sound field factor and the sound field information obtained in the operation S808. In performing the sound field processing at the operation S810, sound field processing of the next frame is controlled by referring to the SFC factors of the present frame and the next frame. For example, fade-in and fade-out processing and other transitional processing are performed by referring to the sound field information of the present frame and the next frame. Thus, the sound field processing can be performed with presence.
Further, for the convenience of the user, at the operation 808, both the sound field factor and the sound field information input by the user, as well as the sound field factor and the sound field information obtained from the extraction, can be referred to.
At the operation 806, if the SFC factors are not extracted, the process proceeds to operation S812. At the operation S812, the sound field processing is performed by referring to the sound field factor and the sound field information input by the user.
According to the method of embedding SFC factors of the present general inventive concept, the SFC factors representing characteristics of the sound source are embedded into the sound source itself by using a digital watermarking technology. As a result, the user is not required to designate each of the SFC factors of the sound source.
In addition, according to the method of embedding the SFC factors of the present general inventive concept, the SFC factors are not transmitted in a header of a packet having a compressed sound source. Rather, the SFC factors are embedded and transmitted among sound content in the uncompressed sound source itself using the digital watermark technology. Therefore, even when the header is corrupted by format conversion of the compressed sound source and transmission, the SFC factors can be reliably transmitted.
In addition, according to the method of embedding SFC factors of the present general inventive concept, an uncompressed sound source is segmented into frames. Further, the SFC factors are embedded into each frame of the sound source. Thus, the SFC factors are adapted to the characteristic of the segmented sound source and can be transmitted in real time. In other words, since the sound source may be transmitted in an uncompressed form, the sound source and the SFC factors embedded therein may be processed in real time as the sound source is received by a sound processor. Moreover, the frame segmentation is performed at a position in the sound source where the characteristic of the sound field control is clearly distinguishable. Therefore, the SFC factors can be transmitted more efficiently.
In addition, according to the method of processing the sound field of the present general inventive concept, a transitional processing, such as fade-in and fade-out processing, can be performed based on sound field control (SFC) factors in the present and the next frames. Therefore, the sound field processing can be performed with presence.
As described above, according to the method of embedding SFC factors of the present general inventive concept, the SFC factors representing characteristics of the sound source can be embedded into the sound source itself without degradation in the sound quality, using the digital watermarking technology. In addition, at the time of reproducing the sound source, the SFC factors are extracted and used so that the sound field processing can be reliably performed and the characteristics of sound source can be maintained.
Although a few embodiments of the present general inventive concept have been shown and described, it will be appreciated by those skilled in the art that changes may be made in these embodiments without departing from the principles and spirit of the general inventive concept, the scope of which is defined in the appended claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
2004-73367 | Sep 2004 | KR | national |