In a digital audio transmission scheme, data is organized in frames and blocks. A frame represents a set of M data bits, corresponding to one instance of time. A block represents a set of N consecutive frames, corresponding to N consecutive instances of time. The duration of one instance of time is defined by the audio sampling frequency fs. For example, if fs=100 KHz, one hundred thousand blocks would be transmitted during one second. Therefore, one instance of time would be 10 microseconds. The data contained in one frame consists of (1) a sample of audio data (K bits), and (2) a set of non-audio data (M-K bits).
The purpose of the non-audio data is to control, synchronize, or otherwise support the rendering of the audio data in a receiver. In general, the non-audio data needs to be considered in the context of an entire block, not only in the context of a particular frame. The semantics of the data depend on the position of data within a frame and of the position of a frame within a block. The detailed format of data organization for a digital audio transmission scheme may be defined in a standard, such as IEC 61958-1. However, the invention presented herein is not restricted by a particular standard.
It is often desirable that the rate of audio data be changed in the digital domain before rendering the received data in the analog domain. For example, a device for digital audio processing may have insufficient memory allocated for the storage of incoming data. Therefore, the amount of data should be reduced, i.e., decimated. On the other hand, a device for digital audio processing may expect a higher amount of data than the actually incoming data. Therefore, the amount of data should be increased, i.e., interpolated.
Decimation and replication of audio data (or any other band-limited real-time data for that matter, such as video data, digitized measurement data) does not pose a problem, as long as proper filtering is applied. The theory of digital signal processing teaches what filtering methods are applicable. However, no such method can be applied to the non-audio data. Any bit omitted, inserted, or otherwise altered will render this data meaningless.
Digital receivers rely on the integrity of the non-audio data. Such data conveys necessary information for properly rendering the audio data. For example, IEC 61958 defines category code (the kind of equipment used, e.g. compact disk, digital tape recorder, digital broadcast receiver), channel number (2 channels in stereo mode, up to 8 channels in surround sound mode), clock accuracy and other relevant information. Therefore, data subjected to decimation and/or replication is no longer suitable for rendering when using a digital receiver. A method is proposed in the present application to preserve the integrity of the non-audio data in the context of digital audio transmission wherein the audio data is subject to decimation and replication.
One way of dealing with the non-audio data is simply to discard it. Tests have been conducted regarding the behavior of a commercial device supporting two audio output formats, namely SPDIF and I2S. The SPDIF output carries along the non-audio data, whereas the I2S output does not. When the audio data was intentionally altered (i.e., decimated or interpolated), the sound on the SPDIF output was frequently interrupted, whereas the sound on the I2S was uninterrupted. This simple method excludes the usage of state-of-the-art equipment supporting SPDIF.
Another way of dealing with the non-audio data is to redefine it before rendering, using software. The builder of the audio system has to program suitable data into the audio processor. This method works only in a closed system, i.e., knowledge of suitable data is somehow available through other means than the transmitted data.
Another way of dealing with the non-audio data is to limit the usage of audio processing steps involving decimation and/or replication only to parts of the system where non-audio data is simply not present, i.e., remote from the transmission stage.
If the throughput through the HDMI receiver 210 is not balanced, i.e., the input data arrives at a higher or lower rate, respectively, than the output data, the FIFO 260 will get full or empty, respectively. Thus the continuity of the output audio data will be disrupted, resulting in an audible interruption of sound.
To prevent this, flow adjustment is implemented in control logic of the FIFO 260. In normal operation, write and read pointers advance by 1 per each write and read operation, and the average distance between write and read pointer (AKA fill level) is constant, typically around “half full”. When the fill level of the FIFO 260 is above an “almost full” threshold, the read pointer will advance by 2 per read operation, effectively decimating the output data by a factor 2, until the fill level goes back below the threshold. When the fill level of the FIFO 260 is below an “almost empty” threshold, the read pointer will advance only every other read operation, effectively over sampling the output data by a factor 2, until the fill level goes back above the threshold.
Experimental data shows that the flow adjustment produces no audible side effects through the I2S output formatter 230, as long as the throughput misbalance is small (e.g. 44.107 KHz versus 44.103 KHz). However, experimental data shows that the flow adjustment produces a problem for the output of the SPDIF output formatter 240 (SPDIF output). The SPDIF output is organized in frames consisting of 192 audio samples each. Each audio sample has an associated channel status bit (see IEC60958-1 section 4 and IEC 60958-3 table 1). The 192 channel status bits essentially serve as an audio mode descriptor, wherein bits at specific positions (ranging from 0 to 192) have specific meanings. By decimating or duplicating audio samples, the associated channel status bits are decimated or duplicated as well, thus destroying the integrity of a frame. This results in an audible interruption of the audio stream through SPDIF, possibly without recovery.
Accordingly, it is desirable to provide a scheme that preserves control information embedded in digital data that handles the data in real time and transparently without requiring user intervention.
Control information embedded in digital data is preserved by inputting digital data into a data processor, wherein the digital data includes real-time samples of recorded data and control information, the control information being organized in a format within the digital data, separating at least some of the control information from the recorded data, and storing the separated control information in a memory so that it is preserved.
The following drawings provide examples of the invention. However, the invention is not limited to the precise arrangements, instrumentalities, scales, and dimensions shown in these examples, which are provided mainly for illustration purposes only. In the drawings:
The present invention enables the deployment of audio processing steps involving decimation and/or replication anywhere in the system without imposing constraints, such as restricting the use of SPDIF output or restricting the use of decimation/replication between transmitter and receiver.
If systematic data decimation/replication is a requirement, the present invention is more convenient and easier to implement than software-controlled restitution of the non-audio data, because (1) no software has to be written, and (2) no redundancy is required, i.e., the data need not be present in any other media besides in the transmitted data.
Other than systematic data decimation/replication (e.g., for format conversions), dynamic data decimation/replication may also be required to handle data overflow/underflow. In such a scenario, there is a temporary misbalance between incoming and outgoing data, and a flow regulation mechanism needs to be used. The flow regulation mechanism must perform the following functions:
An HDMI receiver according to the present invention separates the data path for the audio samples and the channel status bits, relying on the fact that the channel status bits carry information associated with the audio mode only, which is independent of a particular audio sample. Therefore, the channel status bits can be extracted upstream of the FIFO and then re-inserted downstream of the FIFO.
To further explain the present invention, the relations between IEC 60958 and HDMI format definitions for digital audio data are defined in Table 1 below.
An Audio Sample Packet (ASP) consists of 4 subpackets. At the write and read data ports of the Audio FIFO, the data is arranged as a 56 bit word according Table 2 below.
Each subpacket contains data for both the 1st and the 2nd IEC60958 sub-frame. The 1st and 2nd sub-frames are associated with the left and right speaker in stereo mode.
In frame regeneration mode, bit 55, 54, 27, and 26 are extracted and re-inserted. The frame regeneration mode can be enabled by a software-programmable register. When frame regeneration is disabled, all bits are taken from the FIFO.
The following algorithms show an implementation of one preferred embodiment of portions of the above invention. On the FIFO write port, an address counter (channel_status_addr) is implemented. The address counter is synchronized using the block start indicator bits from the ASP header (see HDMI spec 1.2, section 5.3.4, table 5-12). These bits (b_0, b_1, b_2, b_3) indicate whether any of the 4 subpackets contains the 1st frame in an IEC 60958 block. The synchronization mechanism also uses the existing write control signals (wr, aspf_inc) of the FIFO.
For proper alignment between control and data signals, the channel status bits at the FIFO data port (spm[26], spm[54]) are delayed by one clock cycle.
The extracted channel status bits (cl_bit, cr_bit) are then stored in a 192 bit wide memory, one associated with the 1st subframe (channel_status_left), the other associated with the 2nd subframe (channel_status_right).
On the FIFO read port, a free-running modulo 192 counter (rd_count) is implemented. The counter is controlled by the existing read enable signal (rd) of the FIFO.
The block start indicator bit (b_rd) and the parity bit (p_rd) are regenerated and re-inserted into the data bus (rd_datao) together with the channel status bits (cl_rd, cr_rd). This function is enabled or disabled by a programmable register (flow_dt_en).
It will be appreciated by those skilled in the art that changes could be made to the embodiments described above without departing from the broad inventive concept thereof. It is understood, therefore, that this invention is not limited to the particular examples disclosed, but it is intended to cover modifications' within the spirit and scope of the present invention as defined by the appended claims.