The technical field of this invention is real time audio/video data processing.
The field of this invention is real time audio/video processing. Media files are often delivered for streaming or stored for consumption using compressed data formats. Playing this media data requires decoding/decompressing. This decoding/decompressing should be done in real time while the user is listening to the recovered audio data or watching the recovered video data. Once playing begins any interrupt of flow is generally considered unacceptable by the user.
There are two separate aspects of data processing that might cause problems. The first problem is with computational latency. The decoding/decompressing process requires data processing time. Thus there is a delay in playing audio/video file. The second problem is variability. Most compression techniques provide variable compression dependent upon the nature of the audio/video file being compressed. In addition some parts of the compression technique require varying amounts of data processing for decompression. The variable data rate and variable required data processing for decompression causes variable latency.
There is another artifact of the data processing in decoding/decompressing. The required data processing is generally very serial for each input data sample. Performing the entire serial chain processing for a single sample at a time is considered disadvantageous. Generally these data processing operations operate upon a block of samples. This process generally requires buffering data samples before and after data processing each block of samples.
The problem of this invention is providing suitable latency and robust response to the inherent variability of the latency.
These and other aspects of this invention are illustrated in the drawings, in which:
Input/Output Latency is a key performance characteristic of audio/video products. Proper matching the audio latency with the video latency is required to achieve required lip sync in the output. Algorithms for such products are typically implemented on embedded processors. These embedded processors typically operate on blocks of samples rather than on a sample-by-sample basis. Sample-by-sample operation would provide minimal latency but results in poor processor utilization. The data processing in these systems involves a serial chain of operations. Processing on a sample-by-sample basis would thus include task switching many times. Each task switch typically includes a task switch penalty. Processing upon batches of samples minimizes this task switch penalty at the expense of increased latency. This invention reduces the latency associated with block processing while maintaining the performance benefits associated with block processing.
Audio/Video systems require deterministic input-to-output latency. This is typically achieved by starting transmission of output samples with a fixed timing relationship relative to the arrival of input samples. Human perception of audio/video synchronization limits this latency to less than a prescribed threshold. Sufficient latency must be provided in these systems for processing to be completed without output starvation.
Prior to this invention, this latency between input and output was required to be an integral number of sample blocks. The invention employs a fractional block of samples. This controls the input/output timing relationship with a finer granularity than previous methods. This finer granularity allows intermediate choices for latency which can satisfy both competing constraints. The invention allows latency to be reduced to more perceptually acceptable levels, while still meeting real-time processing deadlines. This invention involves framework architecture changes to reduce processing latency. This is especially important for systems having multi-processor cascades.
System bus 110 serves as the backbone of digital audio/video system 100. Major data movement within digital audio/video system 100 occurs via system bus 110.
Mass memory 106 moves data to system bus 110 under control of CPU 101. This data movement would enable recall of digital audio/video data from mass memory 106 for presentation to the user.
Touch screen interface 112 mediates user input from touch screen 122. Touch screen 122 typically overlays display and includes touch sensors for user input. Touch screen interface 112 senses these screen touches from touch screen 122 and signals CPU 101 of the user input. Touch screen interface 112 typically encodes the screen touch in a code that can be read by CPU 101. Touch screen interface 112 may signal a user input by transmitting an interrupt to CPU 101 via an interrupt line (not shown). CPU 101 can then read the input key code and take appropriate action. Touch screen interface 112 and touch screen 122 could be replaced by any suitable device for inputting user data such as buttons, a keyboard, a joystick or a touch pad.
Digital to analog (D/A) converter and analog output 113 receives the digital audio/video data from mass memory 106. Digital to analog (D/A) converter and analog output 113 provides an analog signal to speakers 123 for listening by the user. Speakers 123 is any suitable electrical to sound transducer including loud speakers, headphones and earbuds.
Display controller 115 controls the display shown to the user via display 125. Display controller 115 receives data from CPU 101 via system bus 110 to control the display. Display 125 is typically a multiline liquid crystal display (LCD). This display typically may also be used to facilitate input of user commands by outlining touch areas for the touch screen input. In a portable system, display 125 would typically be located in a front panel of the device.
I/O controller 117 enables digital audio/video system 100 to exchange messages and data with network 127. As an example, I/O controller 117 could permit digital audio/video system 100 to log on to an Internet site, request a file or data stream and receive delivery via network 127.
DRAM 105 provides the major volatile data storage for the system. This may include the machine state as controlled by CPU 101. Typically data is recalled from mass memory 105 or received from network 127 via I/O controller 117, and buffered in DRAM 105 before decompression by CPU 101. DRAM 105 may also be used to store intermediate results of the decompression.
In operation the user specifies an action to be taken by digital audio/video system 100 via inputs on touch screen 122.
The decoding processing begins during time block 201. During time block 201, a buffer 2111 receives and stores input data. Buffer 2111 is typically a designated portion of DRAM 105. As previously noted this input data could be from mass memory 106 or from network 127 via I/O controller 117. CPU 101 is idle during time block 201 as indicated by idle block 231.
During time block 202, CPU 101 initiates the process via reset (RST) block 2200. This occurs only once at the start of the process. CPU 101 next executes decode (DEC) block 2211. This involves reading data from buffer 2111 as illustrated in
During time block 203, CPU 101 performs pulse code modulation (PCM) encode (PCE) at PCE block 2231. PCE block 2231 produces digital signals for supply to D/A converter and analog output 113 to supply an analog signal to speaker 123. As illustrated in
During time block 204, data stored in buffer 2121 is output to D/A converter and analog output 113. CPU 101 performs pulse code modulation (PCM) encode (PCE) at PCE block 2232 on the second block of input data. CPU 101 next executes DEC block 2213 followed by ASP block 2233 on the third block of input data. CPU 101 is idle during idle block 234. Buffer 2114 is being filled with a fourth block of input data during time block 204.
During time block 205, data stored in buffer 2122 is output to D/A converter and analog output 113. CPU 101 performs pulse code modulation (PCM) encode (PCE) at PCE block 2232 on the fourth block of input data.
Upon reaching a steady state such as illustrated in time block 204, the process includes the following. Data is loaded into one buffer as input buffer. Data is output from one buffer as output buffer. CPU 101 performs the needed data processing tasks.
ΔIO=3N/fs
where: N is number of samples per block; and fs is sampling rate. This is referred to as the system latency.
Two factors establish this delay. The first factor is that no processing is performed or output initiated until one block of input samples has arrived. The second factor is that the output is started with two blocks of zeros in output buffer immediately after the first input block arrives. The input and output buffers employed each have a size to hold two N-sample blocks.
There are three signal dependencies shown in
During time block 401, a buffer 4111 receives and stores input data. CPU 101 is idle during time block 401 as indicated by idle block 431.
During time block 402, CPU 101 initiates the process via reset (RST) block 4200. This occurs only once at the start of the process. CPU 101 next executes decode (DEC) block 4211. This involves reading data from buffer 4111 as illustrated in
During time block 403, CPU 101 continues and completes audio stream processing (ASP) block 4221. CPU 101 then performs pulse code modulation (PCM) encode at PCE block 4231. This differs from
During time block 404, data stored in buffer 4121 is output to D/A converter and analog output 113. CPU 101 performs audio stream processing (ASP) block 4222 and pulse code modulation (PCM) encode (PCE) at PCE block 4232 on the second block of input data. CPU 101 next executes DEC block 4213 followed by ASP block 4223 on the third block of input data. Buffer 4114 is being filled with a fourth block of input data during time block 404.
During time block 405, data stored in buffer 4122 is output to D/A converter and analog output 113. CPU 101 performs pulse code modulation (PCM) encode (PCE) at PCE block 4233 on the third block of input data.
During time block 501, a buffer 5111 receives and stores input data. CPU 101 performs the reset function at block 5200 and then is idle during remainder of time block 501 as indicated by idle block 531.
During time block 502, CPU 101 executes decode (DEC) block 5211. This involves reading data from buffer 5111 as illustrated in
During time block 503, CPU 101 continues and completes audio stream processing (ASP) block 5221. CPU 101 then performs pulse code modulation (PCM) encode at PCE block 5231. This differs from
During time block 504, data stored in buffer 5121 is output to D/A converter and analog output 113. CPU 101 performs audio stream processing (ASP) block 5222 and pulse code modulation (PCM) encode (PCE) at PCE block 5232 on the second block of input data. CPU 101 next executes DEC block 5213 followed by ASP block 5223 on the third block of input data. Buffer 5114 is being filled with a fourth block of input data during time block 504.
During time block 505, data stored in buffer 5122 is output to D/A converter and analog output 113. CPU 101 performs pulse code modulation (PCM) encode (PCE) at PCE block 5233 on the third block of input data.
As seen in
During time block 601, a buffer 6111 receives and stores input data. CPU 101 performs the reset function at block 6200 and then is idle during remainder of time block 601 as indicated by idle block 631.
During time block 602, CPU 101 executes decode (DEC) block 6211. This involves reading data from buffer 6111 as illustrated in
During time block 603, CPU 101 performs pulse code modulation (PCM) encode at PCE block 6231. PCE block 6231 must complete before the beginning of the fourth block 604. Data from PCE block 6231 is stored in buffer 6121. CPU 101 next executes DEC block 6212 on the second block of input data. Once this completes CPU 101 begins audio stream processing (ASP) block 6221 on the second block of data. As shown in
During time block 604, data stored in buffer 6121 is output to D/A converter and analog output 113. CPU 101 completes audio stream processing (ASP) block 6222 and then performs pulse code modulation (PCM) encode (PCE) at PCE block 6232 on the second block of input data. CPU 101 next executes DEC block 6213 on the third block of input data. Buffer 6114 is being filled with a fourth block of input data during time block 604.
During time block 605, data stored in buffer 6122 is output to D/A converter and analog output 113. CPU 101 performs audio stream processing (ASP) 6223 on the third block of data. CPU 101 next performs pulse code modulation (PCM) encode (PCE) at PCE block 6233 on the third block of input data. The data from this operation is stored in buffer 6122. CPU 101 then executes decode (DEC) block 6214 followed by audio stream processing (ASP) block 6224. Buffer 6115 is being filled with the fifth block on input data.
During time block 606, data stored in buffer 6123 is output to D/A converter and analog output 113. CPU 101 performs pulse code modulation (PCM) encode (PCE) at PCE block 6234 on the fourth block of input data.
During time block 701, a buffer 7111 receives and stores input data. CPU 101 performs the reset function at block 7200 and then is idle during remainder of time block 701 as indicated by idle block 731.
During time block 702, CPU 101 executes decode (DEC) block 7211 on the first block of data. CPU 101 next performs audio stream processing (ASP) block 7221 on the first block of data. This involves further processing on data from decode (DEC) block 7211. CPU 101 then performs pulse code modulation (PCM) encode (PCE) at PCE block 7231 on the first block of input data. CPU 101 is then idle the remainder of time block 702 as shown at idle block 732. Buffer 7112 is being filled with a second block of input data during time block 702.
During time block 703, CPU 101 executes DEC block 7212 on the second block of input data. Once this completes CPU 101 begins audio stream processing (ASP) block 7222 on the second block of data. As shown in
During time block 704, data stored in buffer 7121 is output to D/A converter and analog output 113. CPU 101 completes audio stream processing (ASP) block 7222 and then performs pulse code modulation (PCM) encode (PCE) at PCE block 7232 on the second block of input data. Buffer 7114 is being filled with a fourth block of input data during time block 704.
During time block 705, data stored in buffer 7112 is output to D/A converter and analog output 113. CPU 101 performs decode (DEC) block 7213 on the third block of data, then performs audio stream processing (ASP) 7223 on the third block of data. CPU 101 next performs pulse code modulation (PCM) encode (PCE) at PCE block 7233 on the third block of input data. The data from this operation is stored in output buffer 7123. CPU 101 then executes decode (DEC) block 7214 on the fourth block of data. Buffer 7115 is being filled with the fifth block on input data.
During time block 706, data stored in buffer 7123 is output to D/A converter and analog output 113. CPU 101 performs audio stream processing (ASP) 7224 and then performs pulse code modulation (PCM) encode (PCE) at PCE block 7234 on the fourth block of input data.
The preceding examples of
ΔIO=2N/fs
During time block 801, a first buffer 8111 receives and stores input data. CPU 101 performs reset (RST) at block 8200 and is idle during the remainder time block 801 as indicated by idle block 831.
During time block 802, CPU 101 executes decode (DEC) block 8211. This involves reading data from buffer 8111 as illustrated in
During time block 803, data stored in buffer 8121 is output to D/A converter and analog output 113. CPU 101 executes DEC block 8212 followed by ASP block 8222 on the second block of input data. CPU 101 then executes PCE block 8232 on this second block of input data. Buffer 8113 is being filled with a third block of input data during time block 803.
During time block 804, data stored in buffer 8122 is output to D/A converter and analog output 113. CPU 101 performs DEC block 8213 followed by ASP block 8223 on the third block of input data. CPU 101 then executes PCE block 8233 on this third block of input data. CPU 101 is idle during idle block 833. Buffer 8114 is being filled with a fourth block of input data during time block 804.
The input and output buffers employed are again assumed have to hold two N-sample blocks since no improvement can be achieved with larger buffers. The output buffer can no longer be implemented in a simple ping-pong fashion where the output blocks alternate. Some embodiments use a circular buffer. Other embodiments have the first/partial block as a subset of a full block.
The example shown in
During time block 1001, CPU 101 initiates the process via reset (RST) block 10200. CPU 101 is idle for the remainder of time block 1001 as shown by idle block 1031. During time block 1001 input buffer 10111 is filled as previously described.
During time block 1002, CPU 101 executes decode (DEC) block 10211, followed by audio stream processing (ASP) block 10221 and then pulse code modulation encode (PCE) block 10231 upon data from the first block. CPU 101 is idle for the remainder of time block 1002 as shown by idle block 1032. The process requires that decode (DEC) block 10211 completes before the beginning of time block 1003 when the buffer must be clear to receive new data. The process also requires that pulse code modulation encode (PCE) block 10231 complete to provide the data for output to buffer 10121 before the beginning of this output. This output begins 2.25 blocks after the start of input data (beginning of block 1001) within time block 1003. This delay results in a latency of 2.25 time blocks. During time block 1002 input buffer 10112 is being filled.
During time block 1003, CPU 101 executes decode (DEC) block 10212, followed by audio stream processing (ASP) block 10222 and then pulse code modulation encode (PCE) block 10232 upon data from the second input block. CPU 101 is idle for the remainder of time block 1003 as shown by idle block 1033. The process requires that decode (DEC) block 10212 complete before the beginning of time block 1004 when the buffer must be clear to receive new data. The process also requires that pulse code modulation encode (PCE) block 10232 complete to provide the data for output to buffer 10122 before the beginning of this output. This output begins 3.25 blocks after the start of input data (beginning of block 1001) within time block 1004. During time block 1003 input buffer 10113 is being filled.
During time block 1004, CPU 101 executes decode (DEC) block 10213, followed by audio stream processing (ASP) block 10223 and then pulse code modulation encode (PCE) block 10233 upon data from the third input block. CPU 101 is idle for the remainder of time block 1003 as shown by idle block 1034. The process requires that decode (DEC) block 10213 complete before the beginning of time block 1005 when the buffer must be clear to receive new data. The process also requires that pulse code modulation encode (PCE) block 10233 complete to provide the data for output by buffer 10123 before the beginning of this output. This output begins 4.25 blocks after the start of input data (beginning of block 1001) within time block 1005. Data output from buffer 10122 to D/S converter and analog controller 113 begins in block 1003. This output completes during block 1004. During time block 1004 input buffer 10114 is being filled.
The example shown in
During time block 1101, CPU 101 initiates the process via reset (RST) block 11200. CPU 101 is idle for the remainder of time block 1101 as shown by idle block 1131. During time block 1101 buffer 11111 is filled as previously described.
During time block 1102, CPU 101 executes decode (DEC) block 11211, followed by audio stream processing (ASP) block 11221 and then pulse code modulation encode (PCE) block 11231 upon data from the first block. CPU 101 is idle for the remainder of time block 1102 as shown by idle block 1132. The process requires that decode (DEC) block 11211 completes before the beginning of time block 1103 when the buffer must be clear to receive new data. The process also requires that pulse code modulation encode (PCE) block 11231 completes to provide the data for output by buffer 11121 before the beginning of this output. This output begins 2.25 blocks after the start of input data (beginning of block 1101) within time block 1103. This delay results in a latency of 2.25 time blocks. During time block 1102 input buffer 11112 is being filled.
During time block 1103, CPU 101 executes decode (DEC) block 11212, followed by audio stream processing (ASP) block 11222 upon data from the second block.
During time block 1104, CPU 101 executes pulse code modulation encode (PCE) block 11232 filling output buffer 11122. The process requires that pulse code modulation encode (PCE) block 11232 complete to provide the data for output from buffer 11122 before the beginning of this output. This output begins 3.25 blocks after the start of input data (beginning of block 1101) within time block 1104. CPU 101 then executes decode (DEC) block 11213, followed by audio stream processing (ASP) block 11223 and then pulse code modulation encode (PCE) block 11233 upon data from the third input block. The process requires that decode (DEC) block 11213 complete before the beginning of time block 1105 when the buffer must be clear to receive new data. The process also requires that pulse code modulation encode (PCE) block 11233 complete to provide the data for output at 11123 before the beginning of this output. This output begins 4.25 blocks after the start of input data (beginning of block 1101) within time block 1105. During time block 1104 input buffer 11114 is being filled.
During time block 1201, CPU 101 initiates the process via reset (RST) block 12200. CPU 101 is idle for the remainder of time block 1201 as shown by idle block 1231. During time block 1201 input buffer 12111 is filled as previously described.
During time block 1202, CPU 101 executes decode (DEC) block 12211, followed by audio stream processing (ASP) block 12221 and then pulse code modulation encode (PCE) block 12231 upon data from the first block. CPU 101 is idle for the remainder of time block 1202 between the end of ASP block 12221 and the beginning of PCE block 12231 as shown by idle block 1232. The process requires that decode (DEC) block 12211 completes before the beginning of time block 1203 when the buffer must be clear to receive new data. The process also requires that pulse code modulation encode (PCE) block 12231 completes within time block 1203 to provide the data for output at buffer 12121 before the beginning of this output. This output begins 2.75 blocks after the start of input data (beginning of block 1201) within time block 1203. This delay results in a latency of 2.75 time blocks. During time block 1202 input buffer 12112 is being filled.
During time block 1203, CPU 101 executes decode (DEC) block 12212, then begins audio stream processing (ASP) block 12222 upon data from the second input block.
During time block 1204, CPU 101 completes audio stream processing (ASP) 12222 then executes pulse code modulation encode (PCE) block 12232 on the second block of data filling buffer 12122. The process requires that pulse code modulation encode (PCE) block 12232 complete to provide the data for output by buffer 12122 before the beginning of this output. This output begins 3.75 blocks after the start of input data (beginning of block 1201) within time block 1204. CPU 101 then executes decode (DEC) block 12213 on the third block of data. The process requires that decode (DEC) block 12213 is completed before the end of time block 1204 so that a buffer is available for input data. CPU 101 is completely used during time block 1204 leaving no idle block. During time block 1204 input buffer 12114 is being filled.
Circular buffer memory 1301 is circular in storage address. Continued incrementing input pointer 1311 eventually reaches and exceeds the last memory address in circular buffer memory 1301. Upon reaching the end of the memory addresses, input pointer 1311 wraps around to the beginning addresses of circular buffer memory 1311.
A similar process occurs for output. Upon an output command circular buffer memory 1301 supplies the data stored at the address of the output address port (supplied from output pointer 1321) to the output port. Output pointer 1321 is incremented in an amount corresponding to the output data width. Output pointer 1321 circularly wraps around for the end of circular buffer memory 1301 addresses to the beginning addresses are previously described for input pointer 1311. The storage capacity of circular buffer memory 1301 must be large enough to accommodate the 2+x delay from input to output illustrated in
The method selects the buffer size in block 1403. According to this invention the buffer size is 2+x delay of the block size where 0<x<1. The particular x selected provides a desired combination of total delay and adaptability to peak processing requirements.
Block 1404 inputs the next block of data into an input buffer. For the first iteration of this loop the next data is the first data. This is described above in conjunction with
Block 1405 performs the required data processing of the method on one block of data. As described above in conjunction with
Decision block 1408 tests to determine if the block of data of the current iteration of the loop is the last block of data. If this is not the last block of data (No at decision block 1408), then the method returns to block 1404 to input the next block of data. If this is the last block of data (Yes at decision block 1408), then the method is complete and terminates at end block 1409.
As shown in
This application claims priority under 35 U.S.C. 119(e)(1) to U.S. Provisional Application No. 61/774,301 filed Mar. 7, 2013.
Number | Name | Date | Kind |
---|---|---|---|
20030125933 | Saunders et al. | Jul 2003 | A1 |
20080120097 | Fleishman et al. | May 2008 | A1 |
20090171674 | Mitsumori | Jul 2009 | A1 |
Number | Date | Country | |
---|---|---|---|
20140257823 A1 | Sep 2014 | US |
Number | Date | Country | |
---|---|---|---|
61774301 | Mar 2013 | US |