Example embodiments described in this document relate to a high-speed interface for ancillary data for serial digital interface applications.
Serial Digital Interface (“SDI”) refers to a family of video interfaces standardized by the Society of Motion Picture and Television Engineers (“SMPTE”). SDI is commonly used for the serial transmission of digital video data within a broadcast environment. Ancillary data (such as digital audio and control data) can be embedded in inactive (i.e. non-video) regions of a SDI stream, such as the horizontal blanking region for example.
High-speed interfaces can be used to extract ancillary data from a SDI stream for processing or transmission or to supply ancillary data for embedding into an SDI stream, or both.
This disclosure relates to the field of high speed data interfaces.
Examples embodiments of the invention relate to a high-speed serial data interface (“HSI”) transmitter that can be used to extract ancillary data (such as digital audio and control data) from the inactive regions (i.e. non-video) of a Serial Digital Interface (“SDI”) data stream for further processing and an HSI receiver for supplying ancillary data to be embedded into the inactive regions of an SDI data stream.
As previously noted, SDI is commonly used for the serial transmission of digital video data within a broadcast environment. As illustrated shown in
As SDI data rates increase, the bandwidth available for carrying ancillary data such as digital audio also increases. For example, 3G-SDI formats can accommodate 32 channels of audio data embedded within the horizontal blanking region 102. In some large router/switcher designs, routing the 16 channel pairs that make up the 32 channels of audio data to “audio breakaway” processing hardware can be impractical, especially as the density of SDI channels within these components continues to increase. Additionally, other broadcast components such as audio embedder/de-embedders and monitoring equipment may use Field-Programmable Gate Arrays (“FPGA”) for audio processing, and pins are at a significant premium on most FPGA designs.
Example embodiments described herein include an HSI transmitter and HSI receiver that in some applications can provide a convenient uni-directional single-wire point-to-point connection for transmitting audio data, notably for “audio breakaway” processing where the audio data in an SDI data stream is separated from the video data for processing and routing. In some applications, for example in router and switch designs, a single-wire point-to-point connection may ease routing congestion. In some applications, for example when providing audio data to an FPGA, a single-wire interface for transferring digital audio may reduce both routing complexity and pin count.
By way of example,
In large production routers such as video router 200 and master controllers such as controller 300, if the received audio data is formatted as per AES3, each audio channel pair would require its own serial link, and the routing overhead for 16 channel pairs creates challenges for board designs. However, according to example embodiments described herein, each audio channel pair is multiplexed onto a high-speed serial link along with supplemental ancillary data such as audio control packets and audio sample clock information, such that all audio information from the video stream can be routed on one serial link output by HSI transmitter 204, and demultiplexed by the audio processing hardware (for example audio processor 302 of master controller 300).
In example embodiments, audio sample clock information embedded on the HSI data stream produced by HSI transmitter 204 can be extracted by the audio processing hardware (for example audio processor 302), to re-generate an audio sample reference clock if necessary. After audio processing, the audio processing hardware (for example audio processor 302) can re-multiplex the audio data and clock information onto the HSI data stream as per a predefined HSI protocol. The processed ancillary data (e.g. the audio payload data and clock information) in the HSI data stream can be embedded by HSI receiver 212 into a new SDI link at the output board 208.
Another example application of the HSI is illustrated in
A further example application of HSI is shown in
Accordingly, it will be appreciated that there a number of possible applications for an HSI that comprises a single-wire interface used for point-to-point transmission of ancillary data (such as digital audio and control data) for use in SDI systems.
In at least some examples the usage models of the HSI within the SDI application space include HSI transmitter 204 and HSI receiver 212. In some examples, the HSI transmitter 204 is used to extract ancillary data from standard definition (“SD”), high definition (“HD”) and 3G SDI sources, and transmit them serially. An HSI transmitter 204 may for example be embedded within an SDI Receiver 201 or an input reclocker. In some examples, the HSI receiver 212 is used to supply ancillary data to be embedded into the inactive (i.e. non-video) regions of a SDI stream. An HSI receiver 212 may for example be embedded within an SDI Transmitter 211 or an output reclocker.
In
An example embodiment of an HSI transmitter 204 as shown in
As noted above, in addition to the embedded HSI transmitter 204, the SDI receiver of 201 of
In example embodiments, the HSI transmitter 204 operates at a user-selectable clock rate (HSI_CLK 852) to allow users to manage the overall latency of the audio processing path—a higher data rate results in a shorter latency for audio data extraction and insertion. At higher data rates, noise-immunity is improved by transmitting the data differentially. The user-selectable clock rate HSI_CLK 852 may be a multiple of the video clock rate PCLK to facilitate an easier FIFO architecture to transfer data from the video clock domain. However, the HSI transmitter 204 can alternatively also run at a multiple of the audio sample rate so that it can be easily divided-down to generate a frequency useful for audio processing/monitoring. Due to the way sample clock information is embedded in the HSI stream (as explained below), the HSI clock is entirely decoupled from the audio sample clock and the video clock rate, and any clock may be used for HSI_CLK 852.
Within the HSI transmitter 204, the video line data PDATA and the extracted timing signal are received as inputs by ancillary data detection module or block 614 and ancillary data detection is accomplished by searching for ancillary data packet headers within the horizontal blanking region 102 of video line data PDATA. In an example embodiment, each ancillary data packet is formatted as per SMPTE ST 291. An example of an SMPTE ancillary data packet (specifically an SD-SDI packet 1600) is shown in
The ancillary data detection block 614 searches for DIDs corresponding to audio data packets (1600 or 1000—see
Audio Data FIFO buffers 616 are provided for storing the audio data extracted at detection block 614. Audio data is extracted during the horizontal blanking period 102 and each audio data packet 1000, 1600 is sorted by audio group and stored in a corresponding audio group FIFO buffer 616 where the audio data packet 1000, 1600 is tagged with a sample number to track the order in which packets are received. In an example, for HD-SDI and 3G-SDI data rates, each audio data packet 1000 contains 6 ECC words (see
After the first audio data packet 1000, 1600 on a line is received, the HSI transmitter 204 can begin transmitting serial audio data for that line.
Audio Control FIFO buffers 618 are provided for storing audio control packets 1702, 1802 (examples of a high definition audio control packet 1702 can be seen in
A TFM FIFO buffer 620 is used to store the Two-Frame Marker packet 1902 (an example of TFM Packet 1902 can be seen in
Collectively, the serial to parallel conversion module 604, word alignment module 608, timing and reference signal detection module 610, SDI processing module 612, ancillary Data Detection module 614, Audio Data FIFO buffers 616, Audio Control FIFO buffers 618, and TFM FIFO buffer 620 operate as serial data extraction modules 650 of the SDI receiver 201.
A FIFO Read Controller 622 manages the clock domain transition from the video pixel clock (PCLK) to the HSI clock (HSI clock requirements are discussed below). As soon as audio/control/TFM data is written into corresponding FIFO buffers 616, 618, 620, the data can be read out to be formatted and serialized. Ancillary data is serviced on a first-come first-serve basis.
Audio Clock Phase Extraction block 624 extracts audio clock phase information. In example embodiments, audio clock phase information is extracted in one of two ways:
Reference clock information is inserted into the HSI data stream by reference clock SYNC insertion block 626. In one example, the reference clock is transmitted over the HSI via unique synchronization words 1210 (see
Audio Cadence Detection block 628 uses audio clock phase information from audio clock phase extraction block 624 to re-generate the audio sample reference clock 1210 and this reference clock is used to count the number of audio samples per frame to determine the audio frame cadence. For certain combinations of audio sample rates and video frame rates, the number of audio samples per frame can vary over a specified number of frames (for example, up to five frames) with a repeatable cadence. In some applications, it is important for downstream audio equipment to be aware of the audio frame cadence, so that the number of audio samples in a given frame is known.
In some example embodiments, Dolby E Detection block 630 is provided to determine if the audio packets contain non-PCM data formatted as per Dolby E. Using Dolby E, up to eight channels of broadcast-quality audio, plus related metadata, can be distributed via any stereo (AES/EBU) pair. Dolby E is non-PCM encoded, but maps nicely into the AES3 serial audio format. Typically, any PCM audio processor must be disabled in the presence of Dolby E. In an example, Dolby E is embedded as per SMPTE 337M which defines the format for Non-PCM Audio and Data in an AES3 Serial Digital Audio Interface. Non-PCM packets contain a Burst Preamble (shown in
Key Length Value (“KLV”) Formatting block 632 operates as follows. As audio/control/TFM data is read from the corresponding FIFO buffers 616, 618 and 620 (one byte per clock cycle) it is respectively encapsulated in a KLV packet 1112 (KLV audio data packet), 1704 (KLV audio control-HD packet), 1804 (KLV audio control-SD packet), 1904 (KLV TFM packet) (Where KLV=Key+Length+Value) as shown in
After KLV formatting, a number of encoding modules 660 perform further processing. The HSI data stream is 4b/5b encoded at 4b/5b encoding block 638 such that each 4-bit nibble 1402 is mapped to a unique 5-bit identifier 1404 (as shown in
After 4b/5b encoding, packet Framing block 634 is used to add start-of-packet 1202 and end-of-packet 1204 delimiters to each KLV packet 1112, 1704, 1804, 1904 in the form of unique 8-bit sync words, prior to serialization (see
Sync Stuffing block 636 operates as follows. In one example of HSI transmitter 204, if less than eight groups of audio are extracted, then only these groups are transmitted on the HSI—the HSI stream is not padded with “null” packets for the remaining groups. Since each KLV packet 1112, 1704, 1804, 1904 contains a unique key 1110, 1710, 1810, 1910, the downstream logic only needs to decode audio groups as they are received. To maintain a constant data rate, the HSI stream is stuffed with sync (electrical idle) words 1206 that can be discarded by the receiving logic.
After sync stuffing, each 10-bit data word is non-return to zero inverted (“NRZI”) encoded by NRZI Encoding block 640 for DC balance, and serialized at Serialization block 642 for transmission. The combination of 4b5b encoding and NRZI encoding provides sufficient DC balance with enough transitions for reliable clock and data recovery.
Referring again to
The operation of the different functional blocks of the HSI receiver 212 will now be described according to one example embodiment. In one example, the HSI receiver 212, several decoding modules 860 process the incoming HSI data. HSI data (HSI-IN) 850 is deserialized by deserialization block 802 into a 10-bit wide parallel data bus. The parallel data is NRZI decoded by NRZI decoding block 804 so that the unique sync words that identify the start/end of KLV packets 1112, 1704, 1804 and audio sample clock edges 1210 can be detected in parallel.
The Packet Sync Detection block 808 performs a process wherein the incoming 10-bit data words are searched for sync words corresponding to the start-of-packet 1202 and end-of-packet 1204 delimiters. If a KLV packet 1112, 1704, 1804, 1904 is detected, the start/end delimiters 1202, 1204 are stripped to prepare the KLV packet 1112, 1704, 1804, 1904 for decoding.
After detecting the packet synchronization words, the HSI data is 4b/5b decoded by 4b/5b decoding block 806 such that each 5-bit identifier 1404 is mapped to a 4-bit nibble 1402 (each 10-bit word 1408 becomes 8-bits 1406, as shown in
After stripping the start/end delimiters 1202, 1204 from the parallel data stream and performing 4b/5b decoding, the KLV audio payload packets 1112 (or other KLV packets 1704, 1804, 1904) can be decoded by the KLV decode block 810 per the 8-bit Key value and assigned to a corresponding audio data FIFO buffer 812 (or control 814 or TFM 816 FIFO buffer). In particular, audio data 1114 is stripped from its KLV audio payload packet 1112 and stored in its corresponding group FIFO buffer 812. In an example, each audio group 904 is written to its designated FIFO buffer 812 in sample order so that samples received out-of-order are re-ordered prior to insertion into the SDI stream 900.
Audio control data is stripped from its KLV packet 1704, 1804 by KLV decode block 810 and stored in its corresponding group audio control data FIFO buffer 814. Each audio group control packet 1704, 1804 is written to separate respective audio control data FIFO buffers 814 for insertion on specified line numbers.
TFM data 1906 is also stripped from its KLV packet 1904 by KLV decode block 810 and stored in a TFM FIFO buffer 816 for insertion on a user-specified line number. The TFM FIFO buffer 816 is used to store the Two-Frame Marker packet 1902. This packet is transmitted once per frame, and is used to provide 2-frame granularity for downstream video switches. This prevents video switching events from “cutting across” ancillary data that is encoded based on a two-frame period (non-PCM audio data is in this category).
Collectively, the Audio Data FIFO buffer 812, Audio Control FIFO buffer 814, and TFM FIFO buffer 816 operate as buffering modules 856.
The data from the buffering modules 856 is then processed into the SDI stream by a number of serial data insertion modules 858.
A FIFO read controller 818 manages the clock domain transition from the HSI clock HSI_CLK 852 to the video pixel clock PCLK 854. Audio/control/TFM data written into its corresponding FIFO buffer 812/814/818 is read out, formatted as per SMPTE 291M, and inserted into the horizontal blanking area 102 of the SDI stream 900 by an ancillary data insertion block 836 that is part of the SDI transmitter circuit 800. The read controller 818 determines the correct insertion point for the ancillary data based on the timing reference signals present in the video stream PDATA, and the type of ancillary data being inserted.
A Reference Clock Sync Detection block 824 performs a process that searches the incoming 10-bit data words output from NRZI and 4b/5b decoding blocks 804,806 for sync words 1210 that correspond to the rising edge of the embedded sample clock reference 1212. The audio sample reference clock 1212 is re-generated by directly decoding these sync words 1210.
The recovered audio reference clock 1212 is used by audio cadence detection block 826 to count the number of audio samples per frame to determine the audio frame cadence. The audio frame cadence information 1116 is embedded into the audio control packets 1702, 1704 as channel status information at channel status formatting block 830.
A Dolby E detection block 828 detects KLV audio payload packets 1112 tagged as “Dolby E” (tagging is described in further detail below). This tag 1128 indicates the presence of non-PCM audio data which is embedded into SMPTE audio packets 1000, 1600 as channel status information at channel status formatting block 830.
The audio sample rate is extracted by audio sample rate detection block 842 from the KLV “SR” tag. Alternatively it can be derived by measuring the incoming period of the audio sample reference clock 1212.
The channel status formatting block 830 extracts information embedded within KLV audio payload packets 1112 for insertion into SMPTE audio packets 1000, 1600 as channel status information. Channel status information is transmitted over a period of multiple audio packets using the “C” bit 1134 (2nd most significant bit of the sub-frame 1502) as shown in
A clock sync generation block 822 generates clock phase words for embedding in the SMPTE audio packets 1000 for HD-SDI and 3G-SDI data, as per SMPTE 299M. Note that for SD-SDI video, clock phase words are not embedded into the audio data packets 1600.
All audio payload data and audio control data, plus channel status information and clock phase information 1132 is embedded within SMPTE audio packets 1000 that are formatted as per SMPTE 291M by SMPTE ancillary data formatter 820 prior to insertion in the SDI stream 900. Channel status information is encoded along with the audio payload data for that channel within the audio channel data segments of the SMPTE audio packet 1000, 1600.
The SDI transmitting circuit 800 of
KLV mapping requirements will now be described in greater detail. In at least some applications, the HSI described herein may provide a serial interface with a simple formatting method that retains all audio information as per AES3, as shown in
In some but not all example applications, functional requirements of the HSI may include:
To address the functional requirements above, the audio sample data is formatted as per the KLV protocol (Key+Length+Value) as shown in
In example embodiments, the Z, V, U, C, and P bits as defined by the AES3 standard are passed through from the SMPTE audio data packet 1000 to the KLV audio payload packet 1112.
With reference to the DS bit 1120 noted above, if the incoming 3G-SDI stream is identified as dual-link by the video receiver, each 1.5 Gb/s link (Link A and Link B) may contain audio data. If the audio data packets in each link contain DIDs corresponding to audio groups 1 to 4 (audio channels 1 to 16) this indicates two unique 1.5 Gb/s links are being transmitted (2×HD-SDI, or dual stream) and the DS bit 1120 in the KLV audio payload packet 1112 is asserted. To distinguish the audio groups from each link, the HSI transmitter 204 maps the audio from Link B to audio groups 5 to 8. When the HSI receiver 212 receives this data with the DS bit 1120 set, this indicates the DIDs in Link B must be remapped back to audio groups 1 to 4 for ancillary data insertion. Conversely if the incoming 3G-SDI dual-link video contains DIDs corresponding to audio groups 1-4 on Link A, and audio groups 5-8 on Link B, the DS bit 1120 remains low, and the DIDs do not need to be re-mapped.
The KLV mapping approach shown in
As noted above, the audio sample reference clock 1212 is embedded into the HSI stream 1200 at the HSI transmitter 204. In some examples, the audio phase information may be derived in one of two ways: (a) For HD-SDI and 3G-SDI data, it is decoded from clock phase words 1132 in the SMPTE audio data packet; or (b) For SD-SDI, it is generated internally by the HSI transmitter 204 and synchronized to the video frame rate. The reference frequency will typically be 48 kHz or 96 kHz but is not limited to these sample rates. A unique sync pattern is used to identify leading edge of the reference clock 1212. As shown in
Decoding of these sync words 1210 at the HSI receiver 212 enables re-generation of the audio clock 1212, and halts the read/write of the KLV packet 1112 from its corresponding extraction/insertion FIFO.
In some example applications, the High-Speed Interface (HSI) described herein may allow a single-wire point-to-point connection (either single-ended or differential) for transferring multiple channels of digital audio data, which may ease routing congestion in broadcast applications where a large number of audio channels must be supported (including for example router/switcher designs, audio embedder/de-embedders and master controller boards). The single-ended or differential interface for transferring digital audio may provide a meaningful cost savings, both in terms of routing complexity and pin count for interfacing devices such as FPGAs. In some embodiments, the signal sent on the HSI is self-clocking. In some example configurations, the implementation of the 4b/5b+NRZI encoding/decoding for the serial link may be simple, efficient and provide robust DC balance and clock recovery. KLV packet formatting can, in some applications, provide an efficient method of presenting audio data, tagging important audio status information (including the presence of Dolby E data), plus tracking of frame cadences and sample ordering. In some examples, the use of the KLV protocol may allow for the transmission of different types of ancillary data and be easily extensible.
In some examples, the HSI operates at a multiple of the video or audio clock rate that is user-selectable so users can manage the overall audio processing latency. Furthermore, in some examples the KLV encapsulation of the ancillary data packet allows up to 255 unique keys to distinguish ancillary data types. This extensibility allows for the transmission of different types of ancillary data beyond digital audio, and future-proofs the implementation to allow more than 32 channels of digital audio to be transmitted.
Unique SYNC words 1210 may be used to embed the fundamental audio reference clock onto the HIS stream 1200. This provides a mechanism for recovering the fundamental sampling clock 1212, and allows support for synchronous and asynchronous audio.
In some example embodiments, tagging each packet as Dolby E data or PCM data allows audio monitoring equipment to quickly determine whether it needs to enable or disable PCM processing.
The integrated circuit chips 202, 210 can be distributed by the fabricator in raw wafer form (that is, as a single wafer that has multiple unpackaged chips), as a bare die, or in a packaged form. In the latter case the chip is mounted in a single chip package (such as a plastic carrier, with leads that are affixed to a motherboard or other higher level carrier) or in a multichip package (such as a ceramic carrier that has either or both surface interconnections or buried interconnections). In any case the chip is then integrated with other chips, discrete circuit elements, and/or other signal processing devices as part of either (a) an intermediate product, such as a motherboard, or (b) an end product.
The present disclosure may be embodied in other specific forms without departing from its spirit or essential characteristics. The described embodiments are to be considered in all respects as being only illustrative and not restrictive. The present disclosure intends to cover and embrace all suitable changes in technology.
This application claims the benefit of and priority to U.S. Patent Application Ser. No. 61/357,246 filed Jun. 22, 2010, the contents of which are incorporated herein by reference.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/CA11/50381 | 6/22/2011 | WO | 00 | 4/26/2013 |
Number | Date | Country | |
---|---|---|---|
61357246 | Jun 2010 | US |