The present invention relates to a method for displaying pixels on display devices, comprising the steps of
The present invention further relates to a system for displaying pixels on display devices, comprising one or more pixel sources for generating pixels to be displayed one or more display devices for displaying the pixels, wherein the one or more pixel sources are connected to the one or more displays for transmitting the generated pixels to the one or more display devices for displaying the pixels via a network based on a network transmission protocol.
For presentations usually a plurality of multiple individual display devices respectively monitors for a so-called tiled display wall are used to present information to users on a large scale. To present information on such a display wall specialized hardware and specialized connections between the hardware and the display wall respectively the individual displays are conventionally used to enable jointly displaying of content and hence to form a combine stream since the conventional display interfaces such as the digital visual interface (DVI) are designed for a one-to-one interconnection of a single computer to a single monitor.
Using the conventional display interfaces the digital video data is transmitted in an uncompressed manner, whereas when using a plurality of displays forming a display wall the specified bandwidth limit of a single connection is exceeded. For example on a n×1080p video wall comprising n individual full-high definition displays displaying content in multi-high-definition resolution would not be possible without compromising high refresh rates.
Conventional methods such as daisy-chaining in which one video signal is used to feed a plurality of display devices are either limited in the number of display devices or achieve a compromise of reduced resolution in either the spatial domain, the time domain or both. Another conventional solution is the use of dedicated graphics hardware providing multiple display interface ports in order to multiply the available bandwidth. This is known under the term “multi-head”. However, the number of ports respectively heads is limited to a very small number.
Another conventional system is based on multiple personal computers each equipped with a high performance graphics processing unit which—in synchronization with the other personal computers—has separate wiring between the graphics processing units. When combined with “multi-head” this is known under the term “multi-node”.
One of the disadvantages of conventional methods is, that they are cost-intensive, e.g. due to dedicated specialized hardware and wiring as mentioned above. Another disadvantage is that synchronization between the graphics processing units is complicated and requires special hardware, thus being inflexible. One of the further disadvantages is, that for increasingly large display walls one-to-one connections become necessary to overcome bandwidth limitations.
It is therefore an objective of the present invention to provide a method and a system for displaying pixels on display devices which are more flexible.
It is a further objective of the present invention to provide a method and a system for displaying pixels on display devices which enable an easier synchronization for displaying content in particular enabling a precise synchronization for a large number of displays.
It is an even further objective of the present invention to provide a method and a system for displaying pixels on display devices enabling simultaneous behavior of the display devices across a display wall comprising the display devices.
It is an even further objective of the present invention to provide a method and a system for displaying pixels on display devices which is easy-to-implement and cost-effective.
The aforementioned objectives are accomplished by a method of claim 1 and a system of claim 24.
According to claim 1 the method for displaying pixels on display devices, comprising the steps of
According to claim 1 the method is characterized in that a plurality of the display devices are synchronized with each other for displaying the pixels and that, preferably a plurality of the pixel sources are synchronized with the plurality of synchronized display devices via the network for synchronous displaying the pixels on the plurality of the display devices.
According to claim 24 the system for displaying pixels on display devices, comprising one or more pixel sources for generating pixels to be displayed, one or more display devices for displaying the pixels, wherein the one or more pixel sources are connected to the one or more displays for transmitting the generated pixels to the one or more display devices for displaying the pixels via a network based on a network transmission protocol.
According to claim 24 the system is characterized in that a plurality of the display devices are operable to be synchronized with each other for displaying the pixels and that, preferably based on synchronized display devices, a plurality of the pixel sources are operable to be synchronized with the plurality of synchronized display devices via the network for synchronous displaying the pixels on the plurality of the display devices.
According to the invention it has been recognized that when for example a display clock signal is used for both inter-display synchronization as well as for pixel source synchronization in an easy way a multi-node synchronous composite display of the display devices can be formed.
According to the invention it has been further recognized that an end-to-end synchronization between image generators and the display devices is enabled.
According to the invention it has been further recognized that the number of display devices is not limited anymore by the video bandwidth enabling a higher number of display devices connectable together to form a display wall.
According to the invention it has further been recognized that a tight synchronization of content to be displayed by the plurality of display devices is enabled since the display devices with each other as well as the display devices with the corresponding content generating sources may be easily synchronized. This enhances the quality of the presentation for a spectator in contrast to conventional display walls.
In other words the present invention enables synchronization between display devices with each other and the pixel source side is synchronized with the already synchronized display. For example for graphic processing units of computers may use synchronization information of the synchronized display devices for synchronizing the graphic processing units to the display devices.
Further features, advantages and preferred embodiments are described in the following subclaims.
According to a preferred embodiment display device information representing at least one characteristic parameter of one or more of the display devices is announced via the network. By announcing at least one characteristic parameter of the one or more display devices the pixel sources can adapt pixel generation according to this characteristic parameter. For example if the characteristic parameters include size and resolution as well as position of the display devices relative to other display devices the pixel sources can generate pixel data for pixels to be displayed by the display devices without having to issue a corresponding time consuming request for synchronization information. Further changes in the characteristic parameters like dead pixels of a display, etc. may then also be used by the pixel sources for example to generate compensation data or relocate and/or scale the pixels respectively the content to be displayed to another region of a display wall.
According to a further preferred embodiment the plurality of display devices are synchronized with each other with respect to their refresh rates and/or their phases. This enables a tighter synchronization between the display devices. Content respectively pixels are then presented to a user on a very high quality level with respect to synchronous displaying of the content across multiple synchronized display devices.
According to a further preferred embodiment a sync reference device is selected, preferably out of the display devices, for providing synchronization information. This enables in an easy way to synchronize the display devices with each other by using the synchronization information of the sync reference device. If the sync reference device is one of the display devices, further entities for providing synchronization information are not necessary, thus additional costs are saved.
According to a further preferred embodiment a clock signal is determined by a clock master device and the clock signal is included into the synchronization information. This enables to measure and propagate the clock signal by the clock master device. For example the clock master device may be a dedicated entity which preferably converts the clock signal into synchronization information. By using a clock master device an easy implementation and an enhanced flexibility is provided, since the clock master device could be easily added to an existing system or an entity already present in the displaying system could be used as clock master device.
According to a further preferred embodiment an internal clock of the clock master device is determined, preferably based on observing interrupts, and used as the clock signal. This avoids a time consuming and bandwidth consuming coordination between the sync reference device and the clock master device when the sync reference device is identical to the clock master device.
According to a further preferred embodiment vertical display refresh interrupts are observed, preferably via an open GL swap buffer and/or by a SGL video sync extension API. This enables a reliable while easy determination of the internal clock signal. Of course the internal clock signal can also be directly determined at the clock signal generating source. API is the abbreviation for application programming interface.
According to a further preferred embodiment synchronization information preferably including the internal clock signal, is announced in display clock reference messages via multicast and/or broadcast in the network. Other devices, preferably display devices or pixel generating devices, are thus enabled to easily synchronize on the provided synchronization information when they are connected to the network. If further display devices are connected to the network for instance in case further users are connected or connect with their mobile devices to the network, content to be display by the display devices is still synchronized when the mobile devices synchronize with the other display devices based on the synchronization information. By using broadcasting every device connected in the network may receive the synchronization information, thus enabling further and fast synchronization. Even further the synchronization information, preferably a synchronization signal or message can be used for additional purposes other then synchronizing with the pixel generating sources and/or the display devices with each other. By using multicast dedicated receivers for synchronization information are provided with the synchronization information enabling an optimized use of network resources for synchronization. Display clock reference messages may also be called refresh messages.
According to a further preferred embodiment synchronization information is announced to the network with a frequency of updates smaller or equal than the frequency of the clock signal, preferably via periodic Internet Protocol packets. By using these periodic messages for sharing the synchronization information, timing jitter introduced by the network can be compensated and for example the visual refresh rate of the clock master and sync reference device can be determined from those periodic messages at any other entity connected to the network. For example this may be supported by IEEE 1588 hardware timestamps.
According to a further preferred embodiment virtual timestamps, preferably an increasing tick counter, are included in the synchronization information, preferably in the display clock reference messages. By using virtual timestamps a sporadic loss of transmitted synchronization information packets may be compensated. Further the clock master's internal clock signal can be determined by observing arrival timestamps of the synchronization information packets, for example relative to the local system clock of a receiver of the synchronization information.
According to a further preferred embodiment the displaying rates of the display devices other than the sync reference device are adapted according to the announced synchronization information for synchronization. This provides synchronized presentations of content via the display devices in a reliable way.
According to a further preferred embodiment the displaying rates of the display devices other than the sync reference device, preferably of all display devices are determined by a phase-locked-loop procedure. For example a vertical sync frequency of a display device is determined by the phase-locked-loop procedure which may be implemented as a software application. It may be specifically parameterized for the approximate refresh rate/displaying rate of the composite display. By comparing ticks of the clock master device to the local vertical sync ticks of another display device the phase error may be constantly estimated via the phase-locked-loop procedure and hence the frequency deviation of both clocks may be determined The output of the phase-locked-loop procedure is then a frequency differing from both if compared using an independent clock. Specifically the output frequency is greater than the local vertical sync frequency if the frequency of the clock master device is greater than the local vertical sync frequency and it is smaller otherwise. The difference between the phase-locked-loop output frequency and a local vertically sync frequency can hence be used to control, i.e. to decrease the local vertical sync frequency if possible. The phase-locked-loop is “locked” once the difference between the phase-locked-loop output frequency and the local vertical sync frequency is 0 which means in practice as close to 0 as possible.
According to a further preferred embodiment a round-trip-time of the announced synchronization information is estimated, preferably by using network-based protocols, in particular by using ICMP echo messages. This enables to determine the propagation time of the synchronization information packages or display clock reference messages from the clock master device over the network. The propagation time may be precisely estimated and once reliably determined can be incorporated in the phase-locked-loop. This assumed to work anywhere and requires no additional implementation at the clock master device. The round-trip-time may be estimated by using ICMP echo messages commonly known as “ping” according to the internet control message protocol RFC 792. However due to unsymmetrical network traffic between the display devices of a display wall—from the viewpoint of a display wall—downlink traffic from video or pixel sources to the display wall may be much higher than uplink traffic-taking and therefore simply the taking half of the round-trip-time RTT/2 as the one-way propagation time of the synchronization information packets may be imprecise. In this case a more complex algorithm for precise determination the propagation time respectively the propagation delay is used.
According to a further preferred embodiment for synchronizing display devices with the sync reference device a frequency synthesizer used to generate a video clock signal, preferably a DVI and/or HDMI clock signal, is parameterized and modified. This enables adopting the frequency deviation without clearly visible artifacts on the display devices at least most of the time.
According to a further preferred embodiment for synchronizing a display device with a sync reference device a voltage of a voltage controlled crystal oscillator is adapted. Voltage controlled crystal oscillators are commonly used in set-top boxes as channel switching time constrains imposes strong requirement on generator-synchronous playback of received content. Therefore, by adapting the voltage of a voltage controlled oscillator synchronization can also be provided in an easy way with set-top boxes of televisions in particular.
According to a further preferred embodiment the internal displaying rates of all display devices are determined and based on the determined displaying rates the sync reference device selected out of all display devices based on the minimal difference between a middle frequency of the internal displaying rates and the corresponding displaying rates. By using a minimal difference between the middle frequency of the internal displaying rates and the corresponding internal displaying rates for selection of the sync reference device flexibility is enhanced since for example due to the limited range of a voltage controlled crystal oscillator frequency tuning at the display nodes such a middle or center frequency allows to stay in the limited frequency tuning range and thus enables use of the method also for voltage controlled crystal oscillators. Further for example in an initialization phase each display device may act as a digital clock reference transmitter at same point and time for all the others to determine this display's vertical sync frequency relative to their own vertical sync frequency. Each display may then decide if it is close to the corresponding center or middle frequency or not.
According to a further preferred embodiment a distributed coordination mechanism for selecting the sync reference device out of the display devices is performed. This enables to use already existing and reliable mechanisms to select the sync reference device. For example a distributed coordination may be performed similar to CSMA/CA in wireless local area networks: The first display device can notify the others of being most close to the center or middle frequency therefore becoming the sync reference device.
According to a further preferred embodiment based on the transmitted synchronization information each of the pixel sources estimates the internal displaying rates of the display devices relative to its own pixel source system clock. This enables an even tighter synchronization of the pixel sources with the display devices.
According to further preferred embodiment in case of a stream of frames generated by the pixel sources, the time when to send a corresponding frame to the display devices for a synchronized stream displayed on the display devices is estimated, preferably wherein estimating includes the determination of a deadline for each of the frames based on a frame propagation time and/or inter-process communication latency. This enables an even more precise synchronization of the pixel sources with the display devices in case of a stream of frame by predicting the point in time a completely processed video frame is scheduled to be put on the network and then transmitted to the one ore more display devices. The time may be determined by the pixel sources. For example the pixel generating sources may determine the time remaining until the deadline of a specific frame number that has not yet being displayed at the display devices synchronously is determined. This deadline may be compensated for a propagation time by a round-trip time estimation to a clock master device as well as for inter-process communication latency and thus the point in time at which a video frame is to be forwarded to the network stack of the local host is determined.
According to a further preferred embodiment the transmission of the generated pixels is encoded. Encoding reduces the amount of data to be transmitted from the pixel sources to the display devices. Thus network resources are saved.
According to a further preferred embodiment each display device is assigned its own unique identifier in the network, preferably in form of an IP-address. This enables an easy and reliable addressing of the display devices for transmitting or exchanging synchronization information, the pixel data or any other information via the network to or from the display devices.
According to a further preferred embodiment the internal clock signal is transmitted within the network via the NTP and/or PTP protocol and/or MPEG program clock reference timestamps. This enables to use conventional already existing protocols for clock synchronization which can therefore be easily implemented.
According to a further preferred embodiment before displaying the pixels the visualization of the pixels is adapted based on quality level data for visualizing one or more pixels. A quality level is defined as a visual representation of a pixel data set, which can be independently generated on a dedicated entity. User-perceived quality increases with the level number. For example the quality levels are requested in order when rendering to progressively refine the visualization of pixel data for a user. Therefore, flexibility with regard to visualization of pixels is enhanced.
There are several ways how to design and further develop the teaching of the present invention in an advantageous way. To this end it is to be referred to the patent claims subordinate to patent claim 1 and patent claim 24 on the one hand and to the following explanation of preferred embodiments of the invention by way of example, illustrated by the figure on the other hand. In connection with the explanation of the preferred embodiments of the invention by the aid of the figure, generally preferred embodiments and further developments of the teaching will be explained.
In the drawings
a, b shows part of a method according to a fourth embodiment of the present invention;
In
To display the streaming content in spatial and temporal alignment across the composite display 1 with its twenty display devices 2 multiple layers of synchronization at the composite display 1 end are employed. First to enable displaying active-zero content is that the refresh frequency and phase of the physical display devices 2 have to be synchronized with each other. This ensures that all display devices 2 of the multi-display wall 1 for example switch in sync with respective-stereo glasses of a spectator or user without which stereo display would not be possible. The internal clock of a display master selected out of the display devices 2 is used and its internal clock signal is propagated across a network to all other participating display devices 2 of the display want Synchronizing the displays 2 with each other is called in the whole description as “displaylock”.
When for example on the pixel source side corresponding virtual pixel storage means in form of frame buffers across multiple hosts are synchronized, i.e. synchronized display refreshes are provided and vertical synchronization in a graphics driver within each display device is employed then this is called in the description whole as “swaplock”. If a video stream in a local frame buffer needs to be synchronized such that the same frame of a video is shown in all participating displays 2 this is called “framelock” and may be achieved by taking into account presentation timestamps inserted in all video streams and sorting the frames of the video stream into the correct time slots between frame buffer swaps at the display devices 2.
The term “reverse genlock” means in the description that when the signal of a swap lock of a composite display is made available to the pixel producing pixel sources or applications the pixel generation to a frame buffer refresh rate of the receiving display devices is synchronized. This may correspond to the corresponding signal of a display lock. In other words “reverse genlock” means that a pixel sink locks to the clock of its pixel source inverting the concept of a conventional generator locking “genlock”.
In
In
For active stereoscopy a missing for frame lock, i.e. missing synchronization of visual refresh at the displays SDN disturbs a spatial perception of a composite image by a user on multiple display devices 2. For example when viewing active stereo shutter glasses which are connected and synchronized to one of a set of display devices 2, for example via infrared or bluetooth and this display device 2 is selected to be the sync reference display of a display wall 1 then without further measures left-right stereo separation with suffer on all of the display devices 2 except this sync reference device or sync master display device. In particular for example the phase difference of 90° between any display device and the master device results in a 100% cross talk, while 180° phase difference results in left-right permutation. If continuous frequency and phase equalization is not observed then—for a common refresh rate of approximately 120 Hz-180° phase offset will occur after seven minutes when the two oscillators deviating by for example 10 ppm in base frequency. Even further swap lock i.e. synchronization at which a content of two currently displayed virtual frame buffers are synchronized is preferably enabled for a composite display. Swap lock ensures that all video decoder outputs provide the correct video frame at any display refresh instant.
In general two independently running graphic devices including display devices 2 are out of phase in their display refresh rates. Even if parameterized to the same refresh rate or for example 50 Hz each derives this frequency of the display device from its local quartz crystal oscillator of some base frequency. For example, a 27 MHz base signal is common in television broadcast, however for example personal computer may use other base frequency quartz crystal oscillators or even multiple different local quartz crystal oscillators. Further due to production tolerances of several ten parts per million, the oscillator frequency temperature and aging dependence, it is highly likely that two running local quartz crystal oscillators differ in frequency and phase at some point in time. Having the corresponding oscillator frequencies f1(t) and f2(t) frequency and phase difference are given as
whereas Δφ(t), φ1(t), φ2(t) ε [0, 2π] are the phase difference and the phases, respectively.
In order to synchronize multiple independently running clocks for the display devices 2 in a display wall 1 it has to be defined a wall clock and its propagation and second an adaption of the local clocks in frequency and phase to the defined wall clock has to be performed.
A wall clock may for example be defined as one of the display devices elected as the clock master display which continuously broadcasts a message respectively synchronization information at each visual refresh time instant within display clock reference UDP/IP packets. These packets may be exchanged according to NTP and PTP system time clock synchronization mechanisms: a client may request one or more servers to send timestamps via IP packets while the client itself may be a server to others. This synchronization may then based on calculations with four timestamps, which are exchanged in IP packets on a predefined network port potentially with operating system or Ethernet hardware support. The synchronization accuracy is increased when most stable and less distant clock servers are selected.
When slave display nodes, i.e. display devices other then the clock master device receiving the stream of messages they may be assumed to be spaced equidistant on the time access of the clock master devices display clock.
To adapt local clocks in frequency and phase to the wall clock all slave display devices first need to determine the frequency and phase of the wall clock as well as of their own display clock. To determine frequency and phase of the wall clock the clock master display devices exact refresh rate is deduced from arrival timestamps while the own display clock rate is deduced from refresh interrupt timestamps. Further propagation time of the display clock reference/refresh packets i.e. synchronization information must be determined. In a final step the local display clock has to be adjusted without visible artifacts to the frequency and phase of the wall clock.
In
Therefore in case of symmetric traffic
With proper synchronization of any two display nodes in display frequency and plase, Swaplock is achieved by provision of timeline information in DCR packets/refresh messages. By comparison of DCR timeline information with video frame timestamps during playback, synchronous video playback may be achieved across the display wall. In the same manner, content sources may synchronize their content generation speed with the composite screen. The sum of all rendering, buffering and processing times will result in a delay between source(s) and sink(s) that is to be determined for any discrete setup. Automatic detection and optimization of this delay sum may be provided additionally. The network forwarding delay due to symmetry may be expressed as
fw
=RTT/2 (1)
In case of unsymetric traffic the RTT measurement is clearly asymmetric as there is an imbalance in video streaming traffic. Specifically, RTT is asymmetric during periods in which the slave display nodes receive video traffic. When there is a period of time over which no video streaming traffic is present on the network (e.g. in an initialization phase), a minimum RTT should be determined on the otherwise idle network and the forwarding delay is thus found as
fw=min(RTT)/2+[RTT−min(RTT)] (2)
In
Daisy-chain uses a sequential connection from a pixel source 5 to a first screen 2, from a first screen to a second screen and so on. Multi-head uses one pixel source 5 and corresponding one-to-one connections between the pixel source 5 and each corresponding screen 2.
Multi-node with multi-head is shown on the bottom of
In other words in the case of daisy-chaining and multi-head there is one single local quartz crystal oscillator involved serving as the wall clock. In daisy-chaining a single video signal, for example a HDMI signal, is repeated by the displays with possibly introducing some delay. In the same way a multi-head based system is clocked from a single local quartz crystal oscillator on a multi-head graphic processing unit for example. In case of multi-node systems a dedicated wiring is necessary for frame lock of the independent systems. A synchronization signal from the master clock device 5a to a number of slave display systems 5b is used.
If for example pixel sources are provided in form of personal computers images drawn in the display devices are usually generated at some rate by the directly connected graphics device. Upon reception of pixel or image data the display device refreshes the currently shown image with this data. Although physical creation of a visible image on nowadays liquid crystal display replacing the predominant former displays of cathode-ray tubes, the format of the pixel data transmitted to the display is following the VGA-standard.
In
Starting from a VGA signal in a cathode-ray tube CRT an electron beam is moved over the screen, drawing pixels serially. At the end of each line the beam needs to be steered back to the beginning of the next line. During this retrace the beam is switched off (horizontal blanking). At the end of the last active line, the electron beam must travel back to the upper left corner. Therefore the vertical blanking interval (VBLANK) is appended containing a vertical sync pulse (VSYNC). To allow the voltage on the display cable to stabilize, front and back porch are inserted.
Periodic refresh of the image is necessary to display images without apparent flickering. Fluorescent material within the CRT screen has a specific afterglow period. If a pixel is not refreshed within this period, it will darken and vanish. CRTs were built supporting relatively high refresh rates of 75-100 Hz to combat flickering. As the electron beam requires some time in the order of microseconds to retrace, the length of the blanking periods had to be sufficient. For CRTs these are specified by the Video Electronics Standards Association (VESA) by the general timing formula (GTF). Blanking is a display device specific requirement, hence on aforementioned digital video links, reduced blanking intervals are an option to reduce bandwidth overhead.
Consequently, modulation of frame duration may increase or decrease the current refresh rate. For example increasing the duration of two out of four frames by 1% would increase the refresh rate on average over these four frames by 0.5%.
Due to the similarity of analog VGA signaling and the two predominant digital variants DVI and HDMI, an evaluation of the possibility of adjusting the refresh rate of commodity PC graphics hardware (Intel Linux Graphics http://intellinuxgraphics.org/) by modulation of frame duration was performed. The video refresh rate is rv=fp/ptot with fp being the pixel clock frequency, divided by the total number of pixels. Consequently, the frame duration is Tframe=1/rv. With fixed fp, changing the number of total pixels will modify the frame length (in pixels) and its duration. Further the number of non-active pixel lines and columns within the VSYNC interval is modified. The change in refresh rate is given as
With the image parameters as given in Table I below one obtains from (1), within a reasonable refresh rate range, a granularity of refresh rate variation while changing only the number of lines by Δrv≈0.16 Hz, and while changing only the number of columns by Δrv≈0.06 Hz. Combining horizontal increase with vertical decrease and vice versa results in a granularity of about 0.015-0.02 Hz.
From equation (1) we see that also a modification of fp, the pixel clock, results in a different frame duration. The pixelclock is typically produced by a frequency synthesizer on the graphics device. It is generated by applying multipliers and divisors to a reference frequency that is provided by a quartz oscillator. Eq. (2) gives an example (http://intellinuxgraphics.org/):
whereas fref is a reference XO frequency (e.g. 96 MHz for some Intel devices). M1,2, N and P1,2 are integer parameters that can be adjusted within predetermined limits. The granularity of refresh rate modification using the pixelclock has been around 0.14-0.36 Hz for the parameters in Table I.
It can be seen that with access to fp, some combination of parameters may enable even more fine grained frame length modulation. However, the above mentioned results are only bound to specific display hardware and cannot be generalized to other display devices.
When modifying the image dimensions within any part of the total pixel area as well as the pixelclock within the VBLANK pause, e.g. using Intel integrated graphics platforms without resetting the whole display pipeline using Intel Linux graphics drivers (Intel Linux Graphics http://intellinuxgraphics.org/) many displays do not tolerate frequent modifications of image parameters without clearly visible artifacts.
Hence, the modulation of frame duration by the above mentioned methods related to
a, b shows part of a method according to a forth embodiment of the present invention.
In
In
Using “reverse Genlock” according to an embodiment of the present invention display refresh rates of display devices are synchronized together with times across display devices, i.e. a frame lock by using for example an IP-based clock signal. Using the same signal pixel sources are synchronized to the synchronized display devices. For synchronization clock signal packets are exchanged which may be provided with virtual timestamps. When pixel or video sources providing timestamped IP video streams to the display side a closed loop enabling swap lock is created. The phase may be estimated by using an ICMP ping which is usually available at any IP host entity.
In
A reference time base is directly derived from the CMD's digital display signal and is provided to other entities in a network via short IP packets which are called display clock reference (DCR) packets in the following. DCR packets are generated periodically with period larger than (e.g. at integer multiples) or equal to the CMD's visual refresh rate. Due to the equidistant assumption on those packets, timing jitter as introduced by the network can be compensated, and the visual refresh rate of the CMD can be determined from those frames at another network entity. This may be supported by IEEE 1588 hardware timestamps.
Non-clock-master displays, i.e. the slave display nodes SDN, measure their own visual refresh rate and compare it with the reference clock received via IP, whereas packet loss is compensated for. Refresh rate equalization is done by VCXO control (Genlock) based on the DCR signal. Phase differences are compensated for by round trip time (RTT) estimation using the standard IP RTT measurement method based on ICMP (RFC 792: Internet Control Message Protocol) ping.
As mentioned above, the arrival times of DCR packets are used to determine both frequency and phase. However, the phase obtained is subject to a phase shift composed of the forwarding delay of the DCR packets as well as operating system scheduling.
In addition the DCR packets may also be received by and if not, e.g. due to subnet boundaries, are forwarded to the pixel sources that shall synchronize content generation with the composite display. In case of forwarded DCR packets, the current cumulative RTT for each forwarding hop is signaled in each DCR packet. Such a DCR packet structure is for example shown in
Any participating pixel source can synchronize content generation rate or a playout rate for stored content to the composite display refresh rate. In order to do so, a so-called frame deadline predictor may provide linear extrapolation of the point in time at which a frame with some number b>a, whereas a is the last displayed frame, is due for transmission. By this, a joint generation (or playout) phase can be achieved across multiple content generating nodes, which may be subject to a constant phase offset due to content generation management. Furthermore, audio-visual frames may be timestamped to achieve swap lock. Therefore, a broadcast compatible program clock reference timestamp, abbreviated with PCR is included in the DCR packets, as shown in
When using a phase-locked-loop procedure for VBLANK and DCR frequency and phase estimation, e.g. implemented in software on i686 32-bit architecture this compensates for undetected VBLANK interrupts and lost DCR packets by linear interpolation. In
The PLL loop filter is of first order with cutoff frequency 10−2 Hz (reference sign F3), a reference frequency F4 and further cutoff frequencies F1, F2. and in
The used PLL procedure takes roughly below 75s to acquire lock. This is deemed sufficient, since e.g. a display-wall is thus powered up and running after less than two minutes, including boot time. As a result a residual phase error of around ±100 μs with set-top-boxes used and long term frequency identity was achieved. Absolute phase offset due to ICMP LAN round-trip time estimation error may be negligible with proper filtering. Said magnitude of phase error corresponds to only ±1.2% of Tframe at 120 Hz (reference sign F4), enabling a good time-interleaved stereo separation. Left/right discrimination is provided by HDMI 1.4a.
For determining the frequency and phase stability of the system synchronized via an IP network, two tests have been performed. During the tests, the video signals' VSYNC pulses are extracted from the DVI/HDMI signal and measured using an oscilloscope. Tests are carried out with two set-top-boxes connected to a gigabit Ethernet network. Via this Ethernet, both receive a single DCR signal. In the first test, there is no background traffic on the network. In
The second test is performed while transmitting 100 Mbps UDP streams destined to each of the set-top-boxes IPs from another gigabit Ethernet host in the network. This test is more reliable and is colder to reality since the network cannot be assumed idle in practice due to the visual content being streamed via the same network, too. Due to the Ethernet backbone being a gigabit switch and the set-top-boxes being equipped with 100 Mbps Ethernet, the set-top-boxes' links are assumed to be saturated. Again, a result is presented in
In this test, the phase difference increases to slightly above 100 μs on average with increased standard deviation due to the unsymmetrical network traffic with respect to the display devices. In both tests, RTT estimation has been switched off, thus the results show the reliable use of the above described PLL procedure. In
In
In detail in
A composite display 1 may
The clock master CM of a composite display 1 may
The display master DM of a composite display 1 may
An application may
Distributed applications may request more than one to spread out the pixel-generating workload to multiple physical hosts.
A virtual frame buffer (VFB) may
Therefore a virtual framebuffer my be in particular a content source able to be reverse-synchronized and able to provide content presentation with correct speed constant delay.
The display manager M may
The synchronization architecture is responsible for synchronizing a subset of display devices 2 of a composite display 1 to form a comprehensive stereo display wall. Furthermore, a clock reference for applications to lock onto is generated.
In
In
In addition the display clock reference signal can also be transmitted, e.g. via IP encapsulation, cf. according to IEEE 802.1 as to pixel sources that want to synchronize with the composite display 1, i.e. reverse Genlock is applied in order to reduce playback judder, wherein judder denotes sampling point inaccuracy on the time axis (jitter) for visual content. The pixel sources 5 can then synchronize the content generation and/or its playout rate to the composite display output signal, wherein playout denotes plackback at the playback rate, but without visual display, e.g. for streaming to a display.
When locking to the clock of the composite display's clock master display device CM, the pixel sources 5 could miss one or more clock cycles, e.g. when more time for frame generation than one visual refresh period is needed if their computational power is insufficient. If clock cycles are missed, this missing may be coordinated in order to avoid different temporal resolutions within a scene/video when displayed at the composite display. Additionally or alternatively temporal upsampling, i.e. frame interpolation at the displays devices 2 may be performed.
In detail in
In an MPEG transport stream for example, the program clock reference PCR value provides the time elapsed at the video source and is used to compute the exact time of frame display, e.g. with a constant positive phase offset according to the constant term e in ISO/IEC 13818-1: 2000 (E) (ITU-T Rec. H.222.0) on pp. 118, while in the embodiment, the reception of the display clock reference denotes the exact time of frame display minus the forwarding delay.
As the PTS is defined by the pixel sink, i.e. the display devices 2 and is propagated via display clock reference signal the video source 5 may ensure that each frame containing visual content from the exact same point of time is tagged in the MPEG-TS with the same PTS value. The video or pixel sink 2 may display those frames at an approximately exact same (but later) point in time and may expect an H.264 MPEG-TS for each view.
From the DCR arrival timestamps, each non clock reference display device 2 determines the frequency of display refresh at the clock or sync reference display device 2. At said display device this frequency shall be compared to the local display refresh frequency. As both are measured using the individual local clock of the display device 2 and are thus relative to this clock, the measurement inaccuracy and drift of the local clock is compensated. Thus, based on the comparison of display clock reference DCR and local refresh, the local refresh rate can be synchronized to the refresh rate estimated from the display clock reference DCR.
In order to determine the exact phase of the clock reference display, bi-directional communication measuring the round-trip-time (RTT) is employed. In a first approximation, the forwarding delay is assumed to be the propagation delay of the display clock reference, which is a time-dependent variable, assumed to vary on large time scales and may thus be updated at a low, yet to be determined frequency.
With both phase and frequency estimates, each non-clock reference display device 2 may achieve frame synchronous display with the clock reference display device.
Content synchronization is therefore achieved by comparison of presentation timestamps PTS as contained in the DCR with PTS's as obtained from the content itself, while an overall PTS offset including processing and propagation delay at the video or pixel source plus decoding at the video or pixel sink may be configured or estimated by the clock reference display device.
It may be more feasible to provide different PTS values to pixel sources and pixel sinks in order to account for the PTS offset or to provide the offset to the pixel sinks via separate signaling.
In
In a stereoscopic case, the left and right views at any time instant are composed of those two available frames that have the same PTS value at one display.
In addition the display clock reference signal can also be transmitted to video and pixel sources that want to synchronize with the composite display 1, e.g. performing reverse Genlock in order to reduce playback judder, wherein—as already mentioned before—judder denotes sampling point inaccuracy on the time axis (jitter) for visual content. The pixel sources can then synchronize the content generation and/or its playout rate to the composite display output signal, wherein playout denotes playback at the playback rate, but without visual display, e.g. for streaming to a display, which may be done as follows:
From the arrival timestamps of the display clock reference DCR IP packets, each DCR receiving entity (video or pixel source or sink) may determine the exact frequency of display refresh at the master display device, relative to its own clock. Due to non-deterministic delay of DCR propagation in IP networks, varying for each DCR packet (jitter), this will take a certain amount of time depending on the “distance to the source” in terms of network hops. This time is to be determined. Furthermore, with the PTS included in the DCR, each frame generated at one of the source devices can be tagged with the PTS by setting the MPEG encapsulation PTS for this frame to the DCR PTS. In the MPEG-standard, a PTS is defined relative to a program clock reference (PCR). The PCR in the transport stream TS shall be computed from the DCR PTS. Thus, it is possible to playback pre-generated video streams with identically computed PCR/PTS as well as real-time generated video streams with identically computed PCR from the DCR PTS, since the PTS is in both cases relative to the PCR and both are identical across streams for each frame.
A master video or pixel source managing slave video source or pixel source devices to provide their respective frame tiles with a common PTS=x at some identical time instant t, which may be supported by clock synchronization protocols such as NTP or PTP, may be used to synchronize content generation. Alternatively, round-trip-time (RTT) estimation from-and-to the clock master device CMD via bi-directional communication, whereas the forwarding delay is the DCR propagation delay from the clock master device, together with a fixed and pre-configured processing delay may be used to achieve synchronous content generation. In both of the above cases, frequency as well as phase are uniquely determined at each source device.
Based on the setup of
The vsync frequency of a display device 2 may be estimated by a Phase-Locked-Loop (PLL) procedure, which may be implemented in software. It is specifically parameterized for the approximate refresh rate of the composite display (e.g. 100 Hz or 120/60 Hz) after comparing the CMD's Ticks to the local vsync Ticks, the PLL procedure constantly estimates the phase error and hence determines the frequency deviation of both clocks. The output of the PLL procedure is a frequency that differs from both, if compared using an independent clock. Specifically, it is greater than the local vsync frequency, if the clock master display CMD frequency is greater than the local vsync frequency, and it is smaller otherwise. The difference between PLL output frequency and local vsync frequency can hence be used to control, i.e. to increase or decrease the local vsync frequency. The PLL is locked once the difference between the PLL output and the local vsync frequency is zero, i.e. in practice, as close to zero as possible.
What is unknown in the phase locked frequency compensation loop is the propagation time of the display clock reference DCR messages from the clock master device CMD over the network 7. This time needs to be estimated and, once reliably determined, can be incorporated into the phase locked loop. For this purpose, the round-trip time RTT estimation mechanism ICMP echo (commonly known as “ping”) may be used. This enables to estimate the RTT and hence, by the forwarding delay, the one-way propagation time of small IP packets using an existing and largely supported vehicle, assumed to work anywhere and requiring no additional implementation effort at the clock master device CMD. For instance the Linux ICMP implementation makes use of Ethernet hardware timestamps, if provided, but at least uses timing information from the kernel's network stack.
Hardware support is required to compensate for the estimated frequency deviation at the non-CMDs, and may also be required at the CMD, depending on the frequency compensation range at the non-CMDs.
Three mechanisms for controlling hardware parameters that may determine the vsync frequency of a GPU display pipeline may be used:
With regard to the first option the granularity in frequency compensation is in this case determined by the pixel dimension of the image output, whereas reducing or increasing the blanking by one vertical line is the minimum granularity for landscape displays. However, visible artifacts may appear when adapting to a frequency deviation by blanking interval modification.
The second option was determined to work in particular at some ten percent of one Hertz granularity when using Intel integrated graphics (IGP) of the fourth and fifth generation. Display devices are capable of adopting the frequency deviation without clearly visible artifacts most of the time, but with the display going blank, e.g. for re-synchronization sometime.
The third option, software parameterization of a control voltage of a voltage controlled crystal oscillator VCXO is typically unavailable on consumer PC hardware. However, it is found on Set-Top-Boxes for Television, as the channel switching time constraints impose a strong requirement on generator-synchronous playback of received content, e.g. an Intel consumer electronics (CE) set-top-box based on Intel PC hardware (CE4100), featuring a VCXO as an example for a set-top-box respectively a consumer television broadcast equipment in general. According to the Intel CE datasheets, the required accuracy of the 27 MHz oscillators is within ±50 ppm (parts per million), whereas the tunable frequency range of the VCXO is specified to be at least within ±125 ppm.
Due to the limited range of VCXO frequency tuning at the display devices 2, the individual frequencies of all display devices 2 in an initialization stage may be determined. The CMD should then have a vsync frequency in the middle of the set of frequencies of all display nodes (center frequency), in order to minimize the individual tuning requirements. Then, two mechanisms may be used.
In an initialization phase, each display is a DCR transmitter at some point in time for all the others to determine this display's vsync frequency relative to their own vsync frequency. Hence, each display is capable of deciding if it is close to the center frequency or not. For mechanism two, by distributed coordination (cf. CSMA/CA in wireless LANs), the first to notify the others of being most close to the center frequency becomes the CMD.
In order to do reverse Genlock, a form of generator clock recovery and synchronization in which the roles of generator and consumer are switched (with respect to clock provision), the DCR signal of the CMD is transmitted to the content generating nodes. The frame display frequency of the CMD and hence that of all display devices, once they are synchronized is estimated independently by each content generating device, relative to the system clock at each generating node as depicted in the following figure.
In
In more detail this is shown in
The contact generating nodes, i.e. pixel or video sources need not only determine the exact frequency of the display devices, they also need to predict the point in time a completely processed video frame is scheduled to be put on the network and transmitted to one or more of the display devices. For this purpose, generating nodes may utilize a so-called frame deadline predictor, running on a network entity which returns the time remaining until the deadline of a specific frame number that has not yet been displayed at the display devices synchronously. The deadline is compensated for propagation time by RTT estimation to the clock master device as well as for inter-process communication (IPC) latency, and thus determines the point in time at which a video frame should be forwarded to the network stack of the local host. This process of deadline prediction is shown in
The frame deadline predictor is additionally capable of returning a program clock reference (PCR) value, according to MPEG-2 system Transport Stream (TS) as an example for a multiplex operation on audio-visual content with synchronization information, at any arbitrary point in time. The PCR value is compensated for twice the local host IPC latency, and thus is valid at the point in time of stamping a transport stream with a PCR value.
Due to the limited support of multicast across IP networks, a frame deadline predictor may also serve as a unicast-to-multicast/broadcast proxy. This way, only one generator node respectively pixel of video source receives the original DCR messages from the CMD via unicast, while the rest of the subnet is fed with a relayed version of the DCR signal via broadcast. The delay due to relaying is compensated, as it is indicated in the relayed DCR messages by an additional RTT value.
A local instance of the PLL procedure at the to be reverse Genlock'd pixel or video source needs to be properly parameterized. Parameters may be pre-computed and the PLL is configured via a text file.
An example configuration text file is given below, parameterized for ˜120 Hz and 5 s RTT estimation update interval. The gain parameter controls the duration after which the PLL becomes stable.
# DPLL NCO gain
gain=20
# approximate CMD refresh rate
frequency=120.0
# DPLL loop filter coefficients a0, a1 and b1
a0=0.00496
a1=−0.0012
b1=0.99
# port for receiving the DCR sync packets
sync port=14401
# interval between sending ICMP echo request
# for RTT estimation in seconds
rtt interval=5
# coefficient alpha for ewma filter for RTT estimation
ewma coefficient=0.8
# enable relaying
relay=0
relay destination=134.96.86.21
The deadline predictor may be provided as a statically linked library, its functions may be defined in the following C program header file.
A daemon running an instance of and interfacing with a software-based PLL procedure may also be provided. It provides inter-process communication IPC via localhost UDP communication, specifically timing information relative to a local system clock. This information is useless on another system, hence the daemon does not reply to queries other than from the IP address 127.0.0.1, i.e. from the same host.
The daemon accepts queries if received in the following format:
The daemon responds to the above queries as follows:
A simple application making use of the local frame deadline predictor daemon may be provided, and the main part of its source code is shown below.
The following three functions are defined:
An example output of the sample application is given below.
Local one-way system latency (12 trials) is 0.022000 ms
Returned PCR at this time instant is 9999999
Accordingly, the daemon output has been as follows, due to the IPC latency being estimated in 12 trials, one every 10 ms. The currently estimated frequency, CMD network RTT as well as the last seen frame number is also displayed.
est_freq: 120.058479, est_rtt: 0.324633, frame: 9461539
est_freq: 120.058479, est_rtt: 0.324633, frame: 9461539
est_freq: 120.058509, est_rtt: 0.316927, frame: 9461952
est_freq: 120.058508, est_rtt: 0.316927, frame: 9461953
est_freq: 120.058510, est_rtt: 0.316927, frame: 9461954
est_freq: 120.058510, est_rtt: 0.316927, frame: 9461956
est_freq: 120.058512, est_rtt: 0.316927, frame: 9461957
est_freq: 120.058512, est_rtt: 0.316927, frame: 9461958
est_freq: 120.058512, est_rtt: 0.316927, frame: 9461959
est_freq: 120.058513, est_rtt: 0.316927, frame: 9461961
est_freq: 120.058513, est_rtt: 0.316927, frame: 9461962
est_freq: 120.058517, est_rtt: 0.316927, frame: 9461963
Req. for deadline of frame #9512112, sending time 417.695998 s
In summary of
In summary the present invention enables a flexible, easy-to-implement intra- and linter-display synchronization as well as pixel source or generator synchronization, so a spectator in particular of content displayed on a display wall receives a very high quality of experience in terms of the presented content due to a very tight synchronization between the pixel sources, the display devices as pixel sinks as well as between the pixel sources and the pixels sinks.
Many modifications and other embodiments of the invention set forth herein will come to mind the one skilled in the art to which the invention pertains having the benefit of the teachings presented in the foregoing description and the associated drawings. Therefore, it is to be understood that the invention is not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of the appended claims. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.
Number | Date | Country | Kind |
---|---|---|---|
12173206.9 | Jun 2012 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2013/063155 | 6/24/2013 | WO | 00 |