METHOD AND SYSTEM FOR DISPLAYING PIXELS ON DISPLAY DEVICES

The present invention relates to a method for displaying pixels on display devices, comprising the steps of

a) Generating one or more pixels by one or more pixel sources,
b) Providing one or more display devices for displaying pixels,
c) Transmitting the one or more generated pixels to one or more display devices for displaying the pixels via a network based on a network transmission protocol,
d) Displaying the pixels on the display devices.

The present invention further relates to a system for displaying pixels on display devices, comprising one or more pixel sources for generating pixels to be displayed one or more display devices for displaying the pixels, wherein the one or more pixel sources are connected to the one or more displays for transmitting the generated pixels to the one or more display devices for displaying the pixels via a network based on a network transmission protocol.

For presentations usually a plurality of multiple individual display devices respectively monitors for a so-called tiled display wall are used to present information to users on a large scale. To present information on such a display wall specialized hardware and specialized connections between the hardware and the display wall respectively the individual displays are conventionally used to enable jointly displaying of content and hence to form a combine stream since the conventional display interfaces such as the digital visual interface (DVI) are designed for a one-to-one interconnection of a single computer to a single monitor.

Using the conventional display interfaces the digital video data is transmitted in an uncompressed manner, whereas when using a plurality of displays forming a display wall the specified bandwidth limit of a single connection is exceeded. For example on a n×1080p video wall comprising n individual full-high definition displays displaying content in multi-high-definition resolution would not be possible without compromising high refresh rates.

Conventional methods such as daisy-chaining in which one video signal is used to feed a plurality of display devices are either limited in the number of display devices or achieve a compromise of reduced resolution in either the spatial domain, the time domain or both. Another conventional solution is the use of dedicated graphics hardware providing multiple display interface ports in order to multiply the available bandwidth. This is known under the term “multi-head”. However, the number of ports respectively heads is limited to a very small number.

Another conventional system is based on multiple personal computers each equipped with a high performance graphics processing unit which—in synchronization with the other personal computers—has separate wiring between the graphics processing units. When combined with “multi-head” this is known under the term “multi-node”.

One of the disadvantages of conventional methods is, that they are cost-intensive, e.g. due to dedicated specialized hardware and wiring as mentioned above. Another disadvantage is that synchronization between the graphics processing units is complicated and requires special hardware, thus being inflexible. One of the further disadvantages is, that for increasingly large display walls one-to-one connections become necessary to overcome bandwidth limitations.

It is therefore an objective of the present invention to provide a method and a system for displaying pixels on display devices which are more flexible.

It is a further objective of the present invention to provide a method and a system for displaying pixels on display devices which enable an easier synchronization for displaying content in particular enabling a precise synchronization for a large number of displays.

It is an even further objective of the present invention to provide a method and a system for displaying pixels on display devices enabling simultaneous behavior of the display devices across a display wall comprising the display devices.

It is an even further objective of the present invention to provide a method and a system for displaying pixels on display devices which is easy-to-implement and cost-effective.

The aforementioned objectives are accomplished by a method of claim 1 and a system of claim 24.

According to claim 1 the method for displaying pixels on display devices, comprising the steps of

a) Generating one or more pixels by one or more pixel sources,
b) Providing one or more display devices for displaying pixels,
c) Transmitting the one or more generated pixels to one or more display devices for displaying the pixels via a network based on a network transmission protocol,
d) Displaying the pixels on the display devices.

According to claim 1 the method is characterized in that a plurality of the display devices are synchronized with each other for displaying the pixels and that, preferably a plurality of the pixel sources are synchronized with the plurality of synchronized display devices via the network for synchronous displaying the pixels on the plurality of the display devices.

According to claim 24 the system for displaying pixels on display devices, comprising one or more pixel sources for generating pixels to be displayed, one or more display devices for displaying the pixels, wherein the one or more pixel sources are connected to the one or more displays for transmitting the generated pixels to the one or more display devices for displaying the pixels via a network based on a network transmission protocol.

According to claim 24 the system is characterized in that a plurality of the display devices are operable to be synchronized with each other for displaying the pixels and that, preferably based on synchronized display devices, a plurality of the pixel sources are operable to be synchronized with the plurality of synchronized display devices via the network for synchronous displaying the pixels on the plurality of the display devices.

According to the invention it has been recognized that when for example a display clock signal is used for both inter-display synchronization as well as for pixel source synchronization in an easy way a multi-node synchronous composite display of the display devices can be formed.

According to the invention it has been further recognized that an end-to-end synchronization between image generators and the display devices is enabled.

According to the invention it has been further recognized that the number of display devices is not limited anymore by the video bandwidth enabling a higher number of display devices connectable together to form a display wall.

According to the invention it has further been recognized that a tight synchronization of content to be displayed by the plurality of display devices is enabled since the display devices with each other as well as the display devices with the corresponding content generating sources may be easily synchronized. This enhances the quality of the presentation for a spectator in contrast to conventional display walls.

In other words the present invention enables synchronization between display devices with each other and the pixel source side is synchronized with the already synchronized display. For example for graphic processing units of computers may use synchronization information of the synchronized display devices for synchronizing the graphic processing units to the display devices.

Further features, advantages and preferred embodiments are described in the following subclaims.

According to a preferred embodiment display device information representing at least one characteristic parameter of one or more of the display devices is announced via the network. By announcing at least one characteristic parameter of the one or more display devices the pixel sources can adapt pixel generation according to this characteristic parameter. For example if the characteristic parameters include size and resolution as well as position of the display devices relative to other display devices the pixel sources can generate pixel data for pixels to be displayed by the display devices without having to issue a corresponding time consuming request for synchronization information. Further changes in the characteristic parameters like dead pixels of a display, etc. may then also be used by the pixel sources for example to generate compensation data or relocate and/or scale the pixels respectively the content to be displayed to another region of a display wall.

According to a further preferred embodiment the plurality of display devices are synchronized with each other with respect to their refresh rates and/or their phases. This enables a tighter synchronization between the display devices. Content respectively pixels are then presented to a user on a very high quality level with respect to synchronous displaying of the content across multiple synchronized display devices.

According to a further preferred embodiment a sync reference device is selected, preferably out of the display devices, for providing synchronization information. This enables in an easy way to synchronize the display devices with each other by using the synchronization information of the sync reference device. If the sync reference device is one of the display devices, further entities for providing synchronization information are not necessary, thus additional costs are saved.

According to a further preferred embodiment a clock signal is determined by a clock master device and the clock signal is included into the synchronization information. This enables to measure and propagate the clock signal by the clock master device. For example the clock master device may be a dedicated entity which preferably converts the clock signal into synchronization information. By using a clock master device an easy implementation and an enhanced flexibility is provided, since the clock master device could be easily added to an existing system or an entity already present in the displaying system could be used as clock master device.

According to a further preferred embodiment an internal clock of the clock master device is determined, preferably based on observing interrupts, and used as the clock signal. This avoids a time consuming and bandwidth consuming coordination between the sync reference device and the clock master device when the sync reference device is identical to the clock master device.

According to a further preferred embodiment vertical display refresh interrupts are observed, preferably via an open GL swap buffer and/or by a SGL video sync extension API. This enables a reliable while easy determination of the internal clock signal. Of course the internal clock signal can also be directly determined at the clock signal generating source. API is the abbreviation for application programming interface.

According to a further preferred embodiment synchronization information preferably including the internal clock signal, is announced in display clock reference messages via multicast and/or broadcast in the network. Other devices, preferably display devices or pixel generating devices, are thus enabled to easily synchronize on the provided synchronization information when they are connected to the network. If further display devices are connected to the network for instance in case further users are connected or connect with their mobile devices to the network, content to be display by the display devices is still synchronized when the mobile devices synchronize with the other display devices based on the synchronization information. By using broadcasting every device connected in the network may receive the synchronization information, thus enabling further and fast synchronization. Even further the synchronization information, preferably a synchronization signal or message can be used for additional purposes other then synchronizing with the pixel generating sources and/or the display devices with each other. By using multicast dedicated receivers for synchronization information are provided with the synchronization information enabling an optimized use of network resources for synchronization. Display clock reference messages may also be called refresh messages.

According to a further preferred embodiment synchronization information is announced to the network with a frequency of updates smaller or equal than the frequency of the clock signal, preferably via periodic Internet Protocol packets. By using these periodic messages for sharing the synchronization information, timing jitter introduced by the network can be compensated and for example the visual refresh rate of the clock master and sync reference device can be determined from those periodic messages at any other entity connected to the network. For example this may be supported by IEEE 1588 hardware timestamps.

According to a further preferred embodiment virtual timestamps, preferably an increasing tick counter, are included in the synchronization information, preferably in the display clock reference messages. By using virtual timestamps a sporadic loss of transmitted synchronization information packets may be compensated. Further the clock master's internal clock signal can be determined by observing arrival timestamps of the synchronization information packets, for example relative to the local system clock of a receiver of the synchronization information.

According to a further preferred embodiment the displaying rates of the display devices other than the sync reference device are adapted according to the announced synchronization information for synchronization. This provides synchronized presentations of content via the display devices in a reliable way.

According to a further preferred embodiment the displaying rates of the display devices other than the sync reference device, preferably of all display devices are determined by a phase-locked-loop procedure. For example a vertical sync frequency of a display device is determined by the phase-locked-loop procedure which may be implemented as a software application. It may be specifically parameterized for the approximate refresh rate/displaying rate of the composite display. By comparing ticks of the clock master device to the local vertical sync ticks of another display device the phase error may be constantly estimated via the phase-locked-loop procedure and hence the frequency deviation of both clocks may be determined The output of the phase-locked-loop procedure is then a frequency differing from both if compared using an independent clock. Specifically the output frequency is greater than the local vertical sync frequency if the frequency of the clock master device is greater than the local vertical sync frequency and it is smaller otherwise. The difference between the phase-locked-loop output frequency and a local vertically sync frequency can hence be used to control, i.e. to decrease the local vertical sync frequency if possible. The phase-locked-loop is “locked” once the difference between the phase-locked-loop output frequency and the local vertical sync frequency is 0 which means in practice as close to 0 as possible.

According to a further preferred embodiment a round-trip-time of the announced synchronization information is estimated, preferably by using network-based protocols, in particular by using ICMP echo messages. This enables to determine the propagation time of the synchronization information packages or display clock reference messages from the clock master device over the network. The propagation time may be precisely estimated and once reliably determined can be incorporated in the phase-locked-loop. This assumed to work anywhere and requires no additional implementation at the clock master device. The round-trip-time may be estimated by using ICMP echo messages commonly known as “ping” according to the internet control message protocol RFC 792. However due to unsymmetrical network traffic between the display devices of a display wall—from the viewpoint of a display wall—downlink traffic from video or pixel sources to the display wall may be much higher than uplink traffic-taking and therefore simply the taking half of the round-trip-time RTT/2 as the one-way propagation time of the synchronization information packets may be imprecise. In this case a more complex algorithm for precise determination the propagation time respectively the propagation delay is used.

According to a further preferred embodiment for synchronizing display devices with the sync reference device a frequency synthesizer used to generate a video clock signal, preferably a DVI and/or HDMI clock signal, is parameterized and modified. This enables adopting the frequency deviation without clearly visible artifacts on the display devices at least most of the time.

According to a further preferred embodiment for synchronizing a display device with a sync reference device a voltage of a voltage controlled crystal oscillator is adapted. Voltage controlled crystal oscillators are commonly used in set-top boxes as channel switching time constrains imposes strong requirement on generator-synchronous playback of received content. Therefore, by adapting the voltage of a voltage controlled oscillator synchronization can also be provided in an easy way with set-top boxes of televisions in particular.

According to a further preferred embodiment the internal displaying rates of all display devices are determined and based on the determined displaying rates the sync reference device selected out of all display devices based on the minimal difference between a middle frequency of the internal displaying rates and the corresponding displaying rates. By using a minimal difference between the middle frequency of the internal displaying rates and the corresponding internal displaying rates for selection of the sync reference device flexibility is enhanced since for example due to the limited range of a voltage controlled crystal oscillator frequency tuning at the display nodes such a middle or center frequency allows to stay in the limited frequency tuning range and thus enables use of the method also for voltage controlled crystal oscillators. Further for example in an initialization phase each display device may act as a digital clock reference transmitter at same point and time for all the others to determine this display's vertical sync frequency relative to their own vertical sync frequency. Each display may then decide if it is close to the corresponding center or middle frequency or not.

According to a further preferred embodiment a distributed coordination mechanism for selecting the sync reference device out of the display devices is performed. This enables to use already existing and reliable mechanisms to select the sync reference device. For example a distributed coordination may be performed similar to CSMA/CA in wireless local area networks: The first display device can notify the others of being most close to the center or middle frequency therefore becoming the sync reference device.

According to a further preferred embodiment based on the transmitted synchronization information each of the pixel sources estimates the internal displaying rates of the display devices relative to its own pixel source system clock. This enables an even tighter synchronization of the pixel sources with the display devices.

According to further preferred embodiment in case of a stream of frames generated by the pixel sources, the time when to send a corresponding frame to the display devices for a synchronized stream displayed on the display devices is estimated, preferably wherein estimating includes the determination of a deadline for each of the frames based on a frame propagation time and/or inter-process communication latency. This enables an even more precise synchronization of the pixel sources with the display devices in case of a stream of frame by predicting the point in time a completely processed video frame is scheduled to be put on the network and then transmitted to the one ore more display devices. The time may be determined by the pixel sources. For example the pixel generating sources may determine the time remaining until the deadline of a specific frame number that has not yet being displayed at the display devices synchronously is determined. This deadline may be compensated for a propagation time by a round-trip time estimation to a clock master device as well as for inter-process communication latency and thus the point in time at which a video frame is to be forwarded to the network stack of the local host is determined.

According to a further preferred embodiment the transmission of the generated pixels is encoded. Encoding reduces the amount of data to be transmitted from the pixel sources to the display devices. Thus network resources are saved.

According to a further preferred embodiment each display device is assigned its own unique identifier in the network, preferably in form of an IP-address. This enables an easy and reliable addressing of the display devices for transmitting or exchanging synchronization information, the pixel data or any other information via the network to or from the display devices.

According to a further preferred embodiment the internal clock signal is transmitted within the network via the NTP and/or PTP protocol and/or MPEG program clock reference timestamps. This enables to use conventional already existing protocols for clock synchronization which can therefore be easily implemented.

According to a further preferred embodiment before displaying the pixels the visualization of the pixels is adapted based on quality level data for visualizing one or more pixels. A quality level is defined as a visual representation of a pixel data set, which can be independently generated on a dedicated entity. User-perceived quality increases with the level number. For example the quality levels are requested in order when rendering to progressively refine the visualization of pixel data for a user. Therefore, flexibility with regard to visualization of pixels is enhanced.

There are several ways how to design and further develop the teaching of the present invention in an advantageous way. To this end it is to be referred to the patent claims subordinate to patent claim 1 and patent claim 24 on the one hand and to the following explanation of preferred embodiments of the invention by way of example, illustrated by the figure on the other hand. In connection with the explanation of the preferred embodiments of the invention by the aid of the figure, generally preferred embodiments and further developments of the teaching will be explained.

In the drawings

FIG. 1 shows part of a system according to a first embodiment of the present invention;

FIG. 2 shows schematically a system according to a second embodiment of the present invention;

FIG. 3 shows part of a method according to a third embodiment of the present invention;

FIG. 4 shows conventional systems for displaying pixels;

FIG. 5 shows a signal layout on digital video interfaces for synchronization;

FIG. 6
a, b shows part of a method according to a fourth embodiment of the present invention;

FIG. 7 shows a frame structure for periodic display clock reference frames of a method according to a fifth embodiment of the present invention;

FIG. 8 shows a phase-locked-loop procedure of a method according to a sixth embodiment of the present invention;

FIG. 9 shows an external measurement of the phase-locked-loop with loop filter implementation of a method according to a seventh embodiment of the present invention;

FIG. 10 shows measured Vsync pulses on DVI/HDMI cables using a method according to an eighth embodiment of the present invention;

FIG. 11 shows measured Vsync pulses on DVI/HDMI cables using a method according to a ninth embodiment of the present invention;

FIG. 12 shows schematically a system according to a tenth embodiment of the present invention;

FIG. 13 shows part of a method according to a eleventh embodiment of the present invention;

FIG. 14 shows part of a method according to a twelfth embodiment of the present invention;

FIG. 15 shows part of a method according to a thirteenth embodiment of the present invention;

FIG. 16 shows part of a method according to a fourteenth embodiment of the present invention;

FIG. 17 shows part of a method according to a fiftheenth embodiment of the present invention; and

FIG. 18 shows part of a method according to a sixteenth embodiment of the present invention.

FIG. 1 shows part of a system according to a first embodiment of the present Invention.

In FIG. 1 a so-called composite display 1 is shown comprising 4×5=20 individual displays 2 or display devices. Further one full-size video stream 3 for the composite display of an interactive scientific visualization is shown and a second overlay stream 4 of a running video. Bezels of the different display devices 2 are compensated.

To display the streaming content in spatial and temporal alignment across the composite display 1 with its twenty display devices 2 multiple layers of synchronization at the composite display 1 end are employed. First to enable displaying active-zero content is that the refresh frequency and phase of the physical display devices 2 have to be synchronized with each other. This ensures that all display devices 2 of the multi-display wall 1 for example switch in sync with respective-stereo glasses of a spectator or user without which stereo display would not be possible. The internal clock of a display master selected out of the display devices 2 is used and its internal clock signal is propagated across a network to all other participating display devices 2 of the display want Synchronizing the displays 2 with each other is called in the whole description as “displaylock”.

When for example on the pixel source side corresponding virtual pixel storage means in form of frame buffers across multiple hosts are synchronized, i.e. synchronized display refreshes are provided and vertical synchronization in a graphics driver within each display device is employed then this is called in the description whole as “swaplock”. If a video stream in a local frame buffer needs to be synchronized such that the same frame of a video is shown in all participating displays 2 this is called “framelock” and may be achieved by taking into account presentation timestamps inserted in all video streams and sorting the frames of the video stream into the correct time slots between frame buffer swaps at the display devices 2.

The term “reverse genlock” means in the description that when the signal of a swap lock of a composite display is made available to the pixel producing pixel sources or applications the pixel generation to a frame buffer refresh rate of the receiving display devices is synchronized. This may correspond to the corresponding signal of a display lock. In other words “reverse genlock” means that a pixel sink locks to the clock of its pixel source inverting the concept of a conventional generator locking “genlock”.

FIG. 2 shows schematically a system according to a second embodiment of the present invention.

In FIG. 2 four render servers 5 are shown which write into two virtual frame buffers 6a and 6b. A virtual frame buffer 6a, 6b is requested by for example a pixel generating application as resource to write pixels into. To provide the pixel data of the virtual frame buffers 6a, 6b to a display wall 1 and to a mobile device 8 for presentation the render servers 5 respectively the virtual frame buffers 6a, 6b hosted on one or more of the render servers 5 are connected via a network 7 based on the IP-protocol to display devices 2 of the display wall 1 and via a WLAN access point 9 to the mobile device 8 for presentation. The virtual frame buffers 6a and 6b may be shared among several processes or network locations so that an application may attach to the virtual frame buffers 6a and 6b and for example just specify the quadrangular region within the hole virtual frame buffer 6a, 6b they want to be responsible for. When the composite display 1 respectively the display wall 1 is connected to the virtual frame buffers 6a and 6b and is used to stream the video of the render servers 5 directly to the displays 2 of the display wall 1, the virtual frame buffers 6a and 6b preferably perform the necessary splitting up and scaling of input pixel frames, performing a visual compensation along the way according to the configuration of the displays 1, 8 connected to the virtual frame buffers 6a and 6b. In particular downscaling of input pixels is performed at the virtual frame buffers 6a, 6b and scaling up is performed at the display devices 2 respectively the mobile device 8 reducing pixel data traffic within the network 7.

FIG. 3 shows part of a method according to a third embodiment of the present invention.

In FIG. 3 steps are shown how a slave display device frequency and phase is synchronized to a master display device.

For active stereoscopy a missing for frame lock, i.e. missing synchronization of visual refresh at the displays SDN disturbs a spatial perception of a composite image by a user on multiple display devices 2. For example when viewing active stereo shutter glasses which are connected and synchronized to one of a set of display devices 2, for example via infrared or bluetooth and this display device 2 is selected to be the sync reference display of a display wall 1 then without further measures left-right stereo separation with suffer on all of the display devices 2 except this sync reference device or sync master display device. In particular for example the phase difference of 90° between any display device and the master device results in a 100% cross talk, while 180° phase difference results in left-right permutation. If continuous frequency and phase equalization is not observed then—for a common refresh rate of approximately 120 Hz-180° phase offset will occur after seven minutes when the two oscillators deviating by for example 10 ppm in base frequency. Even further swap lock i.e. synchronization at which a content of two currently displayed virtual frame buffers are synchronized is preferably enabled for a composite display. Swap lock ensures that all video decoder outputs provide the correct video frame at any display refresh instant.

In general two independently running graphic devices including display devices 2 are out of phase in their display refresh rates. Even if parameterized to the same refresh rate or for example 50 Hz each derives this frequency of the display device from its local quartz crystal oscillator of some base frequency. For example, a 27 MHz base signal is common in television broadcast, however for example personal computer may use other base frequency quartz crystal oscillators or even multiple different local quartz crystal oscillators. Further due to production tolerances of several ten parts per million, the oscillator frequency temperature and aging dependence, it is highly likely that two running local quartz crystal oscillators differ in frequency and phase at some point in time. Having the corresponding oscillator frequencies f₁(t) and f₂(t) frequency and phase difference are given as

$\langle (f_{1} (t) - f_{2} (t)) \rangle = Δ f (t) = \frac{\partial}{\partial t} Δφ (t) = \frac{\partial}{\partial t} \langle φ_{1} (t) - φ_{2} (t) \rangle$

whereas Δφ(t), φ₁(t), φ₂(t) ε [0, 2π] are the phase difference and the phases, respectively.

In order to synchronize multiple independently running clocks for the display devices 2 in a display wall 1 it has to be defined a wall clock and its propagation and second an adaption of the local clocks in frequency and phase to the defined wall clock has to be performed.

A wall clock may for example be defined as one of the display devices elected as the clock master display which continuously broadcasts a message respectively synchronization information at each visual refresh time instant within display clock reference UDP/IP packets. These packets may be exchanged according to NTP and PTP system time clock synchronization mechanisms: a client may request one or more servers to send timestamps via IP packets while the client itself may be a server to others. This synchronization may then based on calculations with four timestamps, which are exchanged in IP packets on a predefined network port potentially with operating system or Ethernet hardware support. The synchronization accuracy is increased when most stable and less distant clock servers are selected.

When slave display nodes, i.e. display devices other then the clock master device receiving the stream of messages they may be assumed to be spaced equidistant on the time access of the clock master devices display clock.

To adapt local clocks in frequency and phase to the wall clock all slave display devices first need to determine the frequency and phase of the wall clock as well as of their own display clock. To determine frequency and phase of the wall clock the clock master display devices exact refresh rate is deduced from arrival timestamps while the own display clock rate is deduced from refresh interrupt timestamps. Further propagation time of the display clock reference/refresh packets i.e. synchronization information must be determined. In a final step the local display clock has to be adjusted without visible artifacts to the frequency and phase of the wall clock.

In FIG. 3 on the left side the frequency and phase of the clock master display is shown and on the right side frequency and phase of a slave display device. Beginning from the top the clock master device sends a refresh message, to the slave display device SDN. The travelling time of the refresh message equals theoretically half of the round-trip-time. However due to unsymmetrical network traffic between the display devices of a display wall—form the viewpoint of a display wall—downlink traffic from video or pixel sources to the display wall may be much higher than uplink traffic-taking simply delay is used. Assuming equal distant refresh messages the arrival timestamps of the messages at the slave display device are used to determine the exact refresh rate of the clock master display device. After determining the refresh rate of the clock master device based on the arrival timestamps and the round-trip-time of the refresh message, phase and frequency of the slave display device are adapted continuously by reducing the corresponding phase shift until phase and frequency of the slave display device are equal with respect to measuring uncertainty to the phase and frequency of the clock master device.

Therefore in case of symmetric traffic FIG. 3 illustrates therefore synchronization by example: a clock master display CMD periodically transmits UDP/IP messages at each of its own refresh time instants. Based on those herein called display clock reference (DCR) IP packets or reference messages each slave display adjusts its refresh rate to match that of the CMD. The CMD's exact visual refresh frequency can be estimated by evaluating arrival timestamps of said packets as they are assumed to arrive at equidistant time intervals. On a non-realtime scheduling operating system and in combination with best-effort IP transport, noise may be added to arrival timestamps. Even with frequency identity of display refresh, the phase offset between a slave display node and the CMD remains unknown. To estimate it, each slave display (and content sources as well, if need be) may periodically estimate network round trip time (RTT) to the CMD to evaluate the forwarding delay of the display refresh UDP/IP messages. A perfectly accurate forwarding delay estimate and frequency identity enables slaves achieving phase identity with the CMD at all times.

With proper synchronization of any two display nodes in display frequency and plase, Swaplock is achieved by provision of timeline information in DCR packets/refresh messages. By comparison of DCR timeline information with video frame timestamps during playback, synchronous video playback may be achieved across the display wall. In the same manner, content sources may synchronize their content generation speed with the composite screen. The sum of all rendering, buffering and processing times will result in a delay between source(s) and sink(s) that is to be determined for any discrete setup. Automatic detection and optimization of this delay sum may be provided additionally. The network forwarding delay due to symmetry may be expressed as

Δt
_fw
=RTT/2 (1)

In case of unsymetric traffic the RTT measurement is clearly asymmetric as there is an imbalance in video streaming traffic. Specifically, RTT is asymmetric during periods in which the slave display nodes receive video traffic. When there is a period of time over which no video streaming traffic is present on the network (e.g. in an initialization phase), a minimum RTT should be determined on the otherwise idle network and the forwarding delay is thus found as

Δt
_fw=min(RTT)/2+[RTT−min(RTT)] (2)

FIG. 4 shows conventional systems for displaying pixels.

In FIG. 4 hardware wall clock solutions differing in complexity and video bandwidth are shown. On the top a so-called daisy-chain video interface, in the middle a multi-head video interface and on the bottom a multi-node with multi-head video interface is shown.

Daisy-chain uses a sequential connection from a pixel source 5 to a first screen 2, from a first screen to a second screen and so on. Multi-head uses one pixel source 5 and corresponding one-to-one connections between the pixel source 5 and each corresponding screen 2.

Multi-node with multi-head is shown on the bottom of FIG. 4. A server 5a synchronizes with clients 5b for displaying pixels on the screens 2. Each of the clients 5b has multi-head connections to corresponding screens 2. Further the server 5a has also one or more one-to-one connections for displaying pixels to corresponding screens 2.

In other words in the case of daisy-chaining and multi-head there is one single local quartz crystal oscillator involved serving as the wall clock. In daisy-chaining a single video signal, for example a HDMI signal, is repeated by the displays with possibly introducing some delay. In the same way a multi-head based system is clocked from a single local quartz crystal oscillator on a multi-head graphic processing unit for example. In case of multi-node systems a dedicated wiring is necessary for frame lock of the independent systems. A synchronization signal from the master clock device 5a to a number of slave display systems 5b is used.

If for example pixel sources are provided in form of personal computers images drawn in the display devices are usually generated at some rate by the directly connected graphics device. Upon reception of pixel or image data the display device refreshes the currently shown image with this data. Although physical creation of a visible image on nowadays liquid crystal display replacing the predominant former displays of cathode-ray tubes, the format of the pixel data transmitted to the display is following the VGA-standard.

FIG. 5 shows a signal layout as signal on digital interfaces DVI and HDMI for synchronization.

In FIG. 5 a signal layout as signaled on digital interfaces like DVI and HDMI is shown resembling the structure of a VGA signal.

Starting from a VGA signal in a cathode-ray tube CRT an electron beam is moved over the screen, drawing pixels serially. At the end of each line the beam needs to be steered back to the beginning of the next line. During this retrace the beam is switched off (horizontal blanking). At the end of the last active line, the electron beam must travel back to the upper left corner. Therefore the vertical blanking interval (VBLANK) is appended containing a vertical sync pulse (VSYNC). To allow the voltage on the display cable to stabilize, front and back porch are inserted.

Periodic refresh of the image is necessary to display images without apparent flickering. Fluorescent material within the CRT screen has a specific afterglow period. If a pixel is not refreshed within this period, it will darken and vanish. CRTs were built supporting relatively high refresh rates of 75-100 Hz to combat flickering. As the electron beam requires some time in the order of microseconds to retrace, the length of the blanking periods had to be sufficient. For CRTs these are specified by the Video Electronics Standards Association (VESA) by the general timing formula (GTF). Blanking is a display device specific requirement, hence on aforementioned digital video links, reduced blanking intervals are an option to reduce bandwidth overhead.

Consequently, modulation of frame duration may increase or decrease the current refresh rate. For example increasing the duration of two out of four frames by 1% would increase the refresh rate on average over these four frames by 0.5%.

Due to the similarity of analog VGA signaling and the two predominant digital variants DVI and HDMI, an evaluation of the possibility of adjusting the refresh rate of commodity PC graphics hardware (Intel Linux Graphics http://intellinuxgraphics.org/) by modulation of frame duration was performed. The video refresh rate is r_v=f_p/p_totwith f_pbeing the pixel clock frequency, divided by the total number of pixels. Consequently, the frame duration is T_frame=1/r_v. With fixed f_p, changing the number of total pixels will modify the frame length (in pixels) and its duration. Further the number of non-active pixel lines and columns within the VSYNC interval is modified. The change in refresh rate is given as

$\begin{matrix} Δ r_{v} = f_{p} \cdot (\frac{1}{h_{{tot}_{1}} \cdot v_{{tot}_{1}}} - \frac{1}{h_{{tot}_{2}} \cdot h_{{tot}_{2}}}) & (1) \end{matrix}$

With the image parameters as given in Table I below one obtains from (1), within a reasonable refresh rate range, a granularity of refresh rate variation while changing only the number of lines by Δr_v≈0.16 Hz, and while changing only the number of columns by Δr_v≈0.06 Hz. Combining horizontal increase with vertical decrease and vice versa results in a granularity of about 0.015-0.02 Hz.

TABLE I

EXAMPLE VESA CVT PARAMETERS FOR 720 P AT 120 HZ

active
blanking (horizontal)

pixelclock f_p
(horizontal)
front
SYNC
back

162 MHz
1280
96
136
232

active
blanking (vertical)

refresh rate r_v
(vertical)
front
SYNC
back

119.86 Hz
720
3
5
47

From equation (1) we see that also a modification of f_p, the pixel clock, results in a different frame duration. The pixelclock is typically produced by a frequency synthesizer on the graphics device. It is generated by applying multipliers and divisors to a reference frequency that is provided by a quartz oscillator. Eq. (2) gives an example (http://intellinuxgraphics.org/):

$\begin{matrix} f_{p} = f_{ref} \frac{(5 \cdot (M_{1} + 2) + (M_{2} + 2))}{(N + 2) \cdot (P_{1} \cdot P_{2})} & (2) \end{matrix}$

whereas f_refis a reference XO frequency (e.g. 96 MHz for some Intel devices). M_1,2, N and P_1,2are integer parameters that can be adjusted within predetermined limits. The granularity of refresh rate modification using the pixelclock has been around 0.14-0.36 Hz for the parameters in Table I.

It can be seen that with access to f_p, some combination of parameters may enable even more fine grained frame length modulation. However, the above mentioned results are only bound to specific display hardware and cannot be generalized to other display devices.

When modifying the image dimensions within any part of the total pixel area as well as the pixelclock within the VBLANK pause, e.g. using Intel integrated graphics platforms without resetting the whole display pipeline using Intel Linux graphics drivers (Intel Linux Graphics http://intellinuxgraphics.org/) many displays do not tolerate frequent modifications of image parameters without clearly visible artifacts.

Hence, the modulation of frame duration by the above mentioned methods related to FIG. 5 is unusable for synchronization of digital displays, as the tolerance is highly depended on the specific display hardware.

FIG. 6
a, b shows part of a method according to a forth embodiment of the present invention.

In FIG. 6a slave display devices SDN and pixel sources are synchronized to a wall clock provided by one of the display devices or display nodes.

In FIG. 6a the rules of generator and consumer with respect to synchronization are reversed according to an embodiment of the present invention. This has the advantage that it is more easier to synchronize display clocks in frequency and phase to a signal generated by another display clock whereas any source synchronizes to a clock that is common across the whole set of displays. Normal or “forward Genlock” reduces the accuracy of the inter-display phase synchronization since round-trip-time measurement accuracy from each display to the pixel source(s) has to be respected.

Using “reverse Genlock” according to an embodiment of the present invention display refresh rates of display devices are synchronized together with times across display devices, i.e. a frame lock by using for example an IP-based clock signal. Using the same signal pixel sources are synchronized to the synchronized display devices. For synchronization clock signal packets are exchanged which may be provided with virtual timestamps. When pixel or video sources providing timestamped IP video streams to the display side a closed loop enabling swap lock is created. The phase may be estimated by using an ICMP ping which is usually available at any IP host entity.

In FIG. 6a therefore a system according to the invention based on the reverse Genlock is shown. The system comprises three types entities: One sync reference device, a clock master display device, a plurality of slave display nodes SDN respectively slave display devices, i.e. all other display devices other then the clock master display device and a so-called reverse Genlock frame deadline predictor. In FIG. 6 one of the display devices 2 serves or is selected as the clock master device CMD providing a wall clock reference signal for the composite display comprising all display devices. The clock master display device may be pre-defined or selected on a peer-to-peer basis from all the display devices forming the composite display.

A reference time base is directly derived from the CMD's digital display signal and is provided to other entities in a network via short IP packets which are called display clock reference (DCR) packets in the following. DCR packets are generated periodically with period larger than (e.g. at integer multiples) or equal to the CMD's visual refresh rate. Due to the equidistant assumption on those packets, timing jitter as introduced by the network can be compensated, and the visual refresh rate of the CMD can be determined from those frames at another network entity. This may be supported by IEEE 1588 hardware timestamps.

Non-clock-master displays, i.e. the slave display nodes SDN, measure their own visual refresh rate and compare it with the reference clock received via IP, whereas packet loss is compensated for. Refresh rate equalization is done by VCXO control (Genlock) based on the DCR signal. Phase differences are compensated for by round trip time (RTT) estimation using the standard IP RTT measurement method based on ICMP (RFC 792: Internet Control Message Protocol) ping.

As mentioned above, the arrival times of DCR packets are used to determine both frequency and phase. However, the phase obtained is subject to a phase shift composed of the forwarding delay of the DCR packets as well as operating system scheduling. FIG. 6b depicts possible timestamps involved in the IP based synchronization. The CMD issues a DCR message per video interrupt. (IRQ). The forwarding delay of DCR messages on the IP networks is unknown when arrival timestamps are taken in the network interface card (NIC) IRQ. Hence, a slave display node issues RTT measurement packets that are responded by the master immediately. For frequency synchronization the timestamps t₁and t₂are used, while for phase equalization Δt is compensated additionally. In order to compensate for different scheduling latencies at different operating system levels (cf. kernelspace vs. userspace), timestamps are taken within one operating system level at each note. Although the CMD may take timestamps in the kernelspace, this is not mandatory for slave devices. Furthermore, a single clock source may be used for time measurement at each node. This can be the NIC's clock of the system clock since time differences in the order of milliseconds are to be considered and the whole system is free of any absolute time.

In addition the DCR packets may also be received by and if not, e.g. due to subnet boundaries, are forwarded to the pixel sources that shall synchronize content generation with the composite display. In case of forwarded DCR packets, the current cumulative RTT for each forwarding hop is signaled in each DCR packet. Such a DCR packet structure is for example shown in FIG. 7.

Any participating pixel source can synchronize content generation rate or a playout rate for stored content to the composite display refresh rate. In order to do so, a so-called frame deadline predictor may provide linear extrapolation of the point in time at which a frame with some number b>a, whereas a is the last displayed frame, is due for transmission. By this, a joint generation (or playout) phase can be achieved across multiple content generating nodes, which may be subject to a constant phase offset due to content generation management. Furthermore, audio-visual frames may be timestamped to achieve swap lock. Therefore, a broadcast compatible program clock reference timestamp, abbreviated with PCR is included in the DCR packets, as shown in FIG. 7. This creates a closed loop, as each display device can compare the PCR received via the DCR directly (current time of the master) with PCR values as seen from incoming audio-visual bitstream(s), and consequently adjust a constant delay. If this constant delay is chosen identically on all display devices, swaplock is achieved and the accuracy of any system time clock in the whole architecture is irrelevant, as system time clocks are hidden.

FIG. 8 shows a phase-locked-loop procedure of a method according to an sixth embodiment of the present invention.

When using a phase-locked-loop procedure for VBLANK and DCR frequency and phase estimation, e.g. implemented in software on i686 32-bit architecture this compensates for undetected VBLANK interrupts and lost DCR packets by linear interpolation. In FIG. 8 the following variables are used

- The variable “z” is a frequency variable. The variable is named “z” since the result of a z-transformation based on the Laplace-transformation. The variable “z” “includes” the frequency “f” known from a Fourier-transformation.
- H_LF(z) is the transformation function of the Loop Filter LF
- The term DCO means Digitally Controlled Oscillator, i.e. an oscillator which is digitally controlled
- P(z) is the transfer function of the predictor P. The transfer function P is used to generate based on the frequency and the last phase predicted phase for the next comparison.
- PCR means Program Clock Reference. The Program Clock Reference is the name of the timestamps when embedded into a MPEG-2 Transport Stream and
- RTT means Round Trip Time.

The PLL loop filter is of first order with cutoff frequency 10⁻²Hz (reference sign F3), a reference frequency F4 and further cutoff frequencies F1, F2. and in FIG. 9 a simulation of such a loop filter is shown.

The used PLL procedure takes roughly below 75s to acquire lock. This is deemed sufficient, since e.g. a display-wall is thus powered up and running after less than two minutes, including boot time. As a result a residual phase error of around ±100 μs with set-top-boxes used and long term frequency identity was achieved. Absolute phase offset due to ICMP LAN round-trip time estimation error may be negligible with proper filtering. Said magnitude of phase error corresponds to only ±1.2% of T_frameat 120 Hz (reference sign F4), enabling a good time-interleaved stereo separation. Left/right discrimination is provided by HDMI 1.4a.

For determining the frequency and phase stability of the system synchronized via an IP network, two tests have been performed. During the tests, the video signals' VSYNC pulses are extracted from the DVI/HDMI signal and measured using an oscilloscope. Tests are carried out with two set-top-boxes connected to a gigabit Ethernet network. Via this Ethernet, both receive a single DCR signal. In the first test, there is no background traffic on the network. In FIG. 10 oscilloscope measurements of Vsync pulses K1, K2 on DVI/HDMI cables connected to a slave display node each are shown. Software phase locked loops are locked and no background traffic in the network is present. The scale SC shown in FIG. 10 is 50 μs/dif. The mean phase difference is |Δ T(⋄)|=29.5 μs with std. dev. 6.7 μs. Herein, the phase difference is |Δ T(⋄)| below 30 μs on average.

The second test is performed while transmitting 100 Mbps UDP streams destined to each of the set-top-boxes IPs from another gigabit Ethernet host in the network. This test is more reliable and is colder to reality since the network cannot be assumed idle in practice due to the visual content being streamed via the same network, too. Due to the Ethernet backbone being a gigabit switch and the set-top-boxes being equipped with 100 Mbps Ethernet, the set-top-boxes' links are assumed to be saturated. Again, a result is presented in FIG. 11: In FIG. 11 oscilloscope measurements K3, K4 of Vsync pulses on DVI/HDMI cables connected to a slave display node each are shown. Software phase locked loops are locked and very high background traffic in the network is present. The scale SC shown in FIG. 11 is 50 μsec/dif. The mean phase |Δ T(⋄)| difference is =105.8 μs with std. dev. 20.6 μs.

In this test, the phase difference increases to slightly above 100 μs on average with increased standard deviation due to the unsymmetrical network traffic with respect to the display devices. In both tests, RTT estimation has been switched off, thus the results show the reliable use of the above described PLL procedure. In FIGS. 11 and 10, the grid position relative to the oscilloscope trigger, which is connected to the DCR source (master clock signal) is unchanged.

FIG. 12 shows schematically a system according to a tenth embodiment of the present invention.

In FIG. 12 a system comprises a number of pixel sources 5, for example distributed rendering sources, a playout synchronous source and a presentation asynchronous source each having one or more virtual frame buffers 6. The virtual frame buffers 6 are connected via internet 7 to a composite display 1 comprising a plurality of physical display devices 2. One of the displays 2 is defined as clock master display CM transmitting synchronization information to all other displays 2 and to the pixel sources 5 for synchronization. Further one of the display devices 2 is selected as display master DM. The display master DM provides information about the composite display 1 to the network 7 and further to entities connected to the network 7, i.e. for example which other displays besides itself from the composite display 1 what is the overall pixel size of the composite display 1 available to paint on, etc.

In detail in FIG. 12 and preferably in the whole description a display may

- be a single physical display device 2, such as an LCD screen or a projector,
- be the only actual sink for visual content (pixels) in the system,
- include some processor, which is connected to the network 7 and is able to perform pixel operations on incoming pixel buffers. Furthermore, processing services may be provided to specific network ports by the display itself or by a connected computer,
- provide information about its capabilities (e.g., pixel size, physical size, 3D location, color depth, refresh rate, stereo capabilities) as a service to the network 7,
- have a unique identifier, preferably an IP address, which identifies it uniquely in the network 7.

A composite display 1 may

- comprise of one or more displays 2, which together form a meta-screen 1 for pixels to be displayed on. An example of such a composite display 1 is a display wall,
- comprise exactly one display device 2 that acts as the clock master CM,
- comprise exactly one display device 2 that acts as the display master DM.
  - It may be the same display as the clock master CM.

The clock master CM of a composite display 1 may

- be a display,
- provide a clock source to synchronize switching times of all the other displays 2 that are part of its composite display 1 to its own clock,
- provide a reverse Genlock signal to the network 7, which pixel sources 5 may synchronize themselves to. The signal may be transmitted IP-based via the network 7.

The display master DM of a composite display 1 may

- be a display, too,
- provide information about the composite display 1 to the network 7, in particular which other displays 2 besides itself form the composite display 1, what is the overall pixel size of the composite display 1 available to paint on, etc.

An application may

- be a provider of pixels to be displayed on a composite display 1, i.e., a pixel source. Applications may be the only pixel sources 5 in the overall system,
- request one or more virtual frame buffers 6 from a display manager M.

Distributed applications may request more than one to spread out the pixel-generating workload to multiple physical hosts.

A virtual frame buffer (VFB) may

- be a virtual 2D array of pixels for applications to write into,
- perform color conversion and encoding on pixels that are written to it. If necessary, it scales the pixel array, too,
- send a video stream it creates from incoming pixels in a peer-to-peer fashion to all hosts that need it. It receives the information, which displays 2 to send the stream to, from the display manager M,
- be requested in an arbitrary pixel size from the display manager M. The pixels input to the system in any size can be scaled to a size fitting the pixel real estate on the output end in the course of processing by the VFB 6,
- use the reverse Genlock clock signal from a clock master CM to synchronize its encoding and sending of video frames to a specific output composite display 1.

Therefore a virtual framebuffer my be in particular a content source able to be reverse-synchronized and able to provide content presentation with correct speed constant delay.

The display manager M may

- be implemented as a separate software component, which may run on any host reachable through the display network 7. This host may be a display device 2, but does not have to be,
- be the location where all information about composite displays 1 is collected and stored,
- be the location where all information about pixel sources 5 having requested virtual frame buffers 6 are stored,
- provide an API enabling the direction of pixel flow from all registered pixel sources 5 to all registered pixel sinks 1, 2,
- negotiate if and, if yes, where scaling of the video data coming from a specific pixel source 5 is performed. This way, for a magnification of pixels the scaling may take place at the sink 1, 2, whereas for a minification of the pixels, scaling may take place at the pixel source 5 already. This minimizes network traffic in the network 7.

The synchronization architecture is responsible for synchronizing a subset of display devices 2 of a composite display 1 to form a comprehensive stereo display wall. Furthermore, a clock reference for applications to lock onto is generated.

FIG. 13 shows part of a method according to a eleventh embodiment of the present invention.

In FIG. 13 an inter-display synchronization using a master display device CMD as clock reference and with manual 120 Hz phase correction is shown.

In FIG. 13 one of the display devices 2 serves as the clock master device CM by providing the clock reference for the composite display 1. It may be selected on a peer-to-peer basis from all the display devices 2 that make up a composite display 1. This time base is directly derived from the clock master's digital display signal and may be delivered in small equidistant Ethernet frames as display clock reference frames at a frequency smaller than or equal to the clock master's visual refresh rate. Due to the equidistance property, the timing jitter as introduced by the network 7 can be compensated, and the period and thus the visual refresh rate of the clock master device CM can be determined from those frames. A dedicated subset of non-clock-master display devices 2 measure their own visual refresh rate and compare it with the reference clock signal received via Ethernet. The offset and drift of the different display devices 2 is then compensated, e.g. supported by IEEE1588, to achieve a very tight synchronization between the playout clocks of the display devices 2. This enables active stereo and multi-view composite displays 1. Both measurement and locking the timing of the display signal may require access to the driver of the display hardware.

In addition the display clock reference signal can also be transmitted, e.g. via IP encapsulation, cf. according to IEEE 802.1 as to pixel sources that want to synchronize with the composite display 1, i.e. reverse Genlock is applied in order to reduce playback judder, wherein judder denotes sampling point inaccuracy on the time axis (jitter) for visual content. The pixel sources 5 can then synchronize the content generation and/or its playout rate to the composite display output signal, wherein playout denotes plackback at the playback rate, but without visual display, e.g. for streaming to a display.

When locking to the clock of the composite display's clock master display device CM, the pixel sources 5 could miss one or more clock cycles, e.g. when more time for frame generation than one visual refresh period is needed if their computational power is insufficient. If clock cycles are missed, this missing may be coordinated in order to avoid different temporal resolutions within a scene/video when displayed at the composite display. Additionally or alternatively temporal upsampling, i.e. frame interpolation at the displays devices 2 may be performed.

FIG. 14 shows part of a method according to a twelfth embodiment of the present invention.

In detail in FIG. 14 a timeline of a synchronization is shown. Per frame to be displayed each display device 2 obtains two presentation timestamps PTS one from the clock master device CMD and one from the pixel source 5. The individual time lines are aligned using the presentation timestamps PTS and the round-trip-times RTT to the clock master device CMD. The display clock reference signal may in particular be a periodic signal in form of equidistant Ethernet frames delivered with a frequency equal or lower to the clock reference devices refresh rate. It provides presentation timestamps PTS which may be encoded as a monotonically increasing frame sequence number. The presentation timestamps may be embedded into an IP-packet, for example some proprietary UDP/IP signaling packets. The UDP frames may be multicast addressed in networks supporting multicast while they may be simulcasted via IP unicast otherwise.

In an MPEG transport stream for example, the program clock reference PCR value provides the time elapsed at the video source and is used to compute the exact time of frame display, e.g. with a constant positive phase offset according to the constant term e in ISO/IEC 13818-1: 2000 (E) (ITU-T Rec. H.222.0) on pp. 118, while in the embodiment, the reception of the display clock reference denotes the exact time of frame display minus the forwarding delay.

As the PTS is defined by the pixel sink, i.e. the display devices 2 and is propagated via display clock reference signal the video source 5 may ensure that each frame containing visual content from the exact same point of time is tagged in the MPEG-TS with the same PTS value. The video or pixel sink 2 may display those frames at an approximately exact same (but later) point in time and may expect an H.264 MPEG-TS for each view.

From the DCR arrival timestamps, each non clock reference display device 2 determines the frequency of display refresh at the clock or sync reference display device 2. At said display device this frequency shall be compared to the local display refresh frequency. As both are measured using the individual local clock of the display device 2 and are thus relative to this clock, the measurement inaccuracy and drift of the local clock is compensated. Thus, based on the comparison of display clock reference DCR and local refresh, the local refresh rate can be synchronized to the refresh rate estimated from the display clock reference DCR.

In order to determine the exact phase of the clock reference display, bi-directional communication measuring the round-trip-time (RTT) is employed. In a first approximation, the forwarding delay is assumed to be the propagation delay of the display clock reference, which is a time-dependent variable, assumed to vary on large time scales and may thus be updated at a low, yet to be determined frequency.

With both phase and frequency estimates, each non-clock reference display device 2 may achieve frame synchronous display with the clock reference display device.

Content synchronization is therefore achieved by comparison of presentation timestamps PTS as contained in the DCR with PTS's as obtained from the content itself, while an overall PTS offset including processing and propagation delay at the video or pixel source plus decoding at the video or pixel sink may be configured or estimated by the clock reference display device.

It may be more feasible to provide different PTS values to pixel sources and pixel sinks in order to account for the PTS offset or to provide the offset to the pixel sinks via separate signaling.

In FIG. 14 the clock master device CMD provides the presentation timestamps PTS k to a display device 2, which has previously determined the round-trip-time RTT between itself and the clock master device CMD. This phase error is compensated—see arrow to the left—by time shift of the forwarding delay. This display device 2 has previously received the frame with PTS k′ and displays it at the time of the PTS k. If k′=k is assumed then the clock master device CMD has provided the pixel source with k′ earlier. Otherwise k′< >k and the clock master device CMD has provided the display device 2 with a PTS offset, which may be negative in case of pre-rendered playout content. The PTS offset is compensated (arrow to the right) by a time shift of (PTS offset)/FPS, whereas FPS is the deduced master display refresh rate in frames per second and PTS offset is the difference of two PTS values, wherein the PTS offset must be practically larger than the forwarding delay in any case to maintain causality.

In a stereoscopic case, the left and right views at any time instant are composed of those two available frames that have the same PTS value at one display.

FIG. 15 shows part of a method according to a thirteenth embodiment of the present invention.

In addition the display clock reference signal can also be transmitted to video and pixel sources that want to synchronize with the composite display 1, e.g. performing reverse Genlock in order to reduce playback judder, wherein—as already mentioned before—judder denotes sampling point inaccuracy on the time axis (jitter) for visual content. The pixel sources can then synchronize the content generation and/or its playout rate to the composite display output signal, wherein playout denotes playback at the playback rate, but without visual display, e.g. for streaming to a display, which may be done as follows:

From the arrival timestamps of the display clock reference DCR IP packets, each DCR receiving entity (video or pixel source or sink) may determine the exact frequency of display refresh at the master display device, relative to its own clock. Due to non-deterministic delay of DCR propagation in IP networks, varying for each DCR packet (jitter), this will take a certain amount of time depending on the “distance to the source” in terms of network hops. This time is to be determined. Furthermore, with the PTS included in the DCR, each frame generated at one of the source devices can be tagged with the PTS by setting the MPEG encapsulation PTS for this frame to the DCR PTS. In the MPEG-standard, a PTS is defined relative to a program clock reference (PCR). The PCR in the transport stream TS shall be computed from the DCR PTS. Thus, it is possible to playback pre-generated video streams with identically computed PCR/PTS as well as real-time generated video streams with identically computed PCR from the DCR PTS, since the PTS is in both cases relative to the PCR and both are identical across streams for each frame.

A master video or pixel source managing slave video source or pixel source devices to provide their respective frame tiles with a common PTS=x at some identical time instant t, which may be supported by clock synchronization protocols such as NTP or PTP, may be used to synchronize content generation. Alternatively, round-trip-time (RTT) estimation from-and-to the clock master device CMD via bi-directional communication, whereas the forwarding delay is the DCR propagation delay from the clock master device, together with a fixed and pre-configured processing delay may be used to achieve synchronous content generation. In both of the above cases, frequency as well as phase are uniquely determined at each source device.

Based on the setup of FIG. 15 in FIG. 16 is shown a vertical sync frequency estimation, comparison and compensation according to a master clock reference.

The vsync frequency of a display device 2 may be estimated by a Phase-Locked-Loop (PLL) procedure, which may be implemented in software. It is specifically parameterized for the approximate refresh rate of the composite display (e.g. 100 Hz or 120/60 Hz) after comparing the CMD's Ticks to the local vsync Ticks, the PLL procedure constantly estimates the phase error and hence determines the frequency deviation of both clocks. The output of the PLL procedure is a frequency that differs from both, if compared using an independent clock. Specifically, it is greater than the local vsync frequency, if the clock master display CMD frequency is greater than the local vsync frequency, and it is smaller otherwise. The difference between PLL output frequency and local vsync frequency can hence be used to control, i.e. to increase or decrease the local vsync frequency. The PLL is locked once the difference between the PLL output and the local vsync frequency is zero, i.e. in practice, as close to zero as possible.

What is unknown in the phase locked frequency compensation loop is the propagation time of the display clock reference DCR messages from the clock master device CMD over the network 7. This time needs to be estimated and, once reliably determined, can be incorporated into the phase locked loop. For this purpose, the round-trip time RTT estimation mechanism ICMP echo (commonly known as “ping”) may be used. This enables to estimate the RTT and hence, by the forwarding delay, the one-way propagation time of small IP packets using an existing and largely supported vehicle, assumed to work anywhere and requiring no additional implementation effort at the clock master device CMD. For instance the Linux ICMP implementation makes use of Ethernet hardware timestamps, if provided, but at least uses timing information from the kernel's network stack.

Hardware support is required to compensate for the estimated frequency deviation at the non-CMDs, and may also be required at the CMD, depending on the frequency compensation range at the non-CMDs.

Three mechanisms for controlling hardware parameters that may determine the vsync frequency of a GPU display pipeline may be used:

- 1. Horizontal and vertical blanking interval modification
- 2. Parameterization of the frequency synthesizer used to generate the DVI/HDMI clock signal
- 3. Control of a Voltage Controlled Crystal Oscillator (VCXO), if available

With regard to the first option the granularity in frequency compensation is in this case determined by the pixel dimension of the image output, whereas reducing or increasing the blanking by one vertical line is the minimum granularity for landscape displays. However, visible artifacts may appear when adapting to a frequency deviation by blanking interval modification.

The second option was determined to work in particular at some ten percent of one Hertz granularity when using Intel integrated graphics (IGP) of the fourth and fifth generation. Display devices are capable of adopting the frequency deviation without clearly visible artifacts most of the time, but with the display going blank, e.g. for re-synchronization sometime.

The third option, software parameterization of a control voltage of a voltage controlled crystal oscillator VCXO is typically unavailable on consumer PC hardware. However, it is found on Set-Top-Boxes for Television, as the channel switching time constraints impose a strong requirement on generator-synchronous playback of received content, e.g. an Intel consumer electronics (CE) set-top-box based on Intel PC hardware (CE4100), featuring a VCXO as an example for a set-top-box respectively a consumer television broadcast equipment in general. According to the Intel CE datasheets, the required accuracy of the 27 MHz oscillators is within ±50 ppm (parts per million), whereas the tunable frequency range of the VCXO is specified to be at least within ±125 ppm.

Due to the limited range of VCXO frequency tuning at the display devices 2, the individual frequencies of all display devices 2 in an initialization stage may be determined. The CMD should then have a vsync frequency in the middle of the set of frequencies of all display nodes (center frequency), in order to minimize the individual tuning requirements. Then, two mechanisms may be used.

- 1. The clock master is a dedicated display node that can be parameterized to the exact center frequency.
- 2. The CMD is selected by distributed coordination among the set of display nodes as eventually being most close to the center frequency.

In an initialization phase, each display is a DCR transmitter at some point in time for all the others to determine this display's vsync frequency relative to their own vsync frequency. Hence, each display is capable of deciding if it is close to the center frequency or not. For mechanism two, by distributed coordination (cf. CSMA/CA in wireless LANs), the first to notify the others of being most close to the center frequency becomes the CMD.

In order to do reverse Genlock, a form of generator clock recovery and synchronization in which the roles of generator and consumer are switched (with respect to clock provision), the DCR signal of the CMD is transmitted to the content generating nodes. The frame display frequency of the CMD and hence that of all display devices, once they are synchronized is estimated independently by each content generating device, relative to the system clock at each generating node as depicted in the following figure.

FIG. 17 shows part of a method according to a fifteenth embodiment of the present invention.

In FIG. 17 a vertical sync frequency estimation for reverse Genlock is shown in principle. As mentioned before to do reverse Genlock in form of generator clock recovery and synchronization image, the roles of generator and consumer are switched with respect to clock provision: the display clock reference signal of the clock master display device is transmitted to content generating sources for synchronization. The frame display frequency of the clock master device and hence that of all display devices once they are synchronized is estimated independently by each content generating device, i.e. a pixel or video source relative to the system clock at each content generating device depicted in FIG. 17.

In more detail this is shown in FIG. 18 with a video frame deadline predictor flow diagram.

The contact generating nodes, i.e. pixel or video sources need not only determine the exact frequency of the display devices, they also need to predict the point in time a completely processed video frame is scheduled to be put on the network and transmitted to one or more of the display devices. For this purpose, generating nodes may utilize a so-called frame deadline predictor, running on a network entity which returns the time remaining until the deadline of a specific frame number that has not yet been displayed at the display devices synchronously. The deadline is compensated for propagation time by RTT estimation to the clock master device as well as for inter-process communication (IPC) latency, and thus determines the point in time at which a video frame should be forwarded to the network stack of the local host. This process of deadline prediction is shown in FIG. 18.

The frame deadline predictor is additionally capable of returning a program clock reference (PCR) value, according to MPEG-2 system Transport Stream (TS) as an example for a multiplex operation on audio-visual content with synchronization information, at any arbitrary point in time. The PCR value is compensated for twice the local host IPC latency, and thus is valid at the point in time of stamping a transport stream with a PCR value.

Due to the limited support of multicast across IP networks, a frame deadline predictor may also serve as a unicast-to-multicast/broadcast proxy. This way, only one generator node respectively pixel of video source receives the original DCR messages from the CMD via unicast, while the rest of the subnet is fed with a relayed version of the DCR signal via broadcast. The delay due to relaying is compensated, as it is indicated in the relayed DCR messages by an additional RTT value.

A local instance of the PLL procedure at the to be reverse Genlock'd pixel or video source needs to be properly parameterized. Parameters may be pre-computed and the PLL is configured via a text file.

An example configuration text file is given below, parameterized for ˜120 Hz and 5 s RTT estimation update interval. The gain parameter controls the duration after which the PLL becomes stable.

# DPLL NCO gain

gain=20

# approximate CMD refresh rate

frequency=120.0

# DPLL loop filter coefficients a0, a1 and b1

a0=0.00496

a1=−0.0012

b1=0.99

# port for receiving the DCR sync packets

sync port=14401

# interval between sending ICMP echo request

# for RTT estimation in seconds

rtt interval=5

# coefficient alpha for ewma filter for RTT estimation

ewma coefficient=0.8

# enable relaying

relay=0

relay destination=134.96.86.21

The deadline predictor may be provided as a statically linked library, its functions may be defined in the following C program header file.

#ifndef NET_PLL_LIB_H_—

#define NET_PLL_LIB_H_—

struct net_pll_status {

double est_freq; /**< estimated master frequency */

double est_rtt; /**< estimated rtt to master node in ms */

unsigned int framenumber; /**< last frame displayed by master */

unsigned int pcr; /**< 27MHz PCR value from master */

};

/**

* @brief creates configuration from configuration file

* @param[in] configuration file

*/

extern int init(char* configfilename);

/**

* @brief set configuration parameter

* @param[in] parameter name, possible parameters are:

* “a0”, “a1”, “b1”, “gain”, “vrate”, “RTT interval”, “port”,

“ewma” and “relay”

*@param[in] parameter value

*/

extern int set_param(char* paramname, void* value);

/**

* @brief read configuration parameter

* @param[out] parameter value

* should be ‘(void*) double*’ for “a0”, “a1”, “b1”, “gain”, “vrate” and

“ewma”

* should be ‘(void*) ushort*’ for “RTT interval”, “port”, relay

*/

extern int get_param(char* paramname, void* value);

/**

* @brief start frequency synchronization

*/

extern int start(void);

/**

* @brief retrieve internal status

* @param[out] internal values

*/

extern int status(struct net_pll_status* stat);

/**

* @brief calculate timestamp for framenumber

* @param[in] framenumber

* @param[out] timestamp

*/

extern int frame_timestamp(int framenumber, struct timeval* tv);

#endif /* NET_PLL_LIB_H_ */

A daemon running an instance of and interfacing with a software-based PLL procedure may also be provided. It provides inter-process communication IPC via localhost UDP communication, specifically timing information relative to a local system clock. This information is useless on another system, hence the daemon does not reply to queries other than from the IP address 127.0.0.1, i.e. from the same host.

The daemon accepts queries if received in the following format:

Query
UDP payload
UDP payload

type
(header part)
(message part)

Deadline
uint8_t MAGIC=0x47
uint32_t frame

uint8_t TYPE=0
—

PCR
uint8_t MAGIC=0x47
—

uint8_t TYPE=1

IPC
uint8_t MAGIC=0x47
—

Latency
uint8_t TYPE=4

The daemon responds to the above queries as follows:

UDP payload
UDP payload

Query type
Response
(header part)
(message part)

Deadline
Time until
uint8_t MAGIC=0x47
int32_t seconds

deadline
uint8_t TYPE=2
uint32_t usecs

PCR
PCR
uint8_t MAGIC=0x47
Uint64_t pcr

(now)
uint8_t TYPE=3

IPC
Time
uint8_t MAGIC=0x47
int32_t seconds

Latency
(now)
uint8_t TYPE=2
uint32_t usecs

Note

that all UDP packets according to this protocol have a 12 Byte payload and the default port used is 12345.

A simple application making use of the local frame deadline predictor daemon may be provided, and the main part of its source code is shown below.

sockfd = localhost_socket(port);

localhost_estimate_latency(sockfd, lat_trials, &tlat);

fprintf(stderr,“Local one-way system latency (%d trials) is %f ms\n”,

lat_trials,tlat.tv_usec / 1000.0);

localhost_query_deadline(sockfd,frame_requested, &sec, &usec);

fprintf(stderr,“Frame #%d is due in %lf s \n”,frame_requested,

sec + (usec) / 1000000.0 );

localhost_query_pcr(sockfd,&pcr);

fprintf(stderr,“Returned PCR at this time instant is %”PRlu64“\n”,pcr);

close(sockfd);

The following three functions are defined:

- 1. Estimation of the local host IPC latency. This function estimates the latency on average over a number of trials. It may be called at any time.
- 2. A deadline may be queried for a certain frame number. The result is compensated for one-way IPC latency (if 1. has been called in prior) and network round-trip time RTT.
- 3. A PCR value may be queried at any time. It is compensated for two-way IPC latency and network RTT.

An example output of the sample application is given below.

Local one-way system latency (12 trials) is 0.022000 ms

Frame #9512112 is due in 417.696020 s

Returned PCR at this time instant is 9999999

Accordingly, the daemon output has been as follows, due to the IPC latency being estimated in 12 trials, one every 10 ms. The currently estimated frequency, CMD network RTT as well as the last seen frame number is also displayed.

est_freq: 120.058479, est_rtt: 0.324633, frame: 9461539

est_freq: 120.058479, est_rtt: 0.324633, frame: 9461539

est_freq: 120.058509, est_rtt: 0.316927, frame: 9461952

est_freq: 120.058508, est_rtt: 0.316927, frame: 9461953

est_freq: 120.058510, est_rtt: 0.316927, frame: 9461954

est_freq: 120.058510, est_rtt: 0.316927, frame: 9461956

est_freq: 120.058512, est_rtt: 0.316927, frame: 9461957

est_freq: 120.058512, est_rtt: 0.316927, frame: 9461958

est_freq: 120.058512, est_rtt: 0.316927, frame: 9461959

est_freq: 120.058513, est_rtt: 0.316927, frame: 9461961

est_freq: 120.058513, est_rtt: 0.316927, frame: 9461962

est_freq: 120.058517, est_rtt: 0.316927, frame: 9461963

Req. for deadline of frame #9512112, sending time 417.695998 s

In summary of FIGS. 16 and 17 a synchronization architecture or system has been provided and parameterized. Generating devices like pixel sources can now both generate and stream with knowledge of individual frame deadlines as well as apply timestamps to a PCR stream for compatibility with MPEG2-Systems.

In summary the present invention enables a flexible, easy-to-implement intra- and linter-display synchronization as well as pixel source or generator synchronization, so a spectator in particular of content displayed on a display wall receives a very high quality of experience in terms of the presented content due to a very tight synchronization between the pixel sources, the display devices as pixel sinks as well as between the pixel sources and the pixels sinks.

Many modifications and other embodiments of the invention set forth herein will come to mind the one skilled in the art to which the invention pertains having the benefit of the teachings presented in the foregoing description and the associated drawings. Therefore, it is to be understood that the invention is not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of the appended claims. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.

METHOD AND SYSTEM FOR DISPLAYING PIXELS ON DISPLAY DEVICES

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (1)

PCT Information