The present invention relates to a method and devices for transmitting video data, the video data comprising picture elements being positionable in a picture and having picture element values, from a central unit to a mobile terminal via a mobile radio network. Particularly, the present invention relates to a method for transmitting video data, the gaze direction of a user being determined by means of a gaze direction determination module of a display unit of the terminal, and the gaze direction being transmitted by the terminal to the central unit via the mobile radio network. Specifically, the present invention also relates to a computer-based central unit, a mobile terminal, and a computer program product suited for executing the method.
In patent document EP 1 186 148, described is a system for transmitting video data from a central unit to a terminal via a telecommunications network. According to EP 1 186 148, the terminal comprises a virtual retinal display device projecting directly picture signals corresponding to the video data onto the user's retina. Moreover, the display device comprises a gaze direction determination module determining the current eye position (position of pupil) by means of a so-called eye tracker as an indicator for the user's current gaze direction. For example, patent application WO 94/09472 describes such a virtual retinal display device. The central unit according to EP 1 186 148 comprises a filer module which filters the video data based on the current gaze direction, prior to their transmission, such that outer picture areas, corresponding to the video data and being projected by the virtual retinal display outside of the fovea, have a lower resolution than inner picture areas, corresponding to the video data and being projected onto the fovea. The system according to EP 1 186 148 uses the property of the human eye that a small area of the retina, being denoted as fovea and having an angle of vision of approximately 20, has the most exact vision, and thus the data volume to be transmitted can be reduced by reducing the resolution in outer areas of the picture. Particularly, for transmitting video data via mobile radio networks for mobile telephony, having a significantly lower bandwidth than fixed broadband networks, necessary is a further reduction of the data volume to be transmitted.
It is an object of this invention to provide a method and devices for transmitting video data from a central unit via a mobile radio network to a mobile terminal, which make possible a reduction of the data volume to be transmitted.
According to the present invention, these objects are achieved particularly through the features of the independent claims. In addition, further advantageous embodiments follow from the dependent claims and the description.
According to the present invention, the above-mentioned objects are particularly achieved in that for transmitting video data from a central unit via a mobile radio network to a mobile terminal having a display unit, the video data comprising picture elements that are positionable in a picture and have picture element values, determined is a gaze direction of a user of the display unit by means of a gaze direction determination module of the display unit. The gaze direction is transmitted by the terminal to the central unit via the mobile radio network. Correlation threshold values are determined in the central unit based on the gaze direction, the correlation threshold values being position-dependent with respect to the picture. Generated in the central unit are bit matrices that identify correlating picture elements having correlating picture element values, the correlating picture elements being determined dependent on the correlation threshold values. The bit matrices are transmitted together with the video data, one respective common data element, having a common picture element value, being transmitted for correlating picture elements. The picture signals are rendered by the display unit based on the video data and the bit matrices. Transmission of the video data occurs continuously flowing, particularly, as so-called video streaming. Particularly, for positions in the picture, the central unit determines the correlation threshold values depending on a distance of a respective position in the picture to a viewing position corresponding to the gaze direction. For example, the display unit projects directly the picture signals onto at least one of the user's retina. The picture element values comprise gray values and/or color values. The advantage of determining the correlation threshold values depending on the user's gaze direction is that more severe conditions on the correlation of the picture element values can be applied to picture elements, located in the gazing direction of the user, than to picture elements, located outside the gaze direction. Thereby, it is possible to combine in a common data element picture element values of picture elements, located outside the user's gaze direction, even for large differences of the picture element values, and thus to compress the data volume of the video data to be transmitted, without impairing qualitatively in a significant way for the user the subjective perception of the rendered video data. Particularly, for virtual retinal display devices, which project picture signals directly onto the retina, the data volume can be reduced significantly, because picture elements located outside the gaze direction are projected into retinal areas that are located outside the fovea and have a lower sensitivity than the fovea.
Preferably, generating the bit matrices in the central unit includes identification of picture elements adjoining in the picture and having correlating picture element values. As shown in patent application WO 03/084205, the data volume necessary for coding picture elements can be reduced, when picture elements, adjoining in the picture and having correlating picture element values, are indicated in a bit matrix and, for the correlating picture elements, the picture element value is coded only once in a common data element. If the correlating picture elements have different values, the common picture element value is calculated as an average value of the correlating picture element values, for example.
Preferably, generating the bit matrices in the central unit includes identification of picture elements being positioned equally in (temporally) successive pictures and having correlating picture element values. As the rendering of moving pictures corresponds essentially to rendering a sequence of pictures (so-called full pictures or frames, herein referred to as pictures), the data volume needed for transmitting video data can be reduced, when picture elements, being positioned equally in successive pictures and having correlating picture element values, are indicated in a bit matrix and their picture element value is transmitted only once. The bit matrices indicate correlation of picture elements of two or more successive pictures.
In an embodiment, picture element values of picture elements, having in the picture a defined distance to a viewing position corresponding to the gaze direction, are represented by the central unit with a lower number of bits than picture element values of picture elements at the viewing position. By reducing the number of bits for the coding of picture element values for picture elements located outside the user's gaze direction, the data volume, for video data to be transmitted, can be compressed, without impairing qualitatively in a significant way for the user the subjective perception of the rendered video data.
In an embodiment, multiple adjoining picture elements, having in the picture a defined distance to a viewing position corresponding to the gaze direction, are represented by the central unit as a common picture element in a common data element. By merging adjoining picture elements located outside the user's gaze direction, the geometric extension (size) of the picture elements is increased, this means the local resolution of picture areas outside the user's gaze direction is reduced, such that the data volume, for video data to be transmitted, is compressed, without impairing qualitatively in a significant way for the user the subjective perception of the rendered video data.
In an embodiment, picture elements, having in the picture a defined distance to a viewing position corresponding to the gaze direction, are transmitted by the central unit to the mobile terminal with a reduced refresh frequency. By reducing the refresh frequency of picture elements located outside the user's gaze direction, the data volume, for video data to be transmitted, can be compressed, without impairing qualitatively in a significant way for the user the subjective perception of the rendered video data.
The present invention also relates to a computer program product comprising computer program code means for controlling one or more processors of a computer configured to transmit video data via a mobile radio network to a mobile terminal having a display unit, the video data comprising picture elements that are positionable in a picture and have picture element values, and to receive a gaze direction of a user of the display unit via the mobile radio network from the terminal. The computer program code means are configured to control the processors of the computer such that the computer determines correlation threshold values based on the gaze direction, the correlation threshold values being position-dependent with respect to the picture; generates bit matrices that identify correlating picture elements having correlating picture element values, the correlating picture elements being determined dependent on the correlation threshold values; and transmits the bit matrices together with the video data to the terminal for rendering on the display unit, one respective common data element, having a common picture element value, being transmitted for correlating picture elements. Particularly, the computer program product comprises a computer readable medium containing the computer program code means.
The present invention will be explained in more detail, by way of example, with reference to the drawings in which:
In
The mobile radio network is, for example, a GSM-network (Global System for Mobile Communication), an UMTS-network (Universal Mobile Telecommunications System), a WLAN-network (Wireless Local Area Network), an UMA-network (Unlicensed Mobile Access) or another mobile radio system, e.g. a satellite-based system. One skilled in the art will understand that the proposed method can be used also via other telecommunications networks, particularly via fixed networks.
The mobile terminal 3 comprises a display unit 32 connected to the communication module 31 and implemented, for example, in the form of a set of viewing glasses, wearable on the user's head, or in another form wearable on the head. The communication module 31 and the display unit 32 are arranged, for example, in a common housing, or in separate housings and connected to each other via a wireless or contact-based communication link. If the communication module 31 is implemented with its own separate housing, the communication module 31 is implemented, for example, as a mobile radio phone, as a PDA (Personal Data Assistant), as a play station, or as a laptop computer.
As illustrated schematically in
The display unit 32 comprises a display device 321 as well as a gaze direction determination module 322. For example, the display device 321 is implemented as a virtual retinal display device, projecting directly picture signals onto the retina 41 of the user's eye 4. The gaze direction determination module 322 comprises a so-called eye tracker that determines the position of the pupil 42 as an indicator for the user's gaze direction. A virtual retinal display having an eye tracker is described, for example, in the patent application WO 94/09472. In an embodiment, the display device 321 is implemented as an LCD display (Liquid Crystal Display), the gaze direction determination module 322 determining the gaze direction on the basis of a light reference mark projected onto the cornea 43 and the respective relative positioning of the pupil 42.
In the central unit 1, the video data are retrieved from the database 11, compressed by the data compression module 120, and transmitted to the communication module 31 of the mobile terminal 3 via the mobile radio network 2 by means of the communication module 121 of the central unit 1. The received compressed video data is decompressed by the data decompression module 324 and rendered for the user as visible picture signals by the display device 321. As described in the following paragraphs, the data compression is performed on the basis of information about the user's gaze direction. The gaze direction is determined by the gaze direction determination module 322 and transmitted to the central unit 1 via the mobile radio network 2 by the gazing direction feedback reporting module 323 using the communication module 31.
On the basis of the received current gazing direction of the user, in the data decompression module 120, the current viewing position is determined in the picture defined by the video data.
In
In
Depending on the current viewing position D, the correlation value determination module 122 determines different (position-dependent) correlation threshold values for the picture elements. Essentially, small correlation threshold values (i.e. small tolerance) are provided for picture elements, located near the current viewing position D, whereas greater correlation threshold values (i.e. greater tolerance) are provided for picture elements, located further away from the current viewing position D. For example, depending on the distance to the current viewing position D, the correlation value determination module 122 determines different compression areas A1, A2, A3, A4 having a greater correlation threshold value for greater distance to the viewing position D. The correlation threshold values are given in absolute or relative numeric values. For example, picture elements in compression area A1 are assigned a correlation threshold value of zero (zero tolerance), for the compression area A2 provided is a correlation threshold value of 10%, for the compression area A3 20%, and for the compression area A4 40%. In this example, the difference of picture element values of picture elements in compression area A4 could be up to 40% and the picture elements would still be considered correlating picture elements.
Based on the current correlation threshold values determined, the bit matrix generating module 123 generates bit matrices identifying correlating picture elements having correlating picture element values. Subsequently, with reference to
In
In
In
One skilled in the art will understand that generating bit matrices based on correlation threshold values that depend on a user's gaze direction is applicable to picture element values in the form of a gray value as well as in the form of a color value, for RGB-video data (red, green, blue), each color value is treated as a separate picture element value.
For determining correlating picture elements in (temporal) successive pictures (according to FIGS. 4 or 5), other correlation threshold values can be determined and applied than the ones used for determining correlating picture elements adjoining in a picture (according to
The resolution reducing module 124 encodes picture elements with varying (position-dependent) resolution, depending on the viewing position D. Essentially, a high resolution (i.e. small sizes of picture elements) is provided for picture elements near the current viewing position D, whereas a small resolution (i.e. larger sizes of picture elements) is provided for picture elements located further away from the current viewing position D. In other words, from a defined distance to the viewing position D, multiple adjoining small picture elements are represented as common picture elements in a common data element.
For encoding picture elements, the picture element value reducing module 125 determines a different (position-dependent) number of bits depending on the current viewing position D. Essentially, a greater number of bits is provided for picture element values of picture elements, located near the current viewing position D, than for picture element values of picture elements, located farther away from the current viewing position D.
Depending on the current viewing position D, the refresh frequency-reducing module 126 determines a different (position-dependent) refresh frequency for transmitting picture elements. Essentially, a greater refresh frequency is provided for picture element values of picture elements, located near the current viewing position D, than for picture element values of picture elements, located farther away from the current viewing position D.
For example, the refresh frequency for transmitting picture elements, the number of bits for encoding picture element values, and/or the resolution of picture elements are selected depending on the compression areas A1, A2, A3, A4 mentioned above with reference to
In the mobile terminal 3, received and stored in data buffer module 325 are the compressed video data with the bit matrices and the data elements, containing common picture element values of correlating picture elements.
Based on the associated bit matrices, the data decompression module 324 decompresses the received compressed video data into a sequence of presentable pictures, rendered for the user as picture signals by the display device 322. For example, picture elements of different sizes are mapped onto the presentable picture on the basis of size information. For assigning picture element values to picture elements positioned in (temporal) successive pictures, stored in the data buffer module 325 are at least the video data needed for determining the current presentable picture. In subsequent picture elements, correlating picture elements are determined based on the associated bit matrices, and the respective picture element values are retrieved from the stored video data. For bit matrices relating to multiple (temporal) successive pictures, the received video data are stored in data buffer module 325 at least for the time interval T.
The foregoing disclosure of the embodiments of the invention has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise forms disclosed. Many variations and modifications of the embodiments described herein will be apparent to one of ordinary skill in the art in light of the above disclosure. The scope of the invention is to be defined only by the claims appended hereto, and by their equivalents. Specifically, in the description, the computer program code has been associated with specific software modules, one skilled in the art will understand, however, that the computer program code may be structured differently, without deviating from the scope of the invention. Furthermore, the particular order of the steps set forth in the specification should not be construed as limitations on the claims. One skilled in the art will understand that different sequences of steps are possible without deviating from the scope of the invention.
| Number | Date | Country | Kind |
|---|---|---|---|
| 05 405 336.8 | May 2005 | EP | regional |