This invention relates to a display processor device and a method for processing display image data by overlaying a multitude of image layers.
U.S. Pat. No. 5,469,541 describes an example of overlaying by window specific control of overlay planes in a graphics display system. By a graphics environment window the characteristics of an overlay common to multiple-windows are controlled while operating within the context of a conventional RAMDAC overlay control architecture. Window specific overlay control is accomplished by concatenating the window, masking and overlay data as an address to a mapping memory. The bit content of the mapping memory is controlled directly by the general purpose processor to selectively refine the relationship between the concatenated input as an address and the mapping memory output as the state conveyed to the overlay control of the RAMDAC. A common overlay is thus selectively modifiable by window
Overlaying planes or windows requires respective image data to be fetched from memory. The known system requires many memory fetches to retrieve the pixel values for the layers to be overlayed.
The present invention provides a display processor device, and a method, as described in the accompanying claims.
Specific embodiments of the invention are set forth in the dependent claims. Aspects of the invention will be apparent from and elucidated with reference to the embodiments described hereinafter.
Further details, aspects and embodiments of the invention will be described, by way of example only, with reference to the drawings.
Elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale. In the Figures, elements which correspond to elements already described may have the same reference numerals.
Examples of the present invention will now be described with reference to an example of display processor for display image data. It is noted that such a processor may be part of a larger graphical processing unit (GPU) or dedicated display controller, or any other image processing system, such as a sprite based display processor. So it will be appreciated that the present invention is not limited to the specific processing architecture herein described with reference to the accompanying drawings, and may equally be applied to alternative architectures.
In computer graphics, when a given image is intended to be placed over a background, the transparent pixels can be identified by a specific pixel value, or an additional value may be stored per pixel indicative of the transparency. Also, transparent areas can be specified through a binary mask. This way, for each intended image there may actually be two bitmaps: the actual image, in which the unused areas are given a pixel value with all bits set to a specific value, e.g. O's, and an additional mask, in which the correspondent image areas are given a pixel value indicative of the transparency or the non-transparency, e.g. transparent pixel mask bits set to 0 s and the surrounding areas a value of all bits set to 1 s.
At run time, to put the image on a layer over a background layer, an overlaying unit may operate as follows. First the layer's pixels are masked with the image mask at the desired coordinates, e.g. using a bitwise AND operation. This preserves the background pixels of the transparent areas while resets with zeros the bits of the pixels which will be obscured by the overlapped image.
Then, the overlaying unit renders the image pixel's bits by blending them with the background pixel's bits using the bitwise OR operation. This way, the image pixels are appropriately placed while keeping the background surrounding pixels preserved. The result is a perfect compound of the image over the background.
The above overlaying technique may be used for painting pointing device cursors, in typical 2-D videogames for characters, bullets and so on (so called sprites), for graphical user interface (GUI) icons, and for video titling and other image mixing applications.
The system proposed below reduces memory bandwidth consumed by any sprite based display system. For example, for embedded systems capturing display graphic directly out of a Flash memory device connected externally, it is relevant to keep the bandwidth as low as possible.
The fetch unit 110 is arranged for fetching one or more image layers to be overlayed, shown as a multitude of image layers 114. Pixel values of the respective layers may be stored in local memory, e.g. so called layer buffers, may be generated locally or may be retrieved from the external memory 120. An example of a case where the graphic data is stored locally in the display controller is a cursor. Another example is a background color, i.e. the entire plane has a constant color. Usually the image data is fetched from an external source. To be less sensitive to latencies introduced by external memories there may be pre-fetch FIFOs which store next pixel data required for each plane.
Layers of image data may have substantial areas of pixels values having a single predefined value, e.g. a background color around an object, or a transparent area. Transparent means that the display output signal in such areas is formed during overlaying by the pixel values of lower layers, whereas non transparent pixels are used from the current layer, i.e. assuming that the current layer is in front.
The fetching unit is provided with a fetch control unit 112 for selectively fetching stored pixel values from the memory by skipping stored pixels values having the single predefined value according to a fetch mask from a mask unit 101.
In the fetch mask a mask value, e.g. a bit, corresponds to a pixel on the corresponding location in the image layer, and indicates whether such pixel has a pixel value to be fetched, or that fetching may be skipped because the respective pixel has said predefined pixel value. Hence the mask values are indicative of pixels values having the single predetermined value. If so indicated, the predefined value may be entered in the respective layer buffer for the location of the corresponding pixel internally in the display processor. Alternatively the overlaying function may be controlled according to the mask values, e.g. by skipping the overlaying for pixels that are indicated to be transparent in the layer that is to be overlayed. The fetch mask obviates the need for a memory access cycle to the external memory for pixels that are indicated to have said predefined value.
A respective fetch mask may be provided for each respective layer, or for some layers only, e.g. for layers that are to be fetched from external memory, in particular if such memory is relatively slow. When pixel values of a respective layer have to be fetched the corresponding fetch mask may be activated and retrieved from the mask unit as indicated by respective arrows in the Figure.
In the display processor, the mask is provided in a memory that contains masking data indicating which pixels of a specific layer are not used. Storing the mask requires a rather limited amount of memory and such memory could be placed on a chip also having the display processor. The mask may contain strings of bits having the same value, and therefore is suitable for run length encoding (RLE compression) using a compressed storage format. Then decompression is required doing decoding of mask data.
Optionally, the fetch mask may be stored in a compressed form in the mask unit 101. The display processor may be provided with a decompressor 116 for decompressing a fetch mask. If so, after retrieving the fetch mask, the decompressor 116 is activated for regenerating the original fetch mask.
It is noted that the example shows an image layer where the active image area touches all boundaries. Other examples may have an active image area surrounded by transparent background. The corresponding mask data then indicates that such areas contain only background.
Optionally, the fetch mask is a bit mask having bit values, each bit value indicating whether a corresponding pixel has the single predefined value. In the examples of
Optionally, the fetch mask is a bit mask having bit values, each bit value indicating whether a corresponding set of pixels have the single predefined value. The set of pixels may, for example, be 4, 8 or 16 pixels all having said single predefined value. Hence the resolution of the fetch mask is lower than the resolution of the image data of the layer. Nevertheless a substantial reduction of memory access cycles is achieved when the image has large areas having said predefined value. The size of the fetch mask is reduced by a factor corresponding to the set size. In a further embodiment, the set size, or memory data amount of the pixel values of the set of pixels, is designed to correspond to a retrieval unit having multiple bytes retrievable from the memory by a single memory access operation. Memory systems often have a memory access mode in which a number of bytes are retrieved by a single memory access cycle, e.g. 16 consecutively stored bytes. The mask resolution can be set to match the resolution of retrieving pixel data bytes in the memory.
Optionally, in the device as described above, each layer of the multitude of image layers has a corresponding fetch mask. The device may have a set of registers of mask buffers to store the number of masks corresponding to the multitude of layers to be overlaid. The overlaying, and required fetching, is not interrupted by (re-)loading masks.
Optionally, in the device as described above, the fetch unit is arranged for retrieving the fetch mask from the memory, e.g. an external memory such as flash memory, static memory or DRAM, or (less common) a remote graphics memory e.g. connected via an USB interface. In practice, such external memories may be relatively slow, so selectively fetching only data bytes that are necessary according to the fetch mask enables the use of slower external memories, and/or may reduce bandwidth so reduce power consumption and possibly noise.
Optionally, in the device as described above, the fetch unit is arranged for generating the fetch mask for an image layer based on fetching said layer initially in full from the memory. The fetch unit may detect said predefined value when reading the mask in full in an initial or preparatory cycle. Such a cycle may be slower, but when the fetch mask has been generated, subsequent operational cycles are faster. Alternatively, the mask may be generated in the first pass while the output frequency of the display controller is defined by the display, so having the nominal rate. Now the first pass generates a higher memory bandwidth while subsequent accesses benefit from the bandwidth reduction.
Also, the mask may be generated on the fly. The image may be rendered by a GPU, which subsequently generates a mask for the rendered image. The mask may also be constructed by searching for transparent or black pixels with a CPU or a dedicated hardware circuit.
Optionally, in the device as described above, the fetch unit is arranged for preloading the fetch mask for an image layer before fetching said layer from the memory. The fetch mask may be stored with the respective image data in the memory, or in a different memory. The fetch unit may be programmed or instructed to retrieve the respective fetch mask or masks before the overlaying is performed.
Optionally, in the device as described above, the fetch mask is available in a compressed form and the fetch unit is arranged for decompressing the fetch mask. Such compressed fetch masks require less storage space.
Optionally, in the device as described above, the single predefined value is indicative of a transparency of the pixel in the overlaying. Alternatively, the single predefined value is indicative of a single color of the pixel. The predefined value may also be determined on the fly, e.g. by analyzing the image data and detecting if substantial areas have a single color. Also, the predefined value may be set according to the first pixel of the respective layer, or the last pixel value actually retrieved. Hence the color of the last populated pixel is repeated until the mask indicates that a new pixel value must be fetched.
Optionally, the masking may be applied to a scaled layer. As such, scaling is a well known function, and may be implemented in the hardware of the display controller. The mask is scaled corresponding to the data of the image. The pixel data to be retrieved may be further reduced due to downscaling before actually retrieving the pixel data from memory.
Furthermore, the fetch mask system may be applied to any element on an image plane that has surrounding transparent pixels. For example, a large picture (800*600 pixels in size) uses only a small piece (e.g. 400*300) for active image data, and everything else is transparent. Basically one can save at least % of the memory bandwidth using the fetch mask. Hence the fetch mask may be used both for rectangular areas, windows, and, in particular, for any non-rectangular shape surrounded by transparent or single color background.
In a practical application, e.g. for a car display system, an integrated circuit may contain the above described electronic device, or a multitude of such image display processors for multiple displays.
In a next step CONST 425 the method determines whether to selectively fetch stored pixel values from the memory by skipping stored pixels values having the single predefined value according to the fetch mask. If the mask indicates a constant value, the corresponding pixel or set of pixels is skipped as indicated by arrow 426. If the mask value indicate a pixel value to be fetched, in the step FETCH 430 the memory is accessed to fetch the image values of the corresponding pixel or set of pixels. The fetching continues until, in test RDY 440, all non transparent pixel values of the layer have been read from the memory to a layer buffer. Buffer locations that are not filled by actual pixel values are, for example, set to zero, or to a predefined other value. Alternatively, the fetch mask may be used again during the next step OVERLAY 450 to skip any pixels that are indicated to be transparent in the current layer being overlaid.
Finally, in a test 46 NXT_LYR, it is determined whether all layers are overlaid as required. If not, the next layer and next fetch mask are retrieved, restarting the process at step MASK 420. If overlaying is ready, the output display signal is made available for the display (not shown), until a next image needs to be generated, and the method is reiterated at START 410.
It is noted that fetching a full layer, and applying the mask per layer, could used when performing the processing in a GPU having sufficient internal memory. Alternatively, the processing for multiple layers may actually run in parallel, e.g. in a display controller. The output of a display controller now is, for example, a fully blended result pixel per clock cycle. Expressing such parallel processing in a pseudo-code would be:
For each destination pixel do:
Such process could be illustrated by repeatedly, for each pixel, applying the diagram of
Optionally, in the method, the fetch mask is a bit mask having bit values, each bit value indicating whether a corresponding pixel has the single predefined value. Such a bit mask provides the masking data in an efficient form.
Optionally, the mask value is indicative of whether the respective pixel is transparent or opaque. In the event that a pixel is opaque any pixel values of lower layers need not be fetched, i.e. more backward laying pixel values need not be retrieved at all, because such backward laying pixels would be superseded by the overlaying. Now, the combined set of masks may also be used to directly determine from which layer a pixel value must be retrieved to provide the final pixel value for a particular location. By starting at the front, skipping all layers that are transparent on that pixel location, the foremost, opaque pixel is determined. So, by first logically combining the corresponding mask values of all layers for a particular pixel location, only said foremost pixel value needs to be retrieved. The required bandwidth is now effectively reduced to retrieving only one pixel value from memory for each pixel location. In fact, the overlaying process is replaced by directly reading the foremost, opaque pixel value from external memory.
Optionally, in the method, each layer of the multitude of image layers has a corresponding fetch mask. In such setup the system is less complex in that for each layer the same process of fetching is used. Alternatively, only a subset of the multitude of image layers has a corresponding fetch mask It is noted that some layer may contain hardly any or no pixels that may be skipped. For such layers, no mask needs to be provided, or a control header in the mask of that layer may indicate that no pixels values are masked transparent at all.
Optionally, in the method, the fetch mask for an image layer is generated based on processing said layer in full. So, during an initial processing step, the respective layer is retrieved from the memory in full. The method then determines which pixels do have said single predefined value, e.g. indicating transparency. Subsequently, the fetch mask is generated corresponding to the actual contents of the layer. The mask may be used for subsequent occurrences of overlaying the respective layer. Thereto, the fetch mask may be stored in an internal mask memory in a display processor or processing system, or in the external memory with the original full layer data. An identification of the respective layer, e.g. the starting address in the memory, may be stored with the fetch mask.
Optionally, in the method, said generating the fetch mask is performed off-line before operationally generating the display image data. The fetch mask may be generated before the actual overlaying process is used, e.g. during manufacture or design of the display processing system, so called off line preparation.
Optionally, in the method, the fetch mask for an image layer is preloaded before fetching said layer from the memory. So each time a new overlaying process is started, the respective layer is identified and the corresponding fetch mask is retrieved.
Optionally, in the method, the fetch mask is available in a compressed form and the method comprises decompressing the fetch mask.
In a practical system, the method may be implemented in a processor system, or in a software program for a display processor. Such a computer program product has the instructions for causing a processor system to perform a method of generating display image data as described above.
In summary, the enhancement resides in providing a fetch mask for a display processor device for overlaying a multitude of image layers. Pixel values of at least one of the image layers are stored in a memory and may comprise pixels values having a single predefined value, such as transparency. The fetch mask enables reducing the amount of memory reads when fetching a layer to be overlayed based on it being partially transparent. The display processor has a fetch unit for selectively fetching stored pixel values from the memory by skipping stored pixels values having the single predefined value according to a fetch mask indicative of pixels values having the single predetermined value. Advantageously the bandwidth for accessing the memory is reduced, because less pixel data values need be retrieved. Power consumption may be reduced, and slower memories may be applied.
In the foregoing specification, the invention has been described with reference to specific examples of embodiments of the invention. It will, however, be evident that various modifications and changes may be made therein without departing from the broader spirit and scope of the invention as set forth in the appended claims. For example, the connections may be a type of connection suitable to transfer signals from or to the respective nodes, units or devices, for example via intermediate devices. Accordingly, unless implied or stated otherwise the connections may for example be direct connections or indirect connections.
Because the apparatus implementing the present invention is, for the most part, composed of electronic components and circuits known to those skilled in the art, circuit details will not be explained in any greater extent than that considered necessary as illustrated above, for the understanding and appreciation of the underlying concepts of the present invention and in order not to obfuscate or distract from the teachings of the present invention.
Although the invention has been described with respect to specific conductivity types or polarity of potentials, skilled artisans appreciated that conductivity types and polarities of potentials may be reversed.
Also, the invention is not limited to physical devices or units implemented in non-programmable hardware but can also be applied in programmable devices or units able to perform the desired device functions by operating in accordance with suitable program code. Furthermore, the devices may be physically distributed over a number of apparatuses, while functionally operating as a single device.
Furthermore, the units and circuits may be suitably combined in one or more semiconductor devices.
In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word ‘comprising’ does not exclude the presence of other elements or steps then those listed in a claim. Furthermore, the terms “a” or “an,” as used herein, are defined as one or more than one. Also, the use of introductory phrases such as “at least one” and “one or more” in the claims should not be construed to imply that the introduction of another claim element by the indefinite articles “a” or “an” limits any particular claim containing such introduced claim element to inventions containing only one such element, even when the same claim includes the introductory phrases “one or more” or “at least one” and indefinite articles such as “a” or “an.” The same holds true for the use of definite articles. Unless stated otherwise, terms such as “first” and “second” are used to arbitrarily distinguish between the elements such terms describe. Thus, these terms are not necessarily intended to indicate temporal or other prioritization of such elements. The mere fact that certain measures are recited in mutually different claims does not indicate that a combination of these measures cannot be used to advantage.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/IB2013/000431 | 2/12/2013 | WO | 00 |