In 3D graphical processing, there are a number of display data processing steps which are conventionally run on a computing device which has generated display data, commonly known as a host device, prior to the finished display data being transmitted to a connected display device for display. Because the host device has a limited capacity for such processing, the number of varied display devices that can be connected and supplied with data is necessarily limited.
Shaders are sets of pre-generated standard instructions which are conventionally run entirely on the host device at each step in the display data processing pipeline. Each application or other graphics producer on a host device may run its own instance of a standard shader simultaneously in a multiplexed system controlled by the Graphical Processing Unit (GPU) of the host device.
Since the shaders may handle different parts of the display data processing pipeline, they have different purposes. Vertex shaders take vertices as input and perform transformations on them such as rotation and scaling. Geometry shaders take primitives such as shapes and lines as input and modify them prior to rasterization, which converts vertices to pixels. Pixel shaders, sometimes also known as Fragment shaders, take pixel data as input and produce pixel output which has been blended, lit, etc..
Because all such processes are carried out on the host device, there is a processing bottleneck on the host device which makes it more difficult to connect multiple display devices or especially large display devices. The invention aims to mitigate this problem.
Therefore according to first aspect, the invention provides a host device for use in a display system for displaying display data, the display system comprising a host device, a plurality of display control devices and a plurality of display devices, at least two of the display devices having different characteristics, each display control device of the plurality of display control devices being connected to the host device and to a respective display device, wherein the display data is processed in a display data processing pipeline receiving an initial input of coded display data at the host device and generating a final output of rendered display data at each display control device for transmittal to a respective display device for display, the display data processing pipeline including a plurality of serially connected display data processing steps performed on the display data, wherein the host device is configured to:
determine the characteristics of each of the display devices connected to each of the display control devices to determine characteristics of the rendered display data to be transmitted from the final output of the display data processing pipeline to the particular display device;
determine processing capabilities of each of the display control devices connected to the host device;
determine, based on at least the determined characteristics of the rendered display data required for a particular display device and on the processing capabilities of a particular display control device connected to the particular display device, a subset of the plurality of serially connected display data processing steps that are to be performed by the particular display control device, the subset ending at the final output;
perform all the display data processing steps from the initial input to a display data processing step performed prior to the subset of the plurality of serially connected display data processing steps to generate part-processed data; and
outputting the part-processed display data to the particular display control device to enable the particular display control device to perform the subset of the plurality of serially connected display data processing steps ending at the final output.
Preferably, each display data processing step is performed on a corresponding display data processing engine executing a selected shader program.
In one embodiment, the host device comprises all the display data processing engines required for the display data processing pipeline.
The host device may, preferably, be configured to determine a number of display data processing engines located on the particular display control device based on the processing capabilities of the particular display control device to determine the subset of the plurality of serially connected display data processing steps that are to be performed by the particular display control device.
Preferably, the host device is configured to select a shader program for each of the display data processing engines in the subset of the plurality of serially connected display data processing steps that are to be performed by the particular display control device and to transmit instructions to the particular display control device to control the particular display control device to execute the selected shader program on the corresponding display data. In one embodiment, the host device is configured to transmit the selected shader program to the particular display control device if the particular display control device does not already have the selected shader program stored in memory.
According to a second aspect, the invention provides a display control device for use in a display system, the display system comprising a host device, a plurality of display control devices and a plurality of display devices, at least two of the display devices having different characteristics, each display control device of the plurality of display control devices being connected to the host device and to a respective display device, wherein the display data is processed in a display data processing pipeline receiving an initial input of coded display data at the host device and generating a final output of rendered display data at each display control device for transmittal to a respective display device for display, the display data processing pipeline including a plurality of serially connected display data processing steps performed on the display data, wherein the display control device is configured to:
transmit to the host device the characteristics of the display device connected to the display control device to enable the host device to determine characteristics of the rendered display data to be transmitted from the final output of the display data processing pipeline to the particular display device;
transmit to the host device processing capabilities of the display control device;
receive, from the host device, information regarding a subset of the plurality of serially connected display data processing steps that are to be performed by the display control device, the subset ending at the final output;
receive, from the host device, part-processed display data to enable the display control device to perform the subset of the plurality of serially connected display data processing steps;
perform the display data processing steps in the subset of the plurality of serially connected display data processing steps on the part-processed display data to generate the rendered display data; and
transmit the rendered display data to the display device connected to the display control device.
Preferably, each display data processing step is performed on a corresponding display data processing engine executing a selected shader program.
In one embodiment, the display control device comprises fewer than all the display data processing engines required for the display data processing pipeline.
The display control device may, preferably, be configured to receive instructions from the host device to execute a selected shader program on a corresponding display data processing engine to process the display data, the selected shader program being selected by the host device for each of the display data processing engines in the subset of the plurality of serially connected display data processing steps that are to be performed by the display control device.
Preferably, the display control device is configured to receive the selected shader program from the host device if the display control device does not already have the selected shader program stored in memory.
According to a third aspect, the invention provides a display system comprising a host device as described above, a plurality of display control devices and a plurality of display devices, at least two of the display devices having different characteristics, each display control device of the plurality of display control devices being connected to the host device and to a respective display device, each display control device of the plurality of display control devices being as described above.
According to a fourth aspect, the invention provides a method of displaying display data on a display system comprising a host device, a plurality of display control devices and a plurality of display devices, at least two of the display devices having different characteristics, each display control device of the plurality of display control devices being connected to the host device and to a respective display device, wherein the display data is processed in a display data processing pipeline receiving an initial input of coded display data at the host device and generating a final output of rendered display data at each display control device for transmittal to a respective display device for display, the display data processing pipeline including a plurality of serially connected display data processing steps performed on the display data, the method comprising:
determining, at the host device, the characteristics of each of the display devices connected to each of the display control devices to determine characteristics of the rendered display data to be transmitted from the final output of the display data processing pipeline to the particular display device;
determining, at the host device, processing capabilities of each of the display control devices connected to the host device;
determining, at the host device, based on at least the determined characteristics of the rendered display data required for a particular display device and on the processing capabilities of a particular display control device connected to the particular display device, a subset of the plurality of serially connected display data processing steps that are to be performed by the particular display control device, the subset ending at the final output;
performing, at the host device, all the display data processing steps from the initial input to a display data processing step performed prior to the subset of the plurality of serially connected display data processing steps to generate part-processed data; and
outputting, from the host device, the part-processed display data to the particular display control device to enable the particular display control device to perform the subset of the plurality of serially connected display data processing steps ending at the final output.
Preferably, each display data processing step is performed on a corresponding display data processing engine executing a selected shader program.
In one embodiment, the method further comprises:
determining, by the host device, a number of display data processing engines located on the particular display control device based on the processing capabilities of the particular display control device to determine the subset of the plurality of serially connected display data processing steps that are to be performed by the particular display control device.
Preferably, the method further comprises:
selecting, by the host device, a shader program for each of the display data processing engines in the subset of the plurality of serially connected display data processing steps that are to be performed by the particular display control device; and
transmitting, by the host device, instructions to the particular display control device to control the particular display control device to execute the selected shader program on the corresponding display data processing engine to process the display data.
The method may further comprise:
transmitting, by the host device, the selected shader program to the particular display control device if the particular display control device does not already have the selected shader program stored in memory.
Preferably, the method further comprises:
transmitting, by the display control device to the host device, the characteristics of the display device connected to the display control device to enable the host device to determine characteristics of the rendered display data to be transmitted from the final output of the display data processing pipeline to the particular display device;
transmitting, by the display control device to the host device, the processing capabilities of the display control device;
receiving, by the display control device from the host device, the information regarding the subset of the plurality of serially connected display data processing steps that are to be performed by the display control device;
receiving, by the display control device from the host device, the part-processed display data;
performing, by the display control device, the display data processing steps in the subset of the plurality of serially connected display data processing steps on the part-processed display data to generate the rendered display data; and
transmitting, by the display control device, the rendered display data to the display device connected to the display control device.
In an embodiment, the method further comprises:
receiving, at the display control device, the instructions from the host device to execute the selected shader program on the corresponding display data processing engine to process the display data.
Preferably, the method further comprises:
receiving, at the display control device, the selected shader program from the host device if the display control device does not already have the selected shader program stored in memory.
In another aspect, the invention provides a computer readable medium including executable instructions which, when executed in a processing system, cause the processing system to perform all the steps of a method as described above.
According to a further aspect, there is provided a method of co-ordinating a split display pipeline in which incompletely processed display data may be sent from a host to a display control device with a programmable processor and the display control device may then perform the remaining processing to prepare the display data for display. The method comprises:
The capabilities of the display control device may include the processing power of the programmable processor, the memory available for shaders, the presence of any specialised hardware, the bandwidth available between the display control device and the display device, or any similar measure of ability to carry out additional processing.
The commands sent to the display control device by the host may include instructions to use a particular shader for particular data, as well as any appropriate flags and parameters, such as which of multiple display devices is or are to be used for displaying the finished image data.
The data sent to the display control device may be display data at any appropriate point in generation and processing. It may include, for example, vertices, primitives, and textures.
As is suggested by the transmission of data and commands to the display control device once the shaders have been transmitted, each shader is uploaded to the display control device once and then used repeatedly. This may be done at the beginning of the link, or repeated throughout the life of the link if, for example, more capability becomes available on the display control device and it therefore becomes possible to upload further shaders in addition to those already present.
An example embodiment could comprise the following steps:
The method described above may reduce the amount of processing carried out on the host, as well as providing a potential compression advantage depending on the stage in the pipeline where data is sent to the display control device. Because less processing may be carried out on the host, the host is more likely to be able to supply multiple display control devices with data, especially if they are showing similar images, as is likely in a desktop setting. Furthermore, if the host device were a battery powered device, such as a mobile device, but the display control device was main-powered, or at least with a greater degree of power available than the host device, then off-loading more of the pipeline to the display control device would result in power savings for the host device and therefore in battery life. Indeed, the amount of battery life left for the mobile may even be considered by the host device when determining how much of the pipeline to offload to the display control device.
As an extension of the above method, multiple hosts may be connected to a single display control device. Provided the hosts are aware of one another and are able to reference the appropriate shaders without confusion, the same process can be followed for each connection between a host and the display control device. The display control device need not be aware that the data and accompanying instructions and requests are sent by multiple hosts, as this case is the same as if they were sent by multiple programs on a single host.
Embodiments of the invention will now be more fully described, by reference to the drawings, of which:
Each display control device [12] contains its own internal processor [15], as well as a memory [16], in which instructions and data are stored. These are transmitted by the host [11] across the connection and directed to memory [16] for storage. Commands are also transmitted across the connection, but these are sent directly to the processor [15], which then fetches the required data and instructions from the memory [16]. Naturally, especially if multiple display data processing steps are being carried out on the display control device [12] then data can be returned to memory [16] in between instructions, but the direction of data flow shown in
The processor [15] is likely to consist of multiple processing cores so that instructions can be run in parallel, and may in fact consist of multiple separate processors. However, for simplicity, the processor [15] is described herein as a single processor [15] regardless of its actual arrangement. As suggested by the fact that instructions may be run in parallel, when additional shaders are present on the display control device [12], different processor cores may simultaneously be running the shaders associated with different display data processing steps, but the processes described herein will refer to a single serially-arranged pipeline.
Furthermore, the processor [15] may consist of or include a collection of display data processing engines, each of which is specifically designed to carry out a particular purpose and run a particular corresponding type of shader. Such engines will be described again in more detail with reference to later Figures, but are collectively described as the processor [15].
Each display control device [12] is then connected to a display device [13]. The display devices [13] are likely to have different characteristics, such as size, resolution, frame update speed, colour depth, etc., meaning that display data destined for the two displays will require different processing. When processing of data is complete, the resulting image data is transmitted along this connection for display on the display device [13]. This connection will carry display-specific data, but it may also be wired or wireless. Any one or more of the devices [11, 12, 13] may be co-located with another one or more, where two co-located devices share a single casing and appear to be one device.
At the end of the pipeline, the pixel data is rendered and output to the display device.
This pipeline is an example for convenience only. Other pipelines may include more, fewer, or different steps.
Conventionally, the majority of display data processing is carried out on the host [11], as shown in
Vertices [21] and primitives [22] are generated and stored in memory. Vertices [21] are collections of points stored as co-ordinates, and these are passed to the vertex shader [25], which transforms the vertices as appropriate. Primitives [22] are parts of display data with dimensionality, such as simple shapes which can then be tessellated in order to produce patterns. This is done by the geometry shader [26], with input from the vertex shader [25], as shown in
The GPU also generates textured surfaces [23], which are transformed textures that are then stored in memory. These are then passed to the texture sampler [24] for sampling, and the resulting textures are combined with the output of the geometry shader [26] in the pixel processing unit [27]. This then produces display data, which is transmitted to the display control device [12] and rendered in its internal processor [15] by a pixel rendering service [28]. This produces image data, which is sent to the display device [13] for display.
The host [11] shown in all other Figures could have the same capabilities as that shown in
After being generated on the host [11] the textured surfaces [31] have been sent to the display control device [12] and are stored in memory [16]. This means that when the geometry shading [26] is complete, the shaded data can be transmitted to the display control device [12], where pixel shading [27] is carried out, since the textures [31] are also sampled [24] on the display control device [12]. The display data can then be rendered into image data and transmitted to the display device [13] as if it had been generated and processed in the conventional way.
This reduces the amount of processing that must be carried out on the host [11] and may result in a smaller volume of data being transmitted to the display control device [12] if, for example, textures have not yet been applied at the time the data is transmitted. This may be beneficial where the bandwidth of the connection between the host [11] and the display control device [12] is limited. Compression may be applied to the data as it is transmitted from the host [11] to the display control device [12], which is not done at this stage in the pipeline in the current art as conventionally the whole pipeline is contained on a single device. In many cases, however, there will be greater benefits from compression partway through the pipeline, meaning that a system arranged according to the invention may make more efficient use of a limited-bandwidth connection.
In each of
The host [61] will include other components, but those relevant to the invention are a memory [64] which contains shaders and a processor [65], which may, as previously mentioned, actually comprise a collection of one or more display data processing engines: possibly all those required for the full pipeline as shown in
The display control device [62] is also likely to include other components, but only three are shown in
The display control device processor [68] runs the instructions comprised in shaders on data received by the input engine [66] and stored in memory [69] or passed directly to the processor [68]. The processor may, as previously mentioned, actually comprise a collection of one or more display data processing engines. Unlike the host [61], the display control device [62] does not produce primitives and is highly unlikely to have a full collection of display data processing engines, so it is likely to only be able to perform some of the display data processing steps.
This then produces processed data, which the processor [68] either transmits to the display device [63] for display in the conventional way, or stores in memory [69] so that it can fetch it and run further instructions on it as appropriate. Transmitting finished image data to the display device [63] may involve an output engine to apply formatting and timing appropriate to the display device [63].
At Step S71, the host [61] generates the primitive image data [22] it will require. As previously mentioned, this may consist of lines, simple shapes such as triangles, and colours. These are stored locally on the host [61].
The host [61] also has at least one shader [64], consisting of instructions to be carried out on display data during a particular display data processing step. It registers the shaders at Step S72, assigning each one a reference number. Examples are shown in
At Step S73, the host [61] queries the display control device [62] to see what resources it has available for running shaders, as well as to determine the characteristics of the connected display device or devices [63] in order to determine the characteristics of the data that will be sent to them. The available resources on the display control device [62] would include memory for the instructions [610] and the processing power required for an additional program, and also memory [69] available for the inputs required by the program. If appropriate, it could also include availability of display data processing engines. For example, even in the embodiment shown in
Alternatively, the display control device [62] could transmit its capabilities to the host [61] without querying, for example upon connection. This would mean that the appropriate data would already be stored on the host [61] and it could consult this at Step S73.
In the example shown in
The processor [65] on the host [61] fetches the shader [64D] from memory [64] and transmits it to the input engine [66] on the display control device [62], which immediately directs it into the memory [67] on the display control device [62], where it is stored in the area of memory dedicated to program instructions [610]. The host [61] also sends the unique identifier associated with Shader D [64D]—‘4’, in this example—and this is stored in memory [610] with the shader. It may also be stored with a reference to the shader such as a memory address in a look-up table to provide easy access to each shader using its unique identifier.
Decisions regarding the display data processing steps to be offloaded could be dictated by the characteristics of all the connected display devices [63] such that if they all have particular attributes in common then relevant steps could be most efficiently carried out centrally on the host [61] even if all the display control devices [62] would in fact be capable of performing them. Furthermore, if there are three connected display devices [63] and one is different to the others—for example, it is much larger—the host [61] may determine that a larger subset of the display data processing steps should be carried out on that display device's [63] display control device [62], so that scaling common to the remaining two display devices [63] is performed on the host [61] and the display control device [62] connected to the large display device [63] performs its scaling. In this way, the system is highly flexible for greatest efficiency.
Furthermore, by splitting the pipeline in different ways, different compression methods may be used according to which compression method may be particularly appropriate for the particular partially-processed display data being transferred across from the host to the display control device. Indeed, in some circumstances the host may determine that, although the pipeline could be split in one way so that some processing is offloaded to the display control device as described above, by splitting the pipeline in a different place, better compression may be obtained depending on the partially processed data that is then to be sent, and that such better compression may be preferable to the offloading that would otherwise have been made. Thus, the host could use knowledge of the various compression methods available for different partially processed data and capabilities of the transmission pipeline, such as bandwidth and available resources, when determining where to split the pipeline.
The example being outlined in
In any case, the host [61] will perform any display data processing steps it has determined should not be offloaded to the display control device [62], and will transmit the part-processed display data to the display control device [62] as described above for the primitive data.
At Step S77, the host [61] transmits the identifiers of the shaders to be applied to the data, preferably in the order in which they are to be applied, but a different ordering scheme may be used. For example, if Shaders A and B are vertex shaders with slightly different functions but which occupy the same display data processing step, Shader C is a geometry shader, and Shader D is a pixel shader, the host [61] may only transmit ‘4’ for a pipeline such as that shown in
At Step S78, the processor [68] on the display control device [62] fetches the referenced shader [610] from memory [67] and carries out the instructions contained in it on the primitives or incomplete data transmitted to it, having also fetched these from memory [69]. If there are multiple shaders to be applied, it will repeat this step for all the shaders in the sequence and may move the data in and out of memory [69] between iterations. It may also send feedback to the host [61] on its progress in processing the data, which may allow the host [61] to balance workload, for example between itself and the display control device [62]. This step may involve sending the shaders and data to appropriate display data processing engines within the processor [68], or carrying out the instructions in each shader using a programmable processor [68].
At Step S79, the processing is complete and the display control device [62] transmits the data to the display device [63] for display in the conventional way.
Although particular embodiments have been described in detail above, it will be appreciated that various changes, modifications and improvements can be made by a person skilled in the art without departing from the scope of the present invention as defined in the claims. For example, hardware aspects may be implemented as software where appropriate and vice versa, and engines/modules which are described as separate may be combined into single engines/modules and vice versa. Functionality of the engines or other modules may be embodied in one or more hardware processing device(s) e.g. processors and/or in one or more software modules, or in any appropriate combination of hardware devices and software modules. Furthermore, software instructions to implement the described methods may be provided on a computer readable medium.
Number | Date | Country | Kind |
---|---|---|---|
1609605.9 | Jun 2016 | GB | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/GB2017/051509 | 5/26/2017 | WO | 00 |