The same reference number represents the same element on all drawings.
As is known in the art, each compute node 106 as well as the head node 102 may be a general purpose or specialized computing device (including one or more processors). Thus, as used herein, the head node and each of the compute nodes may also simply be referred to as “computers”, “processors”, or “nodes”. The specific packaging and integration of the computers as one or more printed circuits, in a single enclosure or multiple enclosures, and the particular means of coupling the various computers are well known matters of design choice.
Head Node
Attached host systems and/or print server devices (not shown in
Head node 102 may include a main functional element, sheetside dispatcher 120 (“SSD”). SSD 120 retrieves sheetside description files and distributes or dispatches them across the compute nodes 106 by executing a certain mapping (i.e., resource management) heuristic discussed further herein below. It is assumed that the estimated time required to produce a bitmap out of each sheetside description file (e.g., the RIP time) is known for each of the sheetsides. Those of ordinary skill in the art would readily recognize well known heuristics to estimate the RIP time for each sheetside description file. These estimates, among other dynamic factors discussed further herein below, may then be used by the mapping heuristic to make decisions about which sheetside to send to which compute node. The RIP time estimates are only estimates of RIP times and thus may differ from the actual RIP times.
For modeling of the operation of system 100 by the mapping heuristics, it may be assumed that all compute nodes provide the same computational power, i.e., it is a homogeneous system. Features and aspects hereof for modeling the system 100 can readily be extended for the case where compute nodes can differ in performance, i.e., a heterogeneous system. In the heterogeneous case, there must be a mechanism for estimating the RIP time of each sheetside on each type of compute node.
Compute Nodes
Compute nodes 106 can be represented as a homogeneous collection of “B” independent compute nodes (e.g., “compute nodes”, “processors”, “computers”, “nodes”, etc.). The main relevant use of each compute node is to convert sheetside description files received from the head node 102 to corresponding bitmap files. Sheet side description files assigned to a compute node 106 dynamically arrive from the head node 102 to an input queue associated with each compute node (e.g., a compute node input queue or “BIQ”). Each compute node 106 also has an output queue for storing completed, RIPped sheetsides (“BOQ”). The compute node retrieves the sheetside files in its input queue in FIFO order for rasterization as soon as the compute node's output buffer has enough space to accommodate a complete generated bitmap. The total amount of buffer memory in each compute node is divided between the compute node's input and output buffers at system initialization time. The sizes of the bitmaps generated are known to be constant as a function of the bitmap resolution and size to be generated.
For the exemplary model and dispatch heuristics discussed herein below, it may be assumed that no bitmap compression will be used. Features and aspects hereof can readily be extended to handle compression for the case where the RIP times are extended to include time for performing compression. Further, the model and heuristics may be easily extended to account for variability in the size of generated bitmaps due to compression. Such extensions are readily apparent to those of ordinary skill in the art.
Before a sheetside can be RIPped there must be space in the compute node output buffer sufficient to accommodate the uncompressed bitmap. Using compression the size of the compressed bitmap is unknown until compression completes. Therefore, even utilizing compression, where the final compressed bitmap size may be less than the uncompressed bitmap, size sufficient space must be reserved to accommodate the entire uncompressed bitmap. After the sheetside is RIPped, the actual compressed bitmap size will be known and can be used to determine what space remains available in the given compute node's output buffer.
Two control event messages may be originated at the compute node 106 for use in the model and heuristics discussed further herein below. An event message may be generated indicating when rasterization for a given sheetside is completed. One control event message is sent to the head node 102 carrying the sheetside number of the bitmap, its size, and its creation time. Another control message is forwarded to the corresponding printhead (110 or 112) indicating that the bitmap for the given sheetside number as now available on the compute node 106.
Printheads
Two identical printheads may be employed in a monochrome, duplex print capable embodiment of features and aspects hereof. A first printhead 110 is responsible for printing odd numbered sheetsides, while printhead 112 is responsible for printing even numbered sheetsides. Sheet sides are printed in order according to sheetside numbers. For purposes of the model and heuristics discussed herein below, printing speed is presumed constant and known. A typical printhead interface card has sufficient memory to store some fixed number of RIPped bitmaps or a fraction thereof. In the discussion below, an exemplary buffer size associated with the printheads may be presumed to be equal to two (2) uncompressed bitmaps. Persons skilled in the art will readily see how the data transfer method could be modified to handle a buffer which is less than 2 bitmaps in size.
Bitmaps are requested sequentially by the printheads 110 and 112 from the compute nodes 106 based on information about which bitmaps are in each compute node's output buffer. This information is acquired by the printheads upon receiving control messages from the compute nodes as noted above. When the printhead interface card's buffer memory is full, the next bitmap will be requested from the compute node at the time when the printhead completes printing one of the stored bitmaps.
In this exemplary two printhead monochrome system, printhead 0112 will print the even numbered sheetsides, and printhead 1110 will print the odd numbered sheetsides. The sheetsides will be printed on both sides of a sheet of paper of the continuous form paper medium. For simplicity of this discussion, it may be presumed that the print job begins with sheetside 1 printed on printhead 1, and printhead 0 must print sheetside 2 on the other side of the sheet, at some time later. The time difference between when sheetside 1 and sheetside 2 are printed depends on the physical distance between the two printheads, the speed at which the paper moves, etc. This time difference defines the order in which sheetsides are needed by the printheads, e.g., the time when sheetside 15 is needed by printhead 1 may be the same time that sheetside 8 is needed by printhead 0 (in this example an offset of 15−8=7 will be a constant offset between odd and even numbered sheetsides that are needed simultaneously). Without loss of generality, this discussion will assume an offset of 0. This assumption will simplify the description in this document. The incorporation of offsets greater than 0 is discussed further herein below.
Communication Links
As shown in exemplary system 100 of
There may be a 4 GB Fibre Channel network (154 and 156 of
Those of ordinary skill in the art will readily recognize that these exemplary communication channel types and speeds may vary in accordance with the performance requirements and even the particular data of a particular application. Thus, system 100 of
Mathematical Model
In general the dispatch mapping heuristics in accordance with features and aspects hereof help assure that each bitmap (RIPped sheetside) required by each printhead will be available when needed by the printhead. In achieving this goal, features and aspects hereof account for the following issues in modeling operation of the system:
In accordance with features and aspects hereof, assignments to compute nodes are made by the SSD for individual sheetsides sequentially in order of sheetside numbers. In one aspect, the SSD distributes sheetsides across the compute nodes based on the principle that a sheetside is mapped to the compute node that minimizes the estimated RIP completion time for that sheetside. In other words, each sheetside is assigned to its Minimum RIP Completion Time (MRCT) compute node. A mathematical model for estimating the completion time of a sheetside is presented herein below. The mathematical model forms the basis for the heuristic mapping methods and structures operable in accordance with features and aspects hereof.
The mathematical model discussed herein below presumes an exemplary queuing structure in the communications between the various components. Some constraints and parameters of the model depend on aspects of these queues and the communication time and latencies associated therewith.
Compute node processor 106 eventually processes and then subsequently dequeues each sheetside description from its input queue 202 (in FIFO order to retain proper sequencing of sheetsides). Each sheetside description is dequeued by the compute node 106 from its input queue 202, processed to generate a corresponding bitmap or RIPped sheetside, and the resulting RIPped sheetside is stored in the compute node output queue 204 associated with this selected compute node 106. As above with respect to input queue 202, the output queue 204 of compute node 106 is constrained only by its total storage capacity. Where bitmaps are uncompressed and hence all equal fixed size the number of bitmaps that may be stored in output queue 204 is also fixed. Where bitmap compression is employed, the maximum number of bitmaps in the output queue 204 may vary.
Eventually, printhead 110 will determine that another bitmap may be received in its input queue 206 and requests the next expected RIPped sheetside from the appropriate output queue for the compute node 106 that generated the next sheetside (in sheetside number order). As noted above, the buffer space associated with printhead 110 is typically sufficient to store two sheets such that the first sheet is in process scanning on the printhead while a second RIPped sheetside is loaded into the buffer memory. Such “double-buffering” is well known to those of ordinary skill in the art.
The mathematical model discussed further herein below presumes the following:
Mathematical Model—Sheet Side Deadline
As regards the start times of the printheads, let t0 be the start time of printhead 0 (e.g., printhead 112 of
if i is odd, and at time
if i is even. Let ttranbitmap be the bitmap transfer time from the compute nodes to a printhead. Then, SSi's deadline, td[SSi], indicates the latest wall-clock time for a compute node to produce SSi's bitmap:
The deadline calculation will be used to determine the time delay to begin processing a sheetside on a compute node. For this purpose, the deadline equation needs to be expressed in terms of the ordering of sheetsides on a given compute node. Let BQij be the ith sheetside to have entered compute node j's input queue for a given job. Define the operator num[BQij] that evaluates to the actual sheetside number. Then, (1) can be rewritten as follows:
Mathematical Model—Estimated Departure Time
Let HNi be the ith sheetside to enter the head node input queue (HNIQ) for a given print job. HNi is the same as SSi when 0 paper offset is assumed between the printheads responsible for printing odd and even sheetsides. The case when the paper offset is non-zero is discussed further herein below. Let HNi−1 be the sheetside ahead of HNi in the head node input queue. To evaluate estimated departure time for HNi to compute node j, the input buffer capacity of compute node j must be considered. The space in the compute node input buffer is limited by the two factors: the maximum number of sheetside description files (Q) allowed by the mapping algorithm, and the total number of bytes of memory allocated to the input buffer. The calculation of the estimated RIP completion time of HNi on compute node j includes summing the estimated times to RIP the sheetsides assigned to that compute node but not RIPped yet. The result of this calculation is subject to the estimation error accumulated, which may increase as the number of sheetsides in a compute node input queue increases. The first factor helps to reduce this accumulated error. If the size of sheetside HNi is less than or equal to the available input buffer capacity of compute node j, then HNi can be immediately sent to the input buffer of compute node j following the transfer of HNi−1. Otherwise, HNi will be delayed at the head node for the amount of time needed for a certain number of sheetsides previously assigned to compute node j to be rasterized, to create input buffer capacity sufficient to accommodate HNi.
Let the estimated RIP completion time of HNi on compute node j be tcompj[HNi]. To calculate the available input buffer capacity at compute node j, form the sequence J of all sheetsides mapped to compute node j. Sheet sides in sequence J are ordered as they were mapped to compute node j, i.e., in the older first order. Let sequence K be formed of elements of J that have not yet been RIPped at the time when the transmitter at the head node is ready to start transmitting HNi to the compute nodes. The transmitter becomes ready for HNi when it is finished with HNi−1. Let tdeptx[HNi−1] be the departure time of HNi−1 to its minimum completion time compute node x, and let ttranxdf[HNi−1] be the time required to transfer HNi−1's sheetside description file to the selected compute node. Mathematically, sequence K is defined for HNi by the following equation:
K={HN
k
εJ: t
comp
j
[HN
k
]>t
dept
x
[HN
i−1
]+t
tran
xdf
[HN
i−1]}
Let the operator size[HNk] give the size of the HNk sheetside description file and let CAPinj be the total input buffer byte capacity of compute node j, both in bytes. Then, the available capacity in the input buffer of compute node j, ACinf, is given by,
If size[HNi]≦ACinf and |K|<Q, HNi can depart at time
t
dept
j
[HN
i
]=t
dept
x
[HN
i−1
]+t
tran
xdf
[HN
i−1].
Otherwise, HNi must wait until enough sheetsides have been processed from the input buffer of compute node j, so that these two conditions hold. If after the processing of some BQmjεK these conditions hold, then tdeptj[HNi]=tcompj[BQmj]. The exemplary pseudo code below suggests an exemplary approach for finding the estimated departure time for sheetside HNi if assigned to compute node j, denoted tdeptj[HNi]. If i=1, i.e., HNi is the first sheetside to be assigned by the SSD, HNi can depart immediately.
Mathematical Model—Delay before Processing
Let the RIP completion time of BQij on compute node j be tcomp[BQij]. If BQij has been RIPped then tcomp[BQij] is actual, otherwise it is estimated. Consider compute node j with output buffer capacity CAPoutj measured in bytes. Because bitmaps are all assumed to be the same size (unless the method is adapted to permit bitmap compression), the number of bitmaps that could be placed in the output buffer of any compute node is constant. Assume N bitmaps can be placed in the compute node's output buffer. Define the delay to begin processing sheetside BQij, Δout[BQij], as a waiting period from the time when BQij reaches the head of compute node's input buffer to the time when the compute node's processor is ready to retrieve it for rasterization. If BQij is at the head of compute node j's input buffer, BQi−1j must have completed processing. To determine Δout[BQij], three cases are considered:
Mathematically, the delay for BQij to begin processing is given by:
An example calculation of sheetside delay Δout[BQij], is presented in
Evaluating the three cases for Δout[BQ33j] reveals that Case 1 does not apply because i is greater than N. The other two cases apply as follows (Case 2 applies when BQ30j is not in the output buffer in
Mathematical Model—Estimated RIP Completion Time
The estimated RIP start time of sheetside BQij, denoted tstart[BQij], occurs when two conditions are satisfied: BQij is present at the head of the input buffer of compute node j, and compute node j's output buffer has space sufficient to accommodate it. If these conditions are not satisfied then tstart[BQij] will be defined as follows:
Let the estimated RIP execution time, ERET[BQij], be the estimated time required to rasterize sheetside BQij. Then, tcomp[BQij] can be calculated by adding the ERET[BQij] to the start time for BQij:
t
comp
[BQ
i
j
]=t
start
[BQ
i
j
]+ERET[BQ
i
j]
Let tdept[BQij] be the estimated departure time for sheetside BQij to compute node j (as discussed above for tdeptj[HNi]), and let ttranxdf[BQij] be the time required to transfer BQij's sheetside description file to the selected compute node. Then, tstart[BQij] can be calculated using the following equation:
t
start
[BQ
i
j]=max {(tcomp[BQi−1j]+Δout[BQij]),(tdept[BQij]+ttranxdf[BQij])}
An example calculation for tcomp[BQij] is shown in
t
comp
[BQ
i
j]=max{(7+1), (6+0.1)}+2=8+2=10.
Note that calculation of tcomp[BQij] is based on recursion because it depends on tstart[BQij], which in turn depends on tcomp[BQi−1j]. The recursion basis is formed with BQ1j, whose tcomp[BQ1j] is found as follows:
tcomp[BQ1j]=ERET[BQ1j]+tdept[BQ1j]+ttranxdf[BQ1j]
Mathematical Model—Summary
Summarizing the mathematical model, for any sheetside BQij, its RIP completion time estimate can be computed based on:
Head Node Model and Mapping Heuristic—Overview
The mapping heuristic described in this section assumes the system is in a steady state, i.e., some sheetsides have already been RIPped and the printhead start times t0 and t1 are known. A mapping of a sheetside is made to the MRCT compute node, which is found based on the mathematical model described above. Upon feedback that a compute node has completed a bitmap, the RIP completion time estimates of the sheetsides assigned but not completed at this compute node are recalculated. The sheetside at the head of the head node input queue is placed in the transfer queue to be sent to its MRCT compute node when the compute node has enough room in the input buffer to accommodate that sheetside.
The transfer queue (TQ) as discussed above is a queue on the head node that is used to pass sheetsides to a transmitter for transfer to the compute nodes from the head node. Once a sheetside is in the transfer queue, the mapping for that sheetside can no longer be changed. The transfer queue is limited to two sheetsides to postpone finalizing mapping decisions as long as possible. This allows the SSD to obtain the latest feedback information from the compute nodes to correct errors in the RIP completion time estimates. The earliest expected feedback time (EEFTj) of a compute node j is defined as the time that the sheetside being currently rasterized on the compute node is expected to be completed.
When sheetsides arrive at the head node input queue from an attached host or server via the datastream parser, they are considered for assignment in the order of sheetside numbers. For example, sheetside 43 (HNi) will be mapped to a compute node before sheetside 44 (HNi+1) is considered. By mapping sheetsides in order, certain deadlock scenarios can be avoided. Deadlock may occur due to the finite output buffer capacity of individual compute nodes. When sheetsides that have a later deadline occupy the output buffer of a compute node, a sheetside with an earlier deadline might be stuck in the input buffer of the same compute node.
Due to errors in the estimated completion times, if an opening at the input buffer of any compute node (possibly on the MRCT compute node for HNi+1) happens before there is an opening at the MRCT compute node of HNi, the MRCT calculation for HNi is performed again to check if HNi could be sent to the compute node that produced the opening. However, if it turns out that the compute node having produced the opening is still not the MRCT compute node for HNi, HNi+1 is still not considered for the following reasons:
Head Node Model and Mapping Heuristic—Procedure
For the sheetside considered, a compute node lookup table is first formed. Note that only one lookup table must be maintained at any given point in time. The lookup table contains the following information:
The entire table is sorted (ranked) in ascending order based on the estimated RIP completion time of the sheetside on the compute nodes and the table is dynamically updated upon receiving feedback from a compute node. A compute node j status is said to be invalid indicating that this compute node is no longer considered for mapping for a given sheetside when the following condition is satisfied:
current time>(EEFTj+(tcompk[HNi]−tcompj[HNi])),
where k is the compute node next ranked in the table. The right hand side of the above equation is called the invalidation time (INVTj). A compute node is said to be valid until its invalidation time is passed. If there is no other valid MRCT compute node in the sorted table after the current MRCT compute node j, then the INVTj is the same as the EEFTj.
The invalidation time INVTj defines the maximum wall-clock time by which compute node j can be considered for HNi mapping. As soon as the current time is equal to INVTj, the estimated RIP completion time on compute node j becomes just as good as the estimated RIP completion time on the compute node ranked next in the table. However, that compute node must have all of the required conditions hold to be assigned the considered sheetside (i.e., space in the input buffer and be valid). Furthermore, the fact that the expected feedback has not arrived from compute node j since EEFTj indicates that estimated tcompj[HNi] will significantly deviate from its actual value. Therefore, it is reasonable to stop considering compute node j for the HNi mapping. An example compute node lookup table is shown in Table 1.
Applying the Model Using Heuristic Rules
Element 500 is first operable to retrieve the next raw sheetside from a buffer or queue associated with the head node. The head node input queue is used for storing all received raw sheetsides in sheetside order as received from the datastream parser. In general, all received raw sheetside data may be stored in a queue structure such that each raw sheetside comprises an identifiable group or file identified by the sheetside number. As noted above, for simplicity of this description, it may be presumed that the system operates on a single print job having multiple raw sheetsides numbered 1 through N. Simple extensions readily understood by those of ordinary skill in the art may adapt the method of
Element 502 is operable to apply the mathematical model estimating the current operating parameters and processing capacity of each processor of the multiple processors/compute nodes. Element 502 applies heuristic rules based on the above discussed mathematical model to determine a minimum RIP completion time (MRCT) processor/compute node for processing/ripping this next raw sheetside. Element 504 is then operable to dispatch this raw sheetside to the selected MRCT processor to be RIPped and eventually forwarded to the printhead in proper order. Processing then loops back to element 500 to continue processing other raw sheetsides received at the head node.
Substantially concurrently with the operation of elements 500 through 504, element 506 is operable to continuously update the parameters used in the mathematical model describing current operating status and capacity of the plurality of processors/compute nodes. This present operating status changes as each raw sheetside is completely RIPped by its assigned processor and as new raw sheetside files are received. In like manner, as each completed, RIPped sheetside is transferred to a corresponding printhead, other operating parameters and status of the plurality of processors may be updated by element 506. The dashed line coupling element 506 to element 502 represents the retrieval of current operating status information by operation element 502 when computing the mathematical model to select an MRCT processor for the current raw sheetside.
Element 600 of
Element 602 is next operable to determine whether there are raw sheetsides in the spool or queue associated with the head node. If not, processing returns to element 600 to await receipt of additional raw sheetsides to be processed. If there is a raw sheetside in the spool or input queue for the head node, element 604 is then operable to estimate the processing capacity of each compute node of the plurality of compute nodes for ripping the spooled raw sheetside at the front of the queue. The performance information used in determining the processing capacity of each node may include a variety of parameters such as: storage capacity of the compute node/processor to receive the raw sheetside file, an estimated RIP completion time to complete ripping of this raw sheetside (including estimated RIP times of all earlier sheetsides already queued within each compute node processor and not yet RIPped). Those of ordinary skill in the art will recognize a wide variety of other factors and parameters that may be useful in determining the processing capacity of each node.
Element 606 is then operable to determine from the performance information generated by element 604 whether each compute node is valid or invalid with respect to processing of this raw sheetside. If the performance information for a compute node processor indicates that it is incapable of processing the current raw sheetside for any of various reasons, a compute node will be invalidated. The performance information for each compute node (including the “valid” or “invalid” status) is stored in a table structure generated within the head node. The table is constructed with performance information for each of the multiple, clustered compute node processors of the printer controller regarding their respective capacity to RIP this next raw sheetside.
Processing continues at element 608 to sort the generated table from earliest to latest estimated RIP completion time for this raw sheetside. Element 610 then verifies that at least one valid compute node exists in the table. Element 612 then uses the generated table, sorted by element 608, to select the first compute node indicating that it is valid and has sufficient storage capacity to receive and RIP this raw sheetside. Since the table is sorted in order of lowest estimated RIP completion time, the first valid entry having sufficient storage capacity to receive this raw sheetside will represent the compute node having the minimum RIP completion time for this sheetside given the current performance information for all processors. If no compute node is presently capable of processing this raw sheetside, processing continues at elements 604 (label “B”) to continue evaluating performance information for each compute node until this raw sheetside is successfully processed by the SSD and placed in the transfer queue, where it will be dispatched to a selected compute node by another computational process. The dispatch method exemplified by
The evaluation of performance information by elements 604 and 606 is therefore dynamic in that the current performance information is re-evaluated until such time as the SSD successfully places this raw sheetside in the transfer queue for dispatch to a selected compute node processor representing the minimum RIP completion time for this raw sheetside in the current state of operation of the system.
If the element 614 determines that some valid compute node representing the current minimum RIP completion time for this raw sheetside and indicating sufficient storage capacity to receive this raw sheetside was selected by operation of element 612, element 614 is next operable to verify that there is room in the transfer queue of the head node to permit forwarding of this raw sheetside from the head node to the selected compute node's input queue. As noted above, the transfer queue may preferably have a limited capacity measured in a pre-determined number of raw sheetside files. This pre-determined threshold limit assures that the head node will only make a valid selection of the MRCT compute node at the last possible opportunity so as to assure that the most current performance information is used in the selection process. If no room is presently available in the transfer queue, processing continues at element 604 (label “B”) to continue evaluating performance information of each compute node until this raw sheetside is successfully dispatched from the head node to a selected compute node processor.
If element 614 determines that the transfer queue has sufficient capacity to allow transfer of this raw sheetside, element 616 is then operable to remove the raw sheetside from the head node input queue or spool and place the sheetside in the transfer queue for dispatch to the selected compute node. (through the head node's transfer queue mechanism). Processing then continues looping back to element 602 (label “A”) to process further raw sheetsides utilizing current performance information regarding each of the plurality of compute node processors in the print controller.
Upon detection of any new input event, the idle state (700) is exited and processing commences at element 702 to determine the type of event and to appropriately process the event. Element 702 determines whether the event was receipt of a new raw sheetside from the datastream parser. If so, this new raw sheetside is added at the tail of the head node's input queue (HNIQ) by element 704. If the queue was not empty before as determined by element 706, i.e., after the insertion the size of the head queue (|HNIQ|>1), then no further actions will be taken and the system returns to idle state at element 700. Otherwise, at element 708, the sheetside will be immediately considered for mapping in that the compute node lookup table will be created to determine the MRCT compute node. Three conditions must hold for a mapping or dispatch to a compute node to be made for a given sheetside: (a) the selected compute node j is the MRCT compute node for the sheetside, (b) the input buffer of compute node j has enough room to hold the sheetside, and (c) the transfer queue at the head node has space sufficient to accept the sheetside. If all the conditions are satisfied, the considered sheetside will be mapped or dispatched to its MRCT compute node, placed in the transfer queue, and the SSD returns to its idle state. If any of the required conditions does not hold, the SSD returns to its idle state, and a mapping for this sheetside is postponed.
In particular, element 722 sorts the just created/updated table with performance information for each compute node/processor to process this first raw sheetside in the head node input queue. The table is sorted in order of estimated RIP completion time for this raw sheetside for each of the compute nodes/processors. Element 724 then adds the compute node invalidation times to each table entry. As regards the invalidation time of a compute node for a particular sheetside, assume that the current wall-clock time matches INVTj scheduled for compute node j. In this case, compute node j's status will be changed to invalid, the compute node lookup table will be resorted, and the compute node invalidation times will be recalculated. The MRCT compute node's entry is then located based on the sorted order of the valid candidate compute nodes/processors in the table. Element 726 then determines if the MRCT compute node's table entry indicates sufficient storage capacity to receive the new raw sheetside. If not, the system returns to idle (element 700) to await another change of status to dispatch this new raw sheetside. If element 726 determines that the sheetside's MRCT compute node has sufficient capacity to receive the raw sheetside, element 728 is operable to determine whether the transfer queue of the head node has sufficient space to hold another raw sheetside file.
As noted above, the transfer queue is preferably limited to a pre-determined fixed number of sheetsides—in a preferred embodiment, two sheetsides. This limit helps assure that the head node defers all dispatch/mapping decisions for any sheetside to the latest possible time to utilize the most current estimates of compute node/processor performance information.
If element 728 determines that the transfer queue has insufficient capacity, the system returns to idle (element 700) to defer dispatch of this sheetside. If element 728 determines that the transfer queue has sufficient capacity to store this sheetside, element 730 moves the new sheetside from the head node's input queue to the transfer queue. Element 732 then determines if yet another sheetside may fit in the transfer queue. If so, processing continues at element 710 as discussed below. Otherwise, the system returns to the idle state (element 700) to await another state change causing the head node to re-evaluate sheetside dispatch.
The system may also come out of the idle state (element 700) when a compute node completes RIPping of a dispatched sheetside or when other status messages indicate another completion within the system (e.g., completion of a transfer of a RIPped bitmap to the printhead, etc.). Element 702 will determine that the idle state was exited due to some reason other than a new sheetside arrival. Element 710 then verifies that there is at least one raw sheetside presently queued in the head node input queue. If not, the system simply returns to the idle state (element 700). Otherwise, elements 712 through 720 update the performance information lookup table for the next queued raw sheetside (or create a new table at element 708 if needed).
More specifically, element 712 determines if a table already exists for the next queued sheetside in the head node. If not, element 708 (et seq.) as discussed above creates a new table, sorts it, and uses it to locate a compute node to which this sheetside may be dispatched. If element 712 determines that the table already exists, elements 714 through 718 are operable to update that table, if needed, to reflect current performance information regarding the compute nodes/processors of the cluster controller. Some previously invalid processors may become valid and vice versa. Following creation or update of the table, elements 722 through 732 are operable as above to attempt to dispatch the sheetside to its MRCT compute node/processor.
For example, when a bitmap RIP complete notification comes from compute node j, the compute node lookup table for the sheetside will be updated for the corresponding row (e.g., element 716). If the RIP complete notification was sent from a compute node whose entry in the lookup table is invalid, then after updating the sheetside's completion time on this compute node the compute node will be marked, as valid again and the other table fields updated as needed. This includes recalculation of tcompj[HNi], EEFTj, and ACinf. It is important to note that because computation of tcompj[HNi] is recursive, the estimated RIP completion times for all the sheetsides assigned to compute node j but not RIPped yet must be updated. The invalidation times are recalculated across the entire table after new compute node ranks are determined. Further SSD actions will depend on whether the required conditions hold to map a currently considered sheetside or not.
Or, for example, consider a transfer complete input generated by the head node transmitter. This input indicates that an additional slot became available in the TQ. As a result, the mapping for the currently considered sheetside will be finalized if this was the only unsatisfied condition blocking the mapping before. No table updates are invoked with this input. In addition, the table for this sheetside will be deleted as this sheetside has now been assigned.
Paper Offset Extension
As mentioned above, sheetsides are printed on both sides of the paper by two separate marking engines separated by some distance measured in sheets of paper. This implies that certain fixed amount of time (referred to as a paper offset time) is required to pull the paper from one printhead to another to achieve proper alignment between consecutive odd and even numbered sheetsides. For purposes of simplification, the discussions above presumed this offset to be zero. The reality of a non-zero paper offset modifies the systems and methods above in only minor ways easily observed and understood by those of ordinary skill in the art. The non-zero paper offset results in two implications in the features and aspects discussed herein above:
1. The start time of the printhead 0 (e.g. 112 of
2. Sheet sides have to be rearranged in the head node input queue, because sheetside mapping order matches the order in which generated bitmaps are fetched by the printheads. Such a reordering is illustrated in
Color Extensions
The compute nodes/processors used in a color printer application of features and aspects hereof are structurally identical to that used in the monochrome printer. However, the color version will have to send bitmaps to a larger population of printheads, and multiple bitmaps will be created for each sheetside. Odd and even numbered bitmaps are stored in a single output buffer of the compute node and transferred to the printheads in a FIFO fashion. It is preferable that the four bitmaps corresponding to the four color planes are created by the same compute node out of a single sheetside description file (at the same time) in the color printer application of features and aspects hereof.
Color Extension—Print Groups
As shown in
Color Extension—Communication Networks
A 1 Gb Ethernet network with 50% payload efficiency may used between the head node (not shown in
Ignoring the optional trunk link between the switches, each switch is assumed to function as a C×H non-blocking crossbar switch where C is related to the number of compute nodes and H is related to the number of printheads. Thus, multiple compute nodes 106 can communicate with unique printheads 910 (1-4) or 912 (1-4) simultaneously.
The multicast option is assumed to be enabled on the switches. This allows a switch to make four copies of a control message that is sent when a bitmap is created notifying every printhead in the corresponding print group (920 and 922). Another possible approach is to forward four control messages originating from the compute node. However, this will result in slightly higher load on the network between the compute nodes 106 and a switch 108A or 108B.
Color Extensions—Communication Conflict Resolution Scheme
Due to the fact that four bitmaps are generated from a single sheetside description file in the color printer, the network traffic between the compute nodes 106 and printheads (9101-4, and 9121-4) becomes four times more intensive than that in the monochrome printer. As a result, there may occur a situation when a bitmap cannot be delivered on time to its destination printhead because the compute node's needed outgoing communication channel is busy transmitting another bitmap to the same print group (920 or 922). To provide insight into such a situation and a method for resolving the problem, consider the example depicted in
Illustrated in the timing diagram of
Assume now that all color plane bitmaps for sheetsides 5, 7, 9, and 11 are stored in the same compute node's output buffer, due to the fact that their sheetside description files were assigned for rasterization to the same compute node. Let ttranbitmap be the time required to transfer a bitmap from a compute node output buffer to a printhead input buffer. For the sake of simplicity, assume ttranbitmap=0.05 sec., and cut-through routing mode is activated on the fiber switch 108A and B. Recall, that when the printhead interface card's memory is full, the next bitmap is requested from the compute node at the time when the printhead completes printing one of the stored bitmaps. The time required to deliver a bitmap to the corresponding color printhead since the request was received at the compute node, tdeliver can be computed for each of the aforementioned bitmaps as the delay time until the communication channel becomes available, ta, plus ttranbitmap. Specifically, for bitmap 11[0], ta(11[0])=0. As demonstrated in
t
deliver(11[0])=ttranbitmap=0.05 sec;
t
deliver(9[1])=ttranbitmap−0.01+ttranbitmap=2×ttranbitmap−0.01=0.09 sec;
t
deliver(7[2])=2×ttranbitmap−2×0.01+ttranbitmap=3×ttranbitmap−2×0.01=0.13 sec;
t
deliver(5[3])=3×ttranbitmap−3×0.01+ttranbitmap=4×ttranbitmap−3×0.01=0.17 sec.
This set of equations must be adjusted if a different forwarding mode is used on the switches.
In the considered system, if a given bitmap's tdeliver is greater than tprint (recall, tprint is 0.11 for this example) then it will not be delivered by the time it is needed for printing. According to this rule, sheetsides 7 and 5 will not be delivered on time in the example discussed. If the SSD does not consider compute nodes that have already been assigned two sheetsides whose print times overlap with the considered sheetside, then this unacceptable situation will be avoided. Those skilled in the art will be able to adjust this set of equations to various communication environments and derive a “banned” sequence of sheetsides assignments to the same compute node.
In the described example, it was assumed that requested bitmaps are transmitted to printheads sequentially—this allows us to determine that bitmaps 11 [0] and 9[1] will be delivered in time as opposed to bitmaps 7[2] and 5[3]. In practice, many production network protocols force concurrent data transfers over the same communication channel. Nevertheless, the provided analysis and the derived restriction on the SSD's assignment process hold for that case as well or else some sheetsides will not be delivered by the time they are needed. The only difference is that which bitmaps fail to be delivered in time depends on the details of the protocol used.
Bitmap Compression Extensions
Features and aspects hereof can readily be extended so that bitmap compression can be applied to reduce the file size of the generated bitmaps. Bitmap compression has the following benefits for the intended system:
The obvious drawback of bitmap compression is in the extra CPU work required to generate the compressed version of a bitmap. This extra CPU work will delay the creation of a bitmap, which is an equivalent of having the longer estimated RIP execution time for sheetsides.
To extend features and aspects hereof to include bitmap compression, examples of the aspects that should be taken into account are as follows. Because the result of a compression attempt is not known a priori, sufficient space must be reserved to accommodate the entire uncompressed bitmap when a CPU retrieves a sheetside for RIPping. Also, a control message has to be sent to the head node specifying the actual file size of the completed compressed bitmap.
Although specific embodiments were described herein, the scope of the invention is not limited to those specific embodiments. The scope of the invention is defined by the following claims and any equivalents thereof.