While data has increasingly been consumed electronically, there remain many scenarios in which there is no substitution for printing electronically generated data. A source electronic document has to undergo processing so that an output device such as a printing device like a laser printer can actually form images on print media to print the document. The source document may be in a high-level page-description language (PDL), such as PostScript, PDF, XPS, or another PDL. The source document undergoes raster-image processing (RIP) to convert the document to a raster image, or bitmap, that the printing device directly uses to print the converted document onto print media.
As noted in the background section, a source electronic document in a page-description language (PDL) like PDF or another PDL undergoes raster-image processing (RIP) prior to and so that a printing device like a laser printer can print the document. Performing RIP on a document can be computationally intensive, which in turn means that performing RIP can be time consuming. While a printing device may have a relatively fast print engine (e.g., a laser printing mechanism), the device's ability to print quickly can depend on how fast the pages of a document can undergo RIP and the converted pages delivered to the print engine.
One technique to decrease RIP time is to identify RIP-reusable elements among the pages of a source electronic document to be printed. As such, rather than performing RIP each time an element is encountered on any page of a document, the element just has to undergo RIP one time. Subsequent occurrences of the element can then reuse the prior RIP of the element, instead of having to again perform RIP.
However, scanning a document to identify RIP-reusable elements itself takes time. If no—or relatively few—such elements are found in a source document to be printed, the resulting scanning-plus-RIP time ends up being longer than if RIP were performed without any initial reuse scanning being performed. One technique to resolve this issue is to abort a reuse scan if the scan is not identifying a sufficient number of document pages that include RIP-reusable elements.
More specifically, one technique compares the percentage of document pages scanned thus far that do not contain RIP-reusable elements, of the total number of document pages, to a threshold. For example, a document may include 1,000 pages. After 200 pages have been scanned, if all 200 pages do not contain RIP-reusable elements, then the percentage is 200/1,000=20%. If half of the 200 pages do not contain RIP-reusable elements, then the percentage is 100/1,000=10%. If the percentage is greater than the threshold, reuse scanning may be aborted.
However, setting the threshold can be difficult. For example, if all the pages of a document are unique (i.e., do not contain RIP-reusable elements), or all the pages include RIP-reusable elements, then the threshold may be set relatively low so that the reuse scan can abort quickly. The percentage of unique document pages, of the total number of document pages, will remain zero as pages of a document of which every page includes a RIP-reusable element are scanned. By comparison, the percentage of unique document pages will linearly increase towards 100% as pages of a document that contains no RIP-reusable elements are scanned.
While setting such a low threshold provides for timely termination of the reuse scanning process in these two opposing cases, such a low threshold can prematurely terminate reuse scanning in other cases that may frequently occur. For example, a document may include copies of a two-sided label, where the first side is unique for each copy (e.g., including a different delivery address, a different product bar code, and so on), and the second side is the same for each copy (e.g., including information regarding a vendor or manufacturer). If the document includes 1,000 pages, or 500 labels, then this means that the threshold has to be set over 50% to ensure that reuse scanning does not prematurely terminate. However, such a high threshold results in over 500 pages of a 1,000-page document that contains no RIP-reusable elements having to be scanned before reuse scanning properly terminates.
Furthermore, documents themselves may be amalgamations. For example, multiple print jobs may be merged into a single source electronic document, or the content of a document may otherwise periodically change from the first page of the document to the last job of the document. As to RIP-reusable element scanning, this means that parts of the document may include pages with RIP-reusable elements, but other parts of the documents may include pages without RIP-reusable elements. Reuse scanning may thus terminate due to an initial part of a document not containing sufficient RIP-reusable elements, even though a subsequent document part includes many RIP-reusable elements.
Techniques described herein ameliorate these shortcomings. A threshold can be set in correspondence with the rate of identifying unique pages within a document, as opposed to in correspondence with the percentage of unique pages identified so far of the total number of document pages. For example, a threshold may be set so that for every N pages that are scanned, a maximum of M pages that contain no RIP-reusable elements are permitted. If of N pages, reuse scanning identifies more than M unique pages, then the scan is aborted. Otherwise, scanning continues with the next N pages, where again if more than M pages that contain no unique pages are identified within these next N pages are identified, then scanning is aborted.
This approach can quickly terminate reuse scanning of a document that contains few or no RIP-reusable elements, while not prematurely terminating the scan of a document that has periodic reuse behavior, such as the two-sided label example that has been described. If documents with random reuse behavior is expected—i.e., documents in which there is no periodicity as to the presence of RIP-reusable elements—then N and M may be proportionally increased so that more pages of a document are scanned prior to termination. For example, if periodicity is expected, M and N may be set to three and five, respectively, whereas if periodicity is not expected, but the same base percentage M/N is desired, then M and N may be both increased by a factor of, say, four, to twelve and twenty, respectively.
Furthermore, even when reuse scanning has been aborted, reuse scanning may be restarted, based on a parameter. As one example, a less computational scan may continue once reuse scanning has been aborted, to identify a change in content in the document that warrants restarting reuse scanning. For instance, within the first N=10 pages of a document, say, M=6 unique pages may be identified, resulting in termination of the reuse scanning. A less computationally taxing—and thus faster—scan of subsequent pages of the document may continue. If at, say, page 152 of the document a change in content is identified via this continued scan, then reuse scanning begins again at page 152. Therefore, the benefits of RIP element reuse accrue even with source electronic documents that are amalgamations of different print jobs or otherwise have different types of content.
The method 100 is performed in relation to a source electronic document that is to undergo RIP so that a printing device can print the resulting converted document. The source document may be in a high-level PDL, such as PDF, PostScript, XPS, or another PDL. The PDL may specify different types of elements, such that each page of the document includes a number of such elements. Each element on a document page has to undergo RIP before the document can be printed.
A RIP-reusable element, which can also be referred to as a reusable element, is an element having more than one occurrence and whose RIP results are not affected by other elements that vary each time the element is used (such as, for example, transparency blending between the repeated reusable element and a non-repeated element). Therefore, such an element just has to undergo RIP once. Subsequent occurrences of the element within the document—on the same page or on different pages—can reuse the converted element. That is, when RIP is performed on pages including other occurrences of the element, instead of RIP having to be re-performed on the element, the prior RIP of the element can be (re)used.
Pages of the document that include any RIP-reusable element may be referred to as reusable pages. Pages of the document that do not include any RIP-reusable elements may be referred to as unique pages. The process of scanning a document for RIP-reusable elements can be referred to as reuse scanning.
In the technique of the example method 100, every N pages of the document are scanned for RIP-reusable elements. If in a given N pages there are more than M unique pages, then reuse scanning is aborted. For example, if the first interval of document pages 1 through N includes no more than M unique pages, then the next interval of pages N+1 through 2N are scanned. If this interval does not include more than M unique pages, then the following interval of document pages 2N+1 through 3N are scanned, and so on.
The method 100 sets a current page interval I to 1 (102). The interval I denotes the first page of the current interval. Therefore, when subsequent intervals of N pages are scanned, I is set to N+1, 2N+1, and so on.
The method 100 sets a current interval page J within the current interval J to 1 as well (104). The page J identifies the pages of the current interval, from the first page 1 through the last page N of the interval. As an example, in the third interval beginning with page I=2N+1 of the document and ending at I=3N of the document, the page J still counts from 1 through N. That is, the page J is a relative counter within the current interval, and not an absolute counter within the document as a whole.
The method 100 sets a unique page counter K to zero (106). The unique page counter K denotes the number of pages within the current interval of the document that do not include any RIP-reusable elements. Therefore, at the beginning of every interval of N pages, the counter K is reset back to zero, and is incremented as unique pages within the current interval are identified.
Page (I−1)+J of the document is scanned for RIP-reusable elements (108). For example, when the first interval 1=1 is being processed, the first page (i.e., page 1) of the document is first scanned, since (I=1−1)+(J=1)=1. When the second interval I=N+1 is processed, page N+1 of the document is first scanned, since (I=N+1-1)+(J=1)=N+1. Scanning a page of the document for RIP-reusable elements is not limited to any particular technique, and may correspond to the type of the source electronic document (e.g., PDF, PostScript, and so on).
If the page that has just been scanned contains any RIP-reusable elements (110)—i.e., the page is reusable and is not unique—then the current interval page J is incremented by 1 (112). However, if the page is unique (110)—i.e., it does not include any RIP-reusable element—then the unique page counter K is incremented by 1 (114). If the number of unique pages K that have been identified in the current interval is not greater than M (116)—that is, no more than M unique pages have been counted thus far in the current interval—then the current page J is incremented by 1 (112).
However, if the number of unique pages K that have been identified in the current interval exceeds the number of permitted unique pages M within the current interval (116), then the reuse scan process is aborted (118). The method 100 continues by initiating a process to determine whether (and at what document page) to restart reuse scanning (119). Different techniques for determining whether to restart the reuse scan process are described later in the detailed description.
Once the current interval page J has been incremented (112), the method 100 checks to determine if all the pages of the source electronic document have been scanned for RIP-reusable elements. That is, the method 100 determines whether the document page most recently scanned is the last page of the document. The document has L pages, from one through L.
Therefore, if (I−1)+J is greater than the total number of pages L within the document (120), there are no more pages to scan for RIP-reusable elements. The method 100 continues by performing RIP and printing the converted document (122). The same device that performs the reuse scan process of the method 100 may also perform RIP on the document, or another device may perform RIP, with information as to which pages include RIP-reusable elements, and the identity of these elements, provided by the device that performed reuse scanning. The RIP of the document thus leverages the identified RIP-reusable elements to perform RIP more quickly. The same device that performs RIP may be the same device that actually prints the document (i.e., the printing device), or a device other than the printing device may perform RIP.
If there are more pages to scan for RIP-reusable element, however, then the method 100 checks to determine if all the pages of the current interval of N pages have been scanned. That is, the method 100 determines whether the document page most recently scanned is the last page of the current interval of N pages. If there are more pages within the current interval to scan (124)—i.e., if J is not greater than N—then the method 100 is repeated at part 108 with the next document page.
However, if the page most recently scanned is the last page of the current page interval (124)—i.e., if J is greater than N—then the method 100 increases the current page interval I by N (126). That is, the current page interval is advanced to (the first page of) the next page interval of N pages in the document. The method 100 is repeated at part 104, with the resetting of the current interval page J back to 1, to continue the reuse scan process at the first page of this next interval.
In the technique of the example method 200, reuse rescanning is restarted every P pages. The parameter P may be preconfigured. The parameter P is greater than the interval N. For instance, the document may have L=1,000 pages, and if there are more than M=6 unique pages in an interval of N=10 pages, reuse scanning is aborted. In the example method 200 of
Specifically, reuse scanning begins with the first interval of P=200 pages within the document. If reuse scanning is aborted during any of these first P=200 pages of the document, then reuse scanning is restarted with the next P=200 pages of the document (i.e., at page P=200+1=page 201 of the document). However, if, when reuse scanning began with the first interval of P=200 pages within the document, reuse scanning was aborted at page 295—i.e., within the second interval of P=200 document pages—then reuse scanning is restarted with the third interval of P=200 document pages (i.e., at page 2*(P=200)+1=page 401 of the document).
When a document is first scanned for reuse (i.e., when the method 100 of
As such, if I+J is greater than H*P+1 (204), then the counter H is incremented again at part 202 of the method 200. The most recently scanned page is page I+J−1 of the document, and the next document page that has not been scanned is page I+J. The first page of the next interval of P pages is page H*P+1 of the document. This means if the next document page to be scanned is past the first page of the next interval of P pages (i.e., I+J>H*P+1), the this interval is skipped (i.e., H is again incremented by 1), and the following interval of P pages is considered until an interval is identified having a first page beyond the most recently scanned page.
Once the interval of P pages beyond the last scanned page has been identified (204), then I is set to the first page of this interval (206). That is, I is set to H*P+1. The process of incrementing the counter H until an interval of P pages is identified beyond the last scanned page can result in H*P+1 exceeding the number of L pages in the document. If I is greater than L (208), then, reuse scanning cannot be restarted, since there are no remaining interval of P pages within the document that include a page that has not yet been scanned for RIP-reusable elements. In this instance, the document can undergo RIP and the converted document printed (210), as in part 122 of the method 100. If I<L, though, the reuse scan process is restarted at part 104 of the method 100 (212), with the resetting of J to the first page of the interval of N pages identified by I.
In the technique of the example method 220, reuse rescanning is restarted at user-specified page ranges. For example, when generating a source electronic document to be printed, the user may explicitly identify different page ranges that each should be scanned for RIP-reusable elements. The user may, within a document of L=1,000 pages, explicitly tag that reuse scanning should occur within the range of pages 1 through 400, within the range of pages 401 through 700, and within the range of pages 701 through 1,000.
The method 220 thus determines whether there is any user-specified page range within the document beginning at a page beyond the most recently scanned page of the document. If there is not any user-specified page range beginning after this last scanned page (222), which is page I+J−1, then reuse scanning is not restarted. As such, RIP can be performed and the resulting converted document then printed (228), as in part 122 of the method 100. However, if there is a user-specified page range within the document beginning at a page beyond page I+J−1 (222), then I is set to the first page of this page range (224). The reuse scan process is restarted at part 104 of the method 100 (226).
In the technique of the example method 240, the page interval I can be increased by the parameter N (242), so that determining whether to restart reuse scanning begins at the next page interval within the document. In another implementation, the interval I may instead be set to the next page after the current page at which reuse scanning was aborted. That is, the interval I may be set to I+J (i.e., the interval I may be increased by J), where reuse scanning was aborted at page I+J−1.
If I is beyond the last page within the document, then reuse scanning is not restarted. Therefore, if I is greater than the total number of pages L within the document (244), RIP can be performed and the resulting converted document then printed (246), as in part 122 of the method 100. However, if I is not greater than L (244), then the method 240 continues by scanning page I to determine whether to restart use scanning (248).
In general, the scanning performed in part 248 is less processor intensive than the scanning of a page for RIP-reusable elements in part 108 of the method 100. Therefore, the scanning of part 248 occurs more quickly than the scanning of part 108. Scanning a page of a document to determine whether to restart reuse scanning can be performed in a number of different ways.
For example, the source electronic document may include a PDL-specified tag that denotes pages that are related to one another. Such as tag may specify a print job number, or another identifier of a print job, such as a globally unique identifier (GUID). Therefore, determining whether to restart scanning at a page is achieved by determining whether the value of this tag changes from a prior page to the current page. When a print job is generated, the computer program that generates the print job includes the tag within at least the first page of the print job. When the print job is merged with other print jobs to form the source electronic document, the tag thus demarcates the boundary between each pair of adjacent print jobs, so that reuse scanning can be performed for each print job included within the document.
As a second example, a page of the source electronic document may be scanned to determine whether a change in content of the document occurs at that page. As such, the RIP-reuse scan process is restarted at the page at which a change in content of the document occurs. For instance, the source electronic document may include high-level PDL objects, such as XObjects in the case of PDF-formatted documents. The scanning process of
In the former example, the software that generates the print job includes within each page of a print job—or at least within the first page of the print job—a tag with a value that identifies the print job. However, not all print job-generating software may include such a tag. By comparison, in the latter example, generation of the print job itself involves the generation of objects formatted in a given PDL, which can then be scanned to determine whether there is a significant change in content of a document including the print job. The change in content may occur between print jobs included in the same document, or may occur within a particular print job in the document.
If reuse scanning is to be restarted (250), then the reuse scan process is restarted at part 104 of the method 100 (252). If reuse scanning is not to be restarted (250), then I is incremented (254), and the method 240 is repeated at part 244 for this new value of I. While three different examples of determining when to restart reuse scanning have been described in relation to
As such, the method 300 can include determining whether to abort reuse rescanning. For instance, the number of pages of the document that have already been scanned and that contain no RIP-reusable elements may be compared against a threshold. The threshold may correspond to the maximum number of pages M within a given interval of pages N that are permitted to contain no RIP-reusable elements, as has been described in relation to the method 100 of
The method 300 determines whether to restart reuse scanning, based on a parameter (306). The parameter may correspond to a specified number of pages of the document, as has been described in relation to the method 200 of
The memory 504 stores instructions 506, such as program code. The processor 502 can execute the instructions 506 to, after RIP-reusable element scanning has already been aborted, initiate scanning of the document that is to undergo RIP and then be printed, to detect a change in content within the document (508), as has been described relation to the method 240 of
The techniques described herein thus improve the capabilities of a computing device by improving the performance of such a device in readying a source electronic document for printing. That is, the techniques reduce the length of time needed to perform RIP, by identifying RIP-reusable elements within the document. More specifically, the techniques improve the RIP-reusable element identification process, by restarting this process after it has been aborted. As such, printing performance itself is improved, because the length of time between when printing of a source electronic document is initiated and when printing of the document actually starts is decreased. Such improvement in the capabilities of a computing device—which can be a printing device itself—provide a concrete improvement in printing technology.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2018/066464 | 12/19/2018 | WO | 00 |