1. Field of the Invention
The present invention relates to a web page printing method, and more particularly, to a method for printing a captured screen of a web page, which scrolls a web browser screen to capture the total region of a web page into a plurality of segments and combines the captured web page screen segments to print or store the same as one web page screen.
2. Description of the Related Art
When a web user accesses a certain web server by means of a web browser and loads a web page on the web browser, the wed browser displays the web page in a specific screen size according to the monitor resolution of the web user. In general, only a portion of the total wed page is displayed on the screen and a user interface such as a scroll bar is provided to enable the user to view the other portions of the web page.
For example, when the monitor resolution of the web user is set to 600×400 and a web page with a total size of 1000×1300 is loaded on the web browser as illustrated in
In this manner, when the web browser displays only a portion of the total web page, the web user may print the current web page through a print function of the web browser or other print program.
However, if a print function of the web browser or other print program is used to print the web page displayed on a web browser screen, a screen captured through a screen capture function is converted into a bitmap format prior to printing, thus degrading the quality of the printed material. Also, because the maximum size of bitmap generation is restricted by the memory capacity of a web user computer system, if the web page is long, there is a high possibility that the web page may be abnormally printed. For example, if the print function of the web browser is used to print a web page as illustrated in
Unlike this, a technique of converting a captured screen into an Enhanced Meta File (EMF) format prior to printing has been proposed as a scheme for improving the quality of the printed material of the web page displayed on the web browser screen. In this case, if the total size of the web page is much larger than the current monitor resolution of the web user, the contents exceeding the monitor screen of the web user are not printed as illustrated in
For reference, a metafile such as an EMF is a format for storing a raster-type graphic element (e.g., a bitmap) and a vector-type graphic element (e.g., a line, a figure and a letter) together. In this context, the use of a web screen rendering interface of a web browser makes it possible to divide the HTML elements of a web page into a vector-type graphic element and a raster-type graphic element prior to extraction. That is, the bitmap of the web page is stored in a raster format and the letter, line and figure of the web page are stored in a vector format.
An object of the present invention is to provide a method for printing a captured screen of a web page, which, if the total size of a web page displayed on a web browser screen is greater than the monitor resolution of a web user, automatically scrolls the web browser screen in the horizontal/vertical direction to capture the total region of the web page into a plurality of segments and combines the captured web page screen segments to print or store the same as one web page screen.
According to an aspect of the present invention, there is provided a method for printing a captured screen of a web page, including: a first process of loading, by a web user, a capture printing agent for a capture screen print service on a web browser through a web browser loading module included in the capture printing agent; a second process of obtaining, by a web browser control module included in the capture printing agent, the total size of the web page and the size of a web page of the current web browser screen display region to calculate the necessary number of automatic horizontal/vertical scrolls of the web browser screen, in the event of a printing request of the web user in the state where the capture printing agent is loaded on the web browser; a third process of automatically scrolling, by a scroll capturing module included in the capture printing agent, the web browser screen in the horizontal/vertical direction according to the necessary number of the scrolls calculated by the web browser control module and using a metafile Device Context (DC) among the Graphic Device Interface (GDI) subsystem functions of a Windows operating system to capture the total region of the web page into a plurality of metafile segments; a fourth process of removing, by a clipping module included in the capture printing agent, unnecessary graphic elements from the captured metafile segments to generate secondary metafiles including only pure web page contents; and a fifth process of combining, by a capture screen printing module included in the capture printing agent, the secondary metafiles generated by the clipping module to print or store the same as one web page screen.
In order to provide a function for improving the printing quality while printing a web page, viewed through a web browser by a web user, in the same format as a display screen, the inventor has developed a Layered Meta File (LMF) technique for printing/storing captured web page screen segments as one web page screen by combining the captured web page screen segments in a partially layered manner.
The LMF technique according to the present invention is an application type of the conventional Enhanced Meta File (EMF) technique. The LMF technique according to the prevent invention captures/stores/prints a web page, viewed through a web browser by a web user, not in a bitmap format but in a format for storing graphic commands and graphic data together, thus making it possible to store the graphic data of the web page without degrading the printing quality of the web page.
A Windows Operating System (OS) processes a graphic task through a subsystem such as a Graphic Device Interface (GDI). The GDI manages all of the output-related information such as font, color, thickness, pattern and output format by using a data structure such as a Device Context (DC). Examples of the DC include a display DC for screen display, a printer DC for printing, a memory DC used for bitmap output, and a metafile DC for acquisition of graphic information. Among them, the metafile DC may be used to acquire information about a graphic command that an application executes through a GDI for display. The present invention uses the metafile DC and a web browser screen rendering interface of a web browser to capture the contents of a web page that is being displayed by the web browser. Compared to the bitmap-based technique, the technique of the present invention can store more graphic information by much less memory capacity.
In the metafile, the graphic command executed by the application is represented by a unique ID value, and related data and parameters necessary for the graphic command execution are stored in a data structure format. For example, if a lining task in the application is stored in the metafile, a constant value “EMR_LINETO” is used to indicate that the graphic command executed by the application is a lining and a lining position value (Horizontal, Vertical) is stored as a parameter.
The present invention loads an agent-type module on a web browser in order to capture the contents of a web page displayed by the web browser through the LMF technique. As described above, the EMF technique and a web screen rendering interface of the web browser are used to capture a web screen displayed the web browser, wherein the web screen rendering interface provides a function of depicting the current screen of the web browser in the inputted DC. At this point, the present invention captures the current screen of the web browser by inputting a metafile DC among various DCs provided by the Windows OS. However, due to the characteristics of the Windows OS, each process under execution operates in a separate independent address region. Therefore, the present invention uses a web browser extension module such as an ActiveX, a Browser Helper Object (BHO), a toolbar, a toolbar button and a hooking module to load an agent module on an address space such as a web browser. If the agent module is not loaded on the web browser, because the metafile and the web screen rendering interface of the web browser cannot be used due to the characteristics of the Windows OS as described above, the web screen cannot be correctly captured
When the agent module is loaded on the web browser, the agent module acquires information about a Document Object Model (DOM) tree structure for the web page and the web screen rendering interface of the web browser. The web browser accesses a web server to download an HTML file type web page and interprets the HTML file to display the web page, wherein the DOM tree structure is a data structure corresponding to an object type of each tag of the HTML file.
As illustrated in
Thus, in order to capture a screen of the total web page, the present invention acquires web screen segments while continuing to forcibly scroll a web screen to certain positions by means of the function of a web browser, and combines the web screen segments to acquire the total web page. In order to detect the number of scrolls of the web browser screen necessary for acquiring the screen segments of the total web page, the present invention uses the acquired DOM information to obtain the actual size of the web page loaded by the web browser on a pixel basis. Thereafter, the present invention obtains the current screen size of the web browser and uses the two values to calculate the number of horizontal/vertical scrolls necessary for displaying the total web page on the web browser screen.
Thereafter, a clipping module is used to remove graphic elements irrelevant to the web page (a portion of the web browser screen) for the primary metafile to generate a secondary metafile. Upon completion of the above processes, the web page is captured into a plurality of segments. The segments are stored in a data structure format together with the coordinates on the actual web page obtained through the DOM tree structure. The LMF module combines position information and captured web page screen segments so that they can be stored and printed as one web page. The captured web page screen may be stored in a file for later use.
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principle of the invention. In the drawings:
Reference will now be made in detail to the preferred embodiments of the present invention, examples of which are illustrated in the accompanying drawings. Wherever possible, the same reference numbers will be used throughout the drawings to refer to the same or like parts.
The capture printing agent 10 includes a web browser loading module 11, a web browser control module 12, a scroll capturing module 13, a clipping module 14, and a capture screen printing module 15.
The web browser loading module 11 loads the capture printing agent 10 on the web browser, and may be implemented using a web browser extension module such as an ActiveX, a Browser Helper Object (BHO), a toolbar, a toolbar button and a hooking module.
For example, the web developer may use ActiveX technique to load the capture printing agent 10 on a web browser HTML. Also, the web user may use one of the BHO, toolbar, toolbar button of the web browser to load the capture printing agent 10 on the web browser through an installation program.
The web browser control module 12 obtains the total size of the web page and the size of a web page of the current web browser screen display region to calculate the necessary number of automatic horizontal/vertical scrolls of the web browser screen, in the event of a printing request of the web user in the state where the capture printing agent 10 is loaded on the web browser.
The scroll capturing module 13 automatically scrolls the web browser screen in the horizontal/vertical direction according to the necessary number of the scrolls calculated by the web browser control module 12 to capture the total region of the web page into a plurality of metafile segments.
The clipping module 14 removes unnecessary graphic elements from the captured web page screen segments, i.e., the captured metafile segments to generate secondary metafiles including only pure web page contents.
The capture screen printing module 15 combines the secondary metafiles generated by the clipping module 14 to print or store the same as one web page screen.
A metafile such as the secondary metafile stores graphic elements such as lines, figures and letters included in the web page in a vector format. Therefore, if the secondary metafiles are simply combined prior to output, there occurs a gap between the secondary metafile segments in a printing or storing operation. In order to compensate this, the capture screen printing module 15 combines the secondary metafile segments in a partially layered manner, for example, by about 1˜3 pixels.
As described above, the capture screen printing module 15 combines the captured web page screen segments including only pure web page contents, i.e., the secondary metafile segments in a partially layered manner to print or store the same as one web page screen. Thus, the capture screen printing module 15 is referred to as a Layered Meta File (LMF) module.
Hereinafter, a detailed description will be given of a process of printing a captured screen of a web page according to the present invention, which is performed according to an operation of the capture printing agent 10.
When the web user accesses a web server to load a web page to a web browser, the web browser loading module 11 of the capture printing agent 10 loads the capture printing agent 10 on the web browser (S10).
At this point, an ActiveX technique may be used to enable the capture printing agent 10 to be loaded on a web browser HTML when the web user accesses the web page. Unlike this, one of the BHO, toolbar and toolbar button of the web browser may be used to enable the web user to load the capture printing agent 10 on the web browser through an installation program.
The web browser control module 12 obtains the total size of the web page and the size of a web page of the current web browser screen display region to calculate the necessary number of automatic horizontal/vertical scrolls of the web browser screen, in the event of a printing request of the web user in the state where the capture printing agent 10 is loaded on the web browser (S12).
Hereinafter, a detailed description will be given of the process (S12) for calculating the necessary number of scrolls by the web browser control module 12.
The web browser accesses a web server to download a HTML file type web page, and interprets the HTML file for screen display. At this point, a tag defined in the HTML file is internally managed in one object format. Also, because the objects have a parent/child relationship, the web browser manages the objects in a tree format, which is called a Document Object Model (DOM) tree. In the DOM tree, each object corresponds to one tag in the HTML file, which is called a DOM node. The DOM node provides various information such as the style information of a tag set in a CSS (cascading style sheets) file format in the HTML file, the pixel-based position value of the node represented on the web browser, and the HTML source information and the tag name corresponding to the node. For reference,
The web browser control module 12 obtains a DOM tree from the web browser, obtains a DOM node (HTML tag or BODY on HTML) representing the web page among the nodes of the DOM tree, and obtains the total size of the node and the size of a region displayed on the web browser screen on a pixel basis.
When the total size of the node is obtained, the actual total size of the web page can be obtained on a pixel basis. Also, when the size of a region displayed on the web browser screen is obtained, the size of a region among the web page contents displayed in the web browser can be obtained.
Herein, the obtained value is not the window size of the web browser but the size of a web page display region in the window of the web browser.
For example, as illustrated in
As illustrated in
Because the web page displayed on the web browser screen are horizontally 600 pixels, it can be seen that 2 (=1000/600) times of horizontal scroll is necessary to display the total web page. Likewise, the total necessary scroll time is obtained by dividing the size of the total web page by the size of a region displayed on the web browser.
That is, it can be seen that 2 (horizontal=1000/600) times of horizontal scroll is necessary and 4 (vertical=1300/400) times of vertical scroll is necessary. Thus, as illustrated in
After the necessary number of automatic horizontal/vertical scrolls of the web browser screen is calculated by the web browser control module 12 (S12), the scroll capturing module 13 automatically scrolls the web browser screen in the horizontal/vertical direction according to the calculated scroll number to capture the total region of the web page into a plurality of segments, i.e., a plurality of metafile segments (S14).
At this point, as illustrated in
Also, as illustrated in
After the scroll capturing module 13 automatically scrolls the web browser screen in the horizontal/vertical direction to capture the total region of the web page into a plurality of segments, i.e., a plurality of metafile segments as described above (S14), the clipping module 14 removes unnecessary graphic elements from the captured web page screen segments (i.e., a plurality of metafile segments) to generate secondary metafiles including only pure web page contents (S16).
The secondary metafile segments are stored in a data structure format together with a DOM tree obtained from the web browser by the web browser control module 12 and the coordinates on the actual web page obtained from a DOM node obtained from the DOM tree. Thereafter, the capture screen printing module 15 combines position information and the captured web page screen segments so that they can be stored and printed as one web page and it may be stored in a file format for later use.
After the secondary metafiles are generated by the clipping module 14 as described above, the capture screen printing module combines the secondary metafiles generated by the clipping module 14 to print or store the same as one web page screen as illustrated in
At this point, a metafile such as the secondary metafile stores graphic elements such as lines, figures and letters included in the web page in a vector format. Therefore, if the secondary metafiles are simply combined prior to output, there occurs a gap between the secondary metafile segments in a printing or storing operation. In order to compensate this, the capture screen printing module 15 combines the secondary metafile segments in a partially layered manner, for example, by about 1˜3 pixels. In the case of
As described above, the present invention can overcome a problem that the quality of a printed material degrades when a captured screen of a web page displayed on a web browser screen is converted in a bitmap format prior to printing, a problem that a web page viewed by a web user through a web browser screen is not identically printed due to the restricted maximum size of a bitmap if the web page is long, and a problem that the web page contents exceeding the monitor screen of a web user are not printed if the total size of a web page is greater than the monitor resolution of the web user when a captured screen is converted into an Enhanced Meta File (EMF) prior to printing in order to improve the quality of a printed material.
It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention. Thus, it is intended that the present invention covers the modifications and variations of this invention provided they come within the scope of the appended claims and their equivalents.