The World Wide Web (Web) has been ever growing and rapidly expanding since its inception. Additionally, since the widespread household use of personal computers, the Web has gained popularity among consumers and casual users alike. Thus, it is no surprise that the Web has become an enormous repository of data, containing valuable information and various kinds of interactive resources. For example, Web sites often provide up-to-date news and reporting as well as interactive applications that may change dynamically. Web sites may usually be implemented with hypertext markup language (HTML) and JavaScript. The cascade style sheet (CSS) may also be used often in modern Web pages for the flexibility of specifying various visual effects. Displaying a Web site may call for formatting the style and calculating the layout of a Web page file. Unfortunately, many redundant calculations may be performed in order to display such potentially dynamic content, particularly when subsequent requests for the same page are made.
Over time, advances in network technology and hardware infrastructures have significantly increased network speed and decreased overall Internet download times. Additionally, with the advent of multi-core processors, computing devices have become extremely fast and efficient at processing digital content. In many cases, however, a bottleneck may occur at the computing device because a browser may process Web content essentially in a single thread manner and may not exploit the multi-core processors of a modern client device. Additionally, local Web content processing by a browser may include both style formatting and layout calculations. Eliminating redundant operations in style formatting and layout calculation can speed up local Web content processing. Unfortunately, adequate tools do not exist for effectively caching Web style formats and/or layout calculations. Existing caching tools merely cache an entire HTML page and do not help reduce redundant operations in either style formatting or layout calculation.
This summary is provided to introduce simplified concepts for style and layout caching of Web content, which are further described below in the Detailed Description. This summary is not intended to identify essential features of the claimed subject matter, nor is it intended for use in determining the scope of the claimed subject matter. Generally, the style and layout caching of Web content described herein involves using document object model (DOM) trees constructed from Web page files to create style caching trees which can be used to cache the style formatting of Web pages at a DOM element granularity and/or cache layout calculations performed based at least in part on render trees constructed from the same, or different, Web page files.
The detailed description is set forth with reference to the accompanying figures. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The use of the same reference numbers in different figures indicates similar or identical items.
This disclosure describes style and layout caching for Web content. In particular, systems and recursive methods are presented for creating DOM trees from received Web page files, constructing and caching style caching trees from the DOM trees, performing layout calculations based on render trees, caching the layout calculations, and receiving new Web page files to process. The recursive process may repeat when new Web pages are requested by a Web browser or when changes are found among the style or layout properties of newly received Web page files.
In one aspect, Web page style and layout caching methods may be configured to receive a Web page file, parse the file to create a DOM tree, and construct a style caching tree based at least in part on the DOM tree. In this context, the Web page file may be received from a local memory or from a network storage device or server. Additionally, each DOM tree may contain DOM nodes with parent nodes and/or children nodes. In constructing the style caching tree, style properties may be calculated for each DOM node. Additionally, the style caching tree may be constructed by recursively parsing the DOM tree until each DOM node is represented as a style element. The construction may further include merging sibling style elements that share the same selectors of the CSS rules to be cached. An example of CSS rules to be cached may include rules with the selectors of identifier, class, and tag name as well as the basic descendant and child relationship in the DOM tree. These selectors may be referred to as “normal” selectors. Further, a CSS rule set for the Web page file, the style properties, a matched CSS rule list for each DOM node, along with the style caching tree may be stored in a local cache. Based at least in part on the DOM tree, a render tree may be constructed, and layout calculations may be performed for each render object in the render tree. The resulting layout data may also be stored in a local cache with the corresponding elements in the style caching tree. In this way, successive style formatting and layout calculations, for example when the Web page files and their CSS rule sets that are received at a successive visit of the Web pages, can be checked against the cached content to determine if any redundant calculations may be avoided.
In some instances, only style caching elements with normal selectors will be cached. In other instances, when current DOM nodes match previous DOM nodes, the cached content may be used. Additionally, if current CSS rule sets match previous CSS rule sets, the cached content may be used. In some instances, render boxes, render blocks, render buttons, render text controls, render texts, render images, and inline render objects may be cached. In other instances, when current render tree elements match previous render tree elements, the cached content may be used. When style rules for a current DOM element are different from cached ones, the style of the DOM node may be recalculated. The result may then be cached. When layout features for a current render tree element differ from previous layout features, the layouts for the render tree element may be recalculated. The new results may then be cached.
In another aspect, style caching of Web content may be effectuated by receiving a Web page file, creating a DOM tree from the page file, constructing a smart style caching (SSC) tree based on the DOM tree, and caching the SSC tree. Similarly, in this context, the Web page file may be received from local memory or over a network. Additionally, the SSC tree may be constructed by calculating style properties for each element of the DOM tree, parsing the DOM tree and representing each parsed element as an SSC element in the SSC tree, and merging sibling SSC elements that share the same CSS selectors to be cached. In one instance, CSS rules with normal selectors are cached. The cached SSC tree may be used to avoid recalculating style properties for DOM nodes by determining if rules of current DOM nodes match cached rules of the DOM nodes.
In yet another aspect, a layout caching system may be configured to iteratively receive Web page files, create DOM trees from the Web page files, calculate layout information for nodes of the DOM tree, store the layout information in a local cache, and validate the cached layout information. The system may invalidate cached layout information based on determining that global information of a Web browser has changed, determining that a parent node of a given DOM node has changed, determining that the style information of any DOM node in the DOM tree has changed, or determining that layout-related content of the given DOM node has changed.
As discussed above, many redundant style and formatting calculations are performed when new Web pages are requested by a Web browser or a user. Even worse, traditionally, these redundant recalculations may create a bottleneck at the processing stage for displaying Web content. These problems, and the desire for faster style and layout calculations of Web content, are compounded by the ever increasing number of interactive Web applications found on the Internet.
The techniques described in this disclosure may be used for effectively solving the foregoing problems by caching Web page style and/or layout properties found in associated Web page files. Additionally, the techniques may cache style properties, layout properties, or both style and layout properties together.
A Web browser, or other application stored in a memory of a computing device, may request a Web page from a Web server, or other device for storing Web pages, and may receive a Web page file in response to the request. The Web browser may also receive Web content style specifications in the form of CSS files provided by the creator of the Web page. Generally, a CSS file may be associated with a requested Web page. Alternatively, CSS may be embedded within a Web page file.
A Web browser may also form a DOM tree to represent each element of a Web page file in a tree structure. As such, each element, or tag, in a Web page file may be represented as a single node in the DOM tree. As discussed above, a style caching tree may be formed based, at least in part, on a DOM tree and an associated CSS file. This style caching tree may then be cached to avoid redundant calculations and may be referenced when the same URLs are requested.
Additionally, or alternatively, a render tree may be formed for displaying the Web page on display device. The render tree may be based on a DOM tree. The Web browser may perform layout calculations on DOM tree nodes for a render tree. Calculated layout results may be cached with corresponding nodes in a style caching tree. Much like the style caching, the calculated layout information may be referenced to avoid redundant layout calculations.
In one aspect, the Web content parser 102 may be configured to parse the Web page file 104 to determine individual Web elements. Web elements may be identified based on HTML tags or descriptors found within the Web page file 104. The Web browser may utilize the parsed Web page information from the Web content parser 102 to build a DOM tree 116 for each individual Web page file 104. In this way, the DOM tree 116 can be built based on the parsed elements of the Web page file 104. Additionally, in one aspect, if scripting language code, such as JavaScript™ code, is found within the Web page file 104, the Web browser may serve the script code to a script engine 118. The script engine 118 may be configured to execute the script code, interact with a user of the Web browser, and/or modify the DOM tree 116 based at least in part on the users interactions and/or the executed code. However, if the Web content parser 102 does not detect any script code from within the Web page file 104, the Web browser may build the DOM tree 116 without being modified by the script engine 118.
The Web browser may build a render tree 124 based at least in part on the DOM tree 116. In this way, the Web browser may prepare the data for appropriate layout calculations prior to rendering the Web page on a display device 126. In building a render tree 124, style formatting may be applied to the DOM tree to find out style properties for the DOM nodes. In one aspect, a style caching tree 120 may be constructed from the DOM tree 116 to record style formatting results. As noted above, a style caching tree 120 may be used to represent DOM nodes and associated style properties in tree form. However, other data structures such as linked lists, graphs, etc., may be used to represent the DOM nodes and associated style properties. Further, in one example, the Web browser may cache the style caching tree 120 by storing it in a local memory 122. Local memory 122 may be a cache or other local memory location with relatively short read and/or write latencies. In some instances, successive style formatting 130 operations may retrieve the cached style caching tree 120 to eliminate redundant calculations and/or update the cached style caching tree 120.
In one aspect, the Web browser may perform layout calculations for render objects of the render tree 124. By way of example only, the resulting layout information may contain render data for visible elements of the DOM tree. Additionally, in one aspect, the Web browser may cache the layout information at block 128 by storing it in a local memory 122. In one example, the cached layout information 128 may be stored together with the style caching tree 120 to reduce the data to be cached. However, in other examples, the cached layout information 128 may be stored in a different local memory, such as a different cache or a different computer-readable storage device. In some instances, successive layout calculation 132 may retrieve the cached layout information 128 to eliminate redundant calculations and/or update the cached layout information 128.
Additionally, as shown in
At block 214, the Web browser may construct a style caching tree which may contain the calculated style properties for each DOM node and may have merged elements. As noted above, merged elements may be elements that represent more than one sibling DOM node that share the same ID, Class, TagName triple of CSS rule selectors. The Web browser may then store the style caching tree in a cache at block 216. At blocks 218, 220, and 222, the Web browser may also store the CSS rule set for the Web page, the calculated style properties, and a list of matched CSS rules. At block 224, the Web browser may then construct a render tree based at least in part on the information stored from blocks 216-222. Additionally, at block 226, the Web browser may perform layout calculations for each render object in the render tree. At block 228, the Web browser may cache the layout calculations. The method may then terminate at block 230, where the Web browser may display the Web content to a user.
In some aspects, there may be only one CSS file for each Web page. A CSS file may consist of a set of rules, each rule consisting of at least two parts: a selector and a declaration. The selector of a CSS rule may determine which kind of elements may match the rule. The selector may be simple, such as ID selectors or class selectors, or it may be complex, such as ones that refer to attributes of a DOM node. Thus, developers may define a scope of elements via a selector and then assign specific style values to them. The declaration of a CSS rule, on the other hand, may be a set of values of pre-set style properties, which may determine how selected elements may be displayed. For example, in the CSS rule “p em {color: red},” the selector may be “p em,” which may indicate that <em> elements which are descendants of <p> elements may be selected as the target elements of this particular rule. Additionally, in this example, the declaration part may be “{color: red},” which may define the color property of selected elements as red.
In some aspects, a Web browser may attempt to determine the style of a newly created or modified element. First, the Web browser may check each CSS rule against each DOM node. The selector of a rule may determine whether the rule is a match to the DOM node. Second, matched rules may be applied to the DOM element in a particular order defined in the CSS specification, to generate the style properties of each DOM node.
In one example, the following Web page file:
may be received by a Web browser for displaying content on a display device.
In this example, three rules are bracketed by the <style> and </style> tags. In order to render the <em> element, a Web browser may determine the style of the element. The Web browser may first check each CSS rule provided in the Web page file against the <em> element. In one example, the Web browser may determine that both the first and third rules are a match. According to the CSS specification, these two elements may then be merged. In this case, only the color property is specified by the page author, while other style properties are set as defaults. Additionally, in this example, both matched rules specify the color property and, thus, according to the CSS specification, the value declared in the first rule may be used because it may have a higher priority. Thus, in this example, the text, “The second part” may be displayed in red rather than in blue.
Additionally, style caching methods may consider all types of CSS rule selectors or they may only consider normal selectors. In one example, normal selectors may be defined as those selectors involving ID, Class, TagName attributes of a single element, and basic descendant and child relationship information of DOM nodes.
In some aspects, the style caching tree 304 may be similar to a DOM tree 302. In other aspects, the style caching tree 304 may only store structure information of the DOM tree 302. Additionally, each DOM node may have a corresponding style caching element and each style caching element may contain a list of matched rules with normal selectors for each corresponding DOM node. In some instances, the list may be empty if the DOM node has no matched rules with normal selectors. Further, as noted above, sibling DOM nodes with the same <ID, class, TagName> triple may be merged into one node.
Specifically,
As noted above, in some examples, the style caching methods may cache the style caching tree 304 in a local memory in order to avoid redundant style calculations. As such, any element in the style caching tree 304 for which the matched style rules remain the same after the next Web page request may be retrieved from the style cache without any style computation. On the other hand, when the Web page determines that matched style rules have changed, the style properties may be re-calculated. The style caching tree 304 may be updated with the new matched style rules and the calculated style properties.
As previously discussed, each DOM element may correspond to exactly one style caching element; however, the relationship may not be one-to-one. For example, as discussed regarding the “li” nodes 312, 314, 316, 324, and 326, each style caching element may correspond to one or more DOM nodes. Thus, for each style caching element, the Web browser may cache its style rule selectors (i.e., ID, Class, and TagName) that are used to find the matched DOM elements. The Web browser may cache matched rules for each style caching element. Additionally, the Web browser may also cache the style properties that may be retrieved and applied to the unchanged DOM nodes in subsequent visits to the same Web page.
Therefore, by way of example only, given a DOM node “E,” the corresponding style caching element may be located or created based on the following:
Check if E is the root of the DOM tree.
Additionally, once the Web browser has identified the corresponding style caching element for E, the style properties may be retrieved from the style caching element. In one aspect, if it is a newly created style caching element, then E's style properties may be calculated and recorded into the new style caching element. In this way, the Web browser may ensure that the style properties of an element E1 that has been calculated during a visit to a Web page may be retrieved in subsequent visits if E1 appears in the Web page again.
In some aspects, the style and layout caching methods may be able to tolerate changes to a DOM tree. For example, if the path of a DOM node to the root of the DOM tree does not change, then the path of its corresponding style caching element to the root of the style caching tree may stay the same as well. However, if the path of a DOM node to the root of the DOM tree has changed, then the DOM node as well as its descendant nodes in the DOM tree may no longer be matched in the cached style caching tree.
For example, suppose that the Web page file 300 shown in
In this example, since the “p” node 406 and the two “em” nodes 408 and 410 are new DOM nodes in the DOM tree 402, new style caching elements may be created for them. Here, the “p” node 412 and the “m” node 414 are inserted into the style caching tree 404 to correspond to the new DOM nodes. Additionally, although the two new “em” nodes 408 and 410 are not siblings, they may correspond to the merged “em” node 414 because they may share the same style rules, and their parents, the two “li” nodes 312 and 314 of the DOM tree 402, as noted above with respect to
In other aspects, the style and layout caching methods may be able to tolerate changes in CSS rule sets. For example, a style caching tree may record the style properties for each element and also the list of matched rules for each element. Thus, both the style properties and the list of matched rules for each DOM node may be stored in each corresponding style caching element. In some aspects, new CSS rule sets may be identified while they are being received by a Web browser and the following process may be executed for each element:
Additionally, once the Web browser has completed loading a new Web page or when the current result is to be displayed in an incremental display mode, the Web browser may be able to identify which rules are missed, i.e., which rules are in Rcache but not in Rcur. As such, the Web browser may process the elements affected by those rules based on the following:
When styles for DOM nodes are needed, those DOM nodes which have new matched rules which are not in the cached style caching tree, or have some rules deleted from the stored matched rules for the corresponding nodes, may re-calculate their style properties. Once a node has to re-calculate its style properties, all its descendant nodes may also need to re-calculate their style properties. Other DOM nodes may apply the style properties retrieved from the cached style caching tree. The style caching tree may then be updated with the newly calculated style properties.
In this way, the Web browser may be able to identify the same rules that appear in both the current visit to a Web page and the last visit to the same Web page. This may allow the Web browser to avoid duplicating calculations for the elements of which the matched rule list has not changed. Furthermore, the new CSS rules for the current visit may be stored in the style cache to be retrieved for future visits to the same Web page.
At block 510, the Web browser may merge sibling style caching elements that share the same ID, Class, TagName triple. The Web browser may then cache the style caching tree and calculated style properties in local memory at block 512. At block 514, the Web browser may receive a new Web page file as user makes requests for additional Web pages. At block 516, the Web browser may create a new DOM tree based at least in part on the new Web page file. At decision block 518, the Web browser may determine whether new DOM nodes in the new DOM tree match previous DOM nodes of a previous DOM tree. In one instance, this may be implemented by checking if they have identical paths to their respective roots. If a new DOM node does not match, the Web browser may recalculate individual style properties for the DOM node as well as for its descendant nodes at block 520. Otherwise, for every node in the new DOM tree that matches the old DOM tree, the Web browser may access the style caching tree at block 522.
At decision block 524, the Web browser may determine if the newly received Web page file is accompanied by a new CSS rule set. If so, the Web browser may recalculate the style properties for each element that is affected by the new rule set and also its descendant nodes at block 520. In some instances, only some elements may be affected by such a change; however, in other instances, all elements may be affected. At the end of a page download or when display of the current processed data is requested, deleted CSS rules may be checked at block 526. For rules that appear in cached data but not in the current Web page's CSS rule set, all the matched DOM nodes may be detected. The style properties for these nodes and also their descendant nodes may be re-calculated at block 520. On the other hand, for nodes with their styles not re-calculated at block 520, after checking blocks 526 and 528 (e.g., for the nodes that do not match any new or deleted CSS rules, and where none of their ancestors in the DOM tree have matched any new or deleted CSS rules) the cached style properties may be retrieved at block 528 from the cached style properties, and may be applied without re-calculation. At block 530, the Web browser may construct a render tree based on the DOM tree, and apply the style properties, either re-calculated or retrieved, from the cached style caching tree. The Web browser may perform layout calculations for each element of the render tree at block 532. Finally, at block 534, the method may terminate when the Web browser renders a Web page on a display device based at least in part on the layout calculations and the render tree.
In one aspect, Web page layout calculations are performed based at least in part on a render tree. As discussed above with reference to
In one example, the Web browser may identify unchanged render objects in the render in order to reuse the cached layout results. By way of example and not limitation, one way to identify unchanged render objects is to build a companying tree for the render object. Alternatively, another way to identify unchanged render objects is to utilize an existing style caching tree. In this example, each render object may be associated with one DOM node from which it is generated, each DOM node may be associated with one style caching element and, thus, a render object may also be associated with one style caching element. Therefore, the Web browser may record the render object along with its layout result in its associated style caching element. As noted above regarding determining which style caching elements are associated with which DOM nodes, the Web browser may similarly identify a render object in the layout cache by finding its associated DOM node.
In one aspect, layout caching may cache layout calculations for all possible render objects. In another aspect, however, the Web browser may only cache layout results for render boxes, render blocks, render buttons, render text controls, render texts, render images, inline render objects, any combination of the foregoing, or the like.
Additionally, in order to determine validation of the cached layout results a Web browser may perform up to four different validation checks:
Additionally, and by way of example only, layout caching may not tolerate changes in the CSS rules of a Web page. Therefore, a Web browser may request a layout re-calculation when changes occur to the CSS rules of a Web page. Alternatively, in some examples, the Web page may be able to access and use cached layout calculations for the nodes that are not affected by the changed CSS rules of a Web page.
The Web browser may then receive a new Web page file based on a user's request for a new or updated Web page at block 608. Subsequently, the Web browser may create a new DOM tree based on the newly received HTML file and a new render tree based on the newly created DOM tree at block 610. At decision block 612, the Web browser may determine whether any layout features have changed from the previous render tree to the new render tree. For the nodes for which no change has occurred, the Web browser may access the cached layout calculations at block 614 and apply them without re-calculation. For the nodes for which layout features have changed, the Web browser may perform new layout calculations at block 618. The method may terminate at block 616 by rendering the Web page on a display device based at least in part on the retrieved form of the cached layout properties or newly calculated layout properties and the render tree.
In one illustrative configuration, the computing environment 700 comprises at least a memory 702 and one or more processing units (or processor(s)) 704. The processor(s) 704 may be implemented as appropriate in hardware, software, firmware, or combinations thereof. Software or firmware implementations of the processor(s) 704 may include computer-executable or machine-executable instructions written in any suitable programming language to perform the various functions described.
Memory 702 may store program instructions that are loadable and executable on the processor(s) 704, as well as data generated during the execution of these programs. Depending on the configuration and type of computing device, memory 702 may be volatile (such as random access memory (RAM)) and/or non-volatile (such as read-only memory (ROM), flash memory, etc.). The computing device or server may also include additional removable storage 706 and/or non-removable storage 708 including, but not limited to, magnetic storage, optical disks, and/or tape storage. The disk drives and their associated computer-readable media may provide non-volatile storage of computer readable instructions, data structures, program modules, and other data for the computing devices. In some implementations, the memory 702 may include multiple different types of memory, such as static random access memory (SRAM), dynamic random access memory (DRAM), or ROM.
Memory 702, removable storage 706, and non-removable storage 708 are all examples of computer-readable storage media. Computer-readable storage media includes, but is not limited to, volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. Memory 702, removable storage 706, and non-removable storage 708 are all examples of computer storage media. Additional types of computer storage media that may be present include, but are not limited to, phase change memory (PRAM), SRAM, DRAM, other types of RAM, ROM, electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, compact disc read-only memory (CD-ROM), digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by the server or other computing device. Combinations of any of the above may also be included within the scope of computer-readable storage media.
The computing environment 700 may also contain communications connection(s) 710 that allow the computing environment 700 to communicate with a stored database, another computing device or server, user terminals, and/or other devices on a network. The computing environment 700 may also include input device(s) 712 such as a keyboard, mouse, pen, voice input device, touch input device, etc., and output device(s) 714, such as a display, speakers, printer, etc.
Turning to the contents of the memory 702 in more detail, the memory 702 may include an operating system 716 and one or more application programs or services for implementing Web page style and layout caching including a Web content receiving module 518. The Web content receiving module may be configured to receive Web page files for processing.
The memory 702 may also include a DOM tree creation module 720. The DOM tree creation module 720 may be configured to create a DOM tree by parsing the received Web page file. As discussed above, the DOM tree may represent each element in the Web page file as a node in a hierarchical tree structure or other type of data structure.
The memory 702 may further include a style tree creation module 722. As discussed above, the style tree creation module 722 may be configured to parse each element of the DOM tree, create associated style caching elements, record style properties for each element, and merge elements that share an ID, Class, TagName triple. Additionally, as noted above, the style tree creation module 722 may be configured to embed additional information within the created style tree. Such information may include CSS rule sets, other calculated style properties, matched CSS rule lists, combinations of the foregoing, or the like.
The memory 702 may also include a layout calculation module 724. The layout calculation module 724 may be configured to calculate layout properties for each element in a render tree, along with properties for validation. As discussed above, a Web browser (or any computing device, such as, but not limited to, computing environment 700) may create a render tree based on a DOM tree. Based at least in part on the render tree, the layout calculation module 724 may calculate layout properties for each element of the render tree. In one aspect, these layout properties may be stored back into the render tree.
The memory 702 may further include a caching module 726. The caching module 726 may be configured to store calculation results in a local cache. In one example, the caching module 726 may cache style properties by caching a style tree. In this example, the computing environment 700 may cache the style tree that was previously created by the style tree creation module 720. In other aspects, however, the caching module 726 may cache a style tree provided by a user, a Web server, or the like. In another example, the caching module 726 may cache layout calculations by caching a render tree that includes layout properties for each render object. In this example, the computing environment 700 may cache the layout calculations that were calculated by the layout calculation module 724. Yet, in other aspects, the caching module 726 may cache calculations created and/or served by another entity or module. In yet another example, the caching module 726 may cache both style and layout calculations. As such, in this example, the style and layout calculations may have been performed by the style tree creation module 722 and/or the layout calculation module 724, respectively.
Additionally, the memory 702 may also include a layout validation module 728. The layout validation module 728 may be configured to validate the results of the layout caching. In one aspect, the layout validation module 728 may apply one or more of the four validation checks detailed above. More specifically, the validation module 728 may determine if new layout calculations are to be calculated when a new Web page is requested.
Illustrative methods and systems of multi-threaded parallel web page processing are described above. Some or all of these systems and methods may, but need not, be implemented at least partially by an architecture such as that shown in
Although embodiments have been described in language specific to structural features and/or methodological acts, it is to be understood that the disclosure is not necessarily limited to the specific features or acts described. Rather, the specific features and acts are disclosed as illustrative forms of implementing the embodiments.