The present disclosure relates to the field of mobile internet, and more particularly to a method for rearranging web pages.
In the field of mobile internet, there is extensive research on how to present rich content of the Internet on mobile devices in a user-friendly manner. One crucial topic is how to display traditional Internet web pages designed for high-resolution monitors on the relatively low-resolution screens of mobile devices without compromising browsing of and interaction with the original web pages.
Some efforts have been made in this direction by current mainstream mobile browsers on the market. For example, in the early IE mobile browser for Windows Mobile OS from Microsoft, all elements in a web page are arranged in a vertical order for users' convenience. In the browser in Google's Android OS, word wrap technology is adopted. That is, during web page scaling, paragraphs of text in a web page are rearranged to wrap words according to the relationship between the current scaling ratio and the width of the screen. Therefore, screen rolling operation is not required when users are reading. In browsers from Apple iPhone and Microsoft Windows Phone 7 system, text scaling is adopted to adjust font sizes for different containers of a web page during first rendering of the web page. This ensures that when a container is scaled to the middle of the screen, the font size in the container is suitable for user reading without the need of scrolling screen left and right, successfully avoiding to repeatedly rearrange the web page layout during each scaling operation.
However, the main disadvantage of these technologies is that they only improve the reading experience for paragraphs of text on the mobile devices, but not for other web elements, such as pictures and videos. Moreover, such technologies cause partial change to the layout of web pages, which may possibly lead to disordered global layout, content repetition or large blanks, etc.
Another research direction is server rearranging technology, represented by server cache acceleration technology developed by UCWEB, which rearranges web pages by adapting fonts and width of web pages to lower screen resolutions of mobile devices, thus the connecting frequency to website servers can be reduced by caching the rearranged web pages.
However, due to the variety of mobile devices with different resolutions, the web page rearrangement by cache servers are not optimized for particular users' mobile device screens.
Some websites involves users' privacy information (e.g. e-commerce websites and on-line forums). The server rearranging technology requires a client to establish a direct connection with a cache server, so the privacy information of the users will be stored in the cache server, increasing the risk of privacy information leakage.
Due to the diversity of websites, the rearrangement results may not guarantee ease-of-use and aesthetics.
The server rearranging technology requires an enormous amount of server resource. The cost is higher.
Since rearranged web pages are cached, web pages with high real-time requirement (e.g., live web casting) may be delayed in processing, leading to the loss of real-time updating.
The purpose of the present disclosure is to provide a method for rearranging the web page, which is well suited for the screen resolution of the equipments for extremely good browsing experience. It can also preserve the information and interaction of original web pages to the greatest extent. Meanwhile non-essential elements in the web pages could be filtered out to increase the uploading speed and save the network bandwidth.
To this end, the present disclosure adopts the following technical scheme:
A method for rearranging web pages, including:
The selection rules include web address rule, special element rule and web format rule. The web address rule is defined by regular expression. The special element rule determines whether to select the web page by searching for specific elements in the web page. The web format rule determines whether to select the web page based on an overall hierarchical structure of the web page elements.
The special element rule determines if an identifier (ID) of a body element in the web page matches a specific ID. The web format rule determines if the body of the web page includes two div elements.
The content extraction rule is implemented in XPath language.
The content extraction rule includes content extraction rules for news websites, serial story websites and online forum websites.
The actual content includes internal HTML source code and hyperlinks.
Step E also includes the following steps:
The characteristics of the mobile phone browser include a resolution and display properties.
With the adoption of the technical scheme in the present disclosure, the following technical advantages can be achieved:
Embodiments of the present disclosure are further described in detail with reference to the accompanying figure.
Step 101. Mobile phone browser receives a web address to access.
Step 102. Mobile phone browser determines if a web page corresponding to the web address matches selection rules. If yes, go to Step 104. Otherwise, go to Step 103.
The selection rules are stored in the mobile phone browser client, including a web address rule, a special element rule and a web format rule.
The web address rule is defined by regular expression.
The special element rule determines whether to select the web page by searching for specific elements in the web page. For example, a special element rule determines if an identifier (ID) of a body element in the web page matches a specific ID.
The web format rule determines whether to select the web page based on an overall hierarchical structure of the web page elements in the web page. For example, a web format rule determines if a body of the web page includes two div elements.
Step 103. Mobile phone browser loads the web page and displays content of the web page.
Step 104. Mobile phone browser retrieves HTML source code of the web page.
Step 105. Based on a content extraction rule, the mobile phone browser extracts elements containing actual content from the HTML source code of the web page, and extracts actual content from these elements. The actual content includes internal HTML source code and hyperlinks.
The content extraction rule is stored in the mobile phone browser client, including content extraction rules for news websites, serial story websites and online forum websites. Different content extraction rules are defined for different types of web pages. Since content extraction rules target individual HTML elements or a group of HTML elements, they are often implemented in XPath language.
Step 106. Mobile phone browser inserts actual content of the web page into a predefined web page template to generate a new web page. The web page template includes a layout format for generating the new web pages based on predefined cascading style sheets (CSS) and characteristics of the mobile phone browser. The characteristics of the mobile phone browser include a resolution and display properties.
Step 107. Mobile phone browser loads the new web page and displays content of the new web page. The web page template and its included layout format for generating the new web page differ for different types of web pages, whereas for the same type of web pages, the same web page template and layout style are applied to ensure consistency in the layout of the rearranged web pages.
The above is a detailed description of the technical features of the present disclosure based on a preferred embodiment. However, it should be appreciated that the present disclosure is capable of a variety of embodiments and various modifications by those skilled in the art, and all such variations or changes shall be embraced within the scope of the following claims.
Number | Date | Country | Kind |
---|---|---|---|
201110060342.9 | Mar 2011 | CN | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/CN12/72285 | 3/13/2012 | WO | 00 | 9/10/2013 |