SYSTEM AND METHOD FOR CONTENT NAVIGATION

FIELD

The present specification relates generally to telecommunication and more specifically relates to a system and method for content navigation.

BACKGROUND

Computing devices are becoming smaller and increasingly utilize wireless connectivity. Examples of such computing devices include portable computing devices that include wireless network browsing capability as well as telephony and personal information management capabilities. The smaller size of such client devices necessarily limits their display capabilities. Furthermore the wireless connections to such devices typically have less bandwidth than corresponding wired connections. The Wireless Application Protocol (“WAP”) was designed to address such issues, but WAP can still provide a very unsatisfactory experience or even completely ineffective experience, particularly where the small client device needs to effect a connection with web-sites that host web-pages that are optimized for full traditional desktop browsers.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is schematic representation of a system for content navigation.

FIG. 2 is a schematic representation of a wireless communication device from FIG. 1.

FIG. 3 is a schematic representation of a display and a portion of a keyboard of the device of FIG. 1, wherein the display is showing a screen from a contact manager application.

FIG. 4 is a schematic representation of the display and the portion of a keyboard of FIG. 3, wherein the display is showing the screen from contact manager application including a menu from a menu application with menu selections that are contextual to the contact manager application.

FIG. 5 shows an exemplary web-page available from the web-server in FIG. 1.

FIG. 6 shows an exemplary web-page available from the web-server in FIG. 1.

FIG. 7 shows an exemplary web-page available from the web-server in FIG. 1.

FIG. 8 shows an exemplary web-page available from the web-server in FIG. 1.

FIG. 9 shows a flowchart depicting a method for content navigation.

FIG. 10 shows the system of FIG. 1 during exemplary performance of part of the method of FIG. 9.

FIG. 11 shows the system of FIG. 1 during exemplary performance of part of the method of FIG. 9.

FIG. 12 shows the display of the client machine of FIG. 1 during exemplary performance of part of the method of FIG. 9.

FIG. 13 shows the display of the client machine of FIG. 1 during exemplary performance of part of the method of FIG. 9.

FIG. 14 shows the display of the client machine of FIG. 1 during exemplary performance of part of the method of FIG. 9.

FIG. 15 shows the display of the client machine of FIG. 1 during exemplary performance of part of the method of FIG. 9.

FIG. 16 shows the display of the client machine of FIG. 1 during exemplary performance of part of the method of FIG. 9.

FIG. 17 a schematic representation of a system for content navigation in accordance with another embodiment.

FIG. 18 shows a flowchart depicting a method for content navigation in accordance with another embodiment.

FIG. 19 shows the display of the client machine of FIG. 17 during exemplary performance of part of the method of FIG. 18.

FIG. 20 is schematic representation of a system for content navigation.

FIG. 21 is a schematic representation of a wireless communication device from FIG. 20.

FIG. 22 shows the process of converting webpages to a hierarchical structure.

FIG. 23 shows a flowchart depicting a method to obtain and satisfy a web page request.

FIG. 24 shows the system of FIG. 20 during exemplary performance of part of the method of FIG. 23.

FIG. 25 shows exemplary perspectives of browsing sessions.

FIG. 26 shows an exemplary web-page available from the web-server in FIG. 20.

FIG. 27 shows exemplary output from the Schema Engine.

FIG. 28 shows an exemplary rendering of a mobile application.

FIG. 29 shows an exemplary menu rendered on the mobile device.

FIG. 30 shows an exemplary web-page available from the web-server in FIG. 20.

FIG. 31 shows an exemplary web-page available from the web-server in FIG. 20.

FIG. 32 shows an exemplary web-page available from the web-server in FIG. 20.

FIG. 33 shows an exemplary web-page available from the web-server in FIG. 20.

FIG. 34 shows an exemplary web-page available from the web-server in FIG. 20.

FIG. 35 shows action 1 of FIG. 44.

FIG. 36 shows action 3 of FIG. 44.

FIG. 37 shows action 5 of FIG. 44.

FIG. 38 shows action 7 of FIG. 44.

FIG. 39 shows action 9 of FIG. 44.

FIG. 40 shows an exemplary web-page available from the web-server in FIG. 20.

FIG. 41 shows an exemplary result of the process of assisted capturing of the web-page in FIG. 40.

FIG. 42 shows an exemplary rich bookmarks list.

FIG. 43 shows an exemplary rich bookmarks list.

FIG. 44 shows an exemplary action of browsing through ABC ComTech Corp. to purchase an item.

FIG. 45 shows an exemplary request/response for a page the website in FIG. 20.

DETAILED DESCRIPTION OF THE EMBODIMENTS

The present specification provides, amongst other things, a method and system for navigating content. In an embodiment a portable electronic device is provided having a browser application and a native menu application. The embodiment also includes a network that interconnects a web-server and said portable electronic device. The web-server hosts web pages that include menus and content. The portable electronic device is configured to obtain a schema respective to the web-pages whereby the web-page menus can be generated on the portable electronic device using the native menu application rather than the browser application, thereby permitting navigation of content on the portable electronic device via the native menu application.

Referring now to FIG. 1, a system for content navigation in a computing device is indicated generally at 50. In a present embodiment system 50 comprise a first computing device in the form of a client machine 54 and at least one additional computing device implanted as a second computing device in the form of a web-server 58 and a third computing device in the form of a schema server 62. A network 66 interconnects each of the foregoing components.

Each client machine 54 is typically any type of computing or electronic device that can be used to interact with content available on network 66. Each client machine 54 is operated by a user U. Interaction includes displaying of information on client machine 54 as well as to receive input at client machine 54 that is in turn sent back over network 66. In a present embodiment, client machine 54 is a mobile electronic device with the combined functionality of a personal digital assistant, cell phone, email paging device, and a web-browser. Such a mobile electronic device thus includes a keyboard (or other input device(s)), a display, a speaker, (or other output device(s)) and a chassis within which the keyboard, display monitor, speaker are housed. The chassis also houses one or more central processing units, volatile memory (e.g. random access memory), persistent memory (e.g. Flash read only memory) and network interfaces to allow machine 54 to communicate over network 66.

Referring now to FIG. 2, a schematic block diagram shows client machine 54 in greater detail. It should be emphasized that the structure in FIG. 2 is purely exemplary, and contemplates a device that be used for both wireless voice (e.g. telephony) and wireless data (e.g. email, web browsing, text) communications. Client machine includes a plurality of input devices which in a present embodiment includes a keyboard 200 and a microphone 204. Other input devices, such as a touch screen, and camera lens are also contemplated. Input from keyboard 200 and microphone 204 is received at a processor 208, which in turn communicates with a non-volatile storage unit 212 (e.g. read only memory (“ROM”), Erasable Electronic Programmable Read Only Memory (“EEPROM”), Flash Memory) and a volatile storage unit 216 (e.g. random access memory (“RAM”).

Programming instructions that implement the functional teachings of client machine 54 as described herein are typically maintained, persistently, in non-volatile storage unit 212 and used by processor 208 which makes appropriate utilization of volatile storage 216 during the execution of such programming instructions. Of particular note is that non-volatile storage unit 212 persistently maintains a native menu application 82 and a web-browser application 86, each of which can be executed on processor 208 making use of nonvolatile storage 216 as appropriate. Various other applications (not shown) are maintained in non-volatile storage unit 212 according to the desired configuration and functioning of client machine 54, one specific non-limiting example of which is a contact manager application 90 which stores a list of contacts, addresses and phone numbers of interest to user U and allows user U to view, update, delete those contacts, as well as providing user U an option to initiate telecommunications (e.g. telephone, email, instant message, short message service) directly from that contacts application.

Native menu application 82 is configured to provide menu choices to user U according to the particular application (or other context) that is being accessed. By way of example, while user U is activating contact manager application 90, user U can activate menu application 82 to access a plurality of menu choices available that are respective to contact manger application 90. This example is shown in greater detail in FIG. 3. In FIG. 3, a non-limiting exemplary portion of keyboard 200 is shown, which comprises a menu key 232, a pointing device in the form of a trackball 236, and a select key 240. FIG. 3 also shows a non-limiting example of how contact manager application 90 can be rendered on display 224 when being accessed. In FIG. 3, contact manager application 90 is shown displaying two contacts and a telephone number for each, namely, Bill Smith at 555-555-1212 and Sally Struthers at 555-555-1313. Note that Sally Struthers is highlighted using a colour scheme that is inverse to the colour scheme used to display “Bill Smith” and the words “Contact Manager Application”, indicating that Sally Struthers is currently being selected. User U can operate trackball 236 to scroll between Bill Smith and Sally Struthers causing one or the other to be highlighted.

While accessing contact manager application 90 as shown in FIG. 3, user U can also depress menu key 232 which will invoke menu application 82, an exemplary result of which is shown in FIG. 4. Menu application 82 provides a contextual menu M-90 comprised of a plurality of menu choices that are reflective of the context in which menu key 232 was selected. (Contextual menu M-90 in FIG. 4 is respective to contact manager application 90 and hence the suffix “-90” in M-90. Generically, however, contextual menus will be referred to herein as contextual menus M.) In the example in FIG. 4, contextual menu M-90 provides the choices of: “Help” to obtain context sensitive help about what options are available to user U within the contact manager application 90; “View” to allow user U to see more contact information (e.g. address, additional phone numbers, photographs) of the highlighted contact; “Edit” to allow user U to edit the same information that can be viewed using “View”; “Delete” to allow user U to delete the particular contact from non-volatile storage memory; “Call” to allow user U to invoke a telephony application to initiate a telephone call to the highlighted contact; “Email” to allow user U to invoke an email application to compose an email to the highlighted contact; “Close” to allow user U to close contact manager application 90 altogether and return to an application selection screen (not shown). While, for example “Call Sally Struthers” is highlighted, user U can depress the select key 240 in order to cause client machine 54 to invoke a telephony application (not shown) and dial the telephone number for Sally Struthers. User U can also depress menu key 232 while menu application 82 is open to cause menu application 82 to close and return control to the contact manager application 90 in accordance with the discussion relative to FIG. 3.

Note that the options in contextual menu M-90 are stored within non-volatile storage 212 as being specifically associated with contact application 90. Menu application 82 is therefore configured to generate a plurality of different contextual menus M that are reflective of the particular context in which the menu application 82 is invoked. For example, in an email application where an email is being composed, invoking menu application 82 would generate a contextual menu M that included the options of sending the email, cancelling the email, adding addresses to the email, adding attachments, and the like. The contents for such a contextual menu M would also be maintained in non-volatile storage 212. Other examples of contextual menus M will now occur to those of skill in the art. Menu application 82 and contextual menus M will be discussed in greater detail below.

Returning now to FIG. 1, web-server 58 and schema server 62 (which can, if desired, be implemented on a single server) can be based on any well-known server environment including a module that houses one or more central processing units, volatile memory (e.g. random access memory), persistent memory (e.g. hard disk devices) and network interfaces to allow servers 58 and 62 to communicate over network 66. For example, server 58 or server 62 or both can be a Sun Fire V480 running a UNIX operating system, from Sun Microsystems, Inc. of Palo Alto Calif., and having four central processing units each operating at about nine-hundred megahertz and having about sixteen gigabytes of random access memory. However, it is to be emphasized that this particular server is merely exemplary, and a vast array of other types of computing environments for servers 58 and 62 are contemplated.

It should now be understood that the nature of network 66 and the links 70, 74 and 78 associated therewith is not particularly limited and are, in general, based on any combination of architectures that will support interactions between client machine 54 and servers 58 and 62. In a present embodiment network 66 itself includes the Internet as well as appropriate gateways and backhauls to links 70, 74 and 78. Accordingly, the links 70, 74 and 78 between network 66 and the interconnected components are complementary to functional requirements of those components.

More specifically, system 50 includes link 70 between client machine 54 and network 66, link 70 being based in a present embodiment on core mobile network infrastructure (e.g. Global System for Mobile communications (“GSM”); Code Division Multiple Access (“CDMA”), Enhanced Data rates for GSM Evolution (“EDGE”), Evolution Data-Optimized (“EV-DO”), High Speed Downlink Packet Access (“HSPDA”).) or on wireless local area network (“WLAN”) infrastructures such as the Institute for Electrical and Electronic Engineers (“IEEE”) 802.11 Standard (and its variants) or Bluetooth or the like or hybrids thereof. Note that in an exemplary variation of system 50 it is contemplated that client machine 54 could be other types of client machines, including a full desktop computer or a “thin-client”.

System 50 also includes link 74 which can be based on a T1, T3, O3 or any other suitable wired or wireless connected between server 58 and network 66. System 50 also includes link 78 which can be based on a T1, T3, O3 or any other suitable wired or wireless connected between server 62 and network 66.

As previously stated in relation to FIGS. 1 and 2, client machine 54 is configured to interact with content available over network 66, including web content on web-server 58. In a present embodiment, client machine 54 effects such interaction via web-browser application 86 that is configured to execute on client machine 54. As will be explained further below, web-browser application 86 is a mini-browser in the sense that it is configured to render web-pages on the relatively small display 224 of client machine 54, and during such rendering attempt to render those pages in a format that is different from how those pages would be rendered on a traditional desktop browser, but still conveys, as much as possible, substantially the same information as if those web-pages had been rendered on a full browser such as Internet Explorer or Firefox on a traditional desktop or laptop computer. Web-server 58 is configured to host a web-site 100 that includes a plurality of web-pages.

FIGS. 5-8 show exemplary representations of four different pages from web-site 100, labeled 100-1, 100-2, 100-3 and 100-4 respectively. The representation in FIGS. 5-8 shows how web-pages 100-1, 100-2, 100-3 and 100-4 would be rendered on a traditional desk-top computer such as a Windows-based computer running the Internet Explorer or Firefox Web-browser as an HTTP web-page. In the example, web-site 100 is an e-commerce web-site belonging to a fictional computer equipment retailer named ABC ComTech Corp. Web-site 100 can be browsed to select various computer equipment items for purchase, culminating in the selection of a secure checkout screen that can be used to complete the final order for the selected computer equipment and to provide payment and shipping information therefor. FIGS. 5-8 shows exemplary navigation using a traditional desk-top browser through the “Home”; “Computers”; “Laptops” and “17.0 inch” menu options as found in the menu-panes indicated at 104-1, 104-2, 104-3 and 104-4 respectively on FIGS. 5, 6, 7, and 8. FIGS. 5-8 also show content panes indicated at 108-1, 108-2, 108-3 and 108-4 respectively on FIGS. 5, 6, 7, and 8. In the exemplary pages on FIGS. 5-8, it will be noted that content panes 108-1, 108-2, 108-3, 108-4 comprise promotional content that corresponds to the level of the menu in its respective menu-panes 104-1, 104-2, 104-3, 104-4. More particularly, in FIG. 5, which corresponds to the “Home” menu-pane 104-1, there are promotional items (e.g. A camera, a computer, a personal navigation device, and a television) presented in content pane 108-1 that reflect more than one of the options in the “Home” menu-pane 104-1. In FIG. 6, which corresponds to the “Computers” menu-pane 104-2, there are promotional items (e.g. a laptop computer and a desktop computer) presented in content pane 108-2 that reflect more than one of the options in the “Computers” menu-pane 104-2. In FIG. 7, which corresponds to the “Laptops” menu-pane 104-3, there are promotional items (e.g. various laptop computers) presented in content pane 108-3 that reflect more than one of the options in the “Laptops” menu-pane 104-3. In FIG. 8, which corresponds to the 17.0″ laptops page, menu-pane 104-4 is substantially the same as menu-page 104-3, while content pane 108-4 includes a list of 17″ laptops that are available for purchase. (Note that the fact that menu-pane 104-4 and menu-pane 104-3 are the same is purely exemplary and that in general menu-panes can contain any desired menu selections.) If desired, a specific laptop listed on content pane 108-4 can be selected for further information and/or selected for purchase.

Those skilled in the art will now recognize that menu-panes 104-1, 104-2, 104-3 and 104-4 represent at least one set of hyper-text markup language (“HTML”) programming instructions possibly incorporating scripting language such as Java-script. Likewise those skilled in the art will now recognize that content-panes 108-1, 108-2, 108-3 and 108-4 represent at least one other set of hyper-text markup language (“HTML”) programming instructions possibly incorporating scripting language such as Java-script. The programming instructions for menu-panes 104-1, 104-2, 104-3 and 104-4 are discrete from the programming instructions for content-panes 108-1, 108-2, 108-3 and 108-4. It will also now be apparent that, web-server 58 is configured to provide each web-page 100-1, 100-2, 100-3 and 100-4 in its entirety in response to a request from a web-browser, so that it is not generally possible to view, for example web-page 100-4 directly from web-page 100-1 or web-page 100-2.

Referring again to FIGS. 1 and 2, in a present embodiment, web-browser application 86 is also configured to interact with schema server 62 in order to obtain a schema 102. In general, a schema such as schema 102 comprises a file corresponding to content on web-site 100. Such a schema file can be generated in any desired format, such as eXtensible Markup Language (“XML”) or a text file. A schema can contain instructions to identify each page family on the website as well as instructions to extract desired objects and elements for each page family. A schema can additionally specify the relationship between the objects and attributes. In a present embodiment schema 102 includes information relative to menu-panes 104-1, 104-2, 104-3 and 104-4 that is usable to menu application 82 and web-browser application 86 in the presentation of web-site 100 on client machine 54. Table I shows an exemplary representation of a schema 102 that corresponds to web-site 100.

TABLE I

Exemplary content of schema 102 corresponding to exemplary web-site 100

Web-page Link

within web-site 100

containing content

108 to be shown

Level 1 Menu
Level 2 Menu
Level 3 Menu
corresponding to

Root Menu Item
Item
Item
Item
Menu

Home

N/A
N/A
Address 0

(Corresponds to Web-

page 100-1)

Computers
N/A
N/A
Address 1

(Corresponds to Web-

page 100-2)

Computers
Laptops
N/A
Address 2

(Corresponds to Web-

page 100-3)

Computers
Laptops
13.3″ and
Address 3

smaller
(Corresponding web-

page not shown)

Computers
Laptops
14.1″
Address 4

(Corresponding web-

page not shown)

Computers
Laptops
15.4″
Address 5

(Corresponding web-

page not shown)

Computers
Laptops
Refurbished
Address 6

laptops
(Corresponding web-

page not shown)

Computers
Laptops
Tablet and
Address 7

speciality
(Corresponding web-

page not shown)

Computers
Laptops
17.0″
Address 8

(Corresponds to Web-

page 100-4)

Computers
Desktop
N/A
Address 9

Computers

(Corresponding web-

page not shown)

Computers
Desktop
(Model Line 1)
Address 10

Computers

(Corresponding web-

page not shown)

Computers
Desktop
(Model Line 2)
Address 11

Computers

(Corresponding web-

page not shown)

Computers
Desktop
(Model Line 3)
Address 12

Computers

(Corresponding web-

page not shown)

Computers
Monitors
N/A
Address 13

(Corresponding web-

page not shown)

Computers
Monitors
(Model Line 1)
Address 14

(Corresponding web-

page not shown)

Computers
Monitors
(Model Line 2)
Address 15

(Corresponding web-

page not shown)

Computers
Monitors
(Model Line 3)
Address 16

(Corresponding web-

page not shown)

Computers
Computer
N/A
Address 17

Packages

(Corresponding web-

page not shown)

Computers
Computer
(Model Line 1)
Address 18

Packages

(Corresponding web-

page not shown)

Computers
Computer
(Model Line 2)
Address 19

Packages

(Corresponding web-

page not shown)

Computers
Computer
(Model Line 3)
Address 20

Packages

(Corresponding web-

page not shown)

Computers
Apple
N/A
Address 21

Computers

(Corresponding web-

page not shown)

Computers
Apple
(Model Line 1)
Address 22

Computers

(Corresponding web-

page not shown)

Computers
Apple
(Model Line 2)
Address 23

Computers

(Corresponding web-

page not shown)

Computers
Printers and
N/A
Address 24

Fax Machines

(Corresponding web-

page not shown)

Computers
Printers and
(Model Line 1)
Address 25

Fax Machines

(Corresponding web-

page not shown)

Computers
Printers and
(Model Line 2)
Address 26

Fax Machines

(Corresponding web-

page not shown)

Computers
Scanners
N/A
Address 27

(Corresponding web-

page not shown)

Computers
Scanners
(Model Line 1)
Address 28

(Corresponding web-

page not shown)

Computers
Scanners
(Model Line 2)
Address 29

(Corresponding web-

page not shown)

Computer
N/A
N/A
Address [n]

Add-ons

(Corresponding web-

page not shown)

. . .
[Sub-levels for Computer Add-ons Per above structure]

Software
N/A
N/A
Address [n1]

(Corresponding web-

page not shown)

. . .
[Sub-levels for Software Per above structure]

Photo
N/A
N/A
Address [n2]

(Corresponding web-

page not shown)

. . .
[Sub-levels for Photo Add-ons Per above structure]

Photo-
N/A
N/A
Address [n3]

Finishing

(Corresponding web-

page not shown)

. . .
[Sub-levels for Photo-Finishing Add-ons Per above

structure]

TV & Video
N/A
N/A
Address [n4]

(Corresponding web-

page not shown)

. . .
[Sub-levels for TV& Video Add-ons Per above structure]

Audio
N/A
N/A
Address [n5]

(Corresponding web-

page not shown)

. . .
[Sub-levels for Computer Add-ons Per above structure]

Explaining Table I in greater detail, the first four columns of Table I (“Root Menu Item”; “Level 1 Menu Item”; “Level 2 Menu Item”; “Level 3 Menu Item”) correspond to the menu structure found in menu-panes 104-1, 104-2, 104-3, 104-4. The last column of Table I (“Web-page Link within web-site 100”) corresponds to the specific address associated with a particular web-page within website 100, including web-pages 100-1, 100-2, 100-3, 100-4 and other web-pages that are not actually shown in the Figures and points to the respective content (including 108-1, 108-2, 108-3, 108-4 and other content not actually shown in the Figures) that is associated with the menu-panes reflected in the associated first four columns. Thus the first four columns can be used by native menu application 82 to create a plurality of contextual menus M that have substantially the same content as menu-panes 104-1, 104-2, 104-3 and 104-4. Likewise, the last column of Table I can be used to extract web-content corresponding to the web-site address indicated in the relevant entry of that last column, as found within web-site 100, including web-content 108-1, 108-2, 108-3, 108-4 and other web-content from other web-pages in web-site 100 that are not actually shown in the Figures. Web-browser application 86 and native menu application 82 are therefore configured to co-operate using schema 102 in order to present web-content within the web-browser application 86, while using native menu application 82 to permit user U to navigate through web-site 100.

Referring now to FIG. 9, a method for content navigation is represented in the form of a flow-chart as indicated generally at 900. Method 900 can be performed using system 50, though it is to be understood that method 900 can be performed on variations of system 50, and likewise it is to be understood that method 900 can be varied to accommodate variations on system 50.

At block 910 a schema is requested. Block 910 is performed by web-browser application 86 (or a separate plug-in or other application configured to execute in conjunction with web-browser, such as a transcoding engine, not shown) which establishes a connection with schema server 62 in order to retrieve schema 102. At block 915 the schema is validated and returned. The validation of block 915 (which, it will be appreciated, like certain other aspects of method 900, will be understood to be optional) can be effected by server 62 which can perform a validation operation to confirm that schema 102 matches web-site 100 and is otherwise up-to-date. If validation is not achieved then an exception (e.g. an error) can be generated. Assuming validation is achieved, then schema 102 is returned to web-browser application 86 where it is loaded into web-browser application 86. Blocks 910 through 915 are represented in FIG. 10, as a connection between web-browser application 86 of client machine 54 and schema 102 of server 62 is indicated at reference 216 such that schema 102 is now loaded onto client machine 54 and available to web-browser application 86.

Also note that the means by which web-browser application 86 requests schema 102 is not particularly limited. In one particular embodiment, however, it is contemplated that web-browser application 86 will be configured to automatically make network requests over network 66 to request a schema that corresponds to website 100. For example, schema server 62 can have a predefined network address on network 66 that is preprogrammed into client machine 54. The type of network address is not particularly limited, and can be, for example, any type of network identifier such as an Internet Protocol (“IP”) address or a Uniform Resource Locator (“URL”). Any other suitable type of network address is contemplated. Client machine 54 can therefore be programmed to send a request to the address for schema server 62 and request that schema server 62 provide, if available, a schema (e.g. schema 102) that corresponds to web-site 100. (Note of course that in other embodiments, a separate schema can be provided for each web-page within web-site 100). The request at block 910 provided by client machine 54 can be formed with any unique identifier for each web-page, but in the context of the Internet the request would most typically be, or derived from, the URL associated with each web-page. In turn, that unique identifier can be used to index schema 102 on schema server 62.

As well, authentication can be made through connection 216 to validate the origin of schema 102. For an example, private and public key based authentication can verify that schema 102 is originated from a trusted source.

Those skilled in the art will now recognize that system 50 can be implemented so that a plurality of web-sites (like web-site 100) are hosted over network 66 (either alone by server 58 or by a plurality of web-servers like web-server 58), and that a corresponding plurality of schemas for each of those web-sites (or each of the web-pages therein, or both) can be maintained on schema server 62. Those skilled in the art will now recognize that there can in fact be a plurality of schema servers (like schema server 62) and that client machine 54 can be configured to search for corresponding schema files on one or more of those schema servers. Those skilled in the art will now further recognize that schema servers can be hosted by a variety of different parties, including, for example: a) a manufacturer client machine 54, b) a service provider that provides access to network 66 via link 70 on behalf of user U of client machine 54; or c) the entity that hosts web-site 100. In the latter example it can even be desired to simply host schema 102 directly on web-server 58 and thereby obviate the need for schema server 62.

Referring again to FIG. 9, at block 920 a web-page is selected. In this case home web-page 100-1 for web-site 100 is selected. Such a selection will typically have been made as part of web-browsing performed by user U, and indeed will have been done prior to invocation of method 900. In this embodiment, web-browser application 86 makes a request for home web-page 100-1. Such a request can be made directly bypassing server 62 altogether. (In other embodiments, discussed below in relation to FIG. 17, such a request can be made via server 62 or another server, with intermediate transcoding (e.g. transcoding of web-content 108-1, 108-2, 108-3 and 108-4) from the format of that content on web-server 58 into another format that is optimized for generation on display 224). At block 925, the selected web-page is requested, and at block 930 the selected web-page is returned. More particularly, web-server 58 returns web-page 100-1 to web-browser application 86. Blocks 920 through 930 are represented in FIG. 11 as a connection between web-browser application 86 and web-server 58 is indicated at 220 such that web-page 100-1 is now loaded onto client machine 54 and available to web-browser application 86.

Referring again to FIG. 9, at block 935 the web-page is generated using the schema within the web-browser. Block 935 is represented in FIG. 12, as In this example, web-page 100-1 is generated on display 224 using the last column of Table I representing the aspect of schema 102 that corresponds with the home web-page 100-1 of web-site 100. As a result of performance of block 935, only content 108-1 is actually shown on display 224 while menu-pane 104-1 is removed from display within web-browser application 86.

At block 940, a determination is made as to whether native menu application 82 has been selected for activation. In a present embodiment, and referring again to FIG. 12, such a determination would be made by determining whether menu key 232 had been depressed. A yes determination would be made at block 940 if key 232 was depressed, whereupon method 900 would advance to block 945. If key 232 is not depressed, then a no determination would be made and method 900 would cycle back to block 935.

Within block 935, user U can perform the usual functions of web browsing, including scrolling through the page, and selecting any individual links which may be active on within content 108-1. Thus, user U could browse and otherwise interact with content 108-1 as if user U was operating a traditional desktop browser. It will now be understood that such interaction could lead to a selection of a different web-page which would otherwise interrupt performance of method 900. Such interaction is not contemplated by method 900 expressly for convenience and simplicity, but that is not to say that such interaction is excluded.

Assuming, however, that a “yes” determination is made at block 940 and method 900 advances to block 945, then at block 945 a contextual menu would be generated. FIG. 13 represents performance of block 945, as invocation of menu application 82 has caused contextual menu M-104-1 to be rendered in conjunction with content 108-1 on display 212. As described above, contextual menu M-104-1 is generated using native menu application 82. Native menu application 82 interacts with web-browser application 86 in order to obtain the relevant contents of menu-pane 104-1 in order to ultimately generate contextual menu M-104-1.

At block 950, a determination is made as to whether a web-page has been selected. User U can thus scroll through the various options presented on contextual menu M-104-1 in much the same manner that user U could scroll through the options presented on contextual M-90 as discussed above. Thus, at block 950, a determination would be made as to whether user U interacting with contextual menu M-104-1 using menu application 82 made a selection corresponding to one of “Computers”; Computer Add-ons”; “Software”; “Photo-finishing”; “TV & Video”; or “Audio”.

If the determination at block 950 is “no” then at block 955 a determination is made as to whether a selection was made to close the menu application. Continuing with the present example, it would be determined whether user U interacting with contextual menu M-104-1 using menu application 82 made a selection corresponding to “Close Menu”. If the determination at block 955 is “yes”, then method 900 returns to block 935 and contextual menu M-104-1 would close and display 224 would return the appearance as shown in FIG. 12.

If the determination at block 955 is “no” then method 900 advances to block 960 where a determination is made as to whether a control item was selected. Continuing with the present example, in FIG. 13 an exemplary control option entitled “close browser” is provided. (It should be noted that other control options can be provided, such as “switch application”. Other control options will now occur to those skilled in the art.) A “yes” determination would therefore be made at block 960 if user U interacting with contextual menu M-104-1 using menu application 82 made the selection corresponding to “Close Browser”. Such a “yes” determination could lead to the termination of method 900 as web-browser application 86 is closed altogether and operation of client machine 54 directed to execution of another application, such as a main menu application (not shown).

If the determination at block 960 is “no”, (i.e., user U interacting with contextual menu M-104-1 using menu application 82 made a selection corresponding to “Home”), then method 900 cycles back to block 950.

Referring again to block 950 of FIG. 9, if the determination at block 950 was “yes”, (i.e. that user U interacting with contextual menu M-104-1 using menu application 82 made a selection corresponding to one of “Computers”; Computer Add-ons”; “Software”; “Photo-finishing”; “TV & Video”; or “Audio”), then method 900 cycles back to block 925 where another web-page corresponding to the selection is made. Method 900 continues to perform thereafter in substantially the same manner as previously described, except that the newly selected web-page is now generated and the corresponding contextual menu for each of those pages loaded accordingly.

FIGS. 14, 15 and 16 show further exemplary screen shots of how navigation would be effected using method 900 proceeding from the screen shot in FIG. 13. It will now be appreciated that FIG. 13 corresponds to FIG. 5; FIG. 14 corresponds to FIG. 6; FIG. 15 corresponds to FIG. 7; and that FIG. 16 corresponds to FIG. 8, except that FIGS. 5-8 show how various pages of web-site 100 would be rendered using a traditional desktop browser, whereas FIGS. 13-16 shows how those same pages would be rendered using method 900. Note that on FIG. 16, corresponding to FIG. 8, user U can, if desired, select a specific laptop listed on content pane 108-4 using browser application 86 in order to obtain further information and/or selected for purchase.

Referring now to FIG. 17, a system for content navigation in accordance with another embodiment is indicated generally at 50a. System 50a is a variation of system 50 and therefore like elements in system 50a bear like references to counterpart references in system 50, except followed by the suffix “a”. Of note, however, is that in system 50a, server 62a includes a transcoding engine 103a that is configured to, on behalf of device 54a, transcode web-content 108a-1, 108a-2, 108a-3 and 108a-4 from the format of that content as maintained by web-server 58a into another format that is optimized for generation on the display of device 54a. Thus, in system 50a, web-content 108a-1, 108a-2, 108a-3 or 108a-4 destined for device 54a is retrieved from server 58a via server 62a, whereby server 62a transcodes that content prior to sending that content to device 54a. As a non-limiting example, the transcoded version of web-content 108a-1 is identified as web-content 108a′-1 in an oval associated with web-browser 86a. In this way, a transcoding engine (or other transcoding functions) need not be placed on (or at least implemented by) device 54a and thereby freeing up resources on device 54a. Those skilled in the art will now recognize that block 935 of method 900 would be modified for system 50a, whereby the webpage would be generated using transcoding engine 103a instead of doing any transcoding using the schema within browser 86.

Referring now to FIG. 18, a method for content navigation in accordance with another embodiment is represented in the form of a flow-chart as indicated generally at 900b. Method 900b can be performed using system 50, though it is to be understood that method 900b can be performed on variations of system 50, and likewise it is to be understood that method 900b can be varied to accommodate variations on system 50. Method 900b is a variation on method 900 and therefore like blocks in method 900b bear like references to counterpart blocks in method 900, except followed by the suffix “b”. While methods 900 and 900b are substantially the same, of note is that in method 900b, block 945 is omitted and substituted with block 946b. Block 946b itself is a variation on block 945, except that the menu pane M-104 that is generated includes additional depth beyond the depth provided in the original menu pane 104. Additional depth is meant to indicate, for example, that when generating menu pane 104-1 on device 54, menu pane M-104 may be modified to include at least a portion of the menu selections found one or more of menu panes 104-1, 104-2, 104-3, 104-4. (An exemplary modified version of menu pane M-104b-1 generated by block 946b is shown in FIG. 19 as menu pane M-104b-1, which will be discussed in further detail below.) In one exemplary extreme, not shown in the Figures, menu pane M-104 may be modified to include ALL of the selections in 104-1, 104-2, 104-3, 104-4.

Method 900b addresses one problem of browsing between web-pages on mobile electronic devices, whereby browsing through multiple pages can be time consuming, resource (e.g. bandwidth, processor, memory) intensive and not to mention financially expensive for user U depending on the rate plan available to user U. Method 900b can allow users to navigate through multiple levels of web page menus. Turning now to FIG. 19, in menu pane M-104b-1, the selections of menu pane M-104-3 are combined into the selections of menu pane M-104-1, thereby allowing user U to navigate directly the contents of menu pane M-104-3 and bypassing the contents of menu pane M-104-2. As a practical example, if user U wishes to view 17.0″ laptops (i.e. the content on web-page 108-4 in FIG. 16), then rather than having to navigate through web-pages 108-1, 108-2, and 108-3, one page at a time, user U can be permitted to reach web-page 108-4 in FIG. 16 directly from web-page 108-1 in FIG. 19, directly selecting the ultimate target without having to go through each intermediary web page 108-1, 108-2, and 108-3. Implementing method 900b can be effected by examining the full contents of Table I and generating a modified menu pane M-104 that reflects the desired combinations of one or more of menu panes M-104-1, M-104-2 and M-104-3.

The determination of which portions of menu panes M104-1, M-104-2 or M-104-3 are to be combined are not particularly limited. For example, a record can be kept of the most popular selections by all users of web site 100 and to include direct links to those selections. Alternatively, specific promotions can be chosen to be combined into the modified menu pane M-104 (e.g. where the operator of server 58 wishes to promote the sale of 17″0 laptops in FIG. 16). Alternatively, a browsing history by user U of device 54 can be maintained, so that the first time user U browses web-site 100, method 900 is invoked so that user U is presented with the screens shown in FIGS. 13, 14, 15 and 16, but upon returning to web-site 100, method 900b is invoked and user U will be initially presented with the screen shown in FIG. 19 in anticipation of user U's desire to browse directly to the screen shown in FIG. 16.

The foregoing presents certain exemplary embodiments, but variations or combinations or subsets thereof are contemplated. For example, other functions can be added to each contextual menu M as those menus are presented within browser application 86, such as the common “back” or “forward” commands as found in traditional desk top browsers. Also, the types of web-sites 100 are not intended to be limited to e-commerce web-sites.

Another embodiment provides a communications environment 10D. Referring to FIG. 20, a communications environment 10D has one or more Web sites 20D that have a collection of Web files 52D on a particular subject that includes a beginning file called a home page, which is reached by a computing device 101D over a communications network 11D via a network address (e.g. URL). From the home page, or through direct access to any page without going through the respective home page, a user, using a web site browser (via the device 101D), can access both content 50D and related navigation 54D of all the other pages on the Web site. It is recognized that the Web site is typically hosted on one or more Web servers. A server in this context is a computer device 101D that holds the files for one or more Web sites. For example, a large Web site may be hosted on a number of servers located in many different geographic places.

Access to the Web sites over the network 11D can be done directly, in terms of desktop devices 26D, and through a proxy gateway 22DD, further described below. Accordingly, one or more mobile devices 24D (e.g. PDAs, mobile phones, etc.) and one or more desktops 26D can use the gateway to access the pages (both content 50D and navigational 54D aspects). The gateway can be used to format or otherwise monitor the interaction of the user of the devices 24D, 26D with the content 50D and navigational 54D aspects of the Web pages.

Overview of the Environment 10D

Specifically, the environment 10D can take unstructured webpage (e.g. HTML) and convert it into a structured database, for example. It is not about simplifying HTML for any page, it is about understanding the data in a page and the relationships (between data content and between data content and navigational items tied to that page content) that govern the data in the page. Accordingly, knowledge of the data contained in the page content (e.g. data type—navigation verses published content—as well as which of the published content is related to each other and which of the navigation data is related to each other and to which published content on the page) can be used (for example via a signature file) to extract data from the web site (for example on a page by page or other defined collection of information such as for file by file) for consumption by the mobile/desktop device 101D. Therefore, it is the gateway that acts as the proxy between the desktop/mobile for accommodating requests for web site data from the mobile/desktop and corresponding web site data sent from the web site in response to the request. It is recognized that the data (e.g. web page 60D) obtained by the gateway, from the web site, could be any structured file (e.g. an HTML, XML, etc.) document (optionally in the form of a web page), or which the signature file has predefined knowledge about the contents of the document (e.g. meaning of data contained within tags/delimiters as well as the interrelationships between the data in the document). One example of this is a web page described in HTML, which can be referred to as unstructured content.

It is recognized that the extraction process of the gateway for extracting data from the web page of the web site can be used to obtain only that data (e.g. published content and/or navigational data) that is pertinent to a simplified display on the screen of the user device 101D. The reason for generation of the simplified display of the data obtained from the original web site content (e.g. a web page) can be such as but not limited to: limited display space for the generated simplified data display on the user device 101D (e.g. physical space restrictions such as for a mobile screen or for user/system defined space restrictions such as for only a portion of the theoretically available desktop screen space; and for user preference pertaining to continuity of browsing/transactional/session experience. An example of user preference is where the user starts the interaction with the web site and resultant displayed data (published content and navigational data) on the mobile (i.e. mobile formatted data display) and then wishes to retain the formatting of the mobile when continuing to view on the desktop screen. For example, the user on the desktop can continue to browse the published content and navigational data of the web site as previously experienced on the mobile, using only a portion of the desktop screen (for example) for data display.

The remaining description will refer to the document obtained from the web site as a web page, for exemplary purposes only. Large data-driven sites don't maintain thousands of pages. They have a few page templates and populate them from a database of information, news, shopping etc. Each template represents a family of pages. And a family of pages has objects and attributes.

Example 1
News Site

Family: List Page

Objects: lists a selection of news stories

Attributes: Title, abstract and date

Family: Detail page

Objects: lists a single news story (and maybe other related stories)

Attributes: Journalist, City, Date, Title, Full Story, Image

Example 2
E-Commerce Site

Family: List Page

Objects: lists a selection of products

Attributes: Image, Item Name, Price, Sale Price

Family: Search Page (a specific kind of list page)

Objects: same as list page +−a few

Attributes: same as list page +−a few

There are a few families of pages that can be managed to get an entire website accessible via a signature file, further described below:

List Pages—browse by category, by search, featured products

Detail pages: A specific object details with other information on a page

Search: to enter search information

Input: To do things like enter billing information (these are typically individual pages)

Signature Files

We identify the signature for each family of pages (the family template) that 1) automatically can identify a given page on a website as part of the family and 2) differentiates that family from another family of pages. Similarly each object and attribute field can have a unique signature within a family of pages that we need to identify once for the family.

A Signature file can contain numerous pieces of information, for example namely:

1) identifying the page family

2) identifying the objects and attributes in the page

3) Specifying the relationship between the objects and attributes.

In the case of a document received as a file, the signature file can contain knowledge about the type of file, the objects/attributes of the file, and the relationships between the objects and attributes in the file. A further example of the web site data can be such as but not limited to news articles and RSS feeds or other information feeds (stock tickers, etc.).

Schema Engine

This component uses the signature file for a website to create content data in response to the web page request, from the mobile/desktop, efficiently on the fly and send the data to the client. The data can include web page content data and navigational data obtained from the web page as requested. Alternatively the information can be stored to start building a database of the site, optionally. The construction of this database can be saved locally to the gateway, otherwise cached to the local storage of the user device, and/or cached/stored at the web site or third party (e.g. a search engine service used for comparison of data from different web sites).

Separation of Navigation & Content

Navigation items are on the same page as content, but it may not make sense (in situations with limited screen real estate available) to display the page in the original web page format as obtained from the website by the gateway. Schema extracts the navigational items separately to create a navigational portion of the web page. The environment can do interesting things with the separated navigational items, such as feed it to an application in the background to help improve the browsing experience or to otherwise reformat the presentation of the navigational items on the display of the mobile/desktop, in order to help with navigation and maintaining navigation context in situations with limited display space available for presentation of the web page.

Continuance of Sessions

In the environment 10D, the user can start browsing from the PC or mobile device 101D and complete a purchase on either (or otherwise continue the sessions). We can continue the session to realize benefits, such as revenue share, that could be lost if continuance of sessions was not enabled. Continuance of sessions can also give users seamless flexibility to use their PC and mobile to buy/browse things from websites and to replicate the buy/browse information.

The continuance of sessions can be facilitated by the use of rich bookmarking that is generated from the desktop tagging tool discussed below, such that the rich bookmark is created that has bookmark (e.g. a displayable link) components such as but not limited to: a URL (e.g. network address of the web site data; and identified portions of the web site data located with respect to that URL (e.g. item image, item title, description of item, text body related to item—such as an article, etc.). The portions of web site data associated with the URL (e.g. page/file name) can be considered key or otherwise memorable data preferred by the user with respect to item(s) on the URL (for example product name/price/image).

Desktop Tagging Tool and Automatically Creating Signature Files

This uses artificial intelligence to analyze any page in one or more ways, such as but not limited to.

1) delimiter (e.g. HTML tag) structure and properties; and

2) Spatial analysis of objects located on a rendered page.

Generally main content is closer to the centre of the page, is bigger and is meant to stand out more to the user. Properties in the HTML mark up can be used to accomplish this and we have Al that can identify these properties. One embodiment is where we use the rendered page, in combination with tag analysis. One benefit is that this feature could be used to generate the signature files automatically by guessing and at least significantly speeding up creating of signature files, if not completely automated. Another use of the desktop tagging tool is to create a list of rich bookmarks for later use by the user and/or for publishing or otherwise sharing with other users. One example of this would be a list of rich bookmarks provided by one user to another user, such that the list of rich bookmarks contains URLs and associated data from one or more web sites.

Conducting a Transaction

With regards to billing and completing a transaction. A user goes through a number of pages to navigate to an item they want to buy and then must continue browsing through a number more pages to complete a checkout or transaction. The provided description of the environment 10D includes detailed explanations of the analysis and output of a requested web page. The same process can be extended to all web pages browsed from start to end to complete a transaction. It is recognized that the transaction can be such as but not limited to: browsing for and subsequent purchase of item(s); and/or browsing and subsequent saving of published content (e.g. news article), as desired.

For example, actions one through ten in FIG. 44 represent a user browsing through ABC ComTech Corp. to purchase an item. The web pages representing actions one through ten are shown in FIGS. 35-39 respectively.

Web Sites 20D

Referring again to FIG. 20, Web sites 20D have the plurality of pages 52D (e.g. defined using HTML, XML, XHTML, JavaScript and other structured definition web programming languages—e.g. based on W3C standards—e.g. WSDL). A Web site can be provided as a Web service, which can be a software system, designed to support interoperable device 101D to device 101D interactions over the network 11D. Web services can be Web APIs that can be accessed over the network 11D, such as the Internet, and executed on the remote device 101D hosting the requested services.

For example, a Web service definition encompasses many different systems, but can refer to clients and servers that communicate XML messages that follow the SOAP-standard, via a description of the operations supported by the server e.g. in WSDL.

The composition of the Web pages can include displayed content and navigation features.

Web pages typically have both of these features on each page and will display content in the main content areas and have navigation options through menus, as shown by example in FIG. 26. This web page layout is structured for access by desktop 26D browsers, where the screens are large enough to display the entire page. However most mobile 24D browsers may not have the width and height of a typical PC monitor, therefore they can be unable to display pages as they would appear on a PC browser. One approach to deal with this is to re-organize the page and wrap content around the screen. A second approach used by the WAP standard is spatially divide a page (usually vertically) into a number of pages and allow users to navigate between each page section to view a page. A third approach (through the gateway 22DD) is to functionally divide website features into separated content 50D and navigation 54D components, further described below. One example of content 50D is published content (e.g. articles, stories, news, product information, etc.), which is data that is meant to be read/listened to by the user.

Content 50D

The content can include computer files, image media, audio files, electronic documents, which are either located on/in the Web page or are otherwise accessible through navigation/requests from a particular Web page and/or Web service. For example, Web content can be referred to as textual, visual or aural content that is encountered as part of the user experience through interaction with Web sites/services. Web content may include, among other things: text; images; sounds; videos; animations; and feeds (video, audio, and/or textual). For example, the pages can present content as predominantly composed of HTML, or some variation, as well as data, applications, e-services, images (graphics), audio and video files, personal Web pages, archived e-mail messages, and many more forms of file and data systems can belong to Web sites and Web pages.

Examples of content can be as follows:

1) Tables for presenting information displayed in a grid, such as a calendar, or in a spreadsheet, such as financial data. Tables can be used to have greater control over page layout.

For example, a table can help that text and graphics are displayed in their correct location. A table can also encompasses an entire page, with nested tables (including content and/or navigation features) within the main table for even more layout control;

2) video and audio files;

3) text, e.g. articles, for most web pages, is tone of he most important features. Text can be used to present ideas, instructions, and/or educational/recreational content; and

4) Images (e.g. GIF, JPEG) can be used in web pages to support the theme of the web page and to provide a visual impression. Images can be separate image files and may not reside in the HTML document itself, but can be stored in the same location as the web page. Images can be scanned photographs or pictures, may be created in a draw program, or may be downloaded from another web site.

Navigation 54D

The mobile 24D and desktop 26D devices coordinate user events 109D of the respective users, though operation of the browsers (or other applications) 207D in interaction with the supporting navigation features of the Web pages/sites. The navigation features can include visual based controls, text controls, and/or a combination thereof. For example, there can be three basic types of navigation: Hierarchical that applies to Web sites that are information-rich and are organized as a large tree, much like a library; Global that applies to Web sites where the user can logically jump among all points (e.g. content and/or other navigation controls); and Local that applies when the user wants to access a depth of information/content within broader areas/content of the Web site.

Examples of navigation 54D mechanisms with respect to the content of a Web site can include such as but not limited to: embedded links (e.g. anyplace where one links content within the body of the page); and navigation buttons, graphic and text-based. As well, text entry fields can be used to navigationally access content and other navigation features of Web sites.

Further examples of navigation 54D mechanisms can be such as but not limited to:

1) Buttons can be images with text on them that provide a means to navigate from one location to another. Buttons may be created in a draw program or downloaded from other web sites.

2) Menu Bar can be features on a web page that provide links to other pages for easy navigation between the pages or other Web sites. Menu bars may contain buttons (e.g. text/images), they may be created as a table, or they may be text-based with divider lines; and

3) Links providing “branching capabilities”—the ability to go to another site/page. Links provide that branching option. Links are “jump starts” to other web pages/sites. A link may take the user to another page or it may take them to another site.

Devices 101D

Referring to FIG. 21, a computing device 101D of the system 10D can include a network connection interface 200D, such as a network interface card or a modem, coupled via connection 218D to a device infrastructure 204D. The connection interface 200D is connectable during operation of the devices 101D to the network 11D (e.g. an Intranet and/or an extranet such as the Internet), which enables the devices 101D to communicate with each other (e.g. that of the mobile 24D, gateway 22DD, desktop 26D and Web site 20D) as appropriate. The network 11D can support the communication of the web pages as requested.

Referring again to FIG. 21, the device 101D can also have a user interface 202D, coupled to the device infrastructure 204D by connection 222D, to interact with a user (e.g. mobile user, gateway administrator, website administrator, desktop user—not shown). The user interface 202D can include one or more user input devices such as but not limited to a QWERTY keyboard, a keypad, a stylus, a mouse, a microphone and the user output device such as an LCD screen display and/or a speaker. If the screen is touch sensitive, then the display can also be used as the user input device as controlled by the device infrastructure 204D.

Referring again to FIG. 21, operation of the device 101D is facilitated by the device infrastructure 204D. The device infrastructure 204D includes one or more computer processors 208D and can include an associated memory 22D (e.g. a random access memory). The computer processor 208D facilitates performance of the device 101D configured for the intended task (e.g. of the respective module(s) of the host system 14D) through operation of the network interface 200D, the user interface 202D and other application programs/hardware 207D (e.g. browser or other device application on the mobile/desktop, web page—content and/or navigation—server of the gateway, and Web server of the Web site) of the device 101D by executing task related instructions. These task related instructions can be provided by an operating system, and/or software applications 207D located in the memory 22D, and/or by operability that is configured into the electronic/digital circuitry of the processor(s) 208D designed to perform the specific task(s). Further, it is recognized that the device infrastructure 204D can include a computer readable storage medium 212D coupled to the processor 208D for providing instructions to the processor 208D and/or to load/update the instructions 207D. The computer readable medium 212D can include hardware and/or software such as, by way of example only, magnetic disks, magnetic tape, optically readable medium such as CD/DVD ROMS, and memory cards. In each case, the computer readable medium 212D may take the form of a small disk, floppy diskette, cassette, hard disk drive, solid-state memory card, or RAM provided in the memory module 22D. It should be noted that the above listed example computer readable mediums 212D can be used either alone or in combination.

Further, it is recognized that the computing device 101D can include the executable applications 207D comprising code or machine readable instructions for implementing predetermined functions/operations including those of an operating system and the host system 14D modules, for example. The executable instructions 207D can be an application hosted on the user mobile/desktop for interacting with the gateway, the engine and other related components for when acting as a data proxy between the mobile/desktop and the web site, or a web service (e.g. search engine crawling tool) for use by the web site, as configured by the respective device 101D when operating within the environment 10D. The processor 208D as used herein is a configured device and/or set of machine-readable instructions for performing operations as described by example above. As used herein, the processor 208D may comprise any one or combination of, hardware, firmware, and/or software. The processor 208D acts upon information by manipulating, analyzing, modifying, converting or transmitting information for use by an executable procedure or an information device, and/or by routing the information with respect to an output device. The processor 208D may use or comprise the capabilities of a controller or microprocessor, for example. Accordingly, any of the functionality of the executable instructions 207D (e.g. through modules associated with selected tasks) may be implemented in hardware, software or a combination of both. Accordingly, the use of a processor 208D as a device and/or as a set of machine-readable instructions is hereafter referred to generically as a processor/module for sake of simplicity.

The memory 22D is used to store data locally as well as to facilitate access to remote data stored on other devices 101D connected to the network 11D. This data can be related to data/user events of the mobile/desktop, data used by the gateway in obtaining and satisfying requests for web pages and associated content/navigation features, and/or actual Web site data, as appropriate for the use of the device 101D in the environment 10D.

The data can be stored in a table, which can be generically referred to as a physical/logical representation of a data structure for providing a specialized format for organizing and storing the data. General data structure types can include types such as but not limited to an array, a file, a record, a table, a tree, and so on. In general, any data structure is designed to organize data to suit a specific purpose so that the data can be accessed and worked with in appropriate ways. In the context of the present network environment 10D, the data structure may be selected or otherwise designed to store data for the purpose of working on the data with various algorithms executed by components of the executable instructions, depending upon the application thereof for the respective device 101D. It is recognized that the terminology of a table is interchangeable with that of a data structure with reference to the components of the network environment 10D

Example Operation of Response 60D to Web Page Request by Gateway 22DD

Large data-driven sites 20D may not create and maintain thousands of pages. Instead, they use multiple page templates 62D and populate the templates from the database of content information. Examples would be online stores, news sites, sports information and weather. The association of the data in the database with the templates 62D is used to construct the web page(s) sent to the gateway 22DD.

See FIG. 45 for an example of a request/response for a page on the web site 20D:

1. A client makes a request to the ABC ComTech Corp.ca web server (2)

2. The web server calls the respective page family (3) depending on the page requested

3. The page family (3) retrieves data (all navigation and content) from the database (1) to populate data fields of the page 60D

4. The web server (2) transmits a completed webpage 60D to the gateway 22DD at (4) It is recognized that a page family serves a specific function for use by the gateway 22DD via the signature file 64, see below. For example, ABC ComTech Corp.ca has the following families of pages:

- List Family: Displays a list of products on a page (FIG. 31)
- Item Family: Displays details for a specific product (FIGS. 30 and 32)
- Search Family: Displays a list of product matches for a search keyword
- Other Families: Miscellaneous pages such as checkout/payment

Accordingly, the environment 10D can take advantage of the fact that each web page 50D of the website 20D can follow a recognizable pattern of content data/location and navigational items related to the content data and content location. For example, in the ABC ComTech example, 62D the text on the web page in red, located above a description of a computer product, is always the price of the computer product.

It is recognized that the web page could also be referred to as a document (e.g. file) that is analysed by the engine through use of the signature file in order to extract (or to insert in the case of passing information from the mobile/desktop back to the web site) information subset(s) of the document.

Signature File 64D

A signature file can be created once for a website and then can efficiently analyze and extract data from pages 60D from the website efficiently. One advantage is that as the signature file can be implemented as an application by the gateway 22DD, the gateway 22DD may not have to store any data from the website, and can instead fetch the data in real time upon request as the webpage 60D matching the content/navigation request of the mobile 24D and/or desktop 26D. Another advantage of signature files is that they can be non-intrusive to the existing website infrastructure and may not require that a vendor or merchant make any changes to their website configuration/infrastructure. One preferable characteristic of the websites is the use of page families for representing the website data from their databases 22D, as further described below.

Using signature files, Mobile Applications, —rich mobile applications, can be created from large websites. Each page is “optimized” on the fly by extracting the data from the page and sending the data to a mobile device, through use of the signature file, thereby helping to significantly speed up loading time and saving bandwidth. It is recognized that the Schema Engine is also able to format the content and navigational items obtained from the web pages 60D for efficient display on the display of the mobile/desktop devices 101D. Turning unstructured page content into relational data can also be significant, and can help to enable rich features for users such as custom alerts and price comparison. A user could save an item while browsing and the Schema Engine could automatically go back every day (or other selected time period) and check to see if the item was on sale or in stock. A user could also ask to find similar items at other stores, via appropriate URL requests to the gateway, which could be a supportable feature if product information was stored, for example.

With regard to Search Engines, today, there is no way to automatically index the web and understand the specific details about unstructured information (e.g. embedded in a page format) that are resident on the web pages as embedded content and navigational items. For example a search engine index of the product page of camera A would contain all the keywords for that page and have prices—$200, $100, $50, the relationship of the prices with respect to the camera is unknown to the search engine. For example, a product page can contain information about a specific product including its name, description and price but may also have other product being recommended to the user on the same page with their own prices and names. A search engine may know of all the names and prices in the page, but not know which sets of information belong to which product specifically. But the index would not know which one of those prices is the actual price of camera A. Applying signature files 64D (e.g. having knowledge of which of the content is related to other portions of the content, as well as navigational aspects of the content) to search indexes can allow search engines to unlock the value of precise content that exists in their indexes and can have the ability to significantly improve search results. It also enables new kinds of searches such as “Find the lowest price for this product” or “Show me all articles actually written by this author” which would not show web pages that simply had the authors name in it. Accordingly, it is understood that use of web pages that are unstructured (e.g. little to no use of meta data for defining the content and navigational items resident in the document (e.g. page)), such as unstructured HTML.

With regard to Price Comparison or product recommendation sites—Price comparison sites depend on vendor submission for their offering and it can be very painful for vendors to prepare these data feeds. Crawling the sites like a search engine does not work for the reasons stated above, unless the use of a signature file is applied. Using the signature file, accurate product information could be ascertained by applying signature files to a crawled cache or index or collecting price and product information as it flows through the Schema Engine. In this case, a template of the cached/indexed information would be used to create the respective signature therefore. The information could be much richer and cover a significantly larger portion of the web more reliably and easily. A price comparison engine could automatically crawl using signature files to build a complete database of an ecommerce site, using search criteria facilitated through the signature file to implement complex searches of the web site content on a page per page basis (e.g. find all cameras with prices—done through the use of the signature file for respective web sites and then apply filters to the extracted data—e.g. identify those cameras with a price under $200).

With respect to construction of an appropriate signature file, some terminology is explained using FIG. 34 as an example:

Page family: Item Page

Object: Product (A camera)

Object Elements: Picture (1), Title (2), Price (3), Description (not shown)

A complete signature file 64D for a website 20D can contain such as but not limited to:

- Contains instructions to identify each page family on the website (list pages, search pages etc.);
- Contains instructions to extract desired objects and elements for each page family above;
- Specifies the relationship between the objects and attributes (camera has a title, picture and price);
- Captures web page navigation including navigational items and paging; and
- May enable special functionality for the website including searching, logging in a user, purchasing items etc.

The following provides an overview of constructing the various components of the signature file 64D, for use in interpreting web pages 60D obtained from the web site and for reformatting the content and navigational items of the requested web pages for use by the mobile 24D and/or desktop 26D as reformatted pages 66D. The recognition of various elements of the web pages for use in defining the signature file 64D can be obtained through manual/automated/semi-automated analysis of the web pages (content and navigation), as desired.

Identifying a Page Family

An identifier for a page family can meet 2 criteria:

1) It is present in all pages belonging to the family

2) It is NOT present in pages belonging to any other family

In one embodiment, a string identifier is used that meets the above criteria as shown in the example below.

Code Snippet from Webpage Shown in FIG. 30

. . .

1
<tr>

2
<td colspan=“2” class=“product-details-prd-title”><span

class=“tx-heading3-dgrey”>Acer Aspire AMD

Turion 64 X2 Dual Core TL-52 1.60GHz Laptop (AS9300-5383F) -

French - FS Exclusive</span></td>

3
<td class=“product-details-r-bdr”> </td>

4
</tr>

. . .

Code Snippet from Webpage Shown in FIG. 32

. . .

1
<tr>

2
<td colspan=“2” class=“product-details-prd-title”><span

class=“tx-heading3-dgrey”>Sony 7.2MP Digital Camera

(DSCW55B) - Black</span></td>

3
<td class=“product-details-r-bdr”> </td>

4
</tr>

. . .

The pages in FIGS. 30 and 32 are both from the item family. The text “product-details-r-bdr” occurs in both pages and in fact in all pages from the item family in ABC ComTech Corp. The text does not occur in the page shown in FIG. 27 belonging to the list family and in fact does not occur in any other family of pages other than the item page. Since this satisfies both conditions of being a page family identifier, the highlighted text is chosen as the identifier for the item page family.

Setting a Limit

Unique reference can be defined by setting a limit on portion of a webpage.

Code Snippet A from Webpage Shown in FIG. 30.

1 function ImageFoundHtml( )

2 {

3 document.getElementById(‘largeImageRef’).style.-

display=“inline”:

4 document.getElementById(‘largeImageNoRef’).style.-

display=“none”:

5 {

6 //..>

7 </script>

8 <title >Future Shop: Computers: Laptops: Acer Aspire AMD

Turion 64 X2 Dual Core TL-52 1.60GHz Laptop (AS9300-5383F) -

French - FS Exclusive</title>

9 >/head>

Code Snippet B from Webpage Shown in FIG. 30

1 <DIV id=‘largeImageRef’ style=“display:none”>

2 <a href=“#”

onClick=”openWindowAdv(“http://www.futureshop.ca/popup/largeimagepopup.asp?logon=

&langid=EN&file=/multimedia/products/large/10086374.jpg&title=Acer+Aspire+AMD+Turion+64+

X2+Dual+Core+TL%2D52+1%2E60GHz+Laptop+%28AS9300%2D5383F%29+%2D+

French+%2D+FS+Exclusive”550,524,0,0,0,0,0,0);” title=“Click here for larger view”><img

src=“/multimedia/products/regular/10086374.gif” WIDTH=“150” HEIGHT=“150” BORDER=“0”

alt=“Acer Aspire AMD Turion 64 X2 Dual Core TL-52 1.60GHz Laptop (AS9300-5383F) - French -

FS Exclusive”></a>

3 <a href=“#”

onClick=”openWindowAdv(“http://www.futureshop.ca/popup/largeimagepopup.asp?logon=

&langid=EN&file=/multimedia/producls/large/10086374.jpg&title=Acer+Aspire+AMD+Turion+64+

X2+Dual+Core+TL%2D52+1%2E60GHz+Laptop+%28AS9300%2D5383F%29+2D+

French+%2D+FS+Exclusive”550,524,0,0,0,0,0,0);>Click here for larger view</a>

4 </DIV>

The string “largeImageRef” is the string identifier used to identify and extract the product image for the page shown in FIG. 30. Code snippets A and B above illustrate a common problem that can occur. The string identifier needed occurs previously in the document and is therefore an ambiguous identifier on the page. One solution to this problem is constraining the scope of the Schema Engine to the appropriate part of the page in order to effectively use an identifier. This method allows the definition of seeming uniquely identifiers even if they appear elsewhere on the page.

Extracting Objects and Elements in a Page Family

The example page in FIG. 34 contains a camera object with the elements Picture (1), Title (2), Price (3), Description (not shown). The signature file therefore can have instructions to identify and extract the elements above as part of the product object for all pages in the item page family.

As an example let us try to construct an instruction to identify and extract the title from any item page such as the pages shown in FIGS. 33 and 34. We know that the output of the instruction should be the title of the product in FIG. 33: “Acer Aspire . . . Exclusive”. Below is a code snippet around the title code of the page:

1 <tr>

2 <td colspan=“2” class=“product-details-prd-title”><span

class=“tx-heading3-dgrey”>Acer Aspire AMD Turion 64 X2 Dual Core

TL-52 1.60GHz Laptop (AS9300-5383F) - French - FS Exclusive</span>

</td>

3 <td class=“product-details-r-bdr”> </td>

4 <tr>

The following instructions will result in the output of the title:

1. Locate the string “product-details-prd-title”

2. Extract the value after the string in (1) and in between the strings “<span” and “</span>;”

3. Strip all mark up tags—“class=“tx-heading3-dgrey”>”

4. The resulting string is the product's title

The code snippet for the page shown in FIG. 34 is below. Since both pages belong to the same family, we should be able to follow the instructions above and extract the title of this product. Notice that the instructions work and produces the string “Sony 7.2 MP . . . Black” which is the title of the product. A signature file specifies the instruction above as a single command with parameters.

1 <tr>

2 <td colspan=“2” class=“product-details-prd-title”><span

class=“tx-heading3-dgrey”>Sony 7.2MP Digital Camera (DSCW55B) -

Black</span></td>

3 <td class=“product-details-r-bdr”> </td>

4 </tr>

The command representing instructions 1-4 above is shown below in a query language (e.g. the individual file entries) used in signature files developed for the purpose of data extraction, with some relevant parameters highlighted:

<lookup type=“pex” action=“get_string” name=“title” ref=“product-details-prd-title” location=“after” start=“<span” end=“</span>” include_sz=“1” strip_jags=“1”/>

The signature file and processing of signature files by the Schema Engine are discussed in more detail later.

Identifying Object and Element Relationships

The object and element relationships can be implicitly or explicitly specified. For example in the ABC ComTech Corp. list page shown in FIG. 31, the instructions are to first identify and extract the picture, second to identify and extract the title, third the link and fourth the price. The instructions then repeat for the rest of the products on the page. The specific ordering and grouping of the instructions above implicitly define objects that consist of those elements.

Other Aspects

The example and information demonstrates how to capture data and relationships of objects and elements within a page of a web site 20D. The platform can actually capture relevant attributes of an object across pages. For example, if a user of the mobile 24D clicked through a number of pages in the following categories in ABC ComTech Corp. to get to a specific TV-SONY456: e.g. TV & Video >19″-21″ TVs >LCD TVs >SONY456.

Another aspect is the ability to capture the information across the navigation of pages about the product. In doing that, one can capture the categorization of the TV “TV & Video >19″-21″ TVs >LCD TVs >” and add that as another attribute of the object. This example shows how capturing of navigation metadata or information across pages can be a source of valuable information.

Although this example covers only displaying content, the same concepts apply for a page that requires input. The key input fields and values (e.g. the ability to enter search strings) can be identified in the same way and presented to the user of the mobile 24D and value captured and sent back to the website 20D via the gateway 22DD. The signature file 64D can be written in an xml based query language syntax (or other structured definition language and/or script language, for example) to specify the above identifiers and actions such as traversing backwards, forwards and extracting values. The language can be a SQL type query language and can be built on top of regular expressions.

Automatic Generation of Signature Files 64D

Described is a method of creating signature files that identify and extract specific contents from a webpage. It is recognized that, in view of FIG. 21, a device 101D for implementing the automated method of signature file generation can host a corresponding tool with associated modules including a graphical user interface module to allow selecting contents on a page easier for a user, for example.

The contents may be navigational items, lists, specific items from a list, and other content, for example. The reason that this is useful is that signature files can be manually created, which can be time consuming, and subject to human error. Therefore by automating this process, the turn around time for interpreting a website as a database through the gateway 22DD can be substantially faster and more accurate.

The automated generation method is to break down the html document (of other format of the web pages) into a hierarchy of tags (delimiters pertaining to a schema of the definition of the pages). The resulting structure can be a tree, which defines the parent, siblings and children of each object. The process (described in the following section) can identify the key objects that contain the data required for the signature file. Once an object is identified as being a required field within the database, the object would then identify its uniqueness by examining its properties (for example class, style, id). If the object is a text node of the tree (or other hierarchical structure), the object will use the properties of its parent. If the properties of the object are not unique, then the object would expand its uniqueness to its parent, siblings and children. The process would expand in all directions uniformly (i.e. examine parent, then previous sibling, then next sibling, then first child. The properties of each of these items would also merge with the required object. This process would then be repeated on the parent, then the previous sibling, etc, until a unique identifier was found. Once a unique identifier was found, an expression would be created for the signature. Note that at least two pages of the same family can be used to create the expression.

The user will enter the required fields to be extracted from the page. These fields can be specified by a user using a corresponding graphical user interface of the device 101D to select fields. Alternatively a tool similar to the Desktop tool (see below) could be used to automatically guess at the fields on a page. To automatically generate the signature file assumes that one knows where the key information that resides on the page (i.e. location within the document)—e.g. price, image, description, etc. For example, knowledge of where the key information (e.g. here is the image between these tags to identify the content) is located in the web page can be done using a number of methods, such as but not limited to: look at code of the page by hand to identify the tags used to indicate content type (e.g. navigation, navigation of which content, title, price, image, item description, etc.; semi-automated using a graphical tool to highlight portions on the page and therefore visually select which content data corresponds to what meaning and other content data; and/or the use of the user assisted identification with confirmation/correction by user (further described below with the use of assisted generation that is applicable also to generation of rich bookmarks).

Example HTML document of Web page

Item1.html

<html>

<head></head>

<body>

<img src=”company_logo.gif” class=”image” />

<div class=”product”>

<h1>Product title</h1>

<h2>Product Manufacturer</h2>

<img src=”product_image.gif” />

<br>

List Price: <strong> $99.99 </strong>

<br />

Our Price: <strong> $79.99 </strong>

<br />

<p>

This is a Description for Product Title Made by Product Manufacturer

</p>

</div>

</body>

</html>

Item2.html

<html>

<head></head>

<body>

<img src=”company_logo.gif” class=”image” />

<p>

disclaimer

</p>

<div class=”product”>

<h1>Sample title</h1>

<h2>Sample Manufacturer</h2>

<img src=”sample_image.gif” />

<br>

List Price: <strong> $109.33 </strong>

<br />

Our Price: <strong> $99.99 </strong>

<br />

<p>

This is a Description for Sample Title Made by Sample Manufacturer

</p>

</div>

</body>

</html>

Assumptions: The required fields are identified prior to this process either by the user or using an automated tool (such as the schema desktop tool). They can be as follows:

Item1

Image
Product_image.gif

Title
Product Title

Price
$79.99

List Price
$99.00

Description
This is a description for Product title made

by Product Manufacturer

Item2

Image
Sample_image.gif

Title
Sample Title

Price
$99.99

List Price
$109.33

Description
This is a description for Sample title made by

Sample Manufacturer

It is recognized that different modules of the automated generation process can implement the following steps (embodied as executable instructions 207D—see FIG. 21).

Step 1—Identify the Image

From the Item1 the object <img src=“sample_image.gif”/> is selected. It identifies src as an attribute and scans the source of item1 for src=“sample_image.gif”. It does not find a match, so it then scans item2. If a match is found, and the matching object contained the image identified for item2, the attribute would be used to create a signature file image property. However, the item is not found in Item2, so no match has been made. Next the element looks at “<img” within list 1. It determines that it is the second match. When looking at Item2, the second image also provides the object that contains the image. Now that we have the matching object, we apply a similar heuristic to locate the result from within the object. If the object is a text node, the process is complete. Otherwise, the start and end of the object needs to be located. Using pattern recognition techniques, we find that the ‘src=”’ and that ‘”’ ends the string. Therefore the following entry would be added to the signature file <lookup type=“pex” action=“get_string” name=“image” ref=“<img” repeat_ref=“1” start=” src="” end=“"”/>

Step 2—Identify the Title

From the Item1 the object <h1>Product title</h1> is selected. It identifies that it is a text node, and uses its parent to identify uniqueness. There are no attributes for the parent <h1>. Next the element looks at “<h1” within list 1. It determines that it is the only match. When looking at Item2, there is only one match, and the matching element contains the title. Now that we have the matching object, we apply a similar heuristic to locate the result from within the object. Since the object is a text node, the process is complete. Therefore the following entry would be added to the signature file <lookup type=“pex” action=“get_string” name=“title” ref=“<h1 start=“ src=>” end=“<”/>

Step 3—Identify the Price

From the item1 the object <strong> $79.99</strong> is selected. There are no attributes to be checked for this element. Next the element looks at “<strong” within list 1. It determines that it is the second match. When looking at Item2, the second strong tag also provides the object that contains the price. Since the object is a text node, the process is complete. Therefore the following entry would be added to the signature file

Step 4—Identify the List Price

From the Item1 the object <strong> $99.99</strong> is selected. There are no attributes to be checked for this element. Next the element looks at “<strong” within list 1. It determines that it is the first match. When looking at Item2, the first strong tag also provides the object that contains the price. Since the object is a text node, the process is complete. Therefore the following entry would be added to the signature file

Step 5—Identify the Description

From the Item1 the object <p>, this is a description for Sample title made by Sample Manufacturer </p> that is selected. There are no attributes to be checked for this element. Next the element looks at “<p” within list 1. It determines that it is the first match. When looking at Item2, the first p tag does not provide the object that contains the description. The parent object <div class=“product”> is selected next. It identifies the attribute class=“product”, and scans item1, and determines that it is the only match. The <p tag is processed again, limiting its search to the parent. The <p tag is identifies as the first instance within the parent. Next the same process is performed on item2. First the attribute class=“product” is located. The first <p tag that is a child of the object containing class=“product” is found. The <p object also contains the description. Since the object is a text node, the process is complete. Therefore the following entry would be added to the signature file

Referring to FIG. 22, accordingly, in view of the above, the automated generation methodology implemented on the comparison toll (e.g. device 101D) for the signature file 70D compares two or more delimiters (pertaining to a common schema of the definition of the pages) from each of the pages 52D in order to identify common uses of the delimiters (and their contents). Once identified as a match, the corresponding object, for example, is placed in the hierarchical structure 74D (or other ordered list, etc.).

It is recognized that the hierarchy 74D can link entities 76D either directly or indirectly, and either vertically or horizontally. The only direct links in a hierarchy, insofar as they are hierarchical, can be to the entities' immediate superior or to the entities' subordinates, although a system that is largely hierarchical can also incorporate other organizational patterns. Indirect hierarchical links can extend “vertically” upwards or downwards via multiple links in the same direction. Traveling up the hierarchy to find a common direct or indirect superior, and then down again can nevertheless “horizontally” link all parts of the hierarchy, which are not vertically linked to one another. Further, the structure 74D can also be a lists implemented using arrays or linked/indexed lists of some sort. The structure 74D can have certain properties associated with arrays and linked lists. A sequence can be another name for the structure 74D, emphasizing ordering of the entities 76D.

Further, it is recognized that the structure 74D would be represented in the signature file as the entries as noted above. It is recognized that a user of the device 101D could manually 78D amend or otherwise review the automatically generated signature file 64D, as desired.

User Assisted Generation of Signature Files 64D (Desktop tagging)

Described is a method of assisted recognition of web page contents that identifies and extracts specific contents from a web page, which could be applied in creating signature files. It is recognized that, in view of FIG. 21, a device 101D for implementing the assisted method of web page content that identification and extraction can host a corresponding tool with associated modules. The recognized content could be used to provide the required fields in the signature files or could be used to create the rich bookmarks.

The web page contents may be navigational items, lists, specific items from a list, and other content, for example. The reason that this is useful is that signature files can be manually created, which can be time consuming, and subject to human error. Therefore by helping to automate the recognition of web page contents, the turn around time for interpreting a website as through the gateway 22DD can be substantially faster and more accurate.

The following is an embodiment of the process of assisted capturing of web page contents, such as but not limited to the image, title, description, and price of a product page as shown in FIG. 40. The result of the process is shown in FIG. 41. The process is performed on the client side, with a call to the server (e.g. gateway 22DD and/or web site 20D). In one embodiment, the call consists of requesting a javascript. The javascript is generated dynamically on the server side. The dynamic part of the script can perform a number of functions. A first function consists of checking the users cookies for a username and password, so that the user is not prompted with a login upon saving the item. A second function uses a referrer site to load confidence intervals, that have been generated on previously saved items from the same site.

The javascript can have no other knowledge of a web-page, other than confidence intervals to determine the specific fields (image, title, description, and price) of a product, for example. The confidence intervals, further explained below, contain the location on the page (width and height) of each field, and other properties (stated below) that are used to guess a field (i.e. what is the significance/meaning of the field with respect to the content/navigation items contained on the web page. Therefore, confidence levels can be set on a per site basis, but the process used to derive the fields can be the same for every site. This can be done, because most ecommerce web sites display products in a similar fashion (e.g. the title is bold, the image is near the middle and large, the description has the most text, and is black, the price is highlighted and when rendered is within close proximity to the image. Any differences between web sites can be accommodated for based on the assisted (e.g. user) nature of the capturing of web page contents. For example, after the initial guess by the javascript, incorrect matches can be altered by the user clicking on the field that was matched incorrectly, and then locating the correct match on the page, and clicking on that. Once the item is submitted, the confidence intervals are updated based on the fields submitted.

Accordingly, referring to FIG. 24, the user device 101D first connects to the gateway 22DD and then requests the desired web page 60D (see FIG. 40) that is obtained from the web site 20D. Upon receipt of the java script 95D (or other executable instructions for facilitating the capture of the web page contents), predefined criteria 96D is used to search the rendered page 60D for the desired object(s) (e.g. product including image, title, description, and price). The criteria 96D is used to compare with the objects located on the web page and if the objects pass the analysis, they are considered as matching candidates for final approval by the user. All candidates are then displayed 97D to the user (see FIG. 41). The user can accept 98D the candidate(s) and/or suggest alternative matches (e.g. by clicking on objects displayed on the screen (e.g. different title, price for the correctly identified image and description) based on a visual inspection of the web page displayed on the screen of the device 101D. Acceptance and/or amendment of the candidates can be used to update the pertinent parts of the predefined criteria 96D for subsequent use in matching other candidates. Further, in the case of multiple candidates, the redefined criteria 96D could be used to revise the remaining candidates, before final review by the user.

It is recognized that the assisted recognition of web page contents could also be used to locate any navigational items that are related to the web page content (e.g. a buy button located adjacent to a product, a bid now button located next to an auction item, etc.).

Further, this method of web page recognition can be tuned capture the key information on a webpage for different genres of sites. For example, e-commerce websites, news sites, sport etc. The method can capture the product image, title, price & description from a page and then post the information with the URL of the webpage to a server to store the information for the user for later retrieval and use, e.g. a rich bookmark. This allows the user to store rich bookmarks that contain more than just the URL of the website. An example of rich bookmarks 99D lists are shown in FIGS. 42 and 43, which shows the bookmarks 99D in the context of URL links accessible from a browser menu.

Example Operation

Field Attributes

Image:

Title:

Description:

Price:

Example

Site: http://www.bestbuy.com

Link: http://www.bestbuy.com/site/olspage.jsp?skuld=7731564&type=product&productCategoryId=pcmcat95100050005&id=140392418573

Source: web page of FIG. 40.

Referring to FIG. 42, the following steps can be implemented in user assisted capture of web page contents, namely:

1) User navigates to item page

2) User clicks FatFreeMobile (activation of desire to connect to gateway 22DD)—Save

3) A request is made to fatfreemobile.com (i.e. the gateway 22DD) for the product javascript 95D

4) The FatFree server receives the request

a) The server checks to see if the user is already logged in, if the user is not logged in, the server checks for cookies with the user credentials

b) The server extracts the requesting site from the referrer section of the http request

c) The server attempts to the confidence intervals for the site (based on predefined identification criteria 96D).

d) The server dynamically creates the javascript based on the information from steps (a) and (c).

e) The server returns the javascript to the client

5) The client receives the javascript, which initiates variables required to start the engine, and then launches the engine. Code snippet: watPM.watStart(window);

6) The function watPM.watStart(window) performs the following tasks (e.g. based on the identification criteria 96D)

a) Initializes the objects variables

b) Locates the largest rendered frame

c) From the largest frame, all <head> and <body> tags are extracted. Code snippet: getElementsByTagName(‘body’);

d) The remaining tags i.e. <a> <td> Code snippet: getElementsByTagName(‘body’);

e) A style sheet from FatFreeMobile is then injected into the head of the document

f) Special characters such as   " are replaced with their respective rendered characters i.e. "=”

g) The gui for FatFreeMobile is injected into the body, as the first element

i. API call document.element.insertBefore(new_element);

h) Step 0 is then called setTimeout(“top.watPM.watStage(0)”, 20);

7) The function setTimeout(“top.watPM.watStage(0)”, 20); performs the following tasks by calling watScriptX( )

a) All script tags that are embedded within the page are removed

i. API call document.removeElement(element);

b) Step 1 is then called setTimeout(“top.watPM.watStage(1)”, 10);

8) The function setTimeout(“top.watPM.watStage(1)”, 10); performs the following tasks by calling watParselt(0). This function looks at all of the tags. However it only process 1000 at a time, for example, to help avoid the warning message a browser prompts with

“The javascript is not responding”, So for each tag the functions performs the following (e.g. based on the identification criteria 96D)

a) Extract the tag name (i.e. <A> <BR> <TABLE>)

b) Ensure the current tag is visible. If the tag is not visible (one of the following styles implies hidden visibility=hidden display=none) the tag is ignored.

c) The position of the tag (absolute, relative, etc) are extracted from its style property

d) If the tag is one of the following it is ignored (‘LINK’, ‘STYLE’, ‘HEAD’, ‘TITLE’)

i. For example <title>Hewlett-Packard—42″ Plasma HDTV—PL4260N</title> is ignored

e) If the position (c) is absolute, and the x coordinate <0 and/or the y coordinate is <0 the element is ignored

i. For example <div id=“kioskMessage” style=“display:none;”> and all of its children are ignored

f) All javascript actions from the given object are cleared. (i.e. object.onclick will be set to return false;

i. For example <script language=“JavaScript”>if(is Kiosk){var kioskwarning document.getElementById(“kioskMessage”);kioskwarning.style.display=“block”;strAdHeight2=kioskwarning.offsetHeight;}</script> is removed

g) If the objects tag=IMG or (tag=INPUT and type=image) the object is saved as a candidate for the products image.

i. For example <imgsrc=“http://images.bestbuy.com:80/BestBuy_US/images/products/7731/7731564_rc.jpg” alt=“ ” border=“0” align=“top”> the product image

ii. For example <imgsrc=“http://images.bestbuy.com:80/BestBuy_US/images/products/7426/7426458_s.gif” alt=“7426458 Front Thumbnail” border=“0” height=“45.0” width=“54.0” align=“center”> not the correct product image, but still an image.

h) If the objects tag is in the following (‘TD’, ‘UL’, ‘P’, ‘DIV’, ‘SPAN’, ‘B’, ‘H1’, ‘H2’, ‘H3’, ‘H4’, ‘H5’, ‘H6’, ‘STRONG’, ‘FONT’, ‘BIG’) and the objects innerHTML code length is <1024 (for example) the object is stored as a possible candidate for the products title, price, and description.

i. For example <td class=“Body-Headline” colspan=2>Hewlett-Packard42″ Plasma HDTV<br></td> the correct title

ii. <b> More Options </b> an incorrect title

iii. <td class=“Body”> Watch all of your favorite high-definition quality broadcasts on this 42″ plasma TV that features SRS . . . </td> the correct description

iv. <td class=“Body” valign=“top”> 16:9 widescreen aspect ratio delivers a cinema-style entertainment experience; 3-2 pulldown for accurate reproduction of film-based sources </td> an incorrect description

v. <div class=“priceblock”>Our Price: $1,199.99<br></div> the correct price

vi. <div class=“priceblock”>Our Price: $99.99<br></div> an incorrect price

i) Step 2 is then called setTimeout(“top.watPM.watStage(2)”, 10);

9) The function setTimeout(“top.watPM.watStage(2)”, 10); performs the following tasks by calling watSetTitles( ), which calls watAttrib(hcc,lcc,tcc), (e.g. based on the identification criteria 96D);

i. var hcc=[2,1]; //initial requirements

ii. var tcc=[2]; //post location requirements

iii. var lcc=this.ltitle;

a) all candidates for titles from step 8 are compared with each other. The top 5 (for example) are selected from the following:

i. First the objects weight is assigned a numeric value based on their rendered weight. Each objects' weights are compared.

1. not defined, normal, and 400=400

2. bold, bolder and >400=700

3. <400=300

ii. Any ties are broken by the objects rendered size. The size is assigned a numeric value based on its rendered size.

1. x pixels=x

2. x pt=4/3*x

3. HN=

a. Tag=H1=2

b. Tag=H2=3/2

c. Tag=H3=9/8

d. Tag=H4=1

e. Tag=H5=13/16

f. Tag=H6=5/8

g. Tag=ELSE=1

4. x %=x*(16/100)*HN

5. x em=x*16*HN

6. xx-small=10

7. x-small=12

8. small=16

9. medium=18

10. large=24

11. x-large=32

12. xx-large=48

13. 1 or −2=10

14. 2 or −1=13

15. 3=16

16. 4 or +1=19

17. 5 or +2=24

18. 6=32

19. 7=48

20. ELSE=12

b) The candidates are then arranged in order based on their distance from the center of the page. The closest to the center would be the first choice. Etc . . . . The center of the page is defined by the confidence intervals

c) Finally the winning candidate is selected by comparing the confidence interval of the most common winner, the confidence interval of the location, and the weight of each object.

d) For example, comparing the correct title, and the incorrect title above. Both would evaluate to a weight=700. The size of the correct item is larger, so it would be ranked ahead. Next the locality of each object would be compared. Since the correct title is closer to the center it would remain ranked higher. The items would then be re-ranked based on their weight. Since there weights are equal the winner is the correct title.

a. Step 3 is then called setTimeout(“top.watPM.watStage(3)”, 10);

10) The function setTimeout(“top.watPM.watStage(3)”, 10); performs the following tasks by calling watSetDescription( ), which calls watAttrib(hcc,lcc,tcc), (e.g. based on the identification criteria 96D);

i. var hcc=[5,−1]; //initial requirements

ii. var tcc=[ ]; //post location requirements

iii. var lcc=this.ldesc;

a) all candidates for titles from step 8 are compared with each other. The top 5 (for example) are selected from the following:

i. First the objects length of the innerHTML (the length of the source html code the object contains). The longer the length, the more likely it is a description.

ii. Second the weight of the object is compared, A detailed explanation was provided in step (9). The −1 signifies that a candidates weight counts as a negative attribute. Therefore, text that is not bold/italic etc is more likely to be a description.

b) The candidates are then arranged in order based on there distance from the center of the page. The closest to the center would be the first choice. Etc . . . . The center of the page is defined by the confidence intervals

c) Finally the winning candidate is selected by comparing the confidence interval of the most common winner, the confidence interval of the location.

d) For example, comparing the correct description, and the incorrect description above. The length of the correct item is larger so it would be ranked ahead. Next the locality of each object would be compared. Since the correct description is closer to the center it would remain ranked higher. The items would then be re-ranked based on their weight, where a stronger weight counts against the item. Since there weights are equal the winner is the correct description.

e) Step 4 is then called setTimeout(“top.watPM.watStage(4)”, 10);

11) The function setTimeout(“top.watPM.watStage(4)”, 10); performs the following tasks by calling watSetPrice ( ), which calls watAttrib(hcc,lcc,tcc), (e.g. based on the identification criteria 96D);

i. var hcc=[6,9,8,2,1]; //initial requirements

ii. tcc=[6,9]; /post location requirements

iii. var lcc=this.ldesc;

f) all candidates for titles from step 8 are compared with each other. The top 5 (could change later) are selected from the following:

iii. First the objects text is searched for a dollar sign ($). Objects that have a dollar sign will be ranked higher

iv. Second the objects text is casted to a decimal. If the cast is successful, i.e. the text is a number the element is ranked higher.

v. Third the objects text is scanned to determine if any numbers exist. If a number is found the object is ranked higher

vi. Fourth the objects weights are compared. Objects that are bold/italic will rank higher

vii. Fifth the objects size is compared. The larger the font of the price the more likely it is the products price.

g) The candidates are then arranged in order based on there distance from the center of the page. The closest to the center would be the first choice. Etc . . . . The center of the page is defined by the confidence intervals

h) Finally the winning candidate is selected by comparing the confidence interval of the most common winner, the confidence interval of the location, whether or not a $ sign exists, and whether the text is a numeric.

i) For example, comparing the correct price, and the incorrect price above. Both would evaluate to true when searching for a dollar sign. Neither item is a decimal, as they both contain text. Both would evaluate to true when searched for numbers. Both weights would evaluate to 700. Finally the size of both items are equal. So the item is essentially tied, and since html is a top down language the first item is ranked higher in our case the incorrect item. Next the locality of each object would be compared. Since the correct price is closer to the center it would now be ranked higher. The items would then be re-ranked based on the dollar sign and decimal tests. Since there both items evaluate to be equal the winner is the correct price.

j) Step 5 is then called setTimeout(“top.watPM.watStage(5)”, 10);

12) The function setTimeout(“top.watPM.watStage(5)”, 10); performs the following tasks by calling watSetGraphics ( ), which calls watAttrib(hcc,lcc,tcc), (e.g. based on the identification criteria 96D);

a) all candidates for titles from step 8 are compared with each other. The top 5 (could change later) are selected from the following:

i. First find the rendered width and height of the image.

ii. Determine the distance from the center of the page

iii. Compare an object by taking its area−distance to the center. The object that results with the larger number is more likely to be the image,

iv. For example, comparing the correct image, and the incorrect image above. The area of the correct image is visibly larger than that of the incorrect image. As well the correct image is also visibly closer to the center. Then if the correct image CA, and the incorrect image IA would demonstrate: area of CA−distance to middle CA>area of IA−distance to center. Hence the correct image is chosen.

b) Step 6 is then called setTimeout(“top.watPM.watStage(6)”, 10);

13) The function wataddItem takes the guess for image, title, description, and price and displays them to the user, shown in FIG. 41. The user now has the ability to change a selection by clicking first on the field that was guessed incorrectly. This field will be highlighted in yellow, then locate the correct item on the page, when the correct item is highlighted in yellow, clicking on that item will update the guess.

14) The user clicks Save

15) A form is posted to FatFreeMobile with the products image,price,title, and description. As well for each field, the x,y location of the field and the guess number is sent to FatFreeMobile

16) The server receives the request and updates the database accordingly. The server also downloads the selected image, to help avoid hot linking when displaying products.

It is recognized that the above assisted capture method can be used as a method to have one or more distributed users help or otherwise be employed to create portions of a signature file one or more distributed users help or otherwise be employed to create portions of a signature file for a web site. For example, a number of users could be assigned different pages from a web site in order to assemble a corresponding signature file for the complete web site, as desired.

Schema Communication Flow

The following description provides an example operation of the interaction between the gateway 22DD, the mobile 24D and desktops 26D, and the web pages 60D obtained from the website 20D, based on the requests for content/navigation from the mobile 24D and desktops 26D (see FIG. 20).

Referring to the above FIG. 45 and FIG. 20 with respect to the following steps;

1. A client makes a request to the Schema Engine 23D, acting as a proxy 20D, for a specific webpage 60D from a specific domain (e.g. web site 20D);

2. The engine receives the request and makes a request to the web site 20D for the specified page and retrieves the web page code into memory. This may not include objects on the page such as pictures that are inserted at the time of rendering;

3. The engine in parallel makes a request to the signature repository to acquire the signature file 64D for the domain using the domain in the URL as the key to retrieve the signature file, for example;

4. The engine does not render the page but instead uses the code in the signature file as instructions to extract the desired data from the web page, such that the desired data is defined for a particular request type received by the gateway from the mobile 24D/desktop 26D and/or for a predefined mobile 24/desktop 26 platform (e.g. having knowledge of device display capabilities screen size, resolution, and other parameters useful in determining the way in which the data is capable of being displayed on the device 101D;

5. The data can optionally be stored in a local data repository;

6. The engine transmits the data to the client that requested the page; and

7. The client could be a browser application that displays the data or could be an application that renders the data (e.g. see navigational menu 300D example described below).

Further below is described a section on a detailed explanation on how the Schema Engine understands the signature file syntax and processes a webpage, as proxied between the mobile/desktop and the web site.

It is noted that an example embodiment of the engine 23D and the signature file 64D used to interpret the web page 60D and subsequently send revised/reformatted web page content/navigation data to the screen with limited real estate requirements (e.g. mobile) is provided in Appendix A.

Referring to FIGS. 20 and 23, shown is an example operation of steps implemented to obtain and satisfy a web page request from the user device 101D (e.g. mobile) by the gateway 22DD. Step 200—clients makes a request for the ABC ComTech Corp. page shown in FIG. 26 and after steps 1, 2 and 3 above take place, the Schema Engine has the ABC ComTech Corp. webpage code and signature file loaded. The remaining steps are a detailed description of the above step 4, where the engine does not render the page but instead uses the code in the signature file as instructions to extract the desired data from the web page. Accordingly, step 201—Schema Engine confirms that input HTML is from ABC ComTech Corp.ca and this signature file is that of ABC ComTech Corp.ca, step 202—Schema Engine sets a global variable to append “&test%5Fcookie=1” to all requests and sets main index to http://www.ABC ComTech Corp.ca/home.asp?newlang=EN&logon=&langid=EN, step 203—Schema Engine then tries to determine the page type by checking existence of string identifiers for each page family, step 204—Schema Engine then jumps to the “item_elements” section of the signature file that contains instructions for extracting the object elements for the page, step 205—Schema Engine trims HTML scope, step 206—Schema Engine extract image, step 207—Schema Engine extracts title, step 208—Schema Engine extracts price, step 209—Schema Engine extracts sale price, step 210—Schema Engine extracts description, and step 211—Schema Engine assembles and returns all extracted data for display on the mobile, dependent on the format as predefined suitable for the mobile display capabilities (e.g. screen size, resolution, etc.). At step 212, the engine transmits the data to the client that requested the page (as per step 6 above).

It is recognized that the above described steps 206-210 can be for the extraction of web It is recognized that the above described steps 206-210 can be for the extraction of web example of web page content/navigation items that are obtained by the engine from the web page 60, using the signature file as a guide for the extraction. It is recognized that the engine can also have a series of formatting rules, not shown, for use with the extracted data in generating a page with the extracted data that is suitable for display on the target device 101D (e.g. desktop, mobile). It is recognized that the formatting rules can be system and/or user defined and can include such parameters such as but not limited to: object positioning, object colour, object size, object shape, object font/image characteristics, background style, and navigational item display (e.g. in menu 300D or embedded along with the content in the generated page for display on the target device.

User Interface Optimization by Separating Web Page Content and Navigation

Schema Solution—Gateway 22DD

Although a Schema Engine 23D of the gateway can automatically determine whether to send back menus or content for a given web page, an Schema client (e.g. mobile 24D and/or desktop 26D) has the ability to explicitly request either the navigation menu or the content for a page and the Schema Engine provides each output accordingly. On each screen of the user device 101D (e.g. mobile or desktop) the user either sees the navigation menus or the page content for the respective page. This method is accomplished using the Schema Engine and signature file, further described below. FIG. 26 shows the output the client receives from the Schema Engine, when it extracts the navigation from a typical web page (that has both navigation and content) The Schema engine takes in ABC ComTech Corp. page marked “1” and outputs page marked “2” to the schema client. FIG. 27 shows the output the client receives from the Schema Engine, when it is extracts the content (list items) from a typical webpage. The Schema Engine takes in ABC ComTech Corp. page marked “3” and outputs page marked “4” to the schema client.

This demonstrates how the navigation and content from web pages can effectively be separated and transmitted by the Schema Engine to the schema client.

Packaging Page Content and Navigation Menu Data into a Mobile Application

Page Content

From the above example it is apparent that the Schema Engine is able to output navigation and content data independently as the result of a given web page input. Packaging content into a mobile application (e.g. application hosted by the mobile device or desktop device) entails rendering the data output of the Schema Engine in a client mobile application instead of the web browser. In the current embodiment, as an example, a web browser makes a request for a page and receives the content data as the response that it renders. The mobile application can similarly make a request for the page, as the browser could, and render the data received from the Schema Engine. See FIG. 28 as an example of the rendering of a mobile client application.

To fully package a website into an application, the navigation functions of the website (browsing) and special features (buy item, check availability, and other buttons/links) can be inserted into a menu 300D of the application, see FIG. 29. One advantage of doing so is that the user can invoke the application menu for navigation rather than loading a new page or screen refresh to view the navigational items displayed in the web page. In another embodiment, navigation options can also be inserted in pop-up menus and dialogue boxes of the applications using a similar approach as described below. It is recognized that the gateway can be used as a data flowthrough mechanism, with respect to data on navigational items extracted from the webpage 60D. Accordingly, the navigational item data would be tagged by the gateway for subsequent interpretation by the client application (e.g. mobile), such that the client application would recognize the navigational item data for insertion into the navigation menus 300D, insertion as embedded into the published content displayed on the screen, or a combination thereof. It would be up to the mobile application, for example, to determine the manner in which the navigational item data is to be used in conjunction with the publish content data for any respective web page, as desired.

The pages marked “2” and “4” in FIGS. 26 and 27 respectively show how the navigational features and content features are divided into 2 separate pages/files. A user has to switch pages to toggle between navigation and content.

FIG. 28 shows an example client application's content area. Note that the navigational features of the web page can be selected through the menu 300D that is linked to the actual navigational instructions of the respective Web page(s), which is invoked in a single click (FIG. 23) that has the website navigation options displayed to the used as an application menu 300D rather than as navigational features displayed on the web page.

Navigation Menus 300D

A menu item can be statically created at compile time and its function is known at compile time for web pages. The Schema Engine can dynamically create menu items at run time. Assume that a navigational item is meant to be processed on the client that accomplishes inserting menu items dynamically. If the mobile application was passed the extracted navigational information from the engine, the application would insert the items into the application via MenuItem(name, URL), for example. In this case, it is the engine that would pass the data (name, URL) that indicates the data as a potential menu name and corresponding URL as parameters. The application would insert the data as a menu item into the application menu 300D, such that these parameters would then be linked to the corresponding respective menu item selection. The method described above dynamically inserts navigational items of the web page into the application menu of the application used to interact with the web site contents. The implementation of this method can differ depending on the application and platform of the device 101D. It should be recognized that the menu items are related to navigation of the content in the web pages rather than only between the web pages themselves.

The following steps outline an example process for dynamically inserting menu items into a mobile application menu:

1. The client application makes a request for the navigation items of a web page

2. The Schema Engine receives the web page (marked “1” in FIG. 26)

3. The Schema Engine extracts the navigational items and sends the data set (menu name, URL) to the client. Table 1 shows the output for the first 5 navigational items from the web page of FIG. 26.

4. The client receives the navigational items and calls a createMenuItem( ) method with each [menu name, URL] set received, thus displaying the navigational items as menu items. It is recognized that the menu items can be displayed overtop of the content displayed on the screen of the device 101D, where the navigational items are no longer displayed adjacent to the content (as formatted in the original web page) and rather assembled/combined and displayed in a separate navigational menu for navigating the content of the web site.

At this point, the navigation items for the page are loaded into the application menu 300D. A user can click a menu item, which will result in the application invoking the URL associated with the menu name and thus facilitating the display of the web site content associated with the menu item (representing the original navigational item). For example, the menu item can be used to invoke one of the navigational items of the web site (e.g. “buy item”), rather than just navigate between pages.

Using this method both content and navigation features can be simultaneously retrieved for a given page. For example in the diagram if the user selects the navigational name “computers” the URL page request will be sent to the Schema Engine that will respond to the client with the content for that page as well as the navigation items (in menu format for example) for that page.

The content is rendered in the application as previously described and the web page navigational items are inserted into the application menu as described above. Accordingly, the contents of the navigational menu 300D for any particular web page is dependent upon the navigational items that are contained or are otherwise associated with that web page as configured by the web site. In this case, both traditional content 50D and the navigational features 54D can be treated as components for each of the web pages. Hence, the web pages of the web site (through use of the signature file described below) can be represented as having web page contents that includes both the content 50D and the navigational items 54D. In this sense, each menu 300D for each page is dynamically created based on the navigational items resident/associated with that page (and page content 50D).

Further, it is recognized that some navigational items 54D can remain on the web page as displayed (e.g. embedded with the displayed content 50D), can be represented as separate menu 300D items, or a combination thereof.

Maintaining a Transactional Session Across Devices 101d

Referring to FIG. 25, in general the state of a user's browsing session can have a number of different perspectives, such as but not limited to: 1) a Browser/application 207D perspective including navigation history 82D across the current browsing session, such that the browser/application 207D can keep track of the address (e.g. URL) of pages previously visited (back and forward buttons) as well as the current page the user is on. The current page can also be captured through a bookmark, as is know in the art; and 2) a web site perspective such that session information relating to the specific website can include website state information 80D like a session ID, site preferences, and shopping cart items, for example. This website state information (particular to the user's interaction with the web site—e.g. tracked through the users' ID such as login ID or device 101D ID) can either be stored on the client (user's machine 101D) in a browsing history file (e.g. a cookie) or on the server device 101D associated with a user's session ID (e.g. website 20D and/or the gateway 22DD). In the case of the gateway 22DD, a table/memory 92D can be used to store or otherwise monitor the history 82D and/or the state information 80D with respect to each user transaction that is in a pending/unfinished state (e.g. user has browsed/interacted with a number of web pages to a certain stage for product purchase, but has not yet progressed to the point of transaction completion—e.g. confirmation of payment and shipping information of a selected product).

It is noted that a cookie can be referred to as a small text file of information 80D that certain Web sites can attach to a users hard drive (of their device 101D) while the user is browsing the Web site. The Cookie can contain information such as user ID, user preferences, archive shopping cart information, etc. Since the web sites can be inherently stateless, these cookies or other session history equivalents can also be a good way to create and maintain state from a website's perspective, as implemented by the environment 10D as further described below. Further, a bookmark can be referred to as a process of saving a URL (e.g. network 11D address) in the web browser/application 207D. The bookmark 82D allows the user to return to a particular web site or web page by making a record of the corresponding network address. A bookmark however may not capture the state (data entered/requested in the process of transaction completion) of a user's browsing session, rather the bookmark serves as a reference point for the location of the web page/web site last visited by the user. One can appreciate that a bookmark captures may only a fragment of a user's browsing session, for example only the address of current page that the user was on.

Accordingly, saving and restoring a user's session can have one or more different components, such as but not limited to: saving and restoring the current page and navigation history 82D; and/or saving and restoring the specific website's transactional state 80D pertaining to the user (e.g. using the respective cookie for the transaction).

Saving a User's Browsing Session

Saving navigation history can be accomplished by saving the current page (saving the URL such as a bookmark would do) and optionally gathering the browser's navigation history. For example on a mobile client, all pages that a user requests can be saved on the client or on a remote server.

For example, when a client browser or application (mobile or desktop) makes an http request, a request comes back including 2 parts, an http header and the http content. One of the instructions in the http header is a “set cookie” command. A browser or client application uses that command to create and maintain the cookie on the client. When the browser or client application makes a web page request, it can pass all the cookies back to the website to maintain state. Because cookie information can be in plain text in a header, it can readily be extracted by a mobile client application. One embodiment of the browser/application 207D is to collect cookies on the desktop/mobile is to use a browser plug-in or state application 88 to retrieve cookies from the “temporary internet folder” of the device 101D where cookies are typically stored and transmit them to the remote server or database. Saving cookies is a way to save the user's state from the website perspective.

Accordingly, the user's transaction can be saved through use of the history 82D and/or information 80D. For example, if the user wishes to save a particular transaction-in-progress, the user can notify the gateway 22DD of the intension and the gateway can save the history 82D, information 80D in the memory 92D, for later use in reactivating the particular transaction-inprogress. It is recognized that the data captured as a rich bookmark could also be used in the data 80D,82D as desired.

Restoring a User's Browsing Session

Restoring the current page can be accomplished by making a request by the user for the current page on the client application/browser (mobile or desktop). The gateway is then responsible for sending to the current device 101D (either the same or different device by which the transaction was last done with) a transaction continuance package 84D that is related to the saved particular transaction-in-progress from the memory 92D, which would contain data such as but not limited to: the saved navigation history for use in populating the navigation history of the user device; and/or all saved cookies for use in restoring website state information by placing the cookies into the appropriate location that the browser or client application uses to create and manage cookies, e.g. the “temporary internet folder”.

One aspect is that the application 88D could synchronize all cookies from the desktop to the mobile device or vice versa. This way, user preferences for all web sites (including re membered login ids, for example) could be always synchronized between a mobile device and the desktop. Further, it is recognized that the memory 92D could be used to remember the device on which the transaction-in-progress was last implemented on and to therefore try to maintain the formatting of web pages 86D as displayed previously for the user activity with respect to the transaction-in-progress. One example of this is to keep the simplified formatting of the web pages done for the mobile display the same for display of similar pages on the desktop, even though sufficient desktop screen space is available to display the original content and format of the web pages. Similarly, for transactions started on the desktop, the continuance of the web pages on the mobile, with respect to desktop formatted webpages, could be retained (e.g. through re-organization of the pages and wrap content around the screen, or used of the WAP standard to spatially divide a page (usually vertically) into a number of pages and allow the user to navigate between each page section to view a page). The maintaining of the look and feel of the particular web page content could be useful in keeping the user from becoming confused between format changes of the web pages. Further, for example the user could select a certain web page format for display through the gateway (e.g. original or otherwise simplified format), in the event that the user anticipates changing devices (e.g. desktop to mobile) to continue and complete the transaction, as desired.

Another extension of the concept of saving the transaction-in-progress is that variables such as an affiliate revenue sharing code can be included in the URL. That way, the user can start browsing from a PC or mobile device and save their session based on the code. When they restore the session on another PC or mobile device, the revenue share would be received by appropriate entity based on the code usage.

Caching

There can be cache points on the engine as well as the client. The cache can consist of the actual webpage or the data output of the webpage. Cache's can be build upon request or output can be pre-cached to optimize the user experience. A combination of the above on different kinds of pages can be used to develop caching schemes for usage. Another aspect of the engine is that it can be used to crawl an entire website with the corresponding signature file and build a complete database of product information from the website automatically.

Pre-Caching (Offline Synchronization) of Website Content to a Mobile Client

Another aspect is the ability for a user to load website content (pre-caching) from the Schema Engine to the client in larger segments instead of page by page. This could either be done through an application on an internet enabled PC when the mobile device is connected to the PC or directly from the mobile device when the user has a wireless data connection available. Once the desired content is on the mobile device, the user could browse the content without a wireless connection.

In the current examples, when a user selects a menu item from a menu page, for example “computers” in FIG. 35, the Schema Engine fetches the corresponding page to the menu item which could be a sub-menu item or a list page. Another embodiment of the invention would allow the user to select the category “computers” in FIG. 35 and the Schema Engine would automatically traverse all sub menus in the category and cache all information including sub menus, list pages and item pages in the category. Other examples are sections of a news site including “headlines”, “business”, “sports” etc. Using this method a user could select categories that could be pre-cached on the device for offline browsing at a later time. The same techniques could be applied to cache an entire website for data mining or searching (for example a price comparison website that wished to have indexes of multiple e-commerce sites). Further, it is recognized that no network calls could be required in the event of precaching. For example, precache could happen at the beginning of day via the user desktop and then synched to the mobile via a wired connection (for example), so that user can surf precached information offline. For example, using signature file to grab content based on precaching criteria, this could be used to generate the precache database. Examples of wanting to build a local precached database on the device could include: 1) for browsing situations for no/interrupted connection potential; or 2) for fast browsing. Further, for price comparison websites, they can crawl the web and build a comparison pricing information to make available to the public or other subscribers.

APPENDIX A
Platform Overview

Referring to FIG. 45, the Following Steps can be Effected:

1. A client (1) makes a request to the Schema Engine(4), acting as a proxy, for a specific webpage (2) from a specific domain

2. The engine (4) receives the web page code (2) into memory. This typically does not include objects on the page such as pictures that are inserted at the time of rendering.

3. The engine (4) in parallel makes a request to the signature repository (3) to acquire the signature file for the domain using the domain in the URL as the key to retrieve the signature file

4. The engine does not render the page but instead uses the code in the signature file as instructions to extract the desired data from the web page (6).

5. The data can optionally be stored in a data repository (5)

6. The engine (4) transmits the data to the client (1) that requested the page

7. The client (1) could be a browser application that displays the data or could be an application that renders the data

Schema Engine Detailed Walk Through

Assume that a clients makes a request for the ABC ComTech Corp. page shown in FIG. 26. After steps 1, 2 and 3 take place, the Schema Engine has the ABC ComTech Corp. webpage code and signature file loaded as shown below. The next few pages are a detailed explanation of step (4)

Code Snippet of ABC ComTech Corp. Page Shown in FIG. 26

<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN”

“http://www.w3.org/TR/html4/loose.dtd”>

<meta http-equiv=“Content-Type” content=“text/html; charset=iso-8859-1” />

<script language=“JavaScript” src=“/javaScript/productdetail.js”></script>

<html>

<head>

<meta NAME=“GENERATOR” Content=“Microsoft Visual Studio 6.0”>

<link rel=“SHORTCUT ICON” href=“http://www.ABC ComTech

Corp..ca/favicon.ico”>

<title >ABC ComTech Corp.: Computers: Laptops: Acer Aspire AMD Turion 64 X2

Dual Core TL-52 1.60GHz Laptop (AS9300-5383F) - French - FS Exclusive</title>

</head>

<body onLoad=“onLoadAction( ); return false;” onResize=“setDDOnResize( ); return

false;” leftmargin=“10” rightmargin=“10” topmargin=“10” bottommargin=“10”>



<script language=“JavaScript” src=“/javaScript/search.js”></script>

<table align=“center” cellpadding=“0” cellspacing=“0” class=“table-outer-width”>

<tr>

<td>

<table width=“100%” border=“0” cellpadding=“0” cellspacing=“0”>

<tr valign=“top”>

<td colspan=“3” class=“bg-header-border”></td>

</tr>

<tr>

<td class=“bg-header-border”></td>

...

Code Snippet of Schema ABC ComTech Corp. Signature file retrieved by Schema

Engine

Code Snippet of ABC ComTech Corp. Signature File Retrieved by Schema Engine

<?xml version=“1.0” encoding=“ISO-8859-1” ?>

<site>

<version major=“1” minor=“2”/>

<url location=“http://www.ABC ComTech Corp..ca” key=“ABC ComTech Corp..ca”

name=“ABC ComTech Corp ” />

<advanced>

<append_link value=“&test%5Fcookie=1” />

<index_link

value=“http://www.ABC ComTech

Corp..ca/home.asp?newlang=EN&logon=&langid=EN”

/>

</advanced>

<page_type>

<lookup type=“pex” action=“locate_string” name=“list_elements”

id=“mylist_1” ref=“Sort or compare products” alt1=“Sort products” />

<lookup type=“pex” action=“locate_string” name=“item_elements”

id=“myitem_1” ref=“"product-details-prd-title"” />

<lookup type=“pex” action=“locate_string” name=“menu_elements”

id=“mymenu_2” ref=“anc-lhsnav-subItem” />

<lookup type=“pex” action=“locate_string” name=“menu_elements”

id=“mymenu_1” ref=“product-table” />

<lookup type=“pex” action=“locate_string” name=“item_elements”

id=“myitem_1” ref=“*” />

</page_type>

<list_elements id=“mylist_1”>

<paging>

<page_variable value=“page” />

<page_start value=“0” />

<lookup type=“pex” action=“get_string” name=“link” ref=“Next&nbsp”

location=“before” start=“<a class=” end=“</a>”include_sz=“1” strip_tags=“1” />

</paging>

<actions>

<lookup type=“pex” action=“move_ptr” ref=“Sort or compare

products” alt1=“Sort products” />

...

Step 1: Schema Engine Confirms that Input HTML is from ABC ComTech Corp.ca and this Signature File is that of ABC ComTech Corp.ca

Step 2: Schema Engine Sets a Required Global Variable to Append “&test%5Fcookie1” to all requests and sets main index to http://www.ABC ComTech Corp. .ca/home.asp?newlang=EN&logon=&langid=EN

Step 3: Schema Engine then Tries to Determine the Page Type by Checking Existence of String Identifiers for Each Page Family

<page_type>

<lookup type=“pex” action=“locate_string” name=“list_elements” id=“mylist_1”

ref=“Sort or compare products” alt1=“Sort products” />

<lookup type=“pex” action=“locate_string” name=“item_elements” id=“myitem_1”

ref=“"product-details-prd-title"” />

<lookup type=“pex” action=“locate_string” name=“menu_elements”

id=“mymenu_2” ref=“anc-lhsnav-subItem” />

<lookup type=“pex” action=“locate_string” name=“menu_elements”

id=“mymenu_1” ref=“product-table” />

<lookup type=“pex” action=“locate_string” name=“item_elements” id=“myitem_1”

ref=“*” />

</page_type>

ABC ComTech Corp. web page code snippet:

...

<td valign=“top” width=“430”>

<table width=“100%” border=“0” cellpadding=“0” cellspacing“0”

class=“product-details-table”>

<tr><td><img

src=“/images/prices/savebanner/EN/large/Save%2D%2460.gif” width=“125”

height=“15”></td><td align=“right”><a

href=“http://www.ABCComTechCorp.ca/informationcentre/EN/content_feedback.as

p?sku_id=0665000FS10086374&title=Acer+Aspire+AMD+Turion+64+X2+Dual+Core+TL%2D

52+1%2E60GHz+Laptop+%28AS9300%2D5383F%29+%2D+French+%2D+FS+Exclusive&man

=Acer&logon=&langid=EN”>Feedback</a></td></tr>

<tr>

<td colspan=“2” class=“product-details-price”><img

src=“/images/prices/pricepill/large/999%2E99.gif” ></td>

<td class=“product-details-price-trc”></td>

</tr>

<tr>

<td colspan=“2” class=“product-details-prd-title”><span class=“txheading3-

dgrey”>Acer Aspire AMD Turion 64 X2 Dual Core TL-52 1.60GHz Laptop (AS9300-5383F) -

French - FS Exclusive</span></td>

<td class=“product-details-r-bdr”> </td

</tr>

...

Schema Engine does not find “Sort or compare products” or “Sort products” in the web page so this page is not from the List family. The engine continues to check the next string.

Schema Engine finds “"product-details-prd-title"” and identifies the page as part of the Item family (item_elements).

Step 4: Schema Engine then jumps to the “item_elements” section of the signature file that contains instructions for extracting the object elements for the page

<item_elements id=“myitem_1”>

<actions>

<lookup type=“pex” action=“move_ptr” ref=“&It;/head>” />

</actions>

<element>

<lookup type=“pex” action=“get_string” name=“image” ref=“largeimageref”

location=“after” start=“&It;img src="” end=“"” />

<lookup type=“pex” action=“get_string” name=“title” ref=“productdetails- prd-title”

location=“after” start=“&It;span” end=“&It;/span>” include_sz=“1” strip_tags=“1” />

<lookup type=“pex” action=“get_string” name=“price” ref=“our price:”

location=“after” start=“&It;td” end=“&It;/td>” include_sz=“1” strip_tags=“1” />

<lookup type=“pex” action=“get_string” name=“sale_price” ref=“sale price:”

location=“after” start=“&It;td” end=“&It;/td>” include_sz=“1” strip_tags=“1” tolerance=“1” />

<lookup type=“pex” action=“get_string” name=“description” ref=“detailbox-text”

location=“middle” start=“<p” end=“</p>” include_sz=“1”strip_tags=“1” />

</element>

</item_elements>

Step 5: Schema Engine Trims HTML Scope

<actions>

<lookup type=“pex” action=“move_ptr” ref=“</head>” />

</actions>

ABC ComTech Corp. web page code snippet:

<html>

<title>

...

</head>

<body onLoad=“onLoadAction( ); return false;”

onResize=“setDDOnResize( ); return

false;” leftmargin=“10” rightmargin=“10”

topmargin=“10” bottommargin=“10”>

...

</html>

The engine discards all code before “</head>” setting the upper limit.

Step 6: Schema Engine Extract Image

<element>

<lookup type=“pex” action=“get_string” name=“image”

ref=“largeimageref” location=“after” start“<img src="”

end=“"” />

ABC ComTech Corp. web page code snippet:

<DIV id=“largeImageRef” style=“display:none”>

<a href=“#”

onClick=“openWindowAdv(‘http://www.ABC ComTech

Corp..ca/popup/largeimagepopup.asp?logon=&langid=EN&title=Acer+

Aspire+AMD+Turion+64+X2+Dual+Core+TL%2D52+1%2E60GHz+

Laptop+%28AS9300%550, 524, 0, 0, 0, 0, 0, 0);” title=“Click here

for larger view”><img src=“/multimedia/products/regular/10BORDER=

“0” alt=“Acer Aspire AMD Turion 64 X2 Dual Core TL-52 1.60GHz

Laptop (AS9300-5383F)

The Schema Engine returns the string in between the first “<img src="” and“"” that appears after next appearance of “largeimageref”. The string returned is the path to the product image.

Step 7: Schema Engine Extracts Title

<lookup type=“pex” action=“get_string”

name=“title” ref=“product-details-prd-title” location=

“after” start=“<span” end=“</span>”

include_sz=“1” strip_tags=“1” />

ABC ComTech Corp. web page code snippet:

<tr>

<td colspan=“2” class=“product-details-prd-

title”><span class=“tx-heading3-dgrey”>Acer

Aspire AMD Turion 64 X2 Dual Core TL-52 1.60GHz Laptop

(AS9300-5383F) - French - FS Exclusive</span></td>

<td class=“product-details-r-bdr”> </td>

</tr>

Then Schema Engine returns the string in between first “<span” and “</span>” including first and last element that appears after next appearance of “product-detailsprd-title”, excluding any mark up language. The string returned is the title.

Step 8: Schema Engine Extracts Price

<lookup type=“pex” action=“get_string” name=

“price” ref=“our price:” location=“after”

start=“<td” end=“</td>” include_sz=“1”

strip_tags=“1”/>

ABC ComTech Corp. web page code snippet:

<tr>

<td class=“tx-strong-grey”>Our price:</td>

<td class=“tx-normal-grey” style=“text-align: right”>$1,059.99</td>

</tr>

Then Schema Engine returns the string in between first “<td” and “</td>” including first and last element that appears after next appearance of “our price:”, excluding any mark up language. The string returned is the price.

Step 9: Schema Engine Extracts Sale Price

<lookup type=“pex” action=“get_string” name=“sale_price”

ref=“sale price:” location=“after” start=“<td” end=“</td>”

include_sz=“1” strip_tags=“1” tolerance=“1” />

ABC ComTech Corp. Web Page Code Snippet:

<tr>

<td class=“tx-strong-red”>Sale Price:</td>

<td class=“tx-normal-red” style=“text-align: right”>$999.99</td>

</tr>

Then Schema Engine returns the string in between first “<td” and “</td>” including first and last element that appears after next appearance of “sale price:”, excluding any mark up language The string returned is the sale price.

Step 10: Schema Engine Extracts Description

<lookup type=“pex” action=“get_string” name=“description”

ref=“detailbox-text” location=“middle” start=“<p” end=“</p>”

include_sz=“1” strip_tags=“1” />

</element>

</item_elements>

HTML:

<p class=“detailbox-text”>

Decked out with an impressive 17″ Acer CrystalBrite widescreen display the Aspire 9300 enhances multitasking productivity and gaming pleasure. <a HREF=“#MoreInfo”>More Info</a>

</p>

Then Schema Engine returns the string in the middle of “<p” and “</p>” including first and last element on the occurrence of “detailbox-text”, excluding any mark up language. The string returned is the description.

Step 11: Schema Engine Assembles and Returns all Extracted Data

Image
/multimedia/products/regular/10086374.gif

Title
Title Acer Aspire AMD Turion 64 X2 Dual Core

TL-52 1.60 GHz Laptop (AS9300-5383F) -

French - FS Exclusive

Price
$1,059.99

Sale Price
$999.99

Description
Description Decked out with an impressive 17″

Acer CrystalBrite widescreen display, the

Aspire 9300 enhances multitasking

productivity and gaming pleasure.

Signature File

Example Source (*.ffs)

Genre
Title
File

e-Commerce
ABC ComTech Corp . . . ca.ffs

ABC ComTech Corp. Page Family Signature Explanation

1 <page_type>

2 <lookup type=“pex” action=“locate_string”

name=“list_elements” id=“mylist_1”

ref=“Sort or compare products” ref_alt_1=

“Sort products” />

3 <lookup type=“pex” action=“locate_string”

name=“item_elements” id=“myitem_1”

ref=“"product-detailsprd-title"” />

4 <lookup type=“pex” action=“locate_string”

name=“menu_elements” id=“mymenu_2”

ref=“anc-lhsnav-subItem”

/>

5 <lookup type=“pex” action=“locate_string”

name=“menu_elements” id=“mymenu_1”

ref=“product-table” />

6 <lookup type=“pex” action=“locate_string”

name=“item_elements” id=“myitem_1”

ref=“*” />

7 </page_type>

The Schema Engine processes the <page type> tag by registering the identification strings for each page family. By doing that, when a webpage is sent to the engine as input, the

engine is able to identify the page family by its unique string.

action=“locate_string”

command to check for the existence of a string

name=”

identifies the type of page family for each identified family

id=”

assigns an id to the page family that is used across the signature file

When the Schema Engine is passed a web page and the signature file, the first step is to identify the page type which then instructs the engine to the corresponding list_elements tag for the page family.

ABC ComTech Corp. List Family Signature Explanation

1 <list_elements id=“mylist_1”>

2 <paging>

3 <page_variable value=“page” />

4 <page_start value=“0” />

5 <lookup type=“pex” action=“get_string” name=“link” ref=“Next&nbsp”

location=“before” start=“<a class=” end=“</a>” include_sz=“1” strip_tags=“1” />

6 </paging>

7 <actions>

8 <lookup type=“pex” action=“move_ptr” ref=“Sort or compare products”

ref_alt_1=“Sort products” />

9 </actions>

10 <element>

11 <lookup type=“pex” action=“get_string” name=“link” ref=“thumbnail”

location=“before” start=“<a href="” end=“">” />

12 <lookup type=“pex” action=“get_string” name=“image” ref=“thumbnail”

location=“middle” start=“"” end=“"” />

13 <lookup type=“pex” action=“get_string” name=“title” ref=“class="tx-strong-

dgrey&quot;” location=“after” start=“<a href=” end=“</a>” include_sz=“1”

strip_tags=“1” />

14 <lookup type=“pex” action=“get_string” name=“price” ref=“pricepill/”

location=“after” start=“/” repeat_start=“1” end=“.gif” tolerance=“1” />

15 <lookup type=“pex” action=“move_ptr” ref=“pricepill/” />

16 </element>

17 </list_elements>

Once the engine has identified that the page is of the “mylist_—1” family the engine finds the spots in the signature file that contains the signature for the objects and elements of the family.

Contains paging attributes of the mylist_—1 family. The tags contain instructions to find the number of pages on the list page and generates the links for each of the page links

The action tag instructs the engine to move the scan pointer to the section on the page right before the main list content of the page. This allows the engine to only scan the relevant area, discarding all the code preceding it. This can be important because it can eliminate ambiguity and repetition by instructing the engine on precisely which parts of the page to scan

Explanation of the Lookup Command:

Lookup type=“pex”: string lookup

Action=“get_string”: this action type actually return a value back that is the desired element of the object.

Name=“link”: the object element, in this case the link to the product page

ref=“thumbnail”: the reference string that identifies where to find the value of the link

location=“before”: The value of the link is before the ref string

start=“<a href="”: look for the ref string after this value

end=“">”: look for the ref string before this value

Line 11 for example instructs the engine to look for a reference of the string “thumbnail”, then locate the value between the start and end strings specified to the left of reference point. The element, which is the link to the product page in this case, is before the reference string and its value is to be extracted and returned.

The last lookup with action=“move_ptr” in the element tag instructs the engine to move the pointer past the first object to get ready to repeat the instructions to scan in the element of the second object on the list page.

Note: If you attach “advance_ptr” to a lookup, this will also advance the pointer (this can be used if ordering in list page exists)

ABC ComTech Corp. Search Family Signature Explanation

1 <search_elements id=“mysearch_1”>

2 <settings>

3 <search_path value=“http://www.ABC ComTech

Corp..ca/search/searchresult.asp?logon=&langid=EN&search=KWS” />

4 <search_variable value=“keyword” />

5 </settings>

6 <paging>

7 <page_variable value=“page” />

8 <page_start value=“0” />

9 <lookup type=“pex” action=“get_string” name=“link” ref=“Next&nbsp”

location=“before” start=“<a href=” repeat_start=“1” end=“</a>” include_sz=“1”

strip_tags=“1” />

10 </paging>

11 <actions>

12 <lookup type=“pex” action=“move_ptr” ref=“bg-compare-hero” />

13 </actions>

14 <element>

15 <lookup type=“pex” action=“get_string” name=“link” ref=“>” location=“after”

start=“<a href="” end=“">” />

16 <lookup type=“pex” action=“get_string” name=“image” ref=“<a href”

location=“after” start=“<img src="” end=“"” />

17 <lookup type=“pex” action=“get_string” name=“title” ref=“class="tx-strong-

dgrey&quot;” location=“after” start=“<a href=” end=“</a>” include_sz=“1”

strip_tags=“1” />

18 <lookup type=“pex” action=“move_ptr” ref=“bg-compare-hero” />

19 </element>

20 </search_elements>

Once the engine has identified that the page is of the “mysearch_—1” family the engine finds the spots in the signature file that contains the signature for the objects and elements of the family, shown above.

Contains any page specific manual overrides such as excluding certain menu items or and customization, modification of a menu that may need to be done In this example, value of form variable “keyword” will be posted to “http://www.ABC ComTech Corp..ca/search/searchresult.asp?logon=&langid=EN&search=KWS”

Manages Paging for the Search Pages

Instruct the engine to move the scan pointer to the string “bg-compare-hero” and start looking for elements from there

Contains lookup instructions for each object element as previously described.

ABC ComTech Corp. Menu Family Signature Explanation

1 <menu_elements id=“mymenu_1”>

2 <settings>

3 <black_list value=“Site Index##ReClaim&#8482;

Insurance Replacement” />

4 </settings>

5 <actions>

6 <lookup type=“pex” action=“move_ptr” ref=

“bg-lhsnav-title” />

7 <lookup type=“pex” action=“end_ptr” ref=

“</table>” />

8 </actions>

9 <element>

10 <lookup type=“pex” action=“get_string” name=

“link” ref=“<li>” location=“after” start=“<a href="”

end=“"” />

11 <lookup type=“pex” action=“get_string” name=

“title” ref=“<li>” location=“after” start=“<ahref="”

end=“</a>” include_sz=“1” strip_tags=“1” />

12 <lookup type=“pex” action=“move_ptr” ref=“</li>” />

13 </element>

14 </menu_elements>

Once the engine has identified that it is looking for a menu on a page that contains the menu style of the “mymenu_—1” family the engine finds the spots in the signature file that contains the signature for the objects and elements of the family, shown above.

Contains any page specific manual overrides such as exclude list, customization, modification, personalization, etc. In this example, any result that matches “Site Index”, “ReClaim & Insurance Replacement” are excluded but partial matches are also possible by using wild card strings.

Line 6 and 7 sets the start limit and end limit to instruct the engine on where to look for menu items

Contains lookup instructions for each object element as previously described. In this example, an element in ‘mymenu_—1’ (each individual menu entry of webpage) contains link and title as its properties. Line 12 instructs the engine to move the pointer to “</li>” to get ready to loop through and extract the next men item with the same elements

ABC ComTech Corp. Content/Item Family Signature Explanation

1 <item_elements id=“myitem_1”>

2 <actions>

3 <lookup type=“pex” action=“move_ptr” ref=“</head>” />

4 </actions>

5 <element>

6 <lookup type=“pex” action=“get_string” name=“image” ref=“largeimageref”

location=“after” start=“<img src="” end=“"” />

7 <lookup type=“pex” action=“get_string” name=“title” ref=“product-details-prd-title”

location=“after” start=“<span” end=“</span>” include_sz=“1” strip_tags=“1” />

8 <lookup type=“pex” action=“get_string” name=“price” ref=“our price:”

location=“after” start=“<td” end=“</td>” include_sz=“1” strip_tags=“1” />

9 <lookup type=“pex” action=“get_string” name=“sale_price” ref=“sale price:”

location=“after” start=“<td” end=“<td>” include_sz=“1” strip_tags=“1” tolerance=“1” />

10 <lookup type=“pex” action=“get_string” name=“description” ref=“detailbox-text”

location=“middle” start=“<p” end=“</p>” include_sz=“1” strip_tags=“1” />

11 </element>

12 </item_elements>

Once the engine has identified that the page is of the “myitem_—1” family the engine finds the spots in the signature file that contains the signature for the objects and elements of the family, shown above.

Instructs the engine to move the scan pointer to the appropriate spot to get ready to scan and output the product elements.

Contains lookup instructions for each of the defined fields in a product. In this example, an object in ‘myitem_—1’ (an item) contains the elements image, title, price, sale price and description. Note that the pointer does not need to be moved after scanning in the elements since there are no more objects on the product detail page as there are on the list page.

This family had a detail walk though explained above

Appendix: Signature Engine Syntax

Lookup Syntax

Look up is the query which Signature engine runs against the website for a resultset(s).

Type

Defines data type of reference. One lookup can contain multiple references and specific type of each reference, represented by ‘Type_n’ for each nth reference.

Type
Description

Pex
String expression

Iex
Numeric expression

Rex
Regular expression

Dex
Date/Time expression

Bex
Binary

Action

Action
SQL Equivalent

Locate_string
SELECT COUNT ALL

Get_string
SELECT

Move_ptr
Beginning_of_LIMIT

End_ptr
End_of_LIMIT

Remove_string
DELETE

Replace_string
UPDATE

Name

An element of an object, for example price if the object is a product

Id is a named relation for an identified family of web pages. It is the SQL equivalent of a table name.

Ref

Ref is the string reference being matched to identify a page, object or element. It is equivalent to a WHERE condition in SQL. There could be multiple Ref values where default Ref is represented by ‘Ref’ and subsequent Refs are presented by ‘Ref_n’. This is equivalent to having multiple conditions in an SQL WHERE clause.

Alt

Alt is an extension of Ref object with SQL equivalent of ‘OR’ clause presented by ‘Ref_Alt_n’.

Location

Extends SQL equivalent of WHERE clause to give directional containment of data set.

Location
Description

Before
Target value is located before ‘Ref’ value

Middle
Target value contains ‘Ref’ value

After
Target value is located after ‘Ref’ value

Start

Start is used to specify the beginning of a given data element.

End

End is used to specify end of a given data element.

Include_sz

Boolean to include ‘Start’ and ‘End’ value.

Tolerance

Boolean to define whether failure of given lookup excludes the entire query for an object.

Strip_tags

Boolean to define where to strip all HTML tags out of the target value being extracted.

Notrim

Values will not be trimmed (leading and trailing spaces will not be removed).

Upper

Value is converted to all upper case.

Lower

Value is converted to all lower case.

Uppercase_word

First character of each words in a value is converted to upper case.

Uppercase_first

First character of a value is converted to upper case.

Page Syntax

Make a page family's paging feature functional.

Page_variable

Defines unique key that defines a family's paging feature

Page_start

Defines value of first page in a family's paging feature.

Page_post

Path where paging variable(s) must be transmitted to.

Page_start

Defines value of first page in a family's paging feature.

Search Syntax

Make a website family's search feature functional.

Search_path

Search path where search variable must be transmitted to

Search_variable

Name of search variable which a website's search feature is looking to read, request, post, etc.

General Syntax

Any variable name and value can be defined,

TOR_LAW\6600992\1

SYSTEM AND METHOD FOR CONTENT NAVIGATION

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

US Classifications

International Classifications

Abstract

Description

Claims

PRIORITY

Provisional Applications (1)