BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to web portals and more particularly to a method for constructing pageflows by analyzing multiple clickstreams traversed by a user.
2. Description of Background
FIG. 1 is a schematic system view of an example of a portal server implementing an existing art portal. A prior art portal such as WebSphere™ portal by IBM™ is built by a complex functionality implemented on a network server, such as application server 100 illustrated in FIG. 1. The most important elements of such server are logic components for user authentication 105, state handling 110, aggregation of fragments 115, a plurality of portlets 120 provided in respective pages 125 with a respective plurality of APIs 130 to a respective portlet container software 135 for setting the portlets 120 into the common page context, and portal storage resources 140. The logic components are operatively connected such that data can be exchanged between single components as required as represented in FIG. 1.
The existing art portal realizes a request/response communication pattern, i.e., it waits for client requests and responds to those requests. A client request message includes a URL/URI which addresses the requested portal page and/or other portal resources.
More specifically, an existing art portal such as illustrated in FIG. 1 implements an aggregation of portlets 120 based on the underlying portal model 150 comprising a hierarchy of portal pages that may include portlets and portal information such as security settings, user roles, customization settings, and device capabilities. Within the rendered page, the portal automatically generates the appropriate set of navigation elements based on the portal model. The portal engine invokes portlets during the aggregation as required and when required and uses caching to reduce the number of requests made to portlets. The existing art WebSphere™ portal by IBM™ employs open standards such as the Java™ portlet application programming interface (API). It also supports the use of a remote portlet via the Web Service for Remote Portlets (WSRP) standard.
Referring again to FIG. 1, the portlet container 135 is a single control component competent for all portlets 120, which may control the execution of code residing in each of these portlets. It provides the runtime environment for the portlets and facilities for event handling, inter-portlet messaging, and access to portlet instance and configuration data, among others. The portal resources 140 are in particular the portlets 120 themselves and the pages 125 on which they are aggregated in the form of an aggregation of fragments and the navigation model. A portal database 128 stores the portlet description, which details the portlet description featuring attributes such as portlet name, portlet description, portlet title, portlet short title, and keywords. The portal database 128 also stores the content model 150 which defines the portal content structure, i.e., the structure of pages and comprises page definitions. A page definition describes a portal page and references the components (e.g. portlets) that are contained in the page. This data is stored in the database 128 in an adequate representation based on existing art techniques such as relational tables.
Referring further to FIG. 1, some existing art portals contain a navigation component 165 which provides the possibility to nest elements and to create a navigation hierarchy, which is stored in the portal model.
Referring once more to FIG. 1, an important activity in existing art rendering and aggregation 115 processes is the generation of URLs that address portal resources, e.g., pages 125. A URL is generated by the aggregation logic and includes coded state information. The aggregation state as well as the portlet state is managed by the portal. The aggregation state can include information such as the current selection including the path to the selected page in the portal model, the portlets modes and states, the portlet render and action parameters, etc. By including the aggregation state in a URL, the portal ensures that it is later able to establish the navigation and presentation context when the client sends a request for the particular URL. A portlet can request the creation of a URL through the portlet API and provide parameters, i.e., the portlet render and action parameters to be included in the URL.
Referring again to FIG. 1, the user repository 129 contains user information and authentication information for each portal user. The user repository may be implemented in a database or a prior art Lightweight Directory Access Protocol (LDAP) directory. The user repository 129 supports various retrieval operations to query information about one user, multiple users or all portal users.
FIG. 2 is a diagram that illustrates an example of existing art interactions in a portal during render request processing. Referring to FIG. 2, a client 220 is depicted at the left side of the diagram with the portlet markup A, B, and C of respective portlets in the client browser. The portal container 135 in the central portion of the diagram and the diverse portlets A, B, and C are depicted at the right side of the diagram. The communication is based on requests which are expressed in the depicted arrows.
Referring further to FIG. 2, in particular, the client 220 issues a render request 260, e.g., for a new page, by clicking on a link displayed in its browser window. The link contains a URL, and in reaction to the user action, the client 220 issues the render request 260 containing the URL. To render the new page, the portal 135 (after receiving the render request 260) invokes state handling, passing the URL. State handling then determines the aggregation state and the portlet state that is encoded in the URL or that is associated with the URL. Typically, the aggregation state contains an identification of the requested page. Aggregation 115 checks if a derived page exists for this user. Aggregation 115 loads the according page definition from the portal database 128 and determines the portlets that are referenced in the page definition, i.e., that are contained on the page. Aggregation 115 sends an own render request 270 to each portlet through the portlet container 135. In the existing art, each portlet A, B and C creates its own markup independently and returns the markup fragment with the respective request response 280. The portal aggregates the markup fragments and returns the new page to the client 220 in a respective response 290.
Referring back to FIG. 1, a graphical user interface component 160 is provided for manually controlling the layout of the plurality of rendered pages. By that interface 160, a portal administrator or user is enabled to control the visual appearance of the portal pages (e.g., by creating new pages and/or by adding or removing portlets on pages). In particular, the administrator or user can decide which portlet is included at a given portal web page by adding portlets to pages or by removing portlets from pages. The manual layout interface 160 invokes the model management 161 which comprises the functionality for performing persistent content model changes and offers an API for invoking this functionality.
Some existing art portals support the concept of page derivation. This concept allows for a stepwise specialization of a page. In the first step, an administrator A creates a page, defines a base layout, and adds content (i.e., portlets) to the page. Thereafter, the administrator grants appropriate rights to other administrators or users, who themselves can derive the page and edit the layout and content of a page, but not any locked elements. When an administrator or a user modifies the page, model management 161 creates a derivation of the page and stores it into the portal database 128. It also stores an association between the implicit derivation and the user that performed the page modification.
For example, assume administrator A creates a page X that comprises portlet A, and administrator B adds portlet B to page X, which results in the creation of the derived page X′. Assume further that user C is authorized to view the page X (and thus X′). In this case, when issuing a request for page X, administrator A will see portlet A (corresponding to page X), administrator B will see Portlet A and B (corresponding to page X′), and user C will also see portlets A and B (corresponding to page X′). Aggregation 115 automatically selects the according page during request processing based on the aggregation state and the ID of the user issuing the request. Now, assume user C modifies the page to include portlet C. The portal thus creates a new derived page X″ and stores it into the database 128. The derived page is associated with user C. When now invoking a request for page X, administrator A will see portlet A, administrator B will see Portlet A and B (corresponding to page X′), and user C will see portlets A, B and C (corresponding to page X″).
There are numerous disadvantages associated with the foregoing existing art portal systems. In such existing art portal systems, users are often searching for information with respect to a certain topic. For example, a user might search for information regarding a certain technology X. There might be several places where information about technology X can be retrieved which makes is necessary for the user to travel many different paths to find the best information sources and to collect what is of interest for the user from those sources. However, it is very difficult to remember all the information sources that were found during the traversal process and even more difficult to remember the routes to those sources.
SUMMARY OF THE INVENTION
The shortcomings of the prior art are overcome and additional advantages are provided through embodiments of the invention proposing a method for constructing pageflows by analyzing multiple clickstreams traversed by a user that involves, for example, initiating a clickstream session in response to a user log-in and intercepting and storing all navigation interactions of the user during the clickstream session by a clickstream recorder component. In response to the user's request for a visualization of the user's navigation interactions during the session, the stored navigation interactions of the user for the clickstream session are analyzed by a clickstream analyzer to identify segments comprising interconnected nodes sequentially traversed by the user in a single navigation path during the session and to distinguish segments comprising nodes unrelated to other nodes traversed during the session. A graphic depiction of the identified segments comprising the interconnected nodes sequentially traversed by the user in a single navigation path during the session is presented to the user by a clickstream visualizer.
Embodiments of the invention further propose generating and storing the pageflow comprising a list of semantically related nodes sequentially traversed by the user at least a pre-determined number of times in a single navigation path during the session based on an analysis of the stored navigation interactions of the user for the clickstream session. In response to a request by the user, the stored pageflow is displayed for the user by a pageflow navigator. Embodiments of the invention also propose prompting the user by the pageflow navigator with an option to select and recall sequences of nodes from the pageflow and/or prompting the user by an XML importer with an option to transform the pageflow into an XML structure for export.
TECHNICAL EFFECTS
As a result of the summarized invention, technically we have achieved a solution for implementing a method for automatic generation of pageflows (i.e., a list of semantically interconnected/related nodes (pages)) by analyzing clickstreams describing the user's previous navigation behavior.
BRIEF DESCRIPTION OF THE DRAWINGS
The subject matter which is regarded as the invention is particularly pointed out and distinctly claimed in the claims at the conclusion of the specification. The foregoing and other objects, features, and advantages of the invention are apparent from the following detailed description taken in conjunction with the accompanying drawings in which:
FIG. 1 is a schematic system view of an example of a portal server implementing an existing art portal;
FIG. 2 is a diagram that illustrates an example of existing art interactions in a portal during render request processing;
FIG. 3 is a schematic system view of an example of a portal server for embodiments of the invention;
FIG. 4 is a diagram that illustrates an example of a possible visualization presented by the clickstream visualizer to a user;
FIG. 5 illustrates an example of the XML structure used to describe navigation interaction sequences for embodiments of the invention; and
FIG. 6 is a diagram that illustrates an example of a general flow for embodiments of the invention.
The detailed description explains the preferred embodiments of the invention, together with advantages and features, by way of example with reference to the drawings.
DETAILED DESCRIPTION OF THE INVENTION
A focus of embodiments of the invention lies on the automatic generation of pageflows (i.e., a list of semantically interconnected/related nodes (pages)) by analyzing clickstreams describing the user's previous navigation behavior. Pageflows represent meaningful sets of nodes (pages) that are semantically related and traversed often by users in the same sequence (order). Thus, the construction of pageflows makes it easier for users to recall sequences of nodes (pages) that are being traversed often. Moreover, it makes navigating along them easier as only clicks on next and previous links are needed. Pageflows can either be constructed by the system automatically by observing user behavior or by the user manually, e.g., by selecting nodes being presented as part of a clickstream visualizer.
FIG. 3 is a schematic system view of an example of a portal server for embodiments of the invention. Referring to FIG. 3, in embodiments of the invention, the portal 300 is extended by a clickstream recorder component 310. This component 310 tracks each single navigation interaction, such as clicks on pages and portlets, which the user performs. A single clickstream sequence comprises all navigation interactions that are part of a single session. The entire clickstream sequences are stored in a clickstream storage 313 for later retrieval.
Referring further to FIG. 3, a clickstream analyzer 311 analyzes the clickstreams. The clickstream analyzer 311 distinguishes between segments that comprise nodes being interconnected and segments that are not related to other nodes already traversed. In addition, the clickstream analyzer 311 analyzes nodes with which users actually interacted and ones which have only been visited.
Referring again to FIG. 3, with the help of a clickstream visualizer 312, the system is at any point in time able to visualize what has been traversed so far in a graph-like structure. Different segments of interconnected nodes are visualized in parallel, and nodes themselves are represented by thumbnails. The nodes representing real information sources might usually be the dead ends of each single segment. Whether or not they actually are can be determined by observing users' interaction behavior (e.g., copy and paste, etc.).
Referring further to FIG. 3, the pageflow generator 314 automatically constructs pageflows based on various metrics, e.g., by combining the target pages or pages being part of segments traversed more often. Pageflows can alternatively be constructed manually by the user by selecting thumbnails being displayed as part of the tree representing the prior navigation behavior. Pageflows are stored in the pageflow storage 316 for later retrieval.
Referring again to FIG. 3, using the pageflow navigator 318, users can recall and traverse recorded or retrieved pageflows simply by clicking next and previous alike buttons. Alternatively, pageflows can be exchanged with colleagues by transforming them into an Extensible Markup Language (XML) structure 317 describing the flow as shown in FIG. 5. Thus, experts can generate flows for less experienced users. XML structures can be exported and imported by the XML importer/exporter 315, and imported data can be handed over to pageflow storage 316 or pageflow navigator 318.
The clickstream visualizer 312 can be invoked by the user on demand. A click on a special link part of the theme redirects the user to a special page on which the clickstream visualizer portlet resides.
Referring once more to FIG. 3, similarly the clickstream recorder 310 can be invoked by the user on demand. A click on a special link part of the theme redirects the user to a special page on which the clickstream recorder portlet resides. The portlet presents a list of clickstreams that have already been recorded in the past. Options for recalling them and navigating along them are provided. Automatically and manually recorded clickstreams can be visually distinguished. The portlet also offers to create new clickstreams (manually) and offers options for managing existing ones (deletion, renaming, etc.).
FIG. 4 is a diagram that illustrates an example of a possible visualization presented by the clickstream visualizer 312 to a user. Referring to FIG. 4, three segments 410, 420, and 430 are displayed which represent navigation sequences that belong together as determined by analyzing timing and navigation patterns. Single segments are comprised of several pages, each of which is represented by a thumbnail allowing the user to easily remember what the concrete page was about. The thumbnails are clickable, and a click on a thumbnail redirects the user to the underlying page. Thumbnails 440 correspond to real target pages that have previously been determined by the clickstream analyzer 312.
Exemplarily, an automatically generated pageflow 450 is depicted at the bottom of FIG. 4 which comprises in this case target pages 440 only. This pageflow can be transformed into XML data and exchanged as described earlier.
FIG. 5 illustrates an example of the XML structure used to describe navigation interaction sequences for embodiments of the invention. For each user, all flows that have ever been traversed are stored. A session describes all flows that have been traversed during a particular session. Each flow describes a bunch of segments and each segment a bunch of pages that have been traversed.
FIG. 6 is a diagram that illustrates an example of a general flow for embodiments of the invention. Referring to FIG. 6, after a user logs in, a new clickstream session is started at 610. Every single navigation interaction is recorded at 620 and stored at 630. Upon receiving the users' request for a visualization of the user's previous navigation behavior, at 640, the clickstream analyzer 312 analyzes the clickstreams to determine segments 410, 420, and 430, and real targets 440, and at 650, the visualizer 312 presents the clickstream to the user.
Using the pageflow navigator, at 671, users can recall and traverse recorded or retrieved pageflows simply by clicking next and previous alike buttons. Alternatively, at 681, pageflows can be exchanged with colleagues by transforming them into an XML structure describing the flow as shown in FIG. 5. XML structures can be exported and imported by the XML importer/exporter 315 shown in FIG. 3.
An important aspect of embodiments of the invention is the recording of every navigation step which a user performs. Embodiments of the invention distinguish between segments that comprise nodes being interconnected and segments that are not related to other nodes already traversed. The nodes representing real information sources might usually be the dead ends of each single segment, which can be confirmed one way or the other by observing users.
Embodiments of the invention are capable of constructing flows of pages comprising the nodes that have previously been determined as real information sources. These flows can be associated to a topic X to be stored and recalled later. They can be described in XML structures and exchanged with colleagues, and embodiments of the invention can finally store paths traveled often by itself automatically. Users have the option to manipulate the dynamically generated flows by selecting and deselecting single nodes as part of the visual representation of breadcrumbs that have been recorded.
The flow diagrams depicted herein are only examples. There may be many variations to these diagrams or the steps (or operations) described therein without departing from the spirit of the invention. For example, the steps may be performed in a differing order, or steps may be added, deleted or modified. All of these variations are considered a part of the claimed invention.
While the preferred embodiment to the invention has been described, it will be understood that those skilled in the art, both now and in the future, may make various improvements and enhancements which fall within the scope of the claims which follow. These claims should be construed to maintain the proper protection for the invention first described.