Embodiments relate generally to computer hardware, computer software and methods for session data management. More specifically, disclosed are approaches that provide for functionality such as session capture, processing, search, and replay, among other such actions.
Application sessions generally involve information that is transferred between a user and an application, such as a Web application, during a period of time. Generally, application sessions generate and accumulate substantial amount of data, including data generated form the application and data entered by the user, as well as metadata about the user and the application. Application session can further include information relating to how the user is interacting with the application and historical performance of the application.
There is a need to fully analyze and understand application sessions between a user and an application. For example, when a user of a web application reports a problem with the web application usage, the detailed application session data can help the web application provider to find out what happened at the client side and thus solve the issue.
Existing technologies only partially record or analyze the application sessions. Examples of such technologies include web crawler technology, web archive technology, desktop recording technology, network session-based capture technology, and client-side web widgets which includes web analytics widgets primarily for tracking user actions and web performance widgets primarily for tracking performance metrics.
Thus, there is a need to manage the application sessions in a dynamic and holistic approach to analyze and understand application sessions.
Various embodiments or examples (“examples”) of the invention are disclosed in the following detailed description and the accompanying drawings:
Various embodiments or examples may be implemented in numerous ways, including as a system, a process, an apparatus, a user interface, or a series of program instructions on a computer readable medium such as a computer readable storage medium or a computer network where the program instructions are sent over optical, electronic, or wireless communication links. In general, operations of disclosed processes may be performed in an arbitrary order, unless otherwise provided in the claims.
A detailed description of one or more examples is provided below along with accompanying figures. The detailed description is provided in connection with such examples, but is not limited to any particular example. The scope is limited only by the claims and numerous alternatives, modifications, and equivalents are encompassed. Numerous specific details are set forth in the following description in order to provide a thorough understanding. These details are provided for the purpose of example and the described techniques may be practiced according to the claims without some or all of these specific details. For clarity, technical material that is known in the technical fields related to the examples has not been described in detail to avoid unnecessarily obscuring the description.
In one embodiment, an application session involves data regarding the use of an application by a user in a period, or multiple periods, during which the user is accessing or interacting with the application, such as may include a sequence of interactions between the user and a client computing device, and between the client computing device and other computing devices.
An application session can include an input time series which describes input of a user in a selected period of time. Examples of an input time series include a search term 118 input by the user, a selected website 112 chosen by the user, or any other input events, such as keyboard events, mouse movements, mouse clicks, stylus movements, or touch gestures.
Application session also can include an output time series which describes output of an application in response to the user's input events in a selected period of time. Examples of an output time series include a search result 116, or any other output events displayed by user agent 108. In this example, the application session is related to a user X (not shown).
Referring to
In one embodiment, session capture agent 204 can capture session data from an application session and generate a session capture. In one embodiment, the session capture can include an input time series which describes input of a user in a selected period of time, e.g. a mouse click at a selected web link, and text input for a web search. Moreover, the session capture can include an output time series which illustrates the application's response for the user's input in a selected period of time, e.g. loading of a web link, and displaying of a search result. The session capture can also include session metadata which describes general information of the user and the application. In addition, the input and output time series do not always have the same length as their time stamps are generally different.
The input time series in at least one embodiment is a sequence of tuples I=(u0, I0), (u1, I1) . . . (un, In), where the un are timestamps, and the In are records of user input events, such as keyboard events, mouse movements, mouse clicks, stylus movements, or touch gestures. For example, the In can be represented using a structured format such as JSON or XML, or may be a translation of a binary representation of input events used by the operating system, GUI framework, or user agent 206.
According to some embodiments, session capture agent 204 can employ different approaches to capture input time series. In one embodiment, session capture agent 204 can intercept certain user agent API calls related to receiving or handling user input events through a shim library. In another embodiment, session capture agent 204 can register event handlers that are notified of user input events through user agent API calls. Still in another embodiment, session capture agent 204 can use generic instrumentation facilities of user agent 206 to gather information about user input events.
The output time series in this example is a sequence of tuples O=(t0, O0), (t1, O1) . . . (tm, Om), where the tm are timestamps, and each Om is the structured output document displayed by user agent 206 at the corresponding time tm. For example, if a user agent is using a UI toolkit such as Cocoa Touch, then Om is a structured document representation of the UI display contents, derived from an abstract representation of the UI state in terms of structured UI constructs and their content.
In one embodiment, the structured output document of a user agent can be expressed as:
O
k=(Dk,rk1,rk2, . . . rkn,)
where Dk is the structured output document and rjk are external resources. In such an environment, the output document time series can be referred to as D=(t0, D0), (t1, D1), . . . (tm, Dm). For example, user agent 206 can use a Document Object Model (DOM) as a runtime representation of an output document including a markup language such as XML or HTML. In this example, Dk is a serialization of the DOM at time tk and external resources including images, style sheets and other resources. The external resources are identified by identifiers such as a Uniform Resource Locator (URL), and each resource is included in Dk by a link including the resource's URL.
According to some embodiments, session capture agent 204 can employ different approaches to capture output time series, partially depending on the environment in which session capture agent 204 operates, e.g. the APIs provided by user agent 206 and the APIs provided by the hosting operation system associated with user agent 206.
In one embodiment, session capture agent 204 can call user agent 206 to retrieve the current output document (i.e. full-pulling approach). In another embodiment, session capture agent 204 can register with user agent 206's APIs in order to receive notifications at the change of the output document (i.e. full-change notification approach). Still in another embodiment, if user agent 206 provides appropriate APIs, session capture agent 206 can retrieve change notifications listing the changes in the output document (i.e. delta change notification). Yet in another embodiment, session capture agent 204 can intercept certain user agent API calls related to updating or modifying the output document (i.e. delta intercept approach).
Session metadata can be stored in session metadata records using a structured format such as JSON or XML. Metadata records can contain any auxiliary information that is related to the session but is not contained in the output or input time series. Session metadata record types include temporal or non-temporal. For example, session metadata about the application name and user name are probably non-temporal, whereas metadata about network traffic are likely to be temporal.
According to some embodiments, session capture agent 204 can employ different approaches to capture session metadata. In one embodiment, session capture agent 204 can query user agent 206 to gather session metadata through user agent 206's APIs. In another embodiment, session capture agent 206 can query the client computing device to gather session metadata through host operating system's APIs. Still in another embodiment, session capture agent 206 can register with user agent 206's APIs and or with host operation system's APIs to receive notification when state associated with session metadata changes.
In addition, some session metadata can be extracted from the session input series and session output series. For example, in certain environments, it may only be possible to determine session metadata such as the application name of a session or the login name of a session user by inspection of the input series or output series.
Still referring to
In one embodiment, after generating a session capture, in one embodiment, session processor 208 associated with session capture agent 204 or session search engine 212, can slice and segment the session capture including input times series, output time series and session metadata. Both session capture slicing and segmentation can separate a session capture into contiguous portions. But session capture slicing can operate over long timescales and is driven by long timers and/or infrequent session events, whereas session capture segmentation can operate over short timescales and is driven by short timers and/or frequent session events.
In one embodiment, session processor 208 can slice a long session capture into smaller individual session captures. The result of this slicing is a number of shorter, separate captures that are then processed and handled separately.
In another embodiment, session processor 208 can segment the input/output series into contiguous segments. For example, given a document time series D=(t0, D0), (t1, D1, . . . (tm, Dm), a possible segmentation consists of the three segments Sa=(t0, D0), (t1, D1), Sb=(t2, D2), and Sc=(t3, D3), . . . (tm, Dm).
According to some embodiments, session processor 208 can employ different approaches to slice or segment the input/output time series. In one embodiment, one segmentation process is driven by the time elapsed since the last segment was segmented or emitted. In another embodiment, one segmentation process is driven by the amount of data in the segment. Still in another embodiment, a segmentation process is driven by the output time series, the input time series, session metadata through using these data to place segment boundaries that reflect aspects of user activities and semantics of a session, i.e. session-driven segmentation.
Referring to
In one embodiment, the compression process can include a two-stage process: a lossy compression process and a lossless compression process. For example, a lossy compression process can be a process that removes comments present in the markup of the document in the input segment. After the lossy compression process, a lossless compression process can include a delta encoding process, followed by an inter-segment compression process and an intra-segment compression process. In another embodiment, the compression process can include only a lossless compression process.
After the compression process, session capture data is transferred from session capture agent 204 to session search engine 212 using an underlying connection-oriented, ordered-delivery transport protocol that is provided by the host operating system associated with session capture agent 204 and session search engine 212. In one embodiment, the protocol can include one of a TCP, IP, FTP or a web-based protocol.
In another embodiment, each of the session capture, session processing including session slicing and/or segmentation, or session compression can be non-sequential to each other. For example, the session capture and the session slicing/segmentation can be concurrent processes.
Still referring to
After the session decompression process, each session capture is assigned a unique session identifier (SID) in a session identification process. In one embodiment, the session identification is preceded by a slicing process that breaks long session captures into shorter sessions to improve the index and search efficiency of session search engine 212. In another embodiment, session capture agent 204 can assign the SID at the generation of each session capture. Yet in another embodiment, session search engine 212 can assign the SID to each session capture after the processes described herein.
After the session identification process, session index manager 216 can process the resulting session captures including session time series, or input/output time series components, in a session indexing process. The session indexing process can generate and maintain session indices for an efficient session search. In one embodiment, the session indexing process can include at least one of a term extraction process, a global index update process and a session index update process, as explained herein. The different processes are described separately for clarity, but practitioners skilled in the art should understand these processes can be implemented jointly or independently as needed.
A term extraction process, for example, can create a list of terms present in a session capture. Terms can include words and other constructs such as numbers or identifiers, e.g. social security numbers. For example, to extract terms from the output time series components, the term extract process can parse through each document in an output session segment. To extract terms from the input time series components, the term extract process can extract keystroke events, accumulate the result of those keystroke, and parse the text in each document.
A global index update process, for example, can store and update the list of terms generated by the term extraction process. A global index is a dictionary that maps each term appearing in the input/output time series components to a list of SIDs containing the corresponding terms. In one embodiment, the global index can include an additional global metadata index for frequently used metadata attributes, e.g. the application name and the user name, thus allowing faster query processing.
A session index update process, for example, can store and update session indices associated with each session capture. A session index is a dictionary that can map terms or events to a list of temporal locations where the term (or event) appears in a session capture.
In one embodiment, a session capture can have at least an output index and an input index. In an output index, each temporal location can be a discrete time or an interval, depending on whether the term appears in a single document or remains present in multiple sequential documents. In an input index, each temporal location can be a discrete time. In addition, an input index can map both terms and non-textual UI input event (e.g. mouse clicks or touch gestures) to their temporal locations. In another embodiment, a session capture can have a metadata index.
After the session index process, the outcome sessions can be stored in a session store using a compressed or uncompressed format. In one embodiment, the session store can use tiering and caching mechanism to improve access speed for retrieval of frequently used/searched sessions. In another embodiment, the session store can be distributed across multiple nodes for increased performance, scalability and robustness.
Still referring to
For the session search process, different search entry interfaces can be employed independently or jointly. In one embodiment, a search entry interface can use search queries expressed in a text-based syntax. In another embodiment, a search entry interface can use a combination of GUI elements, e.g. drop-down menus, and text boxes.
According to some embodiments, the session search process can include a multi-stage query process using the indices described herein to locate the matching sessions.
After the session search process, the system transmits the results of the search, e.g. the SIDs for the matching sessions, to a display process, which may use different display methods to present the session search outcome.
In one embodiment, the system can display a list of abbreviated session descriptions to describe the results of the search. In another embodiment, the system can display a list of session timelines including a horizontal strip representing the session's temporal axis. Still in another embodiment, the system can use a full session reply widget to allow replay a session, including playback controls, e.g. “play”, “rewind” and “fast forward”.
In addition, the system can display additional information for aiding the user to navigate the search results. For example, when a search query for “click: search” generates multiple identified input events in a session, the system can display a timeline to show the user where the matches, e.g. “search”, occurred.
According to some embodiments, the system can employ session analytics to the result of a session search. For example, a mathematical or statistical operator can extract distributions or other statistics of identified sessions from a session search. In another example, a graphical visualization can visualize the identified sessions. Yet in another example, a data extraction operator, e.g. a Xpath query, can extract data in a search result.
According to some embodiments, besides all processes described herein, the session management system can conduct an additional process to retrieve external resources that are linked to session documents. In one embodiment, the external resource retrieval process can be part of the decompression process or be independent of the decompression process. In another embodiment, session search engine 212 can conduct the external resource retrieval process.
According to some embodiments, the use of proxy server 304 can eliminate the need to install a user agent customization plugin, e.g. session capture agent 306, on the client computing device. Particularly, the use of proxy server 304 can be useful when a client computing devise employs a closed and incompatible operating platform.
Referring to
According to some examples, computing platform 600 performs specific operations by processor 604 executing one or more sequences of one or more instructions stored in system memory 606, and computing platform 600 can be implemented in a client-server arrangement, peer-to-peer arrangement, or as any mobile computing device, including smart phones and the like. Such instructions or data may be read into system memory 606 from another computer readable medium, such as storage device 608. In some examples, hard-wired circuitry may be used in place of or in combination with software instructions for implementation. Instructions may be embedded in software or firmware. The term “computer readable medium” refers to any tangible medium that participates in providing instructions to processor 604 for execution. Such a medium may take many forms, including but not limited to, non-volatile media and volatile media. Non-volatile media includes, for example, optical or magnetic disks and the like. Volatile media includes dynamic memory, such as system memory 606.
Common forms of computer readable media includes, for example, floppy disk, flexible disk, hard disk, magnetic tape, any other magnetic medium, CD-ROM, any other optical medium, punch cards, paper tape, any other physical medium with patterns of holes, RAM, PROM, EPROM, FLASH-EPROM, any other memory chip or cartridge, or any other medium from which a computer can read. Instructions may further be transmitted or received using a transmission medium. The term “transmission medium” may include any tangible or intangible medium that is capable of storing, encoding or carrying instructions for execution by the machine, and includes digital or analog communications signals or other intangible medium to facilitate communication of such instructions. Transmission media includes coaxial cables, copper wire, and fiber optics, including wires that comprise bus 602 for transmitting a computer data signal.
In some examples, execution of the sequences of instructions may be performed by computing platform 600. According to some examples, computing platform 600 can be coupled by communication link 621 (e.g., a wired network, such as LAN, PSTN, or any wireless network) to any other processor to perform the sequence of instructions in coordination with (or asynchronous to) one another. Computing platform 1000 may transmit and receive messages, data, and instructions, including program code (e.g., application code) through communication link 621 and communication interface 613. Received program code may be executed by processor 604 as it is received, and/or stored in memory 606 or other non-volatile storage for later execution.
In the example shown, system memory 606 can include various modules that include executable instructions to implement functionalities described herein. In the example shown, system memory 606 includes a session capture agent 610, a session search agent 612 and a user agent 616, each can be configured to provide one or more functions described herein.
Although the foregoing examples have been described in some detail for purposes of clarity of understanding, the above-described inventive techniques are not limited to the details provided. There are many alternative ways of implementing the above-described invention techniques. The disclosed examples are illustrative and not restrictive.
This application claims priority to U.S. provisional application 61/815,119, filed Apr. 23, 2013, and entitled “Managing Session Data”, the disclosure of which is hereby incorporated herein by reference in its entirety for all purposes.
Number | Date | Country | |
---|---|---|---|
61815119 | Apr 2013 | US |