1. Field
The present specification generally relates to ranking documents and, more particularly, to systems and methods for ranking a plurality of documents based on user activity.
2. Technical Background
A user of a computing system may wish for the computing system to identify and present data content that is relevant to the user. For example, when a user utilizes the computing system to perform research (e.g., legal research, factual research, etc.), the user may engage in a research session that generally involves a sequence of user activities (e.g., searching, viewing documents, interacting with presented content, etc.) that the user undertakes in order to locate and view relevant documents pertinent to the user's research objective. A computing system may identify and present documents in a ranked order, such that a user may view documents likely to be more relevant to the user earlier in the research session, which may result in quick identification of the information the user is seeking. Accordingly, a need exists for systems and methods for ranking a plurality of documents.
In one embodiment, a method for ranking a plurality of documents based on user activity includes receiving, automatically by a computer, first user activity data indicative of a first user activity and second user activity data indicative of a second user activity. A first user activity point value is associated with the first user activity and a second user activity point value is associated with the second user activity. The method further includes identifying a first data item based on the first user activity data, identifying a second data item based on the second user activity data, updating a first score of the first data item based on the first user activity point value, updating a second score of the second data item based on the second user activity point value, identifying the plurality of documents based on the first data item and the second data item, and ranking the plurality of documents based on the first score and the second score.
In another embodiment, a method for ranking a plurality of documents based on user activity includes receiving, automatically by a computer, first user activity data indicative of a first user activity and second user activity data indicative of a second user activity. A first user activity point value is associated with the first user activity and a second user activity point value is associated with the second user activity. The method further includes identifying a first data item based on the first user activity data, identifying a second data item based on the second user activity data, updating a first score of the first data item based on the first user activity point value, updating a second score of the second data item based on the second user activity point value, identifying a user objective based on the first user activity data, identifying the plurality of documents based on the first data item, the second data item, and the identified user objective, and ranking the plurality of documents based on the first score and the second score.
In yet another embodiment, a system for ranking a plurality of documents based on user activity includes a computing device that includes a non-transitory memory component that stores a set of executable instructions that causes the computing device to receive first user activity data indicative of a first user activity and second user activity data indicative of a second user activity. A first user activity point value is associated with the first user activity and a second user activity point value is associated with the second user activity. The set of executable instructions further causes the computing device to identify a first data item based on the first user activity data, identify a second data item based on the second user activity data, update a first score of the first data item based on the first user activity point value, update a second score of the second data item based on the second user activity point value, identify the plurality of documents based on the first data item and the second data item, and rank the plurality of documents based on the first score and the second score.
These and additional features provided by the embodiments described herein will be more fully understood in view of the following detailed description, in conjunction with the drawings.
The embodiments set forth in the drawings are illustrative and exemplary in nature and not intended to limit the subject matter defined by the claims. The following detailed description of the illustrative embodiments can be understood when read in conjunction with the following drawings, wherein like structure is indicated with like reference numerals and in which:
Referring generally to the figures, particularly
Referring now to the drawings,
The user computing device 12a may be used by a user to perform user activities. The user computing device 12a may also be utilized to perform other user functions, such as to provide a graphical user interface for interacting with the computing network and to display, or otherwise communicate, information to the user. Additionally, included in
It should be understood that while the user computing device 12a and the administrator computing device 12c are depicted as personal computers and the server computing device 12b is depicted as a server, these are non-limiting examples. More specifically, in some embodiments any type of computing device (e.g., mobile computing device, personal computer, server, etc.) may be utilized for any of these components. Additionally, while each of these computing devices is illustrated in
As also illustrated in
The processor 30 may include any processing component configured to receive and execute instructions (such as from the data storage component 36 and/or memory component 40). The input/output hardware 32 may include a monitor, keyboard, mouse, printer, camera, microphone, speaker, touch-screen, and/or other device for receiving, sending, and/or presenting data. The network interface hardware 34 may include any wired or wireless networking hardware, such as a modem, LAN port, wireless fidelity (Wi-Fi) card, WiMax card, mobile communications hardware, and/or other hardware for communicating with other networks and/or devices.
It should be understood that the data storage component 36 may reside local to and/or remote from the server computing device 12b and may be configured to store one or more pieces of data for access by the server computing device 12b and/or other components. As illustrated in
User activity data 38a is indicative of the activities performed by a user of the user computing device 12a. In some embodiments, the user activity data 38a is indicative of user activities, such as information input into the user computing device (e.g., text input via a keyboard or microphone), user manipulation of presented data (e.g., user clicking of a mouse or touching a touch-screen, etc), and the like. For example, in the context of a research session during which a user performs research in order to identify relevant documents, a user may perform a search, view a document, view a related document, download a document, print a document, e-mail a document, fax a document, flag a document, copy text from a document, or click a hyperlink within a document. As another example, in the context of a legal research session performed utilizing the research tools available from LexisNexis, user activity data 38a may be indicative of any of the following user activities: viewing a legal document, viewing a related document, viewing a Shepard's® report, a legal search, a Shepard's® search, viewing a legal issue trail, downloading a document, printing a document, e-mailing a document, faxing a document, flagging a document, copying text from a document, or clicking a hyperlink within a document. In some embodiments, the user activity data 38a may be associated with a particular research session, such that the user activity data 38a is indicative of user activity throughout the research session. In some embodiments, the user activity data 38a may include additional data, such as the duration of a user activity (e.g., how long a document was viewed, how long a user spent performing searches, etc.) or the frequency of a user activity (e.g., a number of times a document was viewed, a number of searches performed, etc.). It should be understood that user activity data 38a may also be indicative of other user activities.
User activity data 38a also includes a user activity point value associated with each user activity. In some embodiments, the user activity point value of a user activity is based on a probativeness of the user activity, such that a more probative user activity has a higher user activity point value than a less probative user activity. As used here “probativeness” refers to a likelihood that user activity data associated with the user activity may be utilized to identify relevant documents. As a first non-limiting example, in the context of a legal research session, performing a legal search may be less probative than viewing a document identified as a result of a legal search. Thus, in the first example, the user activity point value of the less probative legal search may be lower than the user activity point value of the more probative document view. As a second non-limiting example in the same context, viewing a legal issue trail may be more probative than viewing a document. Thus, in the second example, the user activity value of the more probative legal issue trail view may be higher than the user activity point value of the less probative document view. As a third non-limiting example in the same context, viewing a document from within displayed search results may be less probative (and thus have a lower user activity point value) than viewing a document from within another document (e.g., clicking a link to a legal case referenced within a document being viewed by the user). In some embodiments, a document view has a user activity point value of 10, a related document view has a user activity point value of 20, a legal issue trail view has a user activity point value of 30, a legal search has a user activity point value of 5, and a Shepard's® search has a user activity point value of 40. It should be understood that in other embodiments, the user activity point values associated with the user activities may differ than those explicitly set forth herein. For example, in some embodiments, more than one user activity may be associated with the same user activity point value. In some embodiments, a user activity that is not probative may have a user activity point value of 0. In some embodiments, the user activity point value associated with a user activity may be fixed such that the user activity point value is the same for all user activities of the same type (e.g., the user activity point value for all document views may be the same). In other embodiments, the user activity point value associated with a user activity may depend on the nature of the specific user activity (e.g., a first search that returns many results may have a lower probativeness (and consequently a lower user activity point value) than a second search that returns fewer results), as will be described in further detail below.
Included in the memory component 40 are the operating logic 42 and the document identification and ranking logic 44. The operating logic 42 may include an operating system and/or other software for managing components of the server computing device 12b. Similarly, the document identification and ranking logic 44 may reside in the memory component 40 and may be configured to facilitate the identification and ranking a plurality of documents based on user activity, as will be described in detail below with reference to
It should be understood that the components illustrated in
Referring now to
Still referring to
At block 304, the server computing device 12b identifies a first data item based on the first user activity data and a second data item based on the second user activity data. In some embodiments in which the user activity is a search including at least one or more search terms, the identified data item may include: at least one of the one or more search terms; a headnote that contains at least one of the one or more search terms; a reasons for citation (“RFC”) that contains at least one of the one or more search terms (e.g., text including at least one of the one or more search terms that indicates the reason why a particular document was cited); another documents cited in such a headnote or RFC (i.e., a headnote or RFC that contains at least one of the one or more search terms); a core term present in such a headnote or RFC (i.e., a headnote or RFC that contains at least one of the one or more search terms), a legal taxonomy topic associated with such a headnote or RFC (i.e., a headnote or RFC that contains at least one of the one or more search terms), and the like.
Still referring to block 304 of
Still referring to
The first score of the first data item may be continually updated based on user activity so that the first score aggregates user activity associated with the first data item as the user performs various activities. In such embodiments, the first data item may already have a first score as a result of prior user activity. For example, in some embodiments, the server computing device 12b may also receive third user activity data indicative of a third user activity. The third user activity may be associated with a third user activity point value. In such embodiments, after the server computing device 12b receives the third user activity data, the first score of the first data item may be updated based on the third activity point value. By way of example: the first data item may be a headnote containing one or more search terms; the first user activity may be viewing a first document (associated with a first user activity point value); and the third user activity may be viewing a second document (associated with a third user activity point value). In such an example, a first score of the first data item may be updated based on the first user activity when the headnote containing the one or more search terms is associated with the first document, such as by adding the first user activity point value to the first score. Then, the first score of the first data item may be updated based on the third user activity (e.g., by adding the third user activity point value to the first score) when the headnote containing the one or more search terms is associated with the second document. By continually updating the score associated with a particular data item as a user performs a sequence of activities, data items that recur among activities may be tracked, such that a data item with a higher score is likely to be more relevant to the user than a data item with a low score and may be used to identify and rank relevant data content. The second score of the second data item may also be updated in a similar manner.
Once the scores of the data items are updated at block 306, the server computing device 12b identifies the plurality of documents based on the first data item and the second data item at block 308. In some embodiments, the plurality of documents are a plurality of legal documents. However, it should be understood that in other embodiments, the documents may not be legal documents, such as when the documents are news documents, factual documents, articles, webpages, and the like. In some embodiments, each of the plurality of identified documents includes or is associated with at least one of the first data item and the second data item. In some embodiments, each of the plurality of identified documents includes or is associated with both the first data item and the second data item. For example, if the first data item is a first headnote and the second data item is a second headnote, the plurality of documents may be identified as the documents including or associated with either the first headnote or the second headnote. In other embodiments in which the first data item is a first headnote and the second data item is a second headnote, the plurality of documents may be identified as the documents including or associated with one of the first headnote or the second headnote.
Still referring to block 308, in some embodiments, the server computing device 12b may identify a user objective based on the first user activity data and identify the plurality of documents based on the identified user objective. For example, when the first user activity data is indicative of a user viewing a legal brief, the user objective may be identified as drafting a brief. When the user objective is identified as drafting a brief, other briefs may be identified as relevant documents because the user may be looking for additional briefs that may be helpful to the user in drafting a brief.
Still referring to block 308, in some embodiments, the server computing device 12b may identify metadata associated with the first user activity and identify the plurality of documents based on the identified metadata. For example, when the first user activity is viewing a court decision document, metadata may be associated with the viewed court decision document, such as a date of the decision, a jurisdiction, a court that issued the decision, a citation of the court decision document, and the like. In one embodiment in which the first user activity is viewing a court decision document from a particular jurisdiction, the particular jurisdiction may be identified from the metadata associated with the court decision document, and another court decision document associated with the same particular jurisdiction may be identified as one of the plurality of documents.
Still referring to
In some embodiments, the method 300 may be employed by a research assistant tool that tracks user activity data and identifies and ranks documents based on the user activity data, as described in detail above with reference to
After the plurality of documents are identified at block 308 and ranked at block 310, the ranked plurality of documents may be presented to the user of the user computing device 12a in a number of ways. In some embodiments, the ranked plurality of documents may be presented to the user in the context of a research assistant tool that alerts the user to the existence of identified relevant data content and allows the user to access the ranked plurality of documents in a variety of ways, as will be explained below.
Referring now to
Still referring to
The related headnotes tool 406b identifies and ranks relevant documents based on data items identified from user activities and scores of the data items, according to the method 300 described above with reference to
The next 25 tool 406c identifies and ranks the next 25 most relevant documents based on data items identified from user activities and scores of the data items, according to the method 300 described above with reference to
The common documents tool 406d identifies and ranks documents identified in response to more than one user activity based on data items identified from user activities and scores of the data items, according to the method 300 described above with reference to
The recommended documents tool 406e identifies and ranks recommended documents based on data items identified from user activities and scores of the data items, according to the method 300 described above with reference to
In the embodiment depicted in
Still referring to
It should be understood that embodiments described herein provide for systems and methods for ranking a plurality of documents based on user activity. By identifying data items based on user activity, assigning scores to the identified data items, identifying relevant documents based on the data items and ranking the documents based on the scores, relevant documents can be identified that share an identified data item with a relatively high score, and are likely to be relevant to the user. Furthermore, by continually updating the score associated with a particular data item as a user performs a sequence of activities, data items that recur among activities may be tracked, such that a data item with a higher score is presumably more relevant to the user than a data item with a low score and may be utilized to identify and rank documents in an order of likely relevance to the user.
While particular embodiments have been illustrated and described herein, it should be understood that various other changes and modifications may be made without departing from the spirit and scope of the claimed subject matter. Moreover, although various aspects of the claimed subject matter have been described herein, such aspects need not be utilized in combination. It is therefore intended that the appended claims cover all such changes and modifications that are within the scope of the claimed subject matter.