The present invention relates to an on-line method and system for recommending articles to users, based on user input.
A recommender system recommends articles to a user. In this patent application, “article” means any content, data or material that can be delivered on-line, and includes but is not limited to text, such as newspaper or magazine articles, books and book chapters, advertisements, videos, PowerPoints, audio files, podcasts, images, blogs, tweets, or products or services which could be provided or purchased.
A weaknesses of current recommender systems for on-line articles may be that current on-line recommendation systems for articles have a number of disadvantages and present a number of problems.
One disadvantage may be that (a) current recommender systems do not relate user input to recommendations in a visible and real-time (or near real-time) way. Currently available systems do little to promote engagement by the user. Typically the user is asked to provide user input in relation to an article, but there is no immediate connection between that input and the resulting recommendations. Also, users typically have no other choices to specify the kinds of content that they wish to have recommended. The user doesn't have fun in interacting with the system and receiving recommendations from it. As well, the user often has only a limited understanding about why particular recommendations are being made. Because the user cannot see how his or her input immediately influences the recommendation or selection of articles, the user may have reduced acceptance of, and confidence in, the recommender system. As well, many current systems are relatively impersonal—they simply tell a visitor that “people who read this article also read ______”, or “people who read this article bought ______”. They do not appear to be personalized to a great extent.
In many previous recommender systems, recommendations may only be generated between on-line sessions, and presented the next time the user logs on to the system. This decreases user engagement, fun and confidence in the system.
Another disadvantage may be (b) sparsely rated content. Where the number of articles and users are increasing or changing quickly, then there may be relatively few articles rated, and relatively few users providing ratings for any particular article. It can be challenging to provide effective and reliable recommendations for such sparsely rated content.
A goal of the present application may be to address one or more the above-noted disadvantages and weaknesses of current recommender systems.
The following presents a simplified summary of the invention in order to provide a basic understanding of some aspects of the invention. This summary is not an extensive overview of the invention. It is not intended to identify key/critical elements of the invention or to delineate the scope of the invention. Its sole purpose is to present some concepts of the invention in a simplified form as a prelude to the more detailed description that is presented later.
The present invention is directed to a computer-implemented system and method of recommending articles, based on input from a user.
In one embodiment of the present invention, there is provided a computer-implemented system for providing recommendations for articles comprising: a display including at least one view showing a plurality of data items regarding one or more articles; an input device; a receiver module for receiving information regarding one or more articles; and a processor module, for determining replacement information to be displayed, based on the user input. The system may comprise a plurality of views and a changer module to switch between the views. The plurality of views may include a text view where the data items regarding one or more articles are presented in a list, and where, the plurality of data items includes a title and a date. The plurality of views may include a grid view where the one or more data items regarding one or more articles are presented in a grid and where, the plurality of data items includes a title, a date, and an image.
In a further embodiment of the present invention, the changer module includes a default view chosen from the plurality of views.
In a further embodiment of the present invention, the changer module may include a memory of the user's preferred view.
In a further embodiment of the present invention, the system further comprises a storage device for storing an article identifier for identifying an article, a user identifier for identifying a user, and a rating of the user for the article.
In a further embodiment of the present invention, the processor module determines a similarity between information presented in relation to the articles, and then determines the replacement information based on this similarity.
In a further embodiment, the invention provides a computer-implemented method of providing recommendations for articles, comprising the steps: receiving information regarding one or more articles; displaying a first subset of data items relating to said one or more articles, according to a first view, on a display device; responsive to a selection from the user, displaying a second subset of data items relating to the one or more articles, according to a second view, on the display device; receiving input from a user relating to the one or more articles, from an input device; and displaying a set of data items relating to more new articles based on the user input.
In a further embodiment of the invention, input received from the user is a rating of an article. In a further embodiment of the invention, the displaying the set of data items relating to more new articles is determined by the further following steps: determining an article rated favourably by the user; determining an article similar to an article rated favourably by the user on a computer processor; and, displaying a set of data items relating to the similar article.
In a further embodiment of the present invention, the step of determining an article similar to the article rated favourably by the user comprises the steps of: determining the frequency of words found in the article; determining the frequency of words found in a second article; determining with a computer processor a similarity metric based on the frequency of words found in article and the second article; and selecting a second article which meets a criteria to be the article similar to the article rated favourably by the user.
In a further embodiment of the present invention, the similarity metric is a cosine similarity metric.
In a further embodiment of the present invention, the criteria is the greatest value of the similarity metric.
In a further embodiment of the present invention, the criteria is a exceeding a threshold level.
In a further embodiment of the present invention, stop words in the article are not considered.
In a further embodiment of the present invention, the words in the article are stemmed.
In a further embodiment of the present invention, the method further comprises the steps: receiving input from the user indicating that the user wishes to see a different article; and removing the set of data items about an article.
In a further embodiment of the present invention, the first view is a list. In a further embodiment, the second view is a grid. In a further embodiment, the first subset of data items includes a title and a date. In a further embodiment, the second subset of data items includes a title, a date, and an image.
The present invention will be more readily understood from the following detailed description when read in conjunction with the accompanying drawings, in which:
It is a goal of the present invention to provide one or more of the following features or benefits:
As used in this application, the terms “step”, “module”, “component”, “model”, “system”, and the like are intended to refer to a computer-related entity, either hardware, a combination of hardware and software, software, or software in execution. For example, a module may be, but is not limited to being, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a server and the server can be a module. One or more modules may reside within a process and/or thread of execution and a module may be localized on one computer and/or distributed between two or more computers. Also, these modules can execute from various computer readable media having various data structures stored thereon. The modules may communicate via local and/or remote processes such as in accordance with a signal having one or more data packets (e.g., data from one module interacting with another module in a local system, distributed system, and/or across a network such as the Internet with other systems via the signal).
The present invention is directed to a computer-implemented system and method interacting with users, and more specifically, for recommending on-line articles to users.
The system and method for recommending on-line articles or documents is suited for any computation environment. It may run in the background of a general purpose computer. In one aspect, it has CLI (command line interface), however, it could also be implemented with a GUI (graphical user interface) or together with the operation of a web browser.
In an embodiment of the present invention, as is shown in
It has been discovered that user engagement may be increased if more articles are presented to the user. As such, there is a need to allow the user to view more articles 140a . . . 140n on the display 130 via the user recommendation widget 130. However, the display 110 or user recommendation widget 130 (or both) typically have size restrictions, and as such, there is a trade-off as to the number of articles 140a . . . 140n that may be presented versus the information about articles 140a . . . 140n that may be presented to the user.
It has been discovered that some content is more suited for either a “grid view” presentation (as shown in
According to one aspect of the present invention, and as shown in
One or more maximize buttons 194 may be provided to increase the working area of the user recommendation widget 130 (the expanded and unexpanded views are shown in
Optionally, a grid view button 196 and a text view button 198 may be provided to permit the user to select from either a “grid view” presentation or a “text view” presentation, which is described in more detail below.
According to another aspect of the present invention, and as shown in
Although two types of presentations have been described in considerable detail, namely the “grid view” and “text view” presentations, the skilled reader will appreciate that other types of views fall within the scope of this patent. For example, additional views could include but are not limited to, showing just a thumbnail images (image view), showing additional details such as an article summary (detailed view), showing a flip style view (coverflow view), showing article titles with varying sizes depending on which ones are recommended the most (cloud view). If additional presentation views are available, then the skilled reader will appreciate that user interface elements such as toggles, sliders, or buttons may be used to select a current view or cycle between the views, etc.
Still with reference to
The information about articles 140a . . . 140n is stored in a database. A subset (e.g. selected portions) of the information is displayed according to default view, or the view selected (e.g. as selected by the buttons 196 or 198). The selected or default view is stored as a variable (not shown). The variable determines which view mode the current displayed articles (or items) have, and renders each article according to that view mode. When the user recommendation widget 130 first loads, the variable is populated by the default view mode that is set for all articles. If the user changes the view mode, then the variable may be overwritten by the newly selected mode.
The user recommendation widget 130 evaluates whether the selected view (e.g. whether it is a “grid view” according to button 196 or “text view” according to button 198), accesses the database to query the portion of information that should be displayed and then sends the result of the query data to display the portion of information on the selected view.
Articles 140a . . . 140n may comprise articles that are frequently viewed, listened to or read. They may also comprise articles that are new or more recent. In an embodiment, the user may apply one or more filters (via a user interface which is not shown). These filters could select categories of articles a user is interested in, for example, only sports-related articles or no sports-related articles.
An important aspect of the present invention is that upon receiving input from the user on one or more of articles 140a to 140n the system and method, in the same session, provides one or more new (refreshed or replacement) articles to the user in place of one or more articles 140a to 140n. For example, in an embodiment, where a user gives a thumbs-up to one or more of articles 140a to 140n the system and method will replace one or more articles 140a to 140n which a new article based on this user input. Similarly, in an embodiment, where a user gives a thumbs-down to one or more of articles 140a to 140n, the system and method will replace one or more of articles 140a to 140n with a new article based on this user input. In an embodiment, where the user gives a thumbs-up, one or more replacement articles are provided which are similar to the article given the thumbs-up.
Where an article is given a thumbs-down, one or more replacement articles are provided which are similar to an article previously given a thumbs-up. In an embodiment, after an article is rated (given a thumbs-up or thumbs-down), it remains displayed until the user clicks on a related button or icon containing text such as “show another article”.
In step 205, information is received regarding articles of possible interest.
In step 210, information on articles of possible interest are displayed to a user.
In step 220, input is received from the user on one or more of the displayed articles. In an embodiment of the present invention, this input is a click (via a mouse or other input device) on a thumbs-up or thumbs-down icon.
In step 230, one or more of the displayed articles (or information about them) is replaced, based on the user input. Typically, a new article or articles would be provided.
As mentioned above, in an embodiment, when a user provides a thumbs-up, one or more similar articles are provided in user recommendation widget 130. These replace articles originally displayed in widget 130. In an embodiment, a portion of articles 140a to 140n are used for this purpose.
The document receiving the thumbs-up may optionally be pre-processed in step 221. The data pre-processing 221 may comprise stop-word deletion, stemming and title and link extraction, which transforms or presents each article as a document vector in a bag-of-words data structure. With stop-word deletion, selected “stop” words (i.e. words such an “an”, “the”, “they” that are very frequent and do not have discriminating power) are excluded. The list of stop-words can be customized. Stemming converts words to the root form, in order to define words that are in the same context with the same term and consequently to reduce dimensionality. Such words may be stemmed by using Porter's Stemming Algorithm but other stemming algorithms could also be used. Text in links and titles from web pages can also be extracted and included in a document vector.
For each document, in step 225 of the invention a vector is created, setting out the frequency of occurrence of each of the words found in the article. In other words for each article of interest a vector is created {F1, F2, . . . FX}, where F1 represents the frequency in the document of the word, W1. Where a word is not found in the article, the frequency is zero.
In another embodiment, the vector may only be created for a portion of the article, such as the title and first paragraph, or for a brief description or abstract of it.
Vectors are then created using the same words, to represent other potentially similar articles. Then the vectors are compared in step 228 to determine those most similar. In another embodiment, cosine similarity may be used to compare the two article vectors.
For example:
For example:
Other measures of similarity are also possible for example:
(a) Sørensen's quotient of similarity
(b) Mountford's index of similarity
(c) Hamming distance
(d) Correlation
(e) Dice's coefficient
(f) Jaccard index
(g) SimRank
(h) Information retrieval
(i) Weighted cosine measure
In another embodiment, the publisher of articles, such as a newspaper publisher, provides the information which is received in step 205. In another embodiment, this is provided via an extension to the RSS feed version 2.0. For each article, the publisher may provide the following information:
(a) article title;
(b) article URL;
(c) article text;
(d) article category;
(e) the URL of a thumbnail image;
(f) article ID; and,
(g) a final date of publication.
In another embodiment, articles (or information about them) are not displayed after the final date of publication received from the publisher.
Further information on the RSS specification can be found at http://cyber.law.harvard.edu/rss/rss.html. In another embodiment, the information from this RSS feed is stored on table 340 as partially shown in
In another embodiment, related to each article is a table, stored in a database, which stores stemmed words and the associated word count for each article. This is shown in
In another embodiment, each user is given a unique user ID, which is stored as a cookie on the user's computer system. Database 330 also contains a table 370, which sets out information such as the user ID, article ID, and the input or rating received on the article.
In another embodiment, database 330 also contains a table which stores the IDs for first and second articles and the associated similarity score.
The format of tables described as occurring in database 330 are exemplary only—other formats are possible and within the scope of the present invention.
Recommender system 300 also contains a CPU 370 for calculating similarity scores and for carrying out other tasks.
When a user gives one or more of articles 140a . . . 140n a less favourable rating, for example, a thumbs-down, the system then checks table 370 and determines a previous article given a more favourable rating. One or more articles (or information about them) similar to a previously favourably rated article is then displayed to the user. The displayed articles will be ones meeting a specified criteria. The most similar article or articles may be displayed as replacement articles. Alternatively, articles exceeding a threshold level of the similarity metric may be displayed.
In another embodiment, the computer system will include a receiver module for receiving information regarding one or more articles. The system will also include a processor module, for determining replacement information to be displayed, based on the user input. The system will also include a changer module, for switching between views to be displayed.
What has been described above includes examples of the present invention. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the present invention, but one of ordinary skill in the art may recognize that may further combinations and permutations of the present invention are possible. Accordingly, the present invention is intended to embrace all such alterations, modifications and variations that fall within the spirit and scope of the appended claims. Furthermore, to the extent that the term “includes” is used in either the detailed description or the claims, such term is intended to be inclusive in a manner similar to the term “comprising” as “comprising” is interpreted when employed as a transitional word in a claim.
It will be understood that the above description of the present invention is susceptible to various modifications, changes and adaptations, and the same are intended to be comprehended within the meaning and range of equivalents of the appended claims.
This application is a continuation-in-part of U.S. patent application Ser. No. 12/501,221, filed Jul. 10, 2009, now pending, entitled “METHOD AND SYSTEM FOR RECOMMENDING ARTICLES,” the subject matter of which is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
Parent | 12501221 | Jul 2009 | US |
Child | 12558132 | US |