This invention relates to a computer-implemented system and method for interacting with users, and more specifically, for recommending on-line articles and documents to users.
Proliferation of On-Line Content. The internet is the source of extensive content. The amount and diversity of content is quickly growing. Some estimates suggest that more than 60 billion pages of content are now available on-line, and the amount of content grows continuously. It is a challenge for internet users to separate the relevant from the irrelevant. Increasingly, finding appropriate and desired on-line content is like finding a needle in a haystack, particularly when the article or document desired to be found is not known in advance. If a person has to rely on finding or locating articles or documents known to them in advance, they may miss out on accessing useful, relevant and valuable articles and documents.
Proliferation of On-Line Newspapers, Magazines, Information Portal and Other Information Sources. It is becoming quite common for metropolitan and national newspapers and magazines to have an on-line, or internet version. Readership of such on-line versions is large and growing. For example, between January and October 2008, online newspapers enjoyed an average of 67 million unique visitors per month, logging an average of 3.2 billion page views per month (source Newspaper Association of America & Nielsen Online: http://www.naa.org/TrendsandNumbers/Newspaper-Websites.aspx). As well such online versions have extensive content of all kinds, and typically provide access to archival or past content of such Newspapers and Magazines. In addition to the online versions of newspapers and magazines, there are also online versions of major news cable outlets (CNN and MSNBC and Fox) and large information portals (such as Yahoo, Reuters, Bloomberg and others) as well as general information sites and blogs that provide short essays on various topics (such as About.com and Suite101.com and blogs generally). Although such content may be of interest to users, they may not want to spend their time searching for it. As well, searches may return irrelevant, excessive results. One approach to this proliferation of content of On-Line Newspapers and Magazines is to use some type of recommender system to suggest to the reader new articles and documents that they might be interested in. A recommender system generates ratings or some other indication of relevance or interest for articles or documents that have not yet been seen by the user.
Weaknesses of Current Recommender Systems for On-Line Newspapers and Magazines. Current recommendation systems for on-line newspapers and magazines have a number of disadvantages and present a number of problems:
(a) Current Recommender Systems don't focus on the User Interface. There has been relatively little work carried out on understanding what type of user interface and user experience will best contribute to use and efficiency of a recommendation engine. The prior art has shown little interest in the type of interface or the properties of the users interaction with the recommender system. This is a significant problem since the user interface forms an important aspect of the effectiveness and user acceptance of a recommender system.
(b) Current Recommender Systems do not relate ratings to recommendations in a visible and real-time (or near-real-time) way. Currently available systems do little to promote engagement by the user. Typically the user is asked to provide his or her ratings for an article or other content, but there is no immediate connection between those ratings and the resulting recommendations. Also, users typically have no other choices to specify the kinds of content that they wish to have recommended; while this kind of specification is common in search engines, it is absent in recommendation systems, certainly on large information portals and news sites. The user doesn't have fun in interacting with the system and receiving recommendations from it. As well, the user often has only a limited understanding about why particular recommendations are being made. Because the user cannot see how his or her choices and ratings immediately influence the recommendation or selection of articles, the user may have reduced acceptance of, and confidence in, the Recommender System. As well, many current systems are relatively impersonal—they simply tell a visitor that “people who read this article also read ______”, or “people who read this article bought ______”. They do not appear to be personalized to a great extent.
(c) It is a challenge to manage User Input. Users do not want to spend a lot of time interacting with the recommender system. Some recommender systems solve this problem by not having any explicit entry of information, such as ratings, by an individual user. Such systems may recommend only the most frequently viewed, emailed or commented upon articles. This type of system does not personalize or customize recommendations for a user—a significant disadvantage. At the same time, many Users will not enter ratings or preferences. Another approach to this problem is that some recommender systems collect data about user preferences implicitly. Such information might include pages visited, time spent on the page, whether the page was printed or shared by email. Although it may be less obtrusive to obtain information this way, such information may be quite unreliable or inaccurate as a basis for making recommendations.
(d) New User Problem. Many recommender systems operate, at least in part, by determining that a user is similar to one or more other users, and may be interested in the same things, and then proceeds to recommend to them articles or documents the similar user read or rated highly. Recommender systems are challenged by new users, since there is no or a limited basis to understand how a new user might be similar to existing users. This problem is heightened when a new recommender system is introduced or implemented, since all or many of the users may be fairly characterized as new users. Some systems collect demographic data on users, such as their occupation, age or income, but users may be reluctant to spend the time to provide extensive amounts of such information. These problems are compounded for idiosyncratic users or users with unusual or unique tastes or interests. For such users, there may not be any (or there may be relatively few, users with similar tastes and interests, and as such, it can be difficult to provide them with effective recommendations.
(e) New Article Problem. Many current recommender systems involve some type of rating of an item (e.g. an article or document), correlate this rating with other user attributes or behaviours, and use such ratings and correlations as at least part of a basis to make recommendations. This leads to a problem when new articles are introduced to the recommender system, namely, that the new articles have not been rated and so that there is no basis upon which to recommend such a new article to other users.
(f) Cold Start Problem. The Recommender System may initially have a limited number of users and ratings. In order for users to receive good recommendations, the Recommender System needs to have substantial input of information upon which it will determine similarity between users, articles to be recommended, or other information on which similarity will be determined. When the Recommender System is initially started to be used, there is a limited amount of such information because of the limited number of users and user preferences.
(g) Lack of Diversity Many current Recommender Systems do not recommend complementary and related products or services, or a diverse mix of articles and other media.
(h) Sparsely Rated Content. Where the number of articles and users are increasing quickly, then there may be relatively few articles rated, and relatively few users providing ratings for any particular article. It can be challenging to provide effective and reliable recommendations for such sparsely rated content.
It is a goal of the present invention to address one or more the above-noted disadvantages and weaknesses of current recommender systems.
The following presents a simplified summary of the invention in order to provide a basic understanding of some aspects of the invention. This summary is not an extensive overview of the invention. It is not intended to identify key/critical elements of the invention or to delineate the scope of the invention. Its sole purpose is to present some concepts of the invention in a simplified form as a prelude to the more detailed description that is presented later.
The present invention is directed to a computer-implemented system and method interacting with users, and more specifically, for recommending on-line articles and complementary products and services to users.
The present invention may provide one or more of the following benefits or advantages: it may allow the user to see how his or her choices and ratings immediately influence the recommendation or selection of articles; it may increase user engagement; it may increase the number of happy surprises the user experiences, in other words, situations where an item is recommended to the user and he or she is pleased to have received this recommendation; it may increase the number of pages that a user views; it may increase the time that a user spends on a publisher's site; it may increase the frequency and number of the user's visits to a site; and, it may attract more unique visitors to a publisher's site.
An embodiment of the invention provides a system which is the combination of recommendation with a concurrent user interface, with the user interface being adjustable by the user through the manipulation of on-line controls. An important aspect of the present invention is that the user's actions through the user interface may visibly affect the recommendations which are presented. Another important aspect of the present invention is that it permits the operation of the recommender system to be more visibly personalized for each user. Another important aspect of the present invention is that it facilitates faster and more accurate learning about a user's preferences in a way that is not obtrusive.
A computer-implemented method of providing recommendations for articles is provided comprising the steps of providing a user with an initial list of articles by displaying the initial list on a computer display device, receiving input from the user by receiving or monitoring input from at least one input device, said input comprising one or more of: i) an explicit rating for one of said initial list of articles; ii) user data in relation to the user; and iii) an indication the user has changed or set a filter; generating in a microprocessor at least one new recommended article from a list of possible articles, based on the input received from the user; refreshing the initial list of articles with said at least one new recommended article to produce a refreshed list; and providing the user with the refreshed list by displaying the refreshed on the computer display device.
Furthermore, a computer-implemented method of recommending articles is provided comprising the steps of: storing a set of possible articles in a database; receiving information from, or relation to, a first user by receiving or monitoring input from at least one input device, said information including at least one of: i) demographic data about the first user; ii) rating data about one of the set of possible articles from the first user; iii) user data in relation to the first user; iv) transaction data in relation to the first user; and v) information relating to content of an article of interest to the first user; determining in a microprocessor a similarity between the received information and at least one of: i) demographic data about a second user; ii) rating data about one of the set of possible articles from the second user; iii) user data in relation to the second user; iv) transaction data in relation to the second user; v) information relating to content of an article of interest to the second user; and recommending to the first user information about a second article from the set of possible articles based on the determined similarity by displaying the information about the second article on a computer display device, where the recommendation is generated by MWinnow.
In another aspect of the invention, a computer program product is provided comprising: a memory having computer readable code embodied therein, for execution by a CPU for recommending documents, said code comprising: code means for providing a user with an initial list of articles by displaying the initial list on a computer display device; code means for receiving input from the user by receiving or monitoring input from at least one input device, said input comprising one or more of: i) an explicit rating for one of said initial list of articles; ii) user data in relation to the user; iii) an indication user has changed or set a filter; code means for generating at least one new recommended article from a list of possible articles, based on the input received from the user; code means for refreshing the initial list of articles with said at least one new recommended article to produce a refreshed list; and code means for providing the user with the refreshed list.
Furthermore, a computer program product is provided comprising: a memory having computer readable code embodied therein, for execution by a CPU for recommending articles, said code comprising: code means for storing a set of possible articles in a database; code means for receiving information from, or relation to, a first user by receiving or monitoring input from at least one input device, said information including at least one of: i) demographic data about the first user; ii) rating data about one of the set of possible articles from the first user; iii) user data in relation to the first user; iv) transaction data in relation to the first user; and v) information relating to content of an article of interest to the first user; code means for determining in a microprocessor a similarity between the received information and at least one of: i) demographic data about a second user; ii) rating data about one of the set of possible articles from the second user; iii) user data in relation to the second user; iv) transaction data in relation to the second user; v) information relating to content of an article of interest to the second user; code means for recommending to the first user information about a second article from the set of possible articles based on the determined similarity by displaying the information about the second article on a computer display device, where the recommendation is generated by MWinnow.
Moreover, a computer system is provided comprising the following elements: an interface for receiving input from a user by monitoring or receiving input from at least one input device, said input comprising one or more of: i) an explicit rating for one of an initial list of articles presented to the user; ii) user data in relation to the user; iii) an indication the user has changed or set a filter; a user data collection module, for collecting the input from the user and for transmitting information to the user regarding articles; a database, for storing the input, a list of possible article ratings table, and article and user table; a recommender module, for recommending to the user information about one of the list of possible documents, said recommendation based on said input.
As used in this application, the terms “approach”, “module”, “component,” “model,” “system,” and the like are intended to refer to a computer-related entity, either hardware, a combination of hardware and software, software, or software in execution. For example, a module may be, but is not limited to being, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a server and the server can be a module. One or more modules may reside within a process and/or thread of execution and a module may be localized on one computer and/or distributed between two or more computers. Also, these modules can execute from various computer readable media having various data structures stored thereon. The modules may communicate via local and/or remote processes such as in accordance with a signal having one or more data packets (e.g., data from one module interacting with another module in a local system, distributed system, and/or across a network such as the Internet with other systems via the signal).
The present invention is directed to a computer-implemented system and method interacting with users, and more specifically, for recommending on-line articles and documents to users. In this description, the words article, item and document are used synonymously. In this description, the word article includes advertisements, and may also include other types or categories of media, such as videos, audio files, images and podcasts. The word article also includes products or services which could be provided or purchased.
The system and method for recommending on-line articles or documents is suited for any computation environment. It may run in the background of a general purpose computer. In one aspect, it has CLI (command line interface), however, it could also be implemented with a GUI (graphical user interface) or together with the operation of a web browser.
Referring to
Optionally, the system and method of an aspect of the present invention may allow for the previewing and recommendation of more than just text articles, especially given that many news sites are often including large amounts of multimedia content online. With no change in the recommender system or only modest changes in the article widget user interface 100, the system may be configured to show and recommend items including but not limited to text articles, documents, videos, movies, podcasts, audio clips, PDF documents, eBooks or slide presentations, among others. As will be apparent, the meaning of some controls, which will be described later, varies as the type of item varies. For example, if the item was a video or an audio clip, then the control dealing with the length of the item no longer deals with word count, but instead, deals with the duration or playing time of the item. In some cases, just a preview of the recommended article is shown. In another embodiment of the present invention, the system is configured to allow the user to filter out recommendations based on media type: “show video only”, “show PDFs only”. As such, use of the term “article” could encompass any media object including text articles, documents, videos, movies, podcasts, audio clips, PDF documents, eBooks or slide presentations, among others.
Still with reference to
Moreover, with reference to
Where a user dismisses an article it will, in a preferred embodiment, be automatically given the lowest possible rating. In a further embodiment, articles found by recommendation module 320 to have similar content or to reflect a similar user preference to the dismissed article, are less likely to be presented or recommended to the user or presented or recommended to the user for a period of time.
Still with reference to
Still with reference to
Moreover, with reference to
Turning now to
Still with reference to
Moreover, with reference to
An embodiment of the invention provides a system which is the combination of real-time recommendation with a user interface (implemented for example in AJAX or Flash), with the user interface being adjustable by the user through the manipulation of on-line controls. An important aspect of one embodiment of the present invention is that the user's actions through the user interface may be reflected in refreshing the articles recommended to the user.
The set of controls 220 may also comprise the following additional controls (not shown) and many of these elements are also suitable for the article widget user interface 100:
In accordance with the present invention, certain of these controls 220 may not be appropriate when used with non-text articles. For example, when operating upon or with a non-text article, the method and system of the present invention may, in one embodiment, omit one or more of the following:
With reference now to
In a preferred embodiment, when a user rates an article by interacting with the user interface 100/200, two requests (or messages) may be sent to the public API interface 350: the first, to add or update that user's rating of that (current) article to the database 360; the second, to request a list of recommended articles, along with optional parameters, discussed below.
With regard to the first request, the public API interface module 350 will store or update the rating in the database 360, and this may trigger certain stored procedures in the database 360 or in the memory of a computer handling this function. These stored procedures may include, for example, preparing the database 360 for recommendation requests, such as by calculating Slope One item deviation values or Co-Visitation item correlation values to improve the performance of the Slope One and Co-Visitation algorithms, discussed below.
With regard to the second request, the public API interface module 350 may query the database 360 using the optional parameters, and then generate a candidate item list (a set of articles from which to make recommendations), and send the candidate item list to the recommendation module 320. The recommendation module 320 will generate a recommendation result selected from the candidate item list, and this recommendation result will be passed to the public API interface module 350, and in turn, to the user interface 100/200. The public API interface module 350 may accept optional predefined parameters such as maximum returned number of items, or a candidate item list. Alternatively, the user interface 100/200 may just send a unique identifier for a user to the public API interface module 350, and request a list of recommended items, without any optional parameters.
With regard to both first and second requests noted above, the unique identifier for a user must preferably accompany the request to the public API interface module 350. Moreover, the first request is optional and need not occur for the second request, that is, a request for a list of recommended articles, to be made or fulfilled, such as, for example, when the user launches the full-screen article widget user interface 200 (i.e. not providing a rating).
An embodiment of the present invention provides a method of recommending articles to a user, as is further illustrated in
A further embodiment of the present invention has the further following steps:
All the listed types of input information need not be received. Further, combinations of these types of data are received in a preferred embodiment and used to generate one or more recommended articles. For example, a user may provide an explicit rating, and nothing more. Or, a user may provide an explicit rating for an article, together with a category filter (i.e. the fact that the user has accessed the article by use of the filter). Alternatively, a user may provide no explicit rating, but provide an implicit rating based on the fact of reading or previewing an article, for example. In each of these cases, the rating (implicit or explicit), together with the filter or control information, is used by aspects of the computer system 300 including the recommendation module 300 and the database 360, to generate one or more recommended articles.
A further embodiment of the present invention has the further following steps (in addition to those set out in paragraph 54**):
addition to those set out in steps 410-450, above):
The group could include:
The system and method of an aspect of the present invention further comprises a computer system 300, as shown in
Computer system 300 further comprises a User Data Analysis Module (UDAM) 30. The UDAM 30 performs pattern analysis based on user data. For example, it can perform probability analysis to guess the user's gender, age, and profession. For example, a user who has looked at more than two football articles could be predicted to be male, according to a rule-based algorithm. UDAM 30 may also perform user clustering, article clustering, or user-article co-clustering, as is described in more detail below.
The precise functions performed in the modules and components of computer system 300 may, as will be appreciated by those of skill in the art, be performed by other modules within computer system 300 and still be within the scope of the present invention.
Computer system 300 further comprises a User Data Preprocess Module (UDPM) 40. THE UDPM 40 operates on user data to generate implicit rating data based on a set of conversion rules. In a preferred embodiment, the UDPM 40 may be part of the Recommender Module 300. In an embodiment of the present invention, user data is stored in database 360, in association with a unique identifier of the user. User data used to generate an implicit rating include:
Data regarding the user's preferences is also collected implicitly by recording click-throughs, mouse-overs, and analyzing the types of recommended articles (shown in
With reference to
Explicit rating data collected via the user interface 100/200 is stored in the User-Article Ratings table in database 360. The UDPM 40 generates implicit ratings as described in the previous paragraphs, and these implicit ratings are also stored in User-Article Ratings table in database 360. User-Article Ratings table in database 360 provides a ratings matrix according to the following chart:
(The blank cells in the chart may or may not have implicit or explicit ratings stored within them, provided that at least some of the cells of the table have ratings stored in them).
For user or article clustering (i.e. determining an initial evaluation amongst users or articles, the most similar users or articles may be clustered together by clustering module of UDAM 30. To facilitate clustering (i.e. to determine similarity), in a preferred embodiment the k-means algorithm is employed. The steps of the algorithm are as follows (for the example of clustering users together):
Other algorithms can also be used to carrying out clustering, including:
There are numerous benefits to clustering including:
More sophisticated algorithms can be used to carry out co-clustering. In co-clustering, articles and users are clustered to create article-user clustered niches. In a preferred embodiment, the algorithm used is a co-clustering algorithm, licensed by the National Research Council of Canada. The steps of the algorithm are as follows:
CX: {x1, x2, . . . , xm}→{{circumflex over (x)}1, {circumflex over (x)}2, . . . , {circumflex over (x)}k}
CY: {y1, y2, . . . , yn}→{ŷ1, ŷ2, . . . , ŷl}
{circumflex over (X)}=Cx(X)
Ŷ=C
Y(Y)
q(x,y,{circumflex over (x)},ŷ)=p({circumflex over (x)},ŷ)p(x|{circumflex over (x)})p(y|ŷ), where xε{circumflex over (x)}, yεŷ
t=0,
After clustering, the output is a series of groups clustering together articles or users which is stored in database 360, e.g.
(and similarly for articles clustering). For co-clustering, the groups contain both users and articles, e.g.
A group could include one or more articles that a specific user (and member of that group) had not rated. Co-clustering can be used, itself (without further steps of the recommender algorithm, to generate recommendations). For example, within a co-clustered group, articles rated above a threshold are recommended to a user in the group, where user has not rated the article.
Clustering or co-clustering is optional. In a preferred embodiment, clustering and co-clustering is run in the background since it is very processor-intensive. Clustering or co-clustering works best when a large numbers of users, articles and ratings are available.
The clustered results may be used as input to recommendation module 320 according to MWinnow, Slope or Co-Visitation algorithms, which algorithms operate on a clustered or co-clustered group. This improves reliability and speed of processing.
Referring to
Moreover, the system and method of an embodiment of the present invention further comprises a recommendation module 320, preferably based on a hybrid approach which combines collaborative filtering, rule-based, and content-based techniques. The recommendation results are generated by weighting the results from each of the recommendation algorithms. Content-based and rule-based techniques are helpful to alleviate the well-known cold-start, new item, new user and sparse rating problems. Furthermore, the engine can recommend complementary articles based on rule-based techniques.
The following table indicates how different approaches may be mixed together which may be employed in recommendation module 320 in a hybrid approach:
Hybridization can alleviate some of the problems associated with collaborative filtering and other recommendation techniques.
Generally, Recommender Module 300 may take input of one or more of the following data types:
Rule-based. The recommendation engine may further implement rule-based recommendations. Here are two examples:
Although the above paragraph refers to recommending similar articles to a user, complementary products or services could also be recommended to the user. This may be implemented by a Rule-based approach. For example, if a user has been recommended and has read five articles from the New Yorker™ magazine, then do propose an offer to subscribe to the magazine.
Content-based. The recommender system may also make recommendations based on the content of articles. An article may be parsed to determine tags, or to determined the frequency of keywords. A list of keywords (not shown) is stored in Database 360 of
In a preferred embodiment, the collaborative filtering “CF” approaches, as described in more detail below, such as MWinnow, Co-Visitation or Slope, are also used to recommend an article, and may be used in combination with a content-based approach. For example: after a user has given a good rating (either explicitly or implicitly) to “Article 3”, the CF approaches may determine that “Article 1” is similar to “Article 3”, and the content approach may further determine that “Article 5” is similar to “Article 1”. Articles 1 and 5 would be marked to be recommended to the user or added to a list to be recommended to the user or immediately recommended to the user.
In a preferred embodiment, articles are filtered based on factors such as word length or frequency. In an embodiment of the present invention, a system and method of the present invention may also be configured to exclude documents labelled as “non-articles” by a machine learning approach. Once a document has been labelled as a non-article, it would not be presented in response to a query given to a search engine, or would not be presented by a recommendation engine. It may be desirable to label a document as an article or non-article in accordance with the following steps:
In one aspect, recommendation module 320 of the present invention will take a weighted average vote from three recommendation approaches:
These approaches are now described in greater detail below.
Mwinnow. The MWinnow (multi-level Winnow) scheme may directly handle multiclass classification problems by extending the Balanced Winnow scheme directly. Given K possible classes c1, c2, . . . , cK with corresponding class labels l1, l2, . . . , lK, we define K linear functions which correspond to the K classes respectively as follows: ƒ(k)(x)=w0(k)+xTw1(k), k=1,2, . . . , K. (3.1) where instance x=(x1, x2, . . . , xn)T, and weight vector w1(k)=(w1(k), w2(k), . . . , wn(k))T, k=1,2, . . . , K. Let weight vector
We can rewrite (3.1) as follows for simplicity:
ƒ(k)(x)=(1,xT)w(k),k=1,2, . . . , K. (3.2)
Given a new instance x, we calculate the K output values of the K linear functions ƒ(1)(x), ƒ(2)(x), . . . , ƒ(K)(x). In fact, ƒ(k)(x) is a measure of the distance from x to the hyperplane (1,xT)w(k)=0, k=1,2, . . . , K. The classifier c(x) is defined as follows: c(x)=lk such that
We introduce the promotion parameter α and the demotion parameter β such that α>1 and 0<β<1, which determine the change rate of the weights. The model is not updated if the prediction is correct. If the algorithm makes a mistake such that it misclassifies x with true label lp to lq, then we update the weight vectors w(k)=(w0(k), w1(k), w2(k), . . . , wn(k))T (k=1,2, . . . , K) as follows.
wj(q)=βx
wj(p)=αx
∀k such that
ƒ(p)(x)<ƒ(k)(x)≦ƒ(q)(x), j=0,1,2, . . . , n;
w
(k)=(w0(k), w1(k), w2(k), . . . , wn(k))T is not updated if ƒ(k)(x)≦ƒ(p)(x); (4)
where
is a monotonously increasing penalty function with the parameter k>0. It outputs a value between β and 1.
Other more sophisticated penalty functions are also available. If K=2, this is the Balanced Winnow algorithm.
The MWinnow algorithm is summarized in
The theoretical and implementation computational complexity of the MWinnow algorithm is rather low. To better understand MWinnow an example is provided: Suppose a set of instances is given, the training process goes through the instance set for many passes. In each pass, every instance in the set is used to train the classifier exactly once. In the first few passes, the number of accumulated mistakes increases rapidly. The number of mistakes made by the classifier in each pass will gradually decrease because the classifier gets more and more training. If the algorithm eventually converges, the number of accumulated mistakes will increase more and more slowly until reaching a maximum value. Otherwise, the number of accumulated mistakes will increase continuously.
The MWinnow algorithm is strongly convergent on a set of instances if and only if the number of accumulated mistakes eventually stays unchanged while presenting the instances cyclically to it.
Hundreds and thousands of iterations may be required before the MWinnow algorithm strongly converges, depending on the given dataset of instances. In practice, we can remove the strong convergence condition to reduce the number of iterations. In each trial t, we compute the maximum change of the weights, i.e.,
We can define the weak converge condition as follows.
The MWinnow algorithm is weakly convergent on a set of instances if and only if the maximum change of the weights is less than a small value ε>0 while presenting the instances cyclically to it.
The mistake bound of the MWinnow algorithm is an upper bound of the maximum number of accumulated mistakes that it makes in the worst scenario before it eventually converges.
We have done some simple experiments to test its convergence using some artificial data. It has shown that the algorithm will converge if we present the noise-free training examples cyclically to it and tune α and β close to 1. In one of our experiments, we generated training data using a model with three relevant attributes and two irrelevant attributes. There were four class labels, encoded consecutively from one to four. The NMAE (Normalized Mean Absolute Error) value was 0.069 when MWinnow converged. In another experiment, training data were generated from a model with six relevant attributes and four irrelevant attributes. There were seven class labels, encoded consecutively from one to seven. The NMAE value was 0.007 when the algorithm converged. When we changed the order of training examples randomly, the NMAE values did not change significantly.
Online learning schemes are very easy to implement.
Given an instance as the input, the update component will output a predicted class label. We need to train the learner after we get the true label. An instance and its true label form an example. Given an example, the update module will calculate the predicted label by invoking methods in the prediction component. If the predicted label is different from the true label, the component will update the weight vectors.
In an online recommender system, the data given are a matrix of user-article ratings. Two approaches are proposed to apply the MWinnow scheme to a model-based recommender system: pure MWinnow approach and hybrid approach. The prototype recommender system based on the pure MWinnow scheme is demonstrated in
Where MWinnow is used alone (i.e. Mwinnow is given a weighting of 1.0), we train in advance an MWinnow classifier for each article by treating this article as the class attribute and all other articles as the input attributes. When the online behaviours (i.e., article ratings) of a new user are observed, his/her rating for any unrated article can be predicted using the corresponding classifier, and the articles with the highest ratings will be recommended to him/her. The observed behavior data are used to train the classifiers.
Here is a simplified example of the use of MWinnow in accordance with an embodiment of the present invention. In the example, Users 1-3 have given items 1-3 the following rating:
(Predicted Rating)=W0+W1X1+W2X2
Each item is associated with a MWinnow classifier. For example, MWinnow classifier c2({right arrow over (x)})=w0+w1x1+w2x2 is associated with item 2. The weighting factors W0, W1, W2 are initially set to one. Then further instances of ratings are used to update the weighting factors. The first instance of item ratings is used to test the initial weighting factors to see if the predicted value is equal to the true value. Taking the example of the initial predicted value for user1 item 2,
Predicted value=1+1*1+1*3=5
This is not the same as the true value which is 2, so the weighting factors are updated, as is described above. The MWinnow classifier c2({right arrow over (x)})=w0+w1x1+w2x2 was trained using all the instances of item ratings in the given rating matrix.
Thus, if predictions are being created for item2, the MWinnow classifier c2({right arrow over (x)})=w0+w1x1+w2x2 can output a predicted rating value. If a new user, User4, enters the system and provides ratings (implicitly or explicitly) for Items 1 and 3, a prediction for the rating of Item2 for User4 can be generated as follows:
Predicted Rating for Item2 by User4=W0+W1*(User4 Rating for Item1+W2(User4 Rating for Item3)
This predicting rating can then be used to determine if Item2 should be recommended to User4. For example, if the predicted rating exceeds a threshold Item2 will be recommended to User4.
Slope One. The slope one schemes take into account both information from other users who rated the same article and from the other articles rated by the same user. However, Slope One also relies on data points that fall neither in the user array nor in the article array (e.g. user A's rating of article 1), but are nevertheless important information for rating prediction. Much of the strength of the approach comes from data that is not factored in. Specifically, only those ratings by users who have rated some common article with the predictee user and only those ratings of articles that the predictee user has also rated enter into the prediction of ratings under slope one schemes.
Formally, given two evaluation arrays vi and wi with I=1, . . . , n, we search for the best predictor of the form fx=x+b to predict w from v by minimizing Σ1(v1+b−wi)2.
Deriving with respect to b and setting the derivative to zero, we get
In other words, the constant b must be chosen to be the average difference between the two arrays. This result motivates the following scheme.
Given a training set χ and any two articles j and I with ratings uj and ui respectively in some user evaluation u (annotated as u Sj,I (χ)), we consider the average deviation of article 1 with respect to article j as:
Note that any user evaluation u not containing both uj and ui is not included in the summation. The symmetric matrix defined by devj,i can be computed once and updated quickly when new data is entered.
Given that devj,I+ui is a prediction for uj give ui, a reasonable predictor might be the average of )all such predictions.
where Rj={i\i S(U), i≠j, card (Sj,I (χ))>0} is the set of all relevant articles. There is an approximations that can simplify the calculation of this prediction for a dense enough data set where almost all pairs of articles have ratings, that is, where card (Sj,i (χ))>0 for almost all I, j, most of the time Rj=S(u). Since
An implementation of Slope One does not depend on how the user rated individual articles, but only on the user's average rating and crucially on which articles the user has rated.
The use of Slope One in accordance with an embodiment of the present invention will be further described with reference to the following example:
In the example, the following ratings have been provided:
For User1 the difference or deviation between Item2 and Item1=−1
We apply this deviation to User2's rating of Item1 to predict a rating for Item2.
In a preferred embodiment, if the the predicted value goes outside the range the out of range value is used for calculating a weighted average and thus a determination of whether the article should be recommended to the user.
Co-Visitation. Co-visitation helps provide recommendations where a user visits or views articles but does not provide ratings. An article based technique for generating recommendations may make use of co-visitation instances, where co-visitation is defined as an event in which two articles (stories) are clicked by the same user within a certain time interval (typically set to a few hours). Imagine a graph whose nodes represent articles (news stories) and weighted edges represent the time discounted number of co-visitation instances. The edges could be directional to capture the fact that one story was clicked after the other, or not if we do not care about the order. This graph may be maintained as an adjacency list that is keyed by the article id. On article sk, the user's recent click history Cui may be retrieved and iterated over the articles in it. For all such articles sj Cui, the adjacency lists are modified for both sj and sk to add an entry corresponding to the current click. If an entry for this pair already exists, an age discounted count is updated. Given an article s, its near neighbours are effectively the set of articles that have been covisited with it, weighted by the age discounted count of how often they were covisited. This captures the following simple intuition: “user who viewed this article also viewed the following articles”.
For a user ui, the co-visitation based recommendation score is generated for a candidate article s as follows: the user's recent click history Cu is fetched, limited to past few hours or days. For every article si in the user's click history, the entry for the pair si is looked up, s in the adjacency list for si stored in the adjacency list. To the recommendation score, the value stored in this entry normalized by the sum of all entries of si is added. Finally, all the co-visitation scores are normalized to a value between 0 and 1 by linear scaling.
If the results from these algorithms are not satisfactory, then the recommender module 300 will default to one or more “most popular articles” (according to views, comments, or email or some other measure of popularity.)
The recommendation module 320 will insert a proportion of new articles randomly into the stream of articles being recommended, to encourage rankings for these new articles.
The general computer system of the foregoing paragraphs may be configured to allow the user to operate the user interface 100/200300 of
What has been described above includes examples of the present invention. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the present invention, but one of ordinary skill in the art may recognize that may further combinations and permutations of the present invention are possible. Accordingly, the present invention is intended to embrace all such alterations, modifications and variations that fall within the spirit and scope of the appended claims. Furthermore, to the extent that the term “includes” is used in either the detailed description or the claims, such term is intended to be inclusive in a manner similar to the term “comprising” as “comprising” is interpreted when employed as a transitional word in a claim.
Number | Date | Country | Kind |
---|---|---|---|
2,634,020 | May 2008 | CA | national |
This application claims priority from and incorporates by reference the subject matter of the application entitled SYSTEM AND METHOD FOR MULTI-LEVEL ONLINE LEARNING filed with the C.I.P.O. on May 30, 2008 and assigned U.S. Pat. No. 2,634,020.