1. Field of the Invention
The present invention relates to a system and method of providing at least one recommended item and usage for at least one item to an on-line user based on similarities of usage behaviors of the on-line user and other users.
2. Description of Related Art
The enjoyment of particular items is a subjective judgment made by individuals based on any number of criteria, not least of which is the manner in which the item is used. The ability to make acceptable recommendations to a particular person about a given item such as, for example, a book, can be helpful. Such information would enable a person to, e.g., quickly skim a book that is not enjoyable to read in order to extract the key facts, while leisurely perusing an enjoyable book, e.g., rereading some chapters multiple times in order to savor the choice of expressions. There are many critics that rate books. Therefore, an individual can try to identify a critic with somewhat similar preferences as a source for selecting books to read and for suggestions on how best to enjoy a book. However, relying on a critic is not reliable on a regular basis as the critic may not have the same particular likes and dislikes as the reader, and typically the critic provides little indication of the manner in which he read a book.
Prior art systems have attempted to provide recommendations to a user based on, e.g., buying patterns of items or explicit ratings of items provided by the user as compared with other, similar, users. For example, collaborative filtering systems operate generally by asking many users to rate an item that the user is familiar with, and storing these ratings within user-specific rating profiles. To identify items that may be of interest to a particular user, a service correlates the user's rating profile to the profiles of other users to identify users with similar tastes. When applied over large databases of user rated data, this type of analysis can produce recommendations that are valuable to both users and merchants.
However, while collaborative filtering utilizes information obtained through collecting and analyzing an individuals' buying preferences, information on an individual's behaviors in using an item are typically not captured.
Thus one problem with current collaborative filtering is that such information does not provide a means to determine a reader's objectives in reading a particular book. For instance, some books, such as textbooks or reference books, are read primarily to determine certain facts, while novels are typically read from cover to cover to enjoy the story. Such objectives are important in determining other related books to recommend, as well as in determining a suggested reading pattern for a recommended book. Thus current collaborative filtering techniques fail to capture key aspects of how an item is used, which reduce the effectiveness of such techniques in providing accurate recommendations.
The present invention addresses these and other problems that are inherent with existing collaborating filtering systems to enable improved on-line recommendations.
In an embodiment there is disclosed a method of providing online recommendations, comprising:
capturing, for one or more users, at a respective client device, usage characteristics of each users' navigation to and use of one or more items, from among a plurality of items of an item set, on-line, via a respective user interface;
obtaining corresponding profile information for each respective user, said profile information including user attributes;
storing said usage characteristics and corresponding profile of each one or more users; and, for a current user navigating online to said set of items:
deriving an item recommendation and associated usage of said item for said current online user based on items of said item set navigated to and used by other online users having similar profiles; and,
recommending for said current user, via that current user's user interface, usage of an on-line item from among said set of items, wherein a programmed processing unit performs one or more said capturing, obtaining and deriving.
In another embodiment there is disclosed a system for providing online recommendations comprising:
a memory;
a processor in communications with the memory, wherein the system performs a method comprising:
capturing, for one or more users at a respective client device, usage characteristics of each users' navigation to and use of one or more items, from among a plurality of items of an item set, on-line, via a respective user interface;
obtaining corresponding profile information for each respective user, said profile information including user attributes;
storing said usage characteristics and corresponding profile information of each of one or more users; and, for a current user navigating online to said set of items:
deriving an item usage recommendation for said current online user based on items of said item set navigated to and used by other online users having similar profiles; and,
recommending for said current user, via that current user's user interface, an on-line item and its suggested usage from among said set of items, wherein a programmed processing unit performs one or more said capturing, obtaining and deriving.
The foregoing has outlined, rather broadly, the preferred feature of the present invention so that those skilled in the art may better understand the detailed description of the invention that follows. Additional features of the invention will be described hereinafter that form the subject of the claims of the invention. Those skilled in the art should appreciate that they can readily use the conception and specific embodiment as a base for designing or modifying the structures for carrying out the same urposes of the present invention and that such other features do not depart from the spirit and scope of the invention in its broadest form.
Other aspects, features, and advantages of the present invention will become more fully apparent from the following detailed description, the appended claims, and the accompanying drawings in which similar elements are given similar reference numerals.
The following description sets forth an exemplary embodiment, in accordance with the principles of the present invention, of a collaborative filtering system or similar “social” filtering or media recommending system that utilizes attributes of an online user for providing a recommendation of an item within a group of items to the online user based on that user's profile information and similarity of use preferences with use preferences of a similar subgroup of users of the system. For purposes of illustration, the filtering system and associated methods are described herein in the context of recommending a book. It should be understood that the collaborative filtering system is not limited to recommending books. Rather, the collaborative filtering system of the present invention can be used to recommend any product or service whose use can be captured and tracked online, for instance movies or music.
The recommendation system and method of the present invention can be implemented on any suitable internet-connected computer and associated components, peripherals, keyboards, and the like, known to one of ordinary skill in the art. Profile information for a user is provided to the system in a suitable fashion. For example, each user can enter data into a database by keyboard, touch screen, voice, or other means. Usage information is captured via the internet through collection of attributes such as time spent on a page, total time spent with book open, etc.
Referring to
Answers to such information can be used to provide more insightful recommendations about other service, product or media item, e.g., books, that may be of potential interest to a current on-line user, and other subsequent on-line users accessing the web site. For example, such use information may make an online book purchaser more willing to purchase additional books recommended by the online book seller.
Referring to
Then, continuing to block 116, a first mathematical algorithm is applied to identify a similar group(s) of readers based on the prior profiles obtained and stored in step 110 above in addition to information including reading speed, reading frequency, reading style, etc. using historical data for previous books that were read. Possible mathematical algorithms can include but are not limited to cluster analysis or collaborative filtering.
The term cluster analysis encompasses a number of different algorithms and methods for grouping objects of a similar kind into respective categories. Cluster analysis is an exploratory data analysis tool which aims at sorting different objects into groups in a way that the degree of association between two objects is maximal if they belong to the same group and minimal if they do not belong to the same group. Cluster analysis can be used to discover structures in data without providing an explanation and/or an interpretation. Thus, cluster analysis simply discovers structures in data without explaining why they exist.
Collaborative Filtering (CF) is a method of making automatic predictions (filtering) about the interests of a user by collecting taste information from many users (collaborating). The underlying assumption of CF approach is that those who agreed in the past tend to agree again in the future. For example, collaborative filtering for music tastes can make predictions about which music a user should like given a partial list of that user's tastes (likes or dislikes). Note that these predictions are specific to the user, but use information is gleaned from many users.
Collaborative filtering is useful when the number of items in only one category (such as books) becomes so large that a single person cannot possibly view them all in order to select relevant books. Relying on a scoring or rating system which is averaged across all users ignores specific demands of a user, and is particularly poor in tasks where there is a large variation in interest, for example, in the recommendation of books. The paper of Breese, J. S. et al (1998) Empirical analysis of predictive algorithms for collaborative filtering. Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, v 461, San Francisco, Calif. discusses a number of different predictive algorithms used for collaborative filtering.
Proceeding to block 118, usage data obtained from the previously identified similar group of readers is used to derive a book usage recommendation for a selected on-line book for the online reader. Such usage recommendation might be based, for example, on averaging the number of based on books read by other readers having similar profiles and book reading characteristics. A second mathematical algorithm is now used to identify a reason or an objective as to why the online reader is looking for a book based on the online reader's profile, reading speed, reading frequency, reading style, etc., using historical data obtained from previous on-line books read by others that are similar to the online reader's profile, block 120. A preferred second mathematical algorithms for determining the online reader's use can be, classification trees and support vector machines.
A classification tree (also known as decision tree) method is used when a data mining task is classification or prediction of outcomes and the goal is to generate rules that can be easily understood, explained, and translated into a natural query language. Classification tree labels are assigned to discrete classes. A classification tree is built through a process known as binary recursive partitioning. This is an iterative process of splitting data into partitions, and then splitting it up further on each of the branches. Initially, it starts with a training set in which the classification label (“purchaser” or “non-purchaser”) is known (pre-classified) for each record. All of the records in the training set are together in one group or part. The algorithm then systematically tries breaking up the records into two parts, examining one variable at a time and splitting the records on the basis of a dividing line in that variable (income>$55,000 or income <=$55,000). The object is to obtain a homogeneous set of labels (“purchaser” or “non-purchaser”) in each partition. The splitting or partitioning is then applied to each of the new partitions and the process continues until no more useful splits can be found.
The classification tree process starts with a training set consisting of pre-classified records. Pre-classified means that the target field, or dependent variable, has a known class or label, for example “purchaser” or “non-purchaser”. The goal is to build a tree that distinguishes among the classes. For simplicity if it is initially assumed that there are only two target classes and that each split is binary partitioning. The splitting criterion easily generalizes to multiple classes, and any multi-way partitioning can be achieved through repeated binary splits. To choose the best splitter at a node, the algorithm considers each input field in turn. In essence, each field is sorted. Then, every possible split is tried and considered, and the best split is the one which produces the largest decrease in diversity of the classification label within each partition (thus, the increase in homogeneity. This is repeated for all fields and the winner is chosen as the best splitter for that node. The process is continued at the next node and, in this manner, a full tree is generated.
Support vector machines assign labels to instances where the labels are drawn from a finite set of several elements to reduce a single multiclass problem into multiple binary problems. Each problem yields a binary classifier which is believed to produce an output function that gives relatively large values for examples from a positive class and relatively small values for examples belonging to a negative class. Two common methods to build such binary classifiers are where each classifier distinguishes between (A) one of the labels to the rest (one-versus-all) or (B) between every pair of classes (one-versus-one). Classification of new instances for one-versus-all case is done by winner takes-all strategy, in which the classifier with the highest output function assigns the class. The classification of one-versus-one case is done by max-wins voting strategy in which every classifier assigns the instance to one of the two classes, then the vote for the assigned class is increased by one vote. Finally the class with most votes determines the instance classification.
Returning to
Particularly, a third mathematical technique is implemented for deriving a reading style recommendation by using a third mathematical algorithm to determine a characteristic reading pattern for the previously identified similar book readers having similar reading objective. The reading pattern is represented through mapping of the reading pattern to the structure of said book tree. In one embodiment, the third mathematical algorithm includes sequence cluster analysis, or Hidden Markov modeling. For example, sequence clustering refers to the grouping of strings of characters based on some criteria, usually similarity in their sequence. Sequence clustering is a first step in several complex string-related computations, such as the construction of a search table. The procedure of sequence clustering includes following steps:
The result includes several clusters of sequences, which are groupings of sequences that are very similar to one another.
A computer-based system 200 is depicted in
The computer program product comprises all the respective features enabling the implementation of the methods described herein, and which—when loaded in a computer system—is able to carry out these methods. Computer program, software program, program, or software, in the present context means any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following: (a) conversion to another language, code or notation; and/or (b) reproduction in a different material form.
The computer program product may be stored on hard disk drives within processing unit (as mentioned) or may be located on a remote system such as a server (not shown), coupled to processing unit, via a network interface such as an Ethernet interface. Monitor, mouse and keyboard are coupled to the processing unit, to provide user interaction. Printer is shown coupled to the processing unit via a network connection, but may be coupled directly to the processing unit.
More specifically, as shown in
The computing system 200 additionally includes: computer readable media, including a variety of types of volatile and non-volatile media, each of which can be removable or non-removable. For example, system memory 250 includes computer readable media in the form of volatile memory, such as random access memory (RAM), and non-volatile memory, such as read only memory (ROM). The ROM may include an input/output system (BIOS) that contains the basic routines that help to transfer information between elements within computer device 200, such as during start-up. The RAM component typically contains data and/or program modules in a form that can be quickly accessed by processing unit. Other kinds of computer storage media include a hard disk drive (not shown) for reading from and writing to a non-removable, non-volatile magnetic media, a magnetic disk drive for reading from and writing to a removable, non-volatile magnetic disk (e.g., a “floppy disk”), and an optical disk drive for reading from and/or writing to a removable, non-volatile optical disk such as a CD-ROM, DVD-ROM, or other optical media. Any hard disk drive, magnetic disk drive, and optical disk drive would be connected to the system bus 201 by one or more data media interfaces (not shown). Alternatively, the hard disk drive, magnetic disk drive, and optical disk drive can be connected to the system bus 201 by a SCSI interface (not shown), or other coupling mechanism. Although not shown, the computer 200 can include other types of computer readable media. Generally, the above-identified computer readable media provide non-volatile storage of computer readable instructions, data structures, program modules, and other data for use by computer 200. For instance, the readable media can store an operating system (O/S), one or more application programs, such as video editing client software applications, and/or other program modules and program data for enabling video editing operations via Graphical User Interface (GUI), Input/output interfaces 245 are provided that couple the input devices to the processing unit 210. More generally, input devices can be coupled to the computer 200 through any kind of interface and bus structures, such as a parallel port, serial port, universal serial bus (USB) port, etc. The computer environment 500 also includes the display device 19 and a video adapter card 235 that couples the display device 19 to the bus 201. In addition to the display device 19, the computer environment 200 can include other output peripheral devices, such as speakers (not shown), a printer, etc. I/O interfaces 245 are used to couple these other output devices to the computer 200.
As mentioned, computer system 200 is adapted to operate in a networked environment using logical connections to one or more computers, such as a server device that may include all of the features discussed above with respect to computer device 200, or some subset thereof. It is understood that any type of network can be used to couple the computer system 200 with server device, such as a local area network (LAN), or a wide area network (WAN) (such as the Internet). When implemented in a LAN networking environment, the computer 500 connects to local network via a network interface or adapter 29. When implemented in a WAN networking environment, the computer 500 connects to a WAN via a high speed cable/dsl modem 280 or some other connection means. The cable/dsl modem 280 can be located internal or external to computer 200, and can be connected to the bus 201 via the I/O interfaces 245 or other appropriate coupling mechanism. Although not illustrated, the computing environment 200 can provide wireless communication functionality for connecting computer 200 with remote computing device, e.g., an application server (e.g., via modulated radio signals, modulated infrared signals, etc.).
Although an example of the present invention has been shown and described, it would be appreciated by those skilled in the art that changes might be made in the embodiment without departing from the principles and spirit of the invention, the scope of which is defined in the claims and their equivalents.