Claims
- 1. A method in a computer system containing a recommendation system for a set of data including items, content descriptors for the items, user profiles about transactions, prior searches, user ratings or user actions, to generate a recommendation list of desired items, comprising the following steps:
receiving into the recommendation system a set of statistical latent class models along with appropriate model combination weights, each possible combination of items, content descriptors, users, object or user attributes, and preferences being assigned a probability indicating the likelihood of that particular combination; receiving into the recommendation system at least one of: an actual user profile; a user query; and a request to generate at least one recommendation list, items in the recommendation list being ranked by their likelihood of being the desired items; computing a probability of relevance for each item in the set of data utilizing the received set of models and data; returning at least one recommendation list, each recommendation list having a variable length and consisting of a ranked list of desired items, the items being ranking based on the computed probability of relevance.
- 2. The method according to claim 1, further comprising:
a first step of generating the set of statistical latent class models by probabilistic latent semantic indexing.
- 3. The method according to claim 1, wherein the received actual users profile includes a prior history.
- 4. The method according to claim 1, wherein the returned recommendation list further includes a list of attributes, the attributes being determined by the following steps:
for each recommendation list, computing for each attribute, the probability of the attribute occurring in one of the items in the recommendation list based on the received set of models, weighted with their respective probabilities of relevance, and weighted with prior attribute weights; sorting the attributes in each recommendation list; and returning at least a higher probability portion of a sorted list of attributes with the recommendation list.
- 5. The method according to claim 4, wherein the attributes includes keywords.
- 6. The method according to claim 1, further comprising:
recognizing that the query is ambiguous based on the set of models; and when the query is ambiguous, distributing the items over multiple recommendation lists.
- 7. The method according to claim 1, wherein the set of received models are learned over a set of data which does not include at least one of: user profiles; user ratings; user actions; user prior searches; item content descriptors; and item attributes.
- 8. The method according to claim 1, further comprising:
computing predictions for unobserved variables.
- 9. The method according to claim 8, wherein the unobserved variables include at least one of user ratings on items and query terms.
- 10. The method according to claim 1, wherein prior to the step of receiving a set of statistical latent class models, learning model parameters for the user.
- 11. The method according to claim 1, wherein prior to the step of computing a probability of relevance, setting preference probabilities below a predetermined threshold to zero.
- 12. The method according to claim 1, wherein the step of computing probability of relevance includes accounting for the temporal structure of transactions, ratings and user actions; and
computing the probabilities of relevance of items for a user for a particular point in time.
- 13. The method according to claim 1, wherein the set of received models form a hierarchy that is used for successive refinement of the user query.
- 14. The method according to claim 1, wherein the step of computing a probability of relevancy includes computing the expected utility for the user using the probability of relevance for each of the items.
- 15. A method in a computer system for training a latent class model comprising the steps:
receiving data in the form of a list of tupels of entities; receiving a list of parameters, including a number of dimensions to be used in the model training, a predetermined termination condition, and a predetermined fraction of hold out data; splitting the dataset into training data and hold out data according to the predetermined fraction of hold out data; applying Tempered Expectation Maximization to the data to train a plurality of latent class models according to the following steps:
computing tempered posterior probabilities for each tupel and each possible state of a corresponding latent class variable; using these posterior probabilities, updating class conditional probabilities for items, descriptors and attributes, and users; iterating the steps of computing tempered posterior probabilities and updating class conditional probabilities until the predictive performance on the hold-out data degrades; and adjusting the temperature parameter and continuing at the step of computing tempered posterior probabilities until the predetermined termination condition is met; and combining the trained models of different dimensionality into a single model by linearly combining their estimated probabilities.
- 16. The method according to claim 15, wherein the entities include at least one of: items; users; content descriptors; attributes; and preferences.
- 17. The method according to claim 15, further comprising combining the updated class conditional probabilities with a preference value.
- 18. The method according to claim 15, wherein the step of combining the trained models includes computing the weights for the trained models being combined to maximize the predictive model performance on the hold-out data.
- 19. The method according to claim 15, further comprising:
iteratively retraining all models based on both the training data and the holdout data.
- 20. The method according to claim 15, wherein the step of adjusting the temperature parameter is omitted.
- 21. The method according to claim 15, further comprising:
splitting training data into a plurality of blocks and updating the tempered posterior probabilities after the posterior probabilities have been computed for all observations in one block.
- 22. The method according to claim 15, wherein the received data consists only of items characterized by text.
- 23. The method according to claim 15, wherein the received data consists only of pairs of users and items.
- 24. The method according to claim 15, wherein the received data consists only of triplets of users, items, and ratings.
- 25. The method according to claim 15, wherein the received data consists only of: pairs of users and items; and triplets of users, items and ratings.
- 26. The method according to claim 15, further comprising:
extracting hierarchical relationships between groups of data.
- 27. The method according to claim 15, wherein the step of receiving data includes receiving similarity matrices for the similarity of at least one of: items; and users;
integrating the similarity matrices into the step of updating the tempered posterior probabilities by transforming similarities into probabilities; and smoothing the estimates of the class conditional probabilities using the transformed similarities.
- 28. A method in a computer system containing a data mining system to analyze a set of data including items, content descriptors for the items, user profiles about transactions, prior searches, user ratings or user actions, comprising the following steps:
receiving into the data mining system at least one of: a set of statistical latent class models along with appropriate model combination weights; a request to identify groups of desired objects based on the class conditional probabilities provided in the latent class models; a request to describe groups of desired objects by content attributes inferred from the received latent class models; and a request to determine a list of users that are most likely to have the desired preference with respect to a pre-selected object; and determining at least one of: a group of users; items; a list of descriptors and attributes in accordance with the received request by computing the required probabilities from the latent class models and ranking at least one of users; items; descriptors; and attributes.
- 29. A personalized search engine system for creating a recommendation list for a user based on the user's query, past profile, ratings and actions, comprising:
a set of statistical latent class models along with appropriate model combination weights, each possible combination of items, content descriptors, users, object attributes, user attributes, and preferences being assigned a probability indicating the likelihood of that particular combination; a means for receiving the actual user profile, ratings and actions the user has performed in the past; a means for receiving a user query; a means for generating at least one recommendation list, items in the recommendation list being ranked by their likelihood of being the desired items; a means for computing the likelihood of relevance for each item in the database utilizing the statistical latent class models and data; a means for outputting at least one recommendation list, each recommendation list having variable length and consisting of a ranked list of items, the ranking based on the probability of relevance as determined by the search engine.
- 30. The personalized search engine according to claim 29, further comprising:
a means for recognizing that the user query is ambiguous; and a means for distributing the results over multiple result lists, when the query is ambiguous.
- 31. A method in a computer system for generating a recommendation list of desired items from a set of data including at least one of: items, content descriptors for the items, user profiles about transactions, prior searches, user ratings and user actions, to generate a recommendation list of desired items, comprising the following steps:
receiving into the recommendation system a set of data models; receiving into the recommendation system a user query; computing a probability of relevance for each item in the set of data utilizing the received set of models and data; returning at least one recommendation list, each recommendation list having a variable length and consisting of a ranked list of desired items, the items being ranking based on the computed probability of relevance; updating the set of data models based upon an assessment by the user of the quality of selected items in the recommendation list.
- 32. A method in a computer system for generating a recommendation list of desired items from a set of data including at least one of: items, content descriptors for the items, user profiles about transactions, prior searches, user ratings and user actions, to generate a recommendation list of desired items, comprising the following steps:
generating a set of statistical latent class models by probabilistic latent semantic indexing of the set of data; receiving into the recommendation system the set of data models; receiving into the recommendation system a user query; computing a probability of relevance for each item in the set of data utilizing the received set of models and data; returning at least one recommendation list, each recommendation list having a variable length and consisting of a ranked list of desired items, the items being ranking based on the computed probability of relevance.
- 33. A method in a computer system for generating a recommendation list of desired items from a set of data including at least one of: items, content descriptors for the items, user profiles about transactions, prior searches, user ratings and user actions, to generate a recommendation list of desired items, comprising the steps of:
statistically analyzing the set of data to learn semantic associations between words within specific items of the set of data; computing probabilities for each learned semantic association; receiving into the recommendation system at least one of: an actual user profile; a user query; and a request to generate at least one recommendation list, items in the recommendation list being ranked by their likelihood of being the desired items; computing a probability of relevance of each item in the set of data to the at least one of: an actual user profile; a user query; and a request to generate at least one recommendation list; returning at least one recommendation list.
- 34. The method according to claim 33, further comprising:
receiving a user profile, the step of computing a probability of relevance including combining the learned semantic associations and the user profile.
Parent Case Info
[0001] This application claims the benefit of U.S. Provisional application no. 60/220,926, filed Jul. 26, 2000. Application Serial No. 60/220,926 is hereby incorporated by reference.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60220926 |
Jul 2000 |
US |