One type of information that people commonly seek on the Internet is a review of a product or service. There are some web sites whose main function is to allow consumers to review products. In other cases, web sites provide reviews as part of some other service. For example, web sites of large commercial retails often allow customers to write reviews of the products that are sold on the sites. Sites that facilitate the selling of products by small sellers (e.g., eBay, Amazon marketplace, etc.) often allow users to review the experience they have had with particular sellers.
While some sites employ professional experts to perform formal, technical reviews of products and services, many reviews are provided by ordinary consumers. While consumer feedback can be valuable, it is often difficult to interpret. Different people may have different expectations. Thus, when reading a review, it is often difficult to know what the words in the review mean. For example, two people who review a television both describe the picture quality of the television as “good”, but “good” might mean different things to these two people. Moreover, reviewers are often asked to rate a product or service numerically on one or more dimensions (e.g., “rate the picture quality of this television on a scale of one to five”), but people often do not agree on how the numbers are to be assigned. Two people might be equally impressed by the picture quality of a television, but one person might rate the picture a three while the other rates it a four.
If one reads many ratings of the same or similar products, one might gain a comprehensive picture of the product space and how the various products differ from each other. But reading a large enough number of reviews to get such a comprehensive picture is time consuming.
Reviews may be analyzed to determine the relationship between reviews of a product and facts that are known about the product. Using this analysis, statements can be made about how a given product compares with other products that share the same factual features.
For example, suppose that the favorability of a narrative review of a television can be measured numerically (e.g., reviews that say “okay” get a five on a scale of one to ten, while reviews that say “horrible” get a one). Once such numerical values are assigned to reviews, it is possible to find the average favorability rating of a particular product or class of products. So, suppose that there are three brands of televisions—A, B, and C—in the $1400-1500 price range, and the average favorability of a review of any of these brands is four on a scale of one to ten. Suppose further than the average favorability of reviews for brand A is six. Then, it is possible to make the statement that brand A is viewed more favorably than other brands of television in the same price range. This statement may be over interest to a consumer when making a purchase decision, since it summarizes what reviews say about brand A's televisions, and those reviews compare with reviews of other televisions in the same price range. Techniques described herein may be used to generate this kind of statement.
In order to provide such an analysis, textual reviews are analyzed to determine what sentiments they express about a product. Information may be extracted in the form of numerical ratings. For example, reviews might be analyzed to determine what they say about three different aspects of a television: picture, sound, and construction quality. By looking for certain key words and phrases (e.g., “picture is good/amazing/terrific/bad/horrible/barely visible”), it is possible to assess on a numerical scale what a reviewer is saying about various aspects of a television. For example, if a review describes the picture as “good”, the review may be interpreted as rating the picture quality a six, while a review that describes a picture as “amazing” might be interpreted as rating the picture quality an eight. Moreover, textual analysis can be performed on a manufacturer's specifications of a television, which contains basic factual information such as the suggested retail price, the screen size, the screen resolution, etc., and each type of fact can be assigned a number. The result of this analysis is a set of variables. These variables can be analyzed statistically to determine relationships between the variables. For example, one can analyze the average picture quality for 46-inch televisions, or the average sound quality for televisions in the $1400-1500 price range.
Once the relationship between two variables is known, it is possible to make statements about how a specific product fares against other products in the same class. For example, one can say, “The brand-A 46-inch television has a higher picture quality, but a lower sound quality, than other 46-inch televisions,” or “the brand-B television has high sound quality compared with televisions of the same price.” In this sense, a statement that compares the reviews of a specific class of product or service (e.g., a specific model number of television) to some more general class of product or service (e.g., all televisions of a specific screen size) may serve as a kind of auto-generated summary of an existing set of reviews.
In the description herein, products are used as an example of things that can be reviewed, although the techniques described herein can apply to anything that can be reviewed—e.g., products, services, etc.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
People often look to consumer reviews when they want to investigate a product or service. The Internet has made it very easy to write and read reviews. Thus, reviews can be found at various places online. For example, commercial retail web sites often allow users to write reviews of products they have purchased. These web sites often display the consumer reviews with the product so that consumers who are considering buying the same product can find out what others think of the product. Online marketplaces (eBay, Amazon marketplace, etc.) often give buyers the chance to write reviews of sellers.
While consumer reviews are readily available for a wide variety of products and services, these reviews are often difficult to interpret. Traditionally, product and service reviews were created by professional experts. A consumer magazine can employ a team of engineers to put a product through rigorous technical tests. Auto clubs can engage experienced travelers to stay at hotels and rate the service that they receive. These types of reviews are reliable and convey much information because they subject the product or service that is being reviewed to uniform standards that can be well-publicized. By contrast, a typical consumer rates only a few products, and different consumers may have very different personal standards when they review products. For example, two different consumers may have the same subjective impression of the picture quality of a television, but one consumer might have higher expectations than the other. Thus, one consumer might describe the picture quality as “average,” while the other might describe the picture quality as “amazing.” Moreover, consumers tend to encounter fewer products than professional reviewers, so the fact that one particular consumer thinks the television he bought has “fantastic” sound quality may not be particularly informative or reliable, since that consumer may not know much about the general level of quality that one can expect from televisions.
While an individual consumer review may provide information that is difficult to interpret, examining a large number of consumer reviews tends to provide a reliable picture of what consumers think of a product or service. The fact that one consumer thinks the brand-A 46-inch television has a great picture does not, in itself, provide much information. However, the fact that one thousand consumers have given the brand-A 46-inch television reviews that range from good to excellent suggests that the television may be a high quality television. And if there are there are an additional one thousand reviews that rate the brand-B and brand-C 46-inch televisions “poor”, then the high quality ratings of brand-A look all the more impressive by comparison. In other words, when reviews are provided by consumers who apply a wide range of standards and have relatively little experience with the types of products they are rating, the reliability of these reviews comes from two sources: large numbers, and a reference point against which the statements of the consumers can be compared. Considering a large number of reviews decreases the chance that one's impression will be influenced by an aberrational review. And comparing a large number of reviews of brand-A's product with a large number of reviews of similar products allows the similar products to serve as a reference point against which the reviews of brand A's product can be interpreted.
However, most consumers do not have time to canvass a large number of reviews. Thus, the problem of interpreting consumer reviews amounts to marshalling and modeling a large amount of information, much of which is contained in free-form, narrative textual review. The subject matter described herein provides a way to marshal and interpret reviews.
In order to analyze reviews, two types of information are mined: first, basic facts about the product or service being reviewed, and, second, reviewers' impressions of the product or service as expressed in the narrative part of the review. First, basic facts about the products and services are mined from information that is made available by the manufacturer of a product or the provider of a service. For example, if company A makes televisions, it will likely provide basic information about each model of television—e.g., the suggested retail price, the screen size, the screen resolution, the display technology (e.g., plasma or liquid crystal), the number of input connectors, etc. As another example, a hotel company will likely provide basic information about its hotel rooms—e.g., location of the hotel, the price range for different types of room, the sizes of the rooms, the number of restaurants in the hotel, etc. This type of information can be mined from online or print material using text analysis techniques, such as entity extraction.
Second, the reviews themselves are mined to identify what reviewers have said about the product or service they are reviewing. That is, the narrative part of the review may be analyzed to determine what sentiments it appears to express about particular aspects of the product or service being reviewed. A television review that says “picture quality is poor” is expressing the reviewer's sentiment about a product or service, and this sentiment can be extracted from the narrative part of a review.
These two types of information—the basic facts about a product, and the reviews of that product—are used in the following manner. The basic facts about products and services are used to create categories that can be meaningfully compared. For example, it makes sense to compare two 46-inch televisions with 1080p displays. But it makes little sense to compare a 20-inch standard definition cathode ray television with a 65-inch high definition plasma television. In some cases it makes sense to compare any two televisions of the same size and screen resolution; in other cases, it makes sense to compare televisions that have similar prices. Similarly, it makes sense to compare two luxury hotels in midtown Manhattan, but it makes little sense to compare a boutique hotel in Seattle with a roadside motel in Winnemucca, Nev. What type of product or service is being offered can be determined from the basic information that the manufacturer or service provider makes available. This information can be used to create categories of products or services, so that products or services in those categories can be meaningfully compared. That is, if one wants to compare televisions of similar price, then one can determine which televisions are in the same price category using the suggested retail price information provided by the manufacturer.
The reviews themselves are mined to convert free-form narrative statements about a product into a set of metrics. For example, suppose that ratings of televisions come down to ratings of three attributes: picture quality, sound quality, and construction quality. One can examine a narrative review of a particular television to see what the reviewer has said about these three attributes, and can assign a numerical rating to each attribute. Thus, if a reviewer says, “The Minisonic 46-inch 1080p television has a stupendous picture,” one might interpret this statement as saying that the picture quality rates nine on a scale of one to ten. If the review later says that the television “had a very flat sound,” one might interpret this statement as saying that the sound quality rates three on a scale of one to ten. There are various techniques used to perform this type of textual analysis. In one example, an analyzer can maintain a list of descriptive words and phrases with point values assigned, and can look for these words and phrases in proximity to other words that indicate what feature of the television is being described. For example, if the word “flat” appears adjacent to “sound”, then it is likely that the person is saying the sound is flat. If the list of words indicates that “flat” is associated with poor sound quality, then the sentiment that the review is expressing about sound quality can be assigned a low numerical value—e.g., three on a scale of one to ten—(indicating an unfavorable review).
Once information has been mined from the reviews, it is possible to calculate statistics about the reviews. For example, one could calculate the average picture quality of all 46-inch televisions, or the average sound quality of all 46-inch televisions in the $1400-1500 price range. Or, one could plot the relationship between picture quality and price. Additionally, once this type of information has been calculated for a meaningful class of televisions, it is possible to compare a specific television with all televisions in that class. Thus, if the average picture rating for 46-inch televisions in the $1400-1500 price range is four, but the average rating for the Minisonic 46-inch plasma screen television is a seven, then it is possible to make a statement such as, “The Minisonic 46-inch plasma screen television has a high picture quality compared with other televisions of its size and price.” This statement brings together a large amount of information from reviews. It quantifies what people have said about televisions of a particular size and price in general, also distinguishes what people say about one particular 46-inch television in the $1400-1500 price range from what people have said generally about other versions of that size/price of television. This type of statement may be viewed by consumers as being more authoritative than one reviewer's isolated opinion. Additionally, this type of statement can be produced for less money than a professional expert review of a product, thereby making it economically feasible for online information aggregation services to provide this type of statement.
Turning now to the drawings,
Extractor 108 may maintain a list of words that it associates with positive or negative statements. That list may also quantify the magnitude of how positive or negative particular words are. For example, “amazing” and “stupendous” may be considered words that indicate a very high level of satisfaction, while “good” might indicate a sentiment that is positive, but not as strongly positive as the words “amazing” and “stupendous.” The word “bad” might be interpreted as a mildly negative sentiment, and the word “awful” might be interpreted as a strongly negative sentiment. Numerical values could be assigned to these statements according—e.g., one for “awful”, nine for “amazing.”
The depth of the text analysis may depend on the underlying data about what the words and phrases in a review mean. For example, extractor 108 might maintain a database that contains the meanings of general adjectival characterizations like “amazing” and “bad”, but could also include very specific phrases. For example, the writer of narrative 106 has indicated that television “fell apart” (box 120), and extractor 108 might have data indicating that the phrase “fell apart”, when appearing in a television review, is associated with very poor construction quality.
Extractor may comprise, or otherwise make use of, a numerical converter 122. Numerical converter 122 quantifies the sentiment that has been detected in narrative 106, by assigning numbers to that sentiment. In the example of
Another type of information that may be analyzed is provider data 104, which is analyzed in order to mine basic facts about the products and/or services that are the subject of reviews. Provider data 104 may be supplied by the provider of a product or service (e.g., the manufacturer of a product). In the example of
Provider data 104 may be analyzed by extractor 130. Extractor 130 may work similarly to extractor 108, but may be configured to extract the type of information that would be contained in a product data sheet rather than the type of information that would be contained in a narrative review. Extractor 130, in this example, determines the values of two variables 132 and 134, which represent the price and diagonal screen size of a television and are labeled R and D, respectively. Thus, extractor 130 might set the variables to the values R=1499 and D=46. In the example of
It is noted that the example in
One result of the scenario in
Graph 202 plots the values of the price variable (P) against the sound sentiments variable (S). The example of graph 202 shows seven data points, which may have been collected across various different types of televisions. Typically, there may be hundreds or thousands of data points, but for simplicity of illustration, only seven data points are shown. Each data point (shown with a solid circle) represents a specific review of a specific television. For example, data point 204 indicates that a person reviewed a television that has a $1000 suggested retail price. That person used some words to express his or her sentiment about the sound quality of that television, and that sentiment has been given a numerical value of four on a scale of one to ten (i.e., below average sound quality). The position of data point 204 on graph 202 represents the pair of values (sound sentiment, price) after the extractors and/or numerical converters have mined this information from the underlying data. Similarly, data point 206 indicates that a person reviewed a $1200 television, and that the sentiment expressed about sound quality in that review was assigned the value one on a scale of one to ten (i.e., very poor sound quality). The other data points indicated by solid circles represent the sound quality sentiments for various televisions having various prices.
Given a set of data such as the data points shown in graph 202, it is possible to perform various types of statistical analyses on these data. One such example is shown in
Returning to the example of
Based on analyses such as the one shown in
User interface 300 might be the web page of a review web site. The product being reviewed, in this example, is the Minisonic 46-inch 1080p HDTV television. In the example user interface 300, a graphic 302 of the television is shown. Additionally, various statements 304, 306, and 308, concerning the television are shown as part of user interface 300. For example, a web site may collect reviews of televisions, and may provide user interface 300 in order to summarize the reviews.
Concerning the Minisonic 46-inch 1080p HDTV television, statement 304 states that “This television has very good sound for its price.” That statement may be made based on the statistical analysis shown in
Statement 306 states that “This television has somewhat poor construction quality for its price.” As described in
Statement 308 states that “This television has average picture quality for its screen size.” As noted above, any type of category of product or service may be defined. In statements 304 and 306, the price of the television defines the category against which specific televisions are compared. I.e., in statements 304 and 306, the Minisonic television is being compared with other televisions of the same price. However, in statement 308, the Minisonic television is being compared with other televisions that share a particular physical feature (e.g., the same screen size). For example, the average picture sentiment (variable P, in the examples above) might be a six for televisions having a 46-inch screen size, and the Minisonic might also have an average picture rating of six. In that case, statement 308 accurately describes the reviews of the Minisonic relative to other reviews of 46-inch televisions: the average sentiment about the Minisonic's picture quality is the same as the average sentiment about 46-inch televisions overall.
In the process of
At 402, a text analysis is performed on a review. For example, the narrative portion of the review may be evaluated to determine what phrases the review uses with respect to attributes of the product. The particular types of words and phrases that the analysis looks for may depend on the product. For example, if the product being reviewed is a television, one may look for words such as “picture,” “sound,” “screen,” “cabinet,” etc., and may look for specific adjectives or phrases near those words (e.g., “crystal clear,” “murky,” “poor,” etc.).
At 404, a numerical score is assigned to one or more variables based on the text analysis. For example, if the product being rated is a television and one variable represents the reviewer's sentiment about the picture quality, then a numerical score may be assigned to represent that sentiment. So, if a user says, “this television has a very good picture,” this verbally-expressed sentiment might be represented by assigning the picture quality variable a value of seven on a scale of one to ten (where “very good” might be a seven, while “outstanding” might be a nine or ten).
The actions performed at 402 and 404 may be performed for each review to be analyzed.
At 406, a text analysis is performed for the provider data associated with each product or service to be evaluated. As described above in connection with
At 410, a statistical relationship is identified between one (or more) of the variables derived from reviews and one (or more) of the variables derived from provider data.
Computer 500 includes one or more processors 502 and one or more data remembrance components 504. Processor(s) 502 are typically microprocessors, such as those found in a personal desktop or laptop computer, a server, a handheld computer, or another kind of computing device. Data remembrance component(s) 504 are components that are capable of storing data for either the short or long term. Examples of data remembrance component(s) 504 include hard disks, removable disks (including optical and magnetic disks), volatile and non-volatile random-access memory (RAM), read-only memory (ROM), flash memory, magnetic tape, etc. Data remembrance component(s) are examples of computer-readable storage media. Computer 500 may comprise, or be associated with, display 512, which may be a cathode ray tube (CRT) monitor, a liquid crystal display (LCD) monitor, or any other type of monitor.
Software may be stored in the data remembrance component(s) 504, and may execute on the one or more processor(s) 502. An example of such software is review analysis software 506, which may implement some or all of the functionality described above in connection with
The subject matter described herein can be implemented as software that is stored in one or more of the data remembrance component(s) 504 and that executes on one or more of the processor(s) 502. As another example, the subject matter can be implemented as instructions that are stored on one or more computer-readable storage media. Tangible media, such as an optical disks or magnetic disks, are examples of storage media. The instructions may exist on non-transitory media. Such instructions, when executed by a computer or other machine, may cause the computer or other machine to perform one or more acts of a method. The instructions to perform the acts could be stored on one medium, or could be spread out across plural media, so that the instructions might appear collectively on the one or more computer-readable storage media, regardless of whether all of the instructions happen to be on the same medium.
Additionally, any acts described herein (whether or not shown in a diagram) may be performed by a processor (e.g., one or more of processors 502) as part of a method. Thus, if the acts A, B, and C are described herein, then a method may be performed that comprises the acts of A, B, and C. Moreover, if the acts of A, B, and C are described herein, then a method may be performed that comprises using a processor to perform the acts of A, B, and C.
In one example environment, computer 500 may be communicatively connected to one or more other devices through network 508. Computer 510, which may be similar in structure to computer 500, is an example of a device that can be connected to computer 500, although other types of devices may also be so connected.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.