Web users perform many activities on the Web and contribute a large amount of content such as user reviews for various products and services, which can be found on shopping sites, weblogs, forums, etc. These review data reflect Web users' sentiment toward products and are very helpful for consumers, manufacturers, and retailers. Unfortunately, most of these reviews are not well organized. Sentiment classification is one way to address this problem. But it takes effort to classify product reviews into different sentiment categories.
Nonetheless, opinion mining and sentiment classification of online product reviews has been drawing an increase in attention. Typical sentiment categories include, for example, positive, negative, mixed, and none. Mixed means that a review contains both positive and negative opinions. None means that there is no user opinions conveyed in the user review. Sentiment classification can be applied to classifying product features, review sentences, an entire review document, or other writing.
Conventional sentiment classification, however, is limited to text mining, that is, full-text information of the user reviews is widely adopted as the exclusive means for sentiment classification. Conventionally, an understanding of the sentiment is typically derived through dividing text into patterns and trends to find terms through means such as statistical pattern learning. Such text mining usually involves the process of parsing and structuring the input text, deriving patterns within the structured data, and finally evaluating the output. The focus of such text mining is generally the sequence of terms in the text and the term frequency. What is needed for improved sentiment classification is analysis of numerous other features of a received text that are ignored by conventional sentiment classification techniques.
A sentiment classifier is described. In one implementation, a system applies both full text and complex feature analyses to sentences of a product review. Each analysis is weighted prior to linear combination into a final sentiment prediction. A full text model and a complex features model can be trained separately offline to support online full text analysis and complex features analysis. Complex features include opinion indicators, negation patterns, sentiment-specific sections of the product review, user ratings, sequence of text chunks, and sentence types and lengths. A Conditional Random Field (CRF) framework provides enhanced sentiment classification by incorporating the information for each segment of a complex sentence to enhance sentiment prediction.
This summary is provided to introduce the subject matter of smart sentiment classification, which is further described below in the Detailed Description. This summary is not intended to identify essential features of the claimed subject matter, nor is it intended for use in determining the scope of the claimed subject matter.
Overview
This disclosure describes smart sentiment classification for product reviews. It should be noted that the “product” can be a variety of goods or services. Thus, an exemplary Smart Sentiment Classifier (“sentiment classifier” or “SSC”) described herein can classify a wide variety of reviews and critiques, based on sentences, including sentence structure and linguistics, used in such critiques. For example, the exemplary sentiment classifier can classify the sentiment of an automobile review article from newspaper or a consumer information forum, or can also be adapted to classify the opinion sentiment of a written evaluation, e.g., of a person's public speaking performance, a movie, opera, book, play, etc. The exemplary sentiment classifier can be trained for different types of subject matter depending on the type of review or critique that will be processed. The exemplary sentiment classifier analyzes language and other complex features in order to classify sentiment.
This complex-feature-based sentiment classification is weighted and combined by linear combination with a full-text-based sentiment classification that has also been weighted, in order to provide an ensemble approach that improves sentiment classification. Some of the complex features investigated in order to enhance the sentiment classification include opinion features (e.g., words/phrases), negation words and patterns, the section of the review from which a given sentence is taken (i.e., its context), user review ratings, the type of sentence being used to express the reviewing user's opinion, the sequence of text chunks found in a review sentence and their respective sentiments, sentence lengths, etc.
In one implementation, as mentioned, the language analyzed is from product reviews, and the sentiment classifier handles sentiment classification at a sentence level. That is, the sentiment classifier's task is to classify each review sentence, or parts of a sentence, into different sentiment categories.
A conditional random field (CRF) is a type of discriminative probabilistic model often used for parsing sequential data, such as natural language text. In one implementation, the exemplary sentiment classifier uses a Conditional Random Field (CRF) framework to induce dependency in complex sentences and model the text chunks of a sentence for classifying opinion/sentiment orientation.
An exemplary system has several important features:
The unified framework includes phrase-level feature extraction. Sentiment word/phrase extraction is very crucial for sentiment classification related tasks. Its goal is to identify the words or phrases that can strongly indicate opinion orientation. Most conventional work focuses on adjective opinion words and usually ignores opinion phrases. However, not all types of phrases are important clues for sentiment analysis. After a series of experiments, it was discovered that two types of phrases can benefit sentiment classification: verb phrases (e.g. “buy it again”, “stay away”) and noun phrases (“high quality”, “low price”).
Comparative study for feature selection. Feature selection has been widely applied in text categorization and clustering. Compared to unsupervised selection, supervised feature selection is more successful in filtering out noise in most cases.
Sentence pattern mining. An analysis of conventional classification results finds that some typical sentences are incorrectly classified by bag-of-words methods. These kinds of sentences are difficult to classify if the context of the opinion word or phrase is not considered. Important sentence structures are incorporated into the sentence pattern mining: negation patterns, conditional structures, transitional structures, and subjunctive mood constructions. After mining such sentence patterns, the features are incorporated into a unified framework based on CRF (Conditional Random Fields). A unified framework for sentiment classification using CRF. CRF is a recently-introduced formalism for representing a conditional model Pr(y|x), which has been demonstrated to work well for sequence labeling problems. Rather than using sentences' sentiment as input sequential flow, sentences are split into chunks according to the sentence structure and selected features for sentence level sentiment classification.
The exemplary sentiment classifier provides significant improvement over conventional sentiment classification techniques because the sentiment classifier adopts an ensemble approach. That is, the exemplary sentiment classifier combines multiple different analyses to reach a sentiment classification, including full text analysis combined with complex features analysis.
Exemplary System
In one implementation, the exemplary sentiment classifier 104 receives product reviews 106 input at the computing device 102. The sentiment classifier 104 classifies the sentiment expressed by the sentences, language, linguistics, etc., of the product reviews 106 and determines an overall sentence classification for each review 106. From this classification 108, other derivative analyses can be obtained, such as product ratings 110.
The sentiment classification provided by the sentiment classifier 104 is more powerful in accurately finding a reviewer's sentiment toward a product or service than conventional techniques, because the sentiment classifier 104 is trained on language data that is likely similar to that used by a particular type of reviewer, and because the sentiment classifier 104 considers multiple aspects of the reviewer's language when making a sentiment assessment and classification 108.
Exemplary Engine
The exemplary sentiment classifier 104 includes a model trainer 202 that uses training information, such as training data 204, to develop a full text model 206 and a complex features model 208 that support sentiment classification. In one implementation, the model trainer 202 operates offline, so that the full text model 206 and complex features model 208 are trained and fully ready for service to support online sentiment classification.
The sentiment classifier 104 also includes a sentence processor 210 that receives sentences 212 of the review being processed, and produces an ensemble classification 214. The sentence processor 210 typically operates online, and includes an ensemble classifier 216. In one implementation, the ensemble classifier 216 includes a full text analyzer 218 that uses the full text model 206 developed by the model trainer 202, and a complex features analyzer 220 that uses the complex features model 208 developed by the model trainer 202. A weight assignment engine 222 in the ensemble classifier 216 balances the full text analysis and the complex features analysis for combination at the linear combination engine 224, which combines the weighted analyses into the ensemble classification 214.
In
The online sentence processor 210 may also include a sentence preprocessor 320 to receive the sentences 212 or other text data to be processed by the full text analyzer 218 and the complex features analyzer 220 of the ensemble classifier 216.
A full-text-based model loader 408 and a complex feature-based model loader 410 separately load the two component models 206 and 208 of the SSC models 318. A load success tester 412 determines whether the loading is successful, and if not, returns an error code 414. An initializer (not shown) may also load model parameters associated with the SSC models 318. In one implementation, the full text analyzer 218 and the complex features analyzer 220, supported by a configuration file 416 and the sentence section & rating 305, produce the ensemble classification 214, which can be returned as a high confidence classification result 420.
In one implementation, the full text model 206 and the complex features model 208 that make up the SSC models 318 are Naive Bayesian (NB) models, which will be explained in greater detail further below. The full text analyzer 218 and the complex features analyzer 220 use the SSC models 318 to predict a sentiment category, inputting tokens, which can be a single word, a word N-gram, a rating score, a section identifier, etc.
Sentence Segmentation
where Zx is the normalization factor of all label sequences; fk(yi-1,yi,X) and gl(yi,X) are arbitrary feature functions over the labels and the entire observation sequence; and λk and μl are the learned weights for the feature functions fk and gl respectively, which reflect the confidences of feature functions.
The chunk CRF framework 500 splits a sentence 212 into a sequence of text chunks and indicator words for greatly improved sentiment classification. Each text chunk is assigned a sentiment category using opinion words/phrases and negation words/phrases. The chunk CRF framework 500 can be integrated into the sentiment classifier 104 and segments a review sentence 212 into several chunks and constructs opinion classification features using both sentence type information and sequential information of the sentence chunks.
In one implementation, if a sentence 212 contains at least one indicator word, it is regarded as a complex sentence. The complex sentence is then split into several text chunks connected by indicator words. Each text chunk may also have one sentiment orientation (“SO”) tag.
The exemplary chunk CRF framework 500 of
In an online sentiment classification, e.g., of a product review, the sentence segment generator 512 receives sentences 212 and for each sentence, creates sentence chucks or “processing units.” The sentence chunks are fed to the opinion features extractor 504 and the full text classifier 508, which produce output that is passed to a CRF feature space generator 514. The CRF feature space generator 514 creates a CRF model 516 that is used by a CRF-based classifier 518 to produce the opinion orientation 520.
Operation of the Exemplary Engines and Frameworks
A supervised learning approach may be used to train the sentiment classification (SSC) models 318. In one implementation, the exemplary sentiment classifier 104 has the following major characteristics:
Supervised learning: the sentiment classifier 104 can use a set of sentences 204 for training model purposes. Each training sentence 204 can be pre-labeled as one of the four sentiment categories introduced above: “positive,” “negative,” “mixed,” and “none.” The model trainer 202 extracts features from the training examples 204 and trains the full text model 206 or other classification model 506 classification model with the extracted features. The classification model 506 is used to predict a sentiment category for an input sentence 212.
Ensemble classification: The sentiment classifier 104 includes an ensemble classifier 216. Compared with conventional sentiment classification, the exemplary sentiment classifier 104 utilizes both full text information and complex features of the user review sentences 212. Full-text information refers to the sequence of terms in a review sentence 212. Complex features include, for example, opinion-carrying words, and section rating information (to be described more fully below). In one implementation, based on the above-described two kinds of information, two sentiment classification models 318 can be trained separately: the full-text based model 206 and the complex-feature-based model 208. The ensemble classification 214 is derived from a linear combination of the influence of the two models 206 and 208. The weight assignment engine 222 assigns different weights to the two models, after which the linear combination engine 224 combines the outputs of both models to arrive at the final decision, the ensemble classification 214.
Complex feature-based model training: In conventional sentiment classification, full-text information of user reviews is widely adopted as the exclusive means for sentiment classification. The exemplary sentiment classifier 104, on the other hand, also investigates complex features which enhance the sentiment classification. Some complex features include:
The exemplary sentiment classifier 104 trains sentiment classification model 318 with full-text information and complex features separately and utilizes this information in its ensemble approach. In conventional sentiment classification, complex features, where used, are processed in the same manner as full-text features. Thus, in a conventional sentiment classification problem, since text features have very high dimensionality and many of the text terms are irrelevant to predicting a sentiment category, the contribution of non-text features is typically overwhelmed. Experimental results indicate that the exemplary sentiment classifier 104 avoids this imbalance and provides flexibility for tuning parameters to better leverage both full-text information and non-textual features.
In one implementation, the exemplary sentiment classifier 104 segments a review sentence 212 into several chunks and constructs opinion classification features using both sentence type information and sequential information of the sentence chunks. For example, if a sentence 212 contains at least one indicator word, the sentence type identifier 304 regards the sentence as a complex sentence. The chunk sequence builder 306 then splits the sentence 212 into several text chunks connected by the indicator words. In one implementation, besides the entire sentence 212, each text chunk is also assigned one sentiment orientation (SO) tag.
In this example, “but” is detected as an indicator word 602 of a transitional type sentence. This complex sentence 212 is converted to a sequence of three text chunks 604, 606, and 608 and the one indicator word 602. In one implementation, a sentiment orientation (SO) tag 608 for the entire sentence 212 is added and is counted as one of the text chunks 608. Such chunk sequences improve sentiment classification accuracy.
Offline and Online Processing
In
Offline Processing
In one implementation, the input for the offline part 202 is a set of training sentences 204. For example, each training sentence 204 may be extracted from product reviews. Each training sentence 204 is associated with one category, which may be assigned by human labelers. The categories can include positive, negative, mixed or none. The output is a model 318.
The offline part 202 typically includes the following components:
Spell-check dictionary (not shown): If spell-checking is used in the online prediction phase, the classification speed may be quite slow. Thus, a dictionary containing words that are frequently misspelled may be used during the offline phase 202. In one implementation, the spell check dictionary can be a hash table, where the key is wrong spelling and the value is correct spelling.
The training preprocessor 302 receives the training data 204, parses it, and derives patterns within the structured data.
The negation pattern detector 310 inputs training data 204 and a dictionary 308 containing a small group of positive/negative opinion words. Output is typically negation words, such as “not”, “no”, “nothing”, etc. This component constructs two categories: one category includes the sentences 212 that have a sentiment that is the same as their detected opinion words. The second category includes those sentences 212 that have a sentiment that is the reverse of their opinion words. The negation pattern detector 310 extracts the terms that are near the opinion words in the sentence 212, from both categories respectively, under the assumption that such terms reverse the sentiment polarity. For example, “good” is a positive opinion word, but the category for a sentence such as “ . . . not good . . . ” is negative. In this case, “not” is regarded as a negation word/phrase. Then the terms from both categories are ranked according to their CHI score. The terms ranked at top are manually selected and kept as negation words.
The opinion word/phrase identifier 312 inputs training data and negation words and outputs two ranked lists of opinion words: one list is positive and the other is negative.
In one implementation, the sentiment classifier 104 uses unigrams, bigrams and trigrams, which have high possibility of expressing opinions of positive and negative categories respectively. For example, “good” occurs frequently in the positive category, but not in the negative category. Such words are ranked according to their frequency and ability to discriminate among the positive and negative categories. Part-of-speech tag information can be used to filter out noisy opinion word/phrases in both positive and negative categories.
The negative word identifier and opinion word/phrase identifier 312 can help each other. For example, when “not good” is found in the negative category, if it is already known that “not” is negation word, then “good” might belong to positive category, and vice versa. So in one implementation, the sentiment classifier 104 runs the above two steps in an iterative manner. Generally, one or two rounds of iteration are enough for finding negation and opinion words.
The complex feature-based model trainer 316: Complex features include opinion features, section-rating features, sentence type features, etc. Compared to text-based features, one difference is that the values of complex feature are numbers or types, instead of term frequency. After the opinion words/phrases and negation words/phrases are identified from training sentences 204, the sentiment classifier 104 rebuilds a feature vector for them. If opinion word/phrase and negation word/phrase are close enough (for example, less than a 6 word distance, then in one implementation the sentiment classifier 104 combines the negation word and opinion word as one new expression and replaces the original word with it. For example, “not_good” may be used to replace “not good”.
The sentence type identifier 304 inputs training review sentences 204 with category information and outputs a list of indicator words. The sentence type identifier 304 may construct two categories, one category to contain sentences that can be correctly classified by full-text 206 and opinion words-based 208 models 318. The second category contains those sentences that cannot be correctly classified by such models 318. Then the sentence type identifier 304 extracts terms from both categories respectively according to their distributions in the two categories. All extracted terms from both categories are ranked according to their CHI score. The terms ranked at top are selected and kept as sentence type indicator words. The words or phrases like “if”, “but”, “however”, “but if” etc. can be automatically extracted. The part-of-speech tagger 404 can also provide information to filter out noisy indicator words.
The sentence chunk sequence builder 306 inputs a sentence 212 that may have one or more indicator words, and outputs a sequence of text chunks. Thus, the sentence chunk sequence builder 306 splits a complex sentence (a sentence that includes at least one indicator word) into several text chunks connected by the indicator words.
The full-text-based trainer 314 inputs review sentences 212 with assigned category information and in one implementation, outputs a trigram-based classification model 206. In one implementation, the full-text-based trainer 314 trains a trigram-based Naïve Bayesian model. An Information Gain (IG) feature selection method may be adopted to filter out noisy features before model training.
In one implementation, feature selection uses Information Gain (IG) and χ2 statistics (CHI). Information gain measures the number of bits of information obtained for category prediction by the presence or absence of a feature in a document. Let l be the number of clusters. Given vector [fkv1, fkv2, . . . , fkvn], the information gain of a feature fvn is defined as:
An χ2 statistic measures the association between the term and the category. It is defined to be:
The complex feature-based trainer 316 inputs negation words, opinion words, rating/section information, and training data 204. Output is the complex feature-based model 208.
Online Prediction
The input for the online part 210 can be a set of sentences 212, e.g., from a product review. The output is a sentiment category predicted by the sentiment classifier 104. In one implementation, the sentiment categories can be labeled positive, negative or neutral; or, positive, negative, mixed, and none.
The sentence preprocessor 320 shown in
The full-text-based model loader 408 and the complex feature-based model loader 410 load the SSC models 318. Then, the ensemble classifier 214, using the two models 206 and 208, obtains two prediction scores for each sentence 212. Ensemble parameters can be loaded from the model directory. The ensemble parameters can also be tuned in the offline training part 202. After that, the linear combination engine 224 obtains the final score, based on which categorization decision 214 is made.
Design Detail
One major function of the sentiment classifier 104 is to classify a user review sentence according to its sentiment orientation, so that an online search provides the most relevant and useful answers for product queries. But besides providing this major function and attaining basic performance criteria, the structure of the exemplary sentiment classifier 104 can be optimized to make it reliable, scalable, maintainable, and adaptable for other functions.
In one implementation, components (and characteristics) of the sentiment classifier 104 include:
Further Detail and Alternative Implementations
In one implementation, the sentiment classifier 104 classifies a review sentence 212 into one of the sentiment categories: positive, negative, mixed and none. A mixed review sentence contains both positive and negative user opinions. None means no opinion exists in a sentence. Though the description above focuses on sentence-level sentiment classification, the sentiment classifier 104 can also process paragraph level or review level sentiment classification, and can be easily extended to attribute or sub-topic level sentiment classification.
Based on experiment and observation, classification results for negative and mixed reviews are more difficult to accurately achieve than for positive reviews. This is because reviewers tend to adopt explicitly positive words when they write positive reviews. In contrast, when reviewers express negative or mixed opinions, they are more likely to use euphemistic or indirect expressions and the negative sentences usually contain more complex structure than the positive review sentences. For example, users may express opinions with conditions (e.g. “It will be nice if it can work”), using subjunctive moods (e.g. “Manuals could have better organization”), or with transitions (e.g. “Had a Hot Sync problem moving over but Palm Support was great in fixing it.”). Based on analysis of manually labeled sentences, these three types of sentences (conditional, subjunctive, and transitional) are common in negative and mixed reviews. In one study, the percentage of the above three types of sentences in positive, negative, mixed categories are 19.9%, 46.7%, and 96.6% respectively. This indicates euphemistic expressions are much more common in sentences with negative and mixed opinions and are thus more difficult to classify. This problem is referred to herein as the biased sentiment classification problem.
In order to deal with the biased sentiment classification problem, the sentiment classifier 104 improves the classification of complex sentences, including transition sentences, condition sentences and sentences containing subjunctive moods. The words that determine the complex sentence type are referred to herein as indicator words, such as but, if, and could, etc. They are learned from training data 204 with the supervised learning approach. Human editors can make further changes on the list of indicator words, which are automatically learned.
Operation of the Chunk Conditional Random Field (CRF) Framework
The sentiment orientation of a sentence 212 depends on the sequence consisting of both text chunks and indicator words. In one implementation, the sentiment classifier 104 uses the chunk CRF framework 500, or “Chunk CRF,” to deal with complex sentences. Exemplary Chunk CRF determines the sentiment orientation based on both word features and also the sentence structure information so that the accuracy of sentiment classification is improved. Experiments on a human-labeled review sentences indicate Chunk CRF is promising and can alleviate the biased sentiment classification problem.
Chunk CRF treats the sentence-level sentiment classification problem as a supervised sequence labeling problem and uses Conditional Random Field techniques to model the sequential information within a sentence. When CRF is applied on sentence level sentiment classification, the sentence segment generator 512 builds a text chunk sequence for each sentence 212. Given a sentence 212, the framework 500 first detects whether the sentence 212 contains complex sentence indicator 510 words such as “but,” which is determined by the method introduced in the following section. If a sentence 212 contains at least one indicator word, the CRF framework 500 regards the sentence 212 as complex. The sentence 212 is then split into several text chunks connected by indicator words. If a sentence 212 does not contain any indicator word, it is regarded as simple sentence and corresponds to only one text chunk. As one goal is to predict the sentiment orientation (“SO”) of a sentence, the CRF framework 500 adds a virtual text chunk denoted by SO at the end of each sentence 212. The tag of SO corresponds to the sentiment orientation of the whole sentence 212.
Referring to
Intuitively, the sentiment orientation SO chunk 708 depends on the orientations of all other text chunks 702, 704, 706 and the sentence type (e.g., transitional, conditional, subjunctive) which is reflected by the indicator words 710, 712. Each text chunk and indicator word is assigned a set of features. With the sentiment orientation tags of each text chunk (not shown), indicator word, and SO 708, the framework 500 can train a CRF model 516 to predict the category of SO 708 on a set of training sentences 204. The SO chunk 708 can be assigned with a tag of positive, negative, mixed or none. Based on the tag sequence and the features constructed for a sentence, the CRF framework 500 can train the CRF classifier 518 to predict the sentiment orientations 708 of new sentences 212. Another implementation conducts cross-domain studies, that is, trains Chunk CRF with one domain of review data and applies it on other domains.
In the exemplary Chunk CRF framework 500, each text chunk (e.g., 704) or indicator word (e.g., 710) can be represented by a vector of features. Conventional document classification algorithms can also be used to generate features for text chunks. The following features may be used:
Feature 1: Opinion-carrying words of the text chunk if available.
Feature 2: Negation word of the text chunk if available.
Feature 3: Sentiment orientation predicted by opinion-carrying words contained in the text chunk. Negation is also considered to be determinative of the text chunk orientation.
Feature 4: Indicator words if available.
Feature 5: Sentence type. For example, a value of “0” denotes a condition sentence; a value of “1” denotes a sentence with a subjective mood; a value of “2” denotes a transition sentence; a value of “3” denotes a simple sentence.
Feature 6: Sentiment orientation predicted by text analysis/classification algorithms.
By incorporating the above features, the Chunk CRF framework 500 is able to leverage various algorithms in a unified manner. Both opinion-carrying words features and sequential information of a sentence are utilized. Within the Chunk CRF framework 500, the label for the entire sequence is conditioned on the sequence of text chunks and indicator words. By capturing the sentence structure information, the Chunk CRF framework 500 is able to maximize both the likelihood of the label sequences and the consistency among them.
Feature Extraction for Sentiment Classification
Extraction of Opinion-Carrying Word Features
For extraction of opinion-carrying word features, various conventional feature selection methods have been proposed and applied to document classification. In one implementation, the exemplary sentiment classifier 104 adopts two popular feature selection methods in the art of text classification to extract opinion-carrying words: i.e., cross entropy and CHI. Moreover, part-of-speech (POS) tagging information can be used to filter noise and prime WORDNET with a set of manually selected seed opinion-carrying words can be used to improve both accuracy and coverage of the extraction results (WORDNET, Princeton University, Princeton, N.J.). The sentiment classifier 104 may use Spos and Sneg to denote the positive and negative seed opinion-carrying word set respectively. WORDNET is a semantic lexicon for the English language that groups words into sets of synonyms, provides short, general definitions, and records the various semantic relations between the synonym sets. WORDNET provides a combination of dictionary and thesaurus that is organized intuitively, and supports automatic text analysis and artificial intelligence applications.
In one implementation, the sentiment classifier 104 executes the following five steps:
Step 1: Sentences with positive and negative sentiments are tagged with part-of-speech (POS) information. All N-grams (1≦n<5) are extracted.
Step 2: All the unigrams with their part-of-speech (POS) information are filtered. Only those with adjective, verb, adverb, or noun tags are considered to be opinion-carrying word candidates. Different from conventional work, the sentiment classifier 104 also considers nouns because some nouns such as “problem”, “noise”, and “ease” are widely used to express user opinions.
Step 3: Within either a positive or a negative category, each candidate opinion-carrying word is assigned a cross entropy and Chi-square score, denoted by fsc(wi),cε{pos,neg}. In this step, the sentiment classifier 104 also considers embedded negative opinion-carrying words within positive negation expressions. For example, if the negation “not expensive” appears in positive category, the sentiment classifier 104 may select “expensive” as negative candidate words.
Step 4: WORDNET may be used to calculate the similarity of each candidate word and the pre-selected seed opinion words, as in Equation (2):
dist(wi,Sc)=max {sim(wi,p),pεSc},cε{pos,neg} (2)
Step 5: In this implementation, both the scores calculated by feature selection method and WORDNET are used to determine a final score for each candidate word. The scores of all candidate words are ranked to determine a final set of opinion-carrying words, as in Equation (3):
G
c(wi)=α·fsc(wi)+(1−a)·sim(wi,Sc),cε{pos,neg} (3)
In Equation (1) and (2), the similarity between a candidate opinion-carrying word wi and a seed word p is calculated as in Equation (4):
The distance dist(w,p) is the minimal number of hops between the nodes corresponding with words wi and p respectively. Both fsc(wi) and sim(wi,p) are normalized to the range of [0,1].
The exemplary sentiment classifier 104 has the advantage of adopting feature selection and WORDNET to achieve better accuracy and coverage of opinion-carrying words extraction than previous conventional approaches. Also, negation expressions are considered in step 2 above, which is essential for determining the sentiment orientation of opinion-carrying words. However, in most previous conventional research work, negation expressions are usually ignored. Besides word-level features, the next section describes how to use sentence structure features to improve sentiment classification accuracy.
Extraction of Sentence Structure Features
In order to identify what factors cause low accuracy of sentiment classification on negative and mixed sentences, empirical studies were conducted on human-labeled review data. These investigated what kinds of sentences are often used to express negative or mixed opinions. In one study, 50% of sentences were selected from the training set 204 to train sentiment classification models 318, which were then applied to predicting the remaining 50% of the training sentences 204. In order to discover which kinds of sentences containing user opinions are difficult to classify, the 50% of testing sentences 204 were divided into two categories: those correctly classified by the classifier and those which were incorrectly classified. Then feature selection methods such as CHI were applied to identify the words that are discriminative between the two categories. Words with part-of-speech tags coded as “CC” (coordinating conjunctions), “IN” (preposition or subordinating conjunctions), “MD” (modal verb) and “VB” (verb), were retained because such words are usually indicative of complex sentence types.
From the feature selection results, the classified sentences most frequently misclassified fall into three types, already introduced above:
Transitional Sentences: These are sentences that contain indicator words with part-of-speech (POS) tags of CC such as “but”, and “however”. For example, “ . . . which is fine but sometimes a bit hard to reach when the drawer is open and I need to reach it to close”.
Subjunctive Mood Sentences: These are sentences with indicator words with part-of-speech (POS) tags of MD and VB such as “should”, “could”, “wish”, “expect”. For example, “It sure would have been nice if they provided a free carrying case with a belt clip.” Or, “I wish it had an erase lock on it.”
Conditional Sentences: These are sentences with indicator words with part-of-speech (POS) tags of IN such as “if”, “although”. For example, “If your hobby were ‘headache’, buy this one!”
The above three types of sentences are regarded as complex sentences. Such sentences are usually quite euphemistic or subtle when used to express opinions. Thus, in order to increase coverage, based on the above indicator words, WORDNET was also used to find more indicator words such as “however” for the three types of complex sentences. Such indicator words are extracted and used as structure features 510 for sentiment classification.
Exemplary Methods
At block 802, a full-text analysis is applied to a received text to determine a first sentiment classification for the received text. The method 800 uses a supervised learning approach to train a smart sentiment classification model. Thus, the method 800 and/or associated methods have certain characteristics:
In supervised learning, exemplary methods 800 use a set of sentences for training model purposes. Each sentence is already labeled as one of multiple sentiment categories. Exemplary training extracts features from the training examples and trains a classification model with them. The classification model predicts a sentiment category for any input sentence.
The method 800 implements ensemble classification. Compared with conventional work on sentiment classification, the exemplary method 800 utilizes both full-text information and complex features of received sentences. Full-text information typically refers to the sequence of terms in a review sentence.
At block 804, a complex features analysis is applied to the received text to determine a second sentiment classification for the received text. Complex features include opinion-carrying words, section sentiment, rating information, etc. Based on the two kinds of information, two sentiment classification models can be trained separately: a full-text based model and a complex-feature based model.
The complex features can include:
At block 806, the first sentiment classification and the second sentiment classification are combined to achieve a sentiment prediction for the received text. In one implementation, the method linearly combines output of the two models. Different weights are assigned to the two models and linear combination is used to combine the outputs of both models for making a final decision.
At block 902, words (indicators) are found that indicate a sentence type for some or all of a received sentence. For example, in one implementation of the exemplary method 900, three types of sentences are frequently used: transitional sentences (containing words like “but”, “however”, etc.), conditional sentences (“if”, “although”) and sentences with subjunctive moods (“would be better”, “could be nicer”). Words such as “but” and “if”, etc., can be called sentence type indicators, or indicator words.
At block 904, the sentence is divided into segments at the indicator words. Each segment or text chunk may have its own sentiment orientation. The indicator words, moreover, also imply a sentence type for the segment they introduce.
At block 906, an ensemble of sentiment classification analyses are applied to each segment. For example, full-text analysis and complex features analysis are applied to each segment.
At block 908, a Conditional Random Fields (CRF) feature space is created for the output of the sentiment classification results. The sentiment classification of each of the multiple segments may have some components derived from the full-text analysis and others from the complex features-based analysis.
At block 910, a CRF model is used to produce a sentiment prediction for the received sentence. That is, the method 900 uses a CRF model for the various segments and their various sentiment orientations and executes a CRF-based classification of the modeled sentiments to achieve a final, overall sentiment orientation for the received sentence.
Although exemplary systems and methods have been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described. Rather, the specific features and acts are disclosed as exemplary forms of implementing the claimed methods, devices, systems, etc.
This patent application claims priority to U.S. Provisional Patent Application No. 60/892,527 to Huang et al., entitled, “Unified Framework for Sentiment Classification,” filed Mar. 1, 2007 and incorporated herein by reference; and U.S. Provisional Patent Application No. 60/956,053 to Huang et al., entitled, “Smart Sentiment Classifier for Product Reviews,” filed Aug. 15, 2007 and incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
60892527 | Mar 2007 | US | |
60956053 | Aug 2007 | US |