Various embodiments are directed generally to data analysis and specifically to methods and systems that allow structured data tools to operate on unstructured data.
Studies correlate higher overall customer satisfaction levels with improved profitability for business organizations. This correlation may be explained by 1) a satisfied customer is more likely to solicit future business from an organization; and 2) a satisfied customer is more likely to recommend an organization's offerings to their acquaintances, which provides opportunities for acquiring new business.
Today, a large number of business organizations constantly survey a sample of their customers in order to quantitatively project an overall customer satisfaction level. This metric can be thought of as a “customer pulse” By being sensitive to variations and trending patterns in the value of such a metric over time, on organization can react quickly to address areas of customer pain or to faster adjust to shifting customer expectations.
In order for an organization to apply appropriate remediative adjustments, it is critical to be able to associate and explain a specific variation (e.g. an unexpected drop in overall customer satisfaction) against tangible causal factors.
An important resource for evaluating meaningful cause behind shifting overall customer satisfaction is direct customer feedback (e.g. solicited customer surveys and direct customer complaints) and indirect customer feedback (e.g. feedback garnered from social media channels). Such feedback is typically collected as unstructured text.
Conventional approaches to evaluating causal cues from unstructured text require human resources to physically read all feedback associated with the variation, and to then make inferences on which specific issues may have caused the variation. Such an approach is time-consuming, and any delay in identifying issues may translate to loss of potential revenue. Conventional approaches are also labor intensive, inconsistent, error-prone, and tend to be influenced by subjective judgment.
Various embodiments include systems and methods for automating causal analysis.
Various embodiments are directed generally to data analysis and specifically to methods and systems that allow structured data tools to operate on unstructured data.
In some embodiments, a system may comprise a report generation component configured to generate a report; a report presentation component configured to allow an operator to select an observation from the report; a root cause component configured to determine one or more causal factors associated with the observation; a memory configured to store the report generation component, the report presentation component, and the root cause component; and at least one processor to implement the report generation component, the report presentation component, and the root cause component.
In some embodiments, a method of determining one or more causal factors for an observation may comprise: receiving an instruction to execute a report from a user; receiving an instruction to determine the one or more causal factors associated with an observation selected by the user; determining, by a processor, the one or more causal factors associated with the selected observation; ranking, by the processor, the one or more causal factors based on a measure of statistical association to the selected observation; and presenting results to the user
In some embodiments, a computer readable storage medium may comprise instructions that if executed enables a computing system to: receive an instruction to execute a report from a user; receive an instruction to determine the one or more causal factors associated with an observation selected by the user; determine the one or more causal factors associated with the selected observation; rank the one or more causal factors based on a measure of statistical association to the selected observation; and present results to the user.
Additional features, advantages, and embodiments are set forth or apparent from consideration of the following detailed description, drawings and claims. Moreover, it is to be understood that both the foregoing summary and the following detailed description are exemplary and intended to provide further explanation without limiting the scope of the invention as claimed.
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate preferred embodiments of the invention and together with the detailed description serve to explain the principles of the invention.
Various embodiments are directed systems and methods for performing root cause analysis using unstructured data. Such a capability is not possible with existing unstructured and structured data analysis tools.
In some embodiments, an apparatus may be provided to process unstructured text to determine, evaluate and rank causal factors associated with the magnitude and/or timing of a measured observation.
Observations that can be measured and analyzed may include, but are not limited to, average customer sentiment, measure of customer satisfaction, volume of customer comments.
Observations may either be measured over an entire set of customer feedback (i.e. overall measures) or may be restricted to cover only a specific topic of discussion, only a specific segmented set of customers (e.g. Men in age group 30-45), or constrained by a defined criteria (e.g. comments received during Black Friday).
Unstructured text may refer to human language in written form. Unstructured text may be acquired from a variety of sources such as surveys, e-mails, call center notes, audio conversation transcripts, chat data, word processing documents such as excel or word documents, social media such as Facebook or twitter, review websites or news content.
Trended data may refer to data that is being analyzed over time e.g. weekly trend report, daily trend report etc.
Untrended data may refer to data that is being analyzed without consideration of a time component.
A satisfaction measure may refer to an aggregated computed measure of overall customer satisfaction.
An observed anomaly may refer to an observation in a data report which stands out when compared to its peers because of a variation in some quantitative measure such as, but not limited to, volume, sentiment or satisfaction score.
The causal factors that are presented as the output by this present invention may include 1) discussion topics, 2) lexical patterns, 3) semantic patterns, 4) customer groups, and any combination thereof.
System 100 may include enterprise server 110, database server 120, one or more external sources 130, one or more internal sources 140, navigator device 150, administrator device 160, business intelligence server 170, and business intelligence report device 180.
Enterprise server 110, database server 120, one or more external sources 130, one or more internal sources 140, navigator device 150, administrator device 160, business intelligence server 170, and business intelligence report device 180 may be connected through one or more networks. The one or more networks may provide network access, data transport and other services to the devices coupled to it. In general, one or more networks may include and implement any commonly defined network architectures including those defined by standards bodies, such as the Global System for Mobile communication (GSM) Association, the Internet Engineering Task Force (IETF), and the Worldwide Interoperability for Microwave Access (WiMAX) forum. For example, one or more networks may implement one or more of a GSM architecture, a General Packet Radio Service (GPRS) architecture, a Universal Mobile Telecommunications System (UMTS) architecture, and an evolution of UMTS referred to as Long Term Evolution (LTE). The one or more networks may, again as an alternative or in conjunction with one or more of the above, implement a WiMAX architecture defined by the WiMAX forum. The one or more networks may also comprise, for instance, a local area network (LAN), a wide area network (WAN), the Internet, a virtual LAN (VLAN), an enterprise LAN, a virtual private network (VPN), an enterprise IP network, or any combination thereof.
Enterprise server 110, database server 120, and business intelligence server 170 may be any type of computing device, including but not limited to a personal computer, a server computer, a series of server computers, a mini computer, and a mainframe computer, or combinations thereof. Enterprise server 110, database server 120, and business intelligence server 170 may each be a web server (or a series of servers) running a network operating system, examples of which may include but are not limited to Microsoft Windows Server, Novell NetWare, or Linux.
Enterprise server 110 may include natural processing engine 111, sentiment scoring engine 112, classification engine, 113 and reporting engine, 114.
Natural language processing engine 111 may include subsystems to process unstructured text, including, but not limited to, language detection, sentence parsing, clause detection, tokenization, stemming, part of speech tagging, chunking, and named entity recognition. In some embodiments, natural language processing engine 111 may perform any or all portions of the exemplary process depicted in
Sentiment scoring engine 112 may identify a value representing the general feeling, attitude or opinion that an author of a section of unstructured text is expressing towards a situation or event. In some embodiments, the sentiment scoring engine may classify sentiment as either positive, negative or neutral. In some embodiments, the sentiment scoring engine may assign a numeric sentiment score on a numeric scale ranging from a minimum value representing the lowest possible sentiment to a maximum value representing the highest possible sentiment. In some embodiments, a dictionary of words is included, in which selected words are pre-assigned a sentiment tuning value. In some embodiments, the presence or absence of language features such as negation (e.g. NOT GOOD) or modifiers (e.g. VERY GOOD, SOMEWHAT GOOD, etc.) when modifying certain words (e.g. GOOD) influence the computation of sentiment for that sentence or clause.
In some embodiments, if a sentence has a single sentiment word with no negators or modifiers, the sentence sentiment score may be equal to the sentiment of that word (For example, a single word with a sentiment value of +3 will result in a sentence sentiment score of +3). In some embodiments, for sentences with multiple sentiment words, the following calculation may be applied. Consider the sentence below as an example:
1. Find the highest sentiment word value in the sentence. This will be used as a base for the sentence sentiment. In the example sentence this is +3.
2. Add +0.5 for every additional word with the same sentiment. In the example, there is one more word with +3 so add +3 and +0.5 which equals to +3.5
3. Add +0.25 for every word one level lower in sentiment. In the example, there is just one token with +2, so (3.5+0.25)=+3.75
4. The same approach is applicable for each subsequent levels. For sentiment level n−1 take the value of individual token on the level n and divide by 2 and then multiply by number of tokens with sentiment (n−1). So, in the example to calculate the effect of +1 token you have add +0.25/2 to the sentence sentiment: (3.75+0.25/2)=3.875. The only exception is that word sentiment level 0.25 (multiple decreasing modifiers attached to the word with a +1 or −1 value) is handled the same way as 0.5—the net effect for the sentence sentiment will be the same for both levels as there is no meaningful difference between the two cases.
5. Total sentence sentiment=+3.875
The same calculation model may be used for a sentence with negative words: adding a negative value equals subtraction of this value. When a sentence contains both positive and negative words, the calculations are done separately for positive and negative parts and then summed up.
Classification engine 113 may identify whether a particular classification category applies to a portion of unstructured text. In some embodiments each classification category is represented by one or many rules. In some embodiments, the rules may be expressed in Boolean logic. In some embodiments, the rules may be represented by a trained machine learning model.
Reporting engine 114 may report against categories and sentiment expressed in a collection of documents. In some embodiments, the categories used in reporting may include theme detected topics. Reporting engine may include a charting module 116, alerting module, dashboard module, root cause module 115, comparative analysis module, and any combination thereof.
In some embodiments, theme detection may be performed with or by any one or more of the embodiments disclosed in co-pending U.S. patent application Ser. No. 13/783,063 filed Mar. 1, 2013, entitled “Apparatus for Automatic Theme Detection from Unstructured Data,”, which is hereby incorporated herein by reference.
The root cause module 115 may perform any or all of the exemplary processes depicted by
In some embodiments, a user or a business may use an analysis tool provided by navigator device 150 to visualize data from charting 116 or dashboard and may allow a user of the analysis tool to select specific observations within a specific analysis. The user may instruct a module of the system, such as root cause module 115, to determine root cause factors that can explain the selected observation.
In block 220 criteria for isolating the selected observation 215 are identified. Criteria may include any filters applied in the analysis or any report elements that define the selected observation 215. For example, for a monthly trend report where a user observes that sentiment for Store XYZ has dropped on February 2012, criteria to isolate the selected observation may include MONTH=[FEB 2012] AND STORE=[XYZ].
An observation may be chosen by the system or by using a reporting or analytic tool where a user can select a data point on a graphic or tabular report. The observation may be defined by filters applied to restrict data covered by the report, categorical values associated with the selected data point, a time or time period associated with the selected data point, any numerical value associated with the data point, and a definition of a metric used for the numerical value. A metric may be of two types: purely volume based (numerical) or a customer behavior measure, e.g. measures of customer satisfaction, average sentiment, and average feedback rating. A report may be of two types: a trended report, in which time is one dimension being reported, or a non-trend report, in which the report does not have a time component. For trended reports, an observation is considered to be either a spike or dip on the metric being measured when compared to the previous time period. For a non-trend report, the criteria for an observation set may correspond to data elements that have a similar customer behavior metrics.
A filter may be used to constrain the unstructured data selected for a specific analysis. A filter may include one or several criteria—an example of a filter is the criteria STORE=[XYZ] which when applied to metadata for a collection of unstructured documents would return only those documents for which attribute STORE has value XYZ.
Using the identified criteria by block 220, a query may be performed in block 235 to retrieve unstructured text associated with the specified criteria from the entire set of available text documents 230.
From this set of text documents 235, qualifying features may be identified, aggregated and quantified in terms of volume, sentiment, customer satisfaction and any other metric used for analysis in block 250.
In block 225, a comparative observation may be determined to use as a baseline, and isolating and quantifying factors may be determined for the comparative observation. As a first step, criteria required to isolate the comparative observation may be identified—criteria include any filters applied in the analysis or any report elements that define the selected data-point. For example, for a monthly trend report where a user observes where sentiment for Store XYZ has dropped on February 2012 criteria to isolate the comparison observation may include MONTH=[JAN 2012] AND STORE=[XYZ]. Details on the steps required to identify a comparison observation is described in subsequent embodiments below.
The baseline may be a system identified comparison set. The baseline may be defined by filters, a set of one or more categorical values, a time period, a numerical associated with the baseline when applying a metric used for measuring the observation. For a trended report, the baseline may be the time period prior to the time period of the selected observation. For a non-trended reports, the baseline may be based on all other data elements other than the selected observation or customers that scored high or low on a customer behavior metric
Using criteria identified in block 225, a query may be performed to retrieve all unstructured text associated with the comparison criteria, 240, from the entire set of available text documents 230.
From this set of text documents, 240, qualifying features may be identified, aggregated and quantified in terms of volume, sentiment, customer satisfaction and any other metric used for analysis in block 255. The features identified may include words, word relationships (e.g. a pair of syntactically linked words), topics of discussion, and structured data associated with each document, including locations, products, and document categories. The features may be identified by a natural language processing engine that supports sentence, clause, and word parsing, syntactic parsing to determine word relationships, named entity recognition, and topic categorization
Block 260 describes a comparison step between features, and associated aggregated measures, present in the user selected observation and the features, and associated aggregated measures, present in the comparison observation.
Block 265 may rank features from the comparison step 260. In one embodiment of step 265 wherein the selected observation 215 is based on a volume measure, the ranking mechanism may use the following computation for calculating the impact of a certain feature Fx:
where:
For volume based metrics, irrelevant features may be removed by comparing the ratio of the volume of a feature in the baseline to the volume of the baseline to the ratio of volume of a feature in the observation to the volume of the observation as shown below:
In another embodiment of step 265, if the metric used for the observation is a customer behavior based metric, such as sentiment or customer satisfaction score, and the analysis is on a trended report, then the impact of any one feature Fx can be calculated using the formula below which calculates a significance score:
where:
For customer behavior based metrics in trend reports, irrelevant features may be discarded by comparing a metric value of a feature in the baseline to a metric value of a feature in the observation as shown below:
In another embodiment of step 265, if the metric used for the observation is a customer behavior based metric, such as sentiment or customer satisfaction score, and the analysis is on a non-trended report, then the impact of any one feature Fx can be calculated using the formula below which calculates a significance score:
where:
T1 represents the observation
T0 represents the baseline
For customer behavior based metrics in non-trend reports, irrelevant features may be discarded:
Features 350 represents a set of features that may be used as possible causal factors in explaining an anomaly—these may include document features 352, word features 354, linguistic relationships 356 and sentence categories 358.
From an unstructured text document 310, document metadata or structured data features 352 may be derived.
From an unstructured text document 310, natural language processing steps, sentence boundary disambiguation and clause detection, 315, may be performed to break down a document into sentence details and where applicable into clause details.
Sentences and clauses data from 315 may be further resolved into words using a tokenization step 320. Words may be assigned a part of speech (i.e. grammatical role in the sentence or clause).
From words data from 320, word features 354 may be derived. These may be key words performing grammatical roles such as nouns, verbs, adjectives or adverbs.
Between words extracted from step 320, grammatical relationships may be established using grammatical parsing 325.
From grammatical relationships from step 325, relevant relationships are selected as linguistic relationship features in step 356. Relevant relationships include but are not limited to adjective-noun relationships, verb-noun relationships and noun-noun relationships.
Using a predefined categorization model 330, sentences from step 315 may be mapped to a category topic in step 335. The mapping may be applied using predefined mapping rules that are part of model 330 or may be applied using a trained machine learning model. The sentence categories may be extracted as features in step 358.
In some embodiments of the apparatus a user may initiate discovery of root cause behind a specific observation in time-trended report.
In some embodiments of the apparatus a user may initiate discovery of root cause behind normalized volume of an observation in a non-trended report.
In some embodiments of the apparatus a user may initiate discovery of root cause behind sentiment of an observation in a non-trended report.
In some embodiments of the apparatus a user may initiate discovery of root cause behind satisfaction score of an observation in a non-trended report.
In block 420, criteria relating to any entities that pertain to observation 415 may be identified. For example, for a monthly trend report of a sentiment measure on “level of service” where a user observes sentiment for Store XYZ has dropped on February 2012, criteria to isolate the selected entity for the observation may include STORE=[XYZ].
In block 425, criteria relating to time factors of observation 415 may be identified. For example, for a monthly trend report of a sentiment measure on “level of service” where a user observes sentiment for Store XYZ has dropped on February 2012 criteria to isolate the selected time window for the observation may include MONTH=[FEB 2012].
In block 435, criteria relating to a suitable comparison baseline against observation 415 may be identified. In some embodiments, this may be the immediate previous time window prior to the selected observation. For example, for a monthly trend report of a sentiment measure on “level of service” where a user observes sentiment for Store XYZ has dropped on February 2012, criteria to isolate the selected time window for the observation may include MONTH=[JAN 2012].
In block 430, criteria relating to filters applied to report 410 may be identified. For example, for a monthly trend report of a sentiment measure on “level of service” where a user observes sentiment for Store XYZ has dropped on February 2012 criteria to isolate the selected time window for the observation may include CATEGORY=[“level of service”].
In order to identify sentences that are impacted by the selected observation 455, the apparatus may create a conjunction of criteria from 420, 425 and 430.
In order to identify sentences that are impacted by the comparison observation 460, the apparatus creates may create conjunction of criteria from 420, 430 and 435.
In the case of a volume based trend report, in which the observed anomaly is an upward spike in volume, the criteria identified from the comparison may be those factors that demonstrate a statistical tendency to be more prevalent in the anomaly when compared with the prior observation.
In the case of a volume based trend report, in which the observed anomaly is a downward drop in volume, the criteria identified from the comparison may be those factors that demonstrate a statistical tendency to be more prevalent in the prior observation when compared against the anomaly.
In the case of a sentiment based trend report, in which the observed anomaly is an increase in sentiment, the criteria identified from the comparison may be those that factors that by themselves show an increase in sentiment from the prior observation to the selected observation.
In the case of a sentiment based trend report, in which the observed anomaly is a decrease in sentiment, the criteria identified from the comparison may be those that factors that by themselves show a decrease in sentiment from the prior observation to the selected observation.
In the case of a satisfaction measure based trend report, in which the observed anomaly is an increase in satisfaction, the criteria identified from the comparison may be those that factors that by themselves show an increase in satisfaction from the prior observation to the selected observation.
In the case of a satisfaction measure based trend report, in which the observed anomaly is a decrease in satisfaction, the criteria identified from the comparison may be those that factors that by themselves show a decrease in satisfaction from the prior observation to the selected observation.
In block 520, criteria relating to any entities that pertain to observation 515 are identified. For example, for a volume report of all stores in region ABC in which a user selects an observation about volume for store XYZ, criteria to isolate the selected entity for the observation may include STORE=[XYZ].
In block 530, criteria relating to a suitable comparison baseline against observation 515 are identified. In some embodiments, this may be all other observations on the report other than the selected observation. For example, for a volume report of all stores in region ABC in which a user selects an observation about volume for store XYZ, criteria to isolate the selected entity for the observation may include STORE=[PQR, STU] where PQR and STU are other stores in region ABC.
In block 525, criteria relating to filters applied to report 510 are identified. For example, for a volume report of all stores in region ABC in which a user selects an observation about volume for store XYZ, criteria to isolate the selected entity for the observation may include REGION=[ABC].
In order to identify sentences that are impacted by the selected observation 550, the apparatus creates a conjunction of criteria from 520, and 525.
In order to identify sentences that are impacted by the comparison observation 555, the apparatus creates a conjunction of criteria from 525, and 530.
In block 635, criteria relating to any entities that pertain to observation 610 are identified. For example, for a sentiment report of all stores in region ABC in which a user selects an observation about sentiment for store XYZ, criteria to isolate the selected entity for the observation may include STORE=[XYZ].
In block 640, criteria relating to filters applied to report 610 are identified. For example, for a sentiment report of all stores in region ABC in which a user selects an observation about sentiment for store XYZ, criteria to isolate the selected entity for the observation may include REGION=[ABC].
In block 620, criteria relating to any entities that pertain to observation 610 are identified where the sentiment of the observation is negative (e.g. bad). For example, for a sentiment report of all stores in region ABC in which a user selects an observation about negative sentiment for store XYZ, criteria to isolate the selected entity for the observation may include SENTIMENT=[negative].
In block 620, criteria relating to a suitable comparison baseline against observation 610 are identified where the sentiment of the observation is positive (e.g. good). For example, for a sentiment report of all stores in region ABC in which a user selects an observation about positive sentiment for store XYZ, criteria to isolate the selected entity for the observation may include SENTIMENT=[positive].
In block 625, criteria relating to any entities that pertain to observation 610 are identified where the sentiment of the observation is positive (e.g. good). For example, for a sentiment report of all stores in region ABC in which a user selects an observation about positive sentiment for store PQR, criteria to isolate the selected entity for the observation may include SENTIMENT=[positive].
In block 625, criteria relating to a suitable comparison baseline against observation 610 are identified where the sentiment of the observation is negative (e.g. bad). For example, for a sentiment report of all stores in region ABC in which a user selects an observation about negative sentiment for store PQR, criteria to isolate the selected entity for the observation may include SENTIMENT=[negative].
In block 630, criteria relating to any entities that pertain to observation 610 are identified where the sentiment of the observation is neutral (i.e. neither good nor bad). For example, for a sentiment report of all stores in region ABC in which a user selects an observation about negative sentiment for store STU, criteria to isolate the selected entity for the observation may include SENTIMENT=[positive or negative].
In block 630, criteria relating to a suitable comparison baseline against observation 610 are identified where the sentiment of the observation is neutral (i.e. neither good nor bad). For example, for a sentiment report of all stores in region ABC in which a user selects an observation about negative sentiment for store PQR, criteria to isolate the selected entity for the observation may include SENTIMENT=[neutral].
In order to identify sentences that are impacted by the selected observation 660, the apparatus creates a conjunction of criteria from 635, 640 and criteria 3 from either of 620/625/630 depending on whether the selected observation pertains to positive, negative or neutral sentiment respectively.
In order to identify sentences that are impacted by the comparison or baseline observation 665, the apparatus creates a conjunction of criteria from 635, 640 and criteria 4 from either of 620/625/630 depending on whether the selected observation pertains to positive, negative or neutral sentiment respectively.
In the case of the observed anomaly being an overall negative sentiment expressed for a concept, then first all expressions of negative sentiment may be isolated, all other criteria remaining the same, then all expressions of positive sentiment may be isolated, all other criteria remaining the same. The aggregated features from the negative expressions may be statistically compared with the aggregated features from the positive expressions across a set of comparison criteria.
In the case of the observed anomaly being an overall positive sentiment expressed for a concept, then first all expressions of positive sentiment may be isolated, all other criteria remaining the same, then all expressions of negative sentiment may be isolated, all other criteria remaining the same. The aggregated features from the positive expressions may be statistically compared with the aggregated features from the negative expressions across a set of comparison criteria.
In the case of the observed anomaly being an overall neutral sentiment expressed for a concept, then first all expressions of positive or negative sentiment may be isolated, all other criteria remaining the same, then all expressions of neutral sentiment may be isolated, all other criteria remaining the same. The aggregated features from the negative or positive expressions may be statistically compared with the aggregated features from the neutral expressions across a set of comparison criteria.
In block 735, criteria relating to any entities that pertain to observation 710 are identified. For example, for a satisfaction score report of all stores in region ABC in which a user selects an observation about satisfaction score for store XYZ, criteria to isolate the selected entity for the observation may include STORE=[XYZ].
In block 740, criteria relating to filters applied to report 710 are identified. For example, for a satisfaction score report of all stores in region ABC in which a user selects an observation about satisfaction score for store XYZ, criteria to isolate the selected entity for the observation may include REGION=[ABC].
In block 720, criteria relating to any entities that pertain to observation 710 are identified where the satisfaction score of the observation is poor (i.e. bad). For example, for a satisfaction score report of all stores in region ABC in which a user selects an observation about poor satisfaction score for store XYZ, criteria to isolate the selected entity for the observation may include SATISFACTION=[poor].
In block 720, criteria relating to a suitable comparison baseline against observation 710 are identified where the satisfaction score of the observation is poor (i.e. bad). For example, for a satisfaction score report of all stores in region ABC in which a user selects an observation about poor satisfaction score for store XYZ, criteria to isolate the selected entity for the observation may include SATISFACTION=[high].
In block 725, criteria relating to any entities that pertain to observation 710 are identified where the satisfaction of the observation is high (i.e. good). For example, for a satisfaction score report of all stores in region ABC in which a user selects an observation about high satisfaction for store PQR, criteria to isolate the selected entity for the observation may include SATISFACTION=[high].
In block 725, criteria relating to a suitable comparison baseline against observation 710 are identified where the satisfaction of the observation is high (i.e. good). For example, for a satisfaction score report of all stores in region ABC in which a user selects an observation about high satisfaction for store PQR, criteria to isolate the selected entity for the observation may include SATISFACTION=[poor].
In order to identify sentences that are impacted by the selected observation 760, the apparatus creates a conjunction of criteria from 735, 740 and criteria 3 from either of 730/725 depending on whether the selected observation pertains to high or poor satisfaction respectively.
In order to identify sentences that are impacted by the comparison or baseline observation 765, the apparatus creates a conjunction of criteria from 735, 740 and criteria 4 from either of 730/725 depending on whether the selected observation pertains to high or poor satisfaction respectively.
In the case of the observed anomaly being an overall negative satisfaction expressed for a concept, then first all expressions of negative satisfaction may be isolated, all other criteria remaining the same, then all expressions of positive satisfaction may be isolated, all other criteria remaining the same. The aggregated features from the negative expressions may be statistically compared with the aggregated features from the positive expressions across a set of comparison criteria.
In the case of the observed anomaly being an overall positive satisfaction expressed for a concept, then first all expressions of positive satisfaction may be isolated, all other criteria remaining the same, then all expressions of negative satisfaction may be isolated, all other criteria remaining the same. The aggregated features from the positive expressions may be statistically compared with the aggregated features from the negative expressions across a set of comparison criteria.
Bus 1310 may include one or more interconnects that permit communication among the components of computing device 1300. Processor 1320 may include any type of processor, microprocessor, or processing logic that may interpret and execute instructions (e.g., a field programmable gate array (FPGA)). Processor 1320 may include a single device (e.g., a single core) and/or a group of devices (e.g., multi-core). Memory 1330 may include a random access memory (RAM) or another type of dynamic storage device that may store information and instructions for execution by processor 1320. Memory 1330 may also be used to store temporary variables or other intermediate information during execution of instructions by processor 1320.
ROM 1340 may include a ROM device and/or another type of static storage device that may store static information and instructions for processor 1320. Storage device 1350 may include a magnetic disk and/or optical disk and its corresponding drive for storing information and/or instructions. Storage device 1350 may include a single storage device or multiple storage devices, such as multiple storage devices operating in parallel. Moreover, storage device 1350 may reside locally on the computing device 1300 and/or may be remote with respect to a server and connected thereto via network and/or another type of connection, such as a dedicated link or channel.
Input device 1360 may include any mechanism or combination of mechanisms that permit an operator to input information to computing device 1300, such as a keyboard, a mouse, a touch sensitive display device, a microphone, a pen-based pointing device, and/or a biometric input device, such as a voice recognition device and/or a finger print scanning device. Output device 1370 may include any mechanism or combination of mechanisms that outputs information to the operator, including a display, a printer, a speaker, etc.
Communication interface 1380 may include any transceiver-like mechanism that enables computing device 1300 to communicate with other devices and/or systems, such as a client, a server, a license manager, a vendor, etc. For example, communication interface 1380 may include one or more interfaces, such as a first interface coupled to a network and/or a second interface coupled to a license manager. Alternatively, communication interface 1380 may include other mechanisms (e.g., a wireless interface) for communicating via a network, such as a wireless network. In one implementation, communication interface 1380 may include logic to send code to a destination device, such as a target device that can include general purpose hardware (e.g., a personal computer form factor), dedicated hardware (e.g., a digital signal processing (DSP) device adapted to execute a compiled version of a model or a part of a model), etc.
Computing device 1300 may perform certain functions in response to processor 1320 executing software instructions contained in a computer-readable medium, such as memory 1330. In alternative embodiments, hardwired circuitry may be used in place of or in combination with software instructions to implement features consistent with principles of the invention. Thus, implementations consistent with principles of the invention are not limited to any specific combination of hardware circuitry and software.
Exemplary embodiments may be embodied in many different ways as a software component. For example, it may be a stand-alone software package, a combination of software packages, or it may be a software package incorporated as a “tool” in a larger software product. It may be downloadable from a network, for example, a website, as a stand-alone product or as an add-in package for installation in an existing software application. It may also be available as a client-server software application, or as a web-enabled software application. It may also be embodied as a software package installed on a hardware device.
Numerous specific details have been set forth to provide a thorough understanding of the embodiments. It will be understood, however, that the embodiments may be practiced without these specific details. In other instances, well-known operations, components and circuits have not been described in detail so as not to obscure the embodiments. It can be appreciated that the specific structural and functional details are representative and do not necessarily limit the scope of the embodiments.
It is worthy to note that any reference to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. The appearances of the phrase “in one embodiment” in the specification are not necessarily all referring to the same embodiment.
Although some embodiments may be illustrated and described as comprising exemplary functional components or modules performing various operations, it can be appreciated that such components or modules may be implemented by one or more hardware components, software components, and/or combination thereof. The functional components and/or modules may be implemented, for example, by logic (e.g., instructions, data, and/or code) to be executed by a logic device (e.g., processor). Such logic may be stored internally or externally to a logic device on one or more types of computer-readable storage media.
Some embodiments may comprise an article of manufacture. An article of manufacture may comprise a storage medium to store logic. Examples of a storage medium may include one or more types of computer-readable storage media capable of storing electronic data, including volatile memory or non-volatile memory, removable or non-removable memory, erasable or non-erasable memory, writeable or re-writeable memory, and so forth. Examples of storage media include hard drives, disk drives, solid state drives, and any other tangible storage media.
As will be appreciated by one of skill in the art, aspects of the present invention may be embodied as a method, data processing system, or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment or an embodiment combining software and hardware aspects, all generally referred to herein as system. Furthermore, elements of the present invention may take the form of a computer program product on a computer-usable storage medium having computer-usable program code embodied in the medium. Any suitable computer readable medium may be utilized, including hard disks, CD-ROMs, optical storage devices, flash RAM, transmission media such as those supporting the Internet or an intranet, or magnetic storage devices.
Computer program code for carrying out operations of the present invention may be written in an object oriented programming language such as JAVA, C#, Smalltalk or C++, or in conventional procedural programming languages, such as the Visual Basic or “C” programming language. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer, or partially or entirely on a cloud environment. In the latter scenarios, the remote computer or cloud environments may be connected to the user's computer through a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
Aspects of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, systems and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, server, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer, server or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks, and may operate alone or in conjunction with additional hardware apparatus.
It also is to be appreciated that the described embodiments illustrate exemplary implementations, and that the functional components and/or modules may be implemented in various other ways which are consistent with the described embodiments. Furthermore, the operations performed by such components or modules may be combined and/or separated for a given implementation and may be performed by a greater number or fewer number of components or modules.
Some of the figures may include a flow diagram. Although such figures may include a particular logic flow, it can be appreciated that the logic flow merely provides an exemplary implementation of the general functionality. Further, the logic flow does not necessarily have to be executed in the order presented unless otherwise indicated. In addition, the logic flow may be implemented by a hardware element, a software element executed by a processor, or any combination thereof.
Although the foregoing description is directed to the preferred embodiments of the invention, it is noted that other variations and modifications will be apparent to those skilled in the art, and may be made without departing from the spirit or scope of the invention. Moreover, features described in connection with one embodiment of the invention may be used in conjunction with other embodiments, even if not explicitly stated above.
This application claims the benefit of U.S. Provisional Patent Application No. 61/606,025, filed Mar. 2, 2012, and also claims the benefit of U.S. Provisional Patent Application No. 61/606,021, filed Mar. 2, 2012, the contents of each of which are hereby incorporated by reference herein in their entirety.
| Number | Name | Date | Kind |
|---|---|---|---|
| 3576983 | Cochran | May 1971 | A |
| 4652733 | Eng et al. | Mar 1987 | A |
| 4871903 | Carrell | Oct 1989 | A |
| 5162992 | Williams | Nov 1992 | A |
| 5255356 | Michelman et al. | Oct 1993 | A |
| 5361353 | Carr et al. | Nov 1994 | A |
| 5396588 | Froessl | Mar 1995 | A |
| 5560006 | Layden et al. | Sep 1996 | A |
| 5567865 | Hauf | Oct 1996 | A |
| 5586252 | Barnard et al. | Dec 1996 | A |
| 5608904 | Chaudhuri et al. | Mar 1997 | A |
| 5634054 | Sarachan | May 1997 | A |
| 5640575 | Maruyama et al. | Jun 1997 | A |
| 5664109 | Johnson et al. | Sep 1997 | A |
| 5708825 | Sotomayor | Jan 1998 | A |
| 5768578 | Kirk et al. | Jun 1998 | A |
| 5819265 | Ravin et al. | Oct 1998 | A |
| 5887120 | Wical | Mar 1999 | A |
| 5930788 | Wical | Jul 1999 | A |
| 5983214 | Lang et al. | Nov 1999 | A |
| 6003027 | Prager | Dec 1999 | A |
| 6009462 | Birrell et al. | Dec 1999 | A |
| 6052693 | Smith et al. | Apr 2000 | A |
| 6061678 | Klein et al. | May 2000 | A |
| 6078924 | Ainsbury et al. | Jun 2000 | A |
| 6122647 | Horowitz et al. | Sep 2000 | A |
| 6151604 | Wlaschin et al. | Nov 2000 | A |
| 6163775 | Wlaschin et al. | Dec 2000 | A |
| 6236994 | Swartz et al. | May 2001 | B1 |
| 6332163 | Bowman-Amuah | Dec 2001 | B1 |
| 6363379 | Jacobson et al. | Mar 2002 | B1 |
| 6366921 | Hansen et al. | Apr 2002 | B1 |
| 6449620 | Draper et al. | Sep 2002 | B1 |
| 6564215 | Hsiao et al. | May 2003 | B1 |
| 6584470 | Veale | Jun 2003 | B2 |
| 6629097 | Keith | Sep 2003 | B1 |
| 6643661 | Polizzi et al. | Nov 2003 | B2 |
| 6665685 | Bialic | Dec 2003 | B1 |
| 6681370 | Gounares et al. | Jan 2004 | B2 |
| 6694007 | Lang et al. | Feb 2004 | B2 |
| 6694307 | Julien | Feb 2004 | B2 |
| 6728707 | Wakefield et al. | Apr 2004 | B1 |
| 6732097 | Wakefield et al. | May 2004 | B1 |
| 6732098 | Wakefield et al. | May 2004 | B1 |
| 6735578 | Shetty et al. | May 2004 | B2 |
| 6738765 | Wakefield et al. | May 2004 | B1 |
| 6741988 | Wakefield et al. | May 2004 | B1 |
| 6766319 | Might | Jul 2004 | B1 |
| 6862585 | Planalp et al. | Mar 2005 | B2 |
| 6886010 | Kostoff | Apr 2005 | B2 |
| 6912498 | Stevens et al. | Jun 2005 | B2 |
| 6970881 | Mohan et al. | Nov 2005 | B1 |
| 7043535 | Chi et al. | May 2006 | B2 |
| 7085771 | Chung et al. | Aug 2006 | B2 |
| 7123974 | Hamilton | Oct 2006 | B1 |
| 7191183 | Goldstein | Mar 2007 | B1 |
| 7200606 | Elkan | Apr 2007 | B2 |
| 7266548 | Weare | Sep 2007 | B2 |
| 7320000 | Chitrapura et al. | Jan 2008 | B2 |
| 7353230 | Hamilton et al. | Apr 2008 | B2 |
| 7379537 | Bushey et al. | May 2008 | B2 |
| 7523085 | Nigam et al. | Apr 2009 | B2 |
| 7533038 | Blume et al. | May 2009 | B2 |
| 7536413 | Mohan et al. | May 2009 | B1 |
| 7844566 | Wnek | Nov 2010 | B2 |
| 7849048 | Langseth et al. | Dec 2010 | B2 |
| 7849049 | Langseth et al. | Dec 2010 | B2 |
| 8166032 | Sommer et al. | Apr 2012 | B2 |
| 8266077 | Handley | Sep 2012 | B2 |
| 8301720 | Thakker et al. | Oct 2012 | B1 |
| 8335787 | Shein et al. | Dec 2012 | B2 |
| 8346534 | Csomai et al. | Jan 2013 | B2 |
| 20010018686 | Nakano et al. | Aug 2001 | A1 |
| 20010025353 | Jakel | Sep 2001 | A1 |
| 20010032234 | Summers et al. | Oct 2001 | A1 |
| 20020065857 | Michalewicz et al. | May 2002 | A1 |
| 20020078068 | Krishnaprasad et al. | Jun 2002 | A1 |
| 20020111951 | Zeng | Aug 2002 | A1 |
| 20020128998 | Kil et al. | Sep 2002 | A1 |
| 20020129011 | Julien | Sep 2002 | A1 |
| 20020143875 | Ratcliff | Oct 2002 | A1 |
| 20020152208 | Bloedorn | Oct 2002 | A1 |
| 20020156771 | Frieder et al. | Oct 2002 | A1 |
| 20020161626 | Plante et al. | Oct 2002 | A1 |
| 20020168664 | Murray et al. | Nov 2002 | A1 |
| 20020194379 | Bennett et al. | Dec 2002 | A1 |
| 20030014406 | Faieta et al. | Jan 2003 | A1 |
| 20030016943 | Chung et al. | Jan 2003 | A1 |
| 20030033275 | Alpha et al. | Feb 2003 | A1 |
| 20030078766 | Appelt et al. | Apr 2003 | A1 |
| 20030083908 | Steinmann | May 2003 | A1 |
| 20030088562 | Dillon et al. | May 2003 | A1 |
| 20030101052 | Chen et al. | May 2003 | A1 |
| 20030110058 | Fagan et al. | Jun 2003 | A1 |
| 20030120133 | Rao et al. | Jun 2003 | A1 |
| 20030125988 | Rao et al. | Jul 2003 | A1 |
| 20030130894 | Huettner et al. | Jul 2003 | A1 |
| 20030144892 | Cowan et al. | Jul 2003 | A1 |
| 20030149586 | Chen et al. | Aug 2003 | A1 |
| 20030149730 | Kumar et al. | Aug 2003 | A1 |
| 20030158865 | Renkes et al. | Aug 2003 | A1 |
| 20030176976 | Gardner | Sep 2003 | A1 |
| 20030177112 | Gardner | Sep 2003 | A1 |
| 20030177143 | Gardner | Sep 2003 | A1 |
| 20030188009 | Agarwalla et al. | Oct 2003 | A1 |
| 20030204494 | Agrawal et al. | Oct 2003 | A1 |
| 20030206201 | Ly | Nov 2003 | A1 |
| 20030225749 | Cox et al. | Dec 2003 | A1 |
| 20040010491 | Riedinger | Jan 2004 | A1 |
| 20040044659 | Judd et al. | Mar 2004 | A1 |
| 20040049473 | Gower et al. | Mar 2004 | A1 |
| 20040049505 | Pennock | Mar 2004 | A1 |
| 20040103116 | Palanisamy et al. | May 2004 | A1 |
| 20040111302 | Falk et al. | Jun 2004 | A1 |
| 20040122826 | Mackie | Jun 2004 | A1 |
| 20040167870 | Wakefield et al. | Aug 2004 | A1 |
| 20040167883 | Wakefield et al. | Aug 2004 | A1 |
| 20040167884 | Wakefield et al. | Aug 2004 | A1 |
| 20040167885 | Wakefield et al. | Aug 2004 | A1 |
| 20040167886 | Wakefield et al. | Aug 2004 | A1 |
| 20040167887 | Wakefield et al. | Aug 2004 | A1 |
| 20040167888 | Kayahara et al. | Aug 2004 | A1 |
| 20040167907 | Wakefield et al. | Aug 2004 | A1 |
| 20040167908 | Wakefield et al. | Aug 2004 | A1 |
| 20040167909 | Wakefield et al. | Aug 2004 | A1 |
| 20040167910 | Wakefield et al. | Aug 2004 | A1 |
| 20040167911 | Wakefield et al. | Aug 2004 | A1 |
| 20040172297 | Rao et al. | Sep 2004 | A1 |
| 20040186826 | Choi et al. | Sep 2004 | A1 |
| 20040194009 | LaComb et al. | Sep 2004 | A1 |
| 20040215634 | Wakefield et al. | Oct 2004 | A1 |
| 20040225653 | Nelken et al. | Nov 2004 | A1 |
| 20040243554 | Broder et al. | Dec 2004 | A1 |
| 20040243560 | Broder et al. | Dec 2004 | A1 |
| 20040243645 | Broder et al. | Dec 2004 | A1 |
| 20050004909 | Stevenson et al. | Jan 2005 | A1 |
| 20050010454 | Falk et al. | Jan 2005 | A1 |
| 20050015381 | Clifford et al. | Jan 2005 | A1 |
| 20050038805 | Maren et al. | Feb 2005 | A1 |
| 20050049497 | Krishnan et al. | Mar 2005 | A1 |
| 20050050037 | Frieder et al. | Mar 2005 | A1 |
| 20050055355 | Murthy et al. | Mar 2005 | A1 |
| 20050059876 | Krishnan et al. | Mar 2005 | A1 |
| 20050065807 | DeAngelis et al. | Mar 2005 | A1 |
| 20050065941 | DeAngelis et al. | Mar 2005 | A1 |
| 20050065967 | Schuetze et al. | Mar 2005 | A1 |
| 20050086215 | Perisic | Apr 2005 | A1 |
| 20050086222 | Wang et al. | Apr 2005 | A1 |
| 20050091285 | Krishnan et al. | Apr 2005 | A1 |
| 20050108256 | Wakefield et al. | May 2005 | A1 |
| 20050108267 | Gibson et al. | May 2005 | A1 |
| 20050165712 | Araki et al. | Jul 2005 | A1 |
| 20050240984 | Farr et al. | Oct 2005 | A1 |
| 20050243604 | Harken et al. | Nov 2005 | A1 |
| 20060253495 | Png | Nov 2006 | A1 |
| 20070011134 | Langseth et al. | Jan 2007 | A1 |
| 20070011175 | Langseth et al. | Jan 2007 | A1 |
| 20080086363 | Kass | Apr 2008 | A1 |
| 20080249764 | Huang et al. | Oct 2008 | A1 |
| 20090037457 | Musgrove | Feb 2009 | A1 |
| 20090292583 | Eilam et al. | Nov 2009 | A1 |
| 20090307213 | Deng et al. | Dec 2009 | A1 |
| 20090319342 | Shilman et al. | Dec 2009 | A1 |
| 20100049590 | Anshul | Feb 2010 | A1 |
| 20100153318 | Branavan et al. | Jun 2010 | A1 |
| 20100241620 | Manister et al. | Sep 2010 | A1 |
| 20100332287 | Gates et al. | Dec 2010 | A1 |
| 20110106807 | Srihari et al. | May 2011 | A1 |
| 20110137707 | Winfield et al. | Jun 2011 | A1 |
| 20110161333 | Langseth et al. | Jun 2011 | A1 |
| 20110231394 | Wang et al. | Sep 2011 | A1 |
| 20110276513 | Erhart et al. | Nov 2011 | A1 |
| 20110282858 | Karidi et al. | Nov 2011 | A1 |
| 20110302006 | Avner et al. | Dec 2011 | A1 |
| 20110307312 | Goeldi | Dec 2011 | A1 |
| 20120259617 | Indukuri et al. | Oct 2012 | A1 |
| 20120290622 | Kumar et al. | Nov 2012 | A1 |
| 20130036126 | Anderson | Feb 2013 | A1 |
| Number | Date | Country |
|---|---|---|
| 10337934 | Apr 2004 | DE |
| 1083491 | Mar 2001 | EP |
| 1021249 | Jan 1989 | JP |
| 2004258912 | Sep 2004 | JP |
| WO-9630845 | Oct 1996 | WO |
| WO-9835469 | Aug 1998 | WO |
| WO-0026795 | May 2000 | WO |
| WO-02082318 | Oct 2002 | WO |
| WO-02095616 | Nov 2002 | WO |
| WO-03004892 | Jan 2003 | WO |
| WO-03040878 | May 2003 | WO |
| WO-03098466 | Nov 2003 | WO |
| WO-2004104865 | Dec 2004 | WO |
| WO-2007005730 | Jan 2007 | WO |
| WO-2007005732 | Jan 2007 | WO |
| WO-2007021386 | Feb 2007 | WO |
| Entry |
|---|
| “Adding Structure to the Unstructured-Computer Business Review”, http://www.cbr-online.com, World Wide Web, May 25, 2005, pp. 1116. |
| “EIQ Server”, http://www.whamtech.com/eiq—server.htm>, World Wide Web, May 25, 2005, pp. 1-3. |
| “GEDDM—Grid Enabled Distributed Data Mining”, http://www.qub.ac.uk/escience/projects/geddm/geddm handout.pdf, World Wide Web, pp. 1-2. |
| “Innovative Applications;”, http://www2002.oreispector.pdg, World Wide Web, pp. 1-5. |
| Agrawal et al., “Athena: Mining-Based Interactive Management of Text Databases,” 2000, EDBT, LNCS 1777, 365-379. |
| Alani et al., “Automatic Ontology-Based Knowledge Extraction from Web Documents,” IEEE Computer Society, 2003. |
| Bourret, “Persistence: SGML and XML in Databases”, http://www.isemlug.org/database.html, World Wide Web, 2002, pp. 1-5. |
| Buttler et al., “Rapid Exploitation and Analysis of Documents,” Lawrence Livermore National Laboratory, Dec. 2011, pp. 1-40. |
| Clark, “XSL Transformations (XSLT) Version 1.0,” W3C, 1999. |
| Crosman, “Content Pipeline”, http://messagingpipeline.com/shared/article/printableArticleSrc.jhtml?articleId=51201811>, World Wide Web, Nov. 1, 2004, pp. 1-8. |
| Darrow, “IBM Looks to ‘Viper’Database to Combat Oracle, Microsoft”, http://www.bizintellignecepipeline.com/shared/article/printable.ArticleSrc.Jhtml, World Wide Web, May 25, 2005, pp. 1-5. |
| Das et al., “Opinion Summarization in Bengali: A Theme Network Model,” Retrieved From: http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=5591520&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs—all.jsp%3Farnumber%3D5591520, Aug. 20-22, 2010, pp. 675-682. |
| Das et al., “Theme Detection an Exploration of Opinion Subjectivity,” IEEE, 2009, pp. 1-8. |
| Embley et al., “Ontology-Based Extraction and Structuring of Information form Data-Rich Unstructured Documents,” In Proceedings of the Conference on Information and Knowledge Management (CIKM'98), 1998. |
| Extended European Search Report issued in Application No. 06774414.4 dated Dec. 27, 2010. |
| Extended European Search Report issued in Application No. 06774415.4 dated Dec. 27, 2010. |
| Ferrucci, “Building an Example Application With the Unstructured Information Management Architecture”, http://www.findarticles.com/p/articles/mi mISJ/is 3 43/ain7576557/print, World Wide Web, Mar. 16, 2004, pp. 1-22. |
| Gamon et al., “Pulse: Mining Customer Opinions from Free Text,” Springer-Verlag Berline Heidelberg, 2005. |
| Ghanem et. al., “Dynamic Information Integration for E-Science”, http://www.discovery-on-the.net/documents/DynamicInformationintegration.pdf>, World Wide Web, pp. 1-2. |
| Gold-Bernstein, “EbizQ Integration Conference”, http://www.ebizq.net/topics/portals/features/4371.html?page=2&pp=1>, World Wide Web, May 10, 2005, pp. 1-5. |
| Harabagiu et al., “Using Topic Themse for Multi-Document Summarization,” ACM Transactions on Information Systems, vol. 28, No. 3, Jun. 2010, pp. 1-47. |
| Hearst, Marti A., “Untangling Text Data Mining,” Proceeding of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics, 1999. |
| Hu et al., “Mining Opinion Features in Customer Reviews,” American Association for Articial Intelligence, 2004 pp. 1-6. |
| Infoconomy Staff, “Enterprise Search Tools”, http://www.infoconomy.com/pages/infoconomist-crib-sheets/group101866.adp, World Wide Web, Dec. 1, 2004, pp. 1-5. |
| International Preliminary Report on Patentability issued in Application No. PCT/US2006/025810 dated Jan. 10, 2008. |
| International Preliminary Report on Patentability issued in Application No. PCT/US2006/025811 dated Jan. 9, 2008. |
| International Search Report issued in Application No. PCT/US2006/025810 dated Jul. 27, 2007. |
| International Search Report issued in Application No. PCT/US2006/025811 dated Feb. 16, 2007. |
| International Search Report issued in Application No. PCT/US2006/025814 dated Jan. 3, 2007. |
| Kelly et al., “Roadmap to checking data migration,” 2003, Elsevier, 506-510. |
| Keogh et al., On the Need for Time Series Data Mining Benchmarks: A survey and Empirical Demonstration, 2003, Data Mining and Knowledge Discovery, 7:349-371. |
| Kim et al., “Automatic Identification of Pro and Con Reasons in Online Reviews,” ACL 2006, pp. 1-8. |
| Kim et al., “Determining the Sentiment of Opinions,” ACM, 2004. |
| Kugel, “Transform Magazine: Unstructured Information Management”, http://www.transformmag.com/shard/cp/print articlejthm hsessionid. >, World Wide Web, Dec. 2003, pp. 1-3. |
| Lau et al., “Automatic Labelling of Topic Models,” Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Jun. 19-24, 2011, pp. 1536-1545. |
| Ma et al., “Extracting Unstructured Data from Template Generated Web Documents,” 2003, CIKM, 512-515. |
| Moschitti et al., “Open Domain Information Extraction via Automatic Semantic Labeling,” American Association for Artificial Intelligence, 2003. |
| Nagao et al., “Automatic Text Summarization Based on the Global Document Annotation,” International Conference on Computational Linguistics Proceedings, 1998, vol. 2, pp. 917-921. |
| Notice of Allowance issued in U.S. Appl. No. 11/172,955, dated Aug. 16, 2010, 20 pages. |
| Notice of Allowance issued in U.S. Appl. No. 11/172,956, dated Aug. 18, 2010, 19 pages. |
| Office Action issued in U.S. Appl. No. 11/172,955, dated Apr. 29, 2009, 38 pages. |
| Office Action issued in U.S. Appl. No. 11/172,955, dated Jan. 5, 2010, 37 pages. |
| Office Action issued in U.S. Appl. No. 11/172,955, dated Jul. 25, 2007, 39 pages. |
| Office Action issued in U.S. Appl. No. 11/172,955, dated May 30, 2008, 43 pages. |
| Office Action issued in U.S. Appl. No. 11/172,956, dated Apr. 28, 2009, 28 pages. |
| Office Action issued in U.S. Appl. No. 11/172,956, dated Aug. 8, 2007; 26 pages. |
| Office Action issued in U.S. Appl. No. 11/172,956, dated Jan. 7, 2010, 34 pages. |
| Office Action issued in U.S. Appl. No. 11/172,956, dated May 30, 2008, 27 pages. |
| Office Action issued in U.S. Appl. No. 11/172,957, dated Aug. 8, 2007, 35 pages. |
| Office Action issued in U.S. Appl. No. 12/959,805, dated Mar. 31, 2011, 26 pages. |
| Examiner Interview Summary issued in U.S. Appl. No. 11/172,955, dated Feb. 1, 2008, 2 pages. |
| Examiner Interview Summary issued in U.S. Appl. No. 11/172,955, dated Oct. 22, 2009, 4 pages. |
| Examiner Interview Summary issued in U.S. Appl. No. 11/172,956, dated Jan. 28, 2008, 3 pages. |
| Pang et al., “Thumbs Up? Sentiment Classification using Machine Learning Techniques,” Proceedings of the Conference of Empirical Methods in Natural Language Processing (EMNLP), 2002, pp. 79-86. |
| Park et al., “Co-trained support vector machines for large scale unstructured document classification using unlabeled data and syntactic information,” Information Processing and Management, 2004. |
| Patro et al., “Seamless Integration of Diverse Types into Exploratory Visualization Systems,” Eurographics, 2003. |
| Pradhan et al., “Semantic Role Parsing: Adding Semantic Structure to Unstructured Text,” Proceedings of the Third IEEE International Conference on Data Mining (ICDM.03), 2003. |
| Russom, Philip, “How to Evaluate Enterprise ETL,” Tech Choices, 2004. |
| Spector, “Architecting Knowledge Middleware;” http://itlab.uta.edu/idm01/FinalReports/Innovation.pdf, World Wide Web, 2002, pp. 1-40. |
| Swoyer, “IBM's BI Middleware Play: Its All About Integration, Partnerships”, http://www.tdwi.org/Publications/display.aspx?id+7267&t=y, World Wide Web, Nov. 3, 2004, pp. 1-3. |
| The Unstructured Information Management Architecture Project, http://www.research.ibm.com/UIMA/>, World Wide Web, May 25, 2005, pp. 1-2. |
| Tkach, Daniel S., “Information Mining with the IBM Intelligent Miner Family,” IBM, 1998. |
| Vesset et al., “White Paper: Why Consider Oracle for Business Intelligence?” IDC, 2004. |
| Wang et. al., “Database Research at Watson”, http://www.researchibm.com/scalabledb/semistruct.html, World Wide Web, May 25, 2005, pp. 1-3. |
| Written Opinion of the International Searching Authority issued in Application No. PCT/US2006/025810 dated Jul. 27, 2007. |
| Written Opinion of the International Searching Authority issued in Application No. PCT/US2006/025811 dated Feb. 16, 2007. |
| Yang et al., “Automatic Category Theme Identification and Hierarchy Generation for Chinese Text Categorization,” Kluwer Academic Publishers, 2000, pp. 1-26. |
| Zheng, “Tradeoffs in Certificate Revocation Schemes,” Apr. 2003, ACM SIGCOMM Computer Communications Review, 33:2:103-112. |
| Zornes, “EA Community Articles”, http://www.eacommunity.com/articles/openarticle.asp?ID=1834>, World Wide Web, May 25, 2005, pp. Jan. 3, 2013. |
| Number | Date | Country | |
|---|---|---|---|
| 20130231920 A1 | Sep 2013 | US |
| Number | Date | Country | |
|---|---|---|---|
| 61606025 | Mar 2012 | US | |
| 61606021 | Mar 2012 | US |