The present disclosure relates to automated question answering (QA) systems and methods, and more specifically, to such QA methods and systems that score evidence sources used to support candidate answers.
Most existing factoid question answering systems adopt search strategies and scoring methods with the assumption that a short passage exists in the reference corpus which contains sufficient information to answer each question. This assumption largely holds true for short and focused factoid questions such as “When did Hawaii become a state?’” and “What strait separates North America from Asia?’” However, some more complex factoid questions contain facts encompassing multiple facets of the answer, which often cannot be found together in a short text passage. Consider the following examples:
(1) Born in 1956, this Swedish tennis player won 6 French Opens & 5 straight Wimbledons (A: Björn Borg); and
(2) A small- and medium-vessel vasculitis characterized by allergic rhinitis, asthma, and peripheral eosinophilia (A: Churg-Strauss Syndrome).
In both examples, information presented in the question can reasonably be expected to be in documents that describe the respective answer entities. However, it is quite unlikely that all the information will be present in one or two adjacent sentences in the document. More specifically, in example (1), birth year and nationality information is in the basic biographic section of documents about Björn Borg, while statistics about his tennis record can generally be found in a section about Borg's career. Similarly for example (2), being a small/medium-vessel vasculitis can generally be found in an initial definitional sentence about Churg-Strauss Syndrome, while the rest of the question naturally falls under a section that describes typical symptoms of the syndrome. As a result, a typical passage retrieved from most reference corpus would cover only a portion of the facts given in the question.
These multi-faceted factoid questions present a challenge for existing question answering systems which make the aforementioned assumption. Consider the following short passages relevant to the question in example (2):
(2.1a) Churg-Strauss syndrome is a medium and small vessel autoimmune vasculitis, leading to necrosis.
(2.1b) The Churg-Strauss syndrome is a multisystem disorder characterized by chronic rhinosinusitis, asthma, and prominent peripheral blood eosinophilia.
(2.2a) Wegener's granulomatosis is an incurable form of vasculitis that affects small and medium-size blood vessels.
(2.2b) Wegener granulomatosis is a rare multisystem autoimmune disease of unknown etiology. Its hallmark features include necrotizing granulomatous inflammation and pauci-immune vasculitis in small- and medium-sized blood vessels.
Existing systems evaluate each passage separately against the question would view each passage as having a similar degree of support for either Churg-Strauss syndrome or Wegener's granulomatosis as the answer to the question. However, these systems lose sight that even though each passage covers half of the facts in the question, (2.1a) and (2.1b) cover disjoint subsets of the facts, while (2.2a) and (2.2b) address the same set of facts.
Exemplary embodiments herein include methods and/or systems that receive a question (comprising question terms) into a user interface of a computerized device. Such methods/systems automatically search sources of data containing passages using a processor of the computerized device to produce candidate answers to the question. The searching is based on the question terms, and the searching identifies passages that support each of the candidate answers based on scoring features that indicate whether the candidate answers are correct answers to the question.
These methods/systems also automatically create a scoring feature-specific matrix for each scoring feature using the processor. Each scoring feature-specific matrix has a score field for each different combination of text passage and question term, and each score field holds a score value (vector value) indicating how each different combination of text passage and question term supports the candidate answers as being a correct answer to the question. Thus, each different combination of text passage and question term makes up a vector having a vector value equal to that maintained in the score field. Next, such methods/systems automatically combine multiple such vectors using the processor to produce a combined vector score for each of the candidate answers, and then rank the candidate answers based on the combined scores using the processor. Other exemplary methods/systems herein similarly receive a question comprising question terms into a user interface of a computerized device. Such methods/systems also automatically search sources of data containing passages (e.g., text passages) using a processor of the computerized device to produce candidate answers to the question. The searching is based on hypotheses generated from the question terms, and the searching identifies sources of evidence that support each of the candidate answers based on scoring features that indicate whether the candidate answers are correct answers to the question. Thus, the scoring features define the types of analysis that are used to determine whether the candidate answers are correct. The results of the search can be transformed by using a transformation function that takes an input value and information that provides context for the input value, and produces a transformed score value.
Such methods/systems automatically create a scoring feature-specific matrix for each scoring feature using the processor. Each scoring feature-specific matrix has a score field for each different combination of text passage and question term, and each score field holds a score value (vector value) indicating how each different combination of passage and question term supports the candidate answers as being a correct answer to the question (with respect to the corresponding scoring feature of the matrix). The different combinations of passages and question terms thereby form vectors, and each vector has a vector value equal to the corresponding score value.
The systems/methods herein automatically combine (collapse) the vectors using the processor by calculating a statistical measure of the vectors (e.g., a collapsing function, such as calculating a maximum, a minimum, a sum, a mean, a median, and/or a standard-deviation of the vectors) to produce a collapsed score for each of the question terms. Further, the systems/methods herein automatically combine the collapsed score for each question term using the processor to produce a combined score for each of the candidate answers. Then the systems and methods herein can automatically rank the candidate answers based on the combined score for each candidate answer using the processor.
The embodiments herein will be better understood from the following detailed description with reference to the drawings, which are not necessarily drawn to scale and in which:
As mentioned above, existing question answering systems evaluate each passage separately against a question. In view of this, the systems/methods herein provide the notion of multi-dimensional feature merger (or MDM features) which allow for passage scoring results to be combined across different dimensions, such as question segments and different passage scoring methods. In this example, MDM features that combine results across question segments would capture the broader coverage of passages (2.1a) and (2.2b) mentioned in the Background section above, and thus enable the system to recognize Churg-Strauss Syndrome as a better answer for the question than Wegener's granulomatosis.
The systems and methods herein provide a general-purpose MDM feature-merging framework that can be adopted in question answering systems that evaluate candidate answers by matching candidate-bearing passages against the question. The systems and methods herein merge passage scoring results across various dimensions, resulting in 1) new features that are more highly correlated with correct answers than the base features from which they were derived, 2) significant component level performance improvement and 3) end-to-end performance improvement.
For example, given a question, Q, each of its candidate answers, CA, has a set of supporting passages. In one type of question-answering system, support of each passage for a candidate answer is quantified. Then a merging strategy is used to combine the support of all passages for a particular candidate answer. The systems and methods herein provide a general framework for merging support from supporting passages.
The methodology of calculating the support of a passage for a candidate answer is called passage scoring. At an abstract level, a passage scorer is responsible for quantifying how well a passage matches a question. The systems and methods herein represent a question and a passage as an ordered set of terms (Q={q1, q2, . . . , qn}), and (P={p1, p2, . . . , pm}), respectively. Passage scorers align question terms to passage terms and assign a score to how well the terms align. For example, a passage scorer will take as input Q and P and output a vector of scores that represents how well the passage matches the question. The systems and methods herein denote this vector for P as the vector {right arrow over (ƒ)} such that ƒi is the score of how well one of the passage terms matches the ith term in the question. Note the length of this vector is fixed per question but may vary across questions.
The systems and methods herein collect all these vectors per question, per candidate answer into a matrix, M. For example, CA1 may be represented as a matrix where row i corresponds to the passage scoring vector for passage Pi. An element of this matrix, mi,j is the score assigned by one of the passage scorers how well passage Pi aligns with the term j in the question Q.
This matrix is of variable dimensions for different candidate answers per question. Number of rows could be different because number of supporting passages could be different for various candidate answers per question. Since various questions have different number of question terms, number of columns could be different for candidate answers across questions. Essentially, the dimension of this matrix is different for various candidate answers. Therefore, the systems and methods herein cannot capture the distribution of this matrix simply by linearizing the matrix.
The systems and methods herein define a function ƒ: M→RN, that maps each matrix into feature vector of fixed length. This function is defined as follows:
ƒ(M)=<g(M),g(M′)>
where M′ is the transpose of matrix M and g is a function g: M→RN/2 that maps a matrix into feature vector of fixed length, defined as follows:
g(M) include sum(S), avg(S), std(S), max(S), min(S), dim(S), non-zero(S)>
where S is a vector of dimensionality equal to the number of columns in M, and each value in S is computed using some function h(v), applied to the corresponding column in M, where h(v) may include sum(S), avg(S), std(S), max(S), min(S), dim(S), non-zero(S).
Examples of formulae are given in Table 1 (shown in
Consider an example shown in
1. Candidate answer 1: African Elephant
(a) P1.1: The African Elephant is a very large land animal. (b) P1.2: African elephants have large ears.
2. Candidate answer 2: Hippo
(a) P2.1: A hippo is a large land animal. (b) P2.2: Hippos have relatively small ears.
Table 2 (shown in
Using one merging strategy, say SUM-THEN-MAX (in which f(M)=g(M), g(v)=MAX(v), and h(v)=SUM(v)), candidate answer 1, African Elephant, will get assigned a feature value equal to SUM-THEN-MAX{(x1+x2+x3+x4+x5+x6), (x7+x8+x9+x10+x11+x12)}. So either passage P1.1 or passage P1.2 will be selected as an optimal passage. This merger strategy does not attempt to leverage the complementary information in the two passages. The systems and methods merging strategy captures the distribution of alignment across passages.
In the systems and methods herein, the framework is general in three ways: 1) It is independent on the type of passage scorer; 2) more matrix operations (like rank(M)), may be easily added to the definition of function g(M) and h(M); and 3) there is value in considering how all candidates match with the question. Using the framework of the systems and methods herein, all that is required, is adding a new dimension: depth to the matrix M, thus giving rise to a 3-D matrix, say M3D. Each two dimensional M in M3D belongs to a candidate answer. Therefore, depth of M3D is the number of candidate answers per question.
Question/answer (QA) systems analyze a question and generate multiple possible candidate answers. Then QA systems apply many different answer-scoring methods, each of which produces features that are used to evaluate whether the answer is correct. One way in which question/answer systems evaluates candidate answers is to first retrieve passages of text that contain the candidate answer. Each passage is then scored using a variety of methods called passage scorers. All of the features are sent to a final merging and ranking component, which uses machine-learning techniques to weigh and combine features to produce a single confidence value estimating the probability that the candidate answer is correct. The features the systems and methods herein introduce are extracted and made available to the machine learning model in the final merging and ranking component, where the scores assigned by different passage scorers are available.
The systems and methods herein find candidate answers, employing a variety of methods to compute feature values relating to those answers, and then use a statistical classifier to determine which answers are correct. For a given question Q, search components find a set of candidate answers {CA1, CA2, . . . , CAn}. The task of the classifier is to decide which of the candidate answers is the correct answer. Each candidate answer is associated with one or more passages that contain the answer and that the search component has determined may be relevant to the question. A subset of the methods that compute feature values are the passage scoring components. These components evaluate the evidence that a single passage provides relating to how well the answer satisfies the requirements of the question. Thus among the feature values associated with a candidate answer will be some passage scoring features.
The systems and methods herein have a variety of passage scoring methods that use different strategies for determining which parts of a question to attempt to match to each part of a passage and for determining whether two parts of a passage match. Some attempt to align question terms to passage terms using syntactic structure and/or semantic relations, while others use word order or ignore the relationship among terms completely (e.g., simply counting how many question terms appear in the passage, regardless of whether those terms are similarly arranged).
Many scorers score a passage by comparing it to the question. The scorers leverage on available annotation components developed for the question/answer systems framework, such as dependency parsing, named entity recognition, coreference resolution, relation detection, etc. The question and the passage are decomposed into sets of terms, where a term can either be a single token, a multiword term or a proper name. All of these scorers try to determine the amount of overlap between the passage and the question by looking at which terms match. The individual scorers put different restrictions on when a term is considered to match. The passage scores are composed using the inverse document frequencies of the matching terms. The inverse document frequency (idf) is considered to be a measure of a term's information content using the assumption that terms that occur less often are more important. It is computed as
where N is the number of documents of a large corpus and nterm is the number of documents that contain the term.
Some exemplary scorers used by systems and methods herein include a passage term match scorer that assigns a score based on which question terms are included in the passage, regardless of word order or grammatical relationship; a scorer that assigns a score based on whether pairs of terms that are connected or nearly connected in the syntactic-semantic structure of the question match corresponding pairs of terms in the passage; a textual alignment scorer that assigns a score based on how well the word order of the passage aligns with that of the question, when the focus is replaced with the candidate answer; a logical form answer candidate scorer that targets high-precision matching between the syntactic structures of passages and questions, etc.
Each passage-scoring component produces a fixed number of features for each passage. Some of these features range from 0 to 1, where a high score indicates that the passage matches the question well; other passage scorers have other ranges. The passage scorers assign a score to passages for a candidate answer. The answer classifier is intended to classify candidate answers that generally have multiple and different number of passages. Consequently, the systems and methods herein use a merger to combine passage scores for <candidate answer, passage> pairs into a fixed set of features. For example, if a candidate answer has three passages and a passage scorer assigns a value of 0.5, 0.6, and 0.7 to each passage, these scores may be merged using a merger strategy like SUM-THEN-MAX. Using this merger strategy, the feature added to the learning model for the candidate answer under consideration will be SUM-THEN-MAX (0.5, 0.6, 0.7)=0.7.
The systems and methods herein use distinct methods that are used to merge features across passages including, for example:
1. Maximum: The final score for the candidate answer is the maximum score for that answer in any passages found for that answer.
2. Sum: The final score for the candidate answer is the sum of the scores for that answer in each of the passages found for that answer.
3. Decaying sum: The final score for the candidate answer is computed to be
where p0, p1, . . . , pm are the scores of the passages that contain the answers, sorted in descending order.
Passage scorers capture limited complementary information that the passages have to offer. For example, in
For each of the scoring features, the systems and methods herein derive two other features: a feature for whether a feature is missing and a standardized version of a feature. When the value of a feature is missing, the systems and methods herein assert a value of 0 for the feature and a value of 1 for the corresponding derived missing feature; this allows the learner to distinguish between cases where the feature actually has 0 value versus cases where it simply did not apply at all. The standardized version of a feature is computed by subtracting the mean value of that feature and dividing by the standard deviation for that feature. Both mean and standard deviation are computed across all answers to a single question, not across all answers to all questions in the test set. The purpose of the standardized feature is to encode how much the base feature differs from a typical value of that feature for a single question.
In Table 3 (shown in
The foregoing description presents a more narrow, specific introduction and the following presents a more general, broadly applicable elaboration. As shown above, methods and systems herein analyze an input question and generate and evaluate candidate answers using a variety of techniques. These features are then combined based on their learned potential for predicting the right answer. The final result of this process is a ranked list of candidate answers, each with a confidence score indicating the degree to which the answer is believed correct, along with links back to the evidence.
A question answering system may divide up the task of answering questions in a variety of ways. For example, for any single answer it might have multiple pieces of evidence regarding that answer. Some pieces of evidence may relate to the entire question while other pieces of evidence may only provide information relevant to part of the question. A deep evidence scoring component can provide a score indicating how well that piece of evidence relates to some part of the question. A question answering system that uses statistical machine learning to classify candidate answers as correct or incorrect needs to ultimately conclude how well all of the evidence relates to all of the question. The systems and methods herein bridge this gap. The systems and methods herein take as input, feature values that relate portions or aspects of a question to portions or aspects of evidence and combines those feature values using a variety of functions to produce feature values that are relevant to whether the candidate answer correctly answers the question.
In the example shown in
Question/answer systems final merger aggregates information across multiple pieces of evidence, each analyzed using different analytics (features). However, the final merger process requires that each of these features be reduced first to a single number and only then combined into a final score for a candidate. Thus, the combination into a single score for a candidate is limited to combining the total amount of evidence that each passage provides, without regards to what portions of the questions each piece of evidence addressed. For example, conventional final merger will not distinguish between an answer that has two passages that provide very strong evidence for one important part of a question from one that has one passage that provides very strong evidence for one very important part of the question and another passage that provides very strong evidence for another very important part of the question. In addition, since the systems and methods herein combine results from different features across these dimensions, the systems and methods herein can accommodate interactions among features that only make sense within a given dimension.
For example, a feature indicating whether a passage supports the answer and a feature indicating whether a passage comes from a reliable source are only meaningfully combined at the level of passages, and the systems and methods herein make this possible. To the contrary, conventional final merger processing combines the reliability feature into a single value (and thus determining whether the passages, as a whole, are reliable) and each answering feature to a single value (and thus determining whether the passages, as a whole, support this answer). This rigid collapsing mechanism in the conventional final merger processing makes final merger processing unable to access information about whether the passages that support the answer are the same ones that are reliable.
Further, the approach of the systems and methods herein is domain independent, and the systems and methods herein combine multiple pieces of evidence (possibly assessed using different analytics) for the same fact within a decomposed question.
For each question-answer pair, the systems and methods herein takes as input a multi-dimensional feature value matrix. One dimension of the input matrix is labels corresponding to answer scoring features (i.e., labels indicating types of analysis that may be useful for determining whether an answer is correct). There is at least one other dimension for this matrix. Other dimensions may include (but are not limited to) portions of the question and pieces evidence (e.g., passages) that prove whether the answer is correct. The values in the matrix are feature values for the features specified in the feature label dimension. Those input values are computed by an evidence-scoring component, which uses evidence to determine how well a candidate answer satisfies requirements established in a question. The output of the systems and methods herein can be a single dimensional vector where the one dimension is answer scoring feature labels and the values in the vector are computed by combining values from across other dimensions.
Because the output of the systems and methods herein can be a single dimensional vector of features and values, it satisfies the input requirements for a traditional machine learning classifier. Since there is one such vector for each question-answer pair, such a classifier can be used to classify answers as correct or incorrect.
Because the systems and methods herein can have more output features than input features, this can result in a system that has more features than are immediately useful (e.g., can result in models that are overfit to the test set). In that case, a statistical method can select only a subset of the features (using machine learning technology). Providing many output features is useful to the effectiveness of the systems and methods herein, even if only a subset of those features is selected. It is difficult for a human to predict what matrix operations will be most effective at collapsing a complex combination of evidence relating to different parts or aspects of a question down into a single numerical score. The automated feature selection mechanism of systems and methods herein can do this job effectively if it is given results from a broad selection of such operations and is allowed so select among them empirically.
An exemplary processing is set forth below. In the following, shorthand features are defined as follows. SET_T is a predefined set of transformation functions, which take a single value along with any amount of additional information that provides context for that value, and produce a single value. SET_C is a predefined set of collapsing functions, which take a vector and produce a single value MAP_C, a map from matrix dimensions to subsets of SET_C. SET_D is a set of matrix dimensions. SET_P is a set of permutations of subsets of SET_D. The run-time inputs include Minput, which is a matrix of candidate answers with multi-dimensional feature matrices (obtained using enabling art such as Watson deep evidence scoring components).
The exemplary process is as follows:
Matrix Mstart=Expand Minput using SET_T for each P in SET_P;
M(P,0)=Mstart;
for i=1 to the length of P;
Dimension D(P,i)=entry i in P;
Set SUBSET_C(P,i)=the MAP_C entry for D(P,i);
Matrix M(P,i)=A new matrix with all the dimensions of M(P,i−1) and the indexes of D(P,i) replaced by new indexes corresponding to the elements in SUBSET_C(P,i) For each C in SUBSET_C(P,i);
For each Tuple T(P,i−1) of index values for M(P,i−1) for all of its dimensions other than D(P,i) Tuple T(P,i)=T(P,i−1) with the value of the D(P,i) index replaced by a new feature index unique to C and D(P,i);
The entry in M(P,i) whose indexes are T(i) is set to the value computed by applying C to the vector of values of M(P,i−1) whose indexes are T(i) with all index values for D(i);
Vfinal=flatten the final M(P,i) matrices for each P and concatenate all of them.
The output of this process, Vfinal is a feature vector that can be used for classifying candidate answers as right or wrong (using established ML technology). The transformation of the input can use a predefined set of transformation functions. One example of a transformation function is IDENTITY, which ignores the context and returns its input value. Another is STANDARDIZE, which requires as context all other values for the same feature label for all answers to the question; it transforms a single value by subtracting the mean of all of the context values and dividing by the standard deviation of the context values.
The steps above also can use predefined sets of collapsing functions. A collapsing function is a function that takes a one-dimensional vector of numbers and returns a single number. Examples of collapsing functions include MAXIMUM (the maximum value of any number in the vector), MINIMUM, SUM, MEAN, MEDIAN, and STANDARD DEVIATION. Another useful example is the number of non-zero entries. A more complex but also useful example is a machine learning classifier.
Operations of the systems and methods herein are shown in the following example, based on
Here is an example of how matrix Minput (from the method, above) would look for the African Elephant
Here is how Minput would look for Hippo:
As an example, consider P=[Passage,Question Term] and that the only transformation function is IDENTITY (i.e., Mstart=Minput=M(P,0)). D(P,1) is the Passage dimension. The values for T(0) are thus tuples of features and question terms, e.g., (feature1,animal). The values for T(1) are the same tuples with features replaced by new features specific to the Passage dimension and one of the collapsing functions. Assume that the collapsing functions are SUM, MAXIMUM, and MINIMUM. For the purpose of this example the new feature derived from feature1 is labeled the Passage dimension, and the SUM function as feature1_SUM(Passage). Thus, the T(1) tuples corresponding to value (feature1,animal) for T(0) would be the tuples (feature1_SUM(Passage),animal) and (feature1_MAXIMUM(Passage),animal). The entry for the former is computed as the sum across all values of Passage (i.e., P1.1 and P1.2) of all entries in M(P,0) such that the feature is feature1 and the question term is “animal”. The sum of 3 and 4 is 7, so the entry in M(P,1) for indexes feature1_SUM(Passage) and animal is set to 7. Similarly, the maximum of 3 and 4 is 4, so the entry in M(P,1) for indexes feature1_MAXIMUM(Passage) and animal is set to 4. The complete results for M(P,1) for African Elephant are:
Similarly, the complete results for M(P,1) for Hippo are:
D(P,2) is the Question Term dimension. Since the only dimensions in M(1) are now feature labels and question terms, the new values for T(1) are thus atomic feature labels, e.g., feature1_SUM(Passage). The T(2) tuples are thus also atomic feature labels derived from those feature labels, the question term dimension, and the collapsing functions. For example, one instance of T(2) is labeled as feature1_SUM(Passage)_SUM(Question Term). The value in M(P,2) for the feature1_SUM(Passage)_SUM(QuestionTerm) index is the sum across all question terms of all values in M(P,1) for which the feature is feature1_SUM(Passage), i.e., 2+2+7+5+6+7, which equals 29. Applying this logic to M(P,2) for both answers produces:
Since P has length 2, M(P,2) is flattened. The elements in the flattened vector are labeled with the combined labels of all of the remaining dimensions, e.g.:
African Elephant:
feature1_SUM(Passage)_SUM(Question Term): 29
feature1_SUM(Passage)_MAXIMUM(Question Term): 7
feature1_SUM(Passage)_MINIMUM(Question Term): 2
feature1_MAXIMUM(Passage)_SUM(Question Term): 25
feature1_MAXIMUM(Passage)_MAXIMUM(Question Term): 7
feature1_MAXIMUM(Passage)_MINIMUM(Question Term): 1
feature1_MINIMUM(Passage)_SUM(Question Term): 3
feature1_MINIMUM(Passage)_MAXIMUM(Question Term): 3
feature1_MINIMUM(Passage)_MINIMUM(Question Term): 0
feature2_SUM(Passage)_SUM(Question Term): 42
feature2_SUM(Passage)_MAXIMUM(Question Term): 13
feature2_SUM(Passage)_MINIMUM(Question Term): 3
feature2_MAXIMUM(Passage)_SUM(Question Term): 36
feature2_MAXIMUM(Passage)_MAXIMUM(Question Term): 9
feature2_MAXIMUM(Passage)_MINIMUM(Question Term): 3
feature2_MINIMUM(Passage)_SUM(Question Term): 6
feature2_MINIMUM(Passage)_MAXIMUM(Question Term): 6
feature2_MINIMUM(Passage)_MINIMUM(Question Term): 0
Hippo:
feature1_SUM(Passage)_SUM(Question Term): 22
feature1_SUM(Passage)_MAXIMUM(Question Term): 7
feature1_SUM(Passage)_MINIMUM(Question Term): 0
feature1_MAXIMUM(Passage)_SUM(Question Term): 19
feature1_MAXIMUM(Passage)_MAXIMUM(Question Term): 7
feature1_MAXIMUM(Passage)_MINIMUM(Question Term): 0
feature1_MINIMUM(Passage)_SUM(Question Term): 3
feature1_MINIMUM(Passage)_MAXIMUM(Question Term): 3
feature1_MINIMUM(Passage)_MINIMUM(Question Term): 0
feature2_SUM(Passage)_SUM(Question Term): 28
feature2_SUM(Passage)_MAXIMUM(Question Term): 13
feature2_SUM(Passage)_MINIMUM(Question Term): −10
feature2_MAXIMUM(Passage)_SUM(Question Term): 38
feature2_MAXIMUM(Passage)_MAXIMUM(Question Term): 9
feature2_MAXIMUM(Passage)_MINIMUM(Question Term): 0
feature2_MINIMUM(Passage)_SUM(Question Term): 6
feature2_MINIMUM(Passage)_MAXIMUM(Question Term): 6
feature2_MINIMUM(Passage)_MINIMUM(Question Term): −10
These features are then be used by a classifier to judge which of the two answers is correct, using answer classification mechanisms. Since the number of output features is much larger than the number of input features, feature selection methods can choose a subset of the features.
An exemplary embodiment herein is shown as a method in the flowchart in
Such methods/systems thus analyze an input question to determine precisely what it is asking for and generate many possible candidate answers through a broad search of large volumes of content. For each of these candidate answers, a hypothesis is formed based on considering the candidate in the context of the original question and topic. For each hypothesis, methods and system herein spawn an independent thread that attempts to prove the hypothesis. The systems/methods search the content sources for evidence that supports or refutes each hypothesis. For each evidence-hypothesis pair, such systems/methods apply hundreds of methods that dissect and analyze the evidence along different scoring features such as type classification, time, geography, popularity, passage support, source reliability, and semantic relatedness. This analysis produces hundreds of such scoring features (which are sometimes referred to herein simply as “features”). Thus, the scoring features define the types of analysis that are used to determine whether the candidate answers are correct. The results of the search can be transformed by using a transformation function that takes an input value and information that provides context for the input value, and produces a transformed score value 408.
Such methods/systems automatically create a scoring feature-specific matrix for each scoring feature using the processor 410. Each scoring feature-specific matrix has a score field for each different combination of text passage and question term, and each score field holds a score value (vector value) indicating how each different combination of passage and question term supports the candidate answers as being a correct answer to the question (with respect to the corresponding scoring feature of the matrix). The different combinations of passages and question terms thereby form vectors, and each vector has a vector value equal to the corresponding score value.
The systems/methods herein automatically combine (collapse) the vectors in item 412 using the processor by calculating a statistical measure of the vectors (e.g., a collapsing function, such as calculating a maximum, a minimum, a sum, a mean, a median, and/or a standard-deviation of the vectors) to produce a collapsed score for each of the question terms. Further, the systems/methods herein automatically combine the collapsed score for each question term using the processor to produce a combined score for each of the candidate answers. Then the systems and methods herein can automatically rank the candidate answers in item 414 based on the combined score for each candidate answer using the processor. In item 416, a single answer can be output from the user interface, or many answers (with their scores and evidence sources) can be output.
As will be appreciated by one skilled in the art, aspects of the systems and methods herein may be embodied as a system, method or computer program product. Accordingly, aspects of the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system.” Furthermore, aspects of the present disclosure may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
Any combination of one or more computer readable non-transitory medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The non-transitory computer storage medium stores instructions, and a processor executes the instructions to perform the methods described herein. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
Aspects of the present disclosure are described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments herein. It will be understood that each block of the flowchart illustrations and/or two-dimensional block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
A representative hardware environment for practicing the embodiments herein is depicted in
The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments herein. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Deployment types include loading directly in the client, server and proxy computers via loading a storage medium such as a CD, DVD, etc. The process software may also be automatically or semi-automatically deployed into a computer system by sending the process software to a central server or a group of central servers. The process software is then downloaded into the client computers that will execute the process software. The process software is sent directly to the client system via e-mail. The process software is then either detached to a directory or loaded into a directory by a button on the e-mail that executes a program that detaches the process software into a directory. Send the process software directly to a directory on the client computer hard drive. When there are proxy servers, the process will, select the proxy server code, determine on which computers to place the proxy servers' code, transmit the proxy server code, and then install the proxy server code on the proxy computer. The process software will be transmitted to the proxy server then stored on the proxy server.
While it is understood that the process software may be deployed by manually loading directly in the client, server and proxy computers via loading a storage medium such as a CD, DVD, etc., the process software may also be automatically or semi-automatically deployed into a computer system by sending the process software to a central server or a group of central servers. The process software is then downloaded into the client computers that will execute the process software. Alternatively the process software is sent directly to the client system via e-mail. The process software is then either detached to a directory or loaded into a directory by a button on the e-mail that executes a program that detaches the process software into a directory. Another alternative is to send the process software directly to a directory on the client computer hard drive. When there are proxy servers, the process will, select the proxy server code, determine on which computers to place the proxy servers' code, transmit the proxy server code, then install the proxy server code on the proxy computer. The process software will be transmitted to the proxy server then stored on the proxy server.
In
A determination is made if a proxy server is to be built 200 to store the process software. A proxy server is a server that sits between a client application, such as a Web browser, and a real server. It intercepts all requests to the real server to see if it can fulfill the requests itself. If not, it forwards the request to the real server. The two primary benefits of a proxy server are to improve performance and to filter requests. If a proxy server is required then the proxy server is installed 201. The process software is sent to the servers either via a protocol such as FTP or it is copied directly from the source files to the server files via file sharing 202.
Another embodiment would be to send a transaction to the servers that contained the process software and have the server process the transaction, then receive and copy the process software to the server's file system. Once the process software is stored at the servers, the users via their client computers, then access the process software on the servers and copy to their client computers file systems 203. Another embodiment is to have the servers automatically copy the process software to each client and then run the installation program for the process software at each client computer. The user executes the program that installs the process software on his client computer 212 then exits the process 108.
In step 104 a determination is made whether the process software is to be deployed by sending the process software to users via e-mail. The set of users where the process software will be deployed are identified together with the addresses of the user client computers 105. The process software is sent via e-mail 204 to each of the users' client computers. The users then receive the e-mail 205 and then detach the process software from the e-mail to a directory on their client computers 206. The user executes the program that installs the process software on his client computer 212 then exits the process 108.
Lastly, a determination is made on whether to the process software will be sent directly to user directories on their client computers 106. If so, the user directories are identified 107. The process software is transferred directly to the user's client computer directory 207. This can be done in several ways such as but not limited to sharing of the file system directories and then copying from the sender's file system to the recipient user's file system or alternatively using a transfer protocol such as File Transfer Protocol (FTP). The users access the directories on their client file systems in preparation for installing the process software 208. The user executes the program that installs the process software on his client computer 212 then exits the process 108.
The process software is integrated into a client, server and network environment by providing for the process software to coexist with applications, operating systems and network operating systems software and then installing the process software on the clients and servers in the environment where the process software will function.
The first step is to identify any software on the clients and servers including the network operating system where the process software will be deployed that are required by the process software or that work in conjunction with the process software. This includes the network operating system that is software that enhances a basic operating system by adding networking features.
Next, the software applications and version numbers will be identified and compared to the list of software applications and version numbers that have been tested to work with the process software. Those software applications that are missing or that do not match the correct version will be upgraded with the correct version numbers. Program instructions that pass parameters from the process software to the software applications will be checked to ensure the parameter lists matches the parameter lists required by the process software. Conversely, parameters passed by the software applications to the process software will be checked to ensure the parameters match the parameters required by the process software. The client and server operating systems including the network operating systems will be identified and compared to the list of operating systems, version numbers and network software that have been tested to work with the process software. Those operating systems, version numbers and network software that do not match the list of tested operating systems and version numbers will be upgraded on the clients and servers to the required level.
After ensuring that the software, where the process software is to be deployed, is at the correct version level that has been tested to work with the process software, the integration is completed by installing the process software on the clients and servers.
In
A determination is made if the version numbers match the version numbers of OS, applications and NOS that have been tested with the process software 224. If all of the versions match and there is no missing required software the integration continues in 227.
If one or more of the version numbers do not match, then the unmatched versions are updated on the server or servers with the correct versions 225. Additionally, if there is missing required software, then it is updated on the server or servers 225. The server integration is completed by installing the process software 226.
Step 227, which follows either 221, 224 or 226 determines if there are any programs of the process software that will execute on the clients. If no process software programs execute on the clients, the integration proceeds to 230 and exits. If this not the case, then the client addresses are identified 228.
The clients are checked to see if they contain software that includes the operating system (OS), applications, and network operating systems (NOS), together with their version numbers, that have been tested with the process software 229. The clients are also checked to determine if there is any missing software that is required by the process software 229.
A determination is made as to whether the version numbers match the version numbers of OS, applications and NOS that have been tested with the process software 231. If all of the versions match and there is no missing required software, then the integration proceeds to 230 and exits.
If one or more of the version numbers do not match, then the unmatched versions are updated on the clients with the correct versions 232. In addition, if there is missing required software then it is updated on the clients 232. The client integration is completed by installing the process software on the clients 233. The integration proceeds to 230 and exits.
The process software can be stored on a shared file system accessible from one or more servers. The process software is executed via transactions that contain data and server processing requests that use CPU units on the accessed server. CPU units are units of time such as minutes, seconds, hours on the central processor of the server. Additionally, the assessed server may make requests of other servers that require CPU units. CPU units are an example that represents but one measurement of use. Other measurements of use include but are not limited to network bandwidth, memory usage, storage usage, packet transfers, complete transactions etc. When multiple customers use the same process software application, their transactions are differentiated by the parameters included in the transactions that identify the unique customer and the type of service for that customer. All of the CPU units and other measurements of use that are used for the services for each customer are recorded. When the number of transactions to any one server reaches a number that begins to affect the performance of that server, other servers are accessed to increase the capacity and to share the workload. Likewise, when other measurements of use, such as network bandwidth, memory usage, storage usage, etc., approach a capacity so as to affect performance, additional network bandwidth, memory usage, storage etc. are added to share the workload. The measurements of use used for each service and customer are sent to a collecting server that sums the measurements of use for each customer for each service that was processed anywhere in the network of servers that provide the shared execution of the process software. The summed measurements of use units are periodically multiplied by unit costs and the resulting total process software application service costs are alternatively sent to the customer and or indicated on a web site accessed by the customer which then remits payment to the service provider. In another embodiment, the service provider requests payment directly from a customer account at a banking or financial institution. In another embodiment, if the service provider is also a customer of the customer that uses the process software application, the payment owed to the service provider is reconciled to the payment owed by the service provider to minimize the transfer of payments.
The process software is shared, simultaneously serving multiple customers in a flexible, automated fashion. It is standardized, requiring little customization and it is scalable, providing capacity on demand in a pay-as-you-go model.
In
The server central processing unit (CPU) capacities in the On Demand environment are queried 243. The CPU requirement of the transaction is estimated, then the servers available CPU capacity in the On Demand environment are compared to the transaction CPU requirement to see if there is sufficient CPU available capacity in any server to process the transaction 244. If there is not sufficient server CPU available capacity, then additional server CPU capacity is allocated to process the transaction 248. If there was already sufficient available CPU capacity then the transaction is sent to a selected server 245.
Before executing the transaction, a check is made of the remaining On Demand environment to determine if the environment has sufficient available capacity for processing the transaction. This environment capacity consists of such things as but not limited to network bandwidth, processor memory, storage etc. 246. If there is not sufficient available capacity, then capacity will be added to the On Demand environment 247. Next the required software to process the transaction is accessed, loaded into memory, then the transaction is executed 249.
The usage measurements are recorded 250. The usage measurements consist of the portions of those functions in the On Demand environment that are used to process the transaction. The usage of such functions as, but not limited to, network bandwidth, processor memory, storage and CPU cycles are what is recorded. The usage measurements are summed, multiplied by unit costs and then recorded as a charge to the requesting customer 251. If the customer has requested that the On Demand costs be posted to a web site 252 then they are posted 253.
If the customer has requested that the On Demand costs be sent via e-mail to a customer address 254 then they are sent 255. If the customer has requested that the On Demand costs be paid directly from a customer account 256 then payment is received directly from the customer account 257. The last step is to exit the On Demand process 258.
The process software may be deployed, accessed and executed through the use of a virtual private network (VPN), which is any combination of technologies that can be used to secure a connection through an otherwise unsecured or untrusted network. The use of VPNs is to improve security and for reduced operational costs. The VPN makes use of a public network, usually the Internet, to connect remote sites or users together. Instead of using a dedicated, real-world connection such as leased line, the VPN uses “virtual” connections routed through the Internet from the company's private network to the remote site or employee.
The process software may be deployed, accessed and executed through either a remote-access or a site-to-site VPN. When using the remote-access VPNs the process software is deployed, accessed and executed via the secure, encrypted connections between a company's private network and remote users through a third-party service provider. The enterprise service provider (ESP) sets a network access server (NAS) and provides the remote users with desktop client software for their computers. The telecommuters can then dial a toll-free number or attach directly via a cable or DSL modem to reach the NAS and use their VPN client software to access the corporate network and to access, download and execute the process software.
When using the site-to-site VPN, the process software is deployed, accessed and executed through the use of dedicated equipment and large-scale encryption that are used to connect a company's multiple fixed sites over a public network such as the Internet.
The process software is transported over the VPN via tunneling which is the process of placing an entire packet within another packet and sending it over a network. The protocol of the outer packet is understood by the network and both points, called tunnel interfaces, where the packet enters and exits the network.
In
If it does exist, then proceed to 265. Otherwise identify the third party provider that will provide the secure, encrypted connections between the company's private network and the company's remote users 276. The company's remote users are identified 277. The third party provider then sets up a network access server (NAS) 278 that allows the remote users to dial a toll free number or attach directly via a cable or DSL modem to access, download and install the desktop client software for the remote-access VPN 279.
After the remote access VPN has been built or if it been previously installed, the remote users can then access the process software by dialing into the NAS or attaching directly via a cable or DSL modem into the NAS 265. This allows entry into the corporate network where the process software is accessed 266. The process software is transported to the remote user's desktop over the network via tunneling 273. That is, the process software is divided into packets and each packet including the data and protocol is placed within another packet 267. When the process software arrives at the remote user's desktop, it is removed from the packets, reconstituted and then is executed on the remote users desktop 268.
A determination is made to see if a VPN for site to site access is required 262. If it is not required, then proceed to exit the process 263. Otherwise, determine if the site to site VPN exists 269. If it does exist, then proceed to 272. Otherwise, install the dedicated equipment required to establish a site to site VPN 270. Then build the large scale encryption into the VPN 271.
After the site to site VPN has been built or if it had been previously established, the users access the process software via the VPN 272. The process software is transported to the site users over the network via tunneling. That is, the process software is divided into packets and each packet including the data and protocol is placed within another packet 274. When the process software arrives at the remote user's desktop, it is removed from the packets, reconstituted and is executed on the site users desktop 275. Proceed to exit the process 263.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of this disclosure. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The descriptions of the various embodiments of the present invention have been presented for purposes of illustration, but are not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein was chosen to best explain the principles of the embodiments, the practical application or technical improvement over technologies found in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.
This application is a Continuation of U.S. application Ser. No. 13/679,169 filed Nov. 16, 2012, the entire teachings of which are incorporated herein by reference, in their entirety.
Number | Name | Date | Kind |
---|---|---|---|
6397200 | Lynch, Jr. et al. | May 2002 | B1 |
20040049499 | Nomoto et al. | Mar 2004 | A1 |
20120077178 | Bagchi et al. | Mar 2012 | A1 |
20120078891 | Brown et al. | Mar 2012 | A1 |
20120078926 | Chu-Carroll et al. | Mar 2012 | A1 |
20120166468 | Gupta et al. | Jun 2012 | A1 |
20120191716 | Omoigui | Jul 2012 | A1 |
20120215773 | Si et al. | Aug 2012 | A1 |
Number | Date | Country |
---|---|---|
WO 2009143395 | Nov 2009 | WO |
2012047557 | Apr 2012 | WO |
Entry |
---|
Murdock, “Structure Mapping for Jeopardy! Clues,” A. Ram and N. Wiratunga (Eds.): ICCBR 2011, LNAI 6880, pp. 6-10, 2011. © Springer-Verlag Berlin Heidelberg 2011. |
Gondek et al., “A framework for merging and ranking of answers in DeepQA,” IBM Journal of Research and Development, May-Jun. 2012. |
Evans, “Identifying Similarity in Text: Multi-Lingual Analysis for Summarization,” Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Graduate School of Arts and Sciences Columbia University 2005. |
Baker et al., “Distributional Clustering of Words for Text Classification,” Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval pp. 96-103 1998. |
Nico Schlaefer, Jennifer Chu-Carroll, Eric Nyberg, James Fan, Wlodek Zadrozny, David Ferrucci: Statistical Source Expansion for Question Answering. In Proceedings of the 20th ACM Conference on Information and Knowledge Management (CIKM), 2011, pp. 1-10. |
Boris Katz, Gary Borchardt and Sue Felshin. “Syntactic and Semantic Decomposition Strategies for Question Answering from Multiple Resources”. AAAI 2005, pp. 1-11. |
Kalyanpur, A.; Patwardhan, S; Boguraev, B.; Lally, A.; Chu-Carroll, J. “Fact-based question decomposition in DeepQA. IBM Journal of Research and Development”. May-Jun. 2012, pp. 1-7. |
PCT Notification of the International Search Report and Written Opinion Dated Jul. 22, 2014, pp. 1-11. |
Jeongwoo Ko et al., Information Processing and Mangangement, “Combining Evidence With a Probabilistic Framework for Answer Ranking and Answer Merging in Question Answering”, vol. 46, pp. 541-554, 2010. |
U.S. Appl. No. 13/679,169, Office Action Communication dated Oct. 24, 2014, 13 pages. |
U.S. Appl. No. 13/679,169, Office Action Communication dated Dec. 19, 2014, 11 pages. |
U.S. Appl. No. 13/679,169, Notice of Allowance dated Mar. 23, 2015, 26 pages. |
International Application No. PCT/US2013/070219, International Preliminary Report on Patentability and Written Opinion dated May 28, 2015, 7 pages. |
Number | Date | Country | |
---|---|---|---|
20140141401 A1 | May 2014 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13679169 | Nov 2012 | US |
Child | 13689866 | US |