The present invention generally relates to patent analytics, and particularly relates to system and method for assessing patent quality.
Patent analytics uses computer technologies and software algorithms to evaluate a patent with respect to its quality both qualitatively and quantitatively. A well designed patent analytics system generates a patent assessment report that is as close as a human expert would using heuristics, know-how's and related intelligent information. Such patent assessment report may help a professional to evaluate a patent for the purpose of variety of tasks such as patent filing, patent prosecution, patent sales, patent licensing, patent landscape, patent strategy and patent related business decisions. In addition, a computer based patent analytics system also assesses patent quality in an reasonably objective manner combining real world information, empirical data or other information from heterogeneous data sources to produce realistic and meaningful assessment. The present invention attempts to achieve this goal.
A patent assessment system according to one aspect of the present invention comprises a patent history locator, a document comparator, a key term analyzer, and a patent analyzer. Patent history locator is to locate related patent record and data for a given granted patent. For example, when the system or user locates a granted patent for assessment, patent history locator may find the original patent application corresponding to the granted patent, patent family (i.e. patents originated from the same or overlapped inventors on same or similar inventions), foreign counterparts (i.e. patents filed in a foreign country based on the same invention from the same inventor), patent prosecution history (e.g. documents communicated to/from between the Patent Office and the applicant, claim amendment, office action, patent search report, information disclosure statement etc.), or other related information. Document comparator compares the patent for assessment with its related patent documents located by patent history locator and find differences (such as text added to or deleted from the original claim etc.) and correlation between the two, to be used as one or more factors in patent assessment. Patent key term analyzer extracts key terms in a given patent document. Patent analyzer analyzes a given patent combining information and data from other functional modules or information sources. Depending on the specific task, the output of patent analyzer can be at least one of the metrics used in patent evaluation such as patent citation information, patent enforcement information, patent technical strength, applicable market to the patent, the scope of patent claims, the claim breadth, claim diversity, and the strength of a patent with respect to the integrity of the specification etc.
The present invention is advantageous in combining rich information external and pertinent to a patent under assessment thus giving a more accurate and meaningful assessment of the patent.
For a more thorough understanding of the invention, its objectives and advantages refer to the following specification and to the accompanying drawings.
The present invention will become more fully understood from the detailed description and the accompanying drawings, wherein:
With reference to
Patent History Locator 104 is to locate related patent record and data for a given patent. For example, when the system or user locates a granted patent for assessment 109, Patent History Locator 104 may find the original patent application corresponding to the granted patent 109, patent family (i.e. patents originated from the same or overlapped inventors on same or similar inventions), foreign counterparts (i.e. patents filed in a foreign county based on the same invention from the same inventor), patent prosecution history (e.g. documents communicated to/from between the Patent Office and the applicant, claim amendment, office action, patent search report, information disclosure statement etc.), or other related information.
Document Comparator 102 compares the patent for assessment 109 with its related patent documents located by Patent History Locator 104 and find differences (such as text added to or deleted from the original claim etc.) and correlation between the two.
Patent Key Term Analyzer 105 extracts key terms in a given patent document.
Patent Analyzer 103 analyzes a patent combining information and data from other functional modules or information sources. Depending on the specific task, the output of patent analyzer can be at least one of the metrics used in patent evaluation such as patent citation information, patent enforcement information, patent technical strength, applicable market to the patent, the scope of patent claims, the claim breadth, claim diversity, and the strength of a patent with respect to the integrity of the specification etc.
An exemplary scenario is provided below to further illustrate various components of the present inventive system. However, it is by no means indicative of limiting the scope of the present invention. By way of example and with reference to
With reference to
An exemplary way to build a document comparator is to locate the text for target document and original document and compare the two strings. To achieve this, one way is to use well-known edit distance, also called Levenshtein distance. It is generally used to compute the similarity of two strings via three basic actions: insert, delete and replace. As an example, s and t denote source string and target string, we define the distance between two strings s[0 . . . n] and t[0 . . . m] as d[m, n], where m, n are the lengths of string t and s, respectively.
As an example, Table 3 shows how the edit distance is computed when the source string is “GUMBO” and the target string is “GAMBOL”. From the table, we can see the distance between source string and destination string is 2.The shortest path is labeled underlined in the table.
0
1
1
1
1
2
According to one aspect of the present invention, we calculate the cost of claim amendment by first associating a claim from the original patent application file to its corresponding claim from the published granted patent file, and treating the original claim and granted claim as source and target strings, respectively. We then compute edit distance. We first create a two dimensions array d[m, n] where n and m are the length of source string and target string, respectively. The matrix can be initialized and calculated as shown in the following pseudo code
According to one aspect of the present invention, we set the cost of insert and delete operation to 1. Accordingly, cost=0, if s[i]=t[j], otherwise cost=1. The cost can be set to other values without limiting the scope of the present invention.
In order to mark inserted and deleted string, we need to back trace to find the shortest path at each step and save the corresponding operation and string contents, then merge the result. An example is shown in
Generally, the more changes (insert or delete) there are in the claim amendment, the more limitations the new claim may have. Consequently, a claim amendment cost can be assigned to reflect the claim breadth as shown in Table 1. By way of example, the claim amendment cost can be calculated based only on the number of inserted letters, or deleted letters, or a combination thereof, for example, by a weighted sum.
According to another aspect of the present invention, a claim text can be segmented into words or phrases. Consequently, a cost can be assigned based on the length of the word or phrases. For example, the cost for adding “word” would be 4, the cost for adding “field” would be 5 etc.
According to another aspect of the invention, the cost for different operations can be different, and the cost for different length of text segment can be nonlinear. A training system can be designed to derive optimal set of costs.
Variations of Document Comparator can be implemented. According to one aspect of the present invention, instead of comparing source and target text strings, the Document Comparator can directly look for claim amendment in the patent prosecution history. The patent prosecution history can be directly text searchable (e.g. searchable PDF), or if not, can be converted to searchable text via optical character recognition (OCR) techniques. Then the Document Comparator can search for claim amendment in each of applicant's response to Office action.
According to another aspect of the present invention, for the purpose of patent assessment, Document Comparator can also look for other information that changes between any two Office actions during the prosecution of a patent. For example, the reason for rejection in each Office action can be located and analyzed, and patent assessment result can be based on a number of factors, such as the number of §101, §102, §103, and §112 sixth paragraph issues etc. that are pertaining to U.S. Patent Law (or the corresponding issues in patent law of other countries), the number of references the Examiner cited in each rejection, the number of new references the Examiner used in rejecting the same claim as previous Office rejection etc. The higher the number of references an Examiner cited in an Office rejection, the more crowded prior arts are around the claim of concern. Similarly, the higher the number of new references an Examiner used in rejecting the same claim as previous Office rejection, the stronger the rejection may be - this reflects an Examiner's high belief that a claim sought is not patentable.
The Key Term Analyzer identifies essential terminologies or key terms from one or more parts of a patent, such as topic, abstract, specification and/or patent claims. As described above, these essential terminologies can be used in other process of the system, for example, in Patent Analyzer, to be described later. The Key Term Analyzer employs common natural linguistic processing techniques incorporating POS (Part of Speech) tagging and keyword extraction. In one embodiment of the present invention, topic and abstract sections of a patent are used to extract essential terminologies or key terms. With reference to
The part-of-speech of a word in a key phrase can be any one element in the set of all part-of-speeches, i.e. noun, verb, adjective etc. Examples of POS that can be extracted are illustrated in Table 4.
According to one aspect of the present invention, we assume that nouns, verbs and adjectives much more commonly appear in a key phrase. Thus we chose consecutive nouns (NN), consecutive verbs (VB) or consecutive adjectives (JJ) as possible values of part-of-speeches of the words in a key phrase. With reference to
While we make our assumption that key phrases are more likely to be words comprising of consecutive noun, verb or adjective word classes, variations should not depart from the spirit of the present invention. For example, in another embodiment of the present invention, we could extract key words comprising of more than one word class such as “heuristic_JJrule_NN”, “liquid_NNemitting_JJdisplay_VV” using the similar framework as shown in
In another embodiment of the present invention, key terms can be extracted incorporating statistic approaches such as TF-IDF (term frequency-inverse document frequency), mutual information, entropy, etc. For example, TF-IDF is calculated based on term frequency and inverted document frequency to determine the importance of a term. If the TF-IDF value of a term is very high, it will be extracted as a key term.
According to another aspect of the present invention, additional features of keyword extraction are based on “phrases” that appear in the patent specification frequently, excluding stop words. For example, “initial movement” or “initial movement of finger” in
According to another aspect of the present invention, the features for keyword extraction are based on “legal” key terms especially those transitional words or limiting language in a patent claim. For example, the word “consisting”, “consists”, “consist”, “comprising”, “whereby” etc. are frequently used patent claim languages. Text that correlates to those terms is particularly interesting (for claim construction and interpretation purpose) and can contribute more influence or weight to the keyword extraction.
A patent analyzer generates patent assessment metrics. According to one aspect of the present invention, a patent analyzer could generate a patent claim breadth indicator. The result of claim breadth indicator can be integrated into an overall assessment of patent as one assessment factor, or by way of identifying essential claim language in the claim that should prompt the user for more scrutiny. In one embodiment of the present invention, the claim breadth indicator can be based on one or more metrics as below.
(1) New text added in claim amendment. For example, all or part of added text in Document Comparator can be highlighted to alert the user of potential claim limitation in a patent under evaluation. In the example as shown in
(2) The number of changes (added or deleted text) in the claim amendment. For example, a counter is initialized to zero. When any text is deleted, the counter increments by a predefined number of a dynamically changing number (for example, based on the length of deleted text). This indicator reflects the extent of claim amendment or significance of changes in the claim in relative to the original claim, and can be used to calculate claim breadth as shown in Table 1.
In another embodiment of the present invention, the essential terminologies or key phrases or key terms extracted can have several uses in a patent analytics task. According to one aspect of the present invention, key phrases extracted from every patent can be merged together to keep only unique key phrases, or grouped to provide a summarization on the patent being analyzed, or summarization of a particular claim of interest.
In another embodiment of the present invention, the Key Term Analyzer for a patent claim could also be used as identifying “enabling” elements of a claim and comparing these elements with the specification to make sure they are described and supported in the specification. With reference to
Further variations of checking “enabling” elements could employ approximate matching (if two words are of the same meaning but do not exactly match, e.g. one is singular form and the other is plural form) or spelling correction (if there are spelling errors in the spec or claim due to typo or errors as result of document scanning or OCR error) etc. Approximate matching can employ well known natural language processing techniques for example, stemming the words then comparing, or using edit distance in comparing two strings of text, or using ontology and thesaurus to find words in similar meanings.
According to another aspect of the present invention, key phrases can be used to assess the integrity of claim, as described in Table 1. As an example, key phrases are identified from the claim section and checked against the specification of the patent. While a claim needs to have support in the specification, presumably each key phrase in the claim could find its counterpart in the specification. We count the occurrence in the specification for each identified key phrase in the claim. With reference to Table 1, these occurrences of key phrases in the claim are used to calculate claim integrity.
According to another aspect of the present invention, key terms extracted from Key Term Analyzer could also have an impact on the claim breadth indicator as described in Table 1. For example, if the changed text (between claim in the original application and claim in the granted patent) involves key terms, a heavier penalty will be posted on the indicator, whereas a changed text that does not include key terms will incur lighter penalty or no penalty on the claim breadth indicator.
According to another aspect of the present invention, changed text from claim amendment history can be further examined. For example, if a word “comprise” in a claim is substituted for “consist”, a penalty can be added to the claim breadth indicator because the terminologies “comprise” and “consist” are essential in construing the scope of a patent claim in such a way that “consist” dramatically limits the scope of the claim. Yet in another example, if the changes had involved introducing new essential technical term, such as “heuristic” in the example in
The description here is not intended to limit the scope of the invention. Extensions that can be inferred, comprehended and understood by an ordinary in the art are not exhaustive. For example, the system and method as described in the present invention requires a computing device with a microprocessor to execute a computer command. The computing device requires a memory and I/O device (such as a display, printer, USB port, Serial/parallel or internet wired or wireless) in order to present the patent assessment results to a user. Similarly, a report generator is also needed in order to show the assessment results to a user.
Further, when various components are needed, relevant databases may be utilized. Key Term Analyzer may function purely based on the patent under evaluation itself or may incorporate additional information from other databases (DB) or knowledge bases (KB). For example, Key Term Analyzer may use the information in the patent and find related information in the technology knowledgebase (e.g. a published paper from the same inventor on the same invention, where key words are explicitly listed) and use that related information to guide the key term extraction. This can be implemented by assigning new features or giving more favorable weights in extracting keywords.
Still further, some common knowledge key terms in a specific field can be used to build a custom made stop word bank in Key Term Analyzer. The field of the invention can firstly be identified from the patent classification (e.g. U.S. Class or WIPO class) assigned to each patent. For example, a patent may belong to computer architecture from its patent classification, in which the terms “CPU”, “memory”, “storage” etc. are not needed to explicitly explain in the specification of a patent before an ordinary skilled in the art could understand the scope and meaning of these terms. Consequently, once these terms are identified, they can be removed from the Key Term Analyzer result or assigned lighter weight in assessing the claim breadth or integrity.
Still further, according to another aspect of the present invention, key terms can be extracted not only from title and/or abstract, but other parts of the patent document, e.g. the specification, the description of drawings, the claims etc. Still further, the output from Patent Analyzer can be based on the duration of the prosecution (i.e. how long it takes the patent to be granted from its filing date), the number of Office actions occurred during the prosecution period of the patent, the number of unique references used in all Office action rejections, and the average number of references used in each Office action rejection, as one of the criteria in patent assessment.
Still further, the components of the patent assessment can be arranged differently than that in
Still further, various functional components can be built in a feedback network in order to derive optimal parameters for each component. For example, the result from patent claim breadth indicator in Patent Analyzer can be fed back to Key Term Analyzer to fine tune the keyword extraction, or be fed back to Document Comparator to fine tune the edit distance costs.
Still further variations, including combinations and/or alternative implementations, of the embodiments described herein can be readily obtained by one skilled in the art without burdensome and/or undue experimentation. Such variations are not to be regarded as a departure from the spirit and scope of the invention.
This application claims the benefit of U.S. Provisional Application Ser. No. 61/635,896, filed on Apr. 20, 2012. The disclosure of the above application is incorporated herein by reference in its entirety for any purpose.
Number | Date | Country | |
---|---|---|---|
61635896 | Apr 2012 | US |