The present application relates to use of thesaurus databases to develop groups of conceptually related keywords for use in research.
Researchers, and in particular patent researchers, require tools for quickly and accurately locating words having relationship to a concept sought in a search project. As an example, if a researcher was searching for multiple concepts simultaneously, and a first concept relates to a “package”, the researcher might desire to use words like “box”, “container” or “receptacle.” The typical method for locating such synonyms is to use an online or paper based thesaurus. Several drawbacks exist in these traditional approaches. First, each word will have multiple meanings, and each meaning will have its own set of related words, requiring the researcher to have knowledge prior to hunting down his keywords. Second, this approach assumes that the first word sought is the primary word, in that it best represents the concept. However, in most cases, the researcher will discover words that better represent each concept, prompting him to again query the thesaurus with the new word. While the traditional approach can be effective, it also time consuming.
It is an object of the present invention to provide the researcher with a method of rapidly and accurately processing multiple queries of a thesaurus database.
It is a second object of the present invention to provide the researcher with options that he knows when he sees, rather than requiring the researcher to know before seeing.
In the preferred embodiment of the present invention, a method of compiling a list of words with common relationships to a search concept comprises the first step of providing a system for compiling a list of words with common relationships. The system comprises an interactive client device having constituents including a display, a programmable thesaurus analysis module, a programmable interface module and a data storage element, said constituents digitally interconnected through a processor. The system further comprises a first program operable with the programmable thesaurus analysis module and a second program operable with the programmable interface module. The system further comprises both a user input/output interface and a network signally connected to the interactive client device. Lastly, the system comprises at least one thesaurus database signally connected to the network.
Operationally, when the first program instructs the programmable thesaurus analysis module to collect and manipulate data from the at least one thesaurus database through the network and store said data in the data storage element, and the second program displays selected data from input and storage in the display and receives instructions for the manipulation of data, a list of words may be selected, sorted and stored based on iterative incidences of the words.
The second step of the method comprises inputting seed words numbering n, n greater than or equal to one, into a first box in a user GUI through the input/output user interface in communication with the interface module. The third step comprises commanding, by means of said user interface, the analysis module to conduct a loop. In the loop, the at least one thesaurus database is consulted by means of the network to collect words with meanings similar to each of the n seed words, including their synonyms, to form a first virtual array of candidate words.
The fourth step of the method comprises instructing the analysis module, through the input/output interface, to conduct a while loop. In the while loop, frequency of incidence data is collected and stored for each of the candidate words in the first virtual array. Any duplications of the n words and all words with a non-zero incidence count are eliminated. A second virtual array of candidate words is formed from the residual and displayed in a second box in the user GUI.
The fifth step of the method comprises selecting preferred words from the second box in the user GUI on the basis of incidence count and posting said selected words to the first box. The sixth step comprises repeating all of the five steps above for each entry in the first box until the seed list is sufficiently populated and validated with incidence frequency. The seventh and last step comprises transferring the resulting list of words in the first box to a third box for registration as an inquiry string.
a-c are diagrams illustrating arrays manipulated by system shown in
Referring to
Referring to
Referring to
Step 300: The user 50 manually enters one or more user words 106 into the user words box 105. The user 50 then presses the run button 115.
Step 310: The thesaurus analysis module 142 then enters all user words into the user words array 200, which is depicted in
Step 320: The thesaurus analysis module 142 then executes a loop with the number of cycles equal to the user words array size 201. The loop described as follows:
Step 330: The thesaurus analysis module 142 then executes a while loop, with the condition of related words array size 211>0. The while loop is described as follows:
Sort the suggested words array 220 high to low according the frequency column. Finally remove any entries in the suggested word array 220 that are also entered in the user words array 200.
Step 340: The thesaurus analysis module 142 then displays the suggested words array 220 in the suggested words box 110.
Step 350: The user 50 then scans the suggested words box 110 and picks one or more suggested words 111 and adds them to the user words box 105 by double clicking
Step 360: The user 50 then decides to either reload the suggested words box 110 according to the user words box 105. If yes, then return to step 310.
Step 370: The user 50 then moves the user words 106 out of the user words box and into a user group 121 in a user word groups box 120.
Number | Date | Country | |
---|---|---|---|
61644261 | May 2012 | US |