The present description will be better understood from the following detailed description read in light of the accompanying drawings, wherein:
Like reference numerals are used to designate like parts in the accompanying drawings.
The detailed description provided below in connection with the appended drawings is intended as a description of the present examples and is not intended to represent the only forms in which the present example may be constructed or utilized. The description sets forth the functions of the example and the sequence of steps for constructing and operating the example. However, the same or equivalent functions and sequences may be accomplished by different examples.
Although dictionaries of proverbs and idioms exist in paper and electronic form, it is hard for a non-native speaker to determine the context in which a particular idiom should be used. Furthermore, if a non-native speaker inputs one or two keywords into an online dictionary, they are presented with a list of several potential idioms/proverbs and no assistance is provided to identify which of the displayed phrases is the one that the non-native speaker is most likely to want to use.
The term ‘string’ is used herein to refer to a linear sequence of alpha-numeric characters, which may includes spaces and/or punctuation, such as one or more words, numbers, acronyms, abbreviations or phrases.
The method as shown in
Although in
Having identified the keywords, these keywords are analyzed (step 302) to identify the root of the word, different forms of the word (e.g. alternative conjugations of verbs) etc. In the example given above, the root of “shooting” may be identified as “shoot” and alternative conjugations may include “shot”, “shoots” etc. The root of “hip” may be identified as “hip” and alternative forms may include “hips” (the plural form). An example method of identifying the different forms of a word is described at http://www.phon.ucl.ac.uk/home/dick/enc/morphology.htm which is incorporated herein by reference. Where the method is implemented within an application which contains a spelling and/or grammar function, the spelling and/or grammar engine may be used in this analysis. The analysis of the keywords may also include identification of alternative spellings (e.g. “colour” and “color”) or common misspellings of words. The result of this analysis may therefore be a number of words related to each of the identified keywords, for example:
The words identified in the analysis (in step 302) are then used in identifying potential matching strings within the database (step 303). This identification process may be performed using look-up tables or any means for searching the database of strings to identify those strings containing one or more of the words identified In the analysis. Potential matches may be identified as those strings containing at least one of the identified words (or search terms) relating to each of the keywords identified e.g. strings containing one of “shooting”, “shoot”, “shot” and “shoots” and also one of “hip” and “hips” in the example given above. In some situations, this step will only identify one potential match; however, where fewer keywords are identified (in step 301) more matches may be identified. In another example, where n keywords are identified (in step 302), potential matches may first be sought which contain at least one of the identified words relating to each of the n keywords (as described above), however, if no potential matches are identified, the search may be repeated to look for potential matches which contain at least one of the identified words relating to m1 keywords from the set of n identified keywords (where m1<n, e.g. m1=n−1). If this still does not identify any potential matches the process may be repeated again to look for potential matches which contain at least one of the identified words relating to m2 keywords from the set of n identified keywords (where m2<m1<n, e.g. m2=m1−1=n−2), and so on until a potential match is identified or the routine stops (e.g. after a predefined number of iterations or where mx=0).
The potential matching strings are then filtered by domain (step 304). The word ‘domain’ (also referred to herein as a ‘classification’) is used herein to refer to a particular sphere (or field) of use of a string, such as “business”, “slang”, “popular use” etc. The domains (or classifications) may in some examples be more specific, for example by being limited to a particular type of business such as “marketing”, “legal”, “sales”, “communications”, “banking”, “media” etc. Each string in the database is categorized by one or more domains and the applicable domains for each string within the database are recorded in the database of strings, for example:
or:
It will be appreciated that these represent only two possible ways in which domains may be associated with strings within the database. As shown above, a string may be associated with one or more domains.
In a first example, as shown in
In a second example, as shown in
Domain=Business
Domain=Popular Use
Although
Although the step of filtering the potential matches is described above as being part of the data processing and comparison step (step 102), the filtering step may alternatively be performed at other points within the method of
Once the matching strings have been displayed to the user (in step 105), the user can then choose whether to use any of the strings. The user may also, in some examples, be given an option to view additional further information relating to one or more of the strings (as described below). The user may be presented with a window enabling him to insert a phrase into the document (or other file) that he is working on or alternatively the user may be able to cut/copy a string from the display window and paste it into a file as required.
The database of strings 205 may also include further information relating to each of the strings or such further information may be stored in a separate data store (not shown in
In the above description, prepositions and other parts of speech are filtered out in order to identify the keywords (step 301). However, in some examples, some or all of these filtered out parts of speech may be used to filter the potential matches (either before or after the filtering by domain, step 304), for example where a very large number of potential matches are identified (in step 303).
In the above description, the user inputs words contained within a string that he is trying to identify. In another example, the user may input an acronym or abbreviation (e.g. a common abbreviation, an abbreviation used in text messaging etc). In such an example, the processing and comparison step (step 102) may comprise, as shown in
Domain=Banking
Domain=Commonly used phrases
Domain=Communications
Domain=Diving
The method described above may be integrated within a software application such as a Microsoft Office (trade mark) application, an instant messenger application, an email application etc. In such an example, the input of text (in step 101) may be performed by typing into the application (e.g. within a document or an email). The method may be triggered via a control within the application (e.g. a button, an item on a menu bar, a hotkey etc) and may either search the whole document (e.g. on a sentence by sentence basis or identifying acronyms and/or abbreviations) or only the highlighted (or otherwise selected or identified) text (e,g, a phrase, expression, sentence, acronym, abbreviation etc). This functionality may be incorporated within an existing spelling/grammar function and may be checked at the same time as the spelling/grammar or independently.
In the above description, the running of the method is initiated by the user (e.g. by clicking on a button or other control). However, the method may alternatively run automatically when triggered by a software application. For example the method may be triggered by pressing the ‘send’ button within an email application such that the email is searched for keywords (in the same way as searching a whole document, as described above). In another example, the method may be triggered by pressing the ‘send’ (or equivalent) button within an instant messenger application. In such examples, the user may have used acronyms, common abbreviations etc when writing their message and these may be automatically translated prior to the sending of a message such that the recipient receives the full text alternative to any acronyms or abbreviations used by the sender. In such an example, the database of strings may comprise a database of acronyms and/or abbreviations.
Although the above description relates to use of the methods described within a single language, the methods may also be used to identify corresponding idioms/expressions in different languages. For example, this information may be offered to a user as part of the further information relating to each of the strings. In this example, the database of strings 205 may further comprise corresponding strings in different languages or alternatively may comprise references to another data store where the corresponding strings in different languages may be stored. A user may be presented with an option to select the languages of interest.
Although the above introduction relates to the use of the methods described herein by a non-native speaker (e.g. a non-native English speaker for strings in English, or a non-native Spanish speaker for strings in Spanish etc), this is described by way of example only and does not provide any limitation to the applicability of the methods. The methods are also applicable for users who are native speakers for the main language of the database.
Although the present examples are described and illustrated herein as being implemented in a system as shown in
The term ‘computer’ is used herein to refer to any device with processing capability such that it can execute instructions. Those skilled in the art will realize that such processing capabilities are incorporated into many different devices and therefore the term ‘computer’ includes PCs, servers, mobile telephones, personal digital assistants and many other devices
Those skilled in the art will realize that storage devices utilized to store program instructions and data can be distributed across a network. For example, a remote computer may store an example of the process described as software. A local or terminal computer may access the remote computer and download a part or all of the software to run the program. Alternatively, the local computer may download pieces of the software as needed, or execute some software instructions at the local terminal and some at the remote computer (or computer network). Those skilled in the art will also realize that by utilizing conventional techniques known to those skilled in the art that all, or a portion of the software instructions may be carried out by a dedicated circuit, such as a DSP, programmable logic array, or the like.
The methods described herein may be performed by software in machine readable form on a storage medium. The software may be suitable for execution on a parallel processor or a serial processor such that the method steps may be carried out in any suitable order, or simultaneously.
This acknowledges that software can be a valuable, separately tradable commodity it is intended to encompass software, which runs on or controls “dumb” or standard hardware, to carry out the desired functions. It is also intended to encompass software which “describes” or defines the configuration of hardware, such as HDL (hardware description language) software, as is used for designing silicon chips, or for configuring universal programmable chips, to carry out desired functions.
Any range or device value given herein may be extended or altered without losing the effect sought, as will be apparent to the skilled person.
The steps of the methods described herein may be carried out in any suitable order, or simultaneously where appropriate.
It will be understood that the above description of a preferred embodiment is given byway of example only and that various modifications may be made by those skilled in the art. The above specification, examples and data provide a complete description of the structure and use of exemplary embodiments of the invention. Although various embodiments of the invention have been described above with a certain degree of particularity, or with reference to one or more individual embodiments, those skilled in the art could make numerous alterations to the disclosed embodiments without departing from the spirit or scope of this invention.