In abugida or alphasyllabary writing systems such as Devanagari, which is part of the large Brahmic family of scripts, consonant-vowel sequences may be written as a unit, where each unit is based on a base consonant letter, and vowel notations may be indicated by diacritical marks or modifiers. In addition, for example, in Devanagari, vowels may also be written in base independent form when occurring at the beginning of a word, or when following another vowel.
Further, in writing systems such as Devanagari and other Indic scripts, consonant clusters may be also be combined and written using conjuncts, which are more formally called typographic ligatures or ligatures. Ligatures are formed when a plurality of graphemes in a script are joined into a single symbol or glyph in the script. A grapheme or a base character may be viewed as the smallest semantically distinguishing unit in a script. As used herein, the term conjunct is used to refer to consonant-vowel combinations, consonant-consonant combinations, and consonant-consonant-vowel combinations. Consonant-consonant-vowel combinations are conjuncts which have been further modified by diacritical marks.
As a consequence of the possible combinations of graphemes with diacritical marks and conjuncts, the number of distinct units or characters in a script can grow rapidly. For example, in Devanagari, with approximately 12 vowels, 34 consonants and 16 diacritical marks, the consonant-consonant sequences yield over 1156 conjunct characters (or conjuncts). In addition, there are also a multitude of characters formed as a consequence of unique consonant-diacritical mark combinations that correspond to each consonant-vowel sequence.
The input of such a large number of characters on modern electronic devices, such as tablets, handheld devices, smartphones, etc. in an intuitive and efficient manner can be a challenge for users, device manufacturers and application developers.
For example, many implementations for the input of Indic text or other abugida writing systems with conjuncts may use three or more virtual keyboards, where a user alternates between keyboards for vowels, consonants, diacritical marks, and conjuncts. The use of several virtual keyboards makes texting, e-mail and various other applications impractical and unwieldy. Moreover, even with multiple virtual keyboards, in conventional schemes, conjuncts often require multi-key combinations.
Therefore, there is a need for a system to facilitate efficient and intuitive user text input for abugida writing systems with conjuncts such as Indic text.
In some embodiments, a processor-implemented method for input of text for abugida writing systems on a Mobile Station (MS) may comprise obtaining a base character, the base character being obtained by performing Optical Character Recognition (OCR) on written user-input on the MS; applying one or more functional operators to the base character to obtain a conjunct character, the functional operators comprising at least one of a diacritical operator or a conjunct operator; and displaying the conjunct character.
In another aspect, an MS may comprise: a memory to store a plurality of base characters in an abugida writing system, a touchscreen to receive written user input comprising text for the abugida writing system, and a processor coupled to the memory and the touchscreen. The processor may be configured to: obtain a base character from the plurality of stored base characters, the base character being obtained by performing Optical Character Recognition (OCR) on the written user-input, and apply one or more functional operators to the base character to obtain a conjunct character, the functional operators comprising at least one of a diacritical operator or a conjunct operator. In some embodiments, the MS may further comprise a display coupled to the processor, the display for displaying the conjunct character.
In a further aspect, an MS may comprise storage means to store a plurality of base characters in an abugida writing system, input means to receive written user input comprising text for the abugida writing systems, and processing means coupled to the storage means and the input means. In some embodiments, the processing means may comprise: means for obtaining a base character, the base character being obtained using Optical Character Recognition (OCR) means to perform OCR on the written user-input, and means for applying one or more functional operators to the base character to obtain a conjunct character, the functional operators comprising at least one of a diacritical operator or a conjunct operator. In some embodiments, the MS may further comprise display means coupled to the processing means, the display means to display the conjunct character.
In some embodiments, a computer-readable medium may comprise instructions, which when executed by a processor, may perform steps in method on a Mobile Station (MS) for input of text for abugida writing systems. The steps may comprise: obtaining a base character, the base character being obtained by performing Optical Character Recognition (OCR) on written user-input on the MS; applying one or more functional operators to the base character to obtain a conjunct character, the functional operators comprising at least one of a diacritical operator or a conjunct operator; and displaying the conjunct character.
Disclosed embodiments also pertain to apparatuses, systems, program code including firmware, and computer-readable media embodying instructions to perform the above methods.
Mobile Station (MS) 100 may, for example, include: one or more processors 102, memory 104, removable media drive 120, display 170, touchscreen 172, and, as applicable, various sensors 136, which may be operatively coupled using one or more connections 106 (e.g., buses, lines, fibers, links, etc.). As used herein, mobile device or mobile station (MS) 100, may take the form of a cellular phone, mobile phone, or other wireless communication device, a personal communication system (PCS) device, personal navigation device (PND), Personal Information Manager (PIM), or a Personal Digital Assistant (PDA), a laptop, tablet, notebook and/or handheld computer. The terms mobile device or mobile station are used interchangeably herein. In some embodiments, MS 100 may be capable of receiving wireless communication and/or navigation signals.
Further, the term “mobile station” is also intended to include devices which communicate with a personal navigation device (PND), such as by short-range wireless, infrared, wireline connection, or other connections and/or position-related processing occurs at the device or at the PND. Also, “mobile station” is intended to include all devices, including various wireless communication devices, which are capable of communication with a server, regardless of whether wireless signal reception, assistance data reception, and/or related processing occurs at the device, at a server, or at another device associated with the network. Any operable combination of the above are also considered a “mobile station.” In some embodiments, MS 100 may also include one or more ports for communicating over wired networks.
In some embodiments, display 170 (shown in
Further, exemplary MS 100 may be modified in various ways in a manner consistent with the disclosure, such as, by combining (or omitting) one or more of the functional blocks shown. For example, in some embodiments, MS 100 may comprise one or more of speakers, microphones, transceivers (e.g., wireless network interfaces), Satellite Positioning System (SPS) receivers and one or more Cameras 130. Further, in certain example implementations, portions of MS 100 may take the form of one or more chipsets, and/or the like.
Processors 102 may be implemented using a combination of hardware, firmware, and software. In some embodiments, processing unit 102 may include Text Input Module 116, which may facilitate the processing of input in abugida writing systems with conjuncts. For example, Text Input Module 116 may facilitate the input of Indic scripts, such as the Brahmic family of scripts, which includes the Devanagari script.
In one embodiment, Text Input Module 116 may process user input received using touchscreen 172, which may capture the coordinates of the points of contact, time(s) or time period(s) associated with each point contact, the sequence in which the points of contact occurred, and/or other parameters associated with each point of contact. The points of contact and parameters associated with each point of contact and/or a set of points of contact may be relayed to Text Input Module 116, which may use the points of contact and parameters to interpret user gestures, recognize strokes of a script, perform Optical Character Recognition, and/or identify other context-dependent input. In some embodiments, input which has been captured by touchscreen 172 and processed by Text Input Module 116 may be displayed as Indic Text 175 using Display 170. In one embodiment, Text Input Module 116 may use a combination of parameters such as user indication, the current location of the stylus, the displacement between two consecutive contact points, the use of a space bar, or the duration of a pause between strokes, context-sensitive techniques, etc to determine when the entry of a character has been completed.
Processors 102 may also be capable of processing other information either directly or in conjunction with one or more other functional blocks shown in
The methodologies described herein may be implemented by various means depending upon the application. For example, these methodologies may be implemented in hardware, firmware, software, or any combination thereof. For a hardware implementation, processors 102 may be implemented within one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), controllers, micro-controllers, microprocessors, embedded processor cores, electronic devices, other electronic units designed to perform the functions described herein, or a combination thereof.
For a firmware and/or software implementation, the methodologies may be implemented using procedures, functions, and so on that perform the functions described herein. Any non-transitory machine-readable medium tangibly embodying instructions may be used in implementing the methodologies described herein. Non-transitory computer-readable media may include physical computer storage media. A storage medium may be any available medium that can be accessed by a computer. In one embodiment, software code pertaining to the processing of text input for abugida writing systems with conjuncts may be stored in a non-transitory computer-readable medium and read using removable media drive 120 and executed by at least one of processors 102. For example, the methods and/or apparatuses presented herein may take the form in whole or part of a computer-readable medium that may include program code to support Text Input Module 116 in a manner consistent with disclosed embodiments.
Non-transitory computer-readable media may include a variety of physical computer storage media. By way of example, and not limitation, such non-transitory computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer; disk and disc, as used herein, includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
Memory 104 may be implemented within processors 102 and/or external to Processor 102. As used herein the term “memory” refers to any type of long term, short term, volatile, nonvolatile, or other memory and is not to be limited to any particular type of memory or number of memories, or type of media upon which memory is stored. In general, memory 104 may represent any data storage mechanism. Memory 104 may include, for example, a primary memory and/or a secondary memory. Primary memory may include, for example, a random access memory, read only memory, etc. While illustrated in
Secondary memory may include, for example, the same or similar type of memory as primary memory and/or one or more data storage devices or systems, such as, for example, flash/USB memory drives, memory card drives, disk drives, optical disc drives, tape drives, solid state memory drives, etc. In certain implementations, secondary memory may be operatively receptive of, or otherwise configurable to couple to a non-transitory computer-readable medium in removable drive 120. In some embodiments, non-transitory computer readable medium may form part of memory 104.
In addition to storage on computer readable medium, instructions and/or data may be provided as signals on transmission media included in a communication apparatus. For example, a communication apparatus may include a transceiver having signals indicative of instructions and data. The instructions and data are configured to cause one or more processors to implement the functions outlined in the claims. That is, the communication apparatus includes transmission media with signals indicative of information to perform disclosed functions.
As shown in
In some embodiments, when text is written on touchscreen 172, information pertaining to a temporal sequence of stylus contact points on touchscreen 172 may be captured. For example, the information captured may comprise the (X, Y) coordinates of an ordered set of contact points C={C1, C2 . . . Cn} relative to a touchscreen coordinate system or frame of reference during some time period and the points C1-Cn may be ordered in sequence based on a time of contact associated with the respective contact point. As an example, contact point C1 may occur prior to contact point C2.
As can be seen from Table 1 above, there is relatively little variation in the X coordinate of the contact points initially, while the Y-coordinates vary. Thus, the initial sequence of contact points 251 for the character “” 250 corresponds to the set of substantially vertically oriented line segments (shown within the dashed box) 252 in
In one embodiment, a first feature vector Vi−1, i may be constructed by connecting the first contact point in the sequence of contact points to the next, for example consecutive contact points Ci−1=(xi−1, yi−1) and Ci=(xi, yi) may be connected. In one embodiment, the angular displacement of feature vector Vi−1, i between two contact points can be determined relative to an axis in a frame of reference associated with touchscreen 170. Based on the value of the angular displacement of feature vector Vi−1, i relative to the designated axis, the feature vector may be labeled with one of a plurality of identifiers or labels. In some embodiments, the process above may be repeated by connecting the next two contact points Ci=(xi, yi) and Ci+1=(xi+1, yi+1) to obtain vector Vi, i+1 and a label for vector Vi, i+1 based on its angular displacement relative to the designated axis. In some embodiments, each character in a script may be uniquely characterized by a sequence of feature vectors.
As an illustrative example, four feature vector labels may used to describe characters with labels given by N, S, E, and W. For example, the label “N” may be used to describe vectors that are oriented (along a Y-axis) toward the top of the page, “S” to describe vectors that are oriented (along the Y-axis) toward the bottom of the page, “E” to describe vectors that are oriented (along the X-axis) to the right of the page, and “W” to describe vectors that are oriented (along the X-axis) toward the left of the page. Thus, for character “” 250, the initial feature vector sequence (or sequence of labels) based on consecutive contact points represented by the sets of line segments 252 and 253 may take the form SSSSWW 255, which may be normalized to the initial vector sequence shown graphically as SW 257 by using a single label to represent a sequence of consecutive repeated labels. For example, in initial feature vector sequence 255, the sequence “SSSS” may be normalized to “S” and the sequence “WW” may be normalized to “W” to yield vector sequence “SW”.
In general, normalization may be performed on a raw vector sequence by replacing a sequence of consecutive repeated labels (such as “EEEE”) with a single instance of the repeated label (e.g. “E”), or equivalently, by retaining the first label in a sequence of consecutive repeated labels (e.g. “E”) in a raw feature vector sequence (e.g. “EEEE”) and eliminating subsequent consecutive duplicated labels (e.g. “EEE”) in the sequence.
A raw or initially captured feature vector sequence 258-1 for the entire character “” 250 is also shown in
The example above with four feature vectors is for illustrative purposes only, and, in general, the number of feature vectors used to characterize each symbol in a script may be based on the degree of sensitivity required, the speed of the processor, the resolution of the touchscreen, complexity of the script, and various other system parameters. In some embodiments, input from sensors 136 on MS 100 may be used to determine an orientation of MS 100 and appropriate adjustments may be made when calculating angular displacements or determining labels.
In some embodiments, each character in a script may be associated with a unique feature vector sequence. In some embodiments, the feature vector sequence corresponding to some subset of characters in the script and/or one or more other scripts may be stored in databases, Look-Up Tables (LUTs), font tables and/or character tables, linked lists, or other suitable data structures in memory 104 or MS 100. Therefore, by comparing the normalized feature vector sequence of the written use-input with the feature vector sequences stored in memory 104, characters 210 input by the user may be recognized.
Referring to
OCR is typically a computationally intensive process and the computational cost may be higher for abugida languages with large numbers of conjuncts or ligatures. Therefore, in some embodiments, an OCR process may be configured to recognize graphemes, or base characters, which may then be configured to yield symbols in the script in a manner consistent with disclosed embodiments. As noted above, graphemes represent the smallest semantically distinguishing unit in a script.
As shown in
Further, in some embodiments, GUI 200 may also include other functional operator icons 230, such as conjunct operator 232-1, correction operator 239 etc. In some embodiments, conjunct operator 232-1 may be used to generate consonant-consonant conjuncts or ligatures. Note that the locations of the set of diacritical mark icons 235 and functional operator icons 230 in GUI 200 may be changed. For example, the locations of the icons may be user-configurable. In one embodiment, the icons may be transparent or semi-transparent and may be placed within and/or close to the edges of window 205 to facilitate quick user-access. In some embodiments, a conjunct character may be further modified by adding diacritical marks using one of diacritical operators 235.
Further, in some embodiments, GUI 200 may comprise a simple user-configurable single keyboard layout, which may include graphemes and independent vowels, along with functional operators 230 and 235 to permit text entry in a manner consistent with disclosed embodiments.
In one embodiment, when Join operator 232-1 is dragged and dropped into window 220, for example, at a location between characters “” and “”, then, as shown in
Further, as shown in
In some embodiments, the set of potential characters for correction 248 may be populated with characters whose feature vector sequences are within some Levenshtein distance of the normalized feature vector sequence associated with the entered/recognized character. The “Levenshtein distance” or “edit distance” measures the difference between two string sequences. The Levenshtein distance between two words can be viewed as the minimum number of single-character edits (such as insertions, deletions, or substitutions) to change one word into the other. In some embodiments, the Levenshtein distance between the feature vector sequences or feature vector strings for two characters may be used as a measure of similarity between the two characters. For example, a set of potential correction characters may be determined based on the Levenshtein distance between the feature vector sequence of each character in the set of potential correction characters and the normalized feature vector sequence that is associated with an entered character.
In some embodiments, a database or table or LUT in memory that holds the feature vector sequence for a character may also hold characters that are within some Levenshtein distance of the feature vector for that character. For example, in one embodiment, a database or LUT may be indexed by a “feature vector sequence” or a “feature vector sequence key”, and a record in the database may hold the character to be displayed, the feature vector sequence for the character, as well as characters that are within various Levenshtein distances of that character. Accordingly, in some embodiments, the set of potential characters for correction 248 displayed may be efficiently determined during the OCR process for the entered character.
Table 270 shows a set of recognized characters 275-1, 277-1 and 279-1 along with corresponding potential correction characters (shown by dashed ovals) 275-2, 277-2 and 279-2, respectively. In some embodiments, potential correction characters 275-2, 277-2 and 279-2 may be selected based on a Levenshtein distance of their respective feature vector sequences to the feature vector sequence associated with the corresponding recognized character and by adding information (e.g. adding one or more labels) to the feature vector sequence for the corresponding recognized character.
Table 280 shows a set of recognized characters 285-1, 287-1 and 289-1 along with potential correction characters 285-2, 287-2 and 289-2, respectively. In some embodiments, potential correction characters 285-2, 287-2 and 289-2 may be selected based on Levenshtein distance of their respective feature vector sequences to the feature vector sequence associated with the corresponding recognized character and by removing information (e.g. removing one or more labels) from the feature vector sequence for the corresponding recognized character.
In general, in some embodiments, potential correction characters may be determined by adding, removing and/or substituting labels from the feature vector sequence associated with a recognized character. Each addition, deletion or substitution operation may be given a weight and a Levenshtein distance may be computed between two feature vector sequences by adding the weights.
In
In step 305, text input, such as Indic text input, may be received and processed. For example, if Indic text input is input in written form, then, in some embodiments, the written input may be processed to obtain a normalized feature vector sequence for the input character. In one embodiment, the graphemes or base characters may be entered using a simple keyboard, such as a virtual keyboard, or blue tooth keyboard. For example, graphemes may be entered using keys for some subset of base characters in the script such as independent vowels or single consonants.
In step 310, if the entered character was written, then, the obtained normalized feature vector sequence of the entered character may be matched with stored feature vector sequences of characters stored in memory 104. For example, the normalized feature vector sequence corresponding to the entered character may be compared with the feature vector sequences of one or more characters stored in a database or lookup table. In some embodiments, the Levenshtein distance between the normalized feature vector sequence corresponding to the entered character and the feature vector sequence corresponding to a stored character may be obtained. The stored characters, which are potential matches to the entered character, may be sorted in ascending order of Levenshtein distance. A zero Levenshtein distance represents an exact match. In general, a shorter Levenshtein distance between an entered character and a stored character may be indicative of more similarity between the two characters and point to a greater likelihood of a match. In embodiments, or instances, where the base character was entered using a virtual keyboard, step 340 may be directly invoked after step 305 for the current iteration bypassing steps 320, 325 and 330.
In step 320, if there is a match or a high confidence that one of the characters is a match (“Y”, in step 320), then, in step 330, the character that is the best match (lowest Levenshtein distance) may be selected as corresponding to the entered character. For example, in one embodiment, if only one character is within some predetermined threshold Levenshtein distance, then, that character may be selected as corresponding to the entered character.
In step 320, if there is no exact match or the confidence that a match has been obtained is below some threshold (“N”, in step 320), then, in step 325, one of the characters in a set of likely matches may be selected. For example, in one embodiment, the feature vector sequences of several characters may be within some Levenshtein distance of the entered character. Accordingly, in step 325, other criteria such as context and/or dictionaries, may be used to select a character from the set of likely matches. For example, one of the characters within the Levenshtein distance may be selected based on the frequency with which the current and an immediately preceding character occur together, previously entered/recognized characters, spellings of words that may be formed using prior characters, etc.
In step 340, if there is additional input (“Y” in step 340) then, in step 345, the method may determine if the input is functional. For example, if the input is functional (“Y” in step 345) and pertains to a diacritical function or a conjunct (join) function, then in step 350, the function may be applied. For example, in one instance, if a “join” or conjunct function invocation is registered then the previous two characters may be combined to form a conjunct. In another instance, if the conjunct function invocation is through a drag and drop operation, then the coordinates of the drop may be used to determine the characters to be combined and a conjunct character may be obtained by combining the characters in step 350. In some embodiments, the conjunct character to be displayed may be obtained by using the two consonant characters in sequence as an index to search a lookup table for a corresponding entry, which may comprise the conjunct character to be displayed.
In a further instance, if the input (“Y” in step 345) pertains to a diacritical function, then the appropriate diacritical mark may be added, in step 350, to the prior character. If the diacritical mark invocation is through a drag and drop operation, then the coordinates of the drop may be used to determine the character, which may then be modified with the diacritical mark in step 350.
In another instance, if the input (“Y” in step 345) pertains to a correction function, then a list of potential characters for correcting the prior input, such as list 248, may be displayed. In some instances, the list of potential characters for correcting the prior input may be based on the Levenshtein distance of the vector sequence to the entered character or the recognized character.
If the input in step 345, is non-functional and non-diacritical (“N” in step 345) then, the process returns to step 305 to begin another iteration.
In step 355, the recognized and/or selected character corresponding to the entered character after modification in step 350 may be displayed and the process returns to step 340 to begin the next iteration.
In step 340, if there is no additional input (“N” in step 340) then, in step 355, the character selected in steps 325 or 330 as corresponding to the entered character may be displayed and the process ends.
In step 380, a base character in the abugida writing system may be obtained by performing Optical Character Recognition (OCR) on written user-input, which may occur, for example, on MS 100. In some embodiments, OCR may be performed by obtaining a normalized feature vector sequence corresponding to the written user-input, where the normalized feature vector sequence is based on a set of contact points associated with the written user input. Further, the base character may be identified based, at least in part, on a comparison of the normalized feature vector sequence corresponding to the written user-input with a set of stored feature vector sequences, where each feature vector sequence in the set corresponds to a distinct base character. For example, the base character may be identified by determining Levenshtein distances between the normalized feature vector sequence corresponding to the written user-input and a plurality of feature vector sequences in the set. One of the plurality of feature vector sequences may then be selected, if the Levenshtein distance for the selected feature vector sequence is below some predetermined threshold; and the base character may be identified by determining the base character that is associated with the selected feature vector sequence.
Next, in step 382, a functional operator may be applied to the base character to obtain a conjunct character, wherein the functional operator may comprise at least one of a diacritical operator or a conjunct operator. In some embodiments, the conjunct character may be obtained by adding a diacritical mark to the base character at an appropriate location. The diacritical mark may correspond to the diacritical operator and its position may be determined based on the base character and the diacritical operator. Further, in some embodiments, the conjunct character may be obtained by using at least one immediately preceding character and the base character as an index to search a Look-Up Table for a corresponding entry, which may comprise the conjunct character. In some embodiments, the functional operators may be displayed as icons on a virtual keyboard associated with MS 100, and/or invoked by user gestures on touchscreen 172 on MS 100.
In step 384, the conjunct character obtained in step 382 may be displayed. For example, the conjunct character may be displayed on display 172. In some embodiments, the functional operator may comprise a compound functional operator, which may act to apply a plurality of operations, which may be some combination of diacritical and/or conjunct operations to the base character.
In some embodiments, the diacritical and/or functional operators may be repeatedly applied to form complex conjuncts. For example, a triple conjunct character may be formed by using either a compound functional operator or by applying a conjunct operator repeatedly. As one example, the triple conjunct may be formed by combining three consonants. First, a conjunct operator may be applied to a currently entered recognized character and an immediately preceding recognized user-input character to obtain a first conjunct. Next, the conjunct operator may be applied to the first conjunct and another recognized user-input character that immediately precedes the first conjunct to obtain a second triple conjunct. Thus, in the example above, the triple conjunct may be formed by applying the conjunct operator twice. In some embodiments, the triple (or multiple) conjunct may be obtained by using a compound conjunct operator, which, in some embodiments, may be displayed as an icon on display 172. The compound conjunct operator may perform the operations described above when invoked. For example, the compound conjunct may combine the currently entered character with two immediately preceding characters to obtain the triple conjunct.
In steps 410-1 and 410-2 character input may be received. For example, in step 410-1 the character “” 415-1 may be entered. For example, in one embodiment touch-based user written input may be used to enter character “” 415-1. Character “” 415-2 may be entered in step 410-2.
In step 420-1 and 420-2, characters “” 415-1 and “” 415-2 may be processed using OCR to recognize the received character input. In some embodiments, the first and second input characters “” 415-1 and “” 415-2, respectively, may be recognized using a stroke-based recognition approach. For example, normalized feature vector sequences may be derived for characters “” 415-1 and “” 415-2, respectively, based on user input and the normalized feature vector sequences may be used, at least in part, to recognize and/or determine potential matches for the entered characters.
In some embodiments, the sets of potential characters recognized 425-1 and 425-2 may be populated with characters whose feature vector sequences are within some Levenshtein distance of the normalized feature vector sequence associated with the entered/recognized characters. For example, set 425-1 shows the set characters “”, “”, and “” with Levenshtein distances d11, d12, and d13, respectively, as potential matches for input character “” 415-1. Similarly, set 425-2 shows characters “” and “” with Levenshtein distances d21 and d22, respectively, as potential matches for input character “” 415-2.
In step 430-1, the Levenshtein distance d11 to the nearest neighbor is compared to a threshold. If the Levenshtein distance d11 to the nearest neighbor is below the threshold (“Y” in step 430-1), then, in step 440-1, the closest neighbor “” is displayed. If the Levenshtein distance d11 to the nearest neighbor is not below the threshold (“N” in step 430-1), then, in step 450-1, the characters in set 425-1 may be displayed.
Similarly, in step 430-2, the Levenshtein distance d21 to the nearest neighbor is compared to a threshold. If the Levenshtein distance d21 to the nearest neighbor is below the threshold (“Y” in step 430-2), then, in step 440-2, the closest neighbor “” is displayed. If the Levenshtein distance d21 to the nearest neighbor is not below the threshold (“N” in step 430-2), then, in step 450-2, the characters in set 425-2 may be displayed.
For example, if Levenshtein distances d11 and d21 are both above the threshold, then sets 425-1 and 425-2 may be displayed to the user in window 220 on MS 100. In step 455, one of the characters from set 425-1 may be selected as corresponding to input character 410-1 and/or one of the characters from set 425-2 may be selected as corresponding to input character 410-2.
In step 460, for example, the user may use drag and drop operation or pinch and/or may use one or more of the set of functional operator icons 230 displayed in GUI 200 on the characters in window 220 to form a conjunct. In the example shown in
Although the present disclosure is described in relation to the drawings depicting specific embodiments for instructional purposes, the disclosure is not limited thereto. Various adaptations and modifications may be made without departing from the scope. Therefore, the spirit and scope of the appended claims should not be limited to the foregoing description.