Example embodiments of the present disclosure relate generally to fuzzy term searching, and more particularly, to using fuzzy searching for misspelled and/or incomplete search queries.
Text searching is an important part of efficient computer usage including the identification of relevant resources. For example, search engines, including those supported by the internet, generally rely upon search terms to direct users to different websites or to identify resources that are responsive to the search terms. However, users often misspell search terms or do not provide enough information to perform an effective search for their desired target.
In order to efficiently identify the most relevant resources in response to a search query, the input from a user must be analyzed and, if necessary, adjusted to permit the query to be executed relative to the vast amount of information available without undue delay. Traditional databases do not allow for incomplete or inaccurate input or, if allowed, do not provide search results that are response to the intended query. While some databases provide for some adjustments to the search query to allow for some mistakes, these adjustments are often incomplete or otherwise still result in inefficient searches.
Accordingly, a method, apparatus, and computer program product are provided for offline fuzzy term searching. In an example embodiment, a method of offline term searching is provided. The method includes receiving one or more characters of a search query. The method also includes generating one or more search indicator values based on the one or more characters of a search query. Each search indicator value of the one or more search indicator values includes a digest of the one or more characters of a search query inputted into a minhash function with a distinct salt value input. The method further includes comparing the one or more search indicator values with one or more sets of database indicator values. Each database indicator value in the one or more sets of database indicator values corresponds to a database value.
In some embodiments, the method also includes generating one or more confidence levels of the one or more characters of a search query with the one or more database values based on the comparison of the one or more search indicator values with the one or more sets of database indicator values. In some embodiments, the method also includes providing one or more database values to a user as candidates to represent the search query with the one or more database values that are provided being based on the confidence levels that have been created.
In some embodiments, the method also includes causing the transmission of one or more of the database values based on the one or more confidence levels generated. In some embodiments, the digest of the one or more characters is based on the minhash function with the distinct salt value input being performed on one or more hash windows of the one or more characters of the search query. In some embodiments, each minhash function with the distinct salt value input includes a distinct salt value input to compare to a window of the one or more characters of the search query and the one or more salt values are pseudo-random values not related to the one or more characters of the search query.
In some embodiments, the method also includes generating a set of database indicator values. In such an embodiment, each database indicator value of the set of database indicator values includes a digest of the database value inputted into the minhash functions. In some embodiments, the comparing the set of search indicator values with one or more sets of database indicator values includes comparing search indicator values and database indicators values that were digests of the same minhash function. In some embodiments, the method also includes predicting the search query based on the one or more confidence levels generated.
In another example embodiment, an apparatus is provided for offline term searching. The apparatus includes at least one processor and at least one non-transitory memory including computer program code instructions, the computer program code instructions configured to, when executed, cause the apparatus to receive one or more characters of a search query. The computer program instructions are also configured to, when executed, cause the apparatus to generate one or more search indicator values based on the one or more characters of a search query. Each search indicator value of the one or more search indicator values includes a digest of the one or more characters of a search query inputted into a minhash function with a distinct salt value input. The computer program instructions are further configured to, when executed, cause the apparatus to compare the one or more search indicator values with one or more sets of database indicator values. Each database indicator value in the one or more sets of database indicator values corresponds to a database value.
In some embodiments, the computer program code instructions are further configured to, when executed, cause the apparatus to generate one or more confidence levels of the one or more characters of a search query with the one or more database values based on the comparison of the one or more search indicator values with the one or more sets of database indicator values. In some embodiments, the computer program code instructions are further configured to, when executed, cause the apparatus to cause the transmission of at least one database value to a user based on the one or more confidence levels generated.
In some embodiments, the computer program code instructions are further configured to, when executed, cause the apparatus to cause the transmission of one or more of the database values based on the one or more confidence levels generated. In some embodiments, the digest of the one or more characters is based on the minhash function with the distinct salt value input being performed on one or more hash windows of the one or more characters of the search query.
In some embodiments, each minhash function with the distinct salt value input includes a distinct salt value input to compare to a window of the one or more characters of the search query and the one or more salt values are pseudo-random values not related to the one or more characters of the search query. In some embodiments, the computer program code instructions are further configured to, when executed, cause the apparatus to generate a set of database indicator values. In such an embodiment, each database indicator value of the set of database indicator values includes a digest of the database value inputted into the minhash function. In some embodiments, the comparing the set of search indicator values with one or more sets of database indicator values includes comparing search indicator values and database indicators values that were digests of the same minhash function. In some embodiments, the computer program code instructions are further configured to, when executed, cause the apparatus to predict the search query based on the one or more confidence levels generated.
In yet another example embodiment, a computer program product is provided that includes at least one non-transitory computer-readable storage medium having computer-executable program code portions stored therein with the computer-executable program code portions including program code instructions configured to receive one or more characters of a search query. The computer-executable program code portions also include program code instructions configured to generate one or more search indicator values based on the one or more characters of a search query. Each search indicator value of the one or more search indicator values includes a digest of the one or more characters of a search query inputted into a minhash function with a distinct salt value input. The computer-executable program code portions further include program code instructions configured to compare the one or more search indicator values with one or more sets of database indicator values. Each database indicator value in the one or more sets of database indicator values corresponds to a database value.
In some embodiments, the program code instructions are further configured to generate one or more confidence levels of the one or more characters of a search query with the one or more database values based on the comparison of the one or more search indicator values with the one or more sets of database indicator values. In some embodiments, the program code instructions are further configured to cause the transmission of at least one database value to a user based on the one or more confidence levels generated.
In some embodiments, the program code instructions are further configured to cause the transmission of one or more of the database values based on the one or more confidence levels generated. In some embodiments, the digest of the one or more characters is based on the minhash function with the distinct salt value input being performed on one or more hash windows of the one or more characters of the search query.
In some embodiments, each minhash function with the salt value input includes a distinct salt value input to compare to a window of the one or more characters of the search query and the one or more salt values are pseudo-random values not related to the one or more characters of the search query. In some embodiments, the program code instructions are further configured to generate a set of database indicator values. In such an embodiment, each database indicator value of the set of database indicator values includes a digest of the database value inputted into the minhash function. In some embodiments, the program code instructions to compare the set of search indicator values with one or more sets of database indicator values include program code instructions to compare search indicator values and database indicators values that were digests of the same minhash function.
In still another example embodiment, an apparatus is provided including means for offline term searching. The apparatus includes means for receiving one or more characters of a search query. The apparatus also includes means for generating one or more search indicator values based on the one or more characters of a search query. Each search indicator value of the one or more search indicator values includes a digest of the one or more characters of a search query inputted into a minhash function with a distinct salt value input. The apparatus further includes means for comparing the one or more search indicator values with one or more sets of database indicator values. Each database indicator value in the one or more sets of database indicator values corresponds to a database value.
In some embodiments, the apparatus also includes means for generating one or more confidence levels of the one or more characters of a search query with the one or more database values based on the comparison of the one or more search indicator values with the one or more sets of database indicator values. In some embodiments, the apparatus also includes means for providing one or more database values to a user as candidates to represent the search query with the one or more database values that are provided being based on the confidence levels that have been created.
In some embodiments, the apparatus also includes means for causing the transmission of one or more of the database values based on the one or more confidence levels generated. In some embodiments, the digest of the one or more characters is based on the minhash function with the distinct salt value input being performed on one or more hash windows of the one or more characters of the search query. In some embodiments, each minhash function with the distinct salt value input includes a distinct salt value input to compare to a window of the one or more characters of the search query and the one or more salt values are pseudo-random values not related to the one or more characters of the search query.
In some embodiments, the apparatus also includes means for generating a set of database indicator values. In such an embodiment, each database indicator value of the set of database indicator values includes a digest of the database value inputted into the minhash functions. In some embodiments, the comparing the set of search indicator values with one or more sets of database indicator values includes comparing search indicator values and database indicators values that were digests of the same minhash function. In some embodiments, the apparatus also includes means for predicting the search query based on the one or more confidence levels generated.
The above summary is provided merely for purposes of summarizing some example embodiments to provide a basic understanding of some aspects of the invention. Accordingly, it will be appreciated that the above-described embodiments are merely examples and should not be construed to narrow the scope or spirit of the invention in any way. It will be appreciated that the scope of the invention encompasses many potential embodiments in addition to those here summarized, some of which will be further described below.
Having thus described certain example embodiments of the present disclosure in general terms, reference will hereinafter be made to the accompanying drawings which are not necessarily drawn to scale, and wherein:
Some embodiments of the present disclosure will now be described more fully hereinafter with reference to the accompanying drawings, in which some, but not all, embodiments of the invention are shown. Indeed, various embodiments of the invention may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will satisfy applicable legal requirements. Like reference numerals refer to like elements throughout. As used herein, the terms “data,” “content,” “information,” and similar terms may be used interchangeably to refer to data capable of being transmitted, received and/or stored in accordance with embodiments of the present disclosure. Thus, use of any such terms should not be taken to limit the spirit and scope of embodiments of the present disclosure.
Various methods, apparatuses, and computer program products are provided in accordance with example embodiments of the present disclosure for improving fuzzy term searching, both online and offline. Conventional techniques of fuzzy term searching may allow for fuzzy searching, but are not specifically designed from a data structure perspective to facilitate and assist in fuzzy searching such that responsive results are obtained in an efficient manner, thereby providing improved performance and quality. Various embodiments of the present disclosure are designed for fault tolerant fuzzy text queries with some embodiments allowing for scaling across multiple servers and other embodiments configured to be performed locally within a device. In some embodiments, the operations described herein allow for parallel processing of search queries to be completed allowing for near instantaneous determinations.
The processor 14 may be embodied in a number of different ways. For example, the processor 14 may be embodied as one or more of various hardware processing means such as a coprocessor, a microprocessor, a controller, a graphics processing unit (GPU), a digital signal processor (DSP), a processing element with or without an accompanying DSP, or various other processing circuitry including integrated circuits such as, for example, an ASIC (application specific integrated circuit), an FPGA (field programmable gate array), a microcontroller unit (MCU), a hardware accelerator, a special-purpose computer chip, or the like. As such, in some embodiments, the processor 14 may include one or more processing cores configured to perform independently. A multi-core processor may enable multiprocessing within a single physical package. Additionally or alternatively, the processor 14 may include one or more processors configured in tandem via the bus to enable independent execution of instructions, pipelining and/or multithreading.
In an example embodiment, the processor 14 may be configured to execute instructions stored in the memory device 16 or otherwise accessible to the processor. Alternatively or additionally, the processor 14 may be configured to execute hard coded functionality. As such, whether configured by hardware or software methods, or by a combination thereof, the processor 14 may represent an entity (for example, physically embodied in circuitry) capable of performing operations according to an embodiment of the present disclosure while configured accordingly. Thus, for example, when the processor 14 is embodied as an ASIC, FPGA or the like, the processor 14 may be specifically configured hardware for conducting the operations described herein. Alternatively, as another example, when the processor 14 is embodied as an executor of software instructions, the instructions may specifically configure the processor to perform the algorithms and/or operations described herein when the instructions are executed. However, in some cases, the processor 14 may be a processor of a specific device (for example, the computing device) configured to employ an embodiment of the present disclosure by further configuration of the processor by instructions for performing the algorithms and/or operations described herein. The processor may include, among other things, a clock, an arithmetic logic unit (ALU) and logic gates configured to support operation of the processor. In some embodiments, the processor 14 may be configured to use machine learning or other operations described herein.
The apparatus 10 of an example embodiment may also include a communication interface 20 that may be any means such as a device or circuitry embodied in either hardware or a combination of hardware and software that is configured to receive and/or transmit data from/to other electronic devices in communication with the apparatus, such as by near field communication (NFC). Additionally or alternatively, the communication interface 20 may be configured to communicate over Global System for Mobile Communications (GSM), such as but not limited to Long Term Evolution (LTE). In this regard, the communication interface 20 may include, for example, an antenna (or multiple antennas) and supporting hardware and/or software for enabling communications with a wireless communication network. Additionally or alternatively, the communication interface 20 may include the circuitry for interacting with the antenna(s) to cause transmission of signals via the antenna(s) or to handle receipt of signals received via the antenna(s). In some environments, the communication interface 20 may alternatively or also support wired communication and/or infrastructure wireless links. The communication interface 20 may be configured to communicate, through various methods described herein, with one or more client devices (e.g., mobile devices, computers, or the like), and/or the like.
The apparatus 10 of an example embodiment may also optionally include or otherwise be in communication with a user interface 22. The user interface 22 may include a touch screen display, a speaker, physical buttons, and/or other input/output mechanisms. In an example embodiment, the processor circuitry 12 may comprise user interface circuitry configured to control at least some functions of one or more input/output mechanisms. The processing circuitry and/or user interface circuitry comprising the processing circuitry may be configured to control one or more functions of one or more input/output mechanisms through computer program instructions (for example, software and/or firmware) stored on a memory accessible to the processor (for example, memory device 16, and/or the like).
In some embodiments, the user interface 22 may be in communication with the processing circuitry 12 to receive an indication of a user input and/or to cause presentation of the video output generated by execution of computer software. As such, the user interface may include, for example, a keyboard, a mouse, a joystick, a display, a touch screen(s), touch areas, soft keys, a microphone, a speaker, or other input/output mechanisms. Alternatively or additionally, the processing circuitry 12 may comprise user interface circuitry configured to control at least some functions of one or more user interface elements such as, for example, a speaker, ringer, microphone, display, and/or the like. The processing circuitry and/or user interface circuitry comprising the processing circuitry may be configured to control one or more functions of one or more user interface elements through computer program instructions (e.g., software and/or firmware) stored on a memory accessible to the processor (e.g., memory device 16, and/or the like). The user interface may include one or more user equipment.
Referring now to Block 200 of
Referring now to Block 210 of
As discussed below in reference to Block 220, the salt value input used to create a search indicator value may be the same as one used to create a set of database indicator values stored in the memory 16. In some embodiments, a plurality of search indicator values may be generated based on a plurality of salt value inputs. In some embodiments, each salt value may be a pseudo-random value. In some embodiments, the salt value may not be related to the one or more characters of the search query. For example, each of the salt values may be an internet protocol (IP) address or a port number of the apparatus 10 or the like. In some embodiments, the salt value(s) may be unique to the index server and/or database. For example, each index server and/or database may have a distinct salt value input (e.g., one or more salt values). In various embodiments, the one or more salt values may be fed along with the character into a minhash function, such that the digest of the same query may be different in an instance the salt value inputs are different.
In some embodiments, the search indicator values may be based on a character window size used by the minhash function, along with the salt value input. In an example embodiment, the minhash function may be configured to determine a respective value, such as the minimum value, from among the characters in the character window and the salt value inputs. In order to permit the determination of a minimum value, the characters, such as the alphabetic characters, that may comprise a search query may each be assigned a value, such as in accordance with a predefined translation table. For example, an A may be worth 7, a B worth 15, a C worth 2, and so on.
As described below, a plurality of minhash functions may be defined to evaluate the same search query with each minhash function utilizing a different salt value input. In some embodiments, the character window size may be consistent through all minhash functions performed by the apparatus 10. In such an embodiment, each minhash function may have a distinct salt value input, but the same window size. For example, the window size may be 3. In various embodiments, the distinct salt value inputs may randomize the digests of a minhash function such that different minhash functions (e.g., each index server having a different salt value input) may be effective in identifying different misspellings. In some embodiments, the character window size may be based on the tolerance to errors of an input desired. For example, the size of the window may be based on how often a typographical error may be expected to occur (e.g., in an instance a typographical error happens every 3-5 characters, then the character window may be 3-5 to account for such errors). In various embodiments, the digest of a minhash function may be calculated based on the number of characters defined by the character window (e.g., three characters) along with the salt value input (e.g., two salt values). In some embodiments, the window size may be different for different minhash functions, thereby likely resulting in different search indicator values. For example, even with the same salt value input, a different character window size may result in a minhash function generating a different digest in the form of a search indicator value. In an example embodiment, the different sets of database indicator values generated by minhash functions using different salt value inputs may be stored in the database, such as in different indexes maintained by the database, one of which stores the database indicator values generated by the minhash function using a respective salt value input. In various embodiments, the salt value may be predetermined (e.g., determined once for each server). In some embodiments, the salt values may not be modified after being determined.
By way of example, in an instance there are two instances of a minhash function with different salt value inputs, e.g., (0, 0) and (1, 0), and window size of 3, a database value (e.g., “INVALIDENSTRASSE BERLIN”) may be processed by each instance of the minhash function and result in two different database indicator values (e.g., one database indicator value may be “IAIESAEIN” for salt values of (0, 0) and the other database indicator value may be “NLDNRS BLN” for salt values of (1, 0)). In such an example, in an instance the search query is spelled the same as the database value (e.g., “INVALIDENSTRASSE BERLIN”), then the resultant search indicator values generated by the same instances of the minhash function will be the same as for the database value (e.g., one search indicator value may be “IAIESAEIN” for salt values of (0, 0) and the other search indicator value may be “NLDNRS BLN” for salt values of (1, 0)). Such an example may produce a confidence level of 1. In an instance the search query is not spelled the same as the database value due to, for example, a misspelling (e.g., “INVALIDNESTRASSE BERLIN”), then one or more of the resultant search indicator values generated by the same instances of the minhash function may not match the database indicator values. For example, one search indicator value (e.g., “IAIESAEIN” for salt values of (1, 0)) may be the same as the corresponding database value indictor, while the other search indicator value (e.g., “NLDNRS BLN” for salt values of (1, 0)) may be different. Such an example may produce a confidence level of 0.50. Alternatively, a different misspelling (e.g., “INVLAIDENSTRASSE BERLIN”) may result in different resultant search indicator values that match the opposite database indicator value as the previous misspelling (e.g., “NLDNRS BLN” may match the database indicator value “NLDNRS BLN”, but “IVAESAEIN” may not match the other database indicator value “IAIESAEIN”). In such an example, the confidence level may also be 0.50. In an example embodiment with more instances of the minhash functions having distinct salt value input, the confidence level may be more precise.
Referring now to Block 220 of
In some embodiments, the one or more sets of database indicator values may be stored by the apparatus 10, such as in the memory 16. In some embodiments, the apparatus 10, such as the processing circuitry 12, may be configured to generate one or more of the database indicator values. In some embodiments, the generation of the database indicator values may be performed before entry of the search query, while in other embodiments, the database indicator values may be generated in response to entry of the search query. While the database indicator values may be generated by the apparatus 10 as described above, in some embodiments, the database indicator values may, instead, be generated externally to the apparatus 10, such as by the index server(s) 310 shown in
In some embodiments, each search indicator value may be compared by the processing circuitry 12 to a set of database indicator values that also were digests generated by the minhash function with the same salt value inputs. In some embodiments, the comparison may result in one or more matches of a search indicator value with a database indicator value from the set of database indicator values, such as in an instance in which the search indicator value and the database indicator value are the same. In some embodiments, the matching of the search indicator value with a database indicator value indicates that the search indicator value may be the same as a portion, but not all, of the database indicator value. For example, in an instance the one or more characters of a search query (e.g., “food near”) matches less than all the word of a multiword database value (e.g., “food near me”), the search indicator value may match a portion of the database value. In various embodiments, the search indicator value may be matched with more than one of the database indicator values within a set of database indicator values.
In an example embodiment, in an instance in which an entire search query is entered with the correct spelling, the comparison may only match the search indicator value with the database indicator value corresponding to the database value for the correctly spelled search query. Alternatively, in an instance the search query was incomplete and/or misspelled, the search indicator value may be matched with no database indicator values, the correct database indicator value corresponding to the search query if the search query were complete and spelled correctly, the incorrect database indicator value corresponding to a search query different than the complete and correctly spelled version of the search query, or multiple database indicator values, correct and/or otherwise. As discussed above in reference to an example, the more search indicator values that match with a database indicator value corresponding with the same database value, the higher the likelihood the database value is the intended search query. For example, as discussed above, in an instance in which the search query is spelled incorrectly by only a few letters (e.g., reversing two letters), then the search indicator values may match with some, but not all, of the database indicator values corresponding to the database value (e.g., intended search query). In various embodiments, the number of minhash functions with distinct salt value inputs that are used may be based on desired level of precision, available computing power, desired computing power, and/or the like.
Referring now to Block 230 of
In an example embodiment, the confidence level of the one or more characters of a search query corresponding to a database value may be from 0 to 1, with 0 representing no confidence in the match and 1 representing complete confidence in the match. In various embodiments, the accuracy of the confidence level may be based on the number of individual sets of database indicator values used to determine the confidence level with the analysis of more sets of database indicator values resulting in increased accuracy in the confidence level and, conversely, the analysis of fewer sets of database indicator values resulting in decreased accuracy in the confidence level. In an example embodiment, based on the comparison of a search indicator value with a set of database indicator values, the apparatus 10 includes means, such as the processing circuitry 12, for determining whether one or more database indicator values match a portion or all of the search indicator value. In some embodiments, the confidence level may be based on the number of times a database value is matched with a search indicator value divided by the number of sets of database value indicators used in the determination.
For example, in an instance in which there are three sets of database value indicators, if the one or more characters of the search query represent the entire, correctly spelled intended search query (e.g., “food near me”), then the comparison with three sets of database indicator values may result in a database indicator value from each of the three sets matching with the corresponding search value indicator and resulting in a confidence level of 1. However, in an example where the search query was incomplete and/or misspelled (e.g., “food near”), then database indicator values from less than all (e.g., two out of three sets) of the sets of database indicator values may match with the corresponding search indicator value and result in a lower confidence level (e.g., a confidence level of 0.667 for matching with a database value in two out of three sets of database indicator values).
Additionally, other similar search queries may also result in matches with multiple database values and therefore a confidence level may be created by the processing circuitry 12 for each database value matched. For example, in the instance the search query is incomplete (e.g., “food near”), the comparison of the search indicator value may also result in matches with database indicator values representative of other search queries (e.g., “food near the mall”). In some embodiments, the apparatus 10, such as the processing circuitry 12, may generate a confidence level for a plurality of database values that match the search query. The apparatus 10, such as the processing circuitry 12, may be configured to determine a confidence level for each different match, such as for each database value that is determined to potentially match a search query. In some embodiments, the multiple confidence levels may be ranked and/or compared by the processing circuitry 12 to determine the most likely candidate(s) for the intended search query.
Referring now to Block 240 of
In various embodiments, based on the confidence level of one or more database values, the apparatus 10 may include means, such as the processing circuitry 12, for conducting a search (e.g., such as a search engine). In some embodiments, the apparatus may receive, such as by the user interface 22, a selection of one or more database values based on the one or more database values presented via the user interface 22. For example, the apparatus 10, such as the processing circuitry 12, may receive a selection of one or more of the database values with the top five confidence levels presented to the user interface 22, and subsequently search for the presence of the database values in a repository, such as within a database, within the online resources, etc. In some embodiments, the apparatus 10, such as the processing circuitry 12, may provide the user interface with at least one of a suggested correction in spelling, a corrected search query, or a suggested completion to the search query and may, in turn, conduct the search based on user input accepting or rejecting the suggested correction in spelling, the corrected search query, or the suggested completion to the search query.
In some embodiments, the apparatus 10, such as the processing circuitry 12, may use the confidence level of one or more database values to provide a suggested correction and/or a suggestion for an auto fill. For example, in an instance a user misspells the intended search query, the apparatus 10, such as the processing circuitry 12, may provide a suggestion to correct the search to the intended search query based on the matching database value(s) with the greatest confidence levels. In some embodiments, the operations may be completed during or in conjunction with the input of the search query and the confidence level of the database value may be used to predict the intended search query.
Although the foregoing fuzzy search process may be conducted by the apparatus 10 without communication with a network or other network resources, such as in an offline manner, the apparatus 10 of an example embodiment may be employed in a network configuration utilizing various network resources, such as in conjunction with a client server architecture. In this regard,
In some embodiments, the apparatus 10 includes means, such as the processing circuitry 12, the processor 14, or the like, for monitoring the databases during operation. In some embodiments, in an instance the search indicator value does not match any database indicator values in the database, the search indicator value and corresponding one or more characters of a search query may be added to the database. In some embodiments, the apparatus 10 includes means, such as the processing circuitry 12, the processor 14, or the like, for removing database values and corresponding database indicator values from one or more databases. For example, in an instance a database value may be removed from a database, the apparatus 10 may input the database value into the corresponding minhash function for each set of database indicator values. In some embodiments, for offline operations as discussed in reference to
Referring now to
Referring now to Block 400 of
Referring now to Block 410 of
In some embodiments, the client device 305, such as the processing device 12, the communication interface 20 or the like, may cause the transmission of the request for one or more salt value inputs to one or more of the index server(s) 310, through the network 300. In some embodiments, the client device 305 may request the salt value input from one or more of the index server(s) 310 in advance of the operations discussed herein (e.g., during an initial setup period). Additionally or alternatively, the client device 305 may request the salt value inputs from one or more of the index server(s) 310 after receiving one or more characters of a search query. In some embodiments, the client device 305 may receive one or more salt value inputs without requesting the salt value input. For example, the index server(s) 310 may transmit the corresponding salt value inputs when connected to the network 300 and subsequently to a client device 305 when connected to network 300. In some embodiments, the index server(s) 310 may provide salt value inputs to connected client devices 305 at regular intervals. The salt value input(s) provided by an index server generally represent the salt value input(s) utilized to generate the set(s) of database indicator values stored by the index server. In various embodiments, the client device 305 may receive a definition of a minhash function that corresponds to the set of database indicator values held by a given index server 310 (e.g., in place of a salt value input).
Referring now to Block 420 of
Referring now to Block 430 of
Referring now to Block 440 of
Referring now to Block 450 of
Referring now to Block 460 of
Referring now to
Referring now to Block 500 of
Referring now to Block 510 of
Referring now to Block 520 of
Referring now to Block 530 of
In some embodiments, each of the one or more sets of database indicator values may be stored on an individual index server 310. In some embodiments, each index server 310 may be configured to generate one or more of the database indicator values. In some embodiments, the generation of the database indicator values may be performed before entry of the search query, while in other embodiments, the database indicator values may be generated in response to entry of the search query
Referring now to Block 540 of
In various embodiments, at least some example embodiments of the present disclosure allow for online and/or offline fuzzy term searching. In some embodiments discussed herein, the methods, apparatuses, and computer program products allow for efficient fuzzy searching with minimal time delays allowing for near-real time updates, thereby providing improved performance and quality. Various embodiments of the present disclosure are designed for fault tolerant fuzzy text queries with some embodiments allowing for scaling across multiple servers and other embodiments configured to be performed locally within a device. In some embodiments, the operations described herein allow for parallel processing of search queries to be completed allowing for near instantaneous determinations.
Accordingly, blocks of the flowcharts support combinations of means for performing the specified functions and combinations of operations for performing the specified functions for performing the specified functions. It will also be understood that one or more blocks of the flowcharts, and combinations of blocks in the flowcharts, can be implemented by special purpose hardware-based computer systems which perform the specified functions, or combinations of special purpose hardware and computer instructions.
In some embodiments, certain ones of the operations above may be modified or further amplified. Furthermore, in some embodiments, additional optional operations may be included, some of which have been described above. Modifications, additions, or amplifications to the operations above may be performed in any order and in any combination.
Many modifications and other embodiments of the inventions set forth herein will come to mind to one skilled in the art to which these inventions pertain having the benefit of the teachings presented in the foregoing descriptions and the associated drawings. Therefore, it is to be understood that the inventions are not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of the appended claims. Moreover, although the foregoing descriptions and the associated drawings describe example embodiments in the context of certain example combinations of elements and/or functions, it should be appreciated that different combinations of elements and/or functions may be provided by alternative embodiments without departing from the scope of the appended claims. In this regard, for example, different combinations of elements and/or functions than those explicitly described above are also contemplated as may be set forth in some of the appended claims. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.