One embodiment of the present invention is a computer-implemented method for adaptive construction of a search system for searching objects by their string key. The search system can include objects that reside in the object store 108, and the tree 102 that can be constructed using the objects with their string key structure.
The tree 102 can be based on a trie, or prefix tree, which is an ordered tree data structure that is used to index objects where the keys are strings that accommodate a specific search method. The trie facilitates retrieval of the choice of next characters, given a partial string key input. A description of tries is given in Donald Knuth. The Art of Computer Programming, Volume 3: Sorting and Searching, Third Edition. Addison-Wesley, 1997. ISBN 0-201-89685-0. Section 6.3: Digital Searching, pp. 492-512.
The trie can be reconstructed to be adaptable to restrictive storage requirements through a serious of steps that manipulate key prefixes and minimize the number of nodes and leaves. In one embodiment, most leaf nodes of the tree 102 can be associated with multiple objects in the object store, which can mean that only a portion of the full key is searched in the tree before obtaining a group of objects from the object store. For example, the tree 102 can be a variable-scale compression of a full trie.
Tree storage can be minimized based on a given compression criteria. An adaptable search method can be used to retrieve objects via the search tree and said object store. The search can adapt to the tree structure resulting from compression and a given user interface.
In one embodiment, the object store 108 models spatial objects in the real world that can be searched using a string key. The real world spatial objects can include cities, streets in the city, intersections, points of interests (POIs), or another type of object that can be associated with a string key.
In one embodiment, objects can be stored in the leaves of the trie. In another embodiment, to accommodate multiple search methods on the same set of objects, a separate object store can be constructed for a given type of object as either fixed length or variable length storage. Storage for variable length objects can be constructed with a fixed length object offset directory. Such an object store can accommodate search and scrolling of objects.
The object store entries can be determined by the unique key of each object, wherein the order of objects in the store can be determined by the object's sorting key. The object store can distinguish between components of the search key
In one embodiment, once the object store is set up, the search tree 102 can be constructed such that the reference to each object in the store can be found in a tree leaf using object's search key.
The search key structure of the tree can indicate spatial objects, such as streets, street intersections, Points of Interest (POIs) or other elements or attributes of objects. In one embodiment, the order by which components of the key for a given class of objects are concatenated indicates specific search method for this class of objects. The order of concatenation of the key components can be the simple mechanism by which the system designer can rapidly prototype and experiment with various user interfaces embodied in a multitude of string key definitions for a given class of objects.
An API can be implemented and used in an application to aid in the construction of the search key and thus the search method, to allow designers to produce and experiment with a variety of user interfaces for the search of an object. In one embodiment, a GUI can be built using such API to define and select a key structure(s) for the search method(s) on a given class of objects, whereby imposing an appropriate order on the object store and tree(s). The API can give system designers the flexibility to change a search system's interface with ease typically associated with the RDBMS technology, without relying on a relational database management system, which may have impractical storage requirements for some environments.
Thus by selecting the string key components and the component order for a composition of the key structure, a designer can assess a variety of user interfaces and underlying search methods.
For a given user interface profile and a chosen compression criteria, a score can be generated for search tree size, object store size, memory requirements, and best and worst case search performance. This can give system designer a tool to balance various requirements by comparing scores of different implementations.
For example, the amount of compression can affect the performance of the system. High levels of compression can mean that more objects need to be obtained from the object store and analyzed. Low levels of compression can result in large storage requirements for the tree. The variable compression criteria that regulate tree construction and can maximize number of objects referenced by a leaf node can be tuned to reasonably balance performance, memory and storage use for the ultimate application. In one embodiment, compression criteria regulate a minimum number of objects under any branch of the tree.
Looking again at
The leaf nodes can be distinguished as a short leaf node or a long leaf node. A short leaf references a first object in the contiguous list, and a number of objects referenced. A long leaf can reference an arbitrary list of objects by storing a count of references, and a direct reference for each object in the list. The search can include finding a leaf node 110 based on a search key and locating a set of matches among the objects referenced by the leaf node.
In one embodiment, a user inputs a search string character by character and the application searches the tree 102 to indicate a set of valid next input characters, until the search string is complete or the user requests a set of objects that match a partial key. The application can provide a display indicating the valid next characters, or otherwise output the valid next characters. In another embodiment, a user inputs an entire or a partial search string. The tree that supports such searches can store the key prefix string at each tree node, with the shortest at the root and the full search key at the leaf. In one embodiment of the present invention, the search tree is compressed by reducing the node's key prefix to store only its own extension of the parent's prefix, such that the actual key prefix of a node is obtained by concatenating all key prefix strings on the path from the root to this node with the node's stored key prefix. In one embodiment of the adaptive index, the search tree is further compressed by collapsing nodes with a single child.
One embodiment of the present invention is a system comprising an application 104 with a map display 106 and a search system including a tree 102 and the object store 108. The Tree 102 can be constructed with nodes associated with a key structure. The tree 102 can be compressed by reducing each node's prefix. The tree 102 can include leaf nodes that store objects, when the class of objects is intended to be accessed via a single search method. The tree 102 can include leaf nodes that contain references to objects in the object store, when that class of objects is intended to be accessed via more than one search method. The tree 102 can include leaf nodes that reference multiple objects in an object store 108, as the result of variable compression. The search can include searching to find a leaf node 110 based on a search key and checking the objects indicated by the leaf node.
The system 100 can have a user interface 110 that can receive input from the user and produce an output. One exemplary output can be next character indication that shows the valid next characters. The set of available next characters can be determined from a search of the tree 102 and/or object store 108 as discussed below.
In both cases, object data can be obtained corresponding to the number of counts. For example, if the count is 50, the next 50 objects can be obtained from the object store and analyzed as appropriate. Short leaf nodes reduce storage requirements for the tree. This can be valuable for mobile geographic applications implemented on resource-constrained systems.
In one embodiment, consecutive objects can be stored in a short leaf as shown in
In one embodiment the leaf nodes need not have associated key information. This can mean that the leaf nodes will have the same key prefix as their parent node. This can allow the objects or object references to be easily combined into leaf nodes for most efficient packing.
In one embodiment, the objects referenced by the leaf node can be accessed then analyzed to determine the next character and to implement scrolling. The tree can have the leaf nodes at different levels of the tree.
One embodiment of the present invention is a computer-implemented method of constructing a tree comprising a list of keys following a key structure, constructing a full tree structure and then pruning it, by combining nodes such that most leaf nodes are associated with multiple objects.
Compression techniques can include maximizing leaf node references to objects to minimize storage overhead required for each node, based on a given criteria.
In step 504, a list of keys for the objects based on the key structure can be determined. The key structure can also determine the order of the objects in the object store.
In step 506, a full node structure can be created based on the list of keys. This full node structure can be compressed as shown in steps 508, 510 and 512 to reduce the size of the tree by reducing the number of nodes and leaves. Exemplary steps are also shown in
In
The above example shows the steps as distinct. In another embodiment, the compression steps can be combined into a single step producing the same results.
In one embodiment, tree nodes can store indications of other search criteria. A search or other operation on the tree can use the indications to determine whether the node and its offspring nodes or associated objects need to be further analyzed. In one embodiment, the indications can be used to implement an n-dimensional search
In one embodiment, the searches can be filtered by object attributes such as a category or a city. The indications can include indications of object categories that are not found among the node's offspring and/or indications of object categories that are included in at least one of its offspring. Similarly, if indications include city id, the searches can be filtered by a city. In one embodiment, a user can search a point of interest by name, refined by a specified object category, and further refined by the name of the city where it resides.
For example, if the presence or absence of points of interest categories is indicated on a tree node, a character search for a point of interest could eliminate from the search path the nodes that exclude a category, such as fast food.
In one embodiment, the nodes can store category exclusion or inclusion information to simplify and speed up a search for a specific category. The exclusion information can indicate that no object associated with the node is in the category. The inclusion information can indicate that there is an object associated with the node in the category.
The tree of
In one example, the user interface can include checkboxes or the like to receive user input for additional search criteria indicated on the tree, for example object categories. The search can use the category information to determine which nodes to examine in the search. In the example of
The search criteria can be a code associated with certain nodes to indicate the categories not found among the node's offspring or the like. The objects in the object store can also have associated category information so the two dimensional search can involve both the nodes of the tree and the objects in the object store.
The API used to select the key structure can be used to add additional search criteria to the tree to enable the multi-dimensional search.
As described in the co-pending U.S. Patent Application, NEAREST SEARCH ON ADAPTIVE INDEX WITH VARIABLE COMPRESSION, Ser. No. 60/806,367, (corresponding to attorney docket number TELA-07781US0), the search system can be used to do a nearest search to a specific position.
One embodiment may be implemented using a conventional general purpose of a specialized digital computer or microprocessor(s) programmed according to the teachings of the present disclosure, as will be apparent to those skilled in the computer art. Appropriate software coding can readily be prepared by skilled programmers based on the teachings of the present discloser, as will be apparent to those skilled in the software art. The invention may also be implemented by the preparation of integrated circuits or by interconnecting an appropriate network of conventional component circuits, as will be readily apparent to those skilled in the art.
One embodiment includes a computer program product which is a storage medium (media) having instructions stored thereon/in which can be used to program a computer to perform any of the features present herein. The storage medium can include, but is not limited to, any type of disk including floppy disks, optical discs, DVD, CD-ROMs, micro drive, and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, DRAMs, flash memory of media or device suitable for storing instructions and/or data stored on any one of the computer readable medium (media), the present invention includes software for controlling both the hardware of the general purpose/specialized computer or microprocessor, and for enabling the computer or microprocessor to interact with a human user or other mechanism utilizing the results of the present invention. Such software may include, but is not limited to, device drivers, operating systems, execution environments/containers, and user applications.
The forgoing description of preferred embodiments of the present invention has been provided for the purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise forms disclosed. Many modifications and variations will be apparent to one of ordinary skill in the relevant arts. For example, steps preformed in the embodiments of the invention disclosed can be performed in alternate orders, certain steps can be omitted, and additional steps can be added. The embodiments where chosen and described in order to best explain the principles of the invention and its practical application, thereby enabling others skilled in the art to understand the invention for various embodiments and with various modifications that are suited to the particular used contemplated. It is intended that the scope of the invention be defined by the claims and their equivalents.
This application claims priority from the following co-pending applications, which are hereby incorporated in their entirety: U.S. Provisional Application No. 60/806,366 entitled: “ADAPTIVE INDEX WITH VARIABLE COMPRESSION”, by Tsia Kuznetsov, et al., filed Jun. 30, 2006, (Attorney Docket No. TELA-07780US0) and U.S. Provisional Application No. 60/806,367 entitled: “NEAREST SEARCH ON ADAPTIVE INDEX WITH VARIABLE COMPRESSION”, by Tsia Kuznetsov, filed Jun. 30, 2006, (Attorney Docket No. TELA-07781US0).
Number | Date | Country | |
---|---|---|---|
60806366 | Jun 2006 | US | |
60806367 | Jun 2006 | US |