The invention relates generally to computer systems and networks, and more particularly to locating information.
There are many types of computing resources (e.g., services, devices, data and so forth) that computer client users and client applications (or simply clients) need to manage and otherwise access, such as services and data maintained on corporate networks and other remotely accessible sites, including intranets and the internet. As there are many different computing platforms, various platform-independent mechanisms and protocols that facilitate the exchange of network information are becoming commonplace, including HTTP (HyperText Transfer Protocol), XML (extensible Markup Language), XML Schema, and SOAP (Simple Object Access Protocol). The concept of Web services, in which businesses, organizations, and other providers offer services to clients, is based on these standards. Web services are services that connect applications across an intranet, extranet, or across the Internet, so that these applications can share resources and information. Web services can be offered by any individual or organization that has the tools to create them and make them available to other individuals or organizations online.
To be of value, web services need to enable such clients to locate them, and exchange the information needed to execute them. To this end, UDDI (Universal Description Discovery & Integration) provides a set of defined services (e.g., in a UDDI Business Registry) that help clients discover such businesses, organizations, and other Web services providers, along with a description of their available web services and the technical interfaces needed to access those services. UDDI thus facilitates the connection between the providers and the consumers of Web services. Although such services may be provided over the internet, services also may be provided in an enterprise environment or other intranet, where the services and their usage may be more controlled. Thus, not just UDDI, but other service registries (such as one based on Microsoft Corporation's Active Directory®) may provide a way of locating a distributed service or other information.
Regardless of the service registry and the type of information and/or service being sought, taxonomies such as those available as categorization schemes within UDDI may be used to group sets of related values for use typically to categorize entities such as Web services or Web service providers. These values make up the “nodes” of a taxonomy. The nodes typically offer a hierarchical breakdown of a domain (such as the series of hierarchically arranged nodes in a geographic taxonomy path “World/Europe/UK/Scotland”). Taxonomies may also cover domains where there is no established hierarchy, such as by placing all nodes as peers at the top, or root level.
Using taxonomy-related services such as UDDI Services, clients are able to query for entities or other data that are categorized by a value in a given taxonomy. However, because the client is only able to query by value, the client needs to have a certain level of a-priori knowledge of the taxonomy to widen the scope of the search when needed. For example, consider a taxonomy that categorizes the states in the United States of America. A client can query for entities that are categorized with the value ‘California’, but in order to query for nearby entities, the client would have to know that ‘Nevada’ and ‘Oregon’ are nearby states so that these values could be used in further queries as desired. The service that categorized the entities with such a taxonomy likely would know this information, but the client querying the service often will not.
As a result, a client is required to know something about the taxonomy, otherwise the client is unable to widen the search scope to locate possibly valuable information. While this may appear simple with well-known entities such as states, in actuality there are many kinds of taxonomies having various kinds of data, and many varieties of clients. Requiring clients to have pre-existing knowledge of these many, possibly diverse taxonomies is an unreasonable requirement, and in general is a deficiency of taxonomy searching technologies. Consequently, finding relevant categorized results can be difficult. What is needed is a better way for clients that do not have prior knowledge of a taxonomy's structure to locate related data within that taxonomy.
Briefly, the present invention provides a system and method for use with a taxonomy-based search service that locates expanded information based on a query the client has proposed, by performing an automatically expanded branching of the query based on the genealogical branch specified by the client. The system and method thereby provide wider and valuable search results while isolating the user from the contextual knowledge of a given taxonomy.
In general, the taxonomy branch matching technology of the present invention allows the client to specify a starting point in the taxonomy, referred to as an origin node, along with data specifying one or more genealogical directions to expand the search. Thereafter the client automatically receives a result set that contains expanded results from nodes in the taxonomy that are related to the origin node, (as well as possibly including any relevant data corresponding to the origin node).
Genealogical directions for which a search may be automatically expanded in a taxonomy include ancestors (higher parent nodes) of a specified origin node, descendants (lower children nodes) of the origin node, and/or sibling nodes of the origin node. A family search may be specified that expands the search to return ancestors, descendants and siblings. When specifying ancestors (or family) as an expansion direction, a variable may be used to specify how many generations of parents the upward branch should be included in the expanded search. Similarly, when specifying descendants (or family) as an expansion direction, another variable may be used to specify how many generations of children downward should be included in the expanded search.
In general, the present invention may be implemented in a middle tier between a client and a taxonomy service that returns search results from a database. Note that the middle tier may be incorporated into the service. The middle tier recognizes requests for an expanded search, and invokes expansion logic to expand the search in the requested direction or directions relative to the origin node, by converting the expansion request into requests that the service understands, e.g., UDDI find requests in a UDDI environment.
Other advantages will become apparent from the following detailed description when taken in conjunction with the drawings, in which:
Exemplary Operating Environment
The invention is operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known computing systems, environments, and/or configurations that may be suitable for use with the invention include, but are not limited to: personal computers, server computers, hand-held or laptop devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, and so forth, which perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in local and/or remote computer storage media including memory storage devices.
With reference to
The computer 110 typically includes a variety of computer-readable media. Computer-readable media can be any available media that can be accessed by the computer 110 and includes both volatile and nonvolatile media, and removable and non-removable media. By way of example, and not limitation, computer-readable media may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by the computer 110. Communication media typically embodies computer-readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of the any of the above should also be included within the scope of computer-readable media.
The system memory 130 includes computer storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) 131 and random access memory (RAM) 132. A basic input/output system 133 (BIOS), containing the basic routines that help to transfer information between elements within computer 110, such as during start-up, is typically stored in ROM 131. RAM 132 typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processing unit 120. By way of example, and not limitation,
The computer 110 may also include other removable/non-removable, volatile/nonvolatile computer storage media. By way of example only,
The drives and their associated computer storage media, discussed above and illustrated in
The computer 110 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 180. The remote computer 180 may be a personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to the computer 110, although only a memory storage device 181 has been illustrated in
When used in a LAN networking environment, the computer 110 is connected to the LAN 171 through a network interface or adapter 170. When used in a WAN networking environment, the computer 110 typically includes a modem 172 or other means for establishing communications over the WAN 173, such as the Internet. The modem 172, which may be internal or external, may be connected to the system bus 121 via the user input interface 160 or other appropriate mechanism. In a networked environment, program modules depicted relative to the computer 110, or portions thereof, may be stored in the remote memory storage device. By way of example, and not limitation,
Taxonomy Branch Matching
The present invention is in part, generally directed towards distributed network resources (e.g., services and devices), in which a client running on essentially any platform may use a defined protocol such as SOAP (Simple Object Access Protocol) to access network resources over UDDI. However, the present invention is not limited to UDDI, but applies to any technology that handles requests related to value set information that may be arranged as nodes in a taxonomy. Thus, while the present invention will primarily be described with reference to UDDI and UDDI services, it is understood that the present invention may apply to locating related information in general. Further, while the present invention will be primarily described with respect to SOAP, XML, UDDI, and/or Windows®/.NET, it is understood that the present invention is not limited to any particular implementation, protocols, components, APIs, and so forth, but also encompasses similar mechanisms for interacting over a network. Thus, although the examples herein are related to the UDDI standards, it is understood that the actual invention may be abstracted to provide generic capabilities for performing searches based on hierarchical categorizations on alternate systems.
As generally represented in
In
In general, a search begins relative to a starting point (node) in the taxonomy referred to as an origin, identified by a starting keyed reference. In
In accordance with an aspect of the present invention, the present invention facilitates a way to expand a query to include data corresponding to one or more nodes having genealogical relationships (ancestors, descendants and/or siblings) relative to an origin node. Thus, a client that knows a value to provide in a query is able to receive a result set that can return expanded data that is genealogically close along the branch specified to that specifically requested value. For example, a client seeking a business in a particular area might know one nearby city, but does not want the search limited only to that city. Similarly, having knowledge of a nearby street will not find a location that is nearby but not exactly on that street, unless the search is expanded beyond that exact value. For practical reasons, most taxonomies cannot include such expanded data within each node that may be specified. Instead, the present invention expands queries, so as to obtain results that are not limited to a specified node in a taxonomy, but rather by locating related nodes on one or more appropriate branch or branches in that taxonomy.
To this end, the method and system of the present invention provide a feature, such as implemented in a middle tier between the client and a server that provides the taxonomy access, which allows the user to express expanded (branched out) requests. In general, and as described below, the middle tier includes expansion logic that can convert a request having certain information, e.g., within UDDI keyedReferences and keyedReferenceGroups, into a set of requests that the taxonomy service can understand. Note that it is alternatively feasible to extend existing schemas, however using keyedReferences and keyedReferenceGroups enables support of clients from multiple platforms.
The following is an XML-based example of how this feature may be expressed in an example UDDI find message:
In UDDI-based environment, this feature will be modeled as a tModel, wherein the tModel key used is ‘microsoft:uddi.genealogicalExpansion’ in this example implementation.
To recognize a request seeking expanded data in this example implementation, the presence of a keyedReferenceGroup that specifies this ‘genealogicalExpansion’ tModel key is detected at the middle tier, and in essence, tells the middle tier that this particular keyedReferenceGroup should be processed as a request for expanded data. Other ways of determining that a search should be expanded are feasible, such as scanning the request for any tModel key related to genealogical expansion (e.g., in the example request above, three keyed references are present that each specify expansion criteria via their tModel keys, and one or more these may be detected). Requests without this information may be passed to the service directly, or otherwise processed at the middle tier, e.g., to provide other non-conventional request handling services. Note that in an alternative implementation, expanded searches may be the default operation, e.g., expanded searching will occur unless an indicator or the like is used to turn off expanded searching to get an exact search.
The following XML element is what identifies this message as requiring genealogical expansion:
In the above example, the presence of the “microsoft:uddi.genealogicalExpansion” key on the keyedReferenceGroup element identifies this UDDI find message as one that requires genealogical expansion. This message still adheres to the UDDI specification, so if an implementation did not support genealogical expansion it could still process this UDDI find message; abeit with different results.
In this example, the genealogical expansion will be done for 1 level of descendants and 2 levels of ancestors, starting from the origin specified as ‘Seattle’.
is translated into a search that could be implemented in SQL as follows:
The results of this SQL would be returned as keyedReferences; suppose that the following results are returned.
is translated into a search that could be implemented in SQL as follows:
This SQL is actually run recursively, depending on the number of levels specified; in this example, the SQL would recurse two times in order to obtain two levels of ancestors.
The results of this SQL would be returned as keyedReferences; suppose that the following results are returned:
The results of the descendant and ascendant searches replace the genealogical keyedReference group. Thus, the find message after genealogical expansion looks as follows:
This find message is now a ‘regular’ UDDI find message and can be processed as any other UDDI find message. This example implementation allows a UDDI implementation to be retro fitted with genealogical expansion with a minimal amount of effort.
Thus, once the expansion logic 404 receives the keyedReferenceGroup, the keyedReferenceGroup is converted to the set of keyedReferences and provided to the service as a message. From the perspective of the service, this resulting message will look as though the client had individually specified the keyed references for each related node in an appropriate expansion direction, whereby the message can be processed like any other find message. Note that this feature leverages the current find infrastructure by expanding the keyed references into regular find message. This allows the middle tier to reuse other aspects of the infrastructure, including current querying and sorting implementations. In other words, this type of search is not treated any differently from a regular find request from the perspective of the service.
Returning to the example XML request, the next set of XML specifies the keyedReference at which the expansion is to start, that is, the origin parameter is modeled as a tModel:
This feature requires that this keyedReference be present within the keyedReferenceGroup.
The following comprises example code that instructs the expansion logic on how (i.e., what direction or directions to follow) to expand the query:
Valid values for keyvalue parameters corresponding to an expansion direction include family as shown above, parent (e.g., <keyedReference keyvalue=“parent” . . . />), children (e.g., <keyedReference keyvalue=“children” . . . />, and siblings, e.g., <keyedReference keyvalue=“siblings” . . . />. As is understood, other terms are equivalent (e.g., ancestors, descendants, up, down, sideways and so forth could be used). Specifying “Family” returns ancestors, descendants and siblings. Note that other genealogical relationships are possible, such as compound relationships (e.g., expand the search up to a parent and all the parent's siblings).
As represented in the above XML example, when expanding a search to branch upward in the hierarchy, e.g., by specifying parent or family, a maxAncestors value may be specified that indicates how far upward to expand the search, e.g., a maximum parameter value of one requests the immediate parent, a value of two requests the immediate parent and its parent (the grandparent), and so on. Similarly, for downward branch expansion, e.g., by specifying children or family, a maxDescendents value may be specified, e.g., in which a maximum parameter value of one requests the immediate children, two requests the children and grandchildren, and so on.
Continuing with the XML example, the following thus allows the user to determine how many generations (one in this example) of descendants to expand:
Note that an entire generation is retrieved in one query/round trip. As can be appreciated, the maximum descendants value is only relevant if the expansion direction is family or children. If this value is not specified, a default setting is used, which may be client-configurable. Further, this value could be constrained by a client-specified value, e.g., not greater than three.
As is understood, ancestors (parents) are similarly handled:
This optional keyedReference allows the client to determine how many generations of ancestors (two in this example) to expand. An entire generation is retrieved in one query/round trip, and the value is only relevant if the expansionDirection is family or parent. If this value is not specified, a default setting is used, which may be client configurable. Also, this value could be constrained by a client specified value.
Note that in one implementation, the number of siblings cannot be specified. Thus, in this implementation, if family or sibling is specified for expansionDirection, then all of the direct siblings of the origin keyed reference are returned. Note however that in alternative implementations it is feasible to place a limit on the number of siblings returned, although the sibling selection mechanism may be as simple as cutting off the result set at whatever point the limit is reached.
By way of example of an expanded search, returning to
Specifying “Seattle” as the origin and child (or family) with a maxDescendants value of one returns the neighborhood/districts nodes' data, including those shown in
As described above, specifying “family” as in the example XML code returns parents up to the maximum ancestors value, children up to the maximum descendants value and siblings.
In this manner, a straightforward request such as formatted in XML may be used to return an expanded result set. As can be readily appreciated, instead of relying on the client to have intimate knowledge of the taxonomy structure (which would be impractical), or having each node contain the data of its family (which would defeat the purpose of a structured taxonomy), the information specified in the request expands the result set as desired.
Note that in an alternative implementation, rather than modeling parameters as a tModel, a keyName can be used for the parameter value. For example, when specifying the origin, the following alternative structure may be implemented:
This may be somewhat less advantageous, because a keyName expression is required to specify the parameter name. However, this alternative modeling approach is semantically identical, and would not result in significant implementation differences.
Turning to an explanation of the operation of the present invention, the general request operation is represented in
A request handling mechanism 510 of the taxonomy service's server 512 attempts to locate the appropriate taxonomy, e.g., 516, in an appropriate database 518 (comprising any suitable data structure). If located, the request handling mechanism 514 queries the database 518 for each find request, as generally represented in
As can be readily appreciated, the middle tier 508 is shown as separate from the server 512 in
Note that the present invention may be combined with other taxonomy-based mechanisms. For example, U.S. patent application Ser. No. 10/607,812 entitled “System and Method for Enabling Client Applications to Interactively Obtain and Present Taxonomy Information,” (assigned to the assignee of the present invention and hereby incorporated by reference) describes a system and method in which client applications may interactively obtain taxonomy information from a UDDI server and thereby present that information to a user, such as to enable navigation through the taxonomy. An application programming interface is provided by which a client application sends a unique taxonomy identifier and a relationship qualifier (e.g., root, parent and/or child) to a server. The client may also identify a reference node within the taxonomy. The server receives the (e.g., XML) request message, and extracts the data to query a database based on the relationship qualifier (or qualifiers) and the taxonomy/reference node. Based on the query results, the server returns a response that provides relationship information to the client, such as information on root, parent and/or child nodes that satisfy the request. The client interprets the response to present the taxonomy, such as for user navigation through the taxonomy.
As can be readily appreciated, when such navigation is combined with the present invention, the client may, for example, navigate to a node and select that node as the origin node from which a branching search in accordance with an aspect of the present invention is desired. Alternatively, a client could receive expanded results from a node via branch matching and use the navigation mechanism to see the relationship between the nodes.
As can be seen from the foregoing detailed description, there is provided a method and system by which clients can obtain expanded results from a taxonomy (e.g., any categorizations and identifier based taxonomies) based on data obtained from nodes having a genealogical relationship with an origin node. The method and system thus expand searches to specified directions (branches) in a taxonomy, in a manner that is straightforward for clients to use and is flexible, thereby providing access to relevant information without prior knowledge of the taxonomy structure, and without requiring modification of the way in which existing find handling mechanisms operate. The method and system thus provide significant advantages and benefits needed in contemporary computing.
While the invention is susceptible to various modifications and alternative constructions, certain illustrated embodiments thereof are shown in the drawings and have been described above in detail. It should be understood, however, that there is no intention to limit the invention to the specific forms disclosed, but on the contrary, the intention is to cover all modifications, alternative constructions, and equivalents falling within the spirit and scope of the invention.