CONCATENATION DISCOVERY WEB SERVICE

BRIEF DESCRIPTION OF THE INVENTION

The invention is described in detail with the figures, where

FIGS. 1 and 2 show arrangements according to the invention.

FIG. 3 shows a data federation method according to the invention.

FIGS. 4 and 5 show high level architectures of a discovery service according to the invention.

FIG. 6 shows a discovered service network stored by a discovery service according to the invention.

FIG. 7 to 10 illustrate how the information of the discovered service network could advance a service invocation

FIG. 11 shows how the information about a service is integrated with the data federation method according to the invention.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

A basic scenario is illustrated by FIG. 1. In the figure, two services 1103 and 1104 have already been deployed on the service infrastructure 1500. As a consequence, a discovery service 1106 already has information about the data models of the services 1103 and 1104 in its knowledge base 1206 and metadata repository 1207. It is also assumed that all three services in the picture 1103, 1104, and 1105 have overlapping data models 1202, 1203 and 1204. Therefore, a transformation function 1200 has already been deduced by the system and this transformation function was deployed on a transformation engine 1102.

The scenario continues with the deployment of an additional service 1105 on the service infrastructure 1500. An administrator deploys a new service 1105 on the service infrastructure 1500.

Therefore, the administrator provides the WSDL interface of the service as well as the package corresponding to the service implementation to an administration tool 1107. The administration tool 1107 sends a request 1400 to the discovery service 1106. The discovery service parses the WSDL interface and extracts a data model out of this document.

The data model consists of data structures corresponding to the methods defined on the WSDL interface as well as the method argument data structures, described as XML schema in the WSDL document. This data model is stored in the metadata repository 1207.

The discovery service 1106 consults its knowledge base 1206 containing an ontology and/or semantic definitions of the data structures or similar ones inserted in the model during previous service deployments, i.e. when deploying, the service 1103 or 1104 tries to resolve any dependencies and relationships between the new service data model and what it already had discovered previously.

When new data structures or particular fields in those data structures remain unresolved, i.e. can't be related to any existing ontology, the operator is requested 1401 to provide additional ontology descriptions for them, through the administration tool 1107.

The administration tool replies 1402 with the new associations. They are stored by the discovery service 1106 in the knowledge base 1206.

When all data structures and fields have been classified, relationships are searched between data structures by a reasoner 1205. That is a kind of type inference mechanism.

For such a relationship, the system tries to automatically construct the mapping function, based on previously discovered relationships between individual fields of composite data structures.

A manual verification step may be required to make sure that the automatically generated mappings are accurate. Additionally, manual intervention may be required for complex mapping scenarios that cannot easily be handled by an XSLT script or that require additional information to be retrieved from external systems, such as attribute providers.

When relationships cannot be fully resolved automatically, the operator could again be asked 1401 to provide a mapping. This mapping is stored in the discovery service 1106 knowledge base 1206.

The associated mapping function is deployed 1403 in the transformation engine 1102, so that it becomes available as a service 1201 in the service infrastructure, through which a message should be routed, in order to be transformed accordingly.

As more and more relationships are found between individual data structure fields, future service deployments will be able to profit from this information, so that the process becomes more and more automatic.

FIG. 2 illustrates a run-time scenario in which a message 2400 is sent by a client application or another service 2100, to service A 2103. This message corresponds to a request to update a data record stored in database 2202 of service A 2103. The scenario further assumes that both service B 2104 and service C 2105 share the data being updated by the message 2400, in their respective databases 2203 and 2204.

All services 2103, 2104 and 2105 are connected to a service infrastructure 2500. This could be an enterprise service bus or an equivalent message broker. The service infrastructure contains a content-based router 2101 by which all requests destined for services 2103, 2104, and 2105 deployed on the service infrastructure 2500 are intercepted and routed.

Upon receiving message 2400 from client 2100, the content-based router 2101 first consults the discovery service 2106 to find out whether other services are impacted by the update operation associated with the message 2400, before routing the message 2400 to its intended destination (service 2103), as indicated by arrow 2402 in the figure.

In this example scenario, the discovery service 2106 responds with 2 routes: one route via a first transformation function 2200 towards target service 2104, and one route via a second transformation function 2201 towards target service 2105. Each transformation function transforms the original message 2400 to an equivalent message, i.e. a message with the effect to cause the same updates to the shared data in the databases 2203 and 2204 of the impacted services 2104 and 2105 that complies to the interface exposed by each impacted service 2104 and 2105, as indicated by the arrows 2404 and 2406 respectively.

The content-based router 2101 receiving the routes from the discovery service 2106, first forwards the original message 2400 to its originally intended target service 2103, as indicated by arrow 2402. Then, the content-based router 2101 processes the first route, by first sending the message 2400 to the first transformation function 2200 as indicated by arrow 2403, and next sending the resulting, i.e. transformed message, to service B 2104 as indicated by arrow 2404. Finally, the content-based router 2101 processes the second route, by first sending the message 2400 to the second transformation function 2201, as indicated by arrow 2405, and next sending the resulting, i.e. transformed, message to service C 2105.

Both services 2104 and 2105 perform the logic associated with messages 2404 and 2406 respectively, i.e. they update their data stores 2203 and 2204, respectively.

Another area where this invention is of importance is in an SCA-compliant (Service Component Architecture) service environment, see FIG. 3, where services/components 3100, 3101, 3102, and 3103 declare both imports 3300, 3301, and 3302, i.e. the interface they expect another component to provide, and exports 3200, 3201, 3202, and 3203, i.e. the interface the component itself provides to other components, and in which imports 3300, 3301, and 3302 are being linked/bound 3400, 3401, and 3402 to exports 3200, 3201, 3202, and 3203 in order to compose a new component/service offering a particular functionality.

At least one transformation function (including the identity) 3500, 3501, and 3502 is associated with a link/binding 3400, 3401, and 3402.

In the context of the ESB environment, a dedicated Federated Data Manager (FDM) can significantly help to realize this data federation model. Conceptually, a FDM can be thought of as consisting of a discovery service, a retrieval service, and a provisioning services.

Discovery means to locate the data available on the bus and maintaining a model that represents this data, retrieval or query is to support integrated queries that search over different services and data models, and provisioning to provide the data for newly registered services based on data already available on the bus.

An FDM could be also used for synchronization, that is to keep similar data in a consistent state.

Traditional service discovery, as provided by UDDI enables businesses to publish service listings and discover services from other businesses. The meta data available in the registry is suited to describe and search for services. It is rather limited and mainly concerns businesses, protocols and standard classifications, even enriched with semantic denotations.

In the light of data-driven services, this discovery functionality is not sufficient. A contribution of this invention is the analysis of the requirements of data-driven service discovery and the presentation of a general model of such an advanced discovery service.

An FDM Reg is illustrated in FIG. 4. A discovery service Dis could be regarded as a part of a FDM. It is responsible for discovering and locating services and their data usage, based on those services' data models. The data model of a particular service has to be based on its interface. The discovery service should inspect the service's interface (or any additional specification for that matter) and infer the data model from this description.

For data-driven service discovery, it is necessary to define relationships between data types in order to support the integration of the data models of the different services. Whenever a new service is registered with the discovery service, the discovery service will update the data model and discover and instantiate new relationships.

As an extension, meta data could be used for these data types and relationships to add support for a classification model leading to more semantic data discovery, i.e. discussing on a meta level, e.g., to locate a service that deals with multimedia content rather than just looking for content like movies or books.

Regarding FDM service mediation, it is necessary for the discovery service to know the semantic differences between related data-types. For instance, the format of address information used by an address book service might differ from an instant messaging service by the order in which data fields are stored, or by information that is represented as separate data fields in one type versus aggregated fields in the other type.

Hence, in addition to the relations between different data types, the discovery service should preferably incorporate knowledge of how to convert or transform these data types. This can be achieved by associating every data relationship with (knowledge on how to make use of) a transformation service, which is able to convert one data type in the relationship to the other and vice versa, depending on whether the relationship is unidirectional or not.

The discovery service is able to navigate through the resulting data model and deduce how to map one service on another via their data models using these transformations. In this context the term route is also used for such mappings. A primary use of those routes is the autonomous synchronization Sy of data between incorporated services.

In summary, such a database discovery consists of three major activities:

- Extracting the data model from the interface of registering services
- Relating the extracted data model to the data model stored in the registry
- Querying the stored data model to discover services based on their data model

When a new service is registered at the discovery service, the interface of the service will be inspected and a data model will be extracted. A number of situations are possible depending on the nature of the interface and the significance of the data part on the interface.

The most difficult case—and currently also the most frequent case since such a data federation is not applied—is the extraction of the data model from a service that is unaware of data federation. The significance of the data part on the interface will be small and the information the discovery service will be able to extract will be rather limited.

For instance, a WSDL description usually contains only a basic description of the data types used on the input or output of the operations of a service. More appropriate for data-driven services is an interface with a separate data interface, describing the data types in more detail and how the different data types can be read or written, i.e. manipulated by using the public access operations.

Getters and setters for properties of JavaBeans components are a good example for such access operations. In the most ideal case, the data types are also described semantically, e.g. using in-lined Web Ontology Language (OWL), constructs, or using a separate OWL file, relating the types to other, known, types or integrating them in a common or standard ontology.

The integration of the service's data model in the currently stored data model boils down to distinguishing between new and already existing data types and identifying relationships between new data types and previously known data types.

The more detailed the information as it is extracted from the interface, the more meaningful the integration of the new data types within the currently stored data model can occur. A dedicating factor here is using explicit types. If, for example, all data of some service is modeled using strings, the discovery service will not be able to infer a lot of meaningful relationships with the data models of other services. The higher the degree of semantics in the interface, the more autonomous the integration can occur. If the new data types are defined independently, without a reference or relation to other types, it is next to impossible to integrate these types fully autonomously. In this case, relating the new types to the stored data model requires world knowledge, provided e.g. by a discovery administrator.

If, however, semantic information is present in the interface, the integration can happen by reasoning over the semantic information present in the registry and the interface. Most likely, this semantic information will come in the form of a reference to a standard or common ontology. In this case, the discovery service can directly extract the correct relationships from this ontology.

For the discovery service to be able to search for related services through their data models, it needs some rules to define which relations at the level of the data model can introduce relations at the level of services.

It can for instance define a set of semantically related operations of a particular operation S as a (transitive) closure of a relation R between operations. An operation X is related to an operation Y if the inputs of X and Y overlap. This could be in the sense that the input type is a subtype or a part of the input.

A more practical approach could consist of a relationship is TransformableTo, which only means that there exists a transformation from one data type to the other. For each of the relationships subtype of, part of, and is TransformableTo, there is an association with a transformation service.

The above definition of related operations then specifies a sequence of transformations to go from one data type or operation to another data type or operation. This sequence of operations is actually the route that is used for the automatic synchronization between services in a data federation manager.

For the example in the case of the address book and the instant messenger, there could be a route from an UpdateAddress operation to a UpdateVCard operation via the transformations that map UpdateAddress to the Address data type, the Address data type to the address type as it is used in the VCard data type and from there, via VCard to updateVCard.

For a concrete implementation, one needs both a data description and a data discovery technology. One can use for instance both WSDL and OWL, without any need for further integration. That is, OWL can be used as such within a WSDL specification, or it can be used as a separate specification file. Regarding the data discovery technology, one can choose for instance ebXML over UDDI, since it offers an much more expressive data model and query application programmer interface.

ebXML could be used as a set of specifications for electronic business collaboration, of which discovery is one part. The registry used by ebXML consists of both a registry and a repository. The repository is capable of storing any type of electronic content, while the registry is capable of storing meta data that describes that content. The content within the repository is referred to as “repository items” while the meta data within the registry is referred to as “registry objects”.

The ebXML registry defines a registry information model (RIM) which specifies the standard meta data that may be submitted to the registry. The main features of the information model include:

- A RegistryObject: The top level class in ebRIM is the RegistryObject. This is an abstract base class used by most classes in the model. It provides minimal meta data for registry objects.
- A Classification: Any RegistryObject may be classified using ClassificationSchemes and ClassificationNodes which represent individual class hierarchy elements. A ClassificationScheme defines a tree structure made up of ClassificationNodes. The ClassificationSchemes may be user-defined.
- An Association: Any RegistryObject may be associated with any other RegistryObject using an Association instance where one object is the sourceObject and the other is the targetObject of the Association instance. An Association instance may have an associationType which defines the nature of the association. There are a number of predefined Association Types that a registry must support to be ebXML compliant. ebXML allows this list to be expanded.
- A Service Description, ServiceBinding and SpecificationLink classes provide the ability to define service descriptions including WSDL. ebXML exports two interfaces to use the registry.
- A Life-CycleManager (LCM) is responsible for all object lifecycle management requests.
- A QueryManager (QM) is responsible for handling all query requests. A client uses the operations defined by this service to query the registry and discover objects.

The ebXML query service makes full use of the data model. All information can be used to search for items in the registry, e.g. all RegistryObjects that are associated with a certain item or all Service items that are classified with a certain ClassificationNode. To enhance the data classification model in the ebXML registry with semantic relationships, the constructs available in ebXML can be used. The ebXML registry information model can be used to simulate an OWL description of data classes.

An architecture has been defined for the data discovery service prototype using ebXML as a backbone component.

FIG. 5 depicts a high level component view of the architecture. It consists of three components D; QF, and EB. A discovery component D provides three interfaces LC, Q, and A, that are used by other FDM services. A lifecycle interface LC is used for the lifecycle management of registered services. It can be used by the system administrator to subscribe, publish and activate new services. The component will store the service information in the registry based on the description and will propose a data model for the service and relationships with other data types in the registry. The interface also contains an operation for resolving and storing the proposed data relationships. An admin interface A is used for maintenance operations on the registry.

A system administrator will use it for maintenance, especially on the data models and the relationships between them. A query interface Q is used for searching the information stored in the registry. It offers one specific operation, mainly used by the synchronization service to find routes to related services, and one generic operation for structured query language (SQL) like queries as defined in the ebXML standard. A ebXML component EB is a fully ebXML standard compliant registry and discovery service. It will be used by both the discovery component D and third-party clients. The former will use it as a registry that stores the available services together with their data models, including relationships between these models and associated transformations, while the latter can use it as a traditional discovery service. A QueryFacade component QF could handle recursive queries, for example to search through transitive relations. This component is necessary because the ebXML standard specification does not include this functionality.

The interfaces of the discovery component Q, A, and LC mainly use WSDL and OWL formats as input and output, but internally, the discovery registry is based on the ebXML format. Extraction of the data model will thus come down to transforming WSDL and OWL to the ebRIM and ebRS publication format.

Services could be represented with a Service class and the rest of the information from the WSDL comes in the ServiceBinding and SpecificationLink classes. The data model used by the service is mapped to a ClassificationScheme, where each ClassificationNode represents one type in the data model and is associated with the service using a Classification.

For example, the above mentioned address book service could be stored in the ebXML registry. The service is classified with two data types, one for changing address information an another for adding new entries on the address book. Let these types consist of an address type, a person type and strings.

As a new service is published in the registry, the new data model elements should be inserted into the registry and the service's data model should be associated with the data types already stored in the registry. The discovery service might not be able to accomplish the latter fully autonomously. Then it could deduce a set of suggested data type relationships, to be finalized e.g. by a system administrator.

Some simplifications w.r.t. the associations could be based on the full equivalence between data types, e.g. when a type is already available in the registry, its service-specific relations will have to be added to the registry as well. To make this deduction sound and complete, the system administrator could extend the service description with semantic data information by embedding OWL constructs in the WSDL publication.

To search through the model for routes between operations of different services, one can use Floyd-Warshal like algorithms, or one pair shortest path discoveries, i.e. algorithms from the Dijkstra search type.

FIG. 6 shows a more abstract presentation of a service network. As mentioned above a service correspond to a function, shown by the arrows T. The services form a category of arrows T, where a service T has an input and an output data types D. These types define the service and vice versa. For a concatenation of two services the types have to be conform, i.e. the types have to match at least by means of conversion functions that could be derived from meta information of the type on a semantic level. A closer look on the bullets would mean that the types form a equivalence class of data presentations that are implemented in the outlined realization as the aforementioned data models.

FIG. 7 shows a concatenation scenario, i.e. a successive invocation of services with appropriate, i.e. compatible, interfaces. There is an input type S and a output type E of the resulting (concatenated) service, depicted as a dashed arrow. This (virtual) service is composed of three real services.

The services can be concatenated in the category of arrows. A sequence of concatenated invocations correspond to a path in the graph (bold) having a start S and an end E. The constraint is that the data types need to be consistent, i.e. the Nth arrow ends at a bullet, where the N+1th arrow begins. The path corresponds to a (virtual) service having input type S and output type E (dashed).

The discovery service according to the invention is aware of the service network shown in FIG. 6. The discovery service X stores a map of the service network, as shown in FIG. 8. A client C could query S?E for instance whether there exists a service defined by the input data type S and the output data type E. The query is illustrated by the connection between the client C and the discovery service X.

In FIG. 9 it is illustrated how a route through the service network is discovered. The discovery service X has to identify the input and output data types S and E within its map, and the service has to identify a connection between the data types of corresponding points (or equivalence classes), i.e. data models, in the map. This is a path of services T1, T2, and T3—or in general a set of paths. This information, i.e. the routing information (including optionally data transformations for type conversions) is replied to the client C.

That enables the client to invoke the service chain defined by the path, as shown in FIG. 10. With the input the first service T1 is invoked IT1, with the result of this invocation the second service T2 is invoked IT2, and finally the third service T3 is invoked, yielding to a result of the provided output type E.

To summarize: A client C that seeks for a service with the input data type S and the output data type E can ask the dedicated service X for a sequence of service invocations providing the searched service. The dedicated service X could look up the data types in his memory and can calculate a path, e.g. via Dijkstra's algorithm or by means of a transitive closure via Floyd-Warshal algorithm. That enables the client to invoke the services in a concatenated way . . .

FIG. 11 illustrates how the map stored in the discovery service could be created (incrementally). Suppose, starting from the (already discovered service network, shown in FIG. 6, a new service S has to be registered. This is shown by the dashed arrow. The service has an input data type DS and an output data type DE. A lookup yields that the input data type DS is quite new, i.e. unknown, but from the semantic description a transformation between a known data type and the new data type could be derived. This is memorized by creating a new bullet and a new arrow in the map. The output data type DE could be identified as an already known data type in the example. This is shown by the dotted circle. The map is completed by the integration of the arrow connecting directly the data types DS and DE. Finally the above mentioned discovery service has a consistent and integer picture (model) of the services, the data types, and data type transformations.

This technique enables an arrangement for the adaptation of a message exchanged between a consumer service and multiple provider services in e.g. a service-oriented architecture, where the arrangement includes a discovery service comprising storage means for distinct service data models, each service data model being associated to a provider service, and storage means for relationships between said service data models. That enables a discovery message routes, one message route per provider service, each message route being defined as a sequence of zero or more services invocations, optionally with a transformation that is associated to a data model relationship. The discovery means is able to adapt the message of said consumer service into a message route intended to a designated provider service. Preferably the discovery means is able to include the message of said consumer service as the last provider service in said sequence of the message route.

The discovery means might further comprise a reasoner (2205) adapted to support the automatic deduction of new relationships between service data models based on previously established relationships. The discovery means might be adapted to automatically determine at least one additional target provider service based on the impact of said message, exchanged between said consumer service and said designated provider service, has on the service data models of other provider services in order to support the synchronization of data shared between provider services.

CONCATENATION DISCOVERY WEB SERVICE

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

US Classifications

International Classifications

Abstract

Description

Claims

Priority Claims (1)