The invention is described in detail with the figures, where
A basic scenario is illustrated by
The scenario continues with the deployment of an additional service 1105 on the service infrastructure 1500. An administrator deploys a new service 1105 on the service infrastructure 1500.
Therefore, the administrator provides the WSDL interface of the service as well as the package corresponding to the service implementation to an administration tool 1107. The administration tool 1107 sends a request 1400 to the discovery service 1106. The discovery service parses the WSDL interface and extracts a data model out of this document.
The data model consists of data structures corresponding to the methods defined on the WSDL interface as well as the method argument data structures, described as XML schema in the WSDL document. This data model is stored in the metadata repository 1207.
The discovery service 1106 consults its knowledge base 1206 containing an ontology and/or semantic definitions of the data structures or similar ones inserted in the model during previous service deployments, i.e. when deploying, the service 1103 or 1104 tries to resolve any dependencies and relationships between the new service data model and what it already had discovered previously.
When new data structures or particular fields in those data structures remain unresolved, i.e. can't be related to any existing ontology, the operator is requested 1401 to provide additional ontology descriptions for them, through the administration tool 1107.
The administration tool replies 1402 with the new associations. They are stored by the discovery service 1106 in the knowledge base 1206.
When all data structures and fields have been classified, relationships are searched between data structures by a reasoner 1205. That is a kind of type inference mechanism.
For such a relationship, the system tries to automatically construct the mapping function, based on previously discovered relationships between individual fields of composite data structures.
A manual verification step may be required to make sure that the automatically generated mappings are accurate. Additionally, manual intervention may be required for complex mapping scenarios that cannot easily be handled by an XSLT script or that require additional information to be retrieved from external systems, such as attribute providers.
When relationships cannot be fully resolved automatically, the operator could again be asked 1401 to provide a mapping. This mapping is stored in the discovery service 1106 knowledge base 1206.
The associated mapping function is deployed 1403 in the transformation engine 1102, so that it becomes available as a service 1201 in the service infrastructure, through which a message should be routed, in order to be transformed accordingly.
As more and more relationships are found between individual data structure fields, future service deployments will be able to profit from this information, so that the process becomes more and more automatic.
All services 2103, 2104 and 2105 are connected to a service infrastructure 2500. This could be an enterprise service bus or an equivalent message broker. The service infrastructure contains a content-based router 2101 by which all requests destined for services 2103, 2104, and 2105 deployed on the service infrastructure 2500 are intercepted and routed.
Upon receiving message 2400 from client 2100, the content-based router 2101 first consults the discovery service 2106 to find out whether other services are impacted by the update operation associated with the message 2400, before routing the message 2400 to its intended destination (service 2103), as indicated by arrow 2402 in the figure.
In this example scenario, the discovery service 2106 responds with 2 routes: one route via a first transformation function 2200 towards target service 2104, and one route via a second transformation function 2201 towards target service 2105. Each transformation function transforms the original message 2400 to an equivalent message, i.e. a message with the effect to cause the same updates to the shared data in the databases 2203 and 2204 of the impacted services 2104 and 2105 that complies to the interface exposed by each impacted service 2104 and 2105, as indicated by the arrows 2404 and 2406 respectively.
The content-based router 2101 receiving the routes from the discovery service 2106, first forwards the original message 2400 to its originally intended target service 2103, as indicated by arrow 2402. Then, the content-based router 2101 processes the first route, by first sending the message 2400 to the first transformation function 2200 as indicated by arrow 2403, and next sending the resulting, i.e. transformed message, to service B 2104 as indicated by arrow 2404. Finally, the content-based router 2101 processes the second route, by first sending the message 2400 to the second transformation function 2201, as indicated by arrow 2405, and next sending the resulting, i.e. transformed, message to service C 2105.
Both services 2104 and 2105 perform the logic associated with messages 2404 and 2406 respectively, i.e. they update their data stores 2203 and 2204, respectively.
Another area where this invention is of importance is in an SCA-compliant (Service Component Architecture) service environment, see
At least one transformation function (including the identity) 3500, 3501, and 3502 is associated with a link/binding 3400, 3401, and 3402.
In the context of the ESB environment, a dedicated Federated Data Manager (FDM) can significantly help to realize this data federation model. Conceptually, a FDM can be thought of as consisting of a discovery service, a retrieval service, and a provisioning services.
Discovery means to locate the data available on the bus and maintaining a model that represents this data, retrieval or query is to support integrated queries that search over different services and data models, and provisioning to provide the data for newly registered services based on data already available on the bus.
An FDM could be also used for synchronization, that is to keep similar data in a consistent state.
Traditional service discovery, as provided by UDDI enables businesses to publish service listings and discover services from other businesses. The meta data available in the registry is suited to describe and search for services. It is rather limited and mainly concerns businesses, protocols and standard classifications, even enriched with semantic denotations.
In the light of data-driven services, this discovery functionality is not sufficient. A contribution of this invention is the analysis of the requirements of data-driven service discovery and the presentation of a general model of such an advanced discovery service.
An FDM Reg is illustrated in
For data-driven service discovery, it is necessary to define relationships between data types in order to support the integration of the data models of the different services. Whenever a new service is registered with the discovery service, the discovery service will update the data model and discover and instantiate new relationships.
As an extension, meta data could be used for these data types and relationships to add support for a classification model leading to more semantic data discovery, i.e. discussing on a meta level, e.g., to locate a service that deals with multimedia content rather than just looking for content like movies or books.
Regarding FDM service mediation, it is necessary for the discovery service to know the semantic differences between related data-types. For instance, the format of address information used by an address book service might differ from an instant messaging service by the order in which data fields are stored, or by information that is represented as separate data fields in one type versus aggregated fields in the other type.
Hence, in addition to the relations between different data types, the discovery service should preferably incorporate knowledge of how to convert or transform these data types. This can be achieved by associating every data relationship with (knowledge on how to make use of) a transformation service, which is able to convert one data type in the relationship to the other and vice versa, depending on whether the relationship is unidirectional or not.
The discovery service is able to navigate through the resulting data model and deduce how to map one service on another via their data models using these transformations. In this context the term route is also used for such mappings. A primary use of those routes is the autonomous synchronization Sy of data between incorporated services.
In summary, such a database discovery consists of three major activities:
When a new service is registered at the discovery service, the interface of the service will be inspected and a data model will be extracted. A number of situations are possible depending on the nature of the interface and the significance of the data part on the interface.
The most difficult case—and currently also the most frequent case since such a data federation is not applied—is the extraction of the data model from a service that is unaware of data federation. The significance of the data part on the interface will be small and the information the discovery service will be able to extract will be rather limited.
For instance, a WSDL description usually contains only a basic description of the data types used on the input or output of the operations of a service. More appropriate for data-driven services is an interface with a separate data interface, describing the data types in more detail and how the different data types can be read or written, i.e. manipulated by using the public access operations.
Getters and setters for properties of JavaBeans components are a good example for such access operations. In the most ideal case, the data types are also described semantically, e.g. using in-lined Web Ontology Language (OWL), constructs, or using a separate OWL file, relating the types to other, known, types or integrating them in a common or standard ontology.
The integration of the service's data model in the currently stored data model boils down to distinguishing between new and already existing data types and identifying relationships between new data types and previously known data types.
The more detailed the information as it is extracted from the interface, the more meaningful the integration of the new data types within the currently stored data model can occur. A dedicating factor here is using explicit types. If, for example, all data of some service is modeled using strings, the discovery service will not be able to infer a lot of meaningful relationships with the data models of other services. The higher the degree of semantics in the interface, the more autonomous the integration can occur. If the new data types are defined independently, without a reference or relation to other types, it is next to impossible to integrate these types fully autonomously. In this case, relating the new types to the stored data model requires world knowledge, provided e.g. by a discovery administrator.
If, however, semantic information is present in the interface, the integration can happen by reasoning over the semantic information present in the registry and the interface. Most likely, this semantic information will come in the form of a reference to a standard or common ontology. In this case, the discovery service can directly extract the correct relationships from this ontology.
For the discovery service to be able to search for related services through their data models, it needs some rules to define which relations at the level of the data model can introduce relations at the level of services.
It can for instance define a set of semantically related operations of a particular operation S as a (transitive) closure of a relation R between operations. An operation X is related to an operation Y if the inputs of X and Y overlap. This could be in the sense that the input type is a subtype or a part of the input.
A more practical approach could consist of a relationship is TransformableTo, which only means that there exists a transformation from one data type to the other. For each of the relationships subtype of, part of, and is TransformableTo, there is an association with a transformation service.
The above definition of related operations then specifies a sequence of transformations to go from one data type or operation to another data type or operation. This sequence of operations is actually the route that is used for the automatic synchronization between services in a data federation manager.
For the example in the case of the address book and the instant messenger, there could be a route from an UpdateAddress operation to a UpdateVCard operation via the transformations that map UpdateAddress to the Address data type, the Address data type to the address type as it is used in the VCard data type and from there, via VCard to updateVCard.
For a concrete implementation, one needs both a data description and a data discovery technology. One can use for instance both WSDL and OWL, without any need for further integration. That is, OWL can be used as such within a WSDL specification, or it can be used as a separate specification file. Regarding the data discovery technology, one can choose for instance ebXML over UDDI, since it offers an much more expressive data model and query application programmer interface.
ebXML could be used as a set of specifications for electronic business collaboration, of which discovery is one part. The registry used by ebXML consists of both a registry and a repository. The repository is capable of storing any type of electronic content, while the registry is capable of storing meta data that describes that content. The content within the repository is referred to as “repository items” while the meta data within the registry is referred to as “registry objects”.
The ebXML registry defines a registry information model (RIM) which specifies the standard meta data that may be submitted to the registry. The main features of the information model include:
The ebXML query service makes full use of the data model. All information can be used to search for items in the registry, e.g. all RegistryObjects that are associated with a certain item or all Service items that are classified with a certain ClassificationNode. To enhance the data classification model in the ebXML registry with semantic relationships, the constructs available in ebXML can be used. The ebXML registry information model can be used to simulate an OWL description of data classes.
An architecture has been defined for the data discovery service prototype using ebXML as a backbone component.
A system administrator will use it for maintenance, especially on the data models and the relationships between them. A query interface Q is used for searching the information stored in the registry. It offers one specific operation, mainly used by the synchronization service to find routes to related services, and one generic operation for structured query language (SQL) like queries as defined in the ebXML standard. A ebXML component EB is a fully ebXML standard compliant registry and discovery service. It will be used by both the discovery component D and third-party clients. The former will use it as a registry that stores the available services together with their data models, including relationships between these models and associated transformations, while the latter can use it as a traditional discovery service. A QueryFacade component QF could handle recursive queries, for example to search through transitive relations. This component is necessary because the ebXML standard specification does not include this functionality.
The interfaces of the discovery component Q, A, and LC mainly use WSDL and OWL formats as input and output, but internally, the discovery registry is based on the ebXML format. Extraction of the data model will thus come down to transforming WSDL and OWL to the ebRIM and ebRS publication format.
Services could be represented with a Service class and the rest of the information from the WSDL comes in the ServiceBinding and SpecificationLink classes. The data model used by the service is mapped to a ClassificationScheme, where each ClassificationNode represents one type in the data model and is associated with the service using a Classification.
For example, the above mentioned address book service could be stored in the ebXML registry. The service is classified with two data types, one for changing address information an another for adding new entries on the address book. Let these types consist of an address type, a person type and strings.
As a new service is published in the registry, the new data model elements should be inserted into the registry and the service's data model should be associated with the data types already stored in the registry. The discovery service might not be able to accomplish the latter fully autonomously. Then it could deduce a set of suggested data type relationships, to be finalized e.g. by a system administrator.
Some simplifications w.r.t. the associations could be based on the full equivalence between data types, e.g. when a type is already available in the registry, its service-specific relations will have to be added to the registry as well. To make this deduction sound and complete, the system administrator could extend the service description with semantic data information by embedding OWL constructs in the WSDL publication.
To search through the model for routes between operations of different services, one can use Floyd-Warshal like algorithms, or one pair shortest path discoveries, i.e. algorithms from the Dijkstra search type.
The services can be concatenated in the category of arrows. A sequence of concatenated invocations correspond to a path in the graph (bold) having a start S and an end E. The constraint is that the data types need to be consistent, i.e. the Nth arrow ends at a bullet, where the N+1th arrow begins. The path corresponds to a (virtual) service having input type S and output type E (dashed).
The discovery service according to the invention is aware of the service network shown in
In
That enables the client to invoke the service chain defined by the path, as shown in
To summarize: A client C that seeks for a service with the input data type S and the output data type E can ask the dedicated service X for a sequence of service invocations providing the searched service. The dedicated service X could look up the data types in his memory and can calculate a path, e.g. via Dijkstra's algorithm or by means of a transitive closure via Floyd-Warshal algorithm. That enables the client to invoke the services in a concatenated way . . .
This technique enables an arrangement for the adaptation of a message exchanged between a consumer service and multiple provider services in e.g. a service-oriented architecture, where the arrangement includes a discovery service comprising storage means for distinct service data models, each service data model being associated to a provider service, and storage means for relationships between said service data models. That enables a discovery message routes, one message route per provider service, each message route being defined as a sequence of zero or more services invocations, optionally with a transformation that is associated to a data model relationship. The discovery means is able to adapt the message of said consumer service into a message route intended to a designated provider service. Preferably the discovery means is able to include the message of said consumer service as the last provider service in said sequence of the message route.
The discovery means might further comprise a reasoner (2205) adapted to support the automatic deduction of new relationships between service data models based on previously established relationships. The discovery means might be adapted to automatically determine at least one additional target provider service based on the impact of said message, exchanged between said consumer service and said designated provider service, has on the service data models of other provider services in order to support the synchronization of data shared between provider services.
Number | Date | Country | Kind |
---|---|---|---|
06300944.3 | Sep 2006 | EP | regional |