Faceted, tag-based approach for the design and composition of components and applications in component-based systems

Information

  • Patent Grant
  • 8490049
  • Patent Number
    8,490,049
  • Date Filed
    Wednesday, October 15, 2008
    16 years ago
  • Date Issued
    Tuesday, July 16, 2013
    11 years ago
Abstract
A method, including: receiving a software requirement; and constructing a workflow template that can satisfy the software requirement, wherein the workflow template comprises a plurality of processing stages, wherein each processing stage includes at least one component class and each component class includes at least one component, and wherein an output of each processing stage is described by a processing goal pattern that is described by a set of tags and facets.
Description
RELATED APPLICATION

This application is related to commonly assigned U.S. application entitled “DESCRIBING FORMAL END-USER REQUIREMENTS IN INFORMATION PROCESSING SYSTEMS USING A FACETED, TAG-BASED MODEL”, having Ser. No. 12/252,132, filed Oct. 15, 2008, the disclosure of which is incorporated by reference herein in its entirety.


BACKGROUND OF THE INVENTION

1. Technical Field


The present invention relates to service composition.


2. Discussion of the Related Art


The Web Services research community has proposed a number of approaches for service composition, ranging from manual to semi-automatic to completely automatic. However, it is often difficult to take independently developed services and compose them, since they may not work together correctly.


The conventional service-oriented architecture (SOA) lifecycle is essentially top-down, consisting of the four phases: Model, Assembly, Deploy and Manage. Modeling is the process of capturing the business design from an understanding of business requirements and objectives. Business requirements are translated into a specification of business processes, goals, and assumptions for creating a model of the business. During the Assemble phase, the IT organization takes the business design and assembles information system artifacts that implement the design. In this phase, existing artifacts and applications may be reused to meet the needs of the design, and new artifacts may be created as well. The Deploy and Manage phases include hosting the applications and monitoring the production runtime environment.


While the SOA lifecycle does emphasize flexibility and reuse, in practice, it is difficult to respond rapidly to changing user requirements and processing needs. Typically, new user requirements are again addressed top-down by going through the four stages. However, it may be possible to address some new user requirements by assembling new workflows from available services in a bottom-up fashion. For this to happen, however, the services should modeled and developed keeping in mind the needs of spontaneous composition.


There are a number of challenges in combining the top-down and bottom-up approaches to service engineering. Firstly, the requirements should be captured appropriately, at the right level of abstraction and formality. Next, one should ensure that an appropriate set of services are developed, which can be combined into workflows that satisfy the requirements. Another challenge is in enabling a high degree of reuse of both individual services and of workflows in different contexts and in different application domains. Finally, there should be mechanisms for end-users to have appropriate workflows rapidly assembled for them in response to their processing needs.


Accordingly, there is a need for services to be designed and developed in a manner that facilitates their composition.


SUMMARY OF THE INVENTION

In an exemplary embodiment of the present invention, a method, comprises: receiving a software requirement; and constructing a workflow template that can satisfy the software requirement, wherein the workflow template comprises a plurality of processing stages, wherein each processing stage includes at least one component class and each component class includes at least one component, and wherein an output of each processing stage is described by a processing goal pattern that is described by a set of tags and facets.


An input and an output of a component class are each described by a variable processing goal pattern that includes tags, facets and variables, and an input and an output of a component in the component class are each described by a set of tags and variables.


The method further comprises, prior to constructing the workflow template, representing the software requirement as a plurality of goal instances in a requirements goal pattern, wherein the requirements goal pattern is described by a set of tags and facets.


The method further comprises, after constructing the workflow template, for each goal instance, developing at least one workflow instance that can satisfy the goal instance, wherein the workflow instance follows or belongs to the workflow template.


A workflow instance is a directed acyclic graph and comprises at least one of the components arranged in a processing graph to produce information that satisfies the goal instance.


A tag is a keyword associated with an available resource.


A facet is a category that includes at least one tag.


A variable is associated with a set of tags, and wherein a variable is bound to a tag if the tag is a sub-tag of all tags in the set of tags.


In an exemplary embodiment of the present invention, a method, comprises: receiving a high-level software requirement; representing the high-level software requirement as a plurality of processing goals described by a requirements goal pattern, wherein the requirements goal pattern is described by a set of tags and facets; constructing a workflow template that can produce information to satisfy the high-level software requirement, wherein the workflow template comprises a plurality of processing stages, wherein each processing stage includes at least one component class and each component class includes at least one component, and wherein an output of each processing stage is described by a processing goal pattern that is described by a set of tags and facets; and for each of the plurality of processing goals, developing at least one workflow instance that can satisfy the processing goal, wherein the workflow instance follows or belongs to the workflow template, and wherein a workflow instance is a directed acyclic graph and comprises at least one of the components arranged in a processing graph to produce information that satisfies the goal instance.


The method further comprises: receiving at least one of the plurality of processing goals from a user, wherein the user processing goal includes at least one tag; producing information that satisfies the user processing goal by executing one of the workflow instances that belongs to the workflow template or by generating and executing a new workflow instance that does not belong to the workflow template; and providing the information to the user.


In an exemplary embodiment of the present invention, a computer readable storage medium stores instructions that, when executed by a computer, cause the computer to perform a method, the method comprising: receiving a software requirement; and constructing a workflow template that can satisfy the software requirement, wherein the workflow template comprises a plurality of processing stages, wherein each processing stage includes at least one component class and each component class includes at least one component, and wherein an output of each processing stage is described by a processing goal pattern that is described by a set of tags and facets.


An input and an output of a component class are each described by a variable processing goal pattern that includes tags, facets and variables, and an input and an output of a component in the component class are each described by a set of tags and variables.


The method further comprises, prior to constructing the workflow template, representing the software requirement as a plurality of goal instances in a requirements goal pattern, wherein the requirements goal pattern is described by a set of tags and facets.


The method further comprises, after constructing the workflow template, for each goal instance, developing at least one workflow instance that can satisfy the goal instance, wherein the workflow instance follows or belongs to the workflow template.


A workflow instance is a directed acyclic graph and comprises at least one of the components arranged in a processing graph to produce information that satisfies the goal instance.


A tag is a keyword associated with an available resource.


A facet is a category that includes at least one tag.


A variable is associated with a set of tags, and wherein a variable is bound to a tag if the tag is a sub-tag of all tags in the set of tags.


In an exemplary embodiment of the present invention, a computer readable storage medium stores instructions that, when executed by a computer, cause the computer to perform a method, the method comprising: receiving a high-level software requirement; representing the high-level software requirement as a plurality of processing goals described by a requirements goal pattern, wherein the requirements goal pattern is described by a set of tags and facets; constructing a workflow template that can produce information to satisfy the high-level software requirement, wherein the workflow template comprises a plurality of processing stages, wherein each processing stage includes at least one component class and each component class includes at least one component, and wherein an output of each processing stage is described by a processing goal pattern that is described by a set of tags and facets; and for each of the plurality of processing goals, developing at least one workflow instance that can satisfy the processing goal, wherein the workflow instance follows or belongs to the workflow template, and wherein a workflow instance is a directed acyclic graph and comprises at least one of the components arranged in a processing graph to produce information that satisfies the goal instance.


The method further comprises: receiving at least one of the plurality of processing goals from a user, wherein the user processing goal includes at least one tag; producing information that satisfies the user processing goal by executing one of the workflow instances that belongs to the workflow template or by generating and executing a new workflow instance that does not belong to the workflow template; and providing the information to the user.


The foregoing features are of representative embodiments and are presented to assist in understanding the invention. It should be understood that they are not intended to be considered limitations on the invention as defined by the claims, or limitations on equivalents to the claims. Therefore, this summary of features should not be considered dispositive in determining equivalents. Additional features of the invention will become apparent in the following description, from the drawings and from the claims.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 shows a faceted navigation menu and a user-selected, tag-based goal, according to an exemplary embodiment of the present invention;



FIG. 2 shows a flow for a user-selected tag-based goal, according to an exemplary embodiment of the present invention;



FIG. 3 shows a service development lifecycle, according to, an exemplary embodiment of the present invention;



FIG. 4 shows a workflow template with different processing stages, according to an exemplary embodiment of the present invention;



FIG. 5 shows a processing stage for weather forecast extraction, according to an exemplary embodiment of the present invention;



FIG. 6 shows an instantiation of a weather forecast extraction processing stage, according to an exemplary embodiment of the present invention;



FIG. 7 shows a service class and a service, according to an exemplary embodiment of the present invention; and



FIG. 8 shows a block diagram of a system in which exemplary embodiments of the present invention may be implemented.





DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS

This disclosure incorporates by reference herein in its entirety, Bouillet et al. A tag-based approach for the design and composition of information processing applications. Object-Oriented Programming, Systems, Languages and Applications (OOPSLA) '08, to be published Oct. 19-23, 2008.


In this disclosure, we provide a novel methodology for designing, developing and composing services that incorporate both top-down and bottom-up elements. In an exemplary embodiment, the methodology is driven by faceted, tag-based functional requirements that are elicited from end-users. The facets represent different dimensions of both data and processing, where each facet is modeled as a finite set of tags that are defined in a controlled folksonomy. The faceted, tag-based functional requirements are the starting point of a top-down lifecycle where workflows and individual services are designed, explicitly keeping in mind the needs of the composition. The requirements are taken by enterprise architects who design workflow templates that are also associated with faceted, tag-based descriptions. These workflow templates can either reuse existing services or workflows, or they can be used to generate new service requirements, which are also described in terms of facets and tags. These new services are then developed by the developers, and are tested individually and in conjunction with other services as per the workflow templates.


When an end-user submits a processing goal, also expressed in terms of facets and tags, a workflow can be composed, either automatically or manually, using the developed services. Our system uses an AI planner for the automatic composition of workflows using the goal specification from the end-user and the tag-based descriptions of different individual services. During the automatic or manual composition of workflows based on end-user processing goals, new workflows that were not explicitly designed earlier by the enterprise architect may also be created. Hence, different services can be composed in a bottom-up fashion to create new workflows that satisfy new processing goals.


A notable aspect of this methodology is the pervasive use of faceted, tag-based descriptions of functional requirements, of service capabilities, of structural workflow templates and of end-user processing goals. These faceted, tag-based descriptions guide the overall workflow design and the service development lifecycle.


In this disclosure, we focus on information processing workflows, which are workflows that retrieve and process information as desired by end-users. However, exemplary embodiments of the present invention are not limited thereto. These workflows make available unified information, obtained or extracted from multiple data sources, in response to end-users' information inquiries. Examples of such workflows are those that obtain business intelligence for analysts and those that perform information integration and content management. The key drivers for these workflows are to facilitate better decision making by end-users and better information sharing between business operations.


In [E. Bouillet et al. A folksonomy-based model of web services for discovery and automatic composition. In IEEE Services Computing Conference (SCC), 2008)], the disclosure of which is incorporated by reference herein in its entirety, we introduced the use of tag-based descriptions for describing individual services. In the present disclosure, we expand on this model to facilitate the design and development of services that are composable. Some contributions of our methodology are:

    • 1. A faceted, tag-based model for describing high-level end-user information processing requirements.
    • 2. A service design and development lifecycle that results in the development of services that can be composed into workflows satisfying the end-user requirements.
    • 3. An approach for the bottom-up composition of workflows in response to dynamic end-user processing goals.


Mixing Top-Down Structure with Bottom-Up Serendipity in Service Engineering


Developing services that can be composed together into diverse workflows requires a holistic service engineering methodology, where the services are developed keeping in mind the needs of the composition. A purely bottom-up service engineering process, where we attempt to compose services that are developed independently, generally does not succeed in practice, since these services are not likely to work correctly when composed. In addition, a purely top-down approach where a workflow (or a set of workflows) are designed in advance, and services are developed to fit into these workflows, is often not flexible enough to deal with new situations and new processing goals that require different workflows. Hence, a combination of top-down design with bottom-up reuse can achieve correct composition that can work in different situations.


As mentioned above, in this disclosure, we focus on information processing workflows that extract data from one or more sources, process them after using one or more services, and produce useful information or knowledge. The key end-users of information processing workflows are analysts and decision makers in various enterprises. These end-users need to quickly obtain and update the business intelligence that guides their decision. For this, they need to collect the needed information from a potentially huge number of diverse sources, adapt and integrate that data, and apply a variety of analytic models, updating the results as the data changes. When new sources are discovered and/or new analytic models are developed—or simply when new ways of applying existing models are desired—users of information systems cannot and should not wait the days or months needed for development cycles to complete, to get the analysis results they urgently need. These users require the serendipitous assembly of new workflows from the available services to satisfy their dynamic and changing information processing goals.


Pervasive Use of Tags


Our methodology combines top-down structure with bottom-up serendipity through the pervasive use of faceted, tag-based descriptions. In this methodology, we use tags, associated with organizing facets, to describe:

    • functional requirements elicited from end-users
    • all data, and messages exchanged between services
    • high-level workflow templates that describe the structure of families of related workflows
    • individual services
    • workflow instances
    • dynamic end-user information processing goals


Tags and Tag Hierarchies


The word “tag” comes from various collaborative tagging applications that have arisen in Web 2.0 (such as del.icio.us and Flickr) where users annotate different kinds of resources (like bookmarks and images) with tags. These tags aid search and retrieval of resources. A key aspect of the tagging model is that it is relatively simple, in comparison to more expressive models such as those based on Semantic Web ontologies and other formal logics. Hence, it offers a lower barrier to entry for different kinds of users to describe resources. In our case, the resources are different kinds of data artifacts, like files, input and output messages to services, etc.


Let T={t1, t2, . . . , tk} be the set of tags in our system. In most social tagging applications, the set of tags, T, is completely unstructured, i.e., there is no relation between individual tags. Introducing a hierarchy structure in T, however, enhances the expressivity by allowing additional tags to be inferred for resources. A tag hierarchy, H, is a directed acyclic graph (DAG) where the vertices are the tags, and the edges represent “sub-tag” relationships. It is defined as H=(T,S), where T is the set of tags and Scustom characterT×T is the set of sub-tag relationships. If a tag t1 εT is a sub-tag of t2 εT, denoted t1custom charactert2, then all resources annotated by t1 can also be annotated by t2. For convenience, we assume that ∀tεT, tcustom charactert.


Facets


Facets represent dimensions for characterizing resources (data artifacts). Let F={fi} be the set of facets. Each facet is a set of tags, i.e., ficustom characterT. Tags may be shared across facets.



FIG. 1 shows an example of a faceted tag cloud interface for the weather and energy trading services domain. In this domain, end-users can specify different kinds of weather forecast processing goals. Some of the facets are Sources, Weather Forecast Model, Weather Metric, etc. Each facet includes a number of tags, e.g., the Weather Metric facet includes tags like Dewpoint, Temperature, etc. It is noted that some tags are larger, indicating that they are relevant to a larger number of user-specifiable goals. End-users can select one or more tags to formulate the processing goal; our interface also provides a natural language interpretation of the goal from the set of tags, to provide feedback to the end-user on how the system interprets the goal.


Dynamic End-User Processing Goals Expressed Using Tags


As shown in FIG. 1, end-user processing goals are specified as a set of tags. For example, a commodities broker might want to watch for predicted extremes in relative humidity that might indicate a drought, indicating an opportunity to trade corn futures. He would express this as the goal Global Forecast System (GFS), Eta, RelativeHumidity, IA, WeightedAverage, ContourMapView, which represents a request for a workflow that delivers the weighted average of two relative humidity forecasts (produced using the GFS and Eta forecast models obtained from NOAA—the National Oceanographic and Atmosphere Association) for the state of Iowa presented on a contour map.


Each data artifact in our system, a is characterized by a set of tags d(a)custom characterT. The data artifacts include the input and output messages of web services, RSS feeds, web pages, files, etc. The tags only describe the semantics of the data artifacts, and not the actual syntax.


End-user goals describe the semantics of the desired data artifacts that may be produced by an information processing workflow. A goal, qcustom characterT, is satisfied by a data artifact, a, iff ∀tεq∃t′εd(a), t′custom charactert.


When a user selects a goal, a workflow is composed in a bottom-up manner from the available services. In our system, this bottom-up composition occurs through an AI planner, such as, for example, the planner described in [A. Riabov and Z. Liu. Planning for stream processing systems. In American Association for Artificial Intelligence (AAAI), 2005], the disclosure of which is incorporated by reference herein in its entirety, that uses tag-based descriptions of individual services to come up with a workflow satisfying the goal. FIG. 2 shows an example of such as workflow. For example, FIG. 2 is a flow example for the “IA RelativeHumidity GFS Eta WeightedAverage ContourMapView” goal. The final Contour Map View service in the workflow is a REST service that end-user can access for real-time result information. Some services like NOAA GFS Forecast Data are instantiated with specific configuration parameters like Current Forecast. In other words, the boxes in FIG. 2 represent components of an application.


We model a workflow as a graph G (V,E) where G is a DAG (Directed Acyclic Graph). Each vertex vεV is a service instance. Each edge (u,v) represents a logical flow of messages from u to v. If a vertex, v has multiple incoming edges of the form (u1, v), (u2, v), . . . , then it means that the output message produced by u1, u2, . . . are used together to create an input message to v. The message corresponding to each edge, (u,v), can be described by a set of tags, d((u,v)). In this disclosure, we restrict the workflows to acyclic graphs since capturing the semantics of messages where there are loops is difficult. However, exemplary embodiments of the present invention are not limited thereto.


Overview of Lifecycle


For a flow, such as the one in FIG. 2 to be assembled, the individual services are designed, described and developed appropriately. For this purpose, we provide a service engineering lifecycle (see FIG. 3) that is driven by high-level faceted, tag-based functional requirements. In information processing systems, the functional requirements describe the general kinds of information the end-user desires. In our approach, these functional requirements are expressed as patterns of goals that the user would like to submit. Note that this disclosure focuses on functional requirements and not non-functional requirements like security, performance and cost. However, the exemplary embodiments of the present invention are applicable to both sets of requirements.


The functional requirements are taken by an enterprise architect who comes up with a high-level design of the overall workflow(s) and of individual services. The architect first constructs one or more workflow templates that satisfy the requirements. A workflow template is a high-level description of the flow structure and is modeled as a graph of processing stages, where each stage performs a certain segment of the overall required information processing. Each stage in turn consists of a graph of service classes, where a service class is an equivalence class of services that share similar properties and are substitutable in certain contexts. The modular and substitutable nature of services enable such composition. In addition, the decomposition of the workflow into processing stages allows reuse of both services and entire sub-flows.


The architect can reuse existing services (and service classes) in designing the workflow. In some cases, new services may need to be developed, or existing services modified, to satisfy new end-user requirements. The architect defines the semantic requirements of the new services in terms of tags describing the input and output data. In addition, the architect defines the syntactic interfaces (e.g., using WSDL) to enable its interaction with other services in the processing stage, and in the workflow, in general. These semantic and syntactic service requirements are passed to a developer, who develops the service and tests it both individually and in conjunction with other services. Finally, the new services are made available for composition and deployment. This may also result in changes to the end-user interface to include the new tags describing the outputs of workflows that contain the new service. Finally, as shown in FIG. 3, the different stages of the lifecycle are iterative, and proceed in a spiral refinement manner to finally converge towards the required system.


Although the methodology as presented has a top-down emphasis, it does support the bottom-up construction of flows. First, in the workflow template construction stage, it is possible to reuse existing services or sub-flows in defining the template. Second, after deployment, our composition approach is not constrained by the pre-defined workflow templates. Instead, the planner can construct new flows to satisfy user goals using the available services. The planner is not aware of the workflow templates; instead, it creates flows anew from the goal specification. This allows for the spontaneous generation of new flows from existing services that were not necessarily designed by the architect.


In summary, some aspects of our approach are:

    • 1. The top-down approach guarantees that the services developed can be composed to create workflows that meet the initial end-user requirements.
    • 2. The tag-based descriptions of all services facilitates their recombination in new ways to create new workflows that satisfy new end-user goals, which may or may not have been part of the initial requirements.
    • 3. The common, yet extensible, facets and tag hierarchies establish a simple, shared vocabulary that is used by architects, developers and end-users.
    • 4. End-user requirements are captured in a formal manner. This enables us to verify that the requirements are actually satisfied by a set of composable services.


Faceted, Tag-Based Requirements for Driving Composition


Workflow composition requires careful design of the services. The first need is to make sure that at least those flows are composed that meet certain business requirements, which are explicitly specified by the end-users. In addition, if they satisfy new requirements through serendipitous composition of services, that is a bonus.


Hence, in our approach, high-level end-user requirements drive the service engineering process. In any large-scale information processing system, there may be a large number of different kinds of information, and a large number of different ways of processing this information. Hence, requirements are not specified in terms of single goals but as whole classes of goals that are described by goal patterns.


A goal pattern is described as a set of tags and facets. Each facet is associated with a cardinality constraint. The cardinality constraint specifies how many tags in the facet should be part of the goal.


We first define the set of cardinality constraints, CC, as the set of all ranges of positive integers. Then a goal pattern, QP={(x,c)|xεF, cεCC}∪{t|tεT}. A goal pattern requirement means that end-users are interested in all data artifacts that can be described by a combination of tags that are drawn from the facets in the goal pattern, according to the cardinality constraints.


An example of a goal pattern is {Source[≧1], WeatherForecastModel[≧2], MultipleModelAnalysis[1], BasicWeatherMetric[≧1], Visualization[1]}.


This represents the class of all data artifacts that can be used to describe one or more tags that belong to the Source facet, two or more tags in the WeatherForecastModel facet, one tag in the MultipleModelAnalysis facet, one tag in the BasicWeatherMetric facet, and one tag in the Visualization facet.


A point to note is that the goal pattern can refer to a large number of possible goals. For example, if there are five tags in the Source facet, 50 tags in the Model facet, five in the MultipleModelAnalysis facet, 10 in the BasicWeatherMetric facet, and 10 in the Visualization facet, there are up to 25×250×5×210×10 possible kinds of data that may be producible by the information processing system. The goal pattern helps in succinctly expressing the combinatorial number of possible goals that can be submitted to the system.


Workflow Templates


An architect takes a requirement, in the form of a goal pattern and constructs one or more workflow templates that can satisfy all the goal instances belonging to the goal pattern. A workflow template is a high-level description of the workflow structure, consisting of abstract processing stages and services. Each goal instance belonging to the goal pattern can be satisfied by a workflow instance that follows the workflow template.


The workflow templates are intended to guide the goal answering process. It is important to note that they are not the only solutions, though. It is possible to assemble a different flow, that is not part of the template, and that uses potentially different services to satisfy the same goal.


A workflow template is a directed acyclic graph, where the vertices are processing stages and edges represent transfer of messages between services in the different stages. FIG. 4 shows an example of a workflow template, with FIG. 2 being one example instantiation of the template. Each processing stage, itself, can be described by a directed acyclic graph, where the vertices are service classes and edges represent the transfer of messages between different service classes. Each processing stage in the template is associated with a goal pattern that it can satisfy.


Formally, a workflow template is defined as a directed acyclic graph custom character(V, ε, p, λ) where Vcustom characterS and εcustom characterV×V. S is the set of all processing stages. The function p associates sub-graphs (or sub-flows) with a parallelism constraint, p:g→CC, where g is a subgraph of custom character In the example above, one of the subgraphs is associated with a constraint that at least two instances of the processing stages in the subgraph run in parallel. By default, a sub-graph is associated with a cardinality of one.


Each processing stage is associated with a goal pattern that describes the kinds of goals that the sub-flow formed by this processing stage and all preceding processing stages in the flow, can answer. λ is a function that associates a processing stage with the goal pattern it produces as output. λ:v→custom charactercustom character, where custom charactercustom character is the set of all possible goal patterns.


Processing Stage


A processing stage is a directed acyclic graph S(VS,ES) where S is a DAG (Directed Acyclic Graph). Each vertex vεVs is a service class (defined later). Each edge (u.v)εES represents a logical flow of messages from a service in the class u to a service in the class v. Each stage can in fact be viewed as a high-level service with input requirements and output capabilities.


An example of a stage is shown in FIG. 5. It consists of two services, the first fetches a file given a Uniform Resource Locator (URL), and the second parses a weather forecast.



FIG. 6 shows a concrete instance of the processing stage, where the service classes have been instantiated with specific services. The first service class is instantiated with a service that fetches NOAA GFS forecasts and is configured to fetch the current forecast. The second service class is instantiated with a service called MOSParser that parses Model Output Statistics (MOS) forecasts from NOAA to extract temperature and dewpoint predictions for stations in the U.S. MOS is a class of forecasts that includes GFS and Eta.


Service Class and Service Requirements


Services that perform similar tasks and have similar input constraints can be grouped together into a class. For example, all services that take a set of weather forecasts from different sources and aggregate them in some fashion (e.g., performing an average, or coming up with a probability distribution, or finding the minimum or maximum or clustering or detecting outliers) may be grouped together into a class.


The key intuition behind a service class is that all the members of a service class are substitutable in a certain context. That is, in any given flow, a service can be replaced by another service in the same class without any syntactic or semantic mismatch. Hence, the definition of a service class is specific to a certain flow (or a certain class of flows).


This notion of substitutability of services enables our approach to automated composition. Our composition approach starts with a high-level workflow template definition that is made up of a flow of substitutable services. Different substitutions of services result in different instances of the templates that can satisfy specific goals.


Let C={c} be the set of all services in the system. Then the set of all service classes is Ccustom character2C. In addition, a service class, is Ccustom character2C. In addition, a service class, XεC, is specific to a certain position in a flow, or set of flows. If aεX appears in this position, then it can be substituted by any bεX.


Service classes are defined in terms of their inputs and outputs, which are defined using variable goal patterns. A variable, v, is a member of the set V where V is infinite and disjoint from T. A variable is represented with a preceding “?”. Each variable is associated with one or more types (which are also tags). Let τ:V→T be a function that maps a variable to a set of types. A variable, v can be bound to a tag, t if the tag is a sub-tag of all the types of the variable, i.e., canbind(v,t) iff ∀xετ(v),tcustom characterx.


The inputs and outputs of a service class, X, can be described by goal patterns that include variables. We define the set of all variable goal patterns as VQP={(x,c)|xεF∪V,cεCC}∪{t|tεT∪V}. Then a service class, X, can be defined as the pair (IX,OX).

    • IX is a variable goal pattern that describes a class of input message constraints.
    • 2. OX is a variable goal pattern that describes a class of output message constraints.
    • 3. The set of variables in OX is a subset of the set of variables IX. This constraint ensures that no free variables exist in the output description.


We assume that each service belongs to a trivial service class, which is a singleton set. FIG. 7 shows an example service class on the left. The input and output descriptions include the variable ?source whose type is WthrSource. This means that both the input and output include the same tag, which is a sub-tag of WthrSource, such as NOAA.


Service Model


A service class can also act as a requirement specification for a new service, or a set of services. This brings us to the model for describing a single service (or a service operation). Services are described in terms of input and output message constraints that include variables and tags. The variables help in propagating semantic information from the input to the output, since whatever value the variable is bound to in the input of a service is propagated to the output. FIG. 7 shows an example of service on the right that parses MOS forecasts from NOAA.


Let C be the set of all services in the system. A service, oεC, is defined as the pair (Io, Oo) where:

    • 1. Iocustom character(T∪V) is an input message constraint.
    • 2. Oocustom character(T∪V) is an output message constraint.
    • 3. The set of variables in Oo, is a subset of the set of variables in Io.


Note that some services (and service classes) may have no input message constraints, which means that they produce outputs without requiring any input message (e.g., periodically or in response to an event). Our model also includes other information such as binding (i.e., how exactly to instantiate or invoke a service) and other documentation on the service. Further details are available in [E. Bouillet et al. A folksonomy-based model of web services for discovery and automatic composition. In SCC, 2008].


A part of composing workflows is determining whether a message, produced by some service, can be given as input to another service. In a valid workflow, all messages sent as input to a web service must satisfy both the syntactic and semantic input constraints of the service. The syntactic constraints are based on the interface description (e.g., in WSDL). The semantic constraints are based on the tag descriptions of a message and the input descriptions of the web service. The semantics of a message, a, can be described by the set of tags, d(a). We define that d(a) matches an input constraint, Io (denoted by d(a)custom characterIo, iff:

    • 1. For each tag in Io, there exists a sub-tag that appears in d(a).
    •  Formally, ∀yε(Io∩V),(∃xεd(a),xcustom charactery).
    • 2. For each variable in Io, there exists a tag in d(a) to which the variable can be bound. Formally,
    •  ∀yε(Io∩V),(∃xεd(a),canbind(y,x)).


Bottom-Up, Goal-Driven Workflow Composition


Once new services are developed and tested, they can be used in new workflows. The problem of goal-driven composition can be described as constructing workflows that produce a message satisfying the goal. Given a composition problem P(T, C, g), where T is a tag taxonomy, C is a set of services, and gcustom characterT is a composition goal, the set of solutions is all valid workflows, custom character, such that for each workflow graph G(V,E)εcustom character, the message corresponding to at least one edge in E must satisfy the goal.


Our system includes an AI planner, such as, for example, the planner described in [A. Riabov and Z. Liu. Planning for stream processing systems. In AAAI, 2005] that composes workflows from the available services given the goal. The planner is used in the serendipitous assembly of new workflows. It is not aware of the workflow templates; hence, it can compose flows that follow the templates and also possibly new flows, which do not fall into any of the explicitly designed templates.


As an example, assume that there is a service developed in a different context that took weather data and stored it as tables in a database. Then this service can potentially replace any of the visualization services deployed as part of the workflow template in FIG. 4. Hence, a dynamic user goal such as GFS, Eta, RelativeHumidity, IA, WeightedAverage, DatabaseStorage may be satisfiable even though it was not part of the original user requirements.


In this disclosure, we described the use of faceted, tag-based descriptions as a means of specifying high-level end-user requirements. The requirements kick off a top-down service development lifecycle, where enterprise architects and service developers design abstract workflow templates, generate requirements for new services, develop and test the new services and workflows, and finally make available the services for manual or automatic composition in response to dynamic user goals. At different stages of this lifecycle, it is possible to reuse individual services or compose flows in different contexts, and also compose new flows in response to user requests in a serendipitous, bottom-up manner.


We have used our service design and development methodology in a financial services deployment that included a total of 135 services. The development and annotation of the services was undertaken by a team of five people, including one person serving as a requirements engineer and application architect. Some of the services ran on IBM's Project Zero platform, which allows the development of REST-based services, while other were components in IBM's System S stream processing system. The workflow sizes ranged from five to 150 services. Preliminary experiences have shown the usefulness of our approach for developing composable services.


A system in which exemplary embodiments of the present invention may be implemented is shown in FIG. 8. As shown in FIG. 8 the system includes a computer system 100, which can represent any type of computer system capable of carrying out the teachings of the present invention. For example, the computer system 100 can be a laptop computer, a desktop computer, a workstation, a hand-held device, a server, a cluster of computers, etc. End-user(s) 140, architect(s) 125, or developer(s) 130 can access the computer system 100 directly, or can operate a computer system that communicates with computer system 100 over a network 165 (e.g., the Internet, a wide area network (WAN), a local area network (LAN), a virtual private network (VPN), etc.).


Computer system 100 is shown including a processing unit 105, a memory 115, a bus 155, and input/output (I/O) interfaces 110. Further, computer system 100 is shown in communication with external devices/resources 145 and one or more storage system 150. In general, processing unit 105 executes computer program code, such as AI planner 120 or an application 160, that is stored in memory 115 and/or storage system 150. While executing computer program code, processing unit 105 can read and/or write data, to/from memory 115, storage system 150, and/or I/O interfaces 110. Bus 155 provides a communications link between each of the components in computer system 100. External devices/resources 145 can comprise any devices (e.g., keyboard, pointing device, display (e.g., display 135, printer, etc.) that enable a user to interact with computer system 100 and/or any devices (e.g., network card, modem, etc.) that enable computer system 100 to communicate with one or more other computing devices.


Storage system 150 can be any type of system (e.g., database) that is capable of providing storage information for use with exemplary embodiments of the present invention. Such information can include, workflow templates, services and service classes, semantic and syntactic requirements, test results, etc. Shown in memory 115 (e.g., as a computer program product) is the AI planner 120, which is used to develop workflows consisting of components configured to satisfy a user goal, and one or more application(s) 160, which represent the developed workflows, that can be executed by the end-user(s) 140, for example. The application(s) 160 can also be stored in the storage system 150.


It should be understood that the present invention may be implemented in various forms of hardware, software, firmware, special purpose processors, or a combination thereof. In one embodiment, the present invention may be implemented in software as an application program tangibly embodied on a program storage device (e.g., magnetic floppy disk, Random Access Memory (RAM), Compact Disk (CD) Read Only Memory (ROM), Digital Video Disk (DVD), ROM, and flash memory). The application program may be uploaded to, and executed by, a machine comprising any suitable architecture.


It is to be further understood that because some of the constituent system components and method steps depicted in the accompanying figures may be implemented in software, the actual connections between the system components (or the process steps) may differ depending on the manner in which the present invention is programmed. Given the teachings of the present invention provided herein, one of ordinary skill in the art will be able to contemplate these and similar implementations or configurations of the present invention.


It should also be understood that the above description is only representative of illustrative embodiments. For the convenience of the reader, the above description has focused on a representative sample of possible embodiments, a sample that is illustrative of the principles of the invention. The description has not attempted to exhaustively enumerate all possible variations. That alternative embodiments may not have been presented for a specific portion of the invention, or that further undescribed alternatives may be available for a portion, is not to be considered a disclaimer of those alternate embodiments. Other applications and embodiments can be implemented without departing from the spirit and scope of the present invention.


It is therefore intended, that the invention not be limited to the specifically described embodiments, because numerous permutations and combinations of the above and implementations involving non-inventive substitutions for the above can be created, but the invention is to be defined in accordance with the claims that follow. It can be appreciated that many of those undescribed embodiments are within the literal scope of the following claims, and that others are equivalent.

Claims
  • 1. A computer-implemented method, comprising: receiving a software requirement; andconstructing a workflow template that can satisfy the software requirement, wherein the workflow template comprises a plurality of processing stages, wherein each processing stage includes at least one component class and each component class includes at least one component, and wherein an output of each processing stage is described by a processing goal pattern that is described by a set of tags and facets,wherein each facet is associated with a cardinality constraint that specifies how many tags in the facet are part of the goal,wherein the processing goal pattern is represented as a semantic description of a first facet and a first numerical value indicating how many tags in the first facet are part of a first processing goal and a semantic description of a second facet and a second numerical value indicating how many tags in the second facet are part of a second processing goal, the method further comprising:prior to constructing the workflow template, representing the software requirement as a plurality of goal instances in a requirements goal pattern;after constructing the workflow template, for each goal instance, developing at least one workflow instance that can satisfy the goal instance, wherein the workflow instance follows or belongs to the workflow template,wherein a workflow instance is a directed acyclic graph and comprises at least one of the components arranged in a processing graph to produce information that satisfies the goal instance.
  • 2. The method of claim 1, wherein an input and an output of a component class are each described by a variable processing goal pattern that includes tags, facets and variables, and an input and an output of a component in the component class are each described by a set of tags and variables.
  • 3. The method of claim 2, wherein the requirements goal pattern is described by a set of tags and facets.
  • 4. The method of claim 1, wherein a tag is a keyword associated with an available resource.
  • 5. The method of claim 1, wherein a facet is a category that includes at least one tag.
  • 6. The method of claim 1, wherein a variable is associated with a set of tags, and wherein a variable is bound to a tag if the tag is a sub-tag of all tags in the set of tags.
  • 7. A computer-implemented method, comprising: receiving a high-level software requirement;representing the high-level software requirement as a plurality of processing goals described by a requirements goal pattern, wherein the requirements goal pattern is described by a set of tags and facets, wherein each facet is associated with a cardinality constraint that specifies how many tags in the facet are part of the goal, wherein the requirements goal pattern is represented as a semantic description of a first facet and a first numerical value indicating how many tags in the first facet are part of a first processing goal and a semantic description of a second facet and a second numerical value indicating how many tags in the second facet are part of a second processing goal;constructing a workflow template that can produce information to satisfy the high-level software requirement, wherein the workflow template comprises a plurality of processing stages, wherein each processing stage includes at least one component class and each component class includes at least one component, and wherein an output of each processing stage is described by a processing goal pattern that is described by a set of tags and facets; andfor each of the plurality of processing goals, developing at least one workflow instance that can satisfy the processing goal, wherein the workflow instance follows or belongs to the workflow template, andwherein a workflow instance is a directed acyclic graph and comprises at least one of the components arranged in a processing graph to produce information that satisfies the goal instance.
  • 8. The method of claim 7, further comprising: receiving at least one of the plurality of processing goals from a user, wherein the user processing goal includes at least one tag;producing information that satisfies the user processing goal by executing one of the workflow instances that belongs to the workflow template or by generating and executing a new workflow instance that does not belong to the workflow template; andproviding the information to the user.
  • 9. A computer readable storage memory storing instructions that, when executed by a computer, cause the computer to perform a method, the method comprising: receiving a software requirement; andconstructing a workflow template that can satisfy the software requirement, wherein the workflow template comprises a plurality of processing stages, wherein each processing stage includes at least one component class and each component class includes at least one component, and wherein an output of each processing stage is described by a processing goal pattern that is described by a set of tags and facets,wherein each facet is associated with a cardinality constraint that specifies how many tags in the facet are part of the goal,wherein the processing goal pattern is represented as a semantic description of a first facet and a first numerical value indicating how many tags in the first facet are part of a first processing goal and a semantic description of a second facet and a second numerical value indicating how many tags in the second facet are part of a second processing goal, the method further comprising:prior to constructing the workflow template, representing the software requirement as a plurality of goal instances in a requirements goal pattern;after constructing the workflow template, for each goal instance, developing at least one workflow instance that can satisfy the goal instance, wherein the workflow instance follows or belongs to the workflow template,wherein a workflow instance is a directed acyclic graph and comprises at least one of the components arranged in a processing graph to produce information that satisfies the goal instance.
  • 10. The computer readable storage memory of claim 9, wherein an input and an output of a component class are each described by a variable processing goal pattern that includes tags, facets and variables, and an input and an output of a component in the component class are each described by a set of tags and variables.
  • 11. The computer readable storage memory of claim 10, wherein the requirements goal pattern is described by a set of tags and facets.
  • 12. The computer readable storage memory of claim 9, wherein a tag is a keyword associated with an available resource.
  • 13. The computer readable storage memory of claim 9, wherein a facet is a category that includes at least one tag.
  • 14. The computer readable storage memory of claim 9, wherein a variable is associated with a set of tags, and wherein a variable is bound to a tag if the tag is a sub-tag of all tags in the set of tags.
  • 15. A computer readable storage memory storing instructions that, when executed by a computer, cause the computer to perform a method, the method comprising: receiving a high-level software requirement; representing the high-level software requirement as a plurality of processing goals described by a requirements goal pattern, wherein the requirements goal pattern is described by a set of tags and facets, wherein each facet is associated with a cardinality constraint that specifies how many tags in the facet are part of the goal, wherein the requirements goal pattern is represented as a semantic description of a first facet and a first numerical value indicating how many tags in the first facet are part of a first processing goal and a semantic description of a second facet and a second numerical value indicating how many tags in the second facet are part of a second processing goal;constructing a workflow template that can produce information to satisfy the high-level software requirement, wherein the workflow template comprises a plurality of processing stages, wherein each processing stage includes at least one component class and each component class includes at least one component, and wherein an output of each processing stage is described by a processing goal pattern that is described by a set of tags and facets; andfor each of the plurality of processing goals, developing at least one workflow instance that can satisfy the processing goal, wherein the workflow instance follows or belongs to the workflow template, andwherein a workflow instance is a directed acyclic graph and comprises at least one of the components arranged in a processing graph to produce information that satisfies the goal instance.
  • 16. The computer readable storage memory of claim 15, the method further comprising: receiving at least one of the plurality of processing goals from a user, wherein the user processing goal includes at least one tag;producing information that satisfies the user processing goal by executing one of the workflow instances that belongs to the workflow template or by generating and executing a new workflow instance that does not belong to the workflow template; andproviding the information to the user.
GOVERNMENT INTERESTS

This invention was made with Government support under Contract No.: H98230-07-C-0383 awarded by the U.S. Department of Defense. The Government has certain rights in this invention.

US Referenced Citations (67)
Number Name Date Kind
4821211 Torres Apr 1989 A
6401096 Zellweger Jun 2002 B1
7334216 Molina-Moreno et al. Feb 2008 B2
7360175 Gardner et al. Apr 2008 B2
7502799 Ohmori et al. Mar 2009 B2
7548917 Nelson Jun 2009 B2
7600188 Good et al. Oct 2009 B2
7730447 Ringseth et al. Jun 2010 B2
7792836 Taswell Sep 2010 B2
7793268 Wassel et al. Sep 2010 B2
7802230 Mendicino et al. Sep 2010 B1
7908584 Singh et al. Mar 2011 B2
7933914 Ramsey et al. Apr 2011 B2
8161036 Tankovich et al. Apr 2012 B2
8175936 Ronen et al. May 2012 B2
8185892 Lucas et al. May 2012 B2
8225282 Massoudi et al. Jul 2012 B1
8239820 White et al. Aug 2012 B1
8327351 Paladino et al. Dec 2012 B2
20030058282 Sato Mar 2003 A1
20040059436 Anderson et al. Mar 2004 A1
20040254949 Amirthalingam Dec 2004 A1
20050108001 Aarskog May 2005 A1
20050166193 Smith et al. Jul 2005 A1
20050203764 Sundararajan et al. Sep 2005 A1
20050204337 Diesel et al. Sep 2005 A1
20050228855 Kawato Oct 2005 A1
20060242101 Akkiraju et al. Oct 2006 A1
20060248467 Elvanoglu et al. Nov 2006 A1
20060271565 Acevedo-Aviles et al. Nov 2006 A1
20070011155 Sarkar Jan 2007 A1
20070033590 Masuouka et al. Feb 2007 A1
20070061776 Ryan et al. Mar 2007 A1
20070073570 Montagut Mar 2007 A1
20070174247 Xu et al. Jul 2007 A1
20070214111 Jin et al. Sep 2007 A1
20080016072 Frieden et al. Jan 2008 A1
20080104032 Sarkar May 2008 A1
20080168420 Sabbouh Jul 2008 A1
20080189675 Aupperle et al. Aug 2008 A1
20080228851 Angelov et al. Sep 2008 A1
20080229217 Kembel et al. Sep 2008 A1
20080229278 Liu et al. Sep 2008 A1
20080313229 Taswell Dec 2008 A1
20090037268 Zaid et al. Feb 2009 A1
20090077124 Spivack et al. Mar 2009 A1
20090094189 Stephens Apr 2009 A1
20090100407 Bouillet et al. Apr 2009 A1
20090106080 Carrier et al. Apr 2009 A1
20090125977 Chander et al. May 2009 A1
20090144296 Agrawal et al. Jun 2009 A1
20090150425 Bedingfield, Sr. Jun 2009 A1
20090171708 Bobak et al. Jul 2009 A1
20090177955 Liu et al. Jul 2009 A1
20090177957 Bouillet et al. Jul 2009 A1
20090198668 Jean Bolf et al. Aug 2009 A1
20090198675 Mihalik et al. Aug 2009 A1
20090199158 Bolf et al. Aug 2009 A1
20090241015 Bender et al. Sep 2009 A1
20090249370 Liu et al. Oct 2009 A1
20090276753 Bouillet et al. Nov 2009 A1
20100077386 Akkiraju et al. Mar 2010 A1
20100095267 Bouillet et al. Apr 2010 A1
20100106546 Sproule Apr 2010 A1
20100281458 Paladino et al. Nov 2010 A1
20110107273 Ranganathan et al. May 2011 A1
20110314439 Colgrave et al. Dec 2011 A1
Non-Patent Literature Citations (26)
Entry
Rajasekaran et al., “Enhancing Web Services Description and Discovery to Facilitate Composition”, 2005, Springer-Verlag, SWSWPC 2004, LNCS 3387, pp. 55-68.
Battle et al., “Semantic Web Services Language (SWSL)”, W3C Member Submission, Sep. 9, 2005, W3C, pp. 1-41; <http://www.w3.org/Submission/SWSF-SWSL/>.
Liu et al., “A Planning Approach for Message-Oriented Semantic Web Service Composition”, 2007, Association for the Advancement of Artificial Intelligence, pp. 1389-1394; <http://www.aaai.org/Papers/AAAI/2007/AAAI07-220.pdf>.
Qiu et al., “Semantic Web Services Composition Using AI planning of Description Logics”, 2006 IEEE, APSCC'06, pp. 1-8; <http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=4041195>.
Zhen Liu, “Zhen Liu—Nokia Research Center”, pp. 1-5; downloaded Nov. 20, 2012; <research.nokia.com/people/zhen—liu>.
Bouillet et al., “A Faceted Requirements-Driven Approach to Service Design and Composition”, 2008 IEEE, pp. 369-376; <http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4670197>.
Sohrabi et al., “Composition of Flow-Based Applications with HTN Planning”, 2012, the 6th International Scheduling and Planning Applications woPKshop (SPARK), 2012, pp. 58-64; <http://www.aaai.org/ocs/index.php/WS/AAAIW12/paper/view/5303>.
Bouillet et al., “MARIO: Middleware for Assembly and Deployment of Multi-platform Flow-Based Applications”, ACM, 2009 Springer-Verlag, pp. 1-7; <http://dl.acm.org/results.cfm?h=1&source—query=&&cfid=265350054&cftoken=49979371>.
The Amulet Environment: New Models for Effective User Interface Software Development, Myers, et al., IEEE Transactions on Software Engineering, vol. 23, No. 6, Jun. 1997.
Directed acyclic graph, Computing Dictionary, Dec. 7, 1994
D. Berardi, D. Calvanese, G.D. Giacomo, R. Hull, and M. Mecella, “Automatic composition of transition-based semantic web services with messaging”, In VLDB, 2005.
F. Lecue and A. Leger, “A formal model for semantic web service composition”, In ISWC '06, 2006.
S. Narayanan and S. McIlraith, “Simulation, verification and automated composition of web services”, In WWW, 2002.
X.T. Nguyen, R. Kowalczyk, and M.T. Phan, “Modelling and solving QoS composition problem using Fuzzy DisCSP”, In ICWS, 2006.
J. Pathak, S. Basu, and V. Honavar, “Modeling web services by iterative reformulation of functional and non-functional requirements”, In ICSOC, 2006.
M. Pistore et al., “Automated synthesis of composite BPEL4WS web service”, In ICWS, 2005.
R. Akkiraju et al., “Semaplan: Combining planning with semantic matching to achieve web service composition”, In ICWS, 2006.
R. Berbner et al., “Heuristics for Q0S-aware web service composition”, In ICWS, 2006.
A. Riabov and Z. Liu, “Planning for stream processing systems”, In AAAI, 2005.
M. Sheshagiri, M. desJardins, and T. Finin, “A planner for composing services described in DAML-S”, In Web Services and Agent-based Engineering—AAMAS, 2003.
K. Sivashanmugam, J. Miller, A. Sheth, and K. Verma, “Framework for semantic web process composition”, Special Issue of the Interl Journal of Electronic Commerce, 2003.
Daniel H. Pink, “Folksonomy”, The New York Times, Published: Dec. 11, 2005.
“Folksonomy” From Wikipedia, Oct. 5, 2007.
P. Traverso and M. Pistore, “Automated composition of semantic web services into executable processes”, In ISWC'04.
Eric Bouillet, Mark Feblowitz, Hanhua Feng, Zhen Liu, Anand Ranganathan, Anton Riabov, “A Folksonomy-Based Model of Web Services for Discovery and Automatic Composition”, In SCC, '08.
Eric Bouillet, Mark Feblowitz, Zhen Liu, Anand Ranganathan, Anton Riabov, “A Tag-Based Approach for the Design and Composition of Information Processing Applications”, OOPSLA'08.
Related Publications (1)
Number Date Country
20100095269 A1 Apr 2010 US