Service providers, such as backend-as-a-service and software-as-a-service providers, typically offer services performed in a logical sequence to its users. For example, a user may submit a business process that includes service types to a cloud service provider. A service cloud broker selects concrete services for each service type to instantiate the business process into a workflow. However, the selected services may not align with a user's preferences, and it is often difficult for users to articulate their preferences.
This disclosure is not limited to the particular systems, methodologies or protocols described, as these may vary. The terminology used in this description is for the purpose of describing the particular versions or embodiments only, and is not intended to limit the scope.
As used in this document, the singular forms “a,” “an,” and “the” include plural reference unless the context clearly dictates otherwise. Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of ordinary skill in the art. All publications mentioned in this document are incorporated by reference. All sizes recited in this document are by way of example only, and the invention is not limited to structures having the specific sizes or dimension recited below. Nothing in this document is to be construed as an admission that the embodiments described in this document are not entitled to antedate such disclosure by virtue of prior invention. As used herein, the term “comprising” means “including, but not limited to.”
In an embodiment, a method of evaluating a workflow may include identifying a plurality of workflows. Each workflow may be associated with one or more users, and each workflow may represent a flow of data between a plurality of services via one or more execution paths. The method may include clustering, by a computing device, the execution paths associated with the plurality of workflows into a plurality of groups. The clustering may be based on the associated services. The method may include creating, by the computing device, a feature tree for each group, clustering, by the computing device, at least a portion of the users into a plurality of interest groups based on at least one of the feature trees, and for at least one of the interest groups, predicting, by the computing device, one or more preferences for one or more users in the interest group.
In an embodiment, a system of evaluating a workflow may include a computing device and a computer-readable storage medium in communication with the computing device. The computer-readable storage medium may include one or more programming instructions that, when executed, cause the computing device to identify a plurality of workflows. Each workflow may be associated with one or more users, and each workflow may represent a flow of data between a plurality of services via one or more execution paths. The computer-readable storage medium may include one or more programming instructions that, when executed, cause the computing device to cluster the execution paths associated with the plurality of workflows into a plurality of groups, where the clustering may be based on the associated services, create a feature tree for each group, cluster at least a portion of the users into a plurality of interest groups based on at least one of the feature trees, and for at least one of the interest groups, predict one or more preferences for one or more users in the interest group.
The following terms shall have, for purposes of this application, the respective meanings set forth below:
A “computing device” refers to a device that includes a processor and tangible, computer-readable memory. The memory may contain programming instructions that, when executed by the processor, cause the computing device to perform one or more operations according to the programming instructions. Examples of computing devices include personal computers, servers, mainframes, gaming systems, televisions, and portable electronic devices such as smartphones, personal digital assistants, cameras, tablet computers, laptop computers, media players and the like.
An “execution path” refers to at least a portion of a workflow.
A “feature tree” refers to a representation of one or more sub-execution paths in one or more workflows. Each node in a feature tree may represent a sub-execution path. A feature tree may include one or more parent nodes and one or more child nodes. A parent node may represent a super-sequence of its child node(s), and a child node may represent a sub-sequence of its parent node.
A “workflow” refers to a plurality of services that are performable in a sequence. For example, in a print production environment, a workflow may include a sequence of services to be performed to process a print job. Such services may include, for example, printing, binding, collating, cutting and/or the like.
In an embodiment, a user may request that a service provider perform a business process on behalf of the user. A business process may include one or more workflows. For example, a business process may require performing four distinct services in a certain order. Additional and/or alternate numbers of services may be used within the scope of this disclosure.
In an embodiment, a workflow may be associated with one or more different execution paths. An execution path may be attributable to options amongst services to be provided, the presence of one or more loops and/or the like. Table 1 illustrates example execution paths associated with
In an embodiment, an execution path may be associated with a rating. A rating may represent a user's preference for an execution path. In an embodiment, a rating may be binary value as illustrated by Table 1. For example, “1” may represent a good rating, while “−1” may represent a poor rating. Additional and/or alternate binary and/or non-binary ratings may be used within the scope of this disclosure.
In an embodiment, a rating may be assigned to an execution path by a user. For example, after a business process requested by a user is completed, a user may be asked to rate the execution path used to complete the business process. The rating may be based on timeliness of completion, thoroughness, throughput, availability, cost, quality and/or the like.
In an embodiment, if an execution path, Ei, is rated good, then every sub-sequence of the execution path, Ei,sub, may have a good rating. In an embodiment, if an execution path, Ei, is rated bad, then every super-sequence of the execution path, Ei,super, may have a bad rating. In an embodiment, if ratings associated with an execution path are contradicting, then the last rating may be used. For example, if a user rates Ei as good but Ei,sub as bad, the most recent rating may be used. Similarly, if a user rates Ei as bad but Ei,super, as good, the most recent rating may be used.
In an embodiment, information associated with workflows may be identified 300 by retrieving information from a list, database or other storage media. For example, information associated with historical workflows performed by one or more users may be stored in a database.
In an embodiment, execution paths in the identified information may be clustered 302 into one or more groups. Execution paths may be clustered according to any clustering algorithm, such as, for example, fuzzy C-means. Execution paths that share one or more common services may be clustered into the same group. Clustering execution paths may help extract services, which may be represented as shared common sub-execution paths.
In an embodiment, one or more feature trees may be created 304 based on the clustering. A feature tree may be a graphical representation of one or more workflows. A feature tree may include one or more nodes that each represents a sub-execution path.
In an embodiment, each execution path in a group may be identified. Each execution path may be compared to one or more other execution paths in the group to determine a greatest common denominator between the two execution paths. For example, a first execution path may be compared to a second execution path to determine a sub-execution path that includes one or more services that is the greatest common denominator. The second execution path may be compared to a third execution path to determine a sub-execution path that includes one or more services that is the greatest common denominator, and so on. In an embodiment, a determined greatest common denominator service may be inserted into a feature tree where each feature is a shared common sub-execution path. Each parent node in the feature tree may represent a super-sequence of a child node.
The following pseudo code illustrates an example method of extracting services and creating a feature tree according to an embodiment:
In an embodiment, a child node in a feature tree may represent a sub-sequence of its parent node(s). Similarly, a parent node may be a super-sequence of all of its child nodes. As such, inserting new nodes into a feature tree must be done in order to preserve this structure. The following pseudo code illustrates an example method of inserting one or more nodes into a feature tree according to an embodiment.
In an embodiment, a feature tree may be created for every group. Each node may be a sub-execution path that is shared by at least two execution paths.
In an embodiment, each sub-execution path may have an associated weight. A weight may represent a sub-execution path's popularity. In an embodiment, popularity may indicate a relative number of execution paths that share a sub-execution path. For example, if five execution paths share a sub-execution path, then the weight associated with that sub-execution path may be ‘5’. Additional and/or alternate indications of popularity may be used within the scope of this disclosure.
In an embodiment, a feature tree may be pruned by deleting one or more sub-execution paths associated with weights that are less than a threshold value. A threshold value may be dynamically determined based on the distribution of weights associated with a service tree. In an embodiment, pruning a feature tree may help remove relatively unshared sub-execution paths, and therefore reduce the data space occupied by the feature tree.
In an embodiment, users may be clustered 306. For example, users who have rated execution paths may be clustered 306. Users may be clustered 306 based on the ratings they assigned to execution paths, workflows and/or the like. In an embodiment, a matrix of users and sub-execution paths may be used to cluster users. Each column of the matrix may represent a user, and each row of the matrix may represent a sub-execution path. Table 3 illustrates an example matrix according to an embodiment. As illustrated by Table 3, a value in the matrix, vi,j, represents user j's rating of sub-execution path i. For example, User 1's rating of Sub-execution Path 3 is ‘−1’. As discussed above, ‘1’ may indicate a positive rating and ‘−1’ may indicate a negative rating. If a user did not use a workflow that includes a sub-execution path, or if a user has not rated a particular sub-execution path, the sub-execution path may be associated with a rating of ‘0’.
In an embodiment, one or more users may be clustered 306 into one or more interest groups. A similarity value may be determined for one or more pairs of groups. For example, a similarity value between group g, and gj may be computed by the following:
sim(gi,gj)=Σk=insk, where
if gi,k=gi,k=1 or gi,k=gi,k=−1, then sk=1
In an embodiment, a difference value between one or more pairs of groups may be determined. For example, a difference value between group g, and gj may be computed by the following:
diff(gi,gi)=Σk=indk, where
if gi,k*gj,k−1, then dk=1
In an embodiment, the two groups having the highest similarity scores may be merged into another group, gm. In an embodiment, if users in group m rate sub-execution path k as a ‘1’ or ‘0’, with at least one user rating sub-execution path k as a ‘=1’, then gm,k=1. In an embodiment, if users in a group m rates sub-execution path k as ‘−1’ or ‘0’ with at least one user rating sub-execution path k as ‘−1’, then gm,k=−1. Otherwise, gm,k may equal ‘0’.
In an embodiment, merging of two groups may be stopped when a ratio of the similarity value of the two groups to the difference value of the two groups is less than a threshold value. For example, merging of two groups (gi and gj) may be stopped when:
In an embodiment, one or more user preferences may be predicted 308 for gm. For example, execution paths and their ratings may be used to predict user preferences in terms of sub-execution paths. However, it is often difficult to understand sub-execution paths as fragments of execution paths. As such, one or more quality of service (QoS) attributes may be used to predict preferences at a higher level. QoS attributes may include service QoS attributes and/or link QoS attributes.
Service QoS attributes may refer to one or more performance metrics associated with a service. For example, metrics may include, without limitation, response time, cost, reliability, availability and/or the like.
Link QoS attributes may refer to the quality-of-service of the link between two services in a workflow. If a link exists between two services, then one service may provide data to the other service. The data may be transferred over a network from one service to another. Link QoS attributes may refer to one or more metrics associated with the transfer of data between two services. Example Link QoS attributes may include, without limitation, network speed, throughput, reliability, availability and/or the like.
In an embodiment, QoS attribute data may be accessed via a monitoring service or other type of service. For example, a monitoring service may track QoS attribute data for one or more services, and may provide this information in response to a request for such information.
In an embodiment, predicting 308 one or more preferences may involve identifying 310 an execution path having a good rating and identifying 312 an execution path having a poor rating for one or more users in an interest group. For illustrative purposes, s1s2 may be an execution path having a good rating and s1s3 may be an execution path having a poor rating. In an embodiment, the execution paths that are identified 310, 312 may be of the same length. For example, s1s2 and s1s3 both include two services and one link between services. As such, they are the same length.
In an embodiment, the execution paths that are identified 310, 312 may share one or more common services. In an embodiment, the identified execution paths may include the most number of common services amongst available execution paths. In an embodiment, the identified execution paths may include at least a threshold number of common services. For instance, the example execution paths above, s1s2 and s1s3, share 50% common services since both execution paths include s1.
In an embodiment, one or more QoS attribute values associated with the execution path having a good rating may be determined 314. In an embodiment, the way in which a QoS attribute value is determined may depend on the attribute. Example techniques for determining a QoS attribute value may include, without limitation, determining a linear sum, multiplication, determining a minimum value, determining a maximum value and/or the like.
For instance, an execution time associated with an execution path may be a linear sum of the execution times of each service in the execution path, and the time it takes to transmit data between services. For example, referring to execution path s1s2, s1 may execute for three minutes, transmission of data from s1 to s2 may take ten seconds, and s2 may execute for one minute. As such, the execution time of this execution path may be the linear sum of the execution and transmission times (i.e., four minutes and ten seconds).
As another example, an availability associated with an execution path may be determined through multiplication. For instance, using the example above, the availabilities associated with the execution path, s1s2, above may be 90% for s1, 80% for the link between s1 and s2, and 95% for s2. The availability for the execution path may be determined by multiplying the availabilities. For example, the availability for this execution path may be 68.4% (i.e., 90%*80%*85%).
In an embodiment, one or more QoS values associated with one or more execution paths having a bad rating may be determined 316. For example, s1s2 may be rated as good by a user, but another execution path, s1s3, may be rated as bad by a user. The execution time associated with s1s3 may be four minutes, and the availability associated with s1s3 may be 30%. Table 4 illustrates example QoS attribute values for these execution paths according to an embodiment.
In an embodiment, one or more QoS attribute values may be evaluated 318. One or more attribute values of the execution path rated as good may be compared to a corresponding attribute value of the execution path rated as bad in an effort to predict user preferences. In an embodiment, a comparison of values may yield a probability that the attribute is responsible for the bad rating associated with one of the execution paths. In an embodiment, the probability may be based on the similarity or difference between compared values. For example, if two values are relatively similar or are within a certain value or percentage of one another, the probability that the attribute is responsible for the bad rating may be relatively small. However, if there is a great difference between two values, or if the difference between the two values exceeds a threshold amount, then the probability that the corresponding attribute is responsible for the bad rating may be relatively high.
For example, the execution time for s1s2 (4 minutes 10 seconds) may be compared to the execution time for s1s3 (4 minutes). In this situation, the execution times are relatively similar, so the probability that execution time is responsible for the bad rating of s1s3 is low.
However, comparing the availability of s1s2 (68.4%) to the availability of s1s3 (30%) shows a large difference between the values. As such, the probability that the availability QoS attribute is responsible for the bad rating associated with s1s3 may be relatively high.
In an embodiment, the QoS attributes having high probabilities of being responsible for a bad rating and/or the QoS attributes having high probabilities of being a user preference may be identified 320. In an embodiment, a QoS attribute may have a high probability of being responsible for a bad rating if it is associated with a probability that falls below a threshold value. In an embodiment, a QoS attribute may have a high probability of being a user preference if it is associated with a probability that equals or exceeds a threshold value. One or more user preference predictions may be made based on the identified QoS attribute. For example, referring to the above example, the system may predict that the user prefers availability for workflows.
For instance, the probability of availability being a user preference may 90%, the probability of response time being a user preference may be 60% and the probability of reliability being a user preference may be 10%. A threshold value may be 50%, meaning that a QoS attribute having a probability that falls below 50% may be identified as being responsible for a bad rating, and a QoS attribute having a probability equal to or exceeding 50% may be identified as a user-preferred QoS attribute. Three execution paths may exist. Path 1 may have high availability, medium response time and low reliability. Path 2 may have high availability, low response time, and high reliability. Path 3 may have low availability, medium response time, and high reliability. The system may recommend Path 1 followed by Path 2 because these paths have QoS attributes (i.e., availability and response time) that have high probabilities of being user preferences. The system may not recommend Path 3 because the associated QoS attribute that has the highest rating is reliability which is the QoS attribute that has the lowest probability of being a user preference. Additional and/or alternate ratings, probabilities and selections may be used within the scope of this disclosure.
In an embodiment, a profile associated with a user may be updated 322 to reflect the identified predictions. For example, an indication that a user prefers or does not prefer one or more QoS attributes may be added to the user's profile. For instance, using the above example, an indication that the user prefers availability may be added to a profile associated with the user.
In an embodiment, the system may provide 324 one or more subsequent workflow recommendations to a user. The subsequent workflow recommendations may be based on one or more user preferences from the user's profile. For instance, using the above example, the system may suggest to the user only workflows that have high availability.
A controller 420 interfaces with one or more optional non-transitory computer-readable storage media 425 to the system bus 400. These storage media 425 may include, for example, an external or internal DVD drive, a CD ROM drive, a hard drive, flash memory, a USB drive or the like. As indicated previously, these various drives and controllers are optional devices.
Program instructions, software or interactive modules for providing the interface and performing any querying or analysis associated with one or more data sets may be stored in the ROM 410 and/or the RAM 415. Optionally, the program instructions may be stored on a tangible non-transitory computer-readable medium such as a compact disk, a digital disk, flash memory, a memory card, a USB drive, an optical disc storage medium, such as a Blu-ray™ disc, and/or other recording medium.
An optional display interface 430 may permit information from the bus 400 to be displayed on the display 435 in audio, visual, graphic or alphanumeric format. Communication with external devices, such as a printing device, may occur using various communication ports 440. A communication port 440 may be attached to a communications network, such as the Internet or an intranet.
The hardware may also include an interface 445 which allows for receipt of data from input devices such as a keyboard 450 or other input device 455 such as a mouse, a joystick, a touch screen, a remote control, a pointing device, a video input device and/or an audio input device.
It will be appreciated that various of the above-disclosed and other features and functions, or alternatives thereof, may be desirably combined into many other different systems or applications or combinations of systems and applications. Also that various presently unforeseen or unanticipated alternatives, modifications, variations or improvements therein may be subsequently made by those skilled in the art which are also intended to be encompassed by the following claims.