Various embodiments described herein relate to computer systems, methods and program products and, more particularly, to virtualized computer systems, methods and computer program products.
Modern enterprise software environments may integrate a large number of software systems to facilitate complex business processes. Many of these software systems may interact with and/or rely on services provided by other systems (e.g., third-party systems or services) in order to perform their functionalities or otherwise fulfill their responsibilities, and thus, can be referred to as “systems of systems.”
Assuring the quality of such software systems (including the functionality which interacts with third-party systems or services) before deployment into actual production environments (i.e., “live” deployment) may present challenges, for example, where the systems interoperate across heterogeneous services provided by large scale environments. For example, physical replication and provisioning of real-world deployment environments can become difficult to effectively manage or even achieve, as recreating the heterogeneity and massive scale of typical production environments (often with thousands of real client and server hardware platforms, suitably configured networks, and appropriately configured software applications for the system under test to communicate with) may be difficult given the resources of a quality assurance (QA) team. Accessing these environments may also may also involve difficulty and/or expense, and the different environment configurations may affect the operational behavior of such software systems. For example, access to real third party services during testing may be restricted, expensive, and/or unavailable at a scale that is representative of the production environment. Thus, due to the complex interaction between a software system and its operating environment, traditional standalone-system-oriented testing techniques may be inadequate for quality assurance.
Enterprise software environment emulation may be used as an alternative approach to providing interactive representations of operating environments. Software service emulation (or “service virtualization”) may refer to emulation of the behavior of specific components in heterogeneous component-based environments or applications, such as API-driven applications, cloud-based applications and/or service-oriented architectures. Service virtualization allows the communication between a client and software service to be virtualized, such that the virtual service can respond to requests from the client system with generated responses. With the behavior of the components or endpoints simulated by a model or “virtual asset” (which stands in for a component by listening for requests and returning an appropriate response), testing and development can proceed without accessing the actual live components of the system under test. For instance, instead of virtualizing an entire database (and performing all associated test data management as well as setting up the database for every test session), the interaction of an application with the database may be monitored, and the related database behavior may be emulated (e.g., SQL queries that are passed to the database may be monitored, and the associated result sets may be returned, and so forth). For a web service, this might involve listening for extensible markup language (XML) messages over hypertext transfer protocol (HTTP), Java® message service (JMS), or IBM® Web Sphere MQ, then returning another XML message. Thus, the virtual asset's functionality and performance may reflect the functionality/performance of the actual component, and/or may simulate conditions (such as extreme loads or error conditions) to determine how an application or system under test responds under those circumstances.
By modeling the interaction behavior of individual systems in an environment and subsequently simultaneously executing a number of those models, an enterprise software environment emulator can provide an interactive representation of an environment which, from the perspective of an external software system, appears to be a real or actual operating environment. Manually defining interaction models may offer advantages in defining complex sequences of request/response patterns between elements of the system including suitable parameter values. However, in some cases, such an approach may not be feasible due to the time required or lack of required expertise. In particular, manually defining interaction models (including complex sequences of request/response patterns and suitable parameter values) may require knowledge of the underlying interaction protocol(s) and system behavior(s). Such information may often be unavailable at the required level of detail (if at all), for instance, when third-party, legacy, and/or mainframe systems are involved. Additionally, the large number of components and component interactions in such systems may make manual approaches time-consuming and/or error-prone. Also, due to lack of control over the environment, if an environment changes with new enterprise elements or communication between elements, these manual protocol specifications must be further updated.
According to some embodiments, in a method of service emulation, a transaction subset including ones of a plurality of message transactions previously communicated between a system under test and a target system for emulation is defined. The message transactions include requests and responses thereto that are stored in a computer-readable memory. Variable sections of the requests and variable sections of the responses of the transaction subset are identified, for example, based on respective message structures thereof. Substitution rules, which indicate a correspondence between respective ones of the variable sections of the requests and respective ones of the variable sections of the responses, are determined for the transaction subset based on commonalities therebetween. Responsive to receiving an incoming request from the system under test, a response to the incoming request is generated according to the substitution rules. The defining of the transaction subset, the identifying of the variable sections of the requests and responses, the determining of the substitution rules, and the generating of the response are performed by a processor.
In some embodiments, the commonalities between the variable sections of the requests and the variable sections of the responses may include common substrings. For respective pairs including one of the requests and a corresponding one of the responses thereto, the common substrings included in the respective ones of the variable sections thereof may be identified, and symmetric field rules correlating the respective ones of the variable sections thereof having the common substrings may be defined. The symmetric field rules for the respective request-response pairs may be merged to define the substitution rules for the transaction subset.
In some embodiments, the symmetric field rules for the respective request-response pairs may be merged based on a frequency of occurrence of the symmetric field rules.
In some embodiments, the symmetric field rules for the respective request-response pairs may be merged by clustering ones of the symmetric field rules into respective groups based on similarities therebetween calculated using a distance function, and selecting a representative one of the symmetric field rules from the respective groups as one of the substitution rules. For example, a centroid or modal one of the symmetric field rules may be selected as one of the substitution rules.
In some embodiments, the variable sections of the requests and the variable sections of the responses of the transaction subset may be identified by decoding the requests and the responses using a message parser based on predetermined information indicative of respective message structures thereof.
In some embodiments, the variable sections of the requests and the variable sections of the responses of the transaction subset may be identified independent of predetermined information indicative of respective message structures thereof by generating, for the transaction subset, a request prototype including common characters that are present at respective positions of ones of the requests thereof, and a response prototype including common characters that are present at respective positions of ones of the responses thereof. The variable sections of the requests of the transaction subset may be determined based on an absence of the common characters at corresponding positions of the request prototype, and the variable sections of the responses of the transaction subset may be determined based on an absence of the common characters at corresponding positions of the response prototype.
In some embodiments, in generating the request prototype and the response prototype, the requests of the transaction subset may be aligned according to the respective positions thereof, and the common characters for the request prototype may be selected based on a frequency of occurrence thereof at the respective positions of the requests of the transaction subset as indicated by the aligning of the requests. Similarly, the responses of the transaction subset may be aligned according to the respective positions thereof, and the common characters for the response prototype may be selected based on a frequency of occurrence thereof at the respective positions of the responses of the transaction subset as indicated by the aligning of the responses.
In some embodiments, in generating the response to the incoming request, variable sections of the incoming request may be identified based on a comparison of the incoming request with the corresponding positions of the request prototype. Ones of the corresponding positions of the response prototype may be populated with data from ones of the variable sections of the incoming request as specified by the substitution rules to generate the response thereto.
In some embodiments, in generating the response to the incoming request, the transaction subset may be identified, among a plurality of transaction subsets, as corresponding to the incoming request based on a comparison of the incoming request with the request prototype of the transaction subset. The plurality of transaction subsets may respectively include different ones of the message transactions, where the ones of the message transactions included in the transaction subset are of a same operation type.
In some embodiments, the transaction subset may be defined by clustering the ones of the message transactions based on similarities therebetween calculated using a distance function.
In some embodiments, the transaction subset may be defined responsive to user selection of the ones of the message transactions.
According to further embodiments, a computer system includes a processor, and a memory coupled to the processor. The memory includes computer readable program code embodied therein that, when executed by the processor, causes the processor to define a transaction subset that includes ones of a plurality of message transactions previously communicated between a system under test and a target system for emulation. The message transactions include requests and responses thereto stored in a computer-readable memory. The memory further includes computer readable program code embodied therein that, when executed by the processor, causes the processor to identify variable sections of the requests and variable sections of the responses of the transaction subset, and determine substitution rules for the transaction subset. The substitution rules indicate a correspondence between respective ones of the variable sections of the requests and respective ones of the variable sections of the responses based on commonalities therebetween. Responsive to receiving an incoming request from the system under test, the computer readable program code, when executed by the processor, further causes the processor to generate a response to the incoming request according to the substitution rules.
According to still further embodiments, a computer program product includes a computer readable storage medium having computer readable program code embodied in the medium. The computer readable program code includes computer readable code to define a transaction subset that includes ones of a plurality of message transactions previously communicated between a system under test and a target system for emulation. The message transactions include requests and responses thereto that are stored in a computer-readable memory. The computer readable program code further includes computer readable code to identify variable sections of the requests and variable sections of the responses of the transaction subset, and computer readable code to determine substitution rules for the transaction subset. The substitution rules indicate a correspondence between respective ones of the variable sections of the requests and respective ones of the variable sections of the responses based on commonalities therebetween. The computer readable program code further includes computer readable code to generate a response to an incoming request from the system under test according to the substitution rules.
It is noted that aspects described herein with respect to one embodiment may be incorporated in different embodiments although not specifically described relative thereto. That is, all embodiments and/or features of any embodiments can be combined in any way and/or combination. Moreover, other systems, methods, and/or computer program products according to embodiments will be or become apparent to one with skill in the art upon review of the following drawings and detailed description. It is intended that all such additional systems, methods, and/or computer program products be included within this description, be within the scope of the present disclosure, and be protected by the accompanying claims.
Aspects of the present disclosure are illustrated by way of example and are not limited by the accompanying figures with like references indicating like elements.
Various embodiments will be described more fully hereinafter with reference to the accompanying drawings. Other embodiments may take many different forms and should not be construed as limited to the embodiments set forth herein. Like numbers refer to like elements throughout.
As will be appreciated by one skilled in the art, aspects of the present disclosure may be illustrated and described herein in any of a number of patentable classes or context including any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof. Accordingly, aspects of the present disclosure may be implemented entirely hardware, entirely software (including firmware, resident software, micro-code, etc.) or combining software and hardware implementation that may all generally be referred to herein as a “circuit,” “module,” “component,” or “system.” Furthermore, aspects of the present disclosure may take the form of a computer program product embodied in one or more computer readable media having computer readable program code embodied thereon.
Any combination of one or more computer readable media may be utilized. The computer readable media may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an appropriate optical fiber with a repeater, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable signal medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
As described herein, a computing system or environment may include one or more hosts, operating systems, peripherals, and/or applications. Machines in a same computing system or environment may have shared memory or resources, may be associated with the same or different hardware platforms, and/or may be located in the same or different physical locations. Computing systems/environments described herein may refer to a virtualized environment (such as a cloud environment) and/or a physical environment.
In assuring quality of a system under test (for example, a large enterprise system), physical replication of real-world deployment environments may be difficult or impossible to achieve. Thus, an emulation environment where realistic interactive models of the third party services are executed may be useful for purposes of quality assurance and/or development and operations (DevOps). In particular, such “virtual” deployment environments may be used to provision representations of diverse components, as shown by way of example in the environment 600 of
However, in some instances, scaling of the environment 615 to handle the number of likely endpoints 611 in the deployment scenario may require pre-existing knowledge of (i) a likely maximum number of endpoints; (ii) the likely maximum number of messages between endpoint and system; (iii) the likely frequency of message sends/receives needed for the system to respond in acceptable timeframe; (iv) the likely size of message payloads given deployment network latency and bandwidth; and/or (v) the system's robustness in the presence of invalid messages, too-slow response from end-points, or no-response from endpoints. Also, messages being exchanged between the system under test 605 and the endpoints 611 should adhere to various protocols; for example, a Lightweight Directory Access Protocol (LDAP) message sent by the system under test 605 to an endpoint 611 should be responded to with a suitable response message in reply, in an acceptable timeframe and with acceptable message payload. Subsequent messages sent by the system under test 605 to the endpoint using the LDAP response message payload may also need to utilize the previous response information. As such, the creation of such executable endpoint models 611 may require the availability of a precise specification and/or prior detailed knowledge of the interaction protocols 619 used, may be relatively time consuming and/or error-prone, and may be subject to considerable implementation and/or maintenance effort in heterogeneous deployment environments.
Software service emulation or virtualization can create realistic executable models of server-side behavior, thereby replicating production-like conditions for large-scale enterprise software systems, for instance, by generating responses to requests from a system under test using symmetric field substitution. Symmetric field substitution (or “magic strings”) may refer to methods in service virtualization for generating and modifying a played back response to a system under test by substituting substrings from a new incoming request from the system under test into the generated response. The substrings may be common character sequences, of a length greater than a given threshold, which occur in both the request and the associated response of a selected message transaction.
Some embodiments of the present disclosure may arise from realization that some methods for symmetric field substitution may be based on a comparison of a single request and response, and thus, may be prone to errors in response generation due to coincidental similarity between requests and responses. Accordingly, embodiments of the present disclosure provide systems and methods for defining more robust substitution rules that correlate sections of requests to sections of responses, in particular, by calculating symmetric fields from commonalities among groupings or subsets of similar requests and responses thereto. For example, some embodiments as described in detail herein may improve accuracy and/or efficiency in response generation by generating respective prototypes or templates that capture the common features of the range of requests, as well as the common features of the range of responses, in each subset. Embodiments of the present disclosure can be applied to opaque service virtualization (i.e., agnostic of or without pre-existing knowledge of protocols or message structures) or for standard service virtualization (i.e., with pre-existing knowledge of message structures or protocols).
In particular, embodiments of the present disclosure are directed to a service emulation or virtualization approach that simulates enterprise system element interaction behavior by grouping message transactions (including requests and responses thereto) that were previously communicated between a system under test and endpoint or elements/components in its deployment environment into transaction subsets of the same operation type, and determining (for each transaction subset) substitution rules that indicate a correlation between variable sections of the previously recorded request(s) and variable sections of the previously recorded response(s) corresponding thereto. Responsive to receiving an incoming request from a system under test, embodiments of the present disclosure (i) identify variable section(s) of the incoming request based on commonalities and differences between the incoming request and one(s) of the requests from the group (or a request prototype representative of the group), and (iii) generate a response using data from the variable section(s) of the incoming request and substitution rules that indicate a correlation between variable sections of the previously recorded request(s) and variable sections of the previously recorded response(s) corresponding thereto.
An approach for generating responses using more robust request prototypes is described below with reference to the block diagram of
As shown in
In the pre-processing stage, the transaction monitor 125 is configured to record message transactions (including requests and responses thereto) communicated with (i.e., to and/or from) the system under test 105 or other client. In particular, as shown in
Also in the pre-processing stage, the subset analyzer 128 is configured to partition the message transactions 130A of the transaction library 130 into transaction subsets or “clusters” 128A. The message transactions may be included in a particular transaction subset responsive to user direction, or automatically using a data clustering method. As shown in
Still referring to the pre-processing stage, the subset analyzer 128 is configured to identify variable sections of the requests and variable sections of the responses of one or more of the transaction subsets 128A, for example, based on respective message structures thereof. When the message structures of the requests and responses are already known (for example, in standard service virtualization), the subset analyser 128 may employ a message parser to decode the requests and responses into fields that indicate the variable sections thereof. When the message structures of the requests and responses are unknown (for example, in opaque service virtualization), the subset analyser 128 may be configured to generate request and response prototypes that include the common features of the requests and responses, and are indicative of the variable sections of the requests and responses, respectively, of the corresponding transaction subset 128A. The request and response prototypes may be generated using methods described in U.S. patent application Ser. No. 14/535,950 entitled “SYSTEMS AND METHODS FOR AUTOMATICALLY GENERATING MESSAGE PROTOTYPES FOR ACCURATE AND EFFICIENT OPAQUE SERVICE EMULATION,” filed Nov. 7, 2014, the disclosure of which is incorporated by reference herein in its entirety.
In particular, the request and response prototypes for a transaction subset 128A may include one or more commonalities among the request messages and the response messages, respectively, of that transaction subset 128A. In some embodiments, for one or more of the transaction subsets 128A, the subset analyzer 128 may align the request messages to identify common features or characters at respective positions thereof, and may likewise align the response messages to identify common features or characters at respective positions thereof. Gap characters may be inserted among the request messages and/or response messages of each subset 128A to align the positions of the messages. The subset analyzer 128 may select the common character(s) for inclusion in the request and response prototypes based on a frequency of occurrence at respective positions of the aligned requests and responses. The subset analyser 128 may also determine the variable sections of the requests and responses for a transaction subset 128A from the absence of common characters at corresponding positions of the request and response prototypes for that subset 128A. The request and response prototypes generated for each transaction subset 128A may also be used at the run-time stage to increase the efficiency and accuracy of response generation.
In the pre-processing stage of
At the runtime stage, the emulation environment 115 receives a request message Reqin from the system under test 105 at the request analyzer 150 via a network 120B. The request message Reqin may be transmitted from the system under test 105 to request a particular service provided by one or more of the endpoints 111A, 111B, . . . 111N of the deployment environment 110. The request analyzer 150 is configured to indirectly identify at least one of the transaction subsets 128A that corresponds to the operation type requested by the request message Reqin, for instance, by comparing the incoming request message Reqin with the request prototypes for each of the transaction subsets 128A, in some embodiments without knowledge or determination of the structure or protocol of the received request message Reqin. For example, the request analyzer 150 may use a matching distance calculation technique to compare the current request Reqin received from the system under test 105 to the respective request prototypes for each of the transaction subsets 128A to identify one of the transaction subsets 128A as corresponding to the received request Reqin. Results of the analysis by the request analyzer 150 (for example, indicating the closest-matching transaction subset 128A, request prototype, and/or associated response prototype) are provided to the response generator 160. As used herein, a “matching” or “corresponding” prototype, cluster, message, request, and/or response, as determined for example by the request analyzer 150, may refer to similar (but not necessarily identical) prototypes/messages/requests/responses.
Still referring to the runtime stage, the response generator 160 is configured to synthesize or otherwise generate a response message Resout to the incoming request Reqin according to the substitution rules 121A for the particular transaction subset 128A that was indicated by the request analyzer 150. For example, the response generator 160 may identify one or more variable sections of the incoming request Reqin based on a comparison of the incoming request Reqin with the corresponding position(s) of the request prototype for the transaction subset 128A indicated by the request analyzer 150, and may populate the corresponding position(s) of the associated response prototype (for that same transaction subset 128A) with data from ones of the variable sections of the incoming request based on the substitution rules 121A for the transaction subset 128A that was indicated by the request analyzer 150. Thus, the response Resout is automatically generated using the received request Reqin from the system under test 105 using one of the request-response pairs of a selected transaction subset 128A and the corresponding substitution rules 121A, and is returned to the system under test 105 via the network 120B.
In some embodiments of the present disclosure, a distance function may be used for calculations in multiple operations described herein. For example, respective distance functions may be used in subset definition, prototype generation, and symmetric field rule merging operations in the pre-processing stage, as well as in request analysis at the runtime stage. Example distance functions may include the Cartesian distance of the (i, j, L) vector, the Cosine distance, the Manhattan distance, the Needleman-Wunsch algorithm, and/or other distance functions.
It will be appreciated that in accordance with various embodiments of the present disclosure, the emulation environment 115 may be implemented as a single server, separate servers, or a network of servers (physical and/or virtual), which may be co-located in a server farm or located in different geographic regions. In particular, as shown in the example of
As shown in
The storage system 225 may include removable and/or fixed non-volatile memory devices (such as but not limited to a hard disk drive, flash memory, and/or like devices that may store computer program instructions and data on computer-readable media), volatile memory devices (such as but not limited to random access memory), as well as virtual storage (such as but not limited to a RAM disk). The storage system 225 may include a transaction library 230 storing data (including but not limited to requests and associated responses) communicated between a system under test and a target system for emulation. Although illustrated in separate blocks, the memory 212 and the storage system 225 may be implemented by a same storage medium in some embodiments. The input/output (I/O) data port(s) 235 may include a communication interface and may be used to transfer information in the form of signals between the computing device 200 and another computer system or a network (e.g., the Internet). The communication interface may include a modem, a network interface (such as an Ethernet card), a communications port, a PCMCIA slot and card, or the like. These components may be conventional components, such as those used in many conventional computing devices, and their functionality, with respect to conventional operations, is generally known to those skilled in the art. Communication infrastructure between the components of
As mentioned above, in an enterprise system emulation environment (such as the environment 100 of
The observable interaction request messages and response messages communicated between a system under test and a target system contain two types of information: (i) protocol structure information (such as the operation name, field names and field delimiters), used to describe the type and format of a message and (ii) payload information, which includes attribute values of a records or objects and metadata. In general, given a collection of message interactions conforming to a specific interaction protocol, the repeated occurrence of protocol structure information may be common, as only a limited number of operations are defined in the protocol specification. In contrast, payload information is typically quite diverse according to the various objects and records exposed by the service. Message transaction analysis may thus be used to group similar observed messages into subsets, infer constant sections of messages (which may include protocol-related information) from variable sections of messages (which may include record or object related information) by comparing messages within a same subset, and determine substitution rules for response message generation by identifying common substrings among the variable sections of request-response message pairs, even without prior knowledge of the particular protocol used in the message transactions.
As shown in
The service emulation block 315 is configured to carry out some or all of the functionality of the subset analyzer 128, the rule generator 121, the request analyzer 150, and/or the response generator 160 of
For example, responsive to accessing a transaction library including a set of messages (including requests and associated responses) communicated between a client (such as the system under test 105 of
The subset analysis/prototype function 328 further identifies variable sections of the requests and variable sections of the responses of one or more of the transaction subsets, for example, based on respective message structures thereof. For example, if the message structures of the requests and responses are known, the subset analysis/prototype function 328 may be configured to apply a message parser to decode the requests and responses into fields that indicate the variable sections. If the message structures are unknown, the subset analysis/prototype function 328 may be configured to infer the constant and variable sections of the requests and responses, for example, by generating request and response prototypes for each of the transaction subsets.
The request and response prototypes function as representatives for the requests and responses, respectively, of the corresponding transaction subset. For example, responsive to a multiple sequence alignment of the requests and a multiple sequence alignment of the responses of a subset, a consensus request prototype and a consensus response prototype may be calculated by selecting, at each byte (or character) position, the most commonly occurring byte or character at that position, provided the byte/character has a relative frequency above a predetermined threshold. In other words, the request or response prototype may include a particular byte/character at a particular position when there is a consensus (among the requests or responses of the cluster) as to the commonality of the byte at that position). Positions for which there is no consensus may be populated with one or more wildcard characters in the request or response prototype.
The subset analysis/prototype function 328 may thus identify the variable sections of the requests and responses for a transaction subset based on the presence of wildcard characters at corresponding positions of the request and response prototypes generated for that subset. For instance, for each request-response pair of a transaction subset, the subset analysis/prototype function 328 may compare may compare the request to the request prototype and the response to the response prototype to determine the variable sections based on alignment of such sections with the wildcard characters of the prototypes.
It will be understood that these and/or other operations of the subset analysis/prototype function 328 may be performed as a pre-processing step, prior to any response generation. Also, the pre-processing transaction subset generation operations performed by the subset analysis/prototype function 328 may utilize the same distance function utilized by the request analysis function 350 in run-time response generation operations (as discussed below), or a different distance function may be used.
Still referring to
At run-time, the request analysis function 350 compares an unknown, incoming request with the request prototype for each of the transaction subsets generated by the subset analysis/prototype function 328, in order to select a particular one of the subsets that includes requests most similar to the incoming message. For example, the subset corresponding to the request prototype having the minimum distance to the incoming request may be selected as the matching subset, which will typically include message transactions having the same operation type as the incoming request. Thus, rather than comparing the incoming request to all of the requests in the transaction library, the request analysis function 350 compares the incoming request only with the request prototypes, reducing the processing burden and allowing for increased speed and efficiency.
The response generation function 360 performs response generation using the response prototype from the transaction subset selected by the request analyzer function 350, by applying the corresponding substitution rules determined by the substitution rule generation function 321. In particular, the response generation function 360 identifies variable sections of the incoming request from comparison with corresponding positions of the request prototype, and populates the corresponding variable sections of the paired response prototype (as indicated by the substitution rules) with values from the variable sections of the incoming request. The response generation function 360 may thereby generate the response independent of receiving data or other knowledge indicating structural information (including the protocol, operation type, and/or header information) of the incoming request, by substituting fields from the variable sections of the request into corresponding sections of the response prototype according to substitution rules that relate sections of the requests to sections of the responses for each operation type.
Although
Computer program code for carrying out the operations described above and/or illustrated in
Operations for providing service emulation in accordance with some embodiments of the present disclosure will now be described with reference to the flowcharts of
Referring now to
At block 405, variable sections of the requests and variable sections of the responses of a particular transaction subset are identified, for example, based on respective message structures thereof. The message structures of the requests and responses may be based on pre-programmed information (e.g., in standard service virtualization), or may be determined from message prototypes (e.g., in opaque service virtualization). In particular, a request prototype (including common characters that are present at respective positions of the requests of the transaction subset) and a response prototype (including common characters that are present at respective positions of the responses of the transaction subset) may be generated (for example, based on sequence alignment), and the variable sections of the requests and responses may be determined from the absence of common characters at corresponding positions thereof.
Substitution rules for the transaction subset are determined at block 410. The substitution rules indicate a correspondence between variable section(s) of the requests and variable section(s) of the responses based on commonalities therebetween. For example, for one or more request-response pairs of the transaction subset, substrings that are common to variable section(s) of the request and response in each pair may be identified, and symmetric field rules may be defined to correlate the variable section(s) of the request to the variable section(s) of the response in each request-response pair. The symmetric field rules for a number of request-response pairs may be merged (for instance, based on frequency of occurrence of the symmetric field rules or based on representative one(s) of the symmetric field rules) to define the substitution rules for the transaction subset.
At block 415, a response to an incoming request from the system under test is generated according to the substitution rules. For instance, variable sections of the incoming requests may be identified based on a comparison of the incoming request with the request prototype for the transaction subset, and positions of the response prototype corresponding to the identified ones of the variable sections of the incoming request may be populated as specified by the substitution rules to generate the response.
In particular, at block 501, the request-response pairs stored in the transaction library are pre-processed to group similar ones of the request-response pairs into respective transaction subsets. A transaction subset may include a group of requests/responses, typically of the same operation type. The requests/responses to be included in each transaction subset can be selected manually (that is, responsive to a user indication) or automatically, for example, using message clustering. Examples of such message clustering are described in U.S. patent application Ser. No. 14/305,322 entitled “SYSTEMS AND METHODS FOR CLUSTERING TRACE MESSAGES FOR EFFICIENT OPAQUE RESPONSE GENERATION,” filed Jun. 16, 2014, the disclosure of which is incorporated by reference herein in its entirety. For instance, relative distances between respective requests and/or respective responses may be calculated based on a clustering distance measure, and the transaction library may be partitioned based on the relative distances such that the transaction subsets respectively include message transactions having similar relative distances. A data clustering method or algorithm (such as VAT), BEA, K-Means, a hierarchical clustering algorithm, etc.) may be used to group message transactions into subsets of similar messages. For instance, a distance matrix may be generated to include the relative distances for the respective requests and responses, and the clustering algorithm may be applied to the distance matrix to group request-response pairs having similar relative distances into a particular transaction subset. As such, each transaction subset may include message transactions of a same operation type, based on the computed similarities therebetween.
Table 1 below illustrates an example transaction subset, which will be used hereinafter to illustrate processing operations in accordance with some embodiments of the present disclosure. Note that the transaction subset of Table 1 includes request-response pairs for an “Add” operation.
At Block 503, a request prototype and a response prototype are generated for each transaction subset. The operations of block 503 are performed to identify variable message sections/fields of the requests and responses of a transaction subset for an opaque (i.e. unknown structure) communications protocol. However, if the message structure/communications protocol is known (for example, in standard service virtualization or otherwise based on pre-programmed information), the operations of block 503 may be omitted, and operations may continue at block 505.
To generate the request and response prototypes at block 503, a multiple sequence alignment of the requests and responses for the transaction subset may be performed. In particular, for the example transaction subset of Table 1, the requests align as:
{Id:1-,Op:AddRq,Lastname:---------Du,Firstname:---Miao}
{Id:2-,Op:AddRq,Lastname:V-er--steeg,Firstname:--Steve}
{Id:25,Op:AddRq,Lastname:S-ve--tlova,Firstname:Grushka}
{Id:13,Op:AddRq,Lastname:-Karamazov-,Firstname:Fyodor-}
{Id:24,Op:AddRq,Lastname:Smerdyakov-,Firstname:Pavel--},
and the responses align as:
{Id:1-,Op:AddRsp,Result:Ok,-Lastname:---------Du,Key:0}
{Id:2-,Op:AddRsp,Result:Ok, Lastname:V-er--steeg,Key:2}
{Id:25,Op:AddRsp,Result:Ok,-Lastname:S-ve--tlova,Key:5}
{Id:13,Op:AddRsp,Result:Ok,-Lastname:-Karamazov-,Key:3}
{Id:24,Op:AddRsp,Result:Ok,-Lastname:Smerdyakov-,Key:4}
Still referring to block 503, a respective consensus prototype may be generated for the aligned requests, as well as for the aligned responses. The request and response prototypes may be generated using methods described in U.S. patent application Ser. No. 14/535,950 entitled “SYSTEMS AND METHODS FOR AUTOMATICALLY GENERATING MESSAGE PROTOTYPES FOR ACCURATE AND EFFICIENT OPAQUE SERVICE EMULATION,” filed Nov. 7, 2014, the disclosure of which is incorporated by reference herein in its entirety. For the example transaction subset of Table 1, a threshold of 0.8 was used to calculate the following request and response consensus prototypes. That is, common characters occurring at respective positions of ones of the requests and responses with a frequency at or above the 0.8 threshold (as indicated by the above alignments) were included in corresponding positions of the request and response prototypes, respectively. A wildcard character “?” was included in positions of the request and response prototypes for corresponding positions of the requests and responses that did not meet the 0.8 threshold. In particular, the consensus prototype for the requests of the example transaction subset of Table 1 is:
{Id: ??, Op: AddRq, Lastname:????????, Firstname:???????}
The consensus prototype for the responses of the example transaction subset of Table 1 is:
{Id:??,Op:AddRsp,Result:Ok,Lastname:????????,Key:?}
At block 505, variable sections of the requests and variable sections of the responses of the transaction subset are determined. For example, if the message structure/communications protocol is known (e.g., in standard service virtualization, or otherwise based on pre-programmed information), a message parser can be applied to decode the raw messages into fields, where the decoded fields are equivalent to the variable message sections. If the message structure is unknown (e.g., in opaque service virtualization), the request and response prototypes generated at block 503 may be used to deduce the constant and variable sections of the requests and responses, respectively. For instance, one or more common characters at positions of the request and response prototypes may indicate a constant section, while one or more wildcards (included due to an absence of common characters among the requests or responses) at positions of the request and response prototypes may indicate a variable section. The request and response prototypes for each transaction subset may thus be divided into constant and variable sections, and labeled or numbered accordingly.
For the example transaction subset of Table 1, the variable and constant sections for the request prototype are shown below in Table 2a:
The variable and constant sections for the response prototype are shown below in Table 2b:
At block 507, for one or more request-response pairs in the transaction subset, common substrings that occur in both a variable section of a request and in a variable section of its corresponding response are identified. For example, for the transaction subset, each request-response pair may be aligned with the request prototype and the response prototype, respectively, and substrings in the variable request sections which also occur in the variable response sections may be identified. Gaps inserted during the alignment may be ignored during this matching process. Table 3a illustrates alignment of the request prototype and first request of the example transaction subset of Table 1:
Table 3b illustrates alignment of the response prototype and first response of the example transaction subset of Table 1:
As shown above in Tables 3a and 3b, for the first request-response pair of Table 1, section q1 of the first request and section p1 of the first response both contain the common character “1”. Also section q3 of the first request and section p3 of the first response contain the common substring “Du”.
At block 509, for one or more of the request-response pairs of the transaction subset, symmetric field rules are defined. The symmetric field rules correlate variable section(s) of the request to variable section(s) of the response for a given request-response pair. For instance, for each symmetric field that occurs in both a variable section of a request and a variable section of a response for a given pair, a symmetric field rule may be defined in the form:
q
a->pb (1)
Where qa, is a section of the request, and pb is a matching section of the response for the request-response pair. Note qa may map to multiple sections of the response, so it may be possible to have multiple rules with the same value of q4, but different values for pb. If the variable sections in the request and response match completely (i.e., an entire section of the request, matches an entire section of the response, ignoring any gaps), the symmetric field rule may be defined in the form above. If there is a partial match (i.e., a whole or subsection of the request matches a whole or subsection of the response), the symmetric field rule may be defined in the form:
q
a
[i . . . (i+L)]->pb[j . . . (j+L)] (2)
Where L is the length of the substring, i and (i+L) are indices in the request section denoting the start and end (exclusive), respectively, of the matching substring, and j and (j+L) are the start and ending (exclusive) indices of the matching substring in a response section.
For example, as shown in Tables 3a and 3b above, for the first request-response pair of Table 1, section q1 of the request is equal to section p1 of the response, while section q3 of the request is equal to section p3 of the response. Applying the symmetric field search to all of the requests/responses in the example transaction subset of Table 1 yields the following symmetric rules for each transaction, illustrated in Table 4:
Note that the example symmetric field rules of Table 4 does not include partial symmetric field matches, which may also occur if that method was being used. For example, in the request-response pair of transaction 3, note that there is a partial match between section q1 and p5, which may be defined using the notation of equation (2) as: q1[5,6]->p5.
At block 510, the symmetric field rules defined in block 509 for the request-response pairs of a transaction subset are merged to define substitution rules for each transaction subset. For example, in some embodiments, the symmetric field rules may be merged based on a relative frequency of occurrence of the rules in the transaction subset. Symmetric field rules which occur with a relative frequency at or above a defined threshold (e.g., 0.8) may be included, while symmetric field rules below the threshold may be discarded. As shown above in Table 4, for the example transaction subset of Table 1, three different symmetric field rules (q1->p1, q3->p3, q1->p5) occur. Table 5 below shows illustrates the relative frequency of each of the three symmetric field rules q1->p1, q3->p3, q1->p5. Using a threshold of 0.8, the first two symmetric field rules (q1->p1, q3->p3) can be included in the substitution rules for the transaction subset, while the last symmetric field rule (q1->p5) can be discarded.
Still referring to block 510, in other embodiments, the symmetric field rules may be merged by clustering ones of the symmetric field rules into respective groups based on calculated similarities, and selecting a representative one of the symmetric field rules from each group (such as the centroid or modal element) as the merged representation to define a substitution rule. Any of a plurality of clustering algorithms (e.g. K-means, VAT, hierarchical clustering, density based clustering, etc.) may be used. Clustering may be particularly helpful for partial matching rules, where the i, j and L values may be more prone to imprecision as compared to the whole section rules. For the clustering, a distance function may be used to calculate the similarities of the symmetric field rules of a transaction subset. Example distance functions may include the Cartesian distance of the (i, j, L) vector, the Cosine distance, the Manhattan distance, or other distance functions.
At runtime, an unknown request Reqin is received from a system under test at block 511. The unknown request Reqin may be directed to an endpoint and/or environment for which service emulation is desired, such as the deployment environment 110 of
At block 512, responsive to receiving the unknown request Reqin, one of the transaction subsets including message transactions of an operation type corresponding to the unknown request is selected. The transaction subset may be identified by comparing the unknown request to the respective request prototypes (or other representative requests) of each of the transaction subsets. For example, in opaque service virtualization, a similarity of the received request to each of the request prototypes may be determined using a distance function, and one of the transaction subsets corresponding to the closest-matching one of the request prototypes may be identified, as similarly described in the above-referenced U.S. patent application Ser. No. 14/535,950 incorporated by reference herein. The distance function used in transaction subset identification at block 512 may be the same as or different than the distance function(s) used in transaction subset generation at block 501, prototype generation at block 503, and/or symmetric field rule merging at block 510, and may be independent of a message structure (which may indicate protocol, operation type, and/or header information) of the unknown request, such that a closest matching one of the request prototypes may be indirectly identified based on similarity, rather than based on the contents thereof. In some embodiments, a maximum distance threshold may be used such that, if no transaction subsets are identified as having a distance to the unknown request Reqin less than the maximum distance threshold, then a default response (such as an error message) may be generated and transmitted to the system under test.
At block 513, variable section(s) of the unknown request are determined from a comparison with the request prototype of the transaction subset that was identified at block 512. For example, with reference to the request prototype of the example transaction subset of Table 1, upon receiving a request:
{Id:133,Op:AddRq,Lastname:Verkhovtseva,Firstname:Katya},
the received request is aligned with the request prototype, and the variable section(s) of the incoming/unknown request may be identified from the alignment, as shown below in Table 6:
At block 515, the substitution rules for the transaction subset, which were defined at block 510, are applied to the response prototype of the transaction subset that was identified at block 512 to generate a response to the system under test. In particular, symmetric field substitution may be performed to fill in values for the appropriate variable sections of the response prototype, from the corresponding variable sections of the incoming request as determined at block 513, based on the substitution rules. For example, for the response prototype of the example transaction subset of Table 1:
The two substitution rules (q1->p1, q3->p3) defined for the example transaction subset of Table 1 are applied to populate the variable sections p1 and p3 in the response prototype with the values of variable sections q1 (133) and q3 (Verkhovtseva) to generate a response:
The remaining unpopulated sections of the response prototype (if any) are then populated. For example, a closest matching request in the transaction subset to the incoming request may be determined (using a distance function), and sections from the corresponding/paired response may be copied to the generated response. In the example transaction subset of Table 1, if the closest matching request in the transaction subset was the request-response pair of transaction 4, the missing section p5 would be populated with the value “4”. Alternatively, a response may be selected at random from the transaction subset, and sections from the selected response to the missing sections of the generated response. For example, request-response pair of transaction 3 may be selected at random from the example transaction subset of Table 1, and the section p5 would be populated with “3”. Similarly, a response may be selected at random for each missing section of the generated response, and the appropriate section may be copied from the randomly selected response. As such, a generated response including multiple missing sections may have sections copied from multiple responses of the transaction subset. As another alternative, for each missing section of the generated response, a string may be randomly generated to populate the section. The length of the string may be restricted to be within the range of lengths observed for the section in the transaction subset. The alphabet used may also be restricted, for example, to only use characters observed to occur in the transaction subset for the section.
Still referring to block 515, using the first of the missing field approaches discussed above, the final generated response would be:
{Id:133,Op:AddRsp,Result:Ok,Lastname:Verkhovtseva,Key:4}
At block 520, the generated response is transmitted to the system under test, in response to the unknown request received therefrom.
Aspects of the present disclosure have been described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatuses (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable instruction execution apparatus, create a mechanism for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. As used herein, “a processor” may refer to one or more processors.
These computer program instructions may also be stored in a computer readable medium that when executed can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions when stored in the computer readable medium produce an article of manufacture including instructions which when executed, cause a computer to implement the function/act specified in the flowchart and/or block diagram block or blocks. The computer program instructions may also be loaded onto a computer, other programmable instruction execution apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatuses or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various aspects of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. Although some of the diagrams include arrows on communication paths to show a primary direction of communication, it is to be understood that communication may occur in the opposite direction to the depicted arrows. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C++, C#, VB.NET, Python or the like, conventional procedural programming languages, such as the “C” programming language, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, dynamic programming languages such as Python, Ruby and Groovy, or other programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider) or in a cloud computing environment or offered as a service such as a Software as a Service (SaaS).
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting to other embodiments. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises,” “comprising,” “includes” and/or “including”, “have” and/or “having” (and variants thereof) when used herein, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. In contrast, the term “consisting of” (and variants thereof) when used in this specification, specifies the stated features, integers, steps, operations, elements, and/or components, and precludes additional features, integers, steps, operations, elements and/or components. Elements described as being “to” perform functions, acts and/or operations may be configured to or otherwise structured to do so. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items and may be abbreviated as “/”.
It will be understood that, although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of the various embodiments described herein.
Many different embodiments have been disclosed herein, in connection with the above description and the drawings. It will be understood that it would be unduly repetitious and obfuscating to literally describe and illustrate every combination and subcombination of these embodiments. Accordingly, all embodiments can be combined in any way and/or combination, and the present specification, including the drawings, shall support claims to any such combination or subcombination.
In the drawings and specification, there have been disclosed typical embodiments and, although specific terms are employed, they are used in a generic and descriptive sense only and not for purposes of limitation, the scope of the disclosure being set forth in the following claims.