The disclosures made herein relate generally to computer networks and computer-implemented methodologies configured for improve response time and, more particularly, to facilitating data compression to improve response time.
In the context of data transmission between networked data processing systems, response time is the duration of time between a first data processing system (e.g., a first server) providing a request for information to a second data processing system (e.g., a second server) and data constituting the requested information being received in its entirety by the first data processing system from the second data processing system. The response time corresponds to the latency, or ‘wait-time’, of the first data processing system with respect to requesting information and waiting for receipt of a corresponding reply. Accordingly, it can be seen that optimizing response time (e.g., reducing response time and/or maintaining response time at an acceptable or specified level) is desirable as it directly influences the overall quality-of-service experienced by clients of a data processing system.
Round-trip time (RTT) is a common metric used for quantifying response time. Conventional means for measuring RTT on a connection between two data processing systems include suitably configured network utilities (e.g., PING utility, TRACEROUTE utility, etc), various configurations of echo utilities, and/or passively monitoring the response time of active connections. In one specific example, RTT is determined by measuring the time it takes a given network packet (i.e., reference data) to travel from a source data processing system to a destination data processing system and back. Examples of factors that affect RTT include, but are not limited to, time for compressing data, time required for sending (i.e., transferring) data to a protocol stack, request sending time, network delay, network congestion, network loss percentage, and decompression time. Because RTT is affected by network congestion, RTT varies over time and is typically calculated on a per-partner basis.
Approaches for reducing response time in computer networks are known (i.e., conventional approaches for reducing response time). The underlying goal of such conventional approaches is to modify data being transmitted by a data processing system (e.g., via data compression, data omission, etc.) and/or modifying operating parameters of the data processing system in a manner that results in a reduction in response time for all or a portion of data being transmitted by the data processing system. However, such conventional approaches for reducing response time are known to have drawbacks that adversely affect their effectiveness, desirability and/or practicality.
One example of such conventional approaches for reducing response time includes requiring that an administrator or an application (i.e., a controlling entity) decide whether the use of data compression is or is not desirable for reducing response time. But, because administrator and/or applications upon which these conventional approaches rely are limited in their ability to readily provide complete and accurate decision-making information, it is often the case that these conventional approaches routinely result in non-optimal decisions being made regarding compression. Example of such non-optimal decisions include, but are not limited to, implementing too much compression, implementing too little compression, and implementing a less than preferred compression technique. In some instances, these non-optimal decisions include simply ignoring the issue of compression all together and tolerating less than optimal response times.
Another example of such conventional approaches for reducing response time includes analyzing subject data and determining which portions of the subject data can be omitted from being transmitted, whether in a compressed or uncompressed format. To this end, it is typically necessary to have a fairly detailed understanding of the subject data such that only non-essential information comprised by the subject data (e.g., certain background information in images) is omitted. A drawback of this type of conventional approach is that it is generally not a practical solution in instances where the content and configuration of data cannot be readily and rapidly determined and/or is not predefined. Another drawback is that analyzing the subject data can be time-consuming and processor intensive.
Yet another example of such conventional approaches for reducing response time includes deploying and activating client and server components of a data compression algorithm (i.e., network middleware) on networked computer systems. In such conventional approaches, the client and server components comprise respective portions of the data compression algorithm that jointly facilitate determination of whether to compress subject data and, in instances where compression is deemed appropriate, facilitate respective portions of compression/decompression functionality. Due to the client-server processing requirements of such a conventional approach, response time optimization functionality afforded by the data compression cannot be carried out in conjunction with a computer system not configured with one or both components of the data compression algorithm (i.e., the client component and/or the server component). This is a drawback in that it limits usefulness, effectiveness and practicality. Another drawback of this type of conventional approach is that extra burden is placed on the CPU and storage means of the client system for maintaining information required for facilitating functionality of the data processing algorithm. Still another drawback is that deployment of client and server components of this type of data compression algorithm is mandated.
Therefore, a system and/or method that overcomes drawbacks associated with conventional approaches for reducing response time would be useful, advantageous and novel.
The inventive disclosures made herein relate to facilitating adaptive implementations of data compression for optimizing response time performance in a data processing system. Such implementations rely on a determination of whether or not adjusting request and/or reply sizes at the data processing system by applying a compression factor (i.e., to facilitate compression) have a desirable influence on response time performance. Such determination is based on a wide variety of decision criteria. Examples of the decision criteria include, but are not limited to, network protocol performance, CPU utilization, bandwidth utilization, and estimates of the CPU time and network time costs of sending compressed verses uncompressed data.
Through experimentation, it has been found that improvement in response time and throughput more than offsets costs associated with facilitating compression. Conversely, it has also been found that facilitating compression can degrade performance in instances where its facilitation results in the use of additional CPU time. Accordingly, systems and methods in accordance with embodiments of the inventive disclosures made herein have an underlying intent of determining how bandwidth and processor utilization can be leveraged to advantageously influence (e.g., optimize) response time performance. The objective of such leveraging is to optimize (e.g., maximize) transaction throughput (e.g., requests per second) between a pair of servers when the servers are connected through less than optimal networks and/or network connections. An edge server and an application server are an example of such pair of servers.
In a first embodiment of a method for facilitating optimization of resource utilization in accordance with the inventive disclosures made herein, operating parameter levels exhibited by a data processing system are determined. At least a portion of the operating parameter levels influence response time performance for the data processing system. After the operating parameter levels are determined, a resource optimization mode is determined dependent upon one or more of the operating parameter levels. Thereafter, a data compression influence on the response time performance is determined dependent upon the determined resource optimization mode.
In a second embodiment of a method for facilitating optimization of resource utilization in accordance with the inventive disclosures made herein, a resource optimization mode for a data processing system is determined dependent upon one or more of a plurality of operating parameter levels exhibited by the data processing system. A resource optimization strategy is then implemented dependent upon resource optimization modes, the operating parameter levels, and/or reference responsiveness parameters. Information utilized in determining the resource optimization strategy is modified dependent upon information derived from implementation of the resource optimization strategy, thereby enabling resource optimization functionality to be adaptively implemented based on historic and current information.
In a third embodiment of a method for facilitating optimization of resource utilization in accordance with the inventive disclosures made herein, operating parameter levels exhibited by a data processing system are determined and at least a portion of the operating parameter levels influence response time performance exhibited by the data processing system. Uncompressed data transmission or a first data compression method is implemented in response to processor utilization exhibited by the data processing system exceeding a respective specified threshold. A second data compression method is implemented in response to bandwidth utilization exhibited by the data processing system exceeding a respective specified threshold. Round-trip time optimization is implemented in response to the processor utilization and the bandwidth utilization being below the respective specified thresholds.
Method 100 begins with operation 105 for determining operating parameter levels for a data processing system (e.g., a server). In one example, determining operating parameter levels includes monitoring, measuring, estimating and/or storing all or a portion of such operating parameter levels. In the context of the inventive disclosures presented herein, the term “operating parameter levels” includes operating parameter levels related to one or more associated network connections of the data processing system in addition to operating parameter levels of resources of the data processing system. Accordingly, examples of such operating parameters include, but are not limited to, monitoring processor utilization, monitoring aggregate bandwidth utilization, measuring network parameters (e.g., round trip time, latency, etc) and estimating compressibility of outbound data.
After determining the operating parameter levels, operation 110 is performed for determining a resource optimization mode. Embodiments of resource optimization modes in accordance with the inventive disclosures made herein include a mode in which processor cycles are optimized (i.e., processor optimization mode), a mode in which aggregate bandwidth is optimized (i.e., a bandwidth optimization mode), and a mode in which round trip time is optimized (i.e., a round-trip time optimization mode). Determination of the resource optimization mode is performed dependent upon one or more of the operating parameter levels exhibited by the data processing system. In one embodiment, determining the resource optimization mode preferably includes selecting the processor optimization mode in response to determining that response time performance is bound by processor utilization (i.e., processor cycles), selecting bandwidth optimization mode in response to determining that the response time performance is bound by bandwidth utilization (e.g., aggregate bandwidth utilization), and selecting round-trip time optimization mode in response to determining that the response time performance is unbound by processor utilization and bandwidth utilization.
It will be understood by a skilled person that the term ‘optimization’ as used herein is a non-absolute term. For example, optimized response time performance for a data processing system may not correspond to absolute optimization of response time performance assuming infinite availability of information, knowledge and time, but rather the best response time performance achievable based on availability and/or practical allocation of information, knowledge and time. In effect, the preferred intent is to pursue absolute optimization to the degree possible in view of factors such as available and/or practical allocation of information, knowledge and time.
Operation 115 is performed for implementing a resource optimization strategy after determining the resource optimization mode. Implementation of the resource optimization strategy is performed dependent upon the determined resource optimization mode, the operating parameter levels, and/or reference responsiveness parameters. Examples of such responsiveness parameters include, but are not limited to, reference round-trip times, reference latencies and reference response times.
In conjunction with performing implementation of the compression strategy, operation 120 is performed for updating optimization strategy information. Such updating of optimization strategy information includes, but is not limited to, adding new information, deleting existing information, replacing existing information and/or modifying existing information. In one embodiment, updating of optimization strategy information is preferably performed dependent upon information derived from implementing the resource optimization strategy. Examples of optimization strategy information include, but are not limited to, information related to processor utilization, information related to aggregate bandwidth utilization, information related to network parameters (e.g., round trip time, latency, etc), and information related to compressibility of reference outbound data.
By updating optimization strategy information in an integrated manner with implementing resource optimization strategies, resource optimization functionality in accordance with the inventive disclosures made herein may be implemented in an adaptive manner. For example, on-going implementation of resource optimization functionality results in new, deleted, replaced and/or modified optimization strategy information. Accordingly, on-going implementation of the resource implementation functionality serves to enhance the breadth, context, content and resolution of the optimization strategy information in an automated manner and, thereby, enables resource optimization functionality to be implemented in an adaptive (e.g., self-regulating) manner.
In response to analysis of the resource utilization determining that response time performance is bound by processor utilization, operation 210 is performed for selecting a processor optimization mode. In response to analysis of the resource utilization determining that response time performance is bound by bandwidth utilization rather than processor utilization, operation 215 is performed for selecting a processor optimization mode. In response to analysis of the resource utilization determining that response time performance is unbound by bandwidth utilization and processor utilization, operation 220 is performed for selecting round-trip time optimization mode. Presented below is an example of modeling approach use for determining resource optimization mode applicability.
The goal of this experiment was to determine how bandwidth and processor utilization influenced whether or not outbound data should be compressed in an effort to optimize response time performance. The results predict which type of resource optimization mode (e.g., which type of resource utilization leveraging) best applies to different types of network operating scenarios.
A 5-system network with one network switch was used to facilitate this experiment. A first pair of systems was configured as partner-system on the network and was used to conduct the test. A second pair of systems was configured to interject network overhead on the switch. The fifth system was configured as a proxy server that could be tuned to be a network bottleneck. Five cases depict the overall results of the experiment.
CASE 1: Two Systems, both CPU constrained by using background activity. Perfect network; Light network usage. Chosen compression approach did not yield an advantageous affect on response time performance.
CASE 2: Two Systems, light system usage but network constrained using a “Proxy Server” on an intermediate box. Compression could be implemented in a manner that yielded an advantageous affect on response time performance.
CASE 3: Two Systems, light system usage, plenty of network capacity but nerwork noise interjected by the other two systems. Chosen compression approach did not yield an advantageous affect on response time performance.
CASE 4: Two systems, heavy CPU usage, network bottleneck using a “Proxy Server”. Chosen compression approach did not yield an advantageous affect on response time performance.
CASE 5: Two systems, heavy CPU usage, busy network (effectively, the same as CASE 4.) Chosen compression approach did not yield an advantageous affect on response time performance.
In summary, the detailed information gathered in this experiment found that:
(1) If a server is CPU-bound, optimizing processor utilization (i.e., processor cycles) is typically advantageous. Accordingly, a comparison would dictate the preference of sending outbound data in an uncompressed form or sending outbound data after being compressed using a looser and/or a less expensive compression method (e.g., a lossy-type compression method).
(2) If a server is bandwidth bound, optimizing network bandwidth is typically advantageous. Accordingly, data compression would be used to its maximum benefit in view of bandwidth utilization.
(3) If a server is unbound by processor and bandwidth utilization, optimizing round-trip time is typically advantageous. Accordingly, because incurring extra processor overhead for very little return benefit becomes counter-productive, a comparison would dictate the preference of sending outbound data in uncompressed form or sending outbound data after being compressed using one of any number of compression methods.
After optimizing the data compression influence, operation 310 is performed for setting a transmission mode that is dependent upon optimization of data compression. A first transmission mode includes sending outbound data in a compressed form in response to compressing outbound data in accordance with a preferred compression factor determined during optimization of the data compression influence on outbound data (i.e., during operation 305). Examples of such a determined compression factor include, but are not limited to, a compression factor calculated dependent upon a suitable formula, a compression factor selected from a collection of pre-defined compression factors, and a compression factor selected from a collection of previously utilized compression factors. A second transmission mode includes sending outbound data in an uncompressed form.
In response to setting the transmission mode for sending outbound data in an uncompressed form, operation 315 is performed for sending outbound data in an uncompressed form. In response to setting the transmission mode for sending outbound data in a compressed form, operation 320 for applying the preferred compression factor to outbound data and operation 325 is performed for sending the outbound data in a corresponding compressed form. Accordingly, implementing the resource optimization strategy results in data being sent in the form that provides optimized response time performance.
Generally speaking, the operation of determining what resource(s) should be optimized is implemented in any manner that accomplished the overall objective of optimizing aggregate response time performance for a server (e.g., what particular resource a server administrator should optimize). As discussed above, an example of an approach for determining the manner in which data compression should be applied includes utilizing experimentation for determining criteria and parameters upon which to base a compression factor to apply. Also discussed above is the effect of optimization strategy information being maintained in a manner that enables resource optimization functionality to be implemented adaptively. Accordingly, the compression factor and its specific means of generation will typically vary on a case-by-case basis.
After determining the baseline compression factor, operation 410 is performed for modeling uncompressed data transmission and compressed data transmission using the baseline compression factor. Operation 415 is performed for analyzing results of the modeling in response to performing the modeling. In one embodiment, the modeling includes sending reference outbound data in an uncompressed form and in a compressed form as generated using the baseline compression factor, and the analysis includes comparing response time performance in view of one or more operating parameter levels for the uncompressed and compressed data.
In one embodiment, the comparison is preferably based on resource utilization parameters. For example, in response to determining that response time performance is bound by processor utilization, processor cycles required for compressing outbound data and sending the compressed outbound data are compared with processor cycles required for sending the outbound data in uncompressed form. In response to determining that the response time performance is bound by bandwidth utilization, bandwidth utilization associated with sending the outbound data in compressed form is compared with bandwidth utilization associated with sending the outbound data in uncompressed form. In response to determining that the response time performance is unbound by processor utilization and bandwidth utilization, round-trip time for the outbound data in compressed form is compared with round-trip time for the outbound data in uncompressed form.
If the data compression influence associated with the compression factor and the utilized compression method is not acceptable (e.g., above or below a respective threshold value), the analysis of operation 415 may optionally determine a revised compression factor, if so practical and/or useful. An example of the compression factor and utilized compression method yielding unacceptable influence is when resource utilization and/or response time performance dictate sending outbound data in the uncompressed form is preferred over the corresponding compressed form. In one embodiment, the revised compression factor is derived as a scaling of the previously determined compression factor. In another embodiment, the revised compression factor is calculated using the same or different approach as used in operation 405 with revised assumptions and/or variable information (e.g., adaptively based on updated resource optimization information).
In response to a revised compression factor being determined, operations 410 and 415 are repeated. In response to a revised compression factor not being determined, the method continues at operation 310. In one embodiment, the method continuing at the operation 310 serves as a trigger for performing operation 120 (
Referring now to computer readable medium, it will be understood by the skilled person that methods, processes and/or operations adapted for carrying out resource optimization functionality in accordance with the inventive disclosures made herein are tangibly embodied by computer readable medium having instructions thereon for carrying out such functionality. In one specific embodiment, the instructions are tangibly embodied for carrying out the method 100 disclosed above to facilitate resource optimization functionality (e.g., as a resource optimization utility running on a data processing system). The instructions may be accessible by one or more data processors (e.g., a logic circuit of a data processing system providing server functionality) from a memory apparatus (e.g. RAM, ROM, virtual memory, hard drive memory, etc), from an apparatus readable by a drive unit of the data processing system (e.g., a diskette, a compact disk, a tape cartridge, etc) or both. Accordingly, embodiments of computer readable medium in accordance with the inventive disclosures made herein include a compact disk, a hard drive, RAM or other type of storage apparatus that has imaged thereon a computer program (i.e., a set of instructions) adapted for carrying out resource optimization functionality in accordance with the inventive disclosures made herein.
Edge server 515 includes inbound optimization layer 535 and application server 520 includes outbound optimization layer 540. Inbound optimization layer 535 is preferably, but not necessarily, implemented at an inbound point from edge server 515 to application server 520 for supporting request flow. Outbound optimization layer 540 is preferably, but not necessarily, implemented at an outbound point from application server 520 to Edge Server 515 for supporting return or response flow.
Inbound optimization layer 535 and outbound optimization layer 540 are each configured for enabling resource optimization functionality to be carried out in accordance with the inventive disclosures made herein. In one specific embodiment, inbound optimization layer 535 and outbound optimization layer 540 each preferably includes instructions for carrying out all or a portion of method 100 depicted in
Inbound optimization layer 535 tracks CPU utilization of edge server 515 and outbound optimization layer 540 tracks CPU utilization of application server 520. If edge server 515 or application server 520 is operating at or above a prescribed processing level (e.g., 90%), the respective optimization layer (i.e., inbound optimization layer 535 or outbound optimization layer 540, respectively) does not implement compression and communicates such decision to the other server through, for example, a custom HTTP header. Inbound optimization layer 535 and outbound optimization layer 540 continually monitor respective CPU utilization and the respective compression decision is revisited iteratively.
At initialization, edge server 515 reads a set of parameters that define system goals associated with implementing resource optimization (e.g., server throughput optimization) in accordance with the inventive disclosures made herein. Additionally, at initialization, edge server 515 performs a TRACEROUTE (or equivalent) operation for determining the number of hops and delays. In initiating a request to the application server 520, edge server 515 examines CPU utilization and makes a determination of whether or not to compress the inbound message to application server 520. If edge server 515 is operating below a predefined level (e.g., at less than 90% busy) and request/responses are operating within a predefined level (e.g., 90% of the system goals for response time based on tracked history over the last 30 seconds, the last five minutes, and last 30 minutes), edge server 515 will compress the message.
Edge server 515, in sending the message, monitors request/response times and maintains a profile of response times over a predefined or system-defined duration (e.g., the last 30 seconds, the last five minutes and last 30 minutes). Preferably, this profile is hardened so as to be persistent across restarts. Application server 520, in processing the message, sends a reply. Based on CPU utilization tracking, if application server 520 determines (e.g., based on the last 30 seconds, the last five minutes and the last 30 minutes) that CPU utilization of both edge server 515 and application server 520 is less than a predefined level (e.g., 90%), and then the reply is compressed.
In one embodiment, edge server 515 preferably uses an architected message format (i.e., custom configured in accordance with the inventive disclosures made herein) for facilitating compression. The architected message format provides for a first compression header that includes the uncompressed length of a message in the compression header and that indicates whether or not compression is being used and the length of the uncompressed header. A second HTTP header includes the CPU utilization of edge server 515 over a predefined or system-defined duration (e.g., the last 3 seconds, the last five minutes, and last 30 minutes).
A novel and advantageous aspect of resource utilization in accordance with the inventive disclosures made herein for compression optimization is that edge server 515 caches and re-uses inflator/deflator objects. Saving and contributions from such caching and from such re-using of inflator/deflator object towards compression optimization is very significant. Deflator objects provide compression of data using a variety of parameters, which define the type and extent of the compression (e.g., the optimization strategy information disclosed in reference to
Application server 520 includes hardware and software that handles all application operations between user data processing systems (e.g., user data processing system 530) and backend applications and/or databases (e.g., a database residing on database server 525). Application servers such as application server 520 are typically used for complex, transaction-based applications. Edge server 515 includes hardware and software that serves the function of distributing application processing of application server 520 to the edge of the enterprise intranet 505, preferably using centralized administrative and application control.
In one embodiment, edge server 515 is a specialized type of application server that performs application front end processing. Caching is an example of such front end processing functionality. Various configurations of edge servers, application servers and data base servers are commercially available from numerous venders. WebSphere® Edge Server and WebSphere® Application Server, both commercially available from IBM Corporation, are a specific examples edge server 515 and application server 520, respectively.
Embodiments of systems and methods in accordance with the inventive disclosures made herein are applicable to a variety of types of network communications and architectures. In one specific embodiment, the target network communications are those between edge servers and application server. Generally speaking, however, embodiments of such systems and methods may be implemented in conjunction with most types of network communication protocols.
In accordance with at least one embodiment of the inventive disclosures made herein, it is required that compressed data be provided in a message that is formatted in a manner indicating that compression is in use and what form of compression is in use. HyperText Transfer Protocol (HTTP) is an example of a communication protocol that provides the use of header information configured for indicating that compression is in use and what form of compression is in use. Accordingly, HTTP is one example of a communication protocol configured in a manner that allows compression information (e.g., presence and type of compressed data) to be provided to sending and receiving parties in a communication and is thus one example of a communication protocol compatible with embodiments of methods and systems in accordance with the inventive disclosures made herein.
In the preceding detailed description, reference has been made to the accompanying drawings that form a part hereof, and in which are shown by way of illustration specific embodiments in which the invention may be practiced. These embodiments, and certain variants thereof, have been described in sufficient detail to enable those skilled in the art to practice the invention. It is to be understood that other suitable embodiments may be utilized and that logical, mechanical and electrical changes may be made without departing from the spirit or scope of the invention. For example, functional blocks shown in the figures could be further combined or divided in any manner without departing from the spirit or scope of the invention. To avoid unnecessary detail, the description omits certain information known to those skilled in the art. The preceding detailed description is, therefore, not intended to be limited to the specific forms set forth herein, but on the contrary, it is intended to cover such alternatives, modifications, and equivalents, as can be reasonably included within the spirit and scope of the appended claims.