A Web Service is software that supports interoperable interaction of computing devices over a network. As Web Services become more common, the complexity of the tasks they perform is likely to increase and the amount of data they process to increase. How the transfer of data is handled can affect the usability of the Web Service. For example, a Monitoring and Reporting Service may need repeatedly to transfer large amounts—even gigabytes—of data.
A standard protocol for exchanging XML-based messages over a computer network is SOAP (Simple Object Access Protocol). SOAP messages are used to transmit messages over a transport protocol, such as HTTP, in Web Services based applications; this is currently the standard approach in the Web Services paradigm. SOAP engines are used for creation and transportation of SOAP messages with the data to be transferred. Existing SOAP engines can handle small SOAP messages efficiently but when they encounter large amount of data bundled into a SOAP message, SOAP engines can experience performance and reliability problems.
In particular, data sent via SOAP message is encoded so, as the amount of data increases, the encoding also increases leading to performance problems. Processing unexpectedly large amounts of data by SOAP engines unpredictably increases access latency and leads to session timeouts. This makes this approach unreliable for transferring large quantities of data. Furthermore, security is introduced at the message-level using WS-Security (Web Service Security), but encrypting large amounts of data increases the overall processing time taken by the SOAP engine.
SOAP with Attachments is used to transfer binary data files. SOAP messages can be transmitted with attachments of various types (such as images, drawings and text documents). Such data is often in a particular binary format and uses the “MIME or DIME multipart/related” content type and URI (Uniform Resource Identifier) schemes for referencing the MIME or DIME parts.
However, this scenario also leads to low performance owing to the additional processing time required to encode the data into a standard binary format (such as base64) and decode it at the client. In addition, interoperability using SOAP attachments is a problem, as each SOAP engine vendor has a proprietary protocol to interpret binary data objects. For example, Java based SOAP engines and Microsoft .NET or C++SOAP engines define binary data objects differently: Java based SOAP engines support MIME whereas .NET based SOAP engines support the Microsoft proprietary protocol DIME.
Furthermore, WSDL (Web Services Description Language, an XML format for describing network services) has to be modified to accommodate different MIME content types and reduces flexibility to send different MIME content types via the same Web Service. For example, an image and a PDF document have different MIME content type descriptions in WSDL.
Alternatively, large quantities of data may be broken into smaller chunks of a fixed size for transfer. The number of method calls is then proportional to the number of data chunks. These data chunks must be managed efficiently to transfer the correct set of data chunks and subsequently to reassemble the data chunks correctly. Thus, such techniques have poor performance owing to the additional processing time required by serialization and the establishment of connections to the server for every chunk. They also have poor usability and manageability metrics, as the consumer of the Web Service must add functionality to the client code to manage the received chunks in a reliable manner. Indeed, the size of a data chunk cannot be predicted on the client-side code, so the server-side code must be adapted to pass information on optimal chunk size to the client, that is, the chunk size that both server and client-side are expected to be able to handle. Finally, very large transfers can potentially suffer from unpredictable delays and may result in loss of data.
In order that the invention may be more clearly ascertained, embodiments will now be described, by way of example, with reference to the accompanying drawing, in which:
There will be provided a method and apparatus for transferring data in a computing environment.
In one embodiment, the method comprises sending a request for the data to a web service, the request comprising web service information; the web service responding by fetching the data from a data storage; storing the data in at least one file; generating a Uniform Resource Identifier for the file; receiving the Uniform Resource Identifier; and receiving the data as a data stream from the file.
In one embodiment, the computing system comprises a client having a client module and a server having a web service. In this embodiment, the client is configured to respond to a request comprising web service information from a consumer of the web service for the data to forward the request to the web service, the web service is configured to respond to the request by fetching the data from a data storage and passing the data to the server, the server is configured to store the data in at least one file, generate a Uniform Resource Identifier for the file and return the Uniform Resource Identifier to the web service, the web service is further configured to return the Uniform Resource Identifier to the client, and the client is further configured to respond by opening a data stream with the Uniform Resource Identifier to receive the data.
Computing system 100 includes communications infrastructure 106 to which server 102 and client 104 are connected, so that server 102 and client 104 are in data communication. Communications infrastructure 106, in this embodiment, comprises an intranet but in other embodiments may comprise the internet or other computer network.
Server 102 has a web container 108, a Web Service 110 and a server library 112 (a software component discussed below). Client 104 has a client module 114 (viz. a software component that is the effective client for Web Service 110) and a client library 116 (a software component, also discussed below).
The features of computing system 100 are most clearly illustrated by reference to
Thus, referring to
At step 208, Web Service 110 fetches the requested data from data storage 118, then at step 210 passes the data either as an object containing data rows or as a stream of data to server library 112. In this embodiment, if the data is essentially in rows and columns (which will commonly be the case), at step 212 server library 112 wraps the data into XML (Extensible Markup Language) elements, and at step 214 stores the data as a temporary XML file 120 on server 102 in web container 108.
At step 216, server library 112 generates a HTTP URI (Uniform Resource Indicator) in the form of a HTTPS URL (Uniform Resource Locator) for temporary XML file 120 and at step 218 returns this URL to the Web Service 110. At step 220 Web Service 110 returns the URL as a SOAP response to client library 116.
For reasons of security, client library 116 generates—at step 222—a client SSL (Secure Sockets Layer) certificate. At step 224, client library 116 attempts to open a data stream with the URL by accessing temporary XML file 120 (which has an HTTPS URL). At step 226, Web container 108 receives the client access request in the form of an HTTPS “Open( )” call with two parameters, the URL and the client certificate. Web container 108 is configured to validate client certificates that are received along with a client request. Hence, Web container 108 responds—at step 228—by validating (i.e. checking the validity of) the client certificate. If the client certificate is not validated (i.e. the request is not from a trusted client), processing continues at step 230 where web container 108 returns an error message and processing ends.
Otherwise (i.e. the request is found to be from a trusted client) processing continues at step 232, where web container 108 allows client library 116 to open a stream to the temporary XML file 120 over HTTPS. At step 234 client library 116 passes this stream to the consumer of Web Service 110 via client module 114. At step 236 the consumer binds to the URL and streams the data, while having control over the data transfer.
At this point processing essentially ends. However, temporary XML files created in web container 108 consume disk space so, to manage these files, server 102 includes a singleton thread utility 122 that is initiated by server library 112 when the first such temporary XML file is created in web container 108. Once started, the thread 122 remains alive and running for the lifetime of Web Service 110 or web container 108. Thread 122 polls periodically, validates each temporary XML file 120 against its timestamp and deletes files that are old. The time interval between polling and the file retention period are configurable by accessing server library 112 with Web Service 110. The consumer has access to temporary XML file 120 within the file retention period; otherwise the consumer will have to make another call to Web Service 110 to create a new temporary XML file 120.
According to this embodiment, it is also possible to upload data.
Thus, referring to
At step 308, Web Service 110 passes the request to server library 112. At step 310, server library 112 creates a new empty file 120 on server 102 in web container 108.
At step 312, server library 112 generates a HTTPS URL for the new file 120 and at step 314 returns this URL to the Web Service 110. At step 316 Web Service 110 returns the URL as a SOAP response to client library 116.
For security, client library 116 generates—at step 318—a client SSL certificate. At step 320, client library 116 attempts to open a data stream with the URL to access new file 120 (which has an HTTPS URL). At step 322, Web container 108 receives the client access request in the form of an HTTPS “Open( )” call with two parameters, the URL and the client certificate. Web container 108 is configured to validate client certificates that are received along with a client request. Hence, Web container 108 responds—at step 324—by validating the client certificate. If the client certificate is not validated, processing continues at step 326 where web container 108 returns an error message and processing ends.
Otherwise (i.e. the request is found to be from a trusted client) processing continues at step 328, where web container 108 allows client library 116 to open a stream to new file 120 over HTTPS. At step 330 client library 116 uploads the local data file to new file 120 on server 102 using, in this embodiment, the PUT method, while having control over the data transfer. The PUT method requests that an enclosed entity be stored at the supplied Request-URI (in this case the HTTPS URL). Processing then ends.
A computing system was constructed according to the embodiment of
A Monitoring Web Service was employed to collect monitoring information on various resources from an OPENVIEW Reporter database on a monthly basis. The consumer of this Monitoring Web Service accesses this data for performance analysis of resources for a particular month. This monitoring information extends up to several megabytes or gigabytes of data. The consumer of this Web Service was integrated with the Client Library and the Monitoring Web Service is integrated with the Server Library.
The application was tested with three approaches for transferring a large volume of data, the first and second according to the background art. In the first or “Standard Approach”, the entire data was passed within a SOAP message; the second approach was the SOAP with Attachments approach. The third approach was according to the embodiment of the present invention described above by reference to
It was found that, for all data sizes, the Standard Approach timed out or ‘crashed’ resulting in a loss of data. The SOAP with Attachments approach and the approach of the present embodiment both succeeded in transferring the data, but the time taken to transfer the 28 MB, 120 MB, 218 MB, 420 MB, 530 MB and 955 MB of data with the approach of the present embodiment was, respectively, 14%, 13%, 13%, 12%, 12% and 12% of the time required using SOAP with Attachments.
Thus, with the method of the present embodiment, performance of a SOAP engine improves significantly as this method processes small SOAP messages with payloads of only a HTTPS URL. A large volume of data can be transferred, by data streaming over HTTPS. Data encryption is left to HTTPS and hence there is no additional overhead to encrypt the data with another security mechanism (such as WS-Security). Only two server calls are required to transfer a large volume of data, the first to the target Web Service while the second is a HTTPS call to transfer data from the temporary XML file.
As a large volume of data is not sent within a SOAP message, delays are avoided that might otherwise arise from accessing the data by the consumer; this also avoids probable session timeouts. The use of a data streaming mechanism using HTTP (a standard and reliable transport protocol) improves reliability and hence data integrity.
Security is provided via a transport-level security mechanism, that is, HTTP over SSL (HTTPS). HTTP and SSL are thus leveraged to potentially encrypt the data while it is on the wire because no data is passed within SOAP messages. Hence, a mechanism message level security is not required so security is provided with an industry standard security based transport protocol.
Interoperability can also be achieved because approaches such as SOAP with Attachments that use proprietary protocols (such as DIME) are avoided. Rather, the industry standard HTTPS transport protocol is used to transfer the data.
System 100 may be configured to employ asynchronous message calls, as accessing large volumes of data synchronously may compel consumers to wait until the whole process of transferring data has been completed. In addition, the method implemented in system 100 may equivalently be used in reverse, that is, to upload data from consumer to the Web Service. In that case, the identity of client and server are effectively reversed.
The foregoing description of the exemplary embodiments is provided to enable any person skilled in the art to make or use the present invention. While the invention has been described with respect to particular illustrated embodiments, various modifications to these embodiments will readily be apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. It is therefore desired that the present embodiments be considered in all respects as illustrative and not restrictive. Accordingly, the present invention is not intended to be limited to the embodiments described above but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Number | Date | Country | Kind |
---|---|---|---|
1521/CHE/2007 | Jul 2007 | IN | national |