1. Field of the Invention
The invention relates to transactions provided over a network, and more particularly to reliable storage of those transactions provided over a network.
2. Description of the Related Art
Transactions over the Internet are rapidly increasing. Not only do shopping sites utilize transactions, but many other sites do as well to provide and maintain data. However, one problem with transactions that are accomplished over networks such as the Internet is the reliability of the transaction process itself. In many cases it is not acceptable to allow a transaction to be posted twice, which can occur if the transaction is actually posted but the client or originator never receives the posted response and repeats the transaction. Because of this problem, sophisticated techniques have been developed to prevent the double posting and often expensive and sophisticated computer hardware is required.
Generally it has been considered required that full state tracking be performed for each transaction, so that should any loss of responses or other communication occur, the exact state of the transaction can be determined. However, this requires that state be maintained by both the client and server ends.
One alternative would be a stateless environment where clients resubmit any transaction after error detection. However, a stateless environment increases the double posting problem discussed above. In those cases, to be more reliable it is preferred that the transactions and the database be located on the same logical unit, either an individual unit or a cluster. Exemplary systems include various mainframes, Hewlett-Packard NonStop servers and Oracle clusters. The problem with this is that those systems are very expensive. This is exacerbated in larger systems. Some cost reductions can be obtained by separating the server into two portions, a transaction front end and a database back end. But both of these portions must still be clustered or redundant systems as listed above to have the needed reliability, so the cost reduction is not necessarily very large.
This is in contrast to commodity servers, such as those built using Intel architectures and running unreliable or non-fault tolerant operating systems such as Linux or Windows. But the Linux and Windows commodity hardware systems running the less scalable and non-fault tolerant databases such as MySQL, Postgres or SQL Server, simply cannot provide the type of data integrity needed to handle the high reliability transaction systems. Therefore the only practical alternative has been either to forgo the stateless architecture of transactional reliability and use other techniques which are not as acceptable, or to utilize an expensive hardware environment.
It would be desirable to be able to perform the reliable commitment of transactions in a stateless architecture using less expensive hardware to enable higher throughput for a lower cost, while maintaining the high reliability and eliminating the duplication of transactions.
In a system according to the present invention, commodity hardware can be utilized to act as a front-end to a database system, while maintaining transaction commitment reliability in a stateless architecture. Systems according to the present invention utilize a separate table to track and determine if a particular transaction has been previously committed to the primary transaction database. Preferably this separate table, the stateless transaction protocol (STP) table, utilizes indices relating to both the user and to the particular request to determine if the particular transaction has been previously committed and if a response has been provided for that transaction. By inspecting this table prior to actually starting any transaction to the primary transaction database, a determination can be made whether the transaction has been previously committed to the primary transaction database. If so, the response, which is also stored in the STP table, is simply provided and the original transaction is no longer necessary. However, if the STP table does not indicate that the transaction has been previously committed, then the transaction is committed and an entry is made in the STP table to indicate the commitment. In the preferred embodiment the primary transaction database table entries and the entry into the STP table are protected by the same transaction, thus alleviating potential race conditions.
By utilizing this separate table to track prior commitments of transactions, less reliable and yet significantly cheaper, commodity server hardware can be utilized at least as a front-end connected to the clients to reduce overall cost of the computer system. In certain embodiments the database server itself can be a commodity server with a commodity database instead of a mainframe or similar as in the prior art.
In normal operation it is possible that the response from the transaction front-end 104 to the client 100 after commitment of a transaction, i.e., after the write operation has actually occurred in the database 106, can be lost. In most cases where the response is not received, the client 100 will retry the response, which would then commit yet one more write operation to the database 106, thus providing duplicate entries. This is the condition which is to be avoided.
Part of the problem with this prior art stateful mainframe architecture is that as the number of clients, and thus transactions increases, the number of modules necessary in the mainframe 102 increases rather dramatically. As much of the capacity is being utilized to perform the communication operations by the transaction front-end module 104, this is not considered to be the most efficient use of the mainframe 102. Thus, to handle very large volumes of operations, very large costs must be incurred to maintain the stateful architecture. The question that arises is why not simply take the transaction front-end 104 out of the mainframe 102 to help reduce cost? According to the stateful protocols of the prior art, that does nothing more than provide one more potential for a response failure, i.e., between the database code module 105 and the transaction front-end module 104 if it is separated into a different unit. Thus it would only potentially exacerbate the problem as more states would need to be tracked, not solve the problem.
This scalability and reliability can be partially addressed by moving the transaction front end module 104 to a separate front end cluster 105 as shown in
A system according to the present invention is shown in
As described above, one of the major problems in a transaction system is the potential for double commitment of write transactions. In a system according to the present invention as shown in
If it is a write operation, control proceeds to step 508 where the request information in the transaction is hashed to provide a unique value. Preferably the request information includes the actual data which is to be placed into the database. This is preferably hashed into a 64 or 128 bit value to save space and provide a unique value representing the data. Control then proceeds to step 509, where the user information is similarly hashed. In the preferred embodiment the user information includes the user identification to allow user tracking, the table name or table names for which the operation or operations are being requested, and the particular columns in the table or tables which are being affected. If there are multiple tables or columns, each is provided as part of the task operation to provide a simple hash value. Similarly the request hash will be developed from each of the request values for each table and column. Again this is preferably hashed using various hashing techniques as desired into a 64 or 128 bit value. It is understood that the other values could be utilized if desired, such that both uniqueness is maintained and storage values are optimized. After the hashing is performed in step 510, control proceeds to step 512 to inspect the STP 113 table to determine if the user hash value is already present in the STP table 113. This is done by the database code module 111 providing a query to the database 106 or 116. This type of operation is performed in all similar cases and hereafter omitted for clarity. If so, control proceeds to step 512 to determine if the request hash is also present in the STP table 113. If the relevant hash is not present in step 510 or 512, control proceeds to step 514 where a test value is set to a false value. If the request hash is present in step 512, where the user hash has previously been determined to be present, this is an indication that the transaction which is attempting to be committed has actually already been committed and should not be recommitted, i.e., it is a duplicate transaction request. Control proceeds to step 515 to retrieve the response from the prior committed operation from the STP table 113. In step 516 the test value is set to true.
After steps 514 or 516, control proceeds to step 518 to determine if a duplicate transaction has been determined. If so, control proceeds to step 520 where the transaction start bit which has been set is cleared and the response, which in this case has been retrieved from the STP table 113, is returned to the transaction front-end module 104, which then returns it to the client 100.
If a duplicate transaction was not determined in step 518, control proceeds to step 522, where the transaction is actually provided to the database 106 or 116 and a response is received to indicate whether the operation by the database 106 or 116 has been successful. Control proceeds to step 524 to determine if the database 106 or 116 operation was successful. If so, then control proceeds to step 526 to determine if the user hash value is already present in the STP table 113. If so, control proceeds to step 528 where the request hash and response, which has been received from the database 106 or 116, are simply updated in the STP table 113. As the user hash is already present, the value that is there for the prior request hash and response value need only to be updated. However, if the user hash is not present, control proceeds to step 530 where the user hash value, the request hash value and the response are inserted into the STP table 113. After step 528 or 530 is completed, the response to the update or insert operation is evaluated in step 532. If the operation of providing the values to the STP table 113 was not successful, control proceeds to step 534 where the STP operation is rolled back. Control proceeds to step 536, which is also where a control will proceed from an unsuccessful insertion in step 524. In step 536 the database operation itself is rolled back so that both, in this case, the STP operation and the database operation itself, are never committed. Control proceeds to step 538, where an error is returned through the transaction front-end module 104 to the client 100. Normally the transaction would then be retried. If it is retried, there is no entry in the STP table 113 and no duplicate because they were not committed and therefore, it would be a normal retry response situation.
If the providing of the information to the STP table was successful in step 532, control proceeds to step 540 where a commit request is provided for both the transaction value itself and for the STP table 113 values. These are encapsulated in a single transaction to the database 106 or 116 so that a race condition will not develop. Thus in step 540 the database 106 or 116 actually commits the transaction request values and the STP table 113 values to their respective tables. Control proceeds to step 520 where the transaction is completed and the start bit cleared and a positive response is returned back.
Therefore, the STP table 113 is utilized to track the values of the last write transaction which was attempted by the particular user so that a double commitment operation cannot be developed. It is considered adequate for most circumstances to track only a single transaction from a given user in the STP table 110 as generally two transactions will not be outstanding from a single client. However, if desired, a multiple entry table can be used, with least recently used replacement techniques or the like used to update the table values for a given user.
Thus it can be seen that utilizing this process allows the transactions to be only singly committed, with no double commit capabilities, because should the transaction actually be committed and then there is a response loss any place in the system returning back to the client, and the client then immediately retries it, this duplicate commitment is detected and the response is simply reprovided without actually performing the full operation. This allows the transaction front-end, i.e., the component with the most scalability requirements, to be moved to commodity hardware without the need for clustering. Depending on other requirements, a mainframe or commodity server and related databases can be used in conjunction with the commodity hardware for the transaction front-end.
It will be understood from the foregoing description that modifications and changes may be made in various embodiments of the present invention without departing from its true spirit. The descriptions in this specification are for purposes of illustration only and are not to be construed in a limiting sense. The scope of the present invention is limited only by the language of the following claims.
This application claims the benefit under 35 U.S.C. § 119(e) of U.S. Provisional Patent Application Ser. No. 60/706,334, entitled “Transaction Protection Using Commodity Servers” by Daniel B. Gray and Paul Busch, filed Aug. 8, 2005, which is hereby incorporated by reference.
Number | Date | Country | |
---|---|---|---|
60706334 | Aug 2005 | US |