The present invention relates to a method and computer system of data storing, and more particularly, to a method for determining information integrity and computer system using the same.
Traceability systems have been widely used in various industries, which allow the consumers to trace the detailed information of products in a supply chain, a marketing system, a logistics system, etc. However, most of the traceability systems are short of visualization of production information such that the consumers' concerns and the authenticity of product information cannot be guaranteed.
Therefore, it is necessary to improve the prior art.
It is therefore a primary objective of the present application to provide a method and computer system for determining integrity of information of an object, to improve over disadvantages of the prior art.
An embodiment of the present invention discloses a method for determining integrity of information of an object, which comprises generating a first data corresponding to the information of the object and generating a first representative data corresponding to the first data; storing the first data to a database, and adding the first representative data into a first block of a blockchain; reading a second data from the database in response to a command for retrieving the first data from the database; using the first block to indicate whether the second data is identical to the first data.
An embodiment of the present invention further discloses a computer system, which comprises a processing unit; and a storage unit, storing information of an object and a program code, wherein the program code instructs the processing unit to execute the following steps: generating a first data corresponding to the information of the object and generating a first representative data corresponding to the first data; storing the first data to a database, and adding the first representative data into a first block of a blockchain; reading a second data from the database in response to a command for retrieving the first data from the database; using the first block to indicate whether the second data is identical to the first data.
These and other objectives of the present invention will no doubt become obvious to those of ordinary skill in the art after reading the following detailed description of the preferred embodiment that is illustrated in the various figures and drawings.
Certain terms are used throughout the description and following claims to refer to particular components. As one skilled in the art will appreciate, manufacturers may refer to a component by different names. This document does not intend to distinguish between components that differ in name but not function. In the following description and in the claims, the terms “include” and “comprise” are used in an open-ended fashion, and thus should be interpreted to mean “include, but not limited to . . . ”.
In order to enhance reliability, “blockchain” is widely utilized in different applications in the prior art. A blockchain is typically managed by a peer-to-peer network collectively adhering to a protocol for inter-node communication and validating new blocks. Once recorded, the data in any given block cannot be altered retroactively without the alteration of all subsequent blocks, which requires consensus of the network majority.
However, the block size of the blockchain 10 is limited by the blockchain protocol, for example, 32MB to bitcoin cash (BCH.) If the blockchain technology is applied to a traceability system to record information, the storing ability provided by the traceability system may be limited and would not be allowed to record multimedia data, such as photos, audios, and videos. On the other hand, if a multimedia data is split to be stored in a plurality of blocks of a blockchain, the efficiency of accessing the multimedia data may be lowered.
To record large-size data as well as keep reliability and integrity of the recorded data, the present invention uses both a database and a blockchain. Note that, data to be recorded in the present invention may be information of an object, wherein the object may be a tangible article or goods, such as a used car in a used-car marketing system, an agricultural product in a traceability system, a cargo in an import/export cargo declaration system, and so on. Alternatively, the object may be an intangible or virtual estate, such as an ownership right of a property or a real estate.
Due to the difference of the objects, the information of the object may represent different statuses of the object. For example, if the object is the used car, the information thereof may be varied when the used car is examined, sold, etc. If the object is the agricultural product, the information thereof may be varied when the agricultural product undergoes different processes such as fertilization, pesticide, packaging, and transportation in the agricultural product traceability system. If the object is the cargo in the import/export cargo declaration system, the information thereof may be varied when the cargo in different stages in the import/export cargo declaration system. If the object is the ownership right, the information thereof may be varied when the ownership right is transferred, inherited, or repealed. To record the information of the object, especially when the information is varied, large-size multimedia data may be required. For example, for the used car, the information of the object may be recorded in terms of photos of the used car, audios of the car engine sound, etc. For the agricultural products, the information of the object may be recorded in terms of videos of the fertilization process, or documents from the government.
To prevent data in the database from being tampered, the present invention further coordinates the database and the blockchain to record data corresponding to the information of the object and keep reliability of the information.
In detail,
For example, if the system 20 is applied to a used-car marketing system and a used car therein is examined, meaning that the information of the object in the system 20 is varied, the database 22 may store one or more photos corresponding to the examining result (which may be inputted by an operator), and the blockchain 24 may record representative data corresponding to the photos stored in the database 22. Therefore, the system 20 may record historical or transaction data of the used car via the database 22 and detect whether the recorded data is modified via the blockchain 24, so as to enhance the credibility of the marketing system. In another example, if the system 20 is applied to an import/export cargo declaration system, the system 20 may store the changed statuses of a cargo therein, such as passing a certification or being tax-duty paid. In yet another example, the system may corporate with the IoT (Internet of Things)-based equipment to perform the product monitoring, for example, the system 20, applied to an agricultural product traceability system, may store information of each process; for example, when agricultural products are under fertilization, the system 20 may trigger the IoT-based equipment such as a camera to catch videos or photos automatically, store the videos or photos to the database 22, and generate corresponding representative data to the blockchain 24.
As can be seen, the system 20 may store large-size data in the database 22 and store representative data corresponding to the data of the database 22 in the blockchain 24, such that the system 20 has enough space to store multimedia data while keeping reliability of the stored data based on the blockchain technology.
On the other hand, when the system 20 is applied to a DNA report certification system, an unauthorized user may not access the database 22 because the DNA reports are highly private and should be kept strictly confidential. However, the user may verify the integrity of the DNA report by comparing the representative data instead of touching the original data. For example, if a DNA report is stored in the data field 220 and the representative data is correspondingly stored in the block 240, then the user may only retrieve the representative data in the block 240 with the data stored in the database 22 can be kept private.
Note that, the representative data may be utilized for detecting whether the database 22 is tampered. In an embodiment, the representative data is generated when a data is generated/inputted to the database 22. For example, the representative data in the blocks 240, 242 and 244 of the blockchain 24 may be generated by duplicating data in the data fields 220, 222 and 224 of the database 22 (i.e., the blockchain 24 is taken as a backup of the database 22), or by performing a compression to data in the data fields 220, 222 and 224 (i.e., the blockchain 24 records compression results of the database 22).
In another example, the representative data in the blocks 240, 242 and 244 of the blockchain 24 may be generated by calculating digests of data in the data fields 220, 222 and 224 through a mapping table or a public algorithm, so as to obtain hash values or the checksums corresponding to data in the data fields 220, 222 and 224.
In addition, generating each of the representative data of the blocks 240, 242 and 244 may not be limited to calculation from a single data field in the database 22. For example, the representative data in the block 240 may include a Secure Hash Algorithm 256 (SHA-256) digest of the data in the data field 220; the representative data in the block 242 may include a SHA-256 digest of both the representative data in the inserted block 240 and data in the data field 222; and the representative data in the block 244 may include a SHA-256 digest of the representative data in the inserted blocks 240, 242 and data in the data field 224. In other words, the representative data may be calculated according to all the stored data in the database 22 and all the stored representative data in the blockchain 24.
On the other hand, in an embodiment, the representative data may be calculated according to one of the stored data and one of the stored digests. For example, the representative data in the block 240 may include the SHA-256 digest of the field 220; the representative data in the block 242 may include the SHA-256 digest of the representative data in the inserted block 240 and data in the data field 222; the representative data in the block 244 may include the SHA-256 digest of the representative data in the inserted block 242 and data in the data field 224.
As can be seen, the representative data may be calculated according to at least a stored representative data in at least a confirmed block in the blockchain and/or at least a stored data in the database. Those skilled in the art may make modifications of the decision rule and alterations accordingly, and not limited herein.
Due to the collision resistance of the blockchain 24, an attacker to the system 20 is hard to generate a fake data with the same digest, and due to the avalanche effect thereof, the hash value of the data field may be dramatically changed even when the attacker modifies a bit to the original data; that is, the attacker may not modify the data in the system 20 easily, and the data integrity can be guaranteed. On the other hand, the integrity of the system 20 may be verified easily by comparing the representative data between the last data field in the database 22 and the last block in the blockchain 24. By doing so, any attack may be detected even if an attacker adds a fake data to the database and adds a fake representative data corresponding to the fake data because the relation is not only between one data field and one block, but also among the combination of multiple data fields and blocks.
In short, the present invention stores data in the database, generates representative data corresponding to the stored data, and adds the representative data to the blocks of the blockchain. Therefore, once the blocks are confirmed, any user may read the data from the database and also retrieve the block containing the representative data corresponding to the data, so as to determine the data integrity.
For example, if the representative data in the blocks 240, 242 and 244 of the blockchain 24 are duplicated from the data fields 220, 222 and 224 respectively in the database 22, the user may detect the integrity of the data fields 220, 222 and 224 in the database 22 by retrieving the duplicated data from the confirmed blocks 240, 242 and 244, which cannot be altered. More specifically, since the representative data in the blockchain 24 is calculated from data in the database 22, the integrity of the data in the database 22 may be detected by comparing the representative data in the blockchain 24 with the data in the database 22. In other words, when a user reads data from the database 22, the user may also retrieve a block containing representative data corresponding to the data to be read. Accordingly, the user may determine whether the data being read is identical to the data stored before according to the relation between the representative data and the data.
Moreover, if the representative data in the block 240 of the blockchain 24 is the hash value of SHA-256 to the data field 220 of the database 22, due to the collision resistance and the avalanche effect, the attacker is difficult to tamper the data field 220 in the database 22. Therefore, the data integrity of the data in the database 22 can be ensured.
In an embodiment, if the representative data is calculated according to the data field and at least a stored data field in the database 22, the database 22 may be verified and be determined which data field is tampered by a binary search. More specifically, due to the avalanche effect, the representative data in the blockchain 24 will be changed after the tampered data field; therefore, the integrity of the middle data field ensures the integrity of a half of the database 22, and again the integrity of the middle element in the half database, which need to be verified, ensures a quarter of the database 22; by doing so, the data field in which the attacker tampers can be found easily.
The above operations can be summarized in an integrity detecting process 30, as shown in
Step 300: Start.
Step 302: Generating a first data corresponding to the information of the object and generating a first representative data corresponding to the first data.
Step 304: Storing the first data to the database 22, and adding the first representative data into a first block of the blockchain 24.
Step 306: Reading a second data from the database 22 in response to a command for retrieving the first data from the database 22.
Step 308: Using the first block to indicate whether the second data is identical to the first data.
Step 310: End.
With the integrity detecting process 30, the information of the object is recorded as data in the data fields 220-224 of the database 22, and the system 20 accordingly generates representative data to form the blocks 240-244 of the blockchain 24. When a user tries to read a specified data from the database 22, the user may send a command for retrieving the specified data. Since the blockchain 24 records representative data corresponding to data stored in the database 22, which cannot be altered, the user may detect the relation between retrieved data and the corresponding representative data, to determine whether the retrieved data is identical to the specified data intended to be read, so as to enhance data credibility and prevent from receiving tampered information. Furthermore, if the representative data corresponding to the data field is lossless, e.g. by duplicating data of the database 22, or by performing a compression to data of the database 22, the system 20 may perfectly reconstruct the original data according to the lossless representative data or recover the database 22 when the integrity thereof is damaged.
For example, if the representative data in the blocks 240-244 of the blockchain 24 are generated by duplicating data in the data fields 220-224 of the database 22, the database 22 may be recovered from the blockchain 24 by copying the representative data from the blockchain 24 to the database 22. By the same token, if the representative data in the blocks 240-244 of the blockchain 24 are generated by performing compression to data in the data fields 220-224 of the database 22, the database 22 may be recovered from the blockchain 24 by decompression for the blocks 240, 242 and 244 when the integrity of the database is damaged.
Notably, the system 20 is an embodiment of the present invention, and those skilled in the art may make modifications and alterations. For example, the database 22 may include any amount of data fields, which is not limited to three, and similarly, the amount of the blocks in the blockchain 24 may be any number, and not limited to three. In addition, the data fields may be stored in multiple computers, located in the same physical location, or may be dispersed over a network of interconnected computers. In an embodiment, the data fields may be generated by manual input of an operator or automatically generated by a machine, or may be triggered by a smart contract in a blockchain.
In addition, the blockchain 24 may contain a plurality of side chains, which may expand the original main chain, add new functions such as transaction privacy protection technology, smart contracts, and place some high-frequency transactions or customized transactions outside the main chain. Moreover, Plasma protocol may be applied to allow storing the result on the main chain with the calculation on the side chains (plasma chain).
Furthermore, in order to improve the privacy protection of both parties to the transaction, the ring signature confidential transaction, usually referred to as RingCT, may apply to hide transactions. Those skilled in the art may make modifications and alterations and are not limited thereto. For example, to avoid the invalidation storing to the orphan blocks, the confirmation number may be set to 20.
Notably, the number of the representative data in a block corresponding to a data field may not limited to one. For example, the block 240 may contain the digest of the data field 220 and the compression result of the data field 220 or other information, such as the serial number, the storing time and the submitter which are corresponding to the data field 220. Therefore, the database may be integrity-guaranteed by the digest stored in the block 240, and the database may be recovered by the compression result stored in the block 240 when the integrity of the database is damaged.
In addition, methods for calculating the digest are well known in the art. In an embodiment, the digest may be calculated according to the recorder's identity. For example, if the system 20 is applied to a traceability system, each undertaker in each step of the import declaration process should record corresponding statuses into the system 20. To verify the undertaker's identity, the representative data may be calculated by a mapping table, a public algorithm or a public protected algorithm with a private key, such as the message authentication code function HMAC-SHA256, which is a public protected algorithm with a private key, according to the data field 224. Therefore, the representative data of the same data would vary for the inputs of the different undertakers. The method of identity verification such as a hash function, a message authentication code function, a key derivation function, a keyed-hash message authentication code function or a symmetric-key cryptographic block cipher in authenticated encryption mode, is known to those skilled in the art, which is not narrated herein for brevity.
In an embodiment, if the system 20 is applied to a marketing system, the representative data may further comprise a pointer ID of products in the marketing system. For example, the statistics, such as the dimension, weight, height, and price of the products may be stored in the database 22, and the blocks 240 and 242 may be corresponding to the products with pointer IDs 0000 and 0002 respectively. However, if the seller wants to modify the statistics of the product with the pointer ID 0000, then the seller may insert new data, which is with the pointer ID 0000 because the confirmed block 240 is unalterable. For example, in a used-car marketing system, when a blue used car with the pointer ID 0000 is wrongly recorded as a red one in the data field 220 and the representative data is correspondingly stored in the block 240 because of manual input errors, the undertaker may establish version history by recording the correct data with the same pointer ID in the data field 222 and the representative data is generated correspondingly to describe that the stored representative data is needed to be modified, replaced or appended after the block 240 is confirmed, which is unalterable. In addition, any user may analyze the classified information about the whole of the marketing system by the pointer ID of the products, for example, the user may use a certain pointer ID to retrieve the latest representative data, or the user may retrieve the differences among the versions of the information of the object recorded in the database with the same pointer ID.
Note that, the system 20 and/or the integrity detecting process 30 may be implemented and/or executed by a computer system. For example,
Notably, the embodiments stated in the above are utilized for illustrating the concept of the present application. There is a plurality of data fields corresponding to the same block; for example, the information of the object may be recorded in the data fields 220 and 222 of the database 22, and the system 20 accordingly generates representative data to add to the two transactions in the same block 240 of the blockchain 24. Those skilled in the art may make modifications and alterations accordingly, and not limited herein to fit the practical scenario. For example, the system 20 may be applied to a DNA report storage system, a supply chain finance system, a real estate marketing system, and a traceability system with/without IoT-based product monitoring. Therefore, as long as a database cooperating with a blockchain to store, back up, and recover the data thereof with data integrity guaranteed, the requirements of the present application are satisfied and within the scope of the present application.
In summary, the present invention provides a method for determining integrity of information of an object, applied to a database and a blockchain by establishing a relation between the database and the blockchain. By storing the data in the blockchain corresponding to the data stored in the database, the data integrity may be ensured, the version history may be easily retrieved, the data may be backed up and restored, and the integrity of the database may be verified.
Those skilled in the art will readily observe that numerous modifications and alterations of the device and method may be made while retaining the teachings of the invention. Accordingly, the above disclosure should be construed as limited only by the metes and bounds of the appended claims.
The present application claims the benefit of U.S. provisional application No. 62/850,961, filed on May 21, 2019, which is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
62850961 | May 2019 | US |