Embodiments of the present principles generally relate to smart contracts on blockchains and more specifically to methods, apparatuses and systems for verifying smart contracts on blockchains.
A blockchain is distributed ledger where each entry is (cryptographically) linked to the previous entry. A Blockchain (BC) deployment consists of several servers running the BC software/stack. The use of a distributed Byzantine-fault-tolerant consensus ensures integrity, authenticity, and resilience of the blockchain and the data stored on it.
A smart contract is a computer protocol intended to digitally facilitate, verify, or enforce the negotiation or performance of predefined transactions, such as the execution of the terms of a contract. Smart contracts enable the performance of credible transactions without the need for third parties. Although the term “smart contract” implies the self-execution of a contract, the term “smart contract” is used more specifically in the sense of general purpose computation or any kind of computer program that takes place on a blockchain.
Because smart contracts are self-executing, there is a need to be able to automatically verify smart contracts that execute on blockchains.
Embodiments of methods, apparatuses and systems for automated verification of a smart contract on a blockchain are disclosed herein.
In some embodiments a method for automated verification of a smart contract on a blockchain includes translating operating properties of a smart contract annotated with contract specifications at a source code level into verification conditions in an intermediate verification language, discharging the verification conditions using an SMT solver, and reporting results of the discharged verification conditions. In some embodiments, the translating can include mapping statements of the smart contract to statements of the intermediate verification language and mapping expressions of the smart contract to expressions of the intermediate verification language. In some specific embodiments, the translating can include mapping state variables of the smart contract to global variables of the intermediate verification language and mapping functions of the smart contract to procedures of the intermediate verification language.
In some embodiments, an apparatus for automated verification of a smart contract on a blockchain includes a processor and a memory coupled to the processor. The memory of the processor includes stored therein at least one of programs or instructions executable by the processor to configure the apparatus to translate operating properties of a smart contract annotated with contract specifications at a source code level into verification conditions in an intermediate verification language, discharge the verification conditions using an SMT solver, and report results of the discharged verification conditions. In some embodiments, the translating can include mapping statements of the smart contract to statements of the intermediate verification language and mapping expressions of the smart contract to expressions of the intermediate verification language. In some specific embodiments, the translating can include mapping state variables of the smart contract to global variables of the intermediate verification language and mapping functions of the smart contract to procedures of the intermediate verification language.
In some embodiments, a system for automated verification of a smart contract on a blockchain includes a plurality of servers connected via a permissioned blockchain, a local network to provide a blockchain operating protocol for the plurality of servers, and an apparatus including a processor and a memory coupled to the processor. The memory of the processor includes stored therein at least one of programs or instructions executable by the processor to configure the apparatus to translate operating properties of a smart contract annotated with contract specifications at a source code level into verification conditions in an intermediate verification language, discharge the verification conditions using an SMT solver, and report results of the discharged verification conditions. In some embodiments, the translating can include mapping statements of the smart contract to statements of the intermediate verification language and mapping expressions of the smart contract to expressions of the intermediate verification language. In some specific embodiments, the translating can include mapping state variables of the smart contract to global variables of the intermediate verification language and mapping functions of the smart contract to procedures of the intermediate verification language.
Other and further embodiments of the present principles are described below.
Embodiments of the present principles, briefly summarized above and discussed in greater detail below, can be understood by reference to the illustrative embodiments of the principles depicted in the appended drawings. However, the appended drawings illustrate only typical embodiments of the present principles and are therefore not to be considered limiting of scope, for the present principles may admit to other equally effective embodiments.
To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the figures. The figures are not drawn to scale and may be simplified for clarity. Elements and features of one embodiment may be beneficially incorporated in other embodiments without further recitation.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of exemplary embodiments or other examples described herein. However, these embodiments and examples may be practiced without the specific details. In other instances, well-known methods, procedures, components, and/or circuits have not been described in detail, so as not to obscure the following description. Further, the embodiments disclosed are for exemplary purposes only and other embodiments may be employed in lieu of, or in combination with, the embodiments disclosed. For example, although embodiments of the present principles are described with respect to specific programming and verification languages, embodiments of the present principles can be using other programming and verification languages in accordance with various embodiments of the present principles.
Embodiments in accordance with the present principles provide methods, apparatuses and systems for automated verification of a smart contract on a blockchain. In various embodiments in accordance with the present principles, the operating properties of a smart contract annotated with contract specifications are translated into verification conditions at a source code level in an intermediate verification language. The verification conditions can then be discharged using an SMT solver. In some embodiments, the operating conditions of the smart contract are translated by mapping statements of the smart contract to statements of the intermediate verification language and mapping expressions of the smart contract to expressions of the intermediate verification language. In some specific embodiments, the operating properties of the smart contract are translated by mapping state variables of the smart contract to global variables of the intermediate verification language and by mapping functions of the smart contract to procedures of the intermediate verification language. Results of the discharged verification conditions, such as successes and failures of the discharged verification conditions can be reported.
In one embodiment in accordance with the present principles, an Ethereum computing platform is implemented in the distributed computing environment 100 of
For the purposes of illustration,
The example SimpleBank contract illustratively defines two public functions, deposit and withdraw. The deposit function is marked as public and payable, meaning that the function can be called by anyone and is allowed to receive Ether as part of the call. This function reads the amount of Ether received from msg.value and adds the amount to the balance of the caller, whose address is available in msg.sender. The withdraw function allows users to withdraw a part of their bank balance. The function first checks that the sender's balance in the bank is sufficient using a require statement. If the condition of require fails, the transaction is reverted with no effect. Otherwise the function sends the required amount of Ether funds by using a call on the caller address with no arguments (i.e., denoted by an empty string). The amount to be transferred is set with the value function. The recipient of the call can be another contract that can perform an arbitrary action on its own (within the gas limits) and can also fail (indicating the failure in the return value). If the call fails, the entire transaction is reverted with an explicit revert, otherwise the balance of the caller is decreased in the mapping as well.
In the example, SimpleBank, depicted in
The inventors propose herein an automatic, precise and scalable approach for formal verification of smart contracts, such as Ethereum smart contracts that can identify failures and vulnerabilities of smart contracts, such as the SimpleBank contract illustrated in
In some embodiments in accordance with the present principles, a modular software verification approach (e.g., VCC, HAVOC, and ESC/Java) is applied to Solidity smart contracts. That is, in some embodiments in accordance with the present principles modular program verification is implemented for efficient reasoning of composite programs built up from smaller modules, such as classes, interfaces, objects and procedures. Modular verification usually includes a specification language and a program logic. The specification language enables formal specification of the modules with various kinds of annotations, including class and object invariants, loop invariants, and function pre- and post-conditions. A purpose of the programming logic is to check each module independently whether it satisfies its specification by assuming the related modules' specifications to hold.
Modular verification in the domain of smart contracts, however, brings domain-specific challenges, such as that the semantics of the Solidity language include Ethereum-specific constructs such as the blockchain state, transactions, and data-types not common in general programming languages. In general, such challenges are addressed by developing a general SMT-friendly encoding of Solidity into Boogie that is expressive enough to capture the properties of interest, and takes advantage of SMT solving to enable effective reasoning about those properties. Specific solutions to such challenges are described in detail below. Embodiments of the present principles enable the identification of non-trivial bugs in contracts. After the identified bugs are corrected, the correctness of the contracts can be verified using embodiments of the present principles.
Embodiments of the present principles work at the level of the source code which enables the extension of the specification language with domain-specific properties that are crucial for describing the contract functionality but otherwise not possible to express. For example, a large portion of Ethereum smart contracts manage balances of users with respect to some asset. It is often natural and desirable to express (as a contract-level invariant) that the amount of the individual assets should be equal to the total supply. One example is the contract invariant of SimpleBank in
In some embodiments in accordance with the present principles, Solidity contracts can be translated into, for example, Boogie IVL or other derivative language such as Why3. For example in some embodiments in accordance with the present principles, a collection of contracts to be verified can be transformed into Boogie and the output is a single Boogie program including all of the contracts. It should be noted that, although Solidity allows a form of inheritance, the result of inheritance is always a single “flattened” contract. As the flattening and virtual-call disambiguation is done by the Solidity compiler, without a loss of generality, the contract is assumed to have no inheritance and the focus is becomes one of the case of a single contract. In such an embodiment, the basic idea of the translation is to simply map contract state variables to Boogie global variables and contract functions to Boogie procedures.
The Solidity language offers a variety of types, most of them common in programming languages, which are easily translated to Boogie types. In some embodiments, Booleans are simply mapped to the Boolean type of Boogie. Solidity integers can be either singed and unsigned and can be of different bitwidths (8, 16, 24, . . . , 256 bits). In contrast, Boogie has mathematical (unbounded, signed) integers. In one embodiment, a simple encoding includes mapping any Solidity integer to the mathematical integer of Boogie. This might lead to imprecise analysis, so a precise encoding is provided by relying on SMT bitvectors, and a pure arithmetic encoding that relies on modular arithmetic (described in greater detail below).
Addresses in Solidity are represented with 160-bit integers. However, as there is no arithmetic or comparison (beyond equality) allowed, in some embodiments in accordance with the present principles, the addresses are mapped to a predefined, un-interpreted address type. Solidity map types are modeled directly as SMT arrays. Boogie does not have a native array type so, in some embodiments, Solidity array types are translated to a pair of an integer length and an SMT array from integers to their element type. Contract reference types are simply represented by addresses. Type checking is already performed by the compiler so only compatible types can be passed around (e.g., as arguments).
In some embodiments, state variables of a Solidity contract are mapped to global variables in Boogie. However, multiple instances of a contract can be deployed to the blockchain at different addresses. Since aliasing is not possible, each state variable is modeled as a one-dimensional global mapping from contract addresses to their respective type (i.e., similar to treating the blockchain as a heap in a Burstall-Bornat model). Visibility specifiers (e.g., public, private) are enforced by the compiler so there is no need to treat them in any special way for translating.
In some embodiments in accordance with the present principles, a function in Solidity is translated to a procedure in Boogie with the same parameters and return value, and an additional implicit receiver parameter called _this, which identifies the address of the contract instance. As an example, consider the set function of the Solidity contract in
In some embodiments, functions can be associated with a visibility (e.g., public, private) and can be declared view (cannot write state) or pure (cannot read or write state). These restrictions are checked by the compiler so they do not need to be treated in a transformation. Additional user-defined function modifiers are a language feature of Solidity to alter or extend the behavior of functions. In practice, modifiers are commonly used to weave in extra checks and instructions to functions (similarly to aspect-oriented programming). For example,
In various embodiments in accordance with the present principles, most of the Solidity statements and expressions are directly mapped to a corresponding statement or expression in Boogie with the same semantics, including variable declarations, conditionals, while loops, calls, returns, indexing, unary/binary operations and literals. There are also some statements and expressions that require a simple transformation, such as mapping “for loops” to “while loops” or extracting nested calls and assignments within expressions to separate statements using fresh temporary variables. The availability of some arithmetic expressions depends on the expressiveness of the underlying domain (e.g., bitwise operations).
Solidity includes domain-specific functions and variables to query and manipulate Ethereum balances and transactions. Some examples can be seen in
Solidity functions marked with the payable keyword, as depicted line 8 of
The functions send and transfer are dedicated functions to transfer Ether between addresses. The subtle difference between the two is that if transfer fails, the failure is propagated, whereas send indicates it with its return value. In some embodiments, these functions are inlined by manipulating the global balances mapping directly. For example, the transfer in line 12 of
More specifically, similar to object and class invariants, a contract invariant is a constraint over the state variables of the contract that expresses the consistency of the contract state. These constraints must hold at any point after the contract has been deployed and can be called. In order to ensure this, a contract invariant must hold after the contract constructor, after any public function, and before any call to external contracts. In some embodiments, a contract invariant can be any side-effect free Boolean expression having the same scope as the contract in question (e.g., state variables and this.balance can be referenced). Contract invariants are written with specific top-level annotations of the contracts code. During verification, each contract-level invariant can be checked as a post condition to the constructor, as pre- and post-condition to every public function, and as an assertion before every external call.
In order to be able to prove properties of contracts that include loops, loops can be annotated with invariants. Similarly to contract-level invariants, annotations are provided to express invariants over for or while loops. These annotations can access the contract state, variables and parameters of their enclosing function, and the loop counter. In general loop invariants can be complex and difficult to write by developers. However, due to the Ethereum execution fees, loops in Solidity contracts tend to be simple and to have a constant bound. For such loops, developers can specify invariants easily.
In smart contracts, solidity exceptions will undo all changes made to the global state by the current call (and all of its sub-calls) and flag an error to the caller. In such instances a distinction is made between expected and unexpected failures. Unexpected failures, such as assert is mapped to an assertion in Boogie, which is checked by the verifier. In contrast, expected failures, such as require, revert and throw are mapped to assumptions, making the verifier stop without reporting an error.
More specifically, functional correctness of Solidity contracts are targeted with respect to completed transactions and different types of failures. Ethereum transactions can either complete successfully or fail due to a runtime exception. For the purposes of verification two categories of transaction failures are distinguished. An expected failure is a failure due to an exception deliberately thrown to guard from the user. An unexpected failure is any other failure. Examples of expected failures can include exceptions thrown with the require statement (used commonly to check function arguments), manually thrown exceptions, and exceptions resulting from running out of gas. Unexpected failures are for example exceptions triggered by assert statements. A contract is considered to be correct if all contract transactions (public function calls) that do not fail due to an expected failure also do not fail due to an unexpected failure and satisfy the explicit contract specification. Note that, although it is common to distinguish between partial and total correctness of programs, in the context of Ethereum, the execution fee (gas) ensures termination, making the two concepts equivalent.
Solidity provides only a few error handling constructs (e.g., assert, require) for the programmer to specify expected behavior. Therefore, in various embodiments, various in-code annotations are used to specify contract properties. With the exception of domain-specific extensions, these annotations follow Solidity expressions syntax and typing, making it easy for developers to write and understand the specification.
As previously eluded to, integers in Solidity can be signed or unsigned and can be 8, 16, 24, . . . , 256 bits long. Operations over integers in typical contracts are mostly mathematical (addition, subtraction, etc.), but Solidity also supports bit-wise operations that are used in real-world contracts to a lesser extent. Depending on the complexity of operations, a contract is using, reasoning about integers of such large bit-widths can be challenging. By default, Boogie treats the integer type as unbounded mathematical integers. This representation allows scalable reasoning with SMT solvers, especially in the case where the constraints are linear, with much progress being made in recent years on also solving the non-linear constraints. As such, in some embodiments, all Solidity integer types can be encoded to unbounded integers. A caveat of this encoding is that such coding does not support bit-precise operations, and that the types and operations are not sound for representing the semantics of Solidity integers (e.g., operations don't overflow). Therefore, verification results should be treated with an extreme caution in this case as the verification results can result in both false alarms and unsound proofs. For example, unsigned integers are guaranteed to be non-negative in Solidity, but the mathematical integers can be possibly negative, causing a false alarm. However, if the contract does not include any bit-wise operations, and the programmer is confident that that no arithmetic operations goes out of range (e.g., by manually checking ranges or using a library), this encoding can provide good results.
In order to support exact semantics for Solidity arithmetic, in some embodiments, an encoding that uses the SMT theory of bitvectors can be used to model the integer types and operations over them. Such encoding enables the translation of almost all Solidity operations to SMT in a fairly straightforward manner.
To strike a balance between precision and scalability, the inventors developed an encoding of integers scheme that models Solidity integers as unbounded integers in Boogie, but adds additional constraints to model the precise semantics, the allowed value ranges and the wraparound semantics. In some embodiments, to track ranges, a type condition (TC) is developed to each integer variable denoting its exact range. Every operation over integers variables can then be performed by first assuming the TCs and then performing the corresponding operation in arithmetic modulo the TC range (with additional constraints to adjust the results for special cases and signed integers). This approach is further sound across all arithmetic expressions since, if the inputs are assumed in the correct range, the results of the operations produce further values in the correct ranges. An advantage of this approach is that the scalability of reasoning is less dependent on the bit-width, with efficient reasoning also available for, for example, nonlinear operations over 256-bit integers.
With respect to smart contracts, neither the Ethereum Virtual Machine nor Solidity performs any checking of the results of arithmetic operations by default. Due to the wraparound semantics of integers, unexpected overflows and underflows can occur undetected. In some embodiments in accordance with the present principles, to remedy such a deficiency, the inventors propose implementing a formal overflow detection in the context of Solidity that provides a scalable solution to overflow detection with minimal false reports. For example, overflows can be detected by checking the results of every operation for a potential overflow. This can be accomplished, for example, by checking if the output of the operation is the same as if computed over unbounded integers. However, reporting every such overflow would result in an overwhelming number of false alarms. For example, it is common practice for Solidity developers to perform arithmetic operations first, and then check for overflows manually after the fact. This practice of overflow detection is used in almost all deployed contracts on the Ethereum blockchain and is part of Solidity best practices. Reporting such potential overflows would be a nuisance to the programmer who has already put effort into guarding against it. To reduce the number of false overflow reports, in some embodiments in accordance with the present principles, whenever an arithmetic computation is performed, the overflow condition that captures whether the overflow has occurred (i.e., if the result of the computation is different from the result over unbounded integers) is computed. However, instead of immediately checking this condition, the results can be accumulated in a dedicated Boolean overflow-detection variable. Overflow is then checked at the end of every basic block with an assertion. This “delayed checking” gives space to a developer to perform manual checking for the overflow (in which case the assertion will not trigger) and will avoid the false alarms.
The Boogie verifier 510 reports results of the discharged verification conditions such as violated pre and post-conditions and failing assertions in the Boogie program. Illustratively in
In various embodiments, the herein described processes of the present principles can be executed in a processor associated with at least one of servers 101 of the distributed computing environment 100 of
The server 101 or controller 110 can communicate with other computing devices based on various computer communication protocols such as Wi-Fi, Bluetooth™ (and/or other standards for exchanging data over short distances includes protocols using short-wavelength radio transmissions), USB, Ethernet, cellular, an ultrasonic local area communication protocol, etc. The server 101 or controller 110 can further include a web browser.
Although the server 101 or controller 110 of
At 704, the verification conditions are discharged using an SMT solver. The method 700 can proceed to 706.
At 706, results of the discharged verification conditions are reported, for example in one embodiment, to a programmer. The method 700 can be exited.
In some embodiments, a non-transitory computer-readable storage device includes stored thereon a plurality of instructions, the plurality of instructions including instructions which, when executed by a processor, cause the processor to perform a method in accordance with embodiments of the present principles. For example, in one embodiment in accordance with the present principles, a non-transitory computer-readable storage device includes stored thereon a plurality of instructions, the plurality of instructions including instructions which, when executed by a processor, cause the processor to translate operating properties of a smart contract annotated with contract specifications into verification conditions at a source code level in an intermediate verification language, and discharge the verification conditions using an SMT solver. In various embodiments in accordance with the present principles, the processor can be further configured report results of the discharged verification conditions. In some embodiments such results can include failures/errors in the discharged verification conditions.
While the foregoing is directed to embodiments of the present principles, other and further embodiments may be devised without departing from the basic scope thereof. For example, the various devices, modules, etc. described herein can be enabled and operated using hardware circuitry, firmware, software or any combination of hardware, firmware, and software (e.g., embodied in a machine-readable medium).
In addition, it can be appreciated that the various operations, processes, and methods disclosed herein can be embodied in a machine-readable medium and/or a machine accessible medium compatible with a data processing system (e.g., a computer system), and can be performed in any order (e.g., including using means for achieving the various operations). Accordingly, the specification and drawings are to be regarded in an illustrative rather than a restrictive sense. In some embodiments, the machine-readable medium can be a non-transitory form of machine-readable medium.
In the foregoing description, numerous specific details, examples, and scenarios are set forth in order to provide a more thorough understanding of the present principles. It will be appreciated, however, that embodiments of the principles can be practiced without such specific details. Further, such examples and scenarios are provided for illustration, and are not intended to limit the teachings in any way. Those of ordinary skill in the art, with the included descriptions, should be able to implement appropriate functionality without undue experimentation.
References in the specification to “an embodiment,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is believed to be within the knowledge of one skilled in the art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly indicated.
Modules, data structures, blocks, and the like are referred to as such for case of discussion, and are not intended to imply that any specific implementation details are required. For example, any of the described modules and/or data structures may be combined or divided into sub-modules, sub-processes or other units of computer code or data as may be required by a particular design or implementation of the servers 101 and/or the optional controller 110.