Smart contract security

Developers
Testers

How did Qubit Finance lose 80 million dollars in one day to hackers? The answer: bad input validation in smart contracts.

The story of smart contracts starts in October 2008, when the initial bitcoin whitepaper was released by Satoshi Nakamoto. Though this name is widely known, it is still unknown who is behind it. A few months later, in January 2009, Satoshi Nakamoto launched the bitcoin network, released the reference implementation and generated the genesis block.

With this a new technology was born: the blockchain. Bitcoin merged its predecessor’s approaches into a successful technology that is able to solve the Byzantine Generals Problem in a distributed system. In this problem each participant has to reach the same conclusion without the help of a central party. Furthermore, in the specific case of blockchain, participants may be malicious, so they may lie while communicating with the others.

Note that we did not mention at all money or banking in connection with the Byzantine Generals Problem. That is because the original problem was about information, and not money. But, in nowadays banking, money is just a number in a ledger, so it is all about information now. As a consequence, tampering with the ledger is a tangible threat. And this is where decentralization and the Byzantine Generals Problem comes into picture. If one would like to create a ledger without a central party, they should first solve the Byzantine Generals Problem.

Bitcoin is an example of a decentralized digital money, which we generally call cryptocurrencies. In this specific case the name for that currency is also the name of the technology: Bitcoin (BTC). We use the underlying network for handling money, and therefore it is the interest of everyone who has money on its ledger to keep the network going without errors. Otherwise, their money would become worthless. Bitcoin also motivates participants to maintain the network by rewarding them whenever one of them creates a new block for the chain. For a participant to be able to create a new block, they first have to solve a complex mathematical problem. By presenting the proof of its solution (hence the name Proof of Work), others have to acknowledge it, and the system will reward the solver’s account with BTC.

More bits, more coins

After its release, people did not use Bitcoin extensively. But after a while it became extremely popular, above all as an investment. And naturally, because it became profitable, lots of others copied – or at least tried to copy – its success and started their own networks. These networks used the same idea (i.e., blockchain with Proof of Work), or an altered version of it (e.g. blockchain with Proof of Stake). Also, there are many other networks that use some non-blockchain based idea (e.g. Byzantine Paxos from 1999, or Directed Acyclic Graph) for solving the Byzantine Generals Problem. All these technologies together we call Distributed Ledger Technologies (DLT).

Over the years, after the release of Bitcoin, many other cryptocurrencies emerged. Generally, we refer to them as altcoins (alternative coins). Nowadays there are thousands of altcoins and their number is still growing. In 2014, Ethereum was the first DLT that implemented a Turing-complete smart contract platform. The phrase smart contract was not a new term: it was first used by Nick Szabo in 1994, but it became widely known only after Ethereum was released.

Are smart contracts smart enough?

So, what is a smart contract? It is basically a program written in some programming language that executes automatically based on some pre-defined conditions. This program expresses the terms of a buyer and a seller, as well as the consequences of executing the contract (usually triggering a money transaction). Additionally, if we store this program on a DLT, there is no need of a central authority, everything will be self-executed. It is perfectly safe, isn’t it?

Bugs. It is always about bugs. Smart contracts are still just programs – and programs are (usually still) written by humans who, unfortunately, make mistakes. And hackers (wearing white, grey, black or whatever hats) are continuously searching for these weaknesses.

There are plenty of cases where bugs in a DLT ecosystem caused the loss of a huge amount of money. Take for example the story of Ethereum Classic, the DAO Hack. A decentralized autonomous organization (hence the name DAO) is a DLT-based cooperative where the rules are phrased and controlled by a smart contract. So, what happens if a non-modifiable smart contract contains a bug in such an environment? In that case one can possibly use it to drain some money from their (rightful) owners. In this specific case, some bad guys have stolen ether cryptocurrency worth $60 million because of a vulnerability called the recursive call bug. In the associated attack scenario, an attacker should deposit some amount on a vulnerable contract instance, and then withdraw it right away. During this withdrawal, a maliciously defined fallback function is invoked, that can withdraw the amount once again, due to a logical flaw: the balance in the original withdrawing function is only decreased after the fallback function is invoked, and not before this call. Such invocations can be even nested, hence the “recursive” in the name of the bug.

As a response, a big part of the Ethereum community decided to rollback before the malicious transactions and start a separate network. By doing this, they made a hard fork in the system. The old network was renamed to Ethereum Classic, and the forked one started as Ethereum again.

smart contract, ether, ethereum, qubit

Qubit Finance

Unfortunately, the DAO Hack was not the only hack since 2009. There were many cases where vulnerable smart contracts allowed stealing cryptocurrencies. Take, for example, the case of a money market platform – Qubit Finance – that was hacked on January 27, 2022. As a consequence of the incident, they lost Binance Coin (BNB) worth $80 million.

Qubit Finance is a platform for decentralized money market connecting lenders and borrowers in an effective, and moreover – as they claimed – secure way. Among others, they also provide a bridge between the Ethereum network and the Binance Smart Chain (BSC), with which one can move their money between the two networks. However, due to a bug, one could get an arbitrary amount of ether (ETH) at their disposal in Binance Smart Chain without depositing any real ETHs in the Ethereum network. Let’s see how.

Essentially, due to faulty validation, one could deposit zero ether by calling the deposit() function, and still gain arbitrary number of them in BSC, as this function did not respond to a wrong input correctly. Due to indicating a successful deposit, the desired amount appeared in BSC. Steps were as follows:

    1. 1. The user attaches some ETH when calling the deposit() function on the Ethereum network. Parameters include the resourceID (from which the token contract address is derived), the depositer (the attacker in this case), and the amount of ETH one wants to transfer (encoded in the data parameter).
    2. 2. The smart contract code – specifically the deposit() function – then validates the given address (derived from resourceID) by an allowlist, and calls the safeTransferFrom() method on it if everything is OK.
    3. 3. An event is emitted that the transfer succeeded.

The problematic deposit() function is exhibited in the following code snippet (the source code is from the aforementioned writeup):

122 function deposit(bytes32 resourceID, address depositer, bytes calldata data)
             external override onlyBridge { 
123     uint option; 
124     uint amount; 
125     (option, amount) = abi.decode(data, (uint, uint)); 
126
127     address tokenAddress = resourceIDToTokenContractAddress[resourceID]; 
128     require(contractWhitelist[tokenAddress], "provided tokenAddress is not whitelisted"); 
129
130     if (burnList[tokenAddress]) { 
131         require(amount >= withdrawalFees[resourcelD], "less than withdrawal fee"); 
132         QBridgeToken(tokenAddress).burnFrom(depositer, amount);
133     } else {
134         require(amount >= minAmounts[resourceID][option], "less than minimum amount"); 
135         tokenAddress.safeTransferFrom(depositer, address(this), amount); 
136     }
137 }

The problem: an attacker could forge malicious data for the deposit() function that allowed them to emit the success event without attaching any ETH for the initial call. All they had to do was to provide a resourceID that translates to a tokenAddress with a value of 0 (to address(0), see line 127 above). Because this zero address was unfortunately on the allowlist (contractWhitelist, line 128), and the amount (extracted from the data parameter in line 125) was also above the minAmounts threshold (line 134), all validations were OK, and we ended up in the safeTransferFrom() function (line 135). It was however called on tokenAddress 0, but it did not revert despite this fact (as address(0) was an externally owned address / EOA) and returned successfully. So in the end, the desired amount was not deposited in Ethereum, but it still “appeared” in Binance Smart Chain.

Lesson learnt again: input validation is essential!

In the 17th century, general Raimondo Montecuccoli said that you need three things for war: 1) money, 2) money, 3) money. We have basically the same saying when it comes to software security: you need three things: 1) validation, 2) validation and 3) even more validation.

So first of all, you need to do validation. But – as the above example shows clearly – it is not enough to have it. It also has to be correct! It has to stop bad input and should only let through those values that your algorithms are prepared for. What is good input and what is bad? Well, that depends on the logic of the algorithm.

The principle of complete mediation (and another one about defense in depth) says that data should not only be validated as it enters the system. You should validate it any time you’re about to touch (use) it – because that’s the best place for the code to know what to validate for. Aligned to this, you should not be afraid of even redundant validations: more validation is always better than no validation at all. But, again, as the example shows, a clever attacker can get through multiple layers of protection, if they are all faulty.

For this specific story, correct validation is even more crucial in case of smart contracts. After all, we store them on the blockchain and therefore they are “immutable”. Once they get there, no one can change them. A weakness stays a weakness forever.

Bottom line is, programmers could have avoided this theft if the smart contract code had just checked the validity of the data correctly (at least in one of the checks), aligned to the input validation best practices and principles. Since input validation is at the heart of secure coding, you can learn how to do it in literally all of our courses.