A Deeper Look Into Bitcoin Internals

Prabath Siriwardena
FACILELOGIN
Published in
40 min readJun 10, 2017

--

The Internet is one of the key breakthroughs in the history of mankind, which set the foundation for a plethora of inventions. To use the Internet today, you do not need to understand how its bits and pieces work together. Many technologists and futurists call bitcoin, the next breakthrough after the Internet. Internet revolutionized the flow of information, while bitcoin revolutionized the flow of money. From its early days in 2009, bitcoin has now matured to a level, where you do not need to understand how it works, to use it. Unless you’ve been living in a deep well under 30,000 feet from the earth, you surely have heard of bitcoin. This article explores how bits and pieces work together in bitcoin. More you learn about it — more you appreciate what it does!

Pay with bitcoin

When you book your travel via Expedia — you can pay with bitcoin. You can buy a 6-inch long veggie-patty sub from a Subway with bitcoin. This video shows a guy with lot of excitement after his very first bitcoin payment to buy an ice-cream sandwich.

Bitcoin is a cryptocurrency and a digital payment system. Neither its the first nor the last cryptocurrency. But its the very first cryptocurrency to address the double-spending problem, in a completely decentralized manner.

Double-spending

Double-spending is the result of successfully spending some money more than once. I sell you two fidget spinners for ten dollars. You pay me in cash — with a ten dollar note. That’s it. You cannot spend the same ten dollar note again and again. This is an important property in all fiat currencies. You cannot create duplicate copies of fiat currencies without specialized machinery and materials. You may get close — but you cannot match fully. Still in doubt? Watch this video!

In the digital world making a copy of something is as easy as pie. You pay me 0.00376 bitcoins for two fidget spinners. I record that in my digital ledger. But that does not prevent you from using the same 0.00376 bitcoins to buy another two fidget spinners from Amazon. Amazon has no clue you are playing a trick.

A fidget spinner

The easiest way to fix this is to introduce a centralized server to validate all the bitcoin transactions. When you pay me with 0.00376 bitcoins, I validate it against this centralized server — which will take it out of your account and add it to mine. Now, if you try to spend 0.00376 bitcoins again, against Amazon, unless you have enough bitcoins in your account, the centralized server will not approve. This works. But — that’s exactly what bitcoin wants to avoid. The bitcoin protocol allows you solve the double spending problem in a completely decentralized manner, with no centralized server. Let’s see how it does it.

My First Bitcoin

To initiate a bitcoin transaction — first you need to have some bitcoins. If I sell fidget spinners only for bitcoin — you cannot buy it with no bitcoins. How do you find/earn bitcoin?

  1. Sell some goods or services to someone who owns bitcoin — and get paid in bitcoin. For example the currency of my home country is Sri Lankan Rupee (LKR). To buy crude oil from Saudi, they won’t accept LKR. So — we sell them tea — in exchange of US Dollar(USD) — and pay for crude oil in USD.
  2. Buy bitcoins from a person who owns it, paying by a currency you already have, say USD. You can buy bitcoins from any bitcoin exchange. Coinbase is one such popular exchange. It allows you buy bitcoin, paying by your credit card. If you use this link to create an account with Coinbase, you’ll get $10 worth bitcoins free.
  3. Bitcoin mining. Fiat currencies are added to the circulation by printing them. When and how much to print are controlled by the government. Bitcoin has a limited supply. The number of total bitcoins ever available would be 21 million. Bitcoin mining is the only way bitcoins are added into circulation. If you mine, you can earn bitcoin. We’ll talk more about mining as we move forward.

Now you have bitcoins. Don’t worry how you got them — and how you store them. We’ll discuss that later — for now just assume you have some bitcoins, and ready to pay me 0.00376 bitcoins to buy two fidget spinners. To initiate a bitcoin payment — you need to construct a bitcoin transaction.

Oh.. wait — you do not need to worry about constructing these transactions by hand. Bitcoin wallet applications do keep track of all the bitcoins you own and create bitcoin transactions whenever you want to spend them (Hint: the wallet applications never store bitcoins — you will learn later). If you are a fresh bitcoin enthusiast, I recommend trying out with Coinbase. Once you create an account with Coinbase, it automatically creates a bitcoin wallet for you — and whenever you buy bitcoin — those will be stored (not really storing bitcoins — just keeps track of them) there. Also if you want to send some bitcoins to someone else, Coinbase (acting as your online wallet application) will do it for you, by constructing the bitcoin transaction and sending it across to the recipient.

Bitcoin Address

To keep or accept bitcoin, you need to have a bitcoin address. This is created by the bitcoin wallet application during the bootstrap process. If you use Coinbase — you even not know — but it will show you the wallet address. My bitcoin address is 1BrVwEq4zY9HfvhHZkrp7qrvDDpQQTmpxt — if you find this blog interesting, I won’t mind you sending me some bitcoins :-). Just kidding, forget it ;-)

Before we move forward let’s see how the bitcoin address is created. The entire bitcoin payment system is heavily dependent on cryptography. Its a no brainer — that’s why we call it a cryptocurrency! Anyone who needs to keep bitcoin must have a public/private key pair. The bitcoin address represents the owner of the public/private key pair (it can represent some other things as well, which we will discuss later towards the end of the article). It is derived from the public key using a cryptographic hash function.

The algorithms used to make a bitcoin address from a public key are the Secure Hash Algorithm (SHA) and the RACE Integrity Primitive Evaluation Message Digest (RIPEMD) — or in other words SHA256 and RIPEMD160. Hashing algorithms are one way — and produce a constant length output for any input of varying sizes. For example, SHA256 algorithm will take the public key, and then derive an output which is 256 bits long. This output from SHA256 algorithm will go as an input to the RIPEMD160 — and will derive an output which is 160 bits long. This output is then Base58 encoded and represents the bitcoin address. (Yes, you read it correctly! its not a typo — its Base58 — not Base64. Base58 encoding is quite similar to Base64. It only drops few characters from the Base64 character set that could possibly cause confusion:).

O1 = SHA256(PUB_KEY)

O2 = RIPEMD(O1)

O3 = Base58Encode(O2)

Once you Base58 encode the 160 bit long O2, it would produce a fixed length output which is of 32 characters. Now the bitcoin address is created by prefixing the above with the constant 1. Any bitcoin address that points to a public key will carry 1 as a prefix.

In practice the complete above process is transparent to the user and executed behind the scene by the wallet application. Having a bitcoin address will let you send and accept bitcoin.

Why base-58 instead of standard base-64 encoding?

1. Don’t want 0OIl characters that look the same in some fonts and could be used to create visually identical looking account numbers.

2. A string with non-alphanumeric characters is not as easily accepted as an account number.

3. E-mail usually won’t line-break if there’s no punctuation to break at.

4. Double clicking selects the whole number as one word if it’s all alphanumeric

Ref: https://en.bitcoin.it/wiki/Base58Check_encoding

Bitcoin Transactions

The wallet application builds you the bitcoin transaction, when you specify the recipient(s) and the amount. There are two key elements in a bitcoin transaction: inputs and outputs. Outputs represent one or more recipients of the bitcoins that you are about to send. The inputs represent the previous bitcoin transactions sent to your address, that can be aggregated together to build the amount of bitcoins specified as outputs in this transaction.

Let’s say you first bought bitcoin for USD from Coinbase. To facilitate this, Coinbase has to build a bitcoin transaction having you as the recipient. The amount of bitcoin specified in that transaction, against your bitcoin address, would be available for you to spend in another transaction. This output is also known as unspent transaction output (UTXO). In the same manner you may have more bitcoins collected from different other transactions — and all of them are now available for you to construct your new transaction to pay for two fidget spinners. The bitcoin wallet application will find out all such UTXOs to construct the input elements of your bitcoin transaction.

One mistake many fresh bitcoin learners make is to think all your bitcoins are aggregated together. That’s not right. Everything is recorded as transactions. Now you know why I mentioned before, your wallet never stores bitcoins — but rather keeps track of them. Lets say T1 transaction paid you 0.0020 bitcoins and T2 paid you 0.0025 bitcoins. To pay 0.00376 bitcoins to buy the fidget spinners, you can’t simply take 0.0020 from T1 and get the rest (0.00176) from T2. In other words, you can never use unspent transaction outputs partially. So in this case we need to use both T1 and T2 as inputs to the new transaction (let’s say T3). That means we have 0.0045 bitcoins as the inputs (T1+T2). What would happen to the excess 0.00074 bitcoins? You can include that as another output to the same bitcoin transaction (T3) and point that address to your bitcoin address itself.

Bitcoin transactions are not reversible. If you send some bitcoins to someone, there is no way to revert it back, unless the recipient of that payment decides to pay you back in another transaction.

Transaction Fees

All the bitcoin transactions are processed by bitcoin miners. We’ll talk about bitcoin mining later in this blog. Till then, similar to the bitcoin wallet application you use to initiate bitcoin transactions, bitcoin miners run bitcoin mining applications to validate bitcoin transactions. Miners put lot of computational power into the bitcoin network and they deserve a transaction fee for processing each bitcoin transaction. The transaction fee is derived from the gap between the input values and output values in a bitcoin transaction. In other words, you need not to say how much it is — but derived. If we take the same previous example, if you would set 0.00070 bitcoins as the output against your own address — then the difference between inputs and outputs would be 0.00004 and that will go to the miner as the transaction fee.

Unspent Transaction Output (UTXO)

How does your bitcoin wallet application find all the UTXO related to your bitcoin address? It must know that to construct a new transaction.

Let me take one step forward. We already briefly talked about miners. Miners are responsible to validate bitcoin transactions. Once they validate all the bitcoin transactions they receive within a 10 minutes time period(roughy) or once they find enough transactions to group them to a block, which does not exceed 1 MB in size, will write to a persistent storage called blockchain. This is not the 100% correct explanation of how this happens — but for now lets stick to that and later dig deep into the details.

Here you can see one such block. This block has 2415 transactions grouped into it and the size is 998.17 KB. If you look at one transaction (shown in the figure below) included in that above block you will find it has 0.07548645 bitcoins as the total inputs and 0.07535085 bitcoins as the total outputs. The difference between these two are listed as the transaction fee, which is 0.0001356 bitcoins.

Every valid bitcoin transaction ultimately gets grouped into a block — and gets written into the blockchain. Once again, for the time being think about blockchain as a persistent storage under the mining application (or node). Each block in the blockchain, has a reference to the previous block. Once you know the top most block, you can traverse through all the blocks to the very first one. Once you know the bitcoin address of someone, you can traverse through all the blocks and find out all the transactions carrying unspent outputs against that bitcoin address. Each transaction has a transaction id — and if the same transaction id is in both the inputs and outputs under a particular bitcoin address, then that’s a spent transaction — if not an unspent transaction.

In a given transaction there can be multiple outputs targeting different recipients. When you refer such a transaction as an input to the new transaction, you refer it by the transaction id — and also by the index of the output. Each output in a transaction has an index.

Input /Output Scripts

Input/output scripts are another two fundamental elements in a bitcoin transaction. The scripts related to inputs are known as unlocking scripts (or scriptSig) and the scripts related to outputs are known as locking scripts (or scriptPubKey). Something we already know is, the outputs from an unspent transaction or UTXO becomes inputs for another new transaction. So, the unlocking scripts kept at the inputs of the new transaction should be related to the locking scripts kept at the outputs of the unspent transaction.

In other worlds, if you want to consume bitcoins sent to you, you need to provide an unlocking script (for each input in the new transaction). These scripts are validated by miners during the process of validating the transaction. The miner has to check whether the unlocking script provided in the new transaction can unlock the locking script provided in the output of the unspent transaction. A script is essentially a list of instructions recorded with each transaction that describes how the next person wanting to spend the bitcoins being transferred can gain access to them.

What these scripts look like? Let’s have a look at a locking script first.

The bitcoin transaction script language is called Script, which is stack-based — means that each data, input or output is put on a stack of other data.

Here is an example of a locking script — associated with an output:

OP_DUP OP_HASH160 <pub_key_hash> OP_EQUALVERIFY OP_CHECKSIG

E.g:

OP_DUP OP_HASH160 6f7fe7974d94b494d19a0c4d08c0b786f10ab864 OP_EQUALVERIFY OP_CHECKSIG

OP_DUP pushes a copy of the top most stack item on to the stack.

OP_HASH160 consumes the topmost item on the stack, computes the RIPEMD160(SHA256()) hash of that item, and pushes that hash onto the stack.

OP_EQUALVERIFY runs OP_EQUAL and then OP_VERIFY in sequence. OP_EQUAL consumes the top two items on the stack, compares them, and pushes true onto the stack if they are the same, false if not. OP_VERIFY consumes the topmost item on the stack. If that item is zero (false) it terminates the script in failure.

OP_CHECKSIG consumes a signature and a full public key, and pushes true onto the stack if the transaction data specified by the SIGHASH flag was converted into the signature using the same ECDSA private key that generated the public key. Otherwise, it pushes false onto the stack.

In plain English, the above says the value of the pub_key_hash is the double hashed (first with SHA-256 and then with RIPEMD-160) value of the recipient’s pubic key, and the corresponding public key should be used to verify the signature of the complete transaction, to unlock it.

The unlocking script corresponding to the above is generated by the wallet application. It will just contain the public key associated with the recipient and the signature of the transaction derived from the corresponding private key. Now the mining software, which validates the transaction, will execute the script concatenating the unlocking script with the locking scripts, which will look like following.

<signature> <pub_key> OP_DUP OP_HASH160 <pub_key_hash> OP_EQUALVERIFY OP_CHECKSIG

The first two instructions in the above script are the data instructions. The data instructions will be just pushed into the stack. Then we have the OP_DUP, which instructs to duplicate the top item on the stack, which is the public key. The next instruction is OP_HASH160 tells to pop the top stack value (which is the public key) and compute its cryptographic hash, and push the results onto the top of the stack. When this instruction finishes executing, we will have replaced the public key on the top of the stack with its hash. Next is another data instruction, which is the hash of the public key (set by the locking script) and will be just pushed into the stack. Now the OP_EQUALVERIFY instruction will compare the two values at the top of the stack to see they are equal and if yes, will consume those two values. Now the stack will only have the signature and the public key. OP_CHECKSIG will pop those two values of the stack and verify the signature with the provided public key. This unlocks the transaction!

Bitcoin Network

Once a transaction is verified by a mining node it broadcasts it to all the other mining nodes in the bitcoin network, and each node will independently verify. This happens in three steps. The node that mined the block (or verified the block), will not broadcast it immediately to the rest. Instead the availability of the block is announced to the neighbors by sending them an inv message once the block has been completely verified (or mined). The inv message contains a set block hashes that have been received by the sender and are now available to be requested. A node, receiving an inv message for a block that it does not yet have locally, will issue a getdata message to the sender of the inv message containing the hashes of the information it needs. The actual transfer of the block is done via individual block messages.

Each mining node runs a mining software. In the bitcoin network all the nodes are equal. There is no hierarchy — no special nodes or master nodes. It runs over TCP and has a random topology, where each node peers with other random nodes. A new node can join the network anytime. It can first connect to an active node, which it already knows and then discovers other nodes in the network. This known active node is also known as a seed node — and there are various mechanisms to find a seed node. For example, the mining software knows about a set of DNS seeds (seed.bitcoin.sipa.be, dnsseed.bluematt.me, dnsseed.bitcoin.dashjr.org, seed.bitcoinstats.com, seed.bitcoin.jonasschnelli.ch, seed.btc.petertodd.org) — and doing a nslookup will return a set of IP addresses of available seed nodes.

Once one mining node hears about a bitcoin transaction, it validates it and then publishes it to all the nodes its aware of. Once again this does not happen directly, but in three steps, just as in the case of broadcasting a block. The node that verified the transaction will not broadcast it immediately to the rest. Instead the availability of the transaction is announced to the neighbors by sending them an inv message once the transaction has been completely verified. The inv message contains a set of transaction hashes that have been received by the sender and are now available to be requested. A node, receiving an inv message for a transaction that it does not yet have locally, will issue a getdata message to the sender of the inv message containing the hashes of the information it needs. The actual transfer of the transaction is done via individual tx messages.

The same thing will be repeated by all the mining nodes in the bitcoin network and ultimately all the nodes in the bitcoin network will be aware of this transaction. This happens through a simple flooding algorithm, sometimes called a gossip protocol.

When a mining node hears about a transaction — it carries out the following checks to make sure its a valid transaction.

  1. The transaction must be valid with the current blockchain. We still didn’t introduce the concept of blockchain. For the time being think about it as a repository which carries all the valid transactions. And each mining node has its own repository. Nodes run the unlocking scripts for each previous output being redeemed haven’t already been spent and ensure that the scripts return true.
  2. Check whether the outputs being redeemed have not been spent already.
  3. Check whether the transaction is seen before — if its seen before then that node will not relay it to other nodes.

There are about 5000 to 10,000 nodes permanently connected to the bitcoin network and fully validating every transaction. We can also call these nodes as fully validating nodes. Each fully validating node must have a copy of the all the bitcoin transactions happened since its inception. At the time of this writing there are more than 229 million bitcoin transactions stored in the blockchain — and the total size would be around 125 GB.

https://blockchain.info/charts/n-transactions-total?timespan=all

Does this mean your bitcoin wallet application has to download the entire 125GB of blockchain — so it can construct the transaction looking at your unspent outputs? Not really. In contrast to fully validating nodes, there are lightweight nodes, also called thin clients or Simplified Payment Verification(SPV) nodes. Most of the nodes in the bitcoin network are SPV clients.

Simplified Payment Verification(SPV) Node

To differentiate an SPV node from a fully validating node — we need to introduce the concept of block here. What we know already about a block is, its a grouping of a set of valid transactions — and mining nodes are responsible to create blocks and add them to the blockchain. A given block has a header in addition to its transactions. A block header is only 1/1000 the size of a block. While the fully validating nodes keep a copy of all the blocks, the SPV nodes only keep a copy of the block headers. If the total size of the blockchain is 125GB, then the total size of the block headers would be around 125MB. That’s a totally affordable size even for a smart phone. Most of the wallet applications act as SPV nodes. Also there are API based wallet apps, which connect to a fully validating node or an SPV node via an API.

How does an SPV node know about a user’s unspent transaction outputs?

An SPV node — or a wallet application needs to know about its user’s unspent transaction outputs prior to creating a new transaction. The block headers stored in an SPV node does not help to find out unspent transaction outputs. The block header does not contain the transactions — but the root of the merkle tree built with all the transactions. Do not panic if you do not understand what is merkle tree and how its built with bitcoin transactions — you are just few minutes away from knowing what it is!

The SPV nodes use bloom filters to query from other nodes in the network to find the transactions related to a given bitcoin address of their interest. A bloom filter is a search filter, a way to describe a desired pattern with out specifying it exactly. Once you know all the transaction related to a given bitcoin address, the wallet application can filter out the unspent transaction outputs.

Mining

Mining gold produces gold— mining graphite produces graphite. Mining bitcoin produces bitcoin. Mining is the only process to produce bitcoins — and add new bitcoins into the circulation. Unlike fiat currencies, bitcoin supply is limited. There cannot be more than 21million bitcoins ever.

All the transactions generated by wallet applications will ultimately reach all the mining nodes in the bitcoin network. To be a bitcoin miner you have to join the bitcoin network and connect to other nodes.

Once you are connected, you need to listen for the transactions on the network and validate them by checking whether the signatures are correct and that the inputs are not spent before. The sender signs the entire transaction (but without any of the signature scripts). Each input in the transaction has a signature script (or the unlocking script). The signature includes all the outputs (including the locking scripts) in this transaction. Whoever tries to modify the output of the transaction to get bitcoins into his account will fail, as such actions will invalidate the signature.

Orphan Transactions

While validating the unspent transactions, included as inputs to this transaction — there can be a case where the mining node may not find the referenced transaction — either in the pending transaction pool or in the blockchain (blockchain stores all the valid transactions when those are grouped into a block). In such case, that transaction will be moved into a the orphaned transactions pool. This can be due to an ordering issue — and the mining node may see the referenced transaction later — and then the original transaction will be moved out of the orphaned transactions pool.

Block

Not just this particular transaction — but many more transactions are seen by the miner nodes. Once you have all the valid transactions — and the total size of all of them are around 1 MB, the mining node groups all the transactions together and creates a block. A block in the blockchain can be identified either by the hash of the block or the block height. The block height is the number of blocks preceding a particular block on the blockchain. For example, the very first block in the blockchain (the genesis block) has a height of zero because zero blocks preceded it.

The process of creating a block is the most computationally extensive task in the bitcoin protocol. A block consists of a header and a group of transactions. There are few key parameters in the block header worth mentioning. One parameter keeps track of the hash of the previous block header. Each block has to be linked to its previous block — and given a block one can traverse through to the very first block in the blockchain, which is famously known as the genesis block. Another parameter is called nonce. This is the most trickiest out off all and eating all the computational power. The value of the nonce is derived or found from a process called proof of work.

Proof of Work

No miner can add a block to the blockchain without a proper proof of work. The work here is to find a nonce, once added into the block header, the complete hash of the header falls under a given number. So — the nonce is the proof. Finding this nonce value to match the given number is a computationally very expensive process. There is no shortcut — you need to guess a nonce value, calculate the hash of the complete block header (with the nonce) and repeat it with different nonce values till you get the right hash. How quickly you can find this magic number will depend on how many hashes your computer can generate per second. Also note that the difficulty of this number will change with time.

A bitcoin mining facility in China

The challenge would be to find a nonce, which can make the hash of the block header, below the provided value. The hash of the transaction is a 256 bit value — and it has to be less than the 256 bit target (a number). In other words, this complete process is a brute force attack. Once the nonce is found — the miner can add it to the block header and add the block to the blockchain. The blockchain is a repository for all the blocks (and yes — the block is a group of transactions). Each miner has a copy of the complete blockchain.

Distributed Consensus

In the bitcoin network there is no master mining node — each node carries the same level of responsibilities. Once your wallet software dispatches the bitcoin transaction to the bitcoin network, it will hit all the mining nodes. Each mining node will carry out the mining process as explained before and will carry out the proof of work. As soon as one node finds out the challenge (or met the target), it will write the block to the blockchain and send out the block to all the nodes in the bitcoin network. As soon a node receives a new block, it will verify the hash of the block with the provided nonce value. Even though the process of finding the nonce to match the given target is computationally expensive, once the nonce is found, to verify that its the right one is straight forward — just need to verify the hash of the block and confirm it’s compliance to the expected difficulty level. Also — each node will verify all the transactions added to the block and if all good — the blockchain behind the corresponding node will be updated and soon start mining the next block.

Each block added on top of a previously mined block (say foo block), is known as a confirmation. In other words, if six more blocks are mined on top of the foo block, then it’s said that the foo block has six confirmations. More the confirmations — more we trust the legitimacy of the corresponding block. It’s recommended to wait for at least six confirmations to accept a transaction, once its written to a block and then to the blockchain. In other words, when you pay me in bitcoin to buy a fidget spinner, I won’t ship you it as soon as I see the corresponding transaction in a block in the blockchain — but wait for at least six confirmations. Six confirmations means — six more blocks need to be mined — so in average would take 1 hour. Practically its impossible to wait for one hour to buy an ice cream with bitcoin. Such case people do not wait for six confirmations, but rather 1. Then again financially expensive transactions like, buying a Ferrari with bitcoin, should wait at least for six confirmations to minimize any risks.

Coinbase Transaction

What makes bitcoin miners to invest lot of time and money on bitcoin mining? There is an incentive model introduced in bitcoin. Whoever the miner first solves the hashing puzzle and adds the block to the blockchain, will get n number of bitcoins. The value of n differs with time — or the number of blocks in the blockchain. It started with 50 bitcoins and today the value is 12.5 bitcoins. After every 210,000 blocks (or roughly in every 4 years) this reward gets halved. The block reward and the transaction fees are the incentives for bitcoin miners.

The transaction which gives the miner the block reward is a special transaction. It’s generated by the mining software itself and added to the block as the very first transaction. In addition to the block reward, the coinbase transaction also includes total transaction fees miner gets from all the transactions included in the corresponding block. Following lists out how a coinbase transaction differs from a normal transaction.

  1. It always has a single input and single output.
  2. No references to unspent transaction outputs.
  3. It has a special ‘coinbase’ parameter — miners can put whatever they want in it. The first block ever mined in bitcoin, the coinbase parameter referenced a story from the Time of London newspaper involving the chancellor bailing out banks.
The Times 03/Jan/2009 Chancellor on brink of second bailout for banks

The Genesis Block

The genesis block is the very first block in the bitcoin blockchain. This block is generated by the mining software itself (or hardcoded into it) — and has only the coinbase transaction. The outputs from this transaction cannot be spent. All the bitcoin mining software ignore this transaction, when building an unspent transaction database from the blockchain.

The Longest Blockchain

We discussed already the process of adding a block to the current blockchain by miners. At the same time, something to notice is, its not just one miner who is working on a block at a given time. There can be many. But, whoever solves the hashing puzzle first wins and writes to the blockchain — and then the other miners confirm its existence and start mining the very next block.

Since bitcoin is a globally distributed network, there can be a case, two miners who are working on the same copy of the blockchain, could solve the hashing puzzle with a short time gap — and start transmitting their own blocks to the network. This is a race condition and in bitcoin terminology this results in a fork. Both the blocks would have a reference to the same previous block hash— but the transactions included in each block can differ. Once again, since the the bitcoin network is globally distributed — some nodes will first get the block from the first miner — and others may get from the second miner. As soon as each miner gets a copy, they will validate the block and then add to their own copy of the blockchain and start mining the very next one. Now we have two chains of the blockchain — and each will start growing independently.

Two different block generated with the same reference to the previous block in the blockchain
Node-1 sees the block-LL and adds to its blockchain, Node-2 sees block-PP and adds it to its blockchain

Let’s say we’ve got a similar situation for the next block too — that is two blocks got mined simultaneously. Let’s say the first block is generated from a miner on the foo blockchain, and the second block is generated from a miner on the bar blockchain. When the first block reaches a miner in the bar block chain — it will refuse accepting it — because, the reference to the previous block hash in that block, does not match with the hash of the latest block in that blockchain. The same happens when the second block reaches a miner on the foo blockchain. This will happen for some time and based on the computational power of one branch, it will start grow faster than the other. Then, all the nodes in the other branch will notice that the length of their blockchain is shorter than the other one — and quickly move to that. All the blocks added to the shorter blockchain are gone and whoever mined won’t get the reward. This is a basic principle in bitcoin — where all the miners will work on the longest branch of the blockchain.

Node-1 sees the block-PP but rejects as it does not fit into its blockchain — same for the Node-2
The blockchain behind the Node-1 is the longest blockchain now

In practice a fork happens in the bitcoin network approximately everyday. A two block fork (that is two times the race condition) may happen weekly or in every month. A three block fork is quite rare. In April 2013 bitcoin experienced a two block fork. Eight minutes later bitcoin experienced a three block folk. Ten minutes after that a four block fork happened. Then followed by a five block fork and a six block fork. And finally a seven block fork. This was due to an update in the bitcoin mining software to use Google’s LevelDB instead of the Berkley DB to store the blocks — and some were still using the old one, which made some nodes to reject blocks, which were accepted by another set of nodes.

Orphan Blocks

An orphan block is a block that doesn’t have a known parent in the longest blockchain. In other words, these blocks are mined in a different branch and later found that they are not in the longest blockchain. Since these are mined in a different blockchain, they do not have any valid references to any of the blocks in the longest blockchain.

What will happen to the transactions included in the orphan blocks in the shorter blockchain? None of them won’t get lost. Probably they may be already included in the blocks in the longest blockchain. All the transactions are visible to all nodes in all the branches. In case some transactions are not added to a block in the longest blockchain, they will get added soon, as they should be in the pending transaction pool.

Merkle Tree

A merkle tree is a binary tree data structure with hash pointers. It’s named after its inventor’s name, Ralph Merkle. The hashes of the transactions (L1, L2, L3, L4), which are grouped into a single block, make up the leaves of the tree . Then those hashes are paired into two. It goes to the top, level by level, till we find one root node. This root node is known as the merkle root. If there are an odd number of transactions, then the transaction without a partner is hashed with a copy of itself. Also keep in mind, the hash of the transaction — is the transaction id itself.

Once the full merkle tree is constructed, during the mining process, the value of the merkle root will be included into the block header. This protects the bitcoin block from being modified. If someone tries to add or update any of the existing transactions, that will change the hash value of that particular transaction, hence the value of the merkle root.

Let’s say one miner in the bitcoin network wants to change the coinbase transaction in the blocks he sees. Once he changes the coinbase transaction, he has to build the complete merkle tree again and find the merkle root to update its value in the current block header. Since the block header is updated, he has to recalculate the hash of the block header. That is, he has to carry out same computationally expensive operation again, to find the nonce value to meet the difficulty level. So its like mining a new block and there is no incentive in trying to modify an existing block. Even if someone desperately wants to do that, that block will get rejected by the other nodes in the blockchain, because at that level (the block which is pointing to the same previous hash) the block is already being added to the blockchain, while the attacker is busy in calculating everything from scratch.

Merkle tree also caters the purpose of finding whether a given transaction is in a given block, in an efficient way. You may recall we discussed before about Simplified Payment Verification(SPV) nodes — which do not store the entire blockchain, but just a copy of all the block headers. Most of the wallet applications are SPV nodes. Also the way SPV nodes learn about transactions is by requesting the set of transactions corresponding to a given bitcoin address from the peer nodes. Once you send me the bitcoins to buy the fidget spinner, my wallet application will find the corresponding transaction. Now it has to check whether its included in a block, to confirm that everything is okay and legitimate. Let’s say we need to find whether the transaction L1 (see the above image) is in the block N. Since we already have the merkle root of the block N in its header — we would only need to know the hashes 0–1 and 1. Once that is known we can independently calculate the merkle root again and compare it with what we have in the block header.

Level of Difficulty

We discussed briefly the level of difficulty, introduced during the proof of work. This varies with the time as per the following formula.

new_difficulty = old_difficulty X (2016 blocks X 10 minutes) / (the time took in minutes to mine the last 2016 blocks)

This formula tries to evaluate the speed of the mining network and find out how much it deviates from the expected level. The expectation is to mine a block in 10 minutes. If the average speed of mining the last 2016 blocks is 8 minutes — then the above factor will be greater than one, so the current difficulty level will be increased. In case — the average is above 10 minutes, then the factor will be less than 1 and the difficulty level will be decreased for the next 2016 blocks. The difficulty level is reevaluated after every 2016 blocks, that’s roughly after every 2 weeks.

The following figure shows how the difficulty level changed with the time from the inception of bitcoin. In other words, the difficulty level reflects how difficult the proof of work calculation with respect to the difficulty value set at the beginning — which is 1. For example, the current difficulty is 678,760,110,083 — which means if we mine the blocks at the same hash rate, which was at the time of the 1st block, then it would take more than 678 billion times to mine a block with the current difficulty. But in practice, since the computation power thrown into the bitcoin mining improved vastly, the time takes to mine a block is kept at a constant number (which is 10minutes), by increasing the level of difficulty. During the first five years of bitcoin, the difficulty level increased from 1 to 50 billion.

In each block, in the header there is a parameter called, bits — and in the genesis block the value of bits is 486604799. If we represent the same in hexadecimal it would be 1D00FFFF. This is a compact format — which can be used to find the target hash value for this (current) block. In fact the hash of this block must be less than or equal to the target. The value of the target is calculated only after 2016 blocks, along with the difficulty level calculation — and once calculated the next 2016 blocks will cary the same value in its bits block header parameter.

target = coefficient * 2^(8 * (exponent — 3))

The first two digits of the above hexadecimal value is known as the exponent, which is 1D and the next six digits (00FFFF) are known as the coefficient. Now if we apply these value to above formula, it will look like the following.

target = 00FFFF * 2^(8 * (1D — 3))target = 00FFFF * 2 ^ (8*1A)target = 00FFFF * 2 ^ D0

Now if we do the hexadecimal arithmetic for the above formula, we’ll find the value of target in hexadecimal. Converted that into decimal would be 2.69 * 10 ^ 67 — and in binary it would be.

11111111011011100100011100011000111100100100000001111110111000011001010100001010011011001110000000110010001011100111110010100011010110010001100111001000100000100000000000000000000000000000000000000000000000000000000000000000

To make it much clear — since the hash of the block must be in 256 bits — lets also represent the target in 256 bit by adding leading zeros.

0000000000000000000000000000000011111111011011100100011100011000111100100100000001111110111000011001010100001010011011001110000000110010001011100111110010100011010110010001100111001000100000100000000000000000000000000000000000000000000000000000000000000000

Now the hash of the genesis block must be less than or equal to the above, which is:

0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001100111010110011010001001110000001000010110101111101001001010001011010100000000000000

As we explained before, the difficulty of the genesis block is 1. After every 2016 blocks, the target is recalculated in the following way.

new_target = old_target / new_difficulty, where the initial target is shown above (which is from the genesis block).

The difficulty is calculated in the following way:

new_difficulty = old_difficulty X (2016 blocks X 10 minutes) / (the time took in minutes to mine the last 2016 blocks)

For example, the new level of difficulty for the 2017th block is calculated in the following manner:

Let’s assume the average block time over the first 2016 blocks is 8 minutes. This is just an assumption — in reality the level of difficulty of the bitcoin blockchain remained same till the 32255th block — and only changed from 1 to 1.18 in 32256 (2016 X 16) block.

new_difficulty = 1 X (2016 X 10 ) / (8*2016) = 1.25The new target = (2.69 * 10 ^ 67) / 1.25 = 2.15 * 10 ^ 6711001100001001111001101011011000111011110100110100111110011001000101101110000101111101000000101110110010000110011011110000011111100101010100000001100010011001100000000000000000000000000000000000000000000000000000000000000000

Now, if we add leading zeros to make the above 256 number, then the target will be:

0000000000000000000000000000000011001100001001111001101011011000111011110100110100111110011001000101101110000101111101000000101110110010000110011011110000011111100101010100000001100010011001100000000000000000000000000000000000000000000000000000000000000000

The new target is less than the previous one — so the mining software has to find a nonce value, which make the hash of the block less than or equivalent to the above. When the computational power increases, the average block time decreases — and then the difficulty level increases by having a low target value. The next 2016 blocks will carry in the block header, the value of the new target as the bits parameter — and the value of the new difficulty level as the difficulty parameter.

The manner in which the difficulty level increases and how its calculated are written into the mining software itself. If one miner wants to cheat the system by changing the rules in his/her own copy, of course he can do it. But that will make him/her isolated. Once you cheat and then send the copy of the mined block to others, other legitimate miners who validate the block will find its not done properly and reject it. The miner who cheats then will automatically fall into his own copy of the blockchain. Unless he/she owns more than 50% of the total computational power in the blockchain, he/she will not be able to compete with others to build the longest blockchain — and will fail.

51% Attack

What will happen if most of the miners in the bitcoin network are cheaters — or the bad guys. That really does not matter given that all those bad guys do not work together. But, in case if all those bad guys work together or in other words, at least 51% (more than 50%) of the total computational power of the bitcoin network is owned by one or a single group of miners, that’s not a healthy situation for bitcoin. In practice, it’s a huge investment to gain 51% of the total computational power of the bitcoin network. Let’s say someone still does it — what would happen next?

Before answering the above question, I guess it’s worth looking at what is actually meant by this 51% ownership of total computational power of the bitcoin network. In the above diagram, let’s assume the total bitcoin network, with all the miners (4 in this case) can generate 100,000 hashes per second. That’s the total computational power of the network. For the simplicity, let’s say all the miners start mining at the same time — and each miner has it’s own computational power. The first miner can generate 25,000 hashes per second, while the second miner can generate 40,000 hashes per second. Now all these miners are in a race to find the magic number to solve the difficulty puzzle. What is the chance each miner has? The first miner has 25% of probability, while the second miner has 40% of probability. To make it much simpler, in a lottery ticket draw, if we own more tickets, you have a higher probability to win the prize — but it’s just probability — even a person who has bought just one ticket can win the prize.

The total computational power of the bitcoin network is expressed in terms of the number of hashes generated by all the nodes per second. At the time of this writing it was around 5.6 million tera-hashes per second. One tera hash is equivalent to 1,000,000,000,000 hashes. The figure below shows how the total computational power of the bitcoin network increased over the time. 51% of the total computational power means, one single mining node (or a group) should be able to generate more than 2.8 million tera-hashes per second. In other words this guy can mine the blocks faster than all the others in the network together— so he/she has a higher chance of producing the longest blockchain.

What kind of an impact this will have on the bitcoin network?

If the owner of the 51% of the total computational power decides to cheat, can he/she change the value of the bitcoin reward (say from 12.5 to 1000 bitcoins) and add them under his account? As we discussed before, the value of the bitcoin reward and how it changes with time is defined in the bitcoin mining software. Each miner runs a copy of this. The attacker can change his copy to generate more bitcoins for the reward and possibly add that to the blockchain. But, once that block is sent to other nodes of the network — the miners who are working with the legitimate software — and who follow the right rules, will reject accepting that block. So the blockchain behind those nodes are not updated with this block, which will result in a fork.

When there is a fork, all the nodes in each branch of the fork is keen to know whether they are in the longest branch. Since the bad guy has control over the computational power, he/she can generate more blocks and possibly would own the longest blockchain. Now what would happen, the good guys attempt to switch to the bad guy’s longest blockchain?

How would one miner finds that there is another branch of the blockchain, which is longer than what he/she works on currently? Remember we discussed about the block height? The block height is the number of blocks preceding a particular block on the blockchain — and the value of the block height is included in the block header itself. Once a mining node receives the block, it will validate it and if it conforms to the accepted rules in the bitcoin network, then it will look at the block height. If the block height is higher than the latest block being mined at this node, then there is another branch of the blockchain which is longer than the one known to it. So it can traverse back in the blockchain (via the reference to the previous block) and update its own copy by requesting the latest from its peers. Keep in mind this complete process will only happen, if the the block it receives is valid.

In case of the 51% attack, since the block generated by the bad guy is invalid (in this case), even though its blockchain is longer, the other legitimate nodes in the bitcoin network will not shift to it. This will make the attacker isolated in its own branch. He or she can accumulate more and more bitcoins, but no one outside his/her branch will accept those.

But, there are other things an attacker can do if he/she owns 51% of the total computational power. Double spending is one option. For example you buy something for 1 bitcoin — and post the transaction to the bitcoin network. The attacker mines a block with that transaction, and updates the blockchain. Now after the merchant confirms that transaction, the attacker can re-mine that block with a new transaction using the same inputs, but outputs the amount to the buyer’s bitcoin account. Now that block is a valid block, and the attacker can mine more blocks on top of that to make it the longest blockchain, with its dominance in computational power.

Also, the attacker can block some transactions being added to the blockchain. The attacker can have his/her own preferences and keeps-on mining the blocks with the set of transactions he/she wants. This will delay certain transactions — even though they happened quite earlier in time.

By the time of this writing, nearly all the miners are mining through pools, very few miners solo any more. A mining pool lets miners from different parts of the world together to contribute their computational power — and paid to each miner based on the hash rate they contribute. The Stratum mining protocol is used to facilitate communication between the mining pool and its participants. There are other alternative protocols too. In June 2014, GHash.IO, one of the largest mining pools, got so big that it actually had more than 50% of the entire capacity of the bitcoin network. This is something that community had feared for a long time, and it led to a backslash against GHash. By August, GHash’s market share went down, as the pool stopped accepting new participants. The following figure shows the percentage hash rate generated by popular mining pools at the time of this writing.

Despite its name, the 51% attack scenario doesn’t actually require 51% of the hashing power. In fact, such an attack can be attempted with a smaller percentage of the hashing power. The 51% threshold is simply the level at which such an attack is almost guaranteed to succeed. A consensus attack is essentially a tug-of-war for the next block and the “stronger” group is more likely to win. With less hashing power, the probability of success is reduced, because other miners control the generation of some blocks with their “honest” mining power. One way to look at it is that the more hashing power an attacker has, the longer the fork he can deliberately create, the more blocks in the recent past he can invalidate, or the more blocks in the future he can control. Security research groups have used statistical modeling to claim that various types of consensus attacks are possible with as little as 30% of the hashing power [ref].

Transaction Fees (contd.)

We discussed before how the transaction fees are calculated per each transaction. Its the difference between the total inputs and outputs. Since there is a limited supply of bitcoins, that is 21 million, from 2140 onwards there won’t be any rewards for the miners. Miners are the most critical component of the bitcoin network and their existence is extremely important. Only way for miners to be profitable after 2140 is through the transaction fees.

Based on the transaction fees attached to a transaction miners may decide when to include that to the block they mine. There is no agreement on this policy or how to prioritize transactions based on the transaction fees. It’s totally up to the individual miner to decide. At the moment the transaction fees do not contribute a lot to the miners’ revenue — its contribution is roughly around 1 percent.

According to the default policy in the bitcoin reference implementation released in 2015 (version 0.10.0), no fee is expected if a transaction meets following three conditions (all of them).

  1. The size is less than 1000 bytes.
  2. All the outputs are 0.01 bitcoin or larger
  3. The priority is high enough. The priority is defined as a function of input value, input age and the size of the transaction. To calculate the priority first you calculate the (input value * input age) for all the inputs and sum them all — and divide it by the size of the transaction. With this, the unspent transactions being there for a long time, will earn a higher priority when used in a new transaction.

If the above criteria is not met, then a fee is expected. The fee is about 0.0001 bitcoins per 1000 bytes. If you make a transaction that does not meet the fee requirements, it will probably find its way into the blockchain, but to get your transaction recorded more quickly and reliably, generally require paying the standard transaction fee.

Redeem with Multiple Signatures

In bitcoin there can be multiple types of transactions based on how you decide how the recipient of a transaction claims it or redeems it. The types are defined by the locking scripts (for the outputs) you pick, while constructing the transaction. What we discussed so far is one type of a transaction, where the owner of the public key corresponding to a given key hash, can prove the procession of the private key by signing the transaction to redeem it.

Here is the script, which we used before to lock the outputs.

OP_DUP OP_HASH160 <pub_key_hash> OP_EQUALVERIFY OP_CHECKSIG

This type of a transaction is known as Pay to Public Key Hash (p2pkh) — and it’s the most common bitcoin transaction type.

Another one is, Pay to Public Key, which is a simplified form of the p2pkh, but not commonly used in new transactions anymore, because p2pkh scripts are more secure as they do not reveal the public key until the output is spent.

One issue with the way the bitcoin locking scripts work is that the sender of the script has to specify the script exactly in the way its expected by the recipient. If there is a case where one recipient has a policy, that to redeem bitcoins, it has to be signed (approved) by multiple parties, then the sender needs to know who they are and the complete script corresponding to the policy. Bitcoin addresses this problem with Pay to Script Hash(p2sh) transaction type. In that case the recipient can simply ask the sender to send the bitcoins to a hash of a script. The purpose of p2sh is to move the responsibility for supplying the conditions to redeem a transaction from the sender of the funds to the redeemer.

Let’s take a concrete example. I can only spend the bitcoins you sent to me to by two fidget spinners only if it’s approved by two out of three of my managers. If we do not follow p2sh mechanism, this is how you will be creating the locking script for the outputs. This says, two out of three owners of the public keys mentioned in the script can be used to unlock the transaction.

2 <pub_key_1> <pub_key_2> <pub_key_3> 3 OP_CHECKMLUTISIG

To redeem the bitcoins sent to me, I can provide any of the following unlocking scripts. Each script has two signatures.

Unlocking script 1 : signature_with_key_1, signature_with_key_2Unlocking script 2: signature_with_key_2, signature_with_key_3

We can make the above simple with p2sh. Then the locking script would be like below.

OP_SHA160 <20_byte hash of the redeem script> OP_CHECKMLUTISIG

The unlocking script would be: signature_with_key_1, signature_with_key_2 redeem_script

And the value of the redeem script is:

2 <pub_key_1> <pub_key_2> <pub_key_3> 3 OP_CHECKMLUTISIG

But, unlike in the previous case, with p2sh, the redeem script is maintained at the recipient’s side.

Before we wind up this lengthy article (congrats!!! if you made this far), let me clarify one last thing — which we discussed at the beginning.

The bitcoin address represents the owner of the public/private key pair (it can represent some other things as well, which we will discuss later)

Now you should be clear enough that the bitcoin address can also represent a script — not just a public key owner. Also you may recall that the bitcoin address corresponding to a public key starts with 1, in the same way the bitcoin address corresponding to a script starts with 3.

Satoshi

As we discussed before, there is only a limited supply of bitcoins — 21 millions. But one bitcoin can be divided into 1oo millions of satoshis. The satoshi is the smallest unit of bitcoin. It’s named after the inventor(s) of bitcoin — Satoshi Nakamoto.

Bitcoin, bitcoin and BTC

Bitcoin is the name of the protocol — which starts with the uppercase B. The word bitcoin (all lowercase) is used to refer bitcoin as a currency (5 bitcoins). BTC is the currency symbol for the bitcoin currency, just like USD is used for US $. 5 bitcoins can also be written as 5 BTC. In this article, I just used the word bitcoin in all the cases — and those necessary does not represent bitcoin as a currency.

References

  1. The bitcoin research paper by Satoshi Nakamoto: https://bitcoin.org/bitcoin.pdf
  2. Mastering Bitcoin by Andreas M. Antonopoulos
  3. The Book Of Satoshi: The Collected Writings of Bitcoin Creator Satoshi Nakamoto
  4. Bitcoin and Cryptocurrency Technologies: A Comprehensive Introduction
  5. Bitcoin Developer Guide: https://bitcoin.org/en/developer-guide

--

--