This is a continuation of the previous installment on “proof of work”.
Obviously, Bitcoin “miners” do not actually know everything. What they do know is the same thing every Bitcoin client knows: what they hear from the Bitcoin network. Such peer-to-peer (P2P) networks are nothing new; if you ever used Napster or BitTorrent, you have the basic idea. If not, go read the Wikipedia article.
Definition: A running copy of the Bitcoin software is called a Bitcoin client.
To use Bitcoin, you must access a system that is part of the Bitcoin network. Any system, including the one on your desk, may join the Bitcoin network simply by running a Bitcoin client, whose initial action is always to locate and connect to a few neighbors (aka. peers) in the network. A system running a Bitcoin client is called a Bitcoin node.
The Bitcoin network’s function is to relay two types of messages: Transactions and blocks.
A transaction is a digitally signed instruction to transfer money between addresses, as described earlier.
A block is:
- A set of transactions
- A timestamp
- The 256-bit hash of the preceeding block
- A nonce sufficient to make the hash of the block not exceed the current target
By including the 256-bit hash of some other block, each block asserts its position in the block chain.
The Bitcoin network relays every valid transaction and every valid block to every node. When presented with a transaction or a block, a node will validate it before relaying it to the node’s neighbors. This prevents the network from being choked with garbage data. To validate a block, a node checks (among other things) that the block’s hash does not exceed the current target, that the timestamp inside the block is not too far in the future or the past, and that all transactions inside the block are valid. To validate a transaction, a node checks (among other things) that the signature is valid and that the input(s) to the transaction have not already been consumed by some earlier transaction.
To perform these validations, every node must maintain a complete copy of all transactions and all blocks, all the way back to Block #0 (the “genesis block”). This is a slight exaggeration, but not much… And yes, it does make Bitcoin’s scalability a serious concern. More about this in a later installment.
Bitcoin “miners” are clients that attempt to create new valid blocks. They do this by putting some transactions in a candidate block, picking a nonce, computing the hash of the resulting block, and repeating with different nonces until they find a block whose hash does not exceed the current target. Then they broadcast that block to the network, thus appending it to the block chain that every client sees.
Miners have a financial incentive: They can embed one coinbase transaction in each block they mine. A coinbase transaction has no input address and has an output address of the miner’s choice. The coinbase transaction includes new bitcoins (hence the term “mining”) and also any transaction fees associated with the transactions in the block. This incentive structure is an important feature of Bitcoin, and I hope to say more about it later.
The current target for the block chain is defined by a calculation, so any two clients looking at the block chain will calculate the same target. This calculation aims to adjust the target such that one block will be mined every ten minutes, no matter how much total computing power is devoted to mining. The target changes every 2016 blocks based on the timestamps within those blocks. Why 2016? Because the Bitcoin designer(s) decided two weeks was a good interval, and at 10 minutes per block, 2016 blocks will be mined every two weeks:
$$\frac{60\frac{\mathrm{min}}{\mathrm{hr}}*24\frac{\mathrm{hr}}{\mathrm{day}}*7\frac{\mathrm{day}}{\mathrm{week}}}{10\frac{\mathrm{min}}{\mathrm{block}}}
=\frac{2016}{2}\frac{\mathrm{block}}{\mathrm{week}}$$
(I admit it; I love MathJax. If the above looks like nonsense, you probably just need to click through to the post.)
When 2016 blocks take more than two weeks to mine, the target goes up to make mining easier; when they take less than two weeks, the target goes down to make mining harder. In symbols, if \(T_{prior}\) is the previous target and \(t_{prior}\) is the time it took to mine 2016 blocks using that target, then the updated target \(T_{current}\) is just:
$$T_{current}=T_{prior}*\frac{t_{prior}}{2\:\mathrm{weeks}}$$
(Aside: I am not well-versed in control theory, but this looks like an extraordinarily simple feedback loop for the internals of a major world currency. Did I mention that Bitcoin is still a bit of an experiment? Then again, what currency isn’t these days…)
(Aside #2: This actually is the formula used in the Bitcoin source code. But did you notice that, strictly speaking, there should be a “+1” and a “-1” in there somewhere? Because, for example, there are 11 numbers from 0 to 10. Fortunately, these values are on a scale where it does not matter. Still, this surprised me a bit, since most of the code is mathematically precise.)
The target is typically a huge number in excess of \(2^{200}\). Also it goes down as the total hashing power of the miners goes up. Consequently, interested humans usually think in terms of the difficulty instead. Definition: The Bitcoin difficulty is the average number of nonces you have to try to find a valid block — aka. the work — divided by \(2^{32}\) (roughly 4 billion). Mathematically:
$$D=\frac{work}{2^{32}}=\frac{2^{256}}{(T_{current}+1)*2^{32}}$$
Note that \(D\) is just a number for human consumption. It scales in direct proportion to the computational effort required for mining, so twice the difficulty means twice the effort.
If we know the current difficulty \(D\), we can estimate how fast all miners are hashing in aggregate. On average, it takes \(D*2^{32}\) hashes to find a nonce that works, and the target is selected to make that take 10 minutes, so:
$$\frac{\mathrm{hashes}}{\mathrm{second}}\approx\frac{D*2^{32}}{600}$$
You should find the “difficulty” and “hash rate” of the network, as reported by various Bitcoin sites, obey this formula.
Well, that pretty much wraps up my introduction to Bitcoin. There is quite a bit more to say; just off the top of my head:
- Economics and incentives
- The security (or lack thereof) of Bitcoin’s hash
- The threat (or lack thereof) from quantum computers
- Addresses, transactions, and scripts
- Scalability
…and so on. But there is no obvious order in which to cover them. So I think I will pause here and ask:
Any questions?
Thank you for this great series.
How do the nodes decide which miner’s block to include as the next in the chain ?
Thnx,
@Leibniz0 —
Thank you for the kind words.
First, search for “Prime Directive” in the previous post.
The nodes keep track of multiple chains at the same time. The chain with the highest total work is considered the “real” chain. Most documentation calls this the “longest” chain, which is true if you define “length” as “sum of work”.
In principle, this means every node must keep a copy of every block it ever sees. In practice, a node only keeps blocks that contribute to a chain with a reasonable chance of becoming the longest. At least, I think that is how it works; I have not studied that part of the source code yet…
Also, I believe each release of the Bitcoin software includes a few hard-coded earlier blocks as “anchors”, including the genesis block. So the Prime Directive is not absolute, since only chains that start from the most recent anchor can possibly be considered valid. Again, this is my current understanding; I have not read the source code in detail.
This incomplete answer is the best I can do for now. If and when I do learn the details, I will probably do a post on the management of the block chain, block chain “forks”, etc.
We already saw one hiccup with a software release that caused a fork of the block chain. Can you think of other glitches that might arise in the network?
With so little for sale in BTC, it seems like a minor risk for government. What will be the tipping point that governments might declare BTC software illegal and have ISPs start sniffing and dropping bitcoin network traffic (e.g. napster)?
Who is buying BTC at $250? This seems like a tulip bubble in full flower (pun intended) and is stunning to watch.
Isn’t the software monoculture a bit dodgy? It would be less fragile if no single implementation had more than say 10% of the network. In effect with one dominant implementation you could argue the committers are the equivalent of a Central Bank policy committee: they control the rules as long as long as they can keep the community upgrading…
One interesting development would be to see what happens if (when?) illegal content is stored in the block chain: for modest size content, you can use steganography to encode stuff in the blockchain by having specifically crafted transactions encode your content, and publish a “decode” algorithm for the world to see it. It costs the publisher only transaction fees (all the encoding transactions can be made between addresses they own so no actual expense of coins) and for that you get perpetual distributed storage… What if child porn or wikileak-style material, or anything that usually gets taken down, is published that way?
it’s the best explanation of bitcoin i have read. spent a good chunk of my sunday reading and rereading thru all the 10 parts and i feel a fog has been lifted. you should publish it as an ebook or something. much gratitude as we approach the $200 mark.