I started 2022 with a mini batch at the Recurse Center. I did a full batch in Fall 2020, and hanging out with people working on their side projects full time seems like a fun way to kick off the year.

My plan was to port my toy blockchain from Python to Rust (repo here, write-up here). The initial project had multiple iterations, starting with proof-of-work and later Merkle trees and elliptic curve cryptography - a suitable candidate for a 1-week rewrite.

As with a lot of RC projects (for me anyway), things rarely play out as planned. I had moments of confusion with the Rust compiler, moved past that to contemplate working on speedups, but ended up compiling to WebAssembly to enable mining in the browser.

What I thought made for a nice ending was being able to connect the mini batch project with what I had previously worked on at RC.

Blockchain 101

For our purposes, a blockchain is a distributed database where records are divided into 'blocks' and the blocks form a 'chain'.

Each block is made up of a header (where block attributes live) and a body (where records live).

block.jpg

The hash of the header is used to identify each block (commonly referred to as the block hash) and is included in the header of the next block (hence the chain).

chain.jpg

The Python function to initialize the header is relatively straightforward - a simple conversion from ints to bytes followed by a concatenation. Note that all the attributes are pre-determined except for the nonce.

VERSION: int = 0

def init_header(previous_hash: bytes, timestamp: int, nonce: int) -> bytes:
    """Initialize header."""
    return (
        VERSION.to_bytes(1, byteorder="big")
        + previous_hash
        + timestamp.to_bytes(4, byteorder="big")
        + nonce.to_bytes(32, byteorder="big")
    )

The nonce is chosen such that the leading bytes of the block hash are zero. The more zeroes we want, the more difficult it is to find the nonce. In the example below we require the two leading bytes be zero, or four leading zeroes in hex. The first ten integers doesn’t get us there.

previous_hash = (0).to_bytes(32, byteorder="big")
timestamp = 1634700000

def sha256_2x(header: bytes) -> bytes:
    """Apply SHA-256 twice."""
    return hashlib.sha256(hashlib.sha256(header).digest()).hexdigest()

for nonce in range(10):
    header = init_header(previous_hash, timestamp, nonce)
    print(f"nonce: {nonce}, hash: {sha256_2x(header)}")