knowledge hub

Explore

the blockchain for beginners

Glossary

cryptography

Secret codes and ciphers have been used to hide information throughout history, even before computers existed. The study of these techniques is called “cryptography” (from Greek “kryptos”, meaning hidden). In the internet era, cryptography is used to protect computer information — the massive volume of data flowing over networks and routed through millions of computers every second. Without cryptography, information exchange and commerce at the speed of the internet would be impossible.

If two people want to communicate securely with each other, cryptography lets them:

encode and exchange messages with each other, so that no one else who might intercept the messages can read them (“privacy”);

ensure the messages they receive have not been tampered with in transit (“integrity”);

know that the messages they receive are actually from the other, and not from another interloper (“authenticity”).

Cryptography is synonymous with secure communication, and the term “crypto” can refer to cryptography in the context of information security. More recently, however, “crypto” has become the informal industry term for cryptocurrencies and cryptonetworks.

cryptographic hash function

Cryptographic hash functions are special programs that ensure the integrity of data in digital applications. The outputs of a hash function are called “cryptographic digests” and are the foundational data structures upon which blockchains are built.

A hash function takes input data of any length and returns a value that is fixed length. This output value (sometimes called a “digest” or “tag”), is computed in a way that is —

deterministic: the same input always results in the same output;

non-invertible: the output reveals no information about the input;

collision-resistant: no two inputs should result in the same output.

These properties allow hash functions to make finding information more efficient, yet also difficult to reverse-engineer. These functions — often considered the workhorses of modern cryptography and blockchains — are used to check for data-tampering. Therefore, they are well-suited to decentralized, permissionless applications.

cryptographic digest

The output of a hash function is called a “cryptographic digest”. A cryptographic digest is a unique, fixed-length tag representing a single piece of data. It is used to detect tampering, since even a small change to that data input to the hash function results in a totally different output. For example, a cryptographic digest of the Tolstoy novel, War and Peace, will differ from the cryptographic digest of the same novel with a single misspelled word.

Because of this property, cryptographic digests are well-suited for immutable record keeping. In a blockchain, these digests are linked together to create a ledger of transactions that no one can remove, modify, or otherwise tamper with. Therefore, anyone can reconstruct the blockchain from any point and verify its correctness.

tamper-proof ledger

A ledger is a list of transactions. Those transactions don’t necessarily have to be payments; they can represent transfers of any asset, such as real estate deeds or an interest-bearing security.

A blockchain is fundamentally a tamper-proof ledger. Because each ledger transaction is a cryptographic digest, an entry cannot be altered without detection. Furthermore, by hashing and turning the entire ledger into a cryptographic digest — as blockchains do — any addition, alteration, or omission of any transaction will change the cryptographic digest of the entire ledger. Thus, blockchains enable participants to audit one another in a decentralized manner.

public key / private key

In cryptography, a private key is a secret number or code. A special mathematical function is then applied to this private key in order to derive a second value, a public key. This value does not have to be kept secret because the public key reveals nothing about the private key.

By analogy, a public key is the address of your house, but the private key is the physical key that unlocks your front door. Simply knowing the address of a house does not help you to unlock the front door.

Why does this matter? In blockchain networks, a public key is the address to which assets can be transferred. Knowledge of the corresponding secret private key is the only way to spend those assets, just like a PIN code is required to withdraw from a checking account. But with public/private keys, you don’t need to trust a bank: You only need to trust the underlying math of a well-proven cryptographic system, the same system that already protects trillions of dollars-worth of payments over the internet today.

digital signatures

Just like fingerprints, digital signatures are unique to a single person or entity. These signatures are mathematically derived from a special pair of numbers called a public/private key pair. A signature on a public key can only be created by the holder of the corresponding private key. Just like a real signature, a digital signature should convince the recipient that message is authentic.

GENERAL BLOCKCHAIN CONCEPTS

state

The “state” of a system is a snapshot of that system at a given point in time. For instance, “state” might refer to an individual checking account balance; after spending $20, the state of the account should represent their new, reduced balance. The state of the system is usually maintained by trusted third party, like a bank or a company web server.

Blockchains enable decentralized networks to maintain a shared state among nodes. They allow each individual node to maintain a global state, or shared “truth”, with other network nodes without relying on a centralized party.

protocol

A protocol is a set of rules or procedures that govern a system — whether that system is a computer network, a town hall meeting, or a board game. For instance, in chess, individual players may have their own strategies — but the way in which each chess piece moves on the board is dictated by the rules (or protocol) of chess.

In networking, a protocol is a common program executed by multiple computers on the same network. These networking protocols govern the transmission and handling of information as well as execution of programs between interconnected but independent devices. For example, TCP (transmission control protocol) represents one of the foundational protocols for managing packets of information as they travel across the internet, powering applications like the world wide web, email, media streaming, and more.

In cryptonetworks, the most important protocol is the consensus protocol. This is the protocol followed by each network participant (or node) to create a single, shared state of the blockchain. In this context, consensus protocols replace a centralized record keeper or counterparty, enabling trustless, peer-to-peer interactions.

peer-to-peer (p2p) network

In a classic, centralized client-server network, data is requested by one class of computers known as “clients” (PCs or mobile phones, for example) and is “served” by another class of computers called servers. Facebook is an example of the client-server network model: Facebook profile data lives on Facebook servers, and is sent to the user when they open the app on their phone.

This hub-and-spoke model is a highly efficient but brittle system since the server is a “chokepoint” and centralized point of failure. Contrast this with a peer-to-peer network, where the connections resemble more of a “spiderweb”. In a peer-to-peer network, each node operates under a single communication protocol to transfer data between them; this model is often less efficient, but much more resilient because there is no single point of failure.

Perhaps the most famous example of a peer-to-peer network is the internet itself. The original internet, known as ARPANET, was invented by the U.S. Department of Defense as a way to ensure defense communication would never go down, even in the event of nuclear war. Disabling individual ARPANET nodes would not stop message traffic; they are simply routed along different paths to the same destination. Similarly, shutting down a single node, or even multiple nodes, on a blockchain network does not stop transactions from being processed.

node

A node is a device that participates in a network by following the network protocol. Individual nodes can perform a variety of roles, such as caching data, validating information, or forwarding messages to other nodes.

Depending on the network, each node can have a unique role or multiple nodes can share a single role. This architectural design choice reflects a fundamental tradeoff between network redundancy (coverage in case one node goes down) and efficiency.

Byzantine Generals’ Problem

One of the fundamental challenges in any distributed computing system is coordination among a group of machines where any one of them could be malicious or malfunctioning.

For example, imagine a Byzantine army separated into divisions led by generals camped around an enemy city. How can these generals communicate with each other only by messenger to agree on a plan when one or more of them may be “traitors” who will try to confuse the others? Similarly, how do participants in a decentralized network communicate and coordinate with each other towards some action without relying on a trusted third-party? This is the Byzantine Generals Problem.

Because blockchain networks assume no trust between participants, their underlying consensus protocols must all somehow address this problem to overcome faulty or malicious adversaries who try to subvert the system.

consensus

The consensus protocol is akin to the operating system of a blockchain. But blockchain consensus algorithms are special, because they define how to resolve disputes between nodes that received conflicting data.

Think of a consensus algorithm as a digital, impartial judge that hears both sides of an argument to arrive at the “truth” of what actually occurred. This judge then determines how to proceed according to a set of predetermined laws or rules. These rules must account for three key properties:

liveness, which ensures that data can always be added and the network never gets “stuck”;

agreement, where all nodes in the network eventually agree on the same value; and

safety, which ensures that an agreed-upon value does not violate the protocol.

Research has shown it is impossible for truly distributed, permissionless networks to achieve all three of these properties. This means that blockchain designers face tradeoffs about what to prioritize. Consensus algorithms aiming for speed often limit the number of network participants, making it less decentralized. On the other hand, protocols that prioritize decentralization and failure-prevention tend to be slower and less performant.

decentralization

Decentralization is the degree to which control — power, resource allocation, etc. — over a given network is distributed across a large, representative base of independent actors.

In most systems, there is a tradeoff between efficiency and decentralization, because coordination costs increase with the number of participants. However, decentralization also provides redundancy and fault-tolerance that a centralized system cannot match. Take the analogy of a democracy, which could be considered a “decentralized” political system. Even though the American democratic system can be inefficient and messy at times, it has proven remarkably resilient. Similarly, blockchain networks are strictly less efficient than centralized databases, yet offer the unique properties of redundancy and censorship-resistance.

double-spending problem

Previous efforts to create a decentralized digital currency failed because there is no “scarcity” in the digital world. Bits can be infinitely reproduced — just as easily as an image or line of text can be copy/pasted. In contrast, a dollar bill or bar of gold has physical scarcity because the owner does not physically possess either after payment for a good or service.

Prior to Bitcoin, the only known way to do digital payments was to use a centralized record-keeper (such as a bank or credit-card company) to keep track of account balances and transactions for every individual. This entity also ensures that no one spends the same balance twice, which replicates the physical scarcity of real currency.

In blockchains, there is no central record-keeper, so the solution to the double spend problem must be solved through the rules of the network. The original and most famous solution to the problem of digital scarcity is Bitcoin, which combines a system of economic incentives to reward honest participants who correctly perform a “proof-of-work” to prevent double spending.

proof-of-work

A proof-of-work demonstrates the use of a specific resource. In the physical world, the ultimate scarce resource is time, so a proof-of-work could be a simple timecard of hours spent at the office. In the digital world, it is trivial for a computer to “forge” a simple timecard. So, we need some other proof that some computational resources were expended.

In the 1800s, gold miners in California were paid by weight for the gold they physically extracted from the ground. Because gold is scarce, the amount of gold a miner brought for payment was proportional to the amount of effort they spent to get it. It was also easy to verify that the gold was real; a manager could simply weigh and inspect it in a fraction of the time it took to mine. How could we replicate this in a decentralized, digital world? One solution is to have a computer solve a puzzle with the following requirements:

Each instance of the puzzle should be unique; solving a previous puzzle doesn’t help solve a future one (like how a real miner cannot “re-extract” the same vein of gold).

The puzzle should be relatively hard to solve, but easy to verify.

This is similar to how blockchains replicate the physical scarcity of the real world. However, instead of mining gold, computers must solve a special type of math problem (or millions of them) that take at least a certain amount of time.

Because of its similarity to the analogy above, this process is called “mining”. When submitting a block to the blockchain, miners must present the solution to this math problem, along with the transactions that they want to include in the proposed block. Invalid solutions (which though hard to compute are simple to check) are rejected by the other miners in the network. This mechanism encourages rational miners to only submit valid blocks, or else they will have wasted time and effort.

blockchain

A blockchain is a tamper-proof ledger organized into a series of linked “blocks” containing data. These blocks are added according to a set of special rules (known as a consensus algorithm). This enables networks of physical computers, working together in concert, to form a single virtual computer.

Blockchains are distinct from other computer networks because they are permissionless. Any computer, anywhere can become part of this larger virtual computer as long as they follow the consensus algorithm.

In a blockchain, the blocks themselves can be thought of as the computer hard drive. The consensus algorithm is like the operating system (like a Windows or MacOS). And the peer-to-peer network is like the silicon semiconductor circuits that carry data between different parts of a computer.

Unlike a traditional computer, a blockchain computer can offer

strong trust guarantees⁠

, rooted in the cryptographic and game-theoretic properties of the system. For example, a user or developer can trust that a piece of code running on a blockchain computer will continue to behave as designed, even if individual computers in the network try to subvert the system. Thus, a blockchain computer enables disintermediated, peer-to-peer interactions and digital services that are owned and operated by communities instead of by corporations.

block

A block is like a folder that contains “files”. The contents of this folder are the transactions that occur over a given time interval (hashed to a cryptographic digest). Each block contains a reference linking it to the previous block — hence the term “blockchain”.

Blocks are added by the miners or validators on a cryptonetwork according to a consensus protocol; they check:

to ensure balances are not spent twice;

that each digital signature matches the public key of the message; and

that the included reference matches the hash of the previously-added block.

Because blocks are made up of cryptographic digests, they cannot be changed after the fact without detection. So blocks are effectively immutable once added to the blockchain.

miners

People have tried to build decentralized payment networks many times. They never worked, because before 2008 (when the

bitcoin whitepaper⁠

came out) there wasn’t a known solution to the double spending problem. The innovation of blockchain networks was introducing an economic participant into the system. This entity, known as the “miner”, is assumed to be purely profit-seeking and self-interested. Yet the sum of the individual actions of all miners enable truly decentralized networks of value: blockchains.

Miners are special nodes on a blockchain network who perform two key roles:

They validate transactions to ensure that they are valid according to the network protocol; and also make sure the balances aren’t spent twice, or “replayed”.

They compete with one another to find a solution to a random proof-of-work puzzle, in exchange for a network reward paid in the currency unit of the ledger they maintain (e.g., Bitcoin, Ether, etc.).

Creating a competition between miners was a key breakthrough that allowed Bitcoin to succeed where previous decentralized, peer-to-peer payment networks failed. Because they are rewarded for following the protocol, miners are incentivized to provide computational resources for securing the overall system. If they fail to do so, they pay the “cost” of foregone rewards.

cryptocurrencies

Cryptocurrencies are more than just a digital form of value. For permissionless networks such as Bitcoin, they are a critical part of the game theory and incentive mechanism that keeps the network secure.

Like traditional money, cryptocurrencies can be considered as a unit of account, store of value, and medium of exchange within the system. Taking Bitcoin as an example:

The service of “miners” or validators in the network are denominated and paid in bitcoin.

For the system to remain secure, these miners must value the bitcoin they receive more than they value they would gain by exploiting the network.

Bitcoin can be natively exchanged between parties on the network in a peer-to-peer manner.

The critical innovation of cryptocurrencies versus traditional payment systems is in that last part: peer-to-peer. This means the transfer occurs without a trusted third party, just like “cash”, only digitally.

Cryptocurrencies take a variety of forms and serve a variety of roles on a blockchain network. Some are mutually interchangeable or “fungible”, while others represent a unique, non-fungible asset. Some are interest-bearing investment assets, while others are “work tokens” that grant rights to perform a specific service. These are the ultimate flexible financial assets, which unlock tremendous value and enable applications that would otherwise be impossible in traditional finance.

cryptonetwork

Cryptonetworks are a fundamentally

new way⁠

to design and incentivize internet-based networks. They arise from cryptocurrency movements, but the fundamental shift between these and previous internet -based economies is the creation of open, decentralized networks and protocols. An example of a past such protocol is SMTP, which enables email; even though Microsoft owns Hotmail and Google owns Gmail, no one company owns the email-enabling protocol itself. Numerous companies can therefore build on top of it without being proprietarily blocked by Microsoft and Google. This is the defining feature of open networks.

However, a classic challenge with decentralized networks is that they are public goods. Thus, incentivizing their maintenance and development is challenging. If there is no central entity (like a Google or Microsoft) supporting it all, who will build, coordinate, manage, and maintain these networks? This is where blockchains and cryptocurrencies come in: the former enables decentralized coordination, and the latter provides incentives for development.

Technically, a cryptonetwork is a public blockchain, maintained by nodes, on a peer-to-peer network. It is distinct from a private blockchain or distributed ledger because it is permissionless: Participation in the network is open to anyone and not limited to a single or pre-defined group.

Cryptonetworks use consensus mechanisms to create an interlocking system of economic incentives to secure the network and prevent double-spending. It is these economic incentives, along with some of the fundamental cryptography and computer science concepts defined above, that creates a redundant, fault-tolerant system that strongly guarantees the persistence of data and execution of programs on the distributed network.

hash rate

Proof-of-work puzzles are based on hash functions, and are at the foundation of Bitcoin’s security model. Since the “work” is repeated hash functions, the combined calculations of every miner on the network to solve those functions is called the hash rate. In general, a higher network hash rate corresponds to a greater level of security for a given blockchain.

The security guarantees of Bitcoin assume that no miner controls a majority of the hash rate. If they did, they could execute a 51% attack. A 51% attack is like a “hostile takeover” of a blockchain. Because they have a higher hash rate than the rest of the network combined, an attacker can effectively rewrite the rules of the protocol and double-spend their own prior transactions. But even in this case, they cannot spend other users’ funds, since these are protected by cryptography.

application-specific integrated circuit (ASIC)

Most integrated circuits — like the CPUs in smartphones and laptops – can do a lot of different types of computations. For example, they render webpages or process user input during a game. ASICs (application-specific integrated circuits), on the other hand, can only do one computation. Yet they are engineered to perform that computation thousands or even millions of times faster than a PC or smartphone.

In crypto, ASICs optimized to compute hash functions now dominate proof-of-work mining. This activity has become increasingly concentrated among a handful of large, specialized firms. Some members of crypto community resent this as a source of centralization. Others argue that because ASICs can’t be repurposed for other tasks, they add “skin in the game” for miners and incentivizes them to act honestly (even if they could do otherwise).

fork

In software development, a fork is a new branch of code that goes off in its own direction. Often, it also often represents (in

open-source software⁠

developed outside of a company) a disagreement in the community which built and maintained the original code.

In crypto, a fork is a disagreement between nodes. The disagreement can be about what code is being run, or which blocks are included in the blockchain. Such a disagreement causes the blockchain to split into two parallel chains. There are two types of forks:

A soft fork often occurs during software upgrades to the protocol. This type of “soft” fork does not result in a permanent split of the network and is more akin to a network migration or upgrade. In other words, nodes on the network remain compatible with one another.

A hard fork happens when nodes in the network fail to reach consensus. In this case, the blockchain splits into two or more branches at the last point of agreement, and new valid blocks accepted on one fork will be rejected by the other.

Users with balances on the original blockchain prior to a hard fork will have the exact same balance on both “branches” afterward. Over time, the relative value of each fork determines who was “right” in the original argument. The market price of the native cryptocurrency of each fork is an economic “vote” on its respective utility. Value should flow to the branch users prefer.

genesis block

The genesis block is the first block created on a blockchain. For Bitcoin, the genesis block was mined on January 3, 2009 by its pseudonymous creator, Satoshi Nakamoto. Fun fact: the Bitcoin genesis block contains the phrase “The Times 03/Jan/2009 Chancellor on brink of second bailout for banks”, suggesting that Nakamoto was motivated by the global financial crisis.

The genesis block parameters set the rules for a given blockchain network going forward. Even if the network later forks, the genesis block is still included in the history for all branches.

SMART CONTRACTS TERMINOLOGY

smart contract

A smart contract is a persistent computer program that runs on a blockchain network. Like legal contracts, smart contracts are agreements between two or more parties written in code that executes autonomously. Smart contracts are different from regular computer programs because the execution of the program is guaranteed, no matter who initializes it. Furthermore, these contracts persist (perhaps indefinitely) because data is effectively permanent once stored on a blockchain.

Second-generation blockchains (such as Ethereum) were designed for executing smart contracts. This was an evolution to a step beyond the simple financial transactions enabled by Bitcoin, enabling a general-purpose platform for decentralized computing. A blockchain network that allows for general smart contracts can therefore be thought of as a “world computer”.

Turing complete

This is a property of modern computer systems that enables universality, meaning that any program that could be conceived can be run on that machine. It is named for the famous British computer scientist, Alan Turing, whose work in breaking the German Enigma encryption system during World War II laid the foundation for modern computers.

Most modern programming languages are Turing complete. Ethereum is an example of a Turing-complete blockchain, along with most other smart contract protocols.

gas

“Gas” is the fee paid to the miners of a blockchain to execute the code of a smart contract.

Imagine paying per character for social media post. You are incentivized to keep it short and sweet! Likewise, a smart contract that has more functionality (and therefore, consists of more lines of code) will generally cost more in fees than a shorter, simpler contract.

composability

Just like LEGO-like blocks can be combined in any number of ways to build something new, composability enables the various components of a system to be mixed and matched to create novel systems and applications.

Blockchains are exponentially composable, since they are both permissionless and permanent. Each additional smart contract or application added to the network is open and accessible to developers looking to build upon and extend its functionality. By enabling and incentivizing a true open-source ecosystem, the possible applications are limited only to our imagination.

interoperability

Interoperability is about systems talking to each other — whether devices, networks, or applications. It is a way of enabling compatibility between systems.

For instance, if a user wants to directly transfer assets/value across different blockchains, i.e. from Bitcoin to Ethereum, interoperability protocols create the “bridge” to enable this exchange.

tokens

Tokens are a digital representation of an asset. This could be either a native digital asset (like a digital baseball card) or represent a credit for some type of “work” or service (like gigabytes of files stored). Tokens are not cryptocurrencies themselves, but rather are issued from smart contracts built on top of other cryptonetworks.

The two most common token types on the Ethereum cryptonetwork are ERC-20 and ERC-721: ERC-20 is the standard implementation for fungible smart-contract tokens, while ERC-721 is the standard for non-fungible tokens. Both ERC-20 and ERC-721 tokens can be used in different ways, or even combined within a single smart contract, to extend the functionality and flexibility of the blockchain economy as a whole.

non-fungible tokens (NFTs)

“Fungibility” means that units of a currency or commodity are alike and indistinguishable. Examples of fungible currencies are $1 bills, each of which is alike and represents the same value.

On the other hand, non-fungible tokens represent unique assets whose value is independent from one another. For example, an NFT might represent a piece of unique digital artwork, a Mickey Mantle baseball card, or a share of physical North Carolina real estate. Despite this difference, NFTs can be exchanged in the same manner as any other token on a cryptonetwork.

The ability to represent unique assets greatly enhances the composability and functionality of cryptonetworks, since many real-world assets are non-fungible. In turn, this enables blockchains to support more flexible economies.

proof-of-stake

In proof-of-work consensus systems, miners expend energy to solve a puzzle and in return for a reward. In a proof-of-stake system, “validators” post a bond or “stake” to a smart contract, earning rewards or “interest” for properly validating the state of the blockchain.

By requiring validators to deposit tokens to participate, proof-of-stake systems not only align incentives for validating transactions correctly (as with proof-of-work), but go a step further by punishing bad behavior. If a dishonest validator violates the protocol, their deposit is “slashed” or confiscated and distributed to the remaining honest validators on the network.

One advantage of proof-of-stake over proof-of-work is that it does not “waste” energy. However, proof-of-stake consensus protocols are often more complex, and have their own unique vulnerabilities. A particularly hard problem is preventing “deep” reorganizations of the blockchain to double-spend prior transactions. In proof-of-work, it is infeasible to present a false “history” of transactions because of the computation that went into producing the chain. But in proof-of-stake, malicious miners can easily “simulate” a blockchain that appears valid but in fact is not. The most advanced proof-of-stake networks solve this through separating the chain into “epochs” as well as encouraging honest behavior through the slashing mechanism described above.

delegated proof-of-stake (DPoS)

DPoS is a type of consensus that limit the number of validators who can add blocks to the blockchain. These validators are selected through some type of network governance mechanism — for example, by a token-weighted vote per user account. Because it is not truly permissionless, this type of consensus is more centralized than proof-of-work (e.g. Bitcoin). Even though DPoS networks can process more transactions than proof-of-work cryptonetworks, this centralization makes them less versatile and more prone to bribery or censorship.

validators

Validators are the miners of a proof-of-stake network. Like miners, the validators’ role is to collect transactions into blocks to add to the blockchain. For adding valid blocks, validators are rewarded in proportion to the amount of currency they post (“stake”) as collateral.

scalability

Cryptonetworks represent a major innovation in terms of decentralization and security. However, early cryptonetworks were highly inefficient (from a user standpoint) compared to modern payment networks. For example, the Bitcoin network processes about 5 transactions per second, while Visa can handle up to 50,000 transactions per second. This disparity has led to efforts for “scaling up” transaction throughput, hence the term “scalability”.

sharding

Sharding is a classic technique in distributed systems that reduces the load on the nodes participating in a network by eliminating the requirement that each node process every transaction. With sharding, each node instead processes only a subset of all transactions. This enables a much greater network throughput, though at the cost of some redundancy.

Layer 1 / Layer 2

One way to think about blockchains is to imagine them as skyscrapers: Structurally, a skyscraper can be divided into two layers: a foundation and a superstructure. Of course, the superstructure (where we live and work) can only be as tall as the foundation is strong.

In computer science, infrastructure and applications are often built using a similarly layered approach. This is at the heart of many blockchain scalability proposals.

Want to print your doc?
This is not the way.

Try clicking the ⋯ next to your doc name or using a keyboard shortcut (

CtrlP

) instead.