In this post, we will give a brief introduction to reliable broadcast and how we use it in Topos.
The first thing to mention about reliable broadcast is that it can be implemented in the asynchronous model, meaning that the correctness of the algorithm is independent of any timing assumptions. Under asynchrony, given that the adversary is bounded by a threshold (typically, in a system of processes, we assume an adversary who can only control processes such that ), the protocol is not vulnerable to the adversary delaying messages.
What is reliable broadcast?
The properties of a reliable broadcast are the following:
- Validity: If a correct process broadcasts a message , then every correct process eventually delivers .
- Consistency: If a correct process delivers a message and another correct process delivers a message , then .
- Totality: If a correct process delivers a message, then every correct process eventually delivers a message.
The combination of the Consistency and Totality properties makes up for the agreement property.
Given these properties, let’s build a reliable broadcast!
We will show a simple and efficient reliable broadcast which tolerates a fraction of Byzantine processes in a system of a total of processes.
The broadcast primitive is composed of steps during which processes wait until they received enough messages before sending certain messages themselves:
- Step 1 (Broadcast): a process broadcasts a message to the rest of the system.
- Step 2 (Echo): upon reception of the message for the first time, a process echoes the message to all the processes with an message.
- Step 3 (Ready): upon reception of more than messages or messages, a correct process sends a message to everyone.
- Step 4 (Delivery): upon receiving messages, a correct process delivers the message .
Performance wise, this primitive has quadratic message complexity so it does not scale well.
But more on this later!
Now that we have our reliable broadcast, let’s prove its correctness:
Proof (Validity): If the sender of message is correct, then all correct processes will send an message. All correct processes receive at least messages and since , it triggers the emission of messages, which in turn will trigger delivery of the message by those processes.
Proof (Consistency): For a correct process to deliver message , it needs to receive messages. A set of cardinality is a Byzantine quorum and two such quorums intersect in at least one correct process because from , it follows that . Now consider another correct process which similarly received messages for message . Since the quorums of and intersect in a correct process, this correct process sent the same message to both and , hence .
Proof (Totality): Again, to deliver a message , a correct process must have received messages. This means that at least of such messages came from correct processes meaning that these processes have already received more than messages or messages (as per Step 3 above). By the Consistency property, these processes did not send messages for a different message so it is not possible that message could have messages nor messages.
We have shown that this simple reliable broadcast is correct!
Can we do better?
As we have seen throughout this post, the reliable broadcast primitive has quadratic message complexity. Thus, if the number of processes increases, the number of messages explodes and processes become overwhelmed. A process can only handle so much, so these algorithms are not scalable. Direct communication between participants incur very large communication cost. A natural way to reduce communication cost between processes is to have hierarchical communication, such that each process only communicates with a relatively small number of other processes (compared to the size of the system). The per-process load is reduced, at the cost of communication latency. For instance, if the processes are organized in a tree structure, the communication latency is logarithmic in the size of the system.
In Topos, the Transmission Control Engine (TCE) uses gossip to propagate messages, hence processes are organized in a tree to minimize communication overhead. Furthermore, the Topos Reliable Broadcast trades deterministic guarantees for probabilistic ones, while preserving good reliability guarantees, as they allow to further reduce the overhead which makes it particularly well suited for large systems. By making the protocol probabilistic, we need to revise the properties of the reliable broadcast.
We need weaker properties to account for this transition from a deterministic to a probabilistic algorithm. Given a failure probability (e.g., ), our three properties become:
- Validity: If a correct process broadcasts a message , then every correct process eventually delivers with probability at least .
- Consistency: If a correct process delivers a message and another correct process delivers a message , then with probability at least .
- Totality: If a correct process delivers a message, then every correct process eventually delivers a message with probability at least .
Probabilistic broadcast has weaker guarantees than its deterministic counterparts but the failure probability can be made arbitrarily small.
Furthermore, it allows the TCE to scale very easily as the per-process communication is only logarithmic and the total communication complexity is quasilinear in the size of the system!
In a future post, we will dive in the core details of the Topos Reliable Broadcast.
We highly recommend checking out Introduction to Reliable and Secure Distributed Programming!
In this post, we will give a brief introduction to reliable broadcast and how we use it in Topos. The first thing to mention about reliable broadcast is that it can be…
Cryptography is at the heart of any blockchain project. A good deal of their useful properties (immutability, security, public verifiability...) comes from cryptographic…
We're extremely excited to announce the release of a bunch of innovations we've been working on for the past two years! The problem More than a decade after Satoshi's…