Brief introduction to the Topos Reliable Broadcast protocol


In this post, we will give a brief introduction to reliable broadcast and how we use it in Topos.

The first thing to mention about reliable broadcast is that it can be implemented in the asynchronous model, meaning that the correctness of the algorithm is independent of any timing assumptions. Under asynchrony, given that the adversary is bounded by a threshold (typically, in a system of n processes, we assume an adversary who can only control f processes such that 3f ≤ n-1), the protocol is not vulnerable to the adversary delaying messages.

What is reliable broadcast?

The properties of a reliable broadcast are the following:

Validity: If a correct process p broadcasts a message m, then every correct process eventually delivers m.

Consistency: If a correct process p delivers a message m and another correct process q delivers a message m’, then m = m’.

Totality: If a correct process delivers a message, then every correct process eventually delivers a message.

The combination of the Consistency and Totality properties makes up for the agreement property.

Given these properties, let’s build a reliable broadcast!We will show a simple and efficient reliable broadcast which tolerates a 0 ≤ f < n/3 fraction of Byzantine processes in a system of a total of n processes.

The broadcast primitive is composed of steps during which processes wait until they received enough messages before sending certain messages themselves:

Step 1 (Broadcast): a process broadcasts a message m to the rest of the system.

Step 2 (Echo): upon reception of the message m for the first time, a process echoes the message to all the processes with an (Echo,m) message.

Step 3 (Ready): upon reception of more than (n+f)/2 (Echo,m) messages or f+1 (Ready, m) messages, a correct process sends a (Ready,m) message to everyone.

Step 4 (Delivery): upon receiving 2f+1 (Ready,m) messages, a correct process delivers the message m.

Performance wise, this primitive has quadratic message complexity so it does not scale well.

But more on this later!

Now that we have our reliable broadcast, let’s prove its correctness:

Proof (Validity): If the sender of message m is correct, then all correct processes will send an (Echo,m) message. All correct processes receive at least n-f (Echo,m) messages and since n-f > (n+f)/2, it triggers the emission of n-f (Ready,m) messages, which in turn will trigger delivery of the message m by those processes.

Proof (Consistency): For a correct process p to deliver message m, it needs to receive 2f+1 (Ready,m) messages. A set Q_1 of cardinality |Q_1| = 2f+1 is a Byzantine quorum and two such quorums intersect in at least one correct process because from |Q_1| = |Q_2| = 2f+1, it follows that |Q_1 ∩ Q_2| ≥ f+1. Now consider another correct process q which similarly received 2f+1 (Ready,m’) messages for message m’. Since the quorums of p and q intersect in a correct process, this correct process sent the same Ready message to both p and q, hence m = m’.

Proof (Totality): Again, to deliver a message m, a correct process p must have received 2f+1 (Ready,m) messages. This means that at least f+1 of such Ready messages came from correct processes meaning that these processes have already received more than (n+f)/2 (Echo,m) messages or f+1 (Ready, m) messages (as per Step 3 above). By the Consistency property, these processes did not send Ready messages for a different message m’ so it is not possible that message m’ could have (n+f)/2 (Echo,m’) messages nor f+1 (Ready,m’) messages.

We have shown that this simple reliable broadcast is correct!

Can we do better?

As we have seen throughout this post, the reliable broadcast primitive has quadratic message complexity. Thus, if the number of processes increases, the number of messages explodes and processes become overwhelmed. A process can only handle so much, so these algorithms are not scalable. Direct communication between participants incur very large communication cost. A natural way to reduce communication cost between processes is to have hierarchical communication, such that each process only communicates with a relatively small number of other processes (compared to the size of the system). The per-process load is reduced, at the cost of communication latency. For instance, if the processes are organized in a tree structure, the communication latency is logarithmic in the size of the system.

In Topos, the Transmission Control Engine (TCE) uses gossip to propagate messages, hence processes are organized in a tree to minimize communication overhead. Furthermore, the Topos Reliable Broadcast trades deterministic guarantees for probabilistic ones, while preserving good reliability guarantees, as they allow to further reduce the overhead which makes it particularly well suited for large systems. By making the protocol probabilistic, we need to revise the properties of the reliable broadcast.

We need weaker properties to account for this transition from a deterministic to a probabilistic algorithm. Given a failure probability ε (e.g., ε < 10e-12), our three properties become:

Validity: If a correct process p broadcasts a message m, then every correct process eventually delivers m with probability at least (1 − ε).

Consistency: If a correct process delivers a message m and another correct process delivers a message m’, then m = m’ with probability at least (1 − ε).

Totality: If a correct process delivers a message, then every correct process eventually delivers a message with probability at least (1 − ε).

Probabilistic broadcast has weaker guarantees than its deterministic counterparts but the failure probability ε can be made arbitrarily small.

Furthermore, it allows the TCE to scale very easily as the per-process communication is only logarithmic and the total communication complexity is quasilinear in the size of the system!

In a future post, we will dive in the core details of the Topos Reliable Broadcast.

We highly recommend checking out Introduction to Reliable and Secure Distributed Programming!


Latest Articles

Cheetah: A STARK-friendly elliptic curve for fast native and in-circuit computations
Cheetah: A STARK-friendly elliptic curve for fast native and in-circuit computations

March 16th, 2022

For Topos’s trustless and privacy-preserving cross-subnet communication, Toposware requires subnets to prove the validity of...

ICE-FROST: Identifiable Cheating Entity FROST Signature Protocol
ICE-FROST: Identifiable Cheating Entity FROST Signature Protocol

December 21st, 2021

Cryptography is at the heart of any blockchain project. A good deal of their useful properties (immutability, security, public verifiability...)…

Topos: A consensusless, trustless, privacy-enhancing interoperability protocol
Topos: A consensusless, trustless, privacy-enhancing interoperability protocol

December 21st, 2021

We're extremely excited to announce the release of a bunch of innovations we've been working on for the past two years!

© Toposware,
2022 All Rights Reserved

Privacy Policy