# Article

## Cheetah: A STARK-friendly elliptic curve for fast native and in-circuit computations

For Topos’s trustless and privacy-preserving cross-subnet communication, Toposware requires subnets to prove the validity of their internal state transitions to the rest of the ecosystem. For this purpose, and to maintain privacy of the subnets, we rely on a FRI-based zk-STARK proving system to enforce correct state transitions.

General-purpose state transition systems (the so-called zk-VMs / zk-EVMs projects emerging in the blockchain ecosystem) usually require, among other checks, to verify a digital signature (Schnorr, ECDSA, ...) ensuring the valid origin of funds. This requirement originally oriented us to base our construction on the curve over the prime field $\mathbb{F}_p$ with $p = 2^{251} + 17\times2^{192} + 1$ designed by STARKWARE. $^{(1)}$ However, the Topos state-transition STARK statement design was missing a crucial, fundamental difference between regular zk-SNARKs and the FRI-based zk-STARK we were built on. The latter construction does not rely on any algebraic group consideration. This property specifically allows one to build a proving system over a particularly small$^{(2)}$ and performant field, unlike existing SNARKs. Those usually rely on the hardness of the Elliptic Curve Discrete Logarithm Problem (ECDLP) on transparent or pairing-friendly curves, and hence inherently work over much larger fields.

Only the Schnorr signature portion of our state-transition program was relying on this notion of an algebraic group, which led to using this large prime field $\mathbb{F}_p$. However, it had the heavy cost of imposing every single value handled (even bits!) to be internally represented as a large 252-bit integer, and to rely on the associated complex modular arithmetic, which was an unacceptable burden for future development. To put this into perspective, internal representation of $\mathbb{F}_p$ elements was consisting of four u64 limbs in little-endian order, representing a given element in Montgomery form. A single Schoolbook multiplication between two elements was then requiring 16 u128 multiplications and 32 u128 additions before Montgomery reduction, which was also requiring 16 u128 multiplications, 56 u128 additions$^{(3)}$ and 4 u64 multiplications. Multiplying two arbitrary field elements inside our AIR program (even binary values!) was involving that many arithmetic operations, a prohibitive overhead!

We then came up with what was going to bring us nice speed-ups both natively, outside our STARK AIR program, as well as inside our prover computations, by changing the algebraic group used in Schnorr to work over an extension of a small prime field.

The result of this work is a new curve, named Cheetah, whose construction approach and design are discussed below. A more thorough analysis of the construction is publicly available at https://eprint.iacr.org/2022/277.

There were two major considerations to evaluate before moving on to the next steps:

• a prime field size, which had to be small enough for efficient field arithmetic and STARK computations;
• an extension degree, large enough to prevent attacks targeting field-extension based curves, and small enough to have efficient group operations.

For the first point, a natural size choice was around $64$ bits as we can easily find primes $p$ of this size with large 2-adicity for efficient FFTs$^{(4)}$, and still inherit fast native arithmetic, unlike operations over much larger prime fields. While we were originally going for a 63-bit prime with extremely large 2-adicity (55 to be precise!), we decided in the end to go for the so-called Goldilocks prime $p = 2^{64} - 2^{32} +1$ used in various projects, for its size and efficient modular reduction. In addition, the prime being greater than $2^{63}$, this allows compatibility with the Cairo framework. The reduction of 2-adicity on the other hand was not considered critical, as execution traces in practice would have less than $2^{32}$ steps for most applications.

As for determining the optimal extension degree $k$ over which we would build our new elliptic curve, the decision was not so straightforward. We already knew that 4 would be a lower bound on our extension degree, as that meant having a group size of $\approx 256$ bits, similar to regular curve constructions, yielding a Pollard-Rho security level of around $128$ bits (give or take some bits). However, in the context of field-extension based curves, other attacks leveraging the specific structure of the field construction to require fewer operations than Pollard-Rho have to be considered and analyzed. Quintic extensions $(\mathbb{F}_{p^5}$) had received limited attention, which was one of our concerns for security confidence and adoption. We then decided to go for a degree 6 extension, yielding a base field $\mathbb{F}_{p^6}$ of a size equivalent to widely-used pairing-friendly constructions (like BLS12-381), and having received more consequent scrutiny from the academic world.

Having these two criteria in mind, we could start to extensively look for curves resistant to sextic-extension based attacks, in addition to the regular elliptic curve attacks on ECDLP (Pollard-Rho, MOV attack, ...) until we found a curve over $\mathbb{F}_{p^6}$ with $p = 2^{64} - 2^{32} + 1$, $\mathbb{F}_{p^6}$ being defined as $\mathbb{F}_p[X] / (X^6 - 7)$. The curve equation is $E: y^2 = x^3 + x + (u + 395)$, with $u^6 = 7$. It contains a 255-bit subgroup of prime order

$q$ = 55610362957290864006699123731285679659474893560816383126640993521607086746831. We named this curve Cheetah 😼 and implemented it here. The search algorithm, to reproduce the search or generate new curves on sextic extensions is available here.

The transition to the new Cheetah curve yielded an appreciable improvement on the Schnorr signing and verification algorithms, mostly due to the performance difference$^{(5)}$ of the new instantiation of the Rescue-Prime hash over the small new field $\mathbb{F}_p$.

In addition, AIR programs are now running much faster, thanks to the fast new STARK field arithmetic! In particular, the verifier is now running under 10ms even for complex programs, a crucial milestone for the scalability of the Topos protocol.

The Cheetah upgrade granted us a very welcome speed-up both for native and in-circuit computations. But more importantly, removing the performance bottleneck of the underlying STARK field allows us to leverage the full power of our FRI-based zk-STARK, for current and future applications on the Topos ecosystem. And even more optimizations are going to be integrated into the Cheetah implementation pretty soon, stay tuned!

$^{(1)}$ For STARKs, we require the underlying field to have a large 2-adicity (i.e. $p = 2^nt$) such that we have a primitive $n$-th root of unity for efficient multi-point evaluation/interpolation over the domain generated by this root of unity through FFT algorithms. We usually aim to have $n \geq 32$.

$^{(2)}$ With a cryptographer's eye, a small prime here means $p \ll 2^{256}$.

$^{(3)}$ We make no distinction between addition and subtraction. A more accurate description would be 52 u128 additions and 4 u128 wrapping subtractions.

$^{(4)}$ Fast-Fourier Transforms can actually be evaluated over any field, even without a root of unity of order $2^n$ for some $n$, following the approach detailed in this paper, at the cost of an extra logarithmic overhead, which has been judged impractical in our use-case, and hence is not approached here.

$^{(5)}$ Recall that, while being extremely efficient inside a SNARK circuit / STARK AIR program, algebraic hash functions are less efficient natively than their usual binary counterparts like SHA3 or Blake3, hence Rescue-Prime is taking a non-negligible portion of the signature algorithms running times.