How does Nym improve on traditional mixnet designs?

Nym and the history of mixnets

February 24, 202015 mins Read

The goal of anonymous communication networks is to make any packet of data sent across the internet indistinguishable from other packets. Realising this goal requires two components to defend against an adversary that can view the entire network:

All packets must be unlinkable in terms of their bits, as well as size. For Nym, this is accomplished by the Sphinx packet format described in the earlier technical introduction.
The packets must be also unlinkable in terms of when they were sent. This is where the concept of “mixing” packets steps in.

There’s lots of interest — but also confusion — about what a “mix network” actually is. A mix network actually mixes packets to destroy information about the time they were sent, which can be used by attackers to determine who sent the packets and even what data they contain.

A lot of projects that build anonymous overlay networks on top of the internet, such as Loki, are well-intentioned but claim to be mixnets when they are not because they do not “mix” packets in any meaningful way that prevents attacks. To make matters more confusing, projects that claim to be “mix networks” like HOPR are using the Sphinx packet format, but HOPR does not currently mix packets, so a powerful adversary could easily de-anonymise packets based on timing even if the bits are unlinkable. Moreover, HOPR’s implementation of Sphinx has incorrect padding, and is at the moment vulnerable to the attack by Kuhn outlined in our previous article. The Lightning Network also uses Sphinx, but in a way that breaks its privacy properties.

There’s a lot of broken and incorrect usage of the term “mixnet,” so it’s time to straighten it out. Recently, the Nym’s CTO published a non-technical introduction to mix networks which is a great starting point if you want to understand what this technology is about. In this article, we will dive a bit deeper into the technical functionality of mix networks and how the Nym mixnet differs from traditional mix network designs.

A brief introduction to Chaumian mix networks

In 1981 David Chaum pioneered the breakthrough idea of a mix network [1], a decentralised network of relays that ensures unlinkability of internet communications by hiding its distinctive characteristics, the so-called metadata. Network metadata contains a vast amount of sensitive information, which can be used to paint a detailed profile of a user’s online activities and associations. The leakage of this metadata undermines the confidentiality provided by even sophisticated encryption techniques, including zero-knowledge proof technologies. Mix networks protect the metadata even against sophisticated traffic analysis techniques.

“Metadata absolutely tells you everything about somebody’s life. If you have enough metadata you don’t really need content.” — Former NSA General Counsel Stewart Baker

Traditional mix network design

In the traditional mix network, the nodes are ordered in a fixed cascade, and each packet is routed via each node from the first to the last one. By using multiple mixes we distribute the trust among them, hence as long as a single mix in the cascade is honest the anonymity is preserved.

Each packet relayed via a mix network is layer-encrypted (or in other words onion-encrypted) using public-key cryptography. Upon receiving each packet, a mix strips a single layer of encryption. This ensures that whoever is observing the network cannot trace packets by looking at their binary representation.

Despite changing the binary patterns, the eavesdropper could still correlate the timing of the encrypted packets on different links if the mix forwards them following the first-in, first-out order. In fact, this is the exact kind of attack that Tor is vulnerable to. Therefore, in order to prevent that, each mix collects a certain number of packets, decrypts and simultaneously reorders (mixes!) the packets following a secret permutation. It’s a bit like shuffling a deck of cards, so the term “shuffling” is sometimes used instead of “mixing.”

Once the packets are shuffled, the mix sends them to the next node in the cascade. The other packets which were not accepted in a batch, are sent in subsequent rounds. This is a so-called batch-and-reorder mixing technique.

Limitations of traditional mix network designs

The original mix network design was studied by the research community for many years. And as it turns out, it has several limitations:

The fixed-cascade topology scales poorly.
In order to perform onion encryption, it requires time-consuming public key operations performed by each client and mix node.
The size of the anonymity set is limited only to the size of a batch.
The end-to-end latency of packets is unbounded since you don’t know in which batch your packet will be accepted.
The traditional mixes are susceptible to various sophisticated attacks like traffic confirmation attacks or active attacks [3,4].

Due to those limitations, Nym does not adopt traditional mix network designs but instead builds on the modern low-latency anonymous communication system called Loopix [2]. So how does our design tackle the above problems of traditional mix networks? And how does it compare to other designs based on Chaumian mix networks like cMix, the design used by Elixxir, or more traditional peer-to-peer designs like HOPR that claim to be mixnets?

Nym Mixnet

Network topology

Anonymity loves company, hence scalability is one of the key properties of a mix network. The fixed-cascade topology is not a friend of scalability. Once the cascade reaches its maximum capacity you cannot support more traffic, and this often introduces reliability issues.

One solution which significantly improves scalability is to introduce multiple cascades which work in parallel, which is precisely what Elixxir does. However, this approach has one limitation — the traffic is partitioned among disjoint cascades, hence you don’t gain any advantage in terms of anonymity from large volumes of traffic.

On the other hand, HOPR establishes a peer-to-peer connection between the nodes, hence it scales very well. However, a peer-to-peer topology offers weaker anonymity than cascades. In cascades the packets meeting in the same mix are always mixed, because the node is in the same path hop for all packets. In peer-to-peer networks, packets may pass by the same node and not be mixed if the node is at a different hop in the packets’ routes [5].

Therefore, the Nym mixnet deploys a stratified topology, which as research has shown, is optimal for anonymity and scalability [5]. In the stratified topology mix nodes are arranged into layers. Each node in layer i is connected with each node in the previous and next layer.

Stratified topology

The packets are source-routed, meaning that the sender selects the entire path to the destination. Each packet is sent via an independent path, composed by picking a single node from each layer at random. Although the packets are sent via independent routes, the stratified topology ensures that they interleave with each other at some point in the network, hence the entire traffic is eventually mixed together.

Most importantly, aggregating mixes into layers ensures a sparse topology that concentrates traffic on a few links and allows the Nym mixnet to scale horizontally, meaning that the overall capacity of the network can be increased by incrementally adding more servers.

Hence, the topology of Nym mixnet offers both strong anonymity and scalability.

Onion encryption

The time-consuming public key operations required by traditional mix network designs impose a high latency overhead, which is impractical in most of the applications. One possible solution is a precomputation phase used by Elixxir’s cMix design. This idea avoids computationally intensive cryptographic operations during real-time communication. However, the time you need to perform the precomputation phase grows linearly with the size of the anonymity set (see paper), and it has to be repeated before each real-time phase.

Nym, on the other hand, to overcome a problem of time-consuming public key operations uses the Sphinx packet format (yes, that’s the same one as used in the Lightning Network!) to onion-encrypt the packets. I wrote a detailed explanation of how Sphinx works and what security properties it has earlier. In a nutshell, Sphinx is a provably secure cryptographic packet format. It is also the most compact, with just a few hundred bytes overhead, and the most computationally efficient packet format proposed so far. The benchmarks of our own Rust Sphinx implementation show that in order to process a single packet you need on average 0.157 milliseconds while creating a single packet takes on average 0.386 milliseconds! And this is still without any optimisation tricks! Moreover, our implementation is resistant to a vulnerability discovered recently by Kuhn et al., (see here for more details).

Sphinx is also used by HOPR. However, in contrast to the Nym implementation, Sphinx implementation by HOPR is vulnerable to the attack shown by Kuhn et al., which leaks information about the path length the packet is traversing.

Mixing

But the real game-changer is the reordering technique used by the Nym mixnet. Instead of the batch-and-reorder technique used in traditional mix networks or systems like cMix, we use the stop-and-go technique (also known as continues-time mixes). This means that instead of batching a certain threshold of packets, a mix delays each packet before forwarding it to the next hop. The amount of time a packet needs to wait in each mix is chosen by the sender, who picks them at random from the exponential distribution. Hence the sender can estimate the end-to-end latency of the packet.

And why did we decide to use this technique? Anonymity!

The exponential distribution has a great property called memoryless property. If you have no idea what it is, intuitively you can think about it with an example of a lottery: imagine you’re a regular lottery player. So far, you’ve played and lost 100 times. But you’re thinking “Hey, I lost so many times, probably my chances of winning now are much higher!” Well, if the lottery is memoryless, this means that after losing 100 times, your probability of winning is the same as it was when you tried the lottery for the first time. The same if you tried 1,000, 10,000, or 100,000 times (sorry!). But how does this translate to the mixing of packets?

As mentioned earlier, in the mix networks using the batch-and-reorder technique the batch size is the size of your anonymity set. If the mix collects a certain number of packets, let’s say 100, then your anonymity set is only 100, even if overall thousands of packets are processed by the mix network.

The Nym mixnet gives you a much bigger anonymity set. How? Imagine the packets arriving one after another to a mix. This time, there is no batching packets in groups, but instead each packet entering a mix dwells there by a random amount of time, unknown to the adversary. But how does delaying works as mixing? You might say that intuitively although there is some additional random delay added, the packets which enter earlier should be also leaving earlier, right? Well, no, and this is the magic of memoryless property — the probability that a packet leaves a mix at a certain time is independent of its arrival time. This means, that when the adversary sees a packet leaving a mix, with certain probability this can be any of the packets he ever observed entering the mix. As a result, compared to the batch mixes, the continuous-time mixes used in Nym have a larger anonymity set. So you’re not anonymous just among a smaller group of other packets, but among all packets which enter the network. Imagine how confused the adversary is now!

HOPR, unlike Elixxir and Nym, does not implement any mixing technique, hence this design is not resistant against traffic analysis and in result does not give you anonymity. However, currently they are working on adding a primitive mixing technique in a future version.

Cover traffic

Online users have persistent patterns of activities, which over time might disclose information about your communication. For example, if the adversary sees a sudden burst of packets sent by Alice, and at the same time a sudden burst of packets received by another user or service, he might infer the destination of Alice’s communication (traffic confirmation attacks).

Therefore, in order to hide when the users are actually using the mixnet and how many packets they send or receive, the Nym client follows a Poisson process (a kind of randomisation) to schedule packets to send. If the user doesn’t schedule any packets to forward into the mix network, the client sends instead a loop cover packet, i.e., a packet with dummy payload which has as a final destination its sender. The use of loop cover traffic ensures that from the perspective of the adversary the user is active all the time, i.e., the adversary cannot distinguish when the user is actually sending or receiving real communication packets. This is often referred to as unobservability, and of existing implemented mixnets, only the Nym mixnet guarantees this property.

Mixes also send loop cover traffic, which is indistinguishable from clients’ traffic in order to detect active attacks, for example, flooding attacks [5]. Moreover, such loop cover packets allow checking the quality of service in the network and detect malicious behaviours (more about that in the next post).

Last but not least, thanks to the combination of continuous time mixes and tuneable cover traffic Nym mixnet ensures that there is always sufficient traffic in the network to guarantee strong anonymity, hence can support applications with various latency and bandwidth constraints.

Conclusion: Does Nym solve the Anonymity trilemma?

Previous research in anonymous communication networks showed that there seems to be a natural trade-off between strong anonymity, low latency, and low overhead costs [7]. Of course, anonymity can’t be for free — there will always be some trade-offs. In terms of mixnet designs, the question is what trade-offs are chosen and whether the design scales. Nym, due to its use of a cutting-edge design, we believe is the most scalable and anonymous mixnet design possible.

To summarise, Nym is the first anonymous communication system which offers scalability, strong anonymity, and low-latency at the same time: Nym hits the sweet spot in the anonymity trilemma [7]. It combines a number of best-of-breed techniques studied by four decades since the introduction of the first mix network design by Chaum.

Elixxir is the best Chaumian batch mixnet, with strong provable properties around anonymity and the elimination of metadata. This should come as no surprise as Elixxir was designed by the legendary Chaum himself! However, Elixxir is based on closed-source code with its own quantum-resistant cryptocurrency, and requires Chaum’s new breakthrough precomputation phase to obtain high speeds, while its scaling hits the limits of its batch mixnet cascade design in terms of anonymity.

So while any Elixxir-based system wouldn’t need Nym, there’s a world of software that already exists. More like Tor than Elixxir, as a continuous mix, Nym choses a generic and open-source path to allow it to interoperate with any blockchain and anonymise any internet traffic via advanced statistical techniques. The more traffic that comes into Nym, the better it scales and the more anonymity it provides.

Nym uses a stratified rather than peer-to-peer topology, as peer-to-peer topologies offer poor anonymity by not evenly “mixing” packets, while Nym wants to guarantee privacy for all packets entering the Nym network. Reach out over the Telegram channel if you want to build an application or integrate an existing one on top of Nym.

Given Nym’s focus on scalable anonymity for internet traffic, the next obvious question is how Nym compares to Tor and the emerging world of “dececntralised VPNs.” This controversial and complex question will be explored next, but in the meantime, keep questions coming over Twitter and Telegram. Mixnets may be complex, but mixnets are the best chance we have for anonymity against powerful adversaries like the NSA.

References

[1] “Untraceable electronic mail, return addresses, and digital pseudonyms”, D. L. Chaum; Communications of the ACM; Link

[2] “The Loopix Anonymity System”, A. M. Piotrowska, J. Hayes, T. Elahi, S. Meiser and G. Danezis; USENIX Security Symposium; Link to the paper and presentation video.

[3] “Statistical disclosure or intersection attacks on anonymity systems”, G. Danezis, A. Serjantov; International Workshop on Information Hiding; Link

[4] “From a trickle to a flood: Active attacks on several mix types”, A. Serjantov, R. Dingledine, P. Syverson; International Workshop on Information Hiding; Link

[5] “Impact of network topology on anonymity and overhead in low-latency anonymity networks”, C. Diaz, S. J. Murdoch and C. Troncoso; Privacy Enhancing Technologies Symposium; Link

[6] “cMix: Mixing with Minimal Real-Time AsymmetricCryptographic Operations”, D. Chaum, D. Das, F. Javani, A. Kate, A. Krasnova, J. De Ruiter, A. T. Sherman; Conference on Applied Cryptography and Network Security; Link

[7] “Anonymity trilemma: Strong anonymity, low bandwidth overhead, low latency-choose two.” , D. Das, S. Meiser, E. Mohammadi, and A. Kate; IEEE Symposium on Security and Privacy (SP); Link