A quick intro to Merkle Tree Certificates

Max
Software engineer
A quick intro to Merkle Tree Certificates
Many of us have heard about quantum computers and that they are a threat to our internet security and privacy. Nobody can tell, though, how long it will still take until quantum computers are actually powerful enough to be a serious threat. Even though the internet is often associated with rapid change, the fundamentals often take decades to change. Therefore, we must act now before it’s too late.

In this blog, I’ll take a look at a proposal called Merkle Tree Certificates, meant to improve the security and efficiency of the present Public Key Infrastructure for the web (WebPKI). I also implemented a prior version of that proposal in rustls as part of my master's thesis earlier this year. Meanwhile, the proposal has evolved to the level that Cloudflare and Google Security are working on an experimental deployment.

Quantum Computers

I bet almost everyone reading this has heard of quantum computers, but by far not everyone has a clue what exactly they are about. Different from the computers we have today, quantum computers don't just know bits that can be either zero or one, but use physical quantum effects such as superposition, interference, and entanglement. In theory, this allows quantum computers to outperform classical computers in very specific tasks, such as chemistry, developing new medicines, but also breaking some cryptography our society relies on. I assume only a few people understand these working principles in detail, and honestly, I'm certainly not one of them. Luckily, the exact details of how quantum computers work do not matter for today's blog. Instead, what we should bear in mind is that nowadays quantum computers are not yet powerful enough to pose an immediate threat, but their continuous improvement makes it important to act now.

Isn’t this a solved problem?

Transport Layer Security (TLS) is a very widespread protocol used to securely transfer data over an untrusted network. It relies on the WebPKI to authenticate the server at the client. Everyone of you has used this, for example, by reading this blog, as the communication between your device and the server hosting our website uses TLS. In that setup, the WebPKI is used to prove that your browser is actually talking to https://tweedegolf.nl. About 50 % of Cloudflare's TLS connections are encrypted post-quantum secure. And Cloudflare should know, given that about 20 % of all web traffic runs through their network.

Sounds pretty good, doesn’t it? It does! But it’s important to be precise about what this means exactly. The 50 % refers to post-quantum secure key agreement as described in the IETF internet-Draft "Hybrid key exchange in TLS 1.3". The PQ-secure key exchange ensures the confidentiality of the transmitted data, even if an attacker stores the traffic today and tries to decrypt it with a future quantum computer, a so-called harvest now, decrypt later attack.

This blog, on the other hand, talks about post-quantum secure authentication, not encryption. Servers use certificates to cryptographically prove to a browser that they indeed serve the website you typed into your address bar, i.e., authenticate the server to the browser. Note that this proof happens at the time you visit the website and consequently cannot be broken retroactively. So one could ask why bother with that problem long before Q-Day, the day a quantum computer capable of cracking classical cryptography actually exists. For one, we can't reliably predict when this Q-Day will happen; We have to consider not just improvements in hardware, but also algorithms. For example, in June 2025, Craig Gidney published a paper describing optimizations that allow breaking RSA-2048 with less than one million qbits. Before that, we thought this task would need about 20 million qbits. The number of qbits is — next to the error rate — one of multiple important performance metrics for a quantum computer.

Quantum computer evolution Landscape of Quantum Computing, by Samuel Jaques

Additionally, history showed that such fundamental changes in the internet infrastructure take a long time to gain widespread adoption. For example, it took almost 10 years from the first experiments at Google before reaching the 50 % of post-quantum TLS key exchanges we see today. Therefore, it's crucial to start the post-quantum transformation in time.

What is the problem then?

One of the big challenges we face when replacing classical cryptography with its post-quantum counterparts is size. To understand which parts are affected exactly, let's do a short recap on the two main categories of cryptography that exist. First, there is symmetric encryption, where both sides have the same secret key they use to encrypt and decrypt a message. Fortunately, quantum computers do not affect symmetric cryptography. On the other hand, we have asymmetric cryptography, where one party has a secret key and everyone may know the corresponding public key. It’s this type of cryptography that is threatened by quantum computers. For this blog, we will focus on the sub-category of signatures, as they make up the certificates.

Signatures allow A to prove to B that they produced a message or document (authenticity) and that it has not been tampered with in transit (integrity). Roughly, that works so that A has the secret key, which it combines with the message to produce a signature that B can then verify by checking it against the public key of A. For B to check the message of A is authentic and hasn’t been tampered with, it must know A's public key. In theory, this is a great system, but in the context of websites, this would mean that every browser would need to know the public key of every website that exists. That's clearly impossible given the sheer mass of existing sites, new websites constantly coming online, and existing ones rotating their keys. In other words, we need a way to reliably distribute the public keys. For that, we rely on a very complex and security-critical system, the Web Public-Key Infrastructure (WebPKI).

In simplified terms, todays WebPKI infrastructure consists of ~100ish Certification Authorities (CAs) which issue certificates to website owners after validating they are in control of their domain. The certificate tells the browser which public key to use to validate the signature of the website in question. The certificate itself is signed by the issuing CA1. This means each browser only needs to know about 100 public keys of the CAs and can validate a website's identity by checking if the certificate they present is signed by one of their trusted CAs. In addition, Certificate Transparency (CT) was added later to avoid, or at least detect, CA misbehavior such as the DigiNotar hack in 2011. In total, that means each TLS handshake — the start of every TLS connection — typically exchanges something like five signatures and two public keys between client and server.

The National Institute of Standards and Technology (NIST) ran an elaborate competition between 2016 and 2024 to find the best signature schemes that can resist quantum computers and selected three winners. Each of them has its strengths and weaknesses, but they all have one thing in common: Their keys and signatures are significantly bigger than the non-quantum secure signature schemes that we use today. Naively replacing all five signatures and two keys in the certificates would increase their size dramatically. Giving some numbers: ECDSA-256 uses as few as 64 bytes for a signature and 32 bytes for a public key; the most promising post-quantum signature ML-DSA-65 requires 3,309 and 1,952 bytes per signature and public key, respectively.

Algorithm Signatures Public Keys Sum PQ
ECDSA-256 320 64 384
RSA-2048 1280 512 1,792
ML-DSA-65 16,545 3,904 20,449

You might argue that having certificates with 20 kB of cryptographic data isn't a problem, given the average size of websites nowadays. But keep in mind that not every connection secured with TLS is serving an overloaded website. Many are simple API calls that don't transport a lot of data. Additionally, in practice, there are middle boxes like load balancers and proxies that break for certificates bigger than 10 kB and the TCP congestion window will cause a significant slowdown. As Cloudflare said in one of their blog posts:

[...] when adding 9KB, we're looking at a slowdown of about 15%. Crossing the 10kB boundary, we are likely to incur an extra roundtrip, which could well lead to a slowdown of more than 60% [in TLS handshake time].

A possible solution: Merkle Tree Certificates

In short, replacing the signatures in certificates with post-quantum variants is not a preferred solution for performance reasons. So, let's take a look at the proposed improvement: Merkle Tree Certificates (MTCs). MTCs heavily rely on so-called hash functions. A hash function is a cryptographic one-way function with a fixed-length output, typically 256 bits. One-way here means that if you only know the output of the hash function — the hash — it's computationally infeasible to find the input that produced this hash. Similarly, you will have a hard time finding two different inputs that will produce the same hash2. The benefit of hash functions is that they are resistant against quantum computers, while a hash output is only 32 bytes, and the hash function is computationally lightweight.

Using the hash functions in smart ways, MTCs need only a single signature and key plus a Merkle Tree inclusion proof for a TLS handshake, if the client is sufficiently up-to-date. But what is this inclusion proof? And what is a Merkle Tree in the first place?

Merkle Tree

Basic Merkle Tree with inclusion proof (in red) for content 2 (in yellow)

The above illustrates a generic Merkle Tree with four items. First, every content that should be included in the Merkle Tree gets passed to a hash function. In the next layer, every two of those hashes will get hashed again to produce one new hash, building a tree up to the root node. If one wants to prove a certain content is included in a given root node hash, they only have to provide the hashes required to calculate the path up to the root. In the above image, if you want to prove that content 2 (in yellow) is included in the root node, you have to provide h1 and h6 (in red). Because of the properties of hash functions, it's impossible to alter the contents in a way such that the root node remains the same value: Every single bit-flip in any of the contents will produce a completely different hash at the root node.

The basic idea of Merkle Tree Certificates is that Certification Authorities constantly keep building a Merkle Tree that includes all assertions that the CA has certified, i.e., that a public key belongs to a domain. Every so often, they decide to mark one of the nodes as landmark. Those landmarks are then distributed to relying parties, e.g., browsers, as trusted landmarks. That means, signatureless certificates only need to include the public key of the authenticating party, e.g., the web server, and an inclusion proof to the landmark, which typically makes them even smaller and quicker to verify than traditional certificates with non-PQ secure algorithms. This is remarkable, as it allows us to introduce post-quantum security and a performance improvement at the same time.

The main drawback of this approach is that there is a significant delay between the certificate request and the time it's usable, partly because the certificate issuance has to wait until the next landmark is produced and mainly because these landmarks have to be distributed to the relying parties thereafter. Luckily, most of the certificate issuance is automated nowadays, and if you request a new certificate early enough before the old one expires, you have a good chance; most clients are already up-to-date with the required landmarks. Nevertheless, there are some less common cases that require an immediate issuance. For example, this applies to newly registered domains, but also unplanned moves of a domain, e.g., to counter a DDoS attack. To achieve the required level of credibility, such a full certificate requires not just the signature of the CA itself, but also cosignatures from so-called co-signers. Those co-signers serve a similar purpose as transparency logs in the CT infrastructure, that is, detecting CA misbehavior or other certificate misissuance. The exact number of necessary signatures in a full certificate depends on the policy of the relying parties. In any case, full certificates are significantly bigger than signatureless certificates, but the expectation is that they are only required in rare cases.

What's Next?

The Merkle Tree approach sounds very promising in theory: reducing certificate size while transitioning to Post-Quantum secure signature algorithms and streamlining certificate transparency to be a first-class citizen in the WebPKI ecosystem — what would you want more? Well, we’re not there yet. First, MTCs will have to prove themselves in practice.

About a year ago, I wrote my Master’s thesis about MTCs including some prototype implementation in Rustls, a modern and widespread TLS library written in Rust. Since then, the specification significantly evolved, unofficially called iteration photosynthesis, an IETF working group formed, and in October 2025, Cloudflare announced that they deploy MTCs on an experimental basis in collaboration with Chrome Security. I'm excited to see MTCs in action, and I’m curious about their real-world performance!

1: The certificate is not directly signed by the CA itself. Instead, the CA signs an intermediate certificate, which in turn actually signs the so-called end-entity certificate. The reasons are mainly operational practicalities, and that the private key for the root certificate should be stored offline to reduce the risk of it leaking.
2: For those of you who really want to know: A good hash function has three crucial properties: First preimage resistance, second preimage resistance, and collision resistance. The first says it's computationally infeasible to find a possible input that produces this output, the second says, given one input, it's computationally infeasible to find another input that produces the same output, and the collision resistance says it's computationally infeasible to find any two different inputs that produce the same output.

Stay up-to-date

Stay up-to-date with our work and blog posts?

Related articles

We keep saying that Rust is how we make software safer. In this blog, we'll tackle a real-world vulnerability, 'rewrite it in Rust', and show you the results of our empirical research, both as a high-level overview and a tech deep-dive.
When iHub's Bernard van Gastel asked us to help them start with Rust, we were somewhat surprised by their bold step but absolutely happy to assist. In this article we'll describe how we went about designing a workshop for the iHub team.
We’ve been raising awareness of the importance of using memory-safe technology to build systems that are truly secure-by-design. We do this alongside our core business, which is to help companies to use Rust successfully. With the CRA standardization process well on its way, now is the perfect time to update you on what we’ve been doing and why.