Brief history of compression in TLS – Attacks on the TLS Record Protocol
21.5.3 Brief history of compression in TLS
Both TLS version 1.0 released in 1999 [47] and TLS version 1.1 released in 2006 [48] specify compression for TLS records.
One of the parameters Alice and Bob agree upon during a TLS 1.0 or TLS 1.1 handshake is the compression method – the compression algorithm Alice and Bob use to compress data before it is encrypted.
Technically, the compression algorithms translate the TLSPlaintext data structure into the TLSCompressed data structure shown in Listing 21.1.
Here, length is the byte length of the fragment variable, and it should not be larger than 214 + 1,024 bytes. The fragment variable, in turn, holds the compressed form of TLSPlaintext.fragment:
Listing 21.1: The TLSCompressed data structure
struct {
ContentType type; /* same as TLSPlaintext.type */
ProtocolVersion version;/* same as TLSPlaintext.version */
uint16 length;
opaque fragment[TLSCompressed.length];
} TLSCompressed;
In TLS 1.0 and 1.1, Alice and Bob compress all TLS records using the compression algorithm defined in their current session state. There is always an active compression algorithm. At a minimum, all TLS 1.0 and 1.1 implementations must support CompressionMethod.null, the identity operation, effectively corresponding to the compression algorithm being deactivated.
Importantly, compression algorithms that are allowed by the TLS 1.0 and 1.1 standards must be lossless – for instance, algorithms such as DEFLATE – and they are not allowed to increase the content length by more than 1,024 bytes.
Due to several attacks that we are going to study next, TLS 1.2 has limited compression, and TLS 1.3 has completely removed compression from the protocol design. Notably, there is even the RFC 8996 [124], which formally deprecates TLS versions 1.0 and 1.1 by moving them to the Historic status.
21.5.4 CRIME
In 2012, security researchers Juliano Rizzo and Thai Duong published a practical attack exploiting the compression side channel [153]. They named their attack Compression Ratio Info-leak Made Easy or CRIME, for short.
CRIME has even received a dedicated Common Vulnerabilities and Exposures (CVE) identifier, namely CVE-2012-4929. CVE is a system that provides information about publicly known security vulnerabilities and exposures to enable vulnerability management automation. Each CVE identifier is unique and contains information about the vulnerability as well as pointers, for example, to security advisories of the affected vendors. One of the best known is the National Vulnerability Database (NVD), operated by the United States government, a well-known public source where CVEs can be looked up.
If Alice and Bob use a vulnerable TLS version up to and including TLS 1.2, Eve can employ CRIME to extract the plaintext from encrypted HTTP headers. She does this by making a series of guesses for a string in the HTTP request – that is, by manipulating that string – and observing the length of the corresponding encrypted traffic. If the value of the string in the HTTP request matches the value of the unknown string in the encrypted HTTP header, lossless compression kicks in and the size of the encrypted traffic is reduced.
Why is this a problem for Alice and Bob? Well, among other things, an HTTP header contains cookies. Cookies are commonly used for user authentication in web applications. As a result, if Eve can extract a cookie transmitted in Bob’s HTTP header, she can impersonate Bob to Alice. In web security, this is known as session hijacking.
To launch the CIME attack, Eve has to trick Bob into downloading and executing a malicious JavaScript code that Eve can use to craft the HTTP requests with her guess of the secret string in the encrypted HTTP header. In addition, Eve must be capable of assuming the man-in-the-middle role to observe the encrypted network traffic from Bob to Alice and, in particular, the length of the individual network messages.
From the cryptanalytic perspective, CRIME is a combination of a chosen plaintext attack and the information an attacker can obtain through the compression side channel. That, in turn, means IND-CPA (see Section 15.1.1 in Chapter 15, Authenticated Encryption) is broken: Eve can distinguish the ciphertext based on whether it contains the string she has guessed (in the HTTP request) or not.
Exploitation of the compression side channel in CRIME comes as no surprise given that Rizzo and Duang are also the authors of the BEAST attack we previously discussed. To protect against CRIME, data compression was banned in TLS 1.3.