MWoS 2015: Let’s Encrypt Automation Tooling

winterOfSecurity_logo_dark_vertical2The Mozilla Winter of Security of 2015 has ended, and the participating teams of students are completing their projects.

The Certificate Automation tooling for Let’s Encrypt project wrapped up this month, having produced an experimental proof-of-concept patch for the Nginx webserver to tightly integrate the ACME automated certificate management protocol into the server operation.

The MWoS team, my co-mentor Richard Barnes, and I would like to thank Klaus Krapfenbauer, his advisor Martin Schmiedecker, and the Technical University of Vienna for all the excellent research and work on this project.

Below is Klaus’ end-of-project presentation on AirMozilla, as well as further details on the project.

MWoS Let’s Encrypt Certificate Automation presentation on AirMozilla

Developing an ACME Module for Nginx

Author: Klaus Krapfenbauer

Note: The module is an incomplete proof-of-concept, available at https://github.com/mozilla/mwos-letsencrypt-2015

The 2015-2016 Mozilla Winter of Security included a project to implement an ACME client within a well-known web server, to show the value of automated HTTPS configuration when used with Let’s Encrypt. Projects like Caddy Server showed the tremendous ease-of-use that could be attained, so for this project we sought to patch-in such automation to a mainstay web server: Nginx.

The goal of the project is to build a module for a web server to make securing your web site even easier. Instead of configuring the web server, getting the certificate (e.g. with the Let’s Encrypt Certbot) and installing the certificate on the web server, you just need to configure your web server. The rest of the work is done by the built-in ACME module in the web server.

Nginx

This project didn’t specify which particular web server we should develop on. We evaluated several, including Apache, Nginx, and Stunnel. Since the goal is to help as many people as possible in securing their web sites we narrowed to the two most widely-used: Nginx and Apache. Ultimately, we decided to work with Nginx since it has a younger code base to develop with.

Nginx has a module system with different types of modules for different purposes. There are load-balancer modules which pass the traffic to multiple backend servers, filter modules which convert the data of a website (for example encrypt it like the SSL/TLS module) and handler modules which create the content of a web request (e.g. the http handler loads the html file from the disk and serves it). In addition to their purpose the module types also differ in how they hook into the server core, which makes the choice crucial when you start to implement a Nginx module. In our case none of the types were suitable, which introduced some difficulties, discussed later.

The ACME module

The Nginx module should be a replacement of the traditional workflow involving the ACME Certbot. Therefore the features of the module should resemble the features of the Certbot. This includes:

  • Generate and store a key-pair
  • Register an account on an ACME server
  • Create an authorization for a domain
  • Solve the HTTP challenge for the domain authorization
    • At a later date, support the other challenge types
  • Retrieve the certificate from the ACME server
  • Renew a certificate
  • Configure the Nginx SSL/TLS module to use the certificate

To provide the necessary information for all the steps in the ACME protocol, we introduced new Nginx configuration directives:

  • A directive to activate the module
  • Directive(s) for the recovery contact of the ACME account (optional)
    • An array of URIs like “mailto:” or “tel:”

Everything else is gathered from the default Nginx directives. For example, the domain for which the certificate is issued is taken from the Nginx configuration directive “server_name”.

An architecture diagram showing the different resources available to Nginx, and their relationships with the ACME module developed, as well as the ACME server: Let's Encrypt.

Architecture of the ACME Module for Nginx

As the ACME module is an extension of the Nginx server itself, it’s a part of the software and therefore uses the Nginx config file for its own configuration and stores the certificates in the Nginx config directory. The ACME module communicates with the ACME server (e.g. Let’s Encrypt, but it could be any other server speaking the ACME protocol) for gathering the certificate, then configures the SSL/TLS module to use this certificate. The SSL/TLS module then does the encryption work for the website’s communication to the end user’s browser.

Let’s look at the workflow of setting up a secure website. In a world without ACME, anyone who wanted to setup an encrypted website had to:

  1. Create a CSR (certificate signing request) with all the information needed
  2. Send the CSR over to a CA (certificate authority)
  3. Pay the CA for getting a signed certificate
  4. Wait for their reply containing the certificate (this could take hours)
  5. Download the certificate and put it in the right place on the server
  6. Configure the server to use the certificate

With the ACME protocol and the Let’s Encrypt CA you just have to:

  1. Install an ACME client
  2. Use the ACME client to:
    1. Enter all your information for the certificate
    2. Get a certificate
    3. Automatically configure the server

That’s already a huge improvement, but with the ACME module for Nginx it’s even more simple. You just have to:

  1. Activate the ACME module in the server’s configuration

Pretty much everything else is handled by the ACME module. So it does all the steps the Let’s Encrypt client does, but fully automated during the server startup. This is how easy it can and should be to encourage website admins to secure their sites.

The minimal configuration work for the ACME module is to just add the “acme” directive to the server context in the Nginx config for which you would like to activate it. For example:

…
http {
…
  server {
    listen 443 ssl;
    server_name example.com;
    acme;
    …
    <recommended SSL hardening config directives>
    …
    location / {
      …
    }
  }
}
…

Experienced challenges

Designing and developing the ACME module was quite challenging.

As mentioned earlier, there are different types of modules which enhance different portions of the Nginx core server. The default Nginx module types are: handler modules (which create content on their own), filter modules (which convert website data – like the SSL/TLS module does) and load-balancer modules (which route requests to backend servers). Unfortunately, the ACME module and its inherent workflow does not fit any of these types. Our module breaks these conventions: it has its own configuration directives, and requires hooks into both the core and other modules. Nginx’s module system was not designed to accommodate our module’s needs, therefore we had a very limited choice on when we could perform the ACME protocol communication.

The ACME module serves to configure the existing SSL/TLS module, which performs the actual encryption of the website. Our module needs to control the SSL/TLS module to some degree in order to provide the ACME-retrieved encryption certificates. Unfortunately, the SSL/TLS module does a check for the existence and the validity of the certificates during the Nginx configuration parsing phase while the server is starting. This means the ACME module must complete its tasks before the configuration is parsed. Our decision, due to those limitations, was to handle all the certificate gathering at the time when the “acme” configuration directive is parsed in the configuration during server startup. After getting the certificates, the ACME module then updates the in-memory configuration of the SSL/TLS module to use those new certificates.

Another architectural problem arose when implementing the ACME HTTP challenge-response.  To authorize a domain using the ACME HTTP challenge, the server needs to respond with a particular token at a well known URL path in its domain. Basically, it must publish this token like  a web server publishes any other site. Unfortunately, at the time the ACME module is processing, Nginx has not yet started: There’s no web server. If the ACME module exits, permitting web server functions to begin (and keeping in mind the SSL/TLS module certificate limitations from before), there’s no simple mechanism to resume the ACME functions later. Architecturally, this makes sense for Nginx, but it is inconvenient for this project. Faced with this dilemma, for the purposes of this proof-of-concept, we decided to launch an independent, tiny web server to service the ACME challenge before Nginx itself properly starts.

Conclusion

As discussed, the limitations of a Nginx module prompted some suboptimal architectural design decisions. As in many software projects, the problem is that we want something from the underlying framework which it wasn’t designed to do. The current architectural design of the ACME module should be considered a proof-of-concept.

There are potential changes that would improve the architecture of the module and the communication between the Nginx core, the SSL/TLS module and the ACME module. These changes, of course, have pros and cons which merit discussion.

One change would be deferring the retrieval of the certificate to a time after the configuration is parsed. This would require spoofing the SSL/TLS module with a temporary certificate until the newly retrieved certificate is ready. This is a corner-case issue that arises just for the first server start when there is no previously retrieved certificate already stored.

Another change is the challenge-response: A web server inside a web server (whether with a third party library or not) is not clean. Therefore perhaps the TLS-SNI or another challenge type in the ACME protocol could be more suitable, or perhaps there is some method to start Nginx while still permitting the module to continue work.

Finally, the communication to the SSL/TLS module is very hacky.

Current status of the project & future plans

The current status of the module can be roughly described as a proof-of-concept in a late development stage. The module creates an ephemeral key-pair, registers with the ACME server, requests the authentication challenge for the domain and starts to answer the challenge. As the proof of concept isn’t finished yet, we intend to carry on with the project.

Many thanks

This project was an exciting opportunity to help realize the vision of securing the whole web. Personally, I’d like to give special thanks to J.C. Jones and Richard Barnes from the Mozilla Security Engineering Team who accompanied and supported me during the project. Also special thanks to Martin Schmiedecker, my professor and mentor at SBA Research at the Vienna University of Technology, Austria. Of course I also want to thank the whole Mozilla organization for holding the Mozilla Winter of Security and enabling students around the world to participate in some great IT projects. Last but not least, many thanks to the Let’s Encrypt project for allowing me to participate and play a tiny part in such a great security project.