<Marc Qualie/>

Using nginx as a node package proxy cache

Photo of Marc Qualie

It's possible to dramatically improve yarn v1 performance locally by caching registry content in a local nginx proxy. I'm part of teams maintaining a lot of legacy yarn projects and it can be painful when running commands like yarn outdated or yarn install after changes are made to dependencies.

The registry itself is very performant since it sits behind Cloudflare, but that's still a bunch of servers not local to your machine. No matter how performant their systems are you can't beat the roundtrip times when checking the status of thousands (yes thousands!) of dependencies.

Docker Compose config (Optional)

The easiest way to run nginx is inside a docker container. Everything in this guide will still work with nginx running natively on your machine or even as part of a kubernetes cluster in your network.

Depending on how your network is configured to override DNS (detailed later), you will need to make sure you're using nginx v1.27 or a paid version of nginx plus to get the upstream resolve features.

services:
  gateway:
    image: nginx:1.27
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - "./nginx/nginx.docker.conf:/etc/nginx/nginx.conf"
      - "./nginx/certs:/etc/nginx/certs"
      - "./nginx/conf:/etc/nginx/conf"
      - "./nginx/logs:/var/log/nginx"
      - "./nginx/cache:/cache/nginx"
    restart: unless-stopped

Nginx config

The config can be used on its own or dropped in as part of an existing setup. I've only included the parts required for this to work but by no means is this a fully complete configuration.

http {
  proxy_cache_path /cache/nginx levels=1:2 keys_zone=yarnpkg:50m inactive=1d 
  max_size=10g;

  resolver 1.1.1.1 valid=5m ipv6=off;

  upstream yarnpkg {
    # zone is required to keep entries from resolve in shared memory
    zone upstream_yarnpkg 64k;
    server registry.yarnpkg.com:443 resolve;
  }

  server {
    listen 80;
    listen 443 ssl;
    http2 on;
    server_name registry.yarnpkg.com;
    # details on how to generate these certificates later
    ssl_certificate certs/registry.yarnpkg.com/cert.pem;
    ssl_certificate_key certs/registry.yarnpkg.com/key.pem;
    access_log /var/log/nginx/access.yarnpkg.log proxy;
    error_log /var/log/nginx/error.yarnpkg.log debug;

    # custom this config to fit your own retention
    proxy_cache yarnpkg;
    proxy_cache_key $scheme$host$request_uri;
    proxy_cache_min_uses 1;
    proxy_cache_methods GET HEAD;
    proxy_cache_valid 200 4h;
    proxy_cache_valid 404 60m;
    proxy_ignore_headers Cache-Control Set-Cookie;

    location / {
      proxy_pass https://yarnpkg;

      # the custom cache headers help with debugging
      add_header X-Cache $upstream_cache_status;
      add_header X-Upstream-Addr $upstream_addr;
      proxy_set_header Host $host;
      proxy_redirect off;
      proxy_read_timeout 15s;
      proxy_http_version 1.1;
      proxy_ssl_server_name on;
      proxy_ssl_protocols TLSv1.2 TLSv1.3;
      proxy_ssl_name $host;
      proxy_set_header Upgrade "";
      proxy_set_header Connection "keep-alive";
    }
  }
}

A few key points to note:

Creating a certificate

Unless you want to update your registry to http://localhost for every item in your yarn.lock you'll need to proxy the host through your local nginx instance. Since it's not possible to get a real certificate for registry.yarnpkg.com we're going to self sign using mkcert.

It is super easy to setup this tool if you don't already use it for other things. Never share any certificates you make with this tool, they are for local testing on your own machine locally.

brew install mkcert
mkcert -install

Once it's installed you can use it generate self signed certificates.

mkcert -cert-file nginx/certs/registry.yarnpkg.com/cert.pem -key-file nginx/certs/registry.yarnpkg.com/key.pem registry.yarnpkg.com

By default yarn commands will still fail even if you have the mkcert CA trusted at the system level. You can fix this by telling yarn exactly where to look when verifying SSL connections.

yarn config set cafile "$(mkcert -CAROOT)/rootCA.pem"

Updating DNS

The easiest way to change the DNS for registry.yarnpkg.com is to update your hosts file and point it to 127.0.0.1 (or wherever you have the nginx proxy running). If you're just using a single machine and want the simplest config then this is it.

I'm personally running custom DNS on my internal network so I can add an entry here that overrides it and points it to the nginx proxy on my homelab. The benefit here is any machine on my network will now magically use the proxy instead of going directly to the registry.

Verifying your setup

Now that you have an nginx config, docker compose and a valid SSL certificate it's time to try it out.

docker compose up -d
curl -I https://registry.yarnpkg.com/react
> HTTP/2 200
> ...
> x-cache: MISS
> x-upstream-addr: 104.16.25.34:443

Sending the request again should update to x-cache: HIT.

If you don't see the x-cache header then something isn't configured right. Revisit the steps and ensure every configuration option is set as even just one missing could cause the cache to skipped, which would defeat the point of this proxy.

Testing speeds

The easiest way to test this is with the yarn outdated command since this pulls down the metadata for every package.json, causing a network roundtrip for each one.

Here are the results on an actual project. First is with an empty cache and the second is using the cache it just created.

yarn outdated # ✨  Done in 12.52s
yarn outdated # ✨  Done in 1.48s.

Conclusion

While this won't make a huge difference for most people, it can certainly help reduce the network burden for people still using multiple large projects on yarn v1. I personally use it when doing batches of dependency upgrades; The ability to run yarn outdated again after modifying the package.json and getting an instant result is such a time saver.

While this is being used for yarn registry there's no reason the same approach can't also be applied to other registries to get the same performance gains.

If you have any questions about this post, or anything else, you can get in touch on Bluesky or browse my code on Github.