log

mostly post-mortems




it's always dns

After a routine update of docker images that only have the latest tags, Searxng broke after a docker restart. Every single search times out. The container has internet access. An initial dig shows that requests takes 4 seconds. Using vanilla docker run as opposed to Ansible on a default Searxng configuration works, so I thought the problem was Searxng. Copying the new configuration onto the container ran using Ansible. Still didn’t work. It must be DNS then. But dig is fast on host, so what’s wrong?

Ansible led me astray, since the resolv.conf for normal docker run is different to the one made by Ansible. I thought the problem must be Ansible then. Upon closer scrutiny, there were two gateways 192.168.1.1 and 192.168.19.1. Huh?

It turns out I changed the router gateway to 192.168.19.1 a few days ago, and forgot to update the relevant systemd units. After removing 192.168.1.1, everything is now okay again.




it's always dns - electric boogalo

Today, our certs expire. And today, it failed to renew. At 10:00pm, everything was down. Suspecting it was the LE API tokens, we rolled it over. Only after rolling it over and restarting Traefik did we realise the old token’s expiry date was set sometime in the far future. The next suspicion was DNS, but you can kindly ask Traefik to resolve your domains with another DNS resolve like this:

certificatesResolvers:
  letsencrypt:
    acme:
      storage: /etc/traefik/acme.json
      dnsChallenge:
        provider: cloudflare
        delayBeforeCheck: 0
        resolvers:
          - "1.1.1.1:53"
          - "1.0.0.1:53"

So we were like - “d’oh it was not the DNS”… only except it was. A few restarts after and random guesses, we found that our router had the option “override DNS for all clients” checked (why was it checked - who knows? Note to self don’t change random things and not reset it back). Unticked that and life was back to normal.