log
mostly post-mortems
mostly post-mortems
After a routine update of docker images that only have the latest
tags,
Searxng broke after a docker restart. Every single search times out. The
container has internet access. An initial dig shows that requests takes 4
seconds. Using vanilla docker run as opposed to Ansible on a default Searxng
configuration works, so I thought the problem was Searxng. Copying the new
configuration onto the container ran using Ansible. Still didn’t work. It must
be DNS then. But dig is fast on host, so what’s wrong?
Ansible led me astray, since the resolv.conf
for normal docker run is
different to the one made by Ansible. I thought the problem must be Ansible
then. Upon closer scrutiny, there were two gateways 192.168.1.1
and
192.168.19.1
. Huh?
It turns out I changed the router gateway to 192.168.19.1
a few days ago, and
forgot to update the relevant systemd units. After removing 192.168.1.1
,
everything is now okay again.
Today, our certs expire. And today, it failed to renew. At 10:00pm, everything was down. Suspecting it was the LE API tokens, we rolled it over. Only after rolling it over and restarting Traefik did we realise the old token’s expiry date was set sometime in the far future. The next suspicion was DNS, but you can kindly ask Traefik to resolve your domains with another DNS resolve like this:
certificatesResolvers:
letsencrypt:
acme:
storage: /etc/traefik/acme.json
dnsChallenge:
provider: cloudflare
delayBeforeCheck: 0
resolvers:
- "1.1.1.1:53"
- "1.0.0.1:53"
So we were like - “d’oh it was not the DNS”… only except it was. A few restarts after and random guesses, we found that our router had the option “override DNS for all clients” checked (why was it checked - who knows? Note to self don’t change random things and not reset it back). Unticked that and life was back to normal.