caddy issues

this sounds a whole lot like something else

# context

Recently, with about 70 containers running, we experienced a lot of 502 Timeout Errors, and Caddy reports I/O timeout errors in its logs. It started with Whoogle search and Linkding, which we initially thought could be because their back-end was Python so maybe it’s slower than other services, but it would occasionally happen to the rest as well, like Portainer, Flame (a pretty lightweight homepage). A lot of these I/O entries in the logs are justified, like Websocket errors or just bugs within the services themselves (like the weird issue with Netdata’s reporting missing JS libraries when loading with a reverse proxy, which seems to be a bandwidth problem in the end).

# troubleshooting

We did some troubleshooting by doing the following:

# results

In the end, we couldn’t really find out the reason why the I/O performance of Caddy was being bottle-necked, we could’ve tried to install it baremetal, but I’m not sure this would contribute much as this problem happened all of a sudden after weeks of normal, expected performance