Mercury and other application down for 1h37m due to cloud provider incident
Resolved
Mar 19 at 04:16pm CET
The application became unavailable at 1:59 PM, after LoadBalancing failed due to an incidient that occurred at our cloud provider, Hetzner, which led to config changes negatively affecting LoadBalancer performance. From our investigation, we deduced that an autoscaling event in the cluster that happened at 1:59 PM led to a config change propagated to the LoadBalancer which then led to the failure.
https://status.hetzner.com/incident/b8246859-8fdd-4c44-b35c-d3c9a6cad3a7
We then redeployed the LoadBalancer and deactivated auto-scaling in the cluster to prevent further problems associated with the Hetzner incident.
We will keep auto-scaling deactivated until we have ensured in a test setup that problem mitigation at Hetzner is reliable.
Affected services