PyPI Outage 2020-03-26
Incident Report for Python Infrastructure
Resolved
From approximately 06:30 UTC to 12:00 UTC on 2020-03-26, PyPI's backends were unavailable causing new uploads, actions on the web ui, and xmlrpc access to fail.

This outage began when a deployment failed due to a core service in PyPI's hosting platform having crashed. The service provides internal TLS certificates and encrypted secret storage for projects that run in the cluster which are necessary as our applications deploy. As the deployment rolled back, the downed service caused that to fail also, leaving PyPI unavailable.

Misconfigurations in our monitoring and alerting caused this outage to go on longer than anticipated and fail to update our status page automatically. Although pages were correctly sent they did not reach the correct administrators to address the outage. We're reviewing and updating our automation for the status page as well as routing for pages to reduce the length of outages like these in the future.
Posted Mar 26, 2020 - 06:30 UTC