Between 04:30 and 09:15 UTC, PyPI experienced elevated errors when performing search operations via the website and XMLRPC search function, affecting users of the website, as well as clients such as pip that call the XMLRPC search function.
Logs indicate that the Elasticsearch cluster powering PyPI experienced an internal issue causing latency for requests to elevate. This latency was higher than PyPI is configured to tolerate. As a result, rather than allowing increased latency to overrun our backend capacity PyPI search endpoints returned an error.
When the Elasticsearch cluster's health resolved, search functionality returned without intervention.
This elevation in error rates was just under our threshold to page a PyPI administrator.
We've opened a ticket with our Elasticsearch hosting vendor to determine if there's something that needs to be addressed and will double check our monitoring and alerting to determine if our thresholds for paging need to be updated.