Python Infrastructure
All Systems Operational
python.org Operational
pypi.python.org Operational
docs.python.org ? Operational
hg.python.org Operational
bugs.python.org Operational
wiki.python.org Operational
mail.python.org Operational
Mailing Lists and Archives Operational
Message Handling Services Operational
pypy.org Operational
speed.pypy.org Operational
Content Delivery Network Operational
Fastly US East (JFK) Operational
Fastly Asia/Pacific (HK) Operational
Fastly US East (IAD) Operational
Fastly US East (ATL) Operational
Fastly US East (MIA) Operational
Fastly US Central (DEN) Operational
Fastly US Central (DFW) Operational
Fastly US Central (ORD) Operational
Fastly US West (LAX) Operational
Fastly US West (SEA) Operational
Fastly US West (SJC) Operational
Fastly Europe (FRA) Operational
Fastly Europe (AMS) Operational
Fastly Europe (LCY) Operational
Fastly Europe (LHR) Operational
Fastly Asia/Pacific (SYD) Operational
Fastly Asia/Pacific (SIN) Operational
Fastly Asia/Pacific (TYO) Operational
Fastly Asia/Pacific (NZ) Operational
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
System Metrics Month Week Day
PyPI Errors ?
Fetching
PyPI CDN Miss Latency ?
Fetching
Past Incidents
Apr 24, 2017

No incidents reported today.

Apr 23, 2017

No incidents reported.

Apr 22, 2017

No incidents reported.

Apr 21, 2017

No incidents reported.

Apr 20, 2017

No incidents reported.

Apr 19, 2017

No incidents reported.

Apr 18, 2017

No incidents reported.

Apr 17, 2017

No incidents reported.

Apr 16, 2017

No incidents reported.

Apr 15, 2017
PyPI experienced increased error rates overnight, beginning at 00:35 UTC and ending at 00:43 UTC.

The root cause was increased latency in calls to the backing Elasticsearch cluster which services PyPI's search. This latency caused our backend servers to spin on search requests until the default 60s timeout leading to exhaustion of our backend worker pools. The root cause of the Elasticsearch issue was a failing job which temporarily filled the disk on the Elasticsearch server. This job's work is not at all critical to PyPI's operation and the volume of data it was processing has grown unchecked.

While search doesn't seem like it would generate sufficient traffic to overload our backends to this extent, the legacy PyPI XMLRPC interface exposes this search method and is frequently used by many automation tools.

Mitigation:

- Disabled failing non-critical job, the volume of data it was processing had grown unchecked.
- Implemented metrics on calls to Elasticsearch
- Implemented strict timeouts on calls to Elasticsearch to guard PyPI's availability.

We'll be monitoring the affect of these timeouts via the implemented metrics and adjust them as needed to ensure search availability is not adversely affected during regular operation.
Apr 15, 14:14 UTC
Apr 14, 2017

No incidents reported.

Apr 13, 2017

No incidents reported.

Apr 12, 2017

No incidents reported.

Apr 11, 2017

No incidents reported.

Apr 10, 2017

No incidents reported.