Incident Summary
| Field | Details |
|---|---|
| Status | Resolved |
| Severity | Minor |
| Duration | 45 minutes |
| Impact | Intermittent slow page loads for some users |
| Root Cause | Upstream DNS provider caching stale records |
Timeline
| Time (UTC) | Event |
|---|---|
| 14:15 | Monitoring alert: Elevated DNS resolution times |
| 14:20 | Investigating — Confirmed intermittent delays via external probes |
| 14:30 | Identified — Traced to upstream DNS provider serving stale cached records after a TTL configuration change |
| 14:35 | Monitoring — Forced cache flush at DNS provider level |
| 14:50 | DNS resolution times returned to normal |
| 15:00 | Resolved — Confirmed normal operation across all regions |
Root Cause
A TTL (Time to Live) configuration change made during routine DNS optimization resulted in the upstream DNS provider caching records beyond the intended duration. When the CDN edge IP addresses rotated as part of normal operations, some users were directed to stale endpoints.
Resolution
- Forced cache flush at the upstream DNS provider
- Verified TTL values propagated correctly
- Confirmed all regions resolving to current edge IPs
Prevention
- Added DNS resolution monitoring to automated health checks
- Set minimum TTL floor of 300 seconds to prevent aggressive caching
- Added secondary DNS provider for redundancy