Incident Summary
Between 19:46 UTC on February 2, 2026 and 06:05 UTC on February 3, 2026, Microsoft Azure experienced a critical multi-service outage affecting compute and identity services across multiple regions.
| Field | Details |
|---|---|
| Duration | 10 hours 19 minutes |
| Root Cause | Configuration change restricting storage access |
| Impact | Global, multiple regions |
| Services Down | VM, VMSS, AKS, DevOps, GitHub Actions, Managed Identities |
Timeline
Phase 1: VM and Compute Failure (Feb 2, 19:46 - Feb 3, 00:30 UTC)
| Time (UTC) | Event |
|---|---|
| Feb 2, 19:46 | Reports begin surfacing of Azure VM connectivity issues |
| Feb 2, 21:30 | Microsoft confirms "a subset of customers" affected |
| Feb 3, 00:00 | Scope expands to include AKS, VMSS, and identity services |
| Feb 3, 00:30 | Recovery begins as configuration rollback is initiated |
Root Cause: A configuration change inadvertently restricted public access to Microsoft-managed storage accounts used to host virtual machine extension packages.
Phase 2: Managed Identities Overload (Feb 3, 00:10 - 06:05 UTC)
| Time (UTC) | Event |
|---|---|
| Feb 3, 00:10 | Surge of queued operations overwhelms Managed Identities platform |
| Feb 3, 02:00 | East US and West US identity services experiencing failures |
| Feb 3, 04:30 | Identity platform capacity scaled up |
| Feb 3, 06:05 | Full recovery confirmed across all regions |
Root Cause: When VM services recovered, a backlog of queued identity operations created a surge that overwhelmed the Managed Identities platform in East US and West US.
Services Impacted
Azure Virtual Machines
- Failed to start or provision
- Lost connectivity to storage
- Extension installations failed
Virtual Machine Scale Sets (VMSS)
- Auto-scaling operations failed
- New instance provisioning blocked
- Health probe failures
Azure Kubernetes Service (AKS)
- Pod scheduling disrupted
- Node provisioning failed
- Cluster operations timed out
Azure DevOps & GitHub Actions
- Pipeline runs failed
- Azure-hosted runners unavailable
- Build and deployment operations blocked
Azure Managed Identities
- Authentication failures
- Token acquisition timeouts
- Identity-dependent operations failed
Impact Assessment
Global Impact: Customers across all Azure regions experienced service disruptions, though East US and West US were most severely affected during the identity overload phase.
Business Impact:
- CI/CD pipelines halted
- Auto-scaling failed during peak traffic
- New deployments blocked for 10+ hours
- Identity-based authentication unavailable
Microsoft's Response
Microsoft's incident response included:
- Immediate Rollback — Configuration change reverted within 5 hours of detection
- Capacity Scaling — Managed Identities platform scaled up to handle surge
- Customer Communication — Status updates via Azure Service Health Dashboard
- RCA Pending — Full Root Cause Analysis to be published
Lessons Learned
For Azure Customers
- Multi-Region Deployments — Avoid concentrating critical workloads in a single region
- Failover Planning — Have documented failover procedures for cloud outages
- Status Monitoring — Implement automated monitoring of Azure Service Health
- SLA Review — Check Azure SLA for service credit eligibility
For Cloud Providers
This incident highlights the cascading failure risk when core infrastructure (storage access) affects compute services, which then creates surge load on identity services during recovery.
Current Status
Resolved: All services fully recovered as of 06:05 UTC on February 3, 2026.
Microsoft is conducting a full Root Cause Analysis and will publish detailed findings. Customers who experienced SLA violations should file service credit requests through the Azure Portal.