The most recent problems began around 9 a.m. UTC Thursday, and were still ongoing shortly after midday UTC, Microsoft reported on its Azure status page.
However, they first showed up in Visual Studio Team Services on Wednesday, between 9.44 p.m. and 11.44 p.m. UTC, Microsoft said. Customers using Microsoft's West Europe, South Central U.S., North Central U.S., and Australia East data centres may have run into HTTP 500 Internal server errors during this time, the company said.
Staff traced those errors back to a recent configuration change in Azure Active Directory -- but rolling back the change did not eliminate the errors.
"Some of the roles in the farm across our Scale Units hit a caching bug that was triggered by the earlier outage. At this moment, we do not understand root cause of the caching bug, however we have taken the required dumps to do final root cause analysis and get to the bottom of the issue," Microsoft staff explained shortly after midnight UTC.
The problems Thursday morning affected a wider range of services depending on Azure Active Directory, including Stream Analytics, Azure management portals, Azure Data Catalogue, Operational Insights, Remote App and SQL databases.
Some Office 365 customers were also unable to log in or access the service.
In preparing a failover to working servers, "The Azure Active Directory team identified an issue with the failover mitigation path, which would have blocked the mitigation," Microsoft reported.
With that path ruled out, the team has been forced to take a more laborious one: updating Azure Active Directory front ends to call a known good configuration in the hope that this will improve performance.
Microsoft promised another status update at around 1.10 p.m. UTC (8.10 a.m. ET)