Microsoft 365 outage was caused by a faulty ECS

Microsoft has identified a faulty Enterprise Configuration Service (ECS) deployment as the root cause of the outage, which affected several of the company’s services.

ECS is an internal central configuration repository that enables Microsoft services to make far-reaching dynamic changes across multiple services and functions. ECS also makes changes to specific configurations per tenant or user.

“A deployment in the ECS service contained a code defect that affected backward compatibility with services that leverage ECS. The net result was that for services that utilize ECS it would return incorrect configurations to all its partners,” the company said.

Regarding the extent of the impact, Microsoft explained that the impact was based on how “individual Microsoft services utilize the malformed configuration provided by ECS. Impact ranged from services crashing such as Teams while other services experienced limited to no impact.”

Microsoft said it is working to improve the resilience of the Microsoft Teams service by using a cached ECS configuration version in case of a future ECS failure.

The company is investing in additional fault isolation to limit the impact of an ECS failure, and there is also a need to update monitoring thresholds to better detect substandard errors.

The sources for this piece include an article in BleepingComputer.

IT World Canada Staff
IT World Canada Staff
The online resource for Canadian Information Technology professionals.

Would you recommend this article?


Thanks for taking the time to let us know what you think of this article!
We'd love to hear your opinion about this or any other story you read in our publication.

Jim Love, Chief Content Officer, IT World Canada

Featured Download

ITW in your inbox

Our experienced team of journalists and bloggers bring you engaging in-depth interviews, videos and content targeted to IT professionals and line-of-business executives.

More Best of The Web