Microsoft Azure Outage-
Incident Report for Medocity
Resolved
Latest update from Microsoft Azure: Preliminary PIR - Azure Front Door - Connectivity Issues (Tracking ID YV8C-DT0)
This is our "Preliminary" PIR to share what we know so far.

After our internal retrospective is completed (generally within 14 days) we will publish a "Final" PIR with additional details/learnings.

What happened?

Between 16:10 and 19:55 UTC on 07 Sep 2022, subset of customers using Azure Front Door might have experienced connectivity issues. This could also be impacting customers’ ability to access other Azure services that leverage Azure Front Door, this includes the Azure Management Portal and Azure CDN.

What went wrong, and why?

The AFD platform automatically balances traffic across our global network of edge sites. When there is a failure in any of our edge sites or an edge site becomes overloaded, traffic is moved to other healthy edge sites in other regions. This way customers and end users don’t experience any issues in case of regional impacts.

Between 16:10 and 16:45 UTC we observed an unusual spike in traffic where the AFD service attempted to load balance traffic for optimal use and minimal latency for customers. In this instance, the load balancing that occurred during the window of the traffic spike caused multiple environments managing this traffic to go offline. We have auto-mitigations which will cause our environments to recover in such an event. By design, these environments will recover and once they are in a healthy state so they can start to resume managing traffic. During this instance, as users and our systems retried the requests, it exasperated the situation where we had a build-up of requests and this build-up did not allow time for the environment to fully recover.

How did Microsoft respond?

We manually intervened in the AFD load balancing process by expediting the auto-recovery system and performing more efficient load distributions in regions where there was a large build-up of traffic. Once the environment recovered, we began to gradually bring them back online to resume traffic management in a normal way.
Posted Sep 07, 2022 - 22:30 EDT
Monitoring
A temporary fix has been made by Medocity to correct the issue -Medocity will continue to monitor status with Microsoft Azure.
Posted Sep 07, 2022 - 15:55 EDT
Update
Medocity has bypassed the Azure Front Door Load Balancers as an interim solution by rerouting the DNS. Unit testing services has been conducted and all services are now operational. Medocity will continue to monitor the Azure Outage and will reroute the DNS back to Front Door once it is stable.
Posted Sep 07, 2022 - 15:45 EDT
Update
Per Microsoft - Azure Front Door - Connectivity issues

Starting at 16:10 on 07 Sep 2022, customers using Azure Front Door could be experiencing connectivity issues. This could also be impacting customers’ ability to access the Azure Management Portal. We are investigating a spike in traffic as a potential cause. While we are not currently observing any traffic spikes currently, we are working on remediating the residual impact. We are recovering a number of nodes that are showing intermittent connectivity issues. For customers who are experiencing connectivity issues, retries are likely to be successful. Most customers should be seeing recovery at this stage.
Posted Sep 07, 2022 - 13:35 EDT
Investigating
We are currently investigating this issue.
Posted Sep 07, 2022 - 12:32 EDT