Updated Technical errors with the US-EAST-1 region of Amazon Web Services have caused widespread woes for customers, including difficulty accessing the management console and some other service problems.
The issues appear to be centred on the US-EAST-1 region, which is the oldest AWS region and located in North Virginia. This can have a global impact, as AWS noted in its status report:
“This issue is affecting the global console landing page, which is also hosted in US-EAST-1.”
Customers may be able to access region-specific consoles, the company said, by going directly to the URL for that region.
Within US-EAST-1 though, the affected services are not just the console, but also EC2 (Elastic Compute Cloud), DynamoDB and Amazon Connect. In reality, if EC2 is not working correctly hundreds of other services can be impacted since they run on EC2 behind the scenes.
The AWS status report for North America showing problems with key services
Twitter filled with frustrated customers, as well as suppliers apologising to their customers for the outage. Even vendors of cloud services were impacted, as many of these also run on AWS, such as Elastic Cloud which reported: “We are experiencing issues with capacity scaling related to the elevated errors rates within us-east-1 (AWS N. Virginia) and are monitoring the situation.”
- A bug introduced 6 months ago brought Google’s Cloud Load Balancer to its knees
- Server errors plague app used by Tesla drivers to unlock their MuskMobiles
- BT’s Plusnet shows Google how it’s done as email woes enter their third day
- The planet survived six hours without Facebook. Let’s make it longer next time
- Fastly ‘fesses up to breaking the internet with an ‘an undiscovered software bug’ triggered by a customer
- If you can’t log into Azure, Teams or Xbox Live right now: Microsoft cloud services in worldwide outage
Big companies believed to be affected include Amazon’s own Alexa, Music and Ring, Netflix, Disney, Discourse (which reported problems with “AWS Route 53, one of our DNS providers,” Tinder and Roku.
One developer said “AWS goes down and I spend 2 hour trying to debug why my code is not working,” illustrating the extent to which public cloud services are assumed to be up and running.
While it is a serious outage, other regions in general seem to be unaffected, management console aside. There is a common issue with hyperscale services though, which is that while resilience in general is very good, there is a possibility of cascading failures because of service inter-dependencies.
AWS in its status report for the console and for EC2 said that “we have identified the root cause and we are actively working towards recovery,” giving hope that the outage will not be long-lived. ®
Updated to add
The outage has been very bad news for the RISC-V team, which is currently hosting a virtual summit.
“We are aware and working closely with the technical team to get this resolved, and will update everyone once it is fully functioning again,” a spokesperson told The Register.
“For those already in a session, we recommend not refreshing the screen as this may disconnect your stream. All sessions are recorded and will be available to you on-demand shortly after the virtual event platform is live again.”
Smartish vacuum maker iRobot is also reporting services on its app being affected.