Amazon’s cloud infrastructure suffered a power outage last week that created problems for several big name clients. It is reported that there was a down time of over 24 hours. Sites such as Foursquare, Quran, Moby, Reddit and Pinterest amongst others were all affected. Many large EC2 users ended up losing valuable business information and data. Chartbeat reported that they lost 11 hours of valuable historical data which has been explained as “unrecoverable”.
Amazon has put the problems down to a power failure, however, no other details were provided regarding the origin of the problem that actually caused the blackout.
However, it all happened during severe thunderstorms in Virginia. This has highlighted one particular issue when it comes to cloud computing; however, ultimately the advantages outweigh the draw backs. By leasing software and computing power from a cloud hosting provider, businesses can save a tremendous amount of time and effort, and will also be able to focus more on their core operations.
The issue that has now arisen is the seeming lack of communication between Amazon and its cloud customers. Jeremy Farber, CTO of Car Domains, has said: “There is no way to know if you are looking at a 30-minute problem, or a 10-hour problem.” He then went on to suggest that Amazon’s very limited communication with customers during down time has always been frustrating.
This isn’t the first time Amazon Web Services (AWS) has crashed; in April 2011 there was a major outage that affected customers in Virginia and now many people are questioning the reliability of AWS. Particularly Ken Peck, director of IT at Davita. He has said that the problems are “a big concern”. Furthermore, he went on to state that the allure of the cloud is the fact that it works 24/7 and that it should be “more fault tolerant” than using any alternative.
It seems that AWS has some questions to answer whilst also needing to rebuild customers’, and potential customers’, confidence in the service.