Recently, Mark Bowker (a colleague of mine who works upstairs near the ESG Research eggheads) wrote a blog post entitled “The Cloud Busy Signal.” ESG continues to observe reports of degraded services and outages by cloud providers. Mark’s post is well worth the read and he makes a good point–not to mention a good segue into the inaugural episode of Backfires, Misfires, and Duds…
Despite all the benefits of cloud, widespread adoption won’t really take off until providers get their collective acts together and start minimizing these “busy signal” events.
—————–
Misfire – Amazon cloud accused of network slowdown:
“Amazon‘s sky-high EC2 service has experienced a significant increase in network latency in recent days, according to data from two separate companies running widely-used management tools in tandem with the service. Cloudkick – one of the many outfits that offer a service for overseeing the use of Amazon EC2 and other so-called compute clouds – first noticed an Amazon latency spike around Christmas time, and the problem has grown steadily over the past few weeks.”
—————–
Misfire –
“Rackspace reports that its cloud computing service is “degraded,” with many customers reporting their sites are unreachable. The company attributed the problem to an unusual load spike in the storage system supporting its cloud platform. The outage came several hours after the Rackspace Cloud disabled CRON, a command commonly used to automate tasks on Unix and Linux systems. By early evening, the company said performance had improved.”
—————–
Backfire – Power Problems at Rackspace London Facility:
“The power interruption started at 9:19 a.m. local time when a module failed on an uninterruptible power supply (UPS) and the unit failed to transfer the load properly, Rackspace said in its status update. Power was restored for most customers by 11:30 a.m., but a subset of servers failed to restart properly… DCOps have had to manually intervene to bring the servers back online. In numerous instances they had to replace power supplies in servers, replace firewalls, reconfigure switches and to log on to servers to get them to boot properly. We can only apologise for this incident”
Read more of the ESG IT Team’s blog entries at IT Artillery.





