Many people have asked for the source code behind our recent post on EC2 I/O performance. After some minimal cleanup, we have now posted the source code on Github: https://github.com/scalyr/iobench. We’ve also created a discussion group for this work: https://groups.google.com/forum/#!forum/scalyr-cloud-benchmarks.Read More
At Scalyr, we’re building a large-scale storage system for timeseries and log data. To make good design decisions, we need hard data about EC2 I/O performance.
Plenty of data has been published on this topic, but we couldn’t really find the answers we needed. Most published data is specific to a particular application or EC2 configuration, or was collected from a small number of instances and hence is statistically suspect. (More on this below.)
Since the data we wanted wasn’t readily available, we decided to collect it ourselves. For the benefit of the community, we’re presenting our results here. These tests involved over 1000 EC2 instances, $1000 in AWS charges, and billions of I/O operations.Read More
On the Scalyr blog, we sometimes post on topics relating to cloud computing in general. This is such a post.
Microsoft’s Azure service suffered a widely publicized outage on February 28th / 29th. Microsoft recently published an excellent postmortem. For anyone trying to run a high-availability service, this incident can teach several important lessons.Read More
37signals recently launched public “Uptime Reports” for their applications (announcement). The reaction on Hacker News was rather tepid, but I think it’s a positive development, and I applaud 37signals for stepping forward. Reliability of cloud applications is a real concern, and there’s not nearly enough hard data out there. Not all products are equally reliable; even within 37signals, the new reports show a 3:1 variation in downtime across apps.Read More