EC2 Benchmark Followup (Source + Data)

Many people have asked for the source code behind our recent post on EC2 I/O performance. After some minimal cleanup, we have now posted the source code on Github: We’ve also created a discussion group for this work:!forum/scalyr-cloud-benchmarks.

There were also a few requests for the raw data. We have now posted it, as two separate archives: and These correspond to the two rounds of benchmarks described in the previous post (see the “Methodology” section). The first round measured performance for different thread counts; the second round measured only the “optimal” threadcount for each configuration, over a longer period of time. The remainder of this post describes the format of these data archives.

Each archive contains 8 subdirectories (“trial1”, “trial2”, etc.), corresponding to the eight tested configurations: small/ephemeral, small/ebs1, small/ebs4, medium/ephemeral, large/ephemeral, large/ebs, large/ebs4, and xlarge/ephemeral respectively. Within each subdirectory is an “output” directory, which contains many numbered files; the numbers correspond to the EC2 instances being benchmarked.

Of primary interest are the JSON files (json.1, json.2, etc.). These summarize the results of the benchmark runs. Each line corresponds to a single benchmark, and is in JSON format with the following structure:

“fileSize”: 85899345920,       // size of data file
“launchTime”: 2367,
“runtime”: 120,                // runtime for this benchmark (secs)
“bucketDuration”: 30,          // duration of a time bucket (secs)
“operations”: [
“signature”: “read,4K,4K”, // operation tested (here, 4K reads)
“threadCount”: 8,          // number of I/O threads used
“total”: {HISTOGRAM},      // summarizes all operations
“timeBuckets”: [
{HISTOGRAM},             // see below


We divide the benchmark execution period into buckets. In this example, the benchmark ran for 120 seconds, with 30-second buckets. The timeBuckets array contains a histogram per bucket, reporting on the runtime of all operations completed during that bucket. The “total” field contains a histogram for all operations in the entire benchmark (i.e. summing across time buckets). Note that the timeBuckets array generally contains one extra entry, reflecting straggler operations that completed just after the nominal benchmark runtime.

Each histogram has the following structure:

“count”: 19516, // total number of operations reported here
“errorCount”: 0, // number of failed operations
“minValue”: 6966, // minimum runtime (nanos) for any operation
“maxValue”: 581407940, // maximum runtime (nanos) for any operation

  “totalValue”: 959992052130, // total runtime (nanos) for all ops
“bucketRatio”: 1.1,

  “firstBucketStart”: 6727.4999493256,
“buckets”: […],
“pinMinimum”: 1000,
“pinMaximum”: 10000000000

Operation runtimes are measured in nanoseconds. Each runtime is pinned to the range [pinMinimum … pinMaximum], and then placed in a bucket. Each entry in the buckets array indicates the number of operations whose runtime fell in a particular range. The range for buckets[k] is [B * 1.1^k … B * 1.1^(k+1)], where B is firstBucketStart. In other words, the largest value falling into a bucket is 1.1 times the smallest value, and the smallest value for the first bucket is firstBucketStart. The code behind all this is in

Also of conceivable interest are the files run.out.1, run.out.2, etc. These contain the raw stdout from the benchmark tool. The contents are essentially the same as the json files, with some additional logging noise.

If you have questions, please post on the discussion group.