One API for All Your Server Logs

Our goal at Scalyr is to provide sysadmins and DevOps engineers with a single log monitoring tool that replaces the hodgepodge of tools they were previously using. We’ve come a long way in doing that. Today, Scalyr is a unified, cloud-based tool that lets you aggregate multiple server logs, monitor and analyze them, set custom log alerts, and create custom dashboards. Still, we work hard to continue improving and making it an even more useful tool for you, and we listen closely to users’ feedback.Read More

Cloud Cost Calculator

Editor’s Note!: While many of you may find this somewhat dated post to be interesting, the calculator itself has been retired for now. We’ve removed links to the Cloud Calculator below.

There are many, many options for cloud server hosting nowadays. EC2 pricing alone is so complex that quite a few pages have been built to help sort it out. Even so, while comparing costs for various scenarios — on demand vs. reserved instances, “light utilization” vs. “heavy utilization” reservations, EC2 vs. other cloud providers — we here at Scalyr recently found ourselves building spreadsheets and looking up net-present-value formulas. That seemed a bit silly, so we decided to do something about it. And so we now present, without further ado: the Cloud Cost Calculator [Link Removed – Content out of date!].Read More

Optimizing AngularJS: 1200ms to 35ms

Edit: Due to the level of interest, we’ve released the source code to the work described here: https://github.com/scalyr/angular.

Here at Scalyr, we recently embarked on a full rewrite of our web client. Our application is a broad-spectrum monitoring and log analysis tool. Our home-grown log database executes most queries in tens of milliseconds, but each interaction required a page load, taking several seconds for the user.Read More

Exploring the Github Events Firehose

Here at Scalyr, we’ve been having a lot of fun building out a high-speed query engine for log data, and a snappy UI using AngularJS. However, we haven’t had a good way to show it off: a data exploration tool is useless without data to explore. This has been a challenge when it comes to giving people a way to play with Scalyr Logs before signing up. We recently learned that Github provides a feed of all actions on public repositories. That sounded like a fun basis for a demo, so we began importing the feed. (To explore the data yourself, see the last paragraph.)Read More

Good News: Your Monitoring Is All Wrong

This is the first in a series of articles on server monitoring techniques. If you’re responsible for a production service, this series is for you.

In this post, I’ll describe a technique for writing alerting rules. The idea is deceptively simple: alert when normal, desirable user actions suddenly drop. This may sound mundane, but it’s a surprisingly effective way to increase your alerting coverage (the percentage of problems for which you’re promptly notified), while minimizing false alarms.Read More

Announcing Scalyr Logs

“Holy crap. You guys are awesome… I’m already finding issues I wasn’t aware of. The ability to click on a piece of the log and find similar items is fantastic.”

18 months ago, we began developing Scalyr, which combines server monitoring, log collection and analysis, alerts, dashboards, and other functions into a practical, comprehensive DevOps tool. Last fall, we began real-world deployments in a closed beta program. The quote above was a comment – unsolicited – from one of our beta customers. Today, we’re excited to announce that we have exited beta and the service is available for all.Read More

“Benchmarking in the Cloud” talk online

Amazon has posted the talks from re:Invent on YouTube. The video from the EBS session is here. My brief presentation on “Benchmarking in the Cloud” starts at the 30:16 mark (direct link). You can download my slides here.

It was a terrific conference. The pace of development, and just plain enthusiasm and energy, around cloud services in general and AWS in particular is just amazing. I do recommend checking out some of the talks if you have time.

Server Monitoring Talk Now Online

The video to my talk on server monitoring (“Famous Outages, and How To Not Have Them”) is now available:

Thanks to Box for providing the venue and a good crowd, and thanks to the crowd for a great response. The talk is aimed at anyone who is running a production system, large or small. The focus is on how to get good monitoring coverage for a reasonable investment in effort; spiced up with plenty of stories about real-world production outages.