Log Management: What Is It and Why You Need It

To understand log management, you first need to understand what problem it solves. Once you see that, you’ll know both what it is and why you need it.

Software these days involves a lot of complexity that didn’t exist once upon a time. We’ve moved things into the cloud, created software/platforms/infrastructure as services, and embraced distributed computing.

That’s a sea change from the good ol’ days of the 1990s. Back then, you’d write a bunch of code, build it, put it on CDs or floppy disks, and mail it to people. It’s even a sea change from the 2000s, when the web application took over. Instead of CDs, you’d set up a web server, deploy your software to that, and let users and their browsers have at it.

But today, we have containers and microservices. We have software intelligence distributed around the globe, spinning up and down on demand, collaborating and orchestrating. We’ve traded the simplicity of the historical monolith for the flexibility and complexity of distributed intelligence.

Log Files in a Distributed World

Think about the change I’ve just described. And now imagine what that means for the existence of a log file.

In the 1990s, you’d add code to your application that dumped information to a single log file. If your users had problems, they could zip up that log file, along with an OS log file for good measure, and send those to you for troubleshooting. With 2000s web applications, that same application log file, along with the web server log file and the database log file, did the trick.

But now? Good luck. Your production operations include six RESTful microservices on six different servers, a bunch of on-demand containers, a few miscellaneous web apps, a service bus, and who knows what else? Each of those concerns is contained, isolated, simple, and useful.

But troubleshooting across those concerns, when the issue happens in the gaps, can be a mess. And gathering 20 different log files that you attempt to reassemble into some facsimile of order doesn’t help matters at all.

Log Management to the Rescue

That is where the idea of log management as a first class need enters the picture. If you have a desktop app or a simple web app, you can probably get by with grep, text editors, and elbow grease. But as soon as you grow beyond that, you’re going to need a better approach.

Log management is that better approach. Instead of regarding your applications’ logs as separate, unrelated entities, you conceive of them as parts of a whole. You weave them together and then use them to paint a dynamic, intelligent, and visual picture of the health of all your systems.

If that sounds daunting, don’t worry. You don’t need to implement all of this yourself. In fact, you definitely shouldn’t do it yourself any more than you should write your own source control. A lot of talented toolmakers have invested significant effort in helping you with your log management.

But rather than focus on specific tools, let’s take a look at log management as a function of its components. What does a good log management scheme involve, and what should you expect out of it?

Choosing Among Log Management Tools

When you google log management tools, an interesting thing happens. At the time of this writing, you see no fewer than 4 paid ads, followed by a series of posts. These include, and this is not a joke, a post that lists the top 47.  As a software developer and tools consumer, this drives me insane. It probably does the same for you.

An author named Barry Schwartz coined a term (along with an eponymous book) for this frustration. He called it “the paradox of choice,” and it describes how, while we like to have some choice and autonomy, too much paralyzes us. To understand this in simple, terms, imagine selecting music for a dinner party. If offered two albums from which to choose, you’d make a pretty quick choice. If offered hundreds, you might thumb through them for a long time, trying to consider the likely tastes of all of your guests. And you might actually just give up eventually, and opt for only conversation with no background music at all.

The Paradox of Choice Among Log Management Tools

Back in the DevOps world, you face a similar plight when trying to pick among log management tools. You understand that you need a better way to aggregate and mine your logs than “by hand, using Sublime Text,” so you start to do some research. And then, about two searches in, you find yourself staring at post entitled, “The Top 47 Log Management Tools.” And, if you’re anything like me, you rub your temples and say to yourself, “ugh, never mind, I’ll figure this out tomorrow.”

That, of course, lines up with Schwartz’s findings about human behavior. Beyond having a few options, each additional option presented to a group of people causes fewer people to participate. The higher the number of log management tools in those posts, the fewer people will actually pick any of them at all.

Luckily, there’s a path back to joy. And it’s not even terribly complicated. You just need to dramatically narrow the field.

So today, I’m not going to add to the pile of “pros/cons/features” posts out there comparing dozens of tools. Instead, I’ll speak to heuristics you can employ to help you choose among log management tools. I’m going to help you narrow the field from a paralyzing number of choices that you make you unhappy to a manageable number that empowers you.

Five Reasons You Need Log Monitoring

You probably regard application logging the way you think of buying auto insurance. You sigh, do it, and hope you never need it. And aren’t you kind of required to do it anyway, or something? Not exactly the scintillating stuff that makes you jump out of bed in the morning.

It feels this way because of how we’ve historically used log files. You dutifully instrument database calls and controller route handlers with information about what’s going on. Maybe you do this by hand, or maybe you use a mature existing tool.  Or maybe you even use something fancy, like aspect-oriented programming (AOP). Whatever your decision, you probably make it early and then further information becomes rote and obligatory.  You forget about it.

At least, you forget about it until, weeks, months, or years later, something happens. Something in production blows up. Hopefully, it’s something innocuous and easily fixed, like your log file getting too big. But more likely some critical and maddeningly intractable production issue has cropped up. And there you sit, scrolling through screens filled with “called WriteEntry() at 2017-04-31 13:54:12,” hoping to pluck the needle of your issue from that haystack.

This represents the iconic use of the log file, dating back decades. And yet it’s an utterly missed opportunity. Your log file can be so much more than just an afterthought and a hail mary for addressing production defects. You just need the right tooling.

Log Monitoring To the Rescue

I’ve talked in the past about one form of upgrade from this logging paradigm: log aggregation. A log aggregation tool brings your log files into one central place, parses them, and allows you to search them rapidly. But you can do even more than that, making use of log monitoring via dashboards.

The DevOps Job Market

DevOps as a profession and discipline just keeps growing in demand. This year alone, more than 100 global conferences are dedicated to DevOps — and even the “best of” lists are staggering. Hundreds of companies (including Scalyr) are developing tools specifically for the DevOps field, and blogs dedicated to everything from current news and trends to making light of DevOps daily frustrations abound (RIP DevOps Reactions). Even a casual survey of the trending-up of “DevOps” in Google search over the past five years makes it pretty clear that demand for professionals in this space will only continue to rise.

Point. Made. (via Google Trends)

