What is the Apache access log? Well, at the broadest level, it’s a source of information about who is accessing your website and how.
But as you might expect, a lot more goes into it than just that. After all, people visiting your website aren’t like guests at your wedding, politely signing a registry to record their presence. They’ll visit for a whole host of reasons, stay for seconds or hours, and do all sorts of interesting and improbable things. And some of them will passively (or even actively) thwart information capture.
So, the Apache access log has a bit of nuance to it. And it’s also a little…complicated at first glance.
But don’t worry — demystifying it is the purpose of this post.
Apache Access Log: the Why
I remember starting my first blog years and years ago. I paid for hosting and then installed a (much younger) version of WordPress on it.
For a while, I blogged into the void with nobody really paying attention. Then I started to get some comments: a trickle at first, and then a flood. I was excited until I realized that they were all suspiciously vague and often non-sequiturs. “Super pro info site you have here, oPPS, I HITTED THE CAPSLOCK KEY.” And these comments tended to link back to what I’ll gently say weren’t the finest sites the internet had to offer.
Yep. Comment spam.
Somewhere between manually deleting these comments and eventually installing a WordPress plugin to help, I started to wonder where these comments were all coming from. They all seemed to magically appear in the middle of the night and they were spammy, but I was interested in patterns beyond that.
This is a perfect use case for the Apache access log. You can use it to examine a detailed log of who has been to your website. The information about visitors can include their IP address, their browser, the actual HTTP request itself, the response, and plenty more.