Five To-Dos When Monitoring Your Kubernetes Environment

If you’re on the DevOps front line, Kubernetes is fast becoming an essential element of your production cloud environment. Since container orchestration is critical to deploying, scaling, and managing your containerized applications, monitoring Kubernetes needs to be a big part of your monitoring strategy.

Container environments don’t operate like traditional ones. So, if you are monitoring your applications and infrastructure, you need to be thoughtful about how you monitor your container environment in which they are running. Here are five best practices to inform your strategy:

  1. Centralize your logs and metrics. Orchestrating your containerized services and workloads through Kubernetes brings order to the chaos, but remember that your environment is still decentralized. You will give yourself a fighting chance if you centralize your logs and metrics.
  2. Account for ephemeral containers. The beauty of container orchestration is it’s easy to start, stop, kill, and clean up your containers in short order. However, monitoring them may not be so easy. You still need to debug problems and monitor cluster activity, even when services are coming and going. The trick is to grab the logs and metrics before they’re gone. If you don’t, your metrics will look more like the graph on the left than the one on the right.
    log files examples for transient containers
  3. Simplify, simplify, simplify. With all of the moving pieces in your container environment (services, APIs, containers, orchestration tool), you need to monitor without introducing unneeded complexity. Rather than bloating your container with various monitoring agents, each requiring updates on unique schedules, abstract your monitoring and management tools from what you’re monitoring and managing. This will also help your engineers focus on building and delivering software, not operating the delivery platform.
  4. Monitor each layer explicitly. You will need to collect logs and monitor for errors, failures, and performance issues at each layer – the pod, the container, and the controller manager – of your environment. For example, you’ll need to be able to troubleshoot pod issues, ensure the container is working, and collect runtime metrics in the controller manager.
  5. Ensure data consistency across layers. For fast, accurate debugging, you need to ensure data consistency across all the layers in your container environment. Things like accurate timestamps, consistent units of measurement (such as milliseconds vs. seconds), and collecting a common set of metrics and logs across applications and components will help you troubleshoot and debug quickly and accurately across all of your layers.

One best practice for accomplishing these to-dos in a simple, straightforward manner is to monitor the containers in your Kubernetes environment without touching your application containers. Do this by introducing a DaemonSet, or alternatively a sidecar, into your Kubernetes environment(s) that sits alongside your containerized services and includes your logging and metrics collection agent. Deploying in this method will ensure consistent data collection, minimize the changes required to your application containers, and most importantly, eliminate the possibility of selective blindness in your production environment.

A few ways to implement this include:

  • Introduce a DaemonSet with the Fluentd logging agent (this will give you logging but not metrics). If you already have an ELK cluster configured, this is probably the option for you. Learn more here.
  • Introduce a DaemonSet or sidecar with the Prometheus metrics agent (CoreOS has done an excellent job of integrating Prometheus and Kubernetes). Running Prometheus on your Kubernetes cluster will give you metrics instrumentation, querying, and alerting. Learn more here.
  • A variety of metrics and performance monitoring tools, including Heapster, DataDog, cAdvisor, New Relic, Weave/VMware, and several others also offer a DaemonSet or sidecar options for Kubernetes monitoring.
  • Scalyr, log management for the DevOps front line, has a preconfigured DaemonSet containing the open source Scalyr agent available for download and use. The Scalyr DaemonSet natively supports both Kubernetes logging and metrics. You can download the YAML file for deploying the containerized Scalyr agent from GitHub here. Note that you also can download the full open-source Scalyr agent from GitHub here.

 

Network Traffic Monitoring: The 7 Best Tools Available To You

Sales, pre-sales, human resources, the company cafeteria: they’re all online. If the network is down, employees are angry and customers have gone elsewhere. That’s why network traffic monitoring is a critical part of maintaining a healthy enterprise.

Fixing network problems when they happen isn’t good enough. IT managers have to proactively watch systems and head off potential issues before they occur. This means observing network traffic and measuring utilization, availability, and performance.

A useful monitoring tool offers these features:

  • real-time network monitoring
  • an ability to detect outages in real time
  • a mechanism for sending alerts
  • integrations for network hardware, such as SNMP and NetFlow monitoring

This is a list of the best tools available for monitoring your network traffic. Several of them are sold as SaaS, others for running on-premises, and a couple are open-source with optional commercial versions. All of these tools offer more than just network monitoring. They also offer varying degrees of application, system, and web monitoring too.

Icons of tools

Read More

Scalyr in the dark

Ying-yang symbol in black and whiteSometimes you want your dashboards to be dark.

And by dark, we mean dark backgrounds with light text and discernable colors. For many people who watch dashboards, looking for alerts and concerns, a darkened screen is more restful. And while research may have concluded that dark letters on a white background are easier to process, it also points out white backgrounds in a darkened environment may disrupt low light vision adjustments.

Scalyr dashboards are white, with dark text and colors in use to indicate the desired metrics. These dashboards are easy to read, understand, and clearly detail the tracked data.

Scalyr dashboard for Linux Processes with white background

However, it is reasonably easy to change this view by making use of the accessibility features in Chrome and Firefox. These reversed colors hack is not perfect as it impacts all of the sessions on the respective browser but quite suitable for a long-lived view or stable operations center.

Scalyr dashboard for Linux processes with black background

This hack is also not making use of the system-wide accessibility features, so only the browser is impacted. In short, the rest of your applications and background are viewable as usual, but the view of your browser has changed. Just be aware that other activity on your tabs may have rather wild results.

Chrome offers several accessibility extensions that allow some control over the look of a page presented in the browser. These extensions are available in the Chrome web store and are easily found via the Preferences (or Settings) page. The extension allows easy toggling of visual impact.

Firefox has built-in preferences that allow you to create the dark scheme. While it is easier than Chrome, requiring no extension installation, it does not provide an easy way to toggle off and on. It does have additional personalization features and does not change the data representation colors. (Note, an add-on exists that allowed toggling but is not supported in FireFox Quantum at this time.)

A similar capability exists on Safari, Windows Internet Explorer, and Edge but requires a CSS stylesheet insertion. I will cover those in a later blog.

The following directions are from my MacBook; however, the same workflow exists for all Firefox or Chrome browsers regardless of operating system. If you are having difficulty with your version, drop me a comment and I will try to help out.

Come to the Dashboard Dark Side. 

Step by step for Chrome:
  1. Open your Chrome browser.
  2. Click on the menu button.
  3. Choose the Settings or Preference menu item (alternatively, you could enter “chrome://settings” directly into the address bar).
  4. Scroll down the page and select Advanced.
  5. Scroll down to Accessibility and click on Add Accessibility Features. This click will open a new tab (or window).image cut of Chrome accessibility store
  6. In the Chrome store, find High Contrast.
  7. Click on Add to Chrome. This click will open a pop-up window. 
  8. Click Add Extension. High Contrast icon in Chrome browser bar
    A small icon will appear in Chrome in the upper right corner.
  9. Now the fun stuff. Click on the icon and Enable the extension. There are some cases where installing the extension automatically enables it. If so, just click the icon to bring up the selections.pop up window to control a11y settings
  10. Now that the extension is enabled click on Inverted Colors.
  11. To toggle back and forth, either make use of the keyboard accelerators or click the appropriate choice in the extension.
Step by step for Firefox:
  1. Open the Firefox browser
  2. Click the Menu button.
  3. Select Preferences or Settings (alternatively, you could enter “about:preferences” directly into the address bar).
  4. Scroll to Fonts & Colors (under Language and Appearance).Firefox Languages section of preferences
  5. Click on ColorsMenu of choices to change Firefox color selections
  6. In the Colors menu, change the associated colors.
    1. Text to White.
    2. Background to Black.
    3. Unvisited Links to Yellow (Optional).
    4. Visited Links to Light Blue (Optional).
      Please note that in Firefox, you can use what colors you would like. The above choices are offered as a starting point.
  7. In the Override the Colors box select Always.
  8. Click OK.

Reversing this choice requires entering preferences again and setting the Override the Colors choice to Never. As noted, the extension that offered a toggle button is not supported in Firefox Quantum.

So there you have it, a quick and dirty approach to getting your dashboards in the dark.

However, note that these changes will impact all tabs and windows of the browser, not just the dashboard views. For that reason, you may want to limit this use to a long-lived display or make use of the toggle capability of Chrome.

To learn more about accessibility in Chrome, check out Use Chrome with accessibility extensions – Google Chrome Help. For information about accessibility in Firefox, take a look at Accessibility features in Firefox. And to learn more about dashboards in Scalyr, check out how log analysis can lead you to needed actions.

If you try it out, tell me about your experiences in the comments.

 

Get Started Quickly With JavaScript Logging

Let’s continue our series on getting started with logging. We’ve already covered C#, Java, Python, Ruby, and Node.js. This time, we’re going to look at JavaScript logging. If you’re wondering how this differs from the Node.js article, this one will look at pure client-side JavaScript logging. Once again, we’ll get into

  • Logging in a very basic way
  • What to log
  • Why you should log at the client-side
  • How to log using a client-side JavaScript logging framework

JavaScript Shield with Scalyr Colors

Read More

How are you doing “observability”?

In today’s world of complex code and deployment, people on the DevOps front line face challenges in monitoring, alerting, tracing distributed systems and log aggregation/analytics. Jointly, these are often called “observability” (reference: Twitter blog).

hand drawn checklist with yes and no choices

These are challenges faced by multiple groups like DevOps, core engineering teams and Web 2.0 developers. We see concerns in web applications and traditional enterprise applications. We foresee even more issues in emerging spaces like IoT, event-driven design and microservices.

And as is usual in most complexity-bound problems, there are a lot of ways to solve these challenges. These include the use of discrete tools and procedural methods. Such approaches often cause gaps and leave edges uncovered. After all, the matrix of product types and usage (use cases) are large and growing and the need and scale are also increasing.

So, what do you think about observability? Scalyr is hosting a short survey to find what the current state of observability is. The survey looks are several areas, including tools and related issues and we invite you to chime in with your thoughts. While the survey is the best place to express your views, feel free to leave us a comment on this topic.

Please take the survey (at https://bit.ly/2JKezDb), and you can opt-in to get a copy of the results.

Scalyr Platform: Kubernetes Monitoring, Performance, and Usability

Our Scalyr platform releases over the past month have focused on Kubernetes monitoring, query performance, and making improvements to usability.

Kubernetes Monitoring
Scalyr Kubernetes Data Visualization

Kubernetes Monitoring

We have added Kubernetes monitoring to our agent. We recommend running it as a DaemonSet on your cluster for efficiency and minimal disruption. Find the new Scalyr agent on Github, and don’t forget to download our Kubernetes monitoring best practices document.

Query Performance Hits New Benchmark – 1.5 TB/second

We have continued to optimize for performance, leading to a new throughput query performance benchmark. Our streamlined database architecture, combined with the brute force technique of applying every core in our cluster to every user query, helped us surpass the 1.5 TB/second benchmark – up from 1 TB/second late last year. Last month, we made a number of improvements, including how we load data from disk, manage concurrent queries, and map data to RAM cache pools.

User and Group APIs

We have added APIs to manage granular user and group permissions. These include adding, listing, editing, revoking access, and providing permissions to users, groups, and users within groups. Learn more in our API documentation.

Billing and Usage Page

We made a number of usability improvements last month, the most notable of which is our revamped billing and usage page, providing at-a-glance information for cost management. Learn more in our Billing and Usage page at the top right dropdown (company@scalyr.com > Billing Plan).

Going Forward

We are developing Scalyr with the DevOps front line in mind, and with a focus on our three value pillars – fast, simple, and shareable. The next several releases will focus on the simple part of that equation and include such improvements as making export to Amazon S3 buckets easy and revamping our alerting capability.

Feedback

Your product (or any) feedback is always welcome. Please reach out to us at support@scalyr.com.

The Build vs Buy Decision Tree

Chocolate or vanilla? Pancakes or waffles? Coke or Pepsi? We decide between similar choices every day. Some of us have preferences, and other times it’s just a feeling in the moment. A common decision in the IT world is the “build vs buy” decision. Sometimes this decision is not so cut and dry.

Can the decision to build or buy paralyze us with fear? Certainly. Do some have preferences? Definitely. However, all is not lost. There can be a logical system to decide whether to build or buy when it comes to software.

 

The Build vs Buy Decision Tree

 

Read More

Understanding the Apache Error Log in Detail

Today features another post about the nuts and bolts of logging.  This time, I’ll be talking about the Apache error log in some detail.

Originally, I had a different plan and outline for this post.  But then I started googling for good reference material.  And what I found were way more questions than answers about the topic, many on various Stack Exchange sites.  It seems that information about the Apache error log is so scarce that people can’t agree on where to ask questions, let alone get answers to them.

So let’s change that.  I’m going to phrase this as a Q&A-based outline, hopefully answering all of the questions you might have come looking for—if you googled the term—while also providing a broad narrative for regular readers.

Apache Feather

Read More

DevOps Security Means Moving Fast, Securely

In this world of lightning-fast development cycles, MVPs, and DevOps, it may intuitively feel like security gets left behind. You might be thinking, “Aren’t the security guys the ones who want to stop everything and look at our code to tell us how broken it is right before we try to deliver it?” Many feel that DevOps security is a pipe dream.

Is it possible to be fast and secure? Lately, I’ve been drooling over a sports car—namely, the Alfa Romeo Giulia Quadrifoglio. Long name, fast car. It holds some impressive racing records and sports 505 horsepower but also is a Motor Trend Car of the Year and an IIHS Top Safety Pick. These awards are due to automatic braking technology, forward-collision warning, lane-keeping assistance, blind-spot monitoring, and rear cross-traffic alert. It is possible to be fast and safe.

The key to DevOps security is to move forward with development. Security teams need to understand why DevOps practices are so effective and learn to adopt them.

Man Running Fast with Scalyr Colors

Read More