Introducing deeper support for Kubernetes, collaboration features and stack tracing for easier troubleshooting

As the move to the cloud and containers continues to make software delivery faster and easier, environments are getting more complex. At Scalyr, we believe that observability solutions need to help engineering and operations teams get access to all the data they need, fast. Along those lines, we are announcing new features to help teams support the latest container and orchestration technologies, improve collaboration, and streamline workflows for faster and easier issue identification and resolution.

Kubernetes Cluster-Level Logging

Our new Kubernetes cluster-level logging enables engineering teams to effectively monitor and troubleshoot Kubernetes environments by centralizing and visualizing logs by deployment. Regardless of source, Scalyr intelligently ingests, parses and organizes logs to give developers an application-level view for faster issue resolution in complex container environments.

As shown in the screenshot above, users are presented with log summaries by deployment  rather than by each individual container. The automatic grouping of logs pertaining to a deployment gives developers a more holistic view of each application or service that may be running on multiple containers and pods. Insight into individual pods and nodes is also available but that level of detail is abstracted by default so developers can focus on their application right away.

Read More

Five To-Dos When Monitoring Your Kubernetes Environment

If you’re on the DevOps front line, Kubernetes is fast becoming an essential element of your production cloud environment. Since container orchestration is critical to deploying, scaling, and managing your containerized applications, monitoring Kubernetes needs to be a big part of your monitoring strategy.

Container environments don’t operate like traditional ones. So, if you are monitoring your applications and infrastructure, you need to be thoughtful about how you monitor your container environment in which they are running. Here are five best practices to inform your strategy:

  1. Centralize your logs and metrics. Orchestrating your containerized services and workloads through Kubernetes brings order to the chaos, but remember that your environment is still decentralized. You will give yourself a fighting chance if you centralize your logs and metrics.
  2. Account for ephemeral containers. The beauty of container orchestration is it’s easy to start, stop, kill, and clean up your containers in short order. However, monitoring them may not be so easy. You still need to debug problems and monitor cluster activity, even when services are coming and going. The trick is to grab the logs and metrics before they’re gone. If you don’t, your metrics will look more like the graph on the left than the one on the right.
    log files examples for transient containers
  3. Simplify, simplify, simplify. With all of the moving pieces in your container environment (services, APIs, containers, orchestration tool), you need to monitor without introducing unneeded complexity. Rather than bloating your container with various monitoring agents, each requiring updates on unique schedules, abstract your monitoring and management tools from what you’re monitoring and managing. This will also help your engineers focus on building and delivering software, not operating the delivery platform.
  4. Monitor each layer explicitly. You will need to collect logs and monitor for errors, failures, and performance issues at each layer – the pod, the container, and the controller manager – of your environment. For example, you’ll need to be able to troubleshoot pod issues, ensure the container is working, and collect runtime metrics in the controller manager.
  5. Ensure data consistency across layers. For fast, accurate debugging, you need to ensure data consistency across all the layers in your container environment. Things like accurate timestamps, consistent units of measurement (such as milliseconds vs. seconds), and collecting a common set of metrics and logs across applications and components will help you troubleshoot and debug quickly and accurately across all of your layers.

One best practice for accomplishing these to-dos in a simple, straightforward manner is to monitor the containers in your Kubernetes environment without touching your application containers. Do this by introducing a DaemonSet, or alternatively a sidecar, into your Kubernetes environment(s) that sits alongside your containerized services and includes your logging and metrics collection agent. Deploying in this method will ensure consistent data collection, minimize the changes required to your application containers, and most importantly, eliminate the possibility of selective blindness in your production environment.

A few ways to implement this include:

  • Introduce a DaemonSet with the Fluentd logging agent (this will give you logging but not metrics). If you already have an ELK cluster configured, this is probably the option for you. Learn more here.
  • Introduce a DaemonSet or sidecar with the Prometheus metrics agent (CoreOS has done an excellent job of integrating Prometheus and Kubernetes). Running Prometheus on your Kubernetes cluster will give you metrics instrumentation, querying, and alerting. Learn more here.
  • A variety of metrics and performance monitoring tools, including Heapster, DataDog, cAdvisor, New Relic, Weave/VMware, and several others also offer a DaemonSet or sidecar options for Kubernetes monitoring.
  • Scalyr, log management for the DevOps front line, has a preconfigured DaemonSet containing the open source Scalyr agent available for download and use. The Scalyr DaemonSet natively supports both Kubernetes logging and metrics. You can download the YAML file for deploying the containerized Scalyr agent from GitHub here. Note that you also can download the full open-source Scalyr agent from GitHub here.