Zalando Engineering Team Standardizes on Scalyr for Log Management   

Overview 

Zalando, Europe’s leading online fashion platform, made the transition to the cloud two years ago. As part of the move to AWS, they were looking for a log management tool that was flexible enough to fit their agile engineering culture, powerful enough to scale, and fast enough to allow them to investigate incidents. After evaluating several solutions, they standardized on Scalyr as their log management solution across their entire engineering team.

About Zalando

Zalando is Europe’s leading online fashion platform for women, men and children. They offer their customers a one-stop, convenient shopping experience with an extensive selection of fashion articles including shoes, apparel, and accessories, with free delivery and returns. Their assortment of almost 2,000 international brands ranges from popular global brands, fast fashion, and local brands, and is complemented by their private label products. Their localized offering addresses the distinct preferences of their customers in each of the 15 European markets they serve: Austria, Belgium, Denmark, Finland, France, Germany, Italy, Luxembourg, the Netherlands, Norway, Spain, Sweden, Switzerland, Poland and the United Kingdom.

Customer Challenges

Zalando transitioned to the cloud two years ago. They went from a monolith code base to microservices in the cloud, which changed their log management needs. They evaluated Scalyr along with three other solutions.

During their evaluation process, their evaluation criteria required:

  • An agent that can collect all the logs on every service
  • UI where engineers can search logs
  • Search specific applications
  • Ability to see every single log in the UI
  • Ability to scale
  • Would fit with the engineering culture of Radical Agility

After evaluating the four solutions, they narrowed it down to two to let the teams decide. They liked that with Scalyr it was easy to implement the agent and roll it out onto EC2 instances. They were able to define custom parsers for log lines.

The engineering culture at Zalando is built on Radical Agility. In order to empower their teams with autonomy, they need to automate everything around how they provision machines. This includes giving people the tools they need to do everything in a compliant way in their accounts. They found that the custom parsers were particularly important in giving each team flexibility to do things in their own way, which is a key pillar of the success of the engineering team.

Results of Using Scalyr

Scalyr is now deployed across the entire engineering team at Zalando. The main ways the team uses Scalyr are:

  • Respond to incidents and incident mitigation
  • Analysis of what’s happening on the service
  • Metrics for monitoring
  • Proactive investigations

They were able to get Scalyr up and running very fast. Once set up, their teams were enabled with access to their logs. They didn’t need to configure the agent and were able to instantly see their logs.

Given the number of autonomous services Zalando runs, they needed a coherent solution for how to get to the logs.

When asked how Scalyr has helped them, Tim Kröger, Head of Engineering – Visibility and Andreas Pfeiffer, Cloud and Network Architect, responded with it feels like asking how breathing helped you with your life.”

Before Scalyr, when an application crashed, the developer had to go to the log server, grab all the logs and find the host where the app was running. This would take at least 10 minutes. With Scalyr, developers can now deploy an application, get issues on the error, see the logs immediately, log into Scalyr, give the app ID and see all the logs from the deployment. They were able to go from 10 minutes of work to 13 seconds (which includes logging into Scalyr!).

Overall, Scalyr has helped Zalando make the transition to the cloud and mitigated the risk or increasing errors while moving to AWS.

Wistia’s Engineering and Customer Support Teams Solve Customer Issues Faster With Scalyr

I recently caught up with Ryan Artecona, Infrastructure Tech Lead at Wistia, to learn more about how their engineering and customer support teams use Scalyr. 

Last summer, the engineering team at Wistia took a step back from their day to day to evaluate the tools and infrastructure they were using. The team realized they needed a log management solution. After evaluating several products, they decided to move forward with Scalyr. Scalyr enabled the engineering team to have more visibility into operations. They were able to identify issues faster and as a result, have found they have fewer disgruntled customers.

About Wistia

Wistia is a professional video hosting and analytics platform designed to help businesses communicate more creatively. Founded in Cambridge, Massachusetts in 2006, Wistia offers businesses the resources to host, organize, customize and measure the impact of video. In addition to video hosting, analytics, and marketing tools, Wistia has a library of educational resources to help you learn the ins-and-outs of creating great video content.

Customer Challenges

Last summer, the engineering team at Wistia took a step back from their normal feature releases to assess their infrastructure. At the time, they had no log aggregation or observability tools beyond New Relic. They had log files, but the only way engineers could use them was to ssh into a specific log and hope the log you were looking for was there. They hadn’t used a log aggregator, but they knew they existed and wanted to find the right one.

The team evaluated Scalyr along with Splunk, SumoLogic, PaperTrail, and LogDNA. The criteria for their evaluation included:

  • Support for live streaming
  • Speed of queries
  • Ability to add indexes to logs in a flexible way
  • Reasonably priced solution

After evaluating all the products, the team decided on Scalyr because it matched their evaluation criteria the best. Some of the others didn’t support live streaming, others were too slow and inflexible and others were too expensive. In the evaluation, the team found Scalyr was easy to use, had powerful capabilities, was fast and was the most competitively priced.

Results of Using Scalyr

It took about a week to get up and running on Scalyr. Now, the product is used pervasively across the engineering team. The simplicity of the product that can get more complex as you need it to helps everyone across the team get value out of the product. For example, when writing a feature, the team wants to log when a few events happen, whatever they put into the code is what they put into Scalyr. They aren’t required to create an opaque translation layer. If they want to start treating it as a string, they can put it in there. If they want something fancier or to do something more complicated, they can make sure to format it to make it through the parser.

According to Ryan Artecona, Infrastructure Engineering Lead at Wistia,

“it is easier and faster to diagnose the bugs that frustrate our customers as a result of using Scalyr.”

Beyond the engineering team, the technical support team (Support Engineers) at Wistia uses Scalyr to help track down the root causes of problems that customers encounter. They use the search functionality in Scalyr extensively to pinpoint the exact requests in the logs that correspond to the problems customers encounter in the browser or when using their APIs. Scalyr has made the customer support team more self-sufficient in resolving customers’ problems – and when they do need to escalate problems to the engineering team, they’re able to give them a great head start on fixing the issue. The handoff between support to engineering is much smoother as a result of using Scalyr.

The engineering team is able to respond to incidents much faster. Rolling out Scalyr has helped the engineering team commit to security best practices, including removing root access to all production servers. Now engineers can write code that is observable from the outside and more conducive to debugging. Using Scalyr has been the team’s first big step in the direction of making debugging software in production more collaborative – something they didn’t have before. Individuals are more empowered to own their code by using Scalyr. Before, if things became too time-consuming (pre-Scalyr), issues would just go undiagnosed and marked as “too hard” to figure out.

Flyclops Accelerates Development by Using Scalyr

We are thrilled to have Flyclops as a customer at Scalyr. I recently chatted with co-owner Dave Martorana to learn more about how they chose Scalyr and how it is helping the engineering team be more effective.

Flyclops was evaluating log management tools when they came across Scalyr in a newsletter. After signing up for a trial, the team was blown away by the speed of the product. Searches went from minutes to seconds, which allowed them to save valuable development time. Using Scalyr has improved their ability to provide support and rapidly resolve issues, which has led to increased player happiness and let the team sleep at night knowing that they had the right tool in place to help them monitor activity on their servers.

About Flyclops

Flyclops is a independent mobile games studio located in Philadelphia, PA, specializing in casual multi-player games, both asynchronous turn-based, and real-time. Flyclops’s games have been played by millions across the globe.

Evaluation Process

When Flyclops was looking for a logging solution, they evaluated several tools. Other tools they were experimenting with made searching logs a painful task. Flyclops had a few things they were looking for in a log management tool, including:

  • Ease of logging
  • Speed of collection
  • Ability to pull metrics out of log data
  • Ability to parse custom log formats
  • Ability to diagnose issues quickly

Dave Martorana, co-owner of Flyclops, discovered Scalyr via a Google Go newsletter as a featured log management tool. Given that a lot of the Flyclops backend was written in Go, they decided to give it a try.

Results of using Scalyr

When the Flyclops team started using Scalyr, they immediately took notice of the speed and performance of the tool. Searches went from tens of seconds and minutes in other tools to almost instantaneous with Scalyr. 

According to Dave,

“Scalyr has been the single best tool I’ve added to our stack in years”.

Flyclops has 500,000 unique players per month. By using Scalyr, they are able to save significant time investigating issues, which gives more time for development. Scalyr allowed them to diagnose most problems substantially faster than with other tools they had tried. They were able to replace whole suites of monitoring tools with something that can answer questions they don’t know they’re going to have in the future.

The team liked that they had the ability to write their own parsers and didn’t have to conform to a certain pattern when writing data to logs. On the client side of things, they’ve gone through a number of third party crash reporting tools. They started logging client exceptions and wrote some custom parsers in order to parse thru the stack traces and proactively look at what’s unique to their products. They were able to turn Scalyr into the best stack analysis tool they had used.

The team sleeps a lot better knowing that Scalyr is watching their servers. The ability to answer questions they didn’t anticipate allows them to be more proactive. They are able to define custom variables and query them. When launching a new feature with a staged rollout, they are able to use Scalyr to validate that they are rolling out at the speed they expected with just a little bit of graphing. All of this allows them to better support their players, and be sure their customers are having a high-quality experience at all times.

Company Culture: Actions, not just words

I recently joined the marketing team at Scalyr. I left my previous role last fall and took some time off. After a bit of travel, I spent the last few months exploring what I wanted to do next. In my next opportunity, I wanted a product first company and a culture that aligned with my values. Throughout my search, I met with several dozen people and companies, some casual catch-ups and others more formal interviews. I wanted to share some of the lessons I learned in my process that ultimately made me believe Scalyr was the right place for me.

Read More

CareerBuilder Resolves Customer Issues 5x Faster with Scalyr

We are excited to have CareerBuilder as a customer here at Scalyr. I recently sat down with Leon Chapman, Director of Cloud Operations at CareerBuilder, to learn more about their decision to use Scalyr and the impact the product is having on their teams and customer experience.

 

 

CareerBuilder chose Scalyr as their log management tool. After moving to the cloud two years ago, the team was looking to consolidate tools across their 250 person engineering organization. As a customer facing product, being able to identify issues quickly helps CareerBuilder deliver a better customer experience for the millions of people who use their products and services each day.

In particular, CareerBuilder found that when comparing Scalyr to other products, Scalyr beat the competition in:

  • Speed
  • Performance
  • Ability to scale without needing to manage infrastructure

Read More