Docker changed the way developers build software. It solved many issues, but bugs can still occur. When this happens, the first step in the debugging process is usually to read logs. However, when using Docker, this isn’t as straightforward as you may think. You can simply execute docker logs [container_id], but it’s not always possible to use this command, and it’s not an ideal solution for bigger applications. In this post, you’ll learn the pros and cons of the different logging options and what to consider when choosing a logging strategy in Docker.
What’s So Complicated About Docker Logging?
Docker provides a few different ways to approach logging. It has some so-called “logging drivers” built in, which allow you to customize logging for different purposes. At the same time, however, you need to choose which logging driver will be the best for you.
If you choose the wrong one, it can impact your application performance, or you may periodically lose some logs. This is why it’s important to understand the purpose of all the methods and know which one suits you best. Additionally, when using Docker, we usually work with more than one container. And the more containers you need to manage, the more important it becomes centralize log management.
Choosing Wisely
When it comes to optimizing logging for debugging, two configuration options play the biggest role: logging driver and delivery method. You can configure both. They are, in theory, independent from each other, but you should think about both when changing one or the other. We’ll explain why later, but first, let’s walk through what these two parameters do:
- Logging driver decides what to do with logs. They can be saved in a file in the container, saved on the host where a container is running, shipped to a locally running service such as a syslog server or Journald, forwarded to another component like Fluentd, or sent directly to a remote log management service.
- Delivery method defines the priority of log delivery. When choosing a delivery method, you should also consider the chosen logging driver.
It’s best to think about both of these together because some logging drivers work better with one delivery method, and others work better with another. We’ll explain this in more detail in the next section.
Tip1: Choose the Right Delivery Method
No matter which logging driver you end up with, you can configure it in blocking or non-blocking delivery mode. The delivery method defines what’s more important: delivering logs or running the application.
Blocking
Docker defaults to the blocking delivery mode. Blocking delivery guarantees all log messages are saved to the destination because every time a container needs to write a log entry, Docker blocks an application execution until the log delivery is completed. In most cases, it’s not as bad as it sounds—you likely won’t notice any interruptions. But in some high-usage cases, the interruption may be noticeable.
Non-Blocking
As you may have guessed, non-blocking delivery works in the opposite way. This delivery method won’t block the application from executing to deliver logs. Instead of waiting until logs are saved to their destination, the container will write logs into a buffer in the container’s memory. From there, logs will be delivered to wherever they need to be saved only when a logging driver is ready to process them. Though the non-blocking delivery method sounds like a better approach, it also creates a risk of some log entries being lost. The memory buffer the logs are stored in has a limited capacity, so it can fill up. Also, if a container crashes, logs might simply get lost before being released from the buffer.
Which One to Choose
Choosing between these two delivery methods depends on what’s more important to you: performance or reliability.
For most use cases, the blocking mode is recommended, unless your logging drivers send logs over the network to a remote server. A network can be slow or even periodically unavailable. This can cause issues with the application, as it would be blocked until logs are saved.
On the other hand, logging drivers designed to save logs to local files are usually fast. Therefore, in this use case, it’s safe to use the blocking mode.
If your application produces many logs or is sensitive to disruptions (or if for any other reason performance is the top priority), then you should use non-blocking mode. You just need to make sure the in-memory buffer size is adequate for your needs to avoid filling up the buffer.
How to Set Up Your Delivery Method
You can set up the desired delivery method per container by passing the mode option to the –log-opt parameter when running a container. Here’s an example of this:
docker run --log-opt mode=non-blocking busybox
You can also change the global Docker delivery method by adding the following to the daemon.json file:
{
"log-opts": {
"mode": "non-blocking"
}
}
Tip 2: Choose the Right Logging Driver
By default, Docker uses the json-file driver, which simply writes logs in JSON format to a local file. It’s fast and efficient, and it works well in most cases. In small environments, it’s best to either keep the default json-file driver or use the syslog or journald driver. Using the docker logs command is only possible with local, json-file, and journald drivers (unless you’re using Docker Enterprise, in which case the docker logs command can be used for any logging driver).
On a bigger scale, however, it’s more efficient to centralize logs. Some logging drivers, such as fluentd, awslogs, gcplogs, and gelf, can do this for you by gathering logs from all containers and shipping them to a common destination. The more containers you have, the more difficult it becomes to find the root cause of the problem. By centralizing logs, this becomes much easier.
You can set the logging driver by passing the –log-driver parameter to either the docker run command or to a daemon.json file, similar to what we saw with delivery method.
Tip 3: Assign Useful Tags and Labels
If you decide to centralize logs from all containers, it’s important to assign useful tags and labels to log messages. Otherwise, you’ll end up with a difficult-to-understand stream of messages. Tags and labels can help you easily find and filter the messages you’re looking for. It’s wise to tag logs with information like the environment and zone. Using well-defined tags can help you answer questions like “Are there any errors from my back-end services in a production environment?”
Tip 4: Secure Log Shipping
Another important thing to remember is to secure log shipping when using syslog or other remote destination drivers. Logs can contain sensitive data, so shipping them in plain text shouldn’t be an option. You should configure a logging driver to use a TLS/SSL connection to the logs’ destinations.
Tip 5: Manage the Destination
When using a log driver other than json-file or local, you have to deal with other components/services in the whole log delivery pipeline. It could either be syslog or Journald on the host machine or it could be a remote destination such as Fluentd, Logstash, or one of the other remote destinations Docker supports.
No matter which one you choose, it’s important to keep its availability, too. For example, if you choose Fluentd, you can install it on each host where Docker is running, on a separate dedicated machine, or only on one host per X machines/clusters. However, this simple decision can affect your application.
If you decide to install one Fluentd host for multiple servers, it can become overloaded. If you choose the blocking delivery method at the same time, it can cause issues for your application. The same applies if you use logging drivers designed to ship logs directly to a remote server, such as Amazon CloudWatch Logs or Cloud Logging from Google Cloud. To deliver these logs, Docker must establish a connection to different networks. Therefore, it’s crucial to either use non-blocking mode or to create an extra network link other services can’t saturate.
In other words, when you decide to use a logging driver, it’s important to not only change Docker’s configuration but to make sure the other side of the link is available.
How to Get the Most Out of Centralized Log Management
In this post, we’ve run through five tips for selecting the optimal logging strategy for Docker container logs. Unless you’re running a small stack, centralized log management is the best option. It’s easier to find the root cause of any issue when you can see sorted and aggregated messages from all containers in one place. But centralizing logs is only half the battle. If you don’t tag your containers properly, you’ll end up with a centralized mess.
Tools like SolarWinds® Papertrail™ can simplify debugging with fast search and features like a log velocity graph to visualize error volumes. On top of this, you don’t have to worry about scalability, security, and reliability. And if you’re used to livestreaming logs (e.g., via tail -f or docker logs –follow), you can still do this with the tail feature in Papertrail. Creating alerts is also possible, so you get a debugging tool capable of removing all the complexity involved with Docker logging.
This post was written by Dawid Ziolkowski. Dawid has 10 years of experience as a network/system engineer, has experience with DevOps, and has recently worked as a cloud-native engineer. He’s worked for an IT outsourcing company, a research institute, telco, a hosting company, and a consultancy company, so he’s gathered a lot of knowledge from different perspectives. Nowadays, he’s helping companies move to cloud and/or redesign their infrastructure for a more cloud-native approach.