Last updated: February 5, 2024
After the introduction of Docker, the life of a developer became much easier. Kubernetes solved many problems and offloaded the task of setting up all the necessary runtimes, libraries, and servers from developers. For small projects, it was enough. However, managing multiple containers for bigger applications becomes difficult. Fortunately, Kubernetes solves that problem, making complex deployments easy. However, using Kubernetes also creates a few new problems, such as memory usage issues. In this post, you will learn how to use logging efficiently to find and avoid common issues with misconfigured resource allocation in Kubernetes.
Why Kubernetes Is Great for Developers
The short answer? You have fewer things to worry about.
Kubernetes can
- Make sure that all your containers are running.
- Reschedule containers if one of the nodes becomes saturated.
- Take care of deploying a new version of your containers with rolling updates.
As a developer, you won’t have to worry about doing all of that. You’ll be able to focus on the application itself. You can also instruct Kubernetes on how it should manage and distribute resources. This is a very useful feature since, usually, different parts of your application (different containers) have different resource needs.
Scalability and Flexibility
While saving the developer from managing multiple containers is an improvement on its own, there is another important advantage of using Kubernetes: scalability. With Kubernetes, it doesn’t really matter if your application has fewer than ten containers or if it has hundreds of them. Kubernetes can equally manage a cluster of five servers as well as a cluster of 500-plus servers. One Kubernetes cluster can even consist of different pools of machines in different places.
Built-In Load Balancing
As a developer, sometimes you need to implement extra logic in the code to make distributed applications bulletproof. Again, the bigger the application (i.e., the more containers it consists of), the more effort must be put into that extra coding.
But don’t worry; Kubernetes can help here, too. It has built-in load-balancing features. Not only can it automatically perform load balancing of the requests between a specified set of containers, but it can even eliminate containers that can’t handle any more load or are not working properly.
What is Kubernetes Logging
Kubernetes logging refers to capturing and managing logs generated by applications and services within a cluster. Papertrail™ gives you several efficient mechanisms for collecting and storing log data, providing flexibility in selecting and configuring various logging options and allowing you to customize your log management according to your specific needs.
For example, if you want to collect logs from the standard Docker streams, you can use Logspout–a lightweight log router, or you can collect all the logs within a cluster using Fluentd. Moreover, you can also configure centralized logging using remote_syslog2.
Managing Resource Usage
Abstraction layers created by Kubernetes and all of its features are very helpful, but they also complicate troubleshooting. Especially when it comes to resource management and allocation. Kubernetes is designed for distributing containers across multiple nodes in the most effective way possible. But to do it really well, it needs your help. It allows you to specify how much resources a container needs to function properly and how much is too much. You can do that by setting resource requests and resource limits. While they are optional, it’s a best practice to set both. We will explain why later, but since it’s also easy to misconfigure them, we need to understand them first.
- Resource requests: a guaranteed amount of resources reserved for the container. If there is more CPU or RAM available on the host, a container can use more resources than specified in requests.
- Resource limits: the maximum amount of resources that the container is allowed to use. If a container tries to allocate more than its limits, Kubernetes will throttle it down or terminate it.
Identifying and Avoiding Common Resource Management Misconfigurations
While it may not sound like rocket science to set requests and limits, there are some consequences of misconfiguring them. Let’s discuss the most common problems.
Memory Usage and Allocation
How does Kubernetes assign memory for a container? It depends. A pod can be run in one of the following scenarios:
- No resource requests or resource limits set (default).
- Only resource requests set.
- Only resource limits set.
- Both resource requests and limits set.
Without requests and limits set, pods will simply be managed on a first-come, first-served basis. Kubernetes will try to distribute RAM between all running pods equally, but if one pod tries to allocate more and more memory, Kubernetes may kick out other pods from the node to meet the demand. There is nothing stopping pods from consuming all the free memory on the node. Trust me, you don’t want to have a memory leak in this situation.
Setting Only Requests or Limits
You might be thinking, “I will set those requests to guarantee the amount that my pod needs to run properly, but I don’t think I need limits.” Doing this will definitely solve some problems.
By setting resource requests, Kubernetes will make sure to schedule a particular pod on a node with a minimum of that amount of RAM available, so, in theory, you’re safe. But in practice, there still is nothing that protects you from a memory-leaking application.
That means if you have a pod that needs only 512 MB of RAM to run properly on a node with 8 GB of RAM and you respectively set memory request to 600 MB for that pod, then you should be able to fit more than 10 pods on that node. But if one of these pods has a memory leak, Kubernetes may not schedule any other pod on that node.
On the other hand, if you only set limits, there is nothing that guarantees a minimum amount of RAM memory for the pod. So, depending on the system usage, your application simply may not perform properly.
Troubleshooting Memory-Related Errors
Setting both memory resource requests and limits for a pod helps Kubernetes manage RAM usage more efficiently. But doing so doesn’t solve all problems, of course.
OOMKilled – Container Limit Reached
If an application has a memory leak or, for any other reason, tries to use more memory than a set limit, Kubernetes will terminate it with an “OOMKilled – Container limit reached” event and Exit Code 137.
So, whenever you see such a message, you either have to increase the limit for the pod if such memory usage was expected (maybe simply due to increased load on the website) or debug the application if such usage was sudden and unexpected. You also need to keep in mind that Kubernetes killing a pod like that is a good thing—it prevents all the other pods from running on the same node.
OOMKilled – Limit Overcommit
Kubernetes uses memory requests to determine on which node to schedule the pod. For example, on a node with 8 GB free RAM, Kubernetes will schedule 10 pods with 800 MB for memory requests, five pods with 1600 MB for requests, or one pod with 8 GB for requests, etc. However, limits can (and should) be higher than requests and are not taken into account for scheduling.
So, for example, you can schedule ten pods on the same node with 800 MB for memory requests and 1 GB for memory limit. This leads to a situation where some pods may try to use more memory than node capacity.
In this case, Kubernetes may terminate some pods, as explained above. And something that’s important to understand here is how Kubernetes decides which pod to kill. This is the hierarchy Kubernetes uses to decide which pods to delete:
- First, it will terminate pods that have neither requests nor limits set.
- Second, it will terminate those that don’t have limits set.
- Next, it terminates pods that use more memory than requested (but are within their limit).
- Only after the first three types of pods listed have been terminated will Kubernetes terminate pods that have both limits and requests set and are currently using less than requested memory.
This hierarchy is why it’s recommended to set the appropriate value of both requests and limits for each workflow.
CPU Requests/Limits
Setting CPU requests and limits isn’t as straightforward as memory, where you can simply define a specific amount of bytes. Kubernetes defines CPU resources as “CPU units.” They equal one vCPU/core for cloud providers and one hyperthread on bare-metal machines). In theory, a CPU request of “1” will allow a container to use one vCPU/core (regardless of whether it’s running on a single-core or 24-core machine). Fractional values are also possible, so a value of “0.5” will allow a container to use half of the core.
However, Kubernetes translates these values under the hood into a proportion of CPU cycles, which means that, in the case of high CPU usage on the node, there is no guarantee that the container will get as much CPU as it requested. So, it’s rather a priority setting. For that reason, unlike with the memory limits, Kubernetes will not kill a container that tries to use more CPU than its limit. Instead, Kubernetes will only throttle down the process, assigning less CPU time to it. Since CPU requests/limits are not absolute value settings but the percentage of the quota, it can be difficult to troubleshoot CPU performance-related issues.
Monitoring to the Rescue
If set to the wrong values, resource requests, and limits will cause more harm than good. Unfortunately, it’s not easy to guess which values are the right values in the first place. You should have at least a general idea of how many resources your application needs. Then, your best bet is to start from reasonably guessed numbers and gradually adjust them to the optimal values. You can use Kubernetes Vertical Pod Autoscaler (VPA) to automatically adjust requests and resource limits for containers running in pods in a deployment.
VPA can improve resource utilization by automatically setting requests based on usage, maintaining proportions between limits and requests, and scaling up pods that are requesting insufficient resources based on usage over time.
VPA can also help you by observing CPU and memory usage and helping you define these values in your deployment. However, due to the complexity of Kubernetes, your best bet is to have a monitoring system (and if you think you don’t, read this blog post).
Aggregating and monitoring logs can be very useful in the process of finding appropriate values for requests and limits. Most of the events related to requests and limits are emitted as logs. Issues related to Kubernetes memory usage, like “OOMKilled – Container limit reached,” are pretty straightforward.
You can read more about troubleshooting with Kubernetes logging here. Fortunately, it’s easy to stream all the logs from Kubernetes into one place. Tools like FluentD can take care of that very well. From there, you only need a system that can aggregate and directly show you the most important messages.
Papertrail can manage logs not only from containers but also from Kubernetes components and nodes, which gives you an even better overview of what’s happening in your cluster. It’s very handy, especially in bigger clusters. The powerful yet simple search syntax offered by Papertrail can drastically reduce debugging time. Moreover, it can show you events with context and pinpoint issues. Even if you prefer real-time troubleshooting, Papertrail has you covered with its Live Tail feature. If you want to see it yourself, sign up for a trial or request a demo.
This post was written by Dawid Ziolkowski. David has 10 years of experience as a network/system engineer at the beginning, DevOps in between, and cloud-native engineer recently. He’s worked for an IT outsourcing company, a research institute, telco, a hosting company, and a consultancy company, so he’s gathered a lot of knowledge from different perspectives. Nowadays he’s helping companies move to cloud and/or redesign their infrastructure for a more cloud-native approach.