The JSON format is ubiquitous and used everywhere from logging in web apps to message passing for microcontrollers. Thanks to its compact nature and it being easily readable by humans, JSON has become the de facto standard for sharing structured data.
To illustrate how easy JSON is to read, here’s an example JSON message containing an array of people where each entry describes a person’s name, age, and address:
{
“people”: [
{
“name”: “Jane Smith”,
“age”: 42,
“address”: “123 Park Road”
},
{
“name”: “John Barnes”,
“age”: 33,
“address”: “55
}
}
But JSON isn’t just easy for humans to read—because it uses structured data, it’s also very machine-readable and simple to parse if you need to convert it to some other format, and there are JSON parsers for every programming language and runtime environment.
Given all these benefits, it’s no surprise the JSON format is used extensively for logging. JSON logs enable you to quickly understand the state of your app and troubleshoot issues when things go wrong. When formatted correctly, they can become a vital part of your diagnosis and analysis toolbox.
To make sure your JSON logs are the best they can be, here’s a list of eight best practices to follow.
Validate Your JSON
Since JSON is so easy to write, you need to ensure you’re working with valid JSON data. If you create invalid JSON, tools and apps won’t be able to correctly parse it. Validating JSON involves checking the syntax is correct, but it doesn’t attribute any meaning to the data since any meaning is specific to the data that you’re communicating. There are many tools online where you can paste your JSON data and have it validated such as https://jsonlint.com or https://codebeautify.org/jsonvalidator. Once you’ve verified your JSON is legal, it’s time to make sure your correctly formatted JSON means what you intended it to mean. And for that, you need a schema.
Create a Standard Schema
Once you’ve determined your log data is valid JSON, it’s a good idea to find some way to attach meaning to each field, so you know where to look when analyzing JSON logs. Schemas are the perfect tool for the job. They allow you to describe the expected format of JSON logs, so the semantics of each field is written down for every user to see. Projects such as JSON Schema help you to create descriptions of your JSON data, which you can use to validate JSON data you receive and send.
Include a Timestamp in Every Message
One of the first things you need to do when troubleshooting complex issues is establish the order of messages, so you can figure out the chain of events leading up to the problem you’re investigating. Including timestamps in each JSON message allows you to order them, so you can see which events led to future events. Just be sure to include any time zone information in the timestamp if you’re working across geographies. Here’s an example JSON message with a timestamp field:
{
“type”: “error-msg”,
“timestamp”: “2020-03-04 21:16:20+00:00 GMT”
}
Include Information for Future Debugging
Choosing exactly which information to include in your logs is highly dependent on your app and workload, but capturing the right information today is critical for troubleshooting issues tomorrow. When you’re in the middle of analyzing an issue, it’s not always possible to generate new logs, and often you can only look at historical data to work out which events led to an issue. You need to include enough context in your JSON logs to derive the state of your application without actually being able to inspect it.
The key for creating usable log messages is to include enough contextual information to make an accurate prediction of the state of your app when the log message was generated.
Keep Identifier Data at the Start
JSON logs come in all sizes depending on what information you need to record. By including tags and other identifying data for figuring out what’s contained in the JSON at the start of the JSON message, you can save time while parsing JSON by skipping data you don’t need. For small messages, this best practice is unlikely to make much of a difference, but some JSON messages can be hundreds of bytes in size. In this case, skipping data you don’t need during parsing can significantly improve parsing performance. Plus, being able to skip unwanted JSON messages also makes it easier for fellow developers to quickly find the messages they need when analyzing logs.
Use Logging Levels to Distinguish Message Types
Most developers are familiar with the various logging levels used to describe the severity of a log message such as INFO, WARN, and CRITICAL. With careful use of these levels in your JSON messages, you can make it easier for existing tools to search through log data and find the most relevant information when you need it.
Using log levels encourages you to record both errors and behavioral information in your logs. Both are important in different ways, especially when diagnosing issues: errors tell you when something went wrong; behavioral messages show why.
Compress Your JSON
Even though the textual representation of JSON is small, you might be able to send and receive JSON messages faster by compressing the data. Using a compression algorithm such as gzip or bzip2 could reduce the size of your JSON data by up to 90%. That’s the same as shrinking a 130MB JSON payload down to 13MB.
Use the Same Logs in Development and Production
Using different JSON schemas in development and production can lead to all kinds of inconsistencies during development and testing. By using the same format of JSON messages in the two environments, you can eliminate duplicate efforts and reuse existing dashboards and queries to analyze how your app is behaving.
Conclusion
When choosing from the many different log formats available you could do a lot worse than selecting JSON. JSON is both human and machine readable, so there are many tools available for generating and consuming JSON messages—libraries exist for all programming languages and runtime environments, and you can even produce JSON messages with microcontrollers.
When it’s time to use your JSON logs, there are several best practices you should follow to ensure your logs are usable when you need them most. Validating your JSON data against a schema means you can guarantee the fields you expect to be in your data are there, and including timestamps gives you extra confidence you’ll be able to quickly order events when troubleshooting.
You can also tune your JSON data to make it faster to parse. Using log levels and putting data at the start of JSON messages to help identify what kind of data is contained within are both useful tricks for skipping messages you’re not interested in, and compressing your data allows it to be transported much quicker, which is very handy for transferring large JSON logs.