Technical Resources
Educational Resources
Connect with Us
Papertrail automatically uploads log messages and metadata to Amazon’s cloud storage service, S3. Papertrail stores one copy in our S3 bucket, and optionally, also stores a copy in a bucket that you provide. You have full control of the optional archive in your own bucket, since it’s tied to your AWS account.
Want to set up S3? Jump to Automatic S3 Archive Export.
Each line contains one message. The fields are ordered:
id
generated_at
received_at
source_id
source_name
source_ip
facility_name
severity_name
program
message
For a longer description of each column, see Log Search API: Responses.
Here are the fields for an example message:
50342052
2011-02-10 00:19:36 -0800
2011-02-10 00:19:36 -0800
42424
mysystem
208.122.34.202
User
Info
testprogram
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor
Archives are in tab-separated values (.tsv
) format, so a line actually looks like this:
50342052\t2011-02-10 00:19:36 -0800\t2011-02-10 00:19:36 -0800\t42424\tmysystem\t208.122.34.202\tUser\tInfo\ttestprogram\tLorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor
The TSV files are gzip-compressed (.gz
) to reduce size. Gzip compression is compatible with UNIX-based zip tools and third-party Windows zip tools such as 7-zip and WinZip.
Here’s how to extract the message (field 10) from the archive file 2018-11-12-00.tsv.gz
, then show the messages sorted by the number of identical occurrences (duplicates).
$ gzip -cd 2018-11-12-00.tsv.gz | cut -f10 | sort | uniq -c | sort -n
Windows PowerShell can do the same thing, with 7-Zip’s help. In this example, [9]
still selects the message (field 10), due to zero-based indexing.
$ 7z x -so 2018-11-12-00.tsv.gz | %{($_ -split '\t')[9]} | group | sort count,name | ft count,name -wrap
The most common messages often differ only by a random number, IP address, or message suffix. These near-duplicates can be discovered with a bit more work.
Here’s how to extract the sender, program, and message (fields 5, 9, and 10) from all archive files, squeeze whitespace and digits, truncate after eight words, and sort the result by the number of identical occurrences (duplicates).
$ gzip -cd *.tsv.gz | # extract all archives
cut -f 5,9- | # sender, program, message
tr -s '\t' ' ' | # squeeze whitespace
tr -s 0-9 0 | # squeeze digits
cut -d' ' -f 1-8 | # truncate after eight words
sort | uniq -c | sort -n
or, as a one-liner:
$ gzip -cd *.tsv.gz | cut -f 5,9- | tr -s '\t' ' ' | tr -s 0-9 0 | cut -d' ' -f 1-8 | sort | uniq -c | sort -n
Once again, Windows PowerShell can do the same thing, with 7-Zip’s help.
> 7z x -so *.tsv.gz | # extract all archives
%{($_ -split '\t')[4,8,9] -join ' '} | # sender, program, message
%{$_ -replace ' +',' '} | # squeeze whitespace
%{$_ -replace '[0-9]+','0'} | # squeeze digits
%{($_ -split ' ')[0..7] -join ' '} | # truncate after eight words
group | sort count,name | ft count,name -wrap
or, as a one-liner:
> 7z x -so *.tsv.gz | %{($_ -split '\t')[4,8,9] -join ' '} | %{$_ -replace ' +',' '} | %{$_ -replace '[0-9]+','0'} | %{($_ -split ' ')[0..7] -join ' '} | group | sort count,name | ft count,name -wrap
In addition to being downloadable from Archives, you can retrieve archive files using your Papertrail HTTP API key, as part of the HTTP API. The URL format is simple and predictable:
https://papertrailapp.com/api/v1/archives/YYYY-MM-DD-HH/download
Reminder: before November 14, 2018 at 00:00 UTC, Papertrail generated either daily or hourly archives based on the amount of log data transfer included in your plan. All archive files generated after that date are hourly.
Download the archive for 2018-11-14 at 14:00 UTC with:
$ curl --no-include -o 2018-11-14-14.tsv.gz -L -H "X-Papertrail-Token: YOUR-HTTP-API-KEY" \
https://papertrailapp.com/api/v1/archives/2018-11-14-14/download
Occasionally, it may be useful to pull down a large number of archives without needing to check the archive frequency. This cURL script will run on a variety of UNIX platforms and download archives between two dates:
curl -sH 'X-Papertrail-Token: YOUR-HTTP-API-KEY' https://papertrailapp.com/api/v1/archives.json |
grep -o '"filename":"[^"]*"' | egrep -o '[0-9-]+' |
awk '$0 >= "YYYY-MM-DD" && $0 < "YYYY-MM-DD" {
print "output " $0 ".tsv.gz"
print "url https://papertrailapp.com/api/v1/archives/" $0 "/download"
}' | curl --progress-bar -fLH 'X-Papertrail-Token: YOUR-HTTP-API-KEY' -K-
Enter the start and end dates in the third line of the script. For example, to download archives for April and May 2019, that line would read awk '$0 >= "2019-04-01" && $0 < "2019-06-01" {
.
date
It's also possible to use the date
tool to run regular automated requests or one-off bulk downloads from a short time period.
Download the archive for 16 hours ago with:
$ curl -silent --no-include -o `date -u --date='16 hours ago' +%Y-%m-%d-%H`.tsv.gz -L \
-H "X-Papertrail-Token: YOUR-HTTP-API-KEY" \
https://papertrailapp.com/api/v1/archives/`date -u --date='16 hours ago' +%Y-%m-%d-%H`/download
As you can see, there's a lot going on in those cURL one-liners. The main parts are:
-o `date -u --date='16 hours ago' +%Y-%m-%d-%h`.tsv.gz
: Downloads the archive to a file with yesterday’s date (UTC) in the format YYYY-MM-DD-HH.tsv.gz-H "X-Papertrail-Token: YOUR-HTTP-API-KEY
": Authenticates the request via your API token, found under your profile.date
To download multiple archives in one command, use:
$ seq 1 X | xargs -I {} date -u --date='{} hours ago' +%Y-%m-%d-%H | \
xargs -I {} curl --progress-bar -f --no-include -o {}.tsv.gz \
-L -H "X-Papertrail-Token: YOUR-HTTP-API-KEY" https://papertrailapp.com/api/v1/archives/{}/download
where X
is the number of hours + 1 that you want to download. For example, to guarantee 8 hours, change X
to 9.
To specify a start date, for example: 1 November 2019, combine the {} day ago
specification with the start date:
$ seq 1 X | xargs -I {} date -u --date='2019-11-01 {} hours ago' +%Y-%m-%d-%H | \
xargs -I {} curl --progress-bar -f --no-include -o {}.tsv.gz \
-L -H "X-Papertrail-Token: YOUR-HTTP-API-KEY" https://papertrailapp.com/api/v1/archives/{}/download
The seq 1 X
command is being used to generate date or hour offsets, starting with 1 (1 day or hour ago) because the current day or hour will not yet have an archive. Since archive processing takes time, near the beginning of the hour or UTC day, the interval also may not have an archive yet (and will return 404 when requested). Thus, to guarantee that you get at least X
days/hours, replace X
with the number of days/hours + 1.
Using macOS and see date: illegal option -- -
? In the examples above, change:
--date='{} hours ago'
to -v-{}H
This option format doesn't have the same capability as the standard date
arguments to provide a simple YYYY-MM-DD start date, but more complex uses of date
can have the same result without requiring too much time math:
seq 1 X | xargs -I {} date -v-Nd -v-{}H +%Y-%m-%d-%H | \
xargs -I {} curl --progress-bar -f --no-include -o {}.tsv.gz \
-L -H "X-Papertrail-Token: YOUR-HTTP-API-KEY" https://papertrailapp.com/api/v1/archives/{}/download
Replace X
with the number of hours to go back, and N
with the number of days ago to start.
seq 1 X | xargs -I {} date -ur `date -ju MMDDHHmm +%s` -v-{}H +%Y-%m-%d-%H| \
xargs -I {} curl --progress-bar -f --no-include -o {}.tsv.gz \
-L -H "X-Papertrail-Token: YOUR-HTTP-API-KEY" https://papertrailapp.com/api/v1/archives/{}/download
Replace X
with the number of hours to go back, and MMDDHHmm
with the desired start date (mm
will always be 00
).
To find an entry in a particular archive, use commands such as:
$ gzip -cd 2019-02-25-00.tsv.gz | grep Something
$ gzip -cd 2019-02-25-00.tsv.gz | grep Something | cut -f5,9,10 | tr '\t' ' '
The files are generic gzipped TSV files, so after un-gzipping them, anything capable of working with a text file can work with them.
If the downloaded files have file names such as 2019-08-18-00.tsv.gz
(the default), multiple archives can be searched using:
$ gzip -cd 2019-08-* | grep SEARCH_TERM
To transfer multiple archives from Papertrail's S3 bucket to a custom bucket, use the relevant download command mentioned above, and then upload them to another bucket using:
$ s3cmd put --recursive path/to/archives/ s3://bucket.name/the/path/
where path/to/archives/
is the local directory where all the archives are stored, and bucket.name/the/path/
is the bucket and path of the target S3 storage location.
See Automatic S3 archive export.
The scripts are not supported under any SolarWinds support program or service. The scripts are provided AS IS without warranty of any kind. SolarWinds further disclaims all warranties including, without limitation, any implied warranties of merchantability or of fitness for a particular purpose. The risk arising out of the use or performance of the scripts and documentation stays with you. In no event shall SolarWinds or anyone else involved in the creation, production, or delivery of the scripts be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the scripts or documentation.