Request.log Performance Analysis Tutorial

Patrick Russell
2020-08-11 15:22

Request.log Performance Analysis Tutorial

Relevant versions: Artifactory 4.X – 6.X

Artifactory's request.log may appear to be simple upon first inspection, but the fields it records are extremely useful to find performance issues. There are ways to analyze these logs to build a picture of where a performance problem is occurring.

The log format is powerful because it is able to measure the time it takes Artifactory to process a request outside of networking. This can rule out networking as a factor in an issue, depending on the context.

 

Request.log Format

This is the format of a standard request.log entry:

20140508154145|2632|REQUEST|86:12:14:192|admin|GET|/jcenter/org/iostreams/iostreams/0.2/iostreams-0.2.jar|HTTP/1.1|200|8296

 

Date and time stamp

The date and time the request was completed and entered into the log file. Format is [YYYYMMDDHHMMSS]

Request time

The time in ms taken for the request to be processed

Request type

DOWNLOAD for a download request

UPLOAD for an upload request

REQUEST for any other request

IP

The requesting user's IP address

User name

The requesting user's user name or "non_authenticated_user" when accessed anonymously

Request method

The HTTP request method. e.g. GET, PUT etc.

Requested resource path

Relative path to the requested resource

Protocol version

The HTTP protocol version

Response code

The HTTP response code

Size (bytes) of request or response

If request method is GET: Size of response

If request method is PUT or POST: Size of request

The most important fields to analyze performance problems are the Request time and Size of Response. This can be demonstrated using some bash terminal commands to search through the file.

Checking for Performance Problems

A common use case is to try and figure out why a "docker login" times out. This is a good way to review how to troubleshoot the process.

First, a "docker login" requests a token by using the Artifactory API. You can get the exact request by running a "docker login" against Artifactory to find the request format:

GET /api/docker/docker-repo/v2/token

curl -u admin:password http://artifactory.com/artifactory/api/docker-local/v2/token

With the URL to search for, a set of "grep" searches can be crafted to parse the request log.

[Identify when a "docker login" request times out]
#In grep, a single wildcard character is the period '.'

#This search finds any request that takes between 10,000 and 99,999 milliseconds (10-99 seconds) 

grep "|…..|REQUEST" request.log | grep "api/docker" | grep "v2/token" 

#Alternative "awk" search
awk 'BEGIN { FS="|" } ($2 > requestTime) && ($6 == "GET") { print $1 " response_time: " $2 " " $7}' requestTime=10000 request.log
#Example long-running docker login
20190726113810| 39070 [39 seconds] |REQUEST| 127.0.0.1 |Jenkins-LDAP|GET|/api/docker/docker-virtual/v2/token|HTTP/1.1|200|0
#Adding an additional dot searches for 100 – 999 second requests
20190726114126 | 126000 [126 seconds] |REQUEST| 127.0.0.1 |Jenkins-LDAP|GET|/api/docker/docker-virtual/v2/token|HTTP/1.1|200|0

This technique allows you to identify if Artifactory is causing a slow request or not. There may be cause for concern if a long running request is for a small file, depending on the situation. 

For example, if a file is pulled from a remote endpoint for the first time, it might take longer versus when it has been cached cached:

[2.2 seconds to stream a 53 KB file the first time – Pulled from a remote repository]
20181228204422|2218|REQUEST|127.0.0.1|admin|GET|/api/npm/npm/sshpk/-/sshpk-1.16.0.tgz|HTTP/1.1|200|53405

20190814003406|681|REQUEST|127.0.0.1|admin|GET|/api/npm/npm/sshpk/-/sshpk-1.16.0.tgz|HTTP/1.1|200|53405

Please note that there are some limitations to this approach. For example, slow downloads or uploads might be taking time because the client must stream a lot of data. 

If there is a long running request transferring a lot of binary data, it is likely an expected behavior:

[137 milliseconds to upload a 2.5 MB file]
20181204003658|137|REQUEST|127.0.0.1|admin|PUT|/libs-snapshot-local/org/jfrog/test/multi3/3.7-SNAPSHOT/multi3-3.7-SNAPSHOT.war;build.timestamp=1543883800178;build.name=maven-pipeline;build.number=23|HTTP/1.1|201|2533340

To solve these problems, it's best to consider how Artifactory serves requests and narrow down the bottleneck:

[Standard process to handle the download of a binary]

1. Artifactory receives the request – The "request time" begins ticking
2. Artifactory authenticates the user using its embedded Access Server 
    2a. Access searches for the user in the database
3. Artifactory finds the file checksum in the database
4. Artifactory fetches the file from the file system
5. Artifactory streams the file to the user
6. The request closes, the "request time" stops, the request.log entry is recorded

Each point in the chain can be eliminated using other tests. For example, you can remove steps 1 and 5 using the request log analysis (It's then only slow inside the application). 

Checking Request Rates

If Artifactory is using a lot of processing power, one way to see what is causing this high usage is to review the request rates.

This can be done in the terminal, and relies on the standardized nature of the request log timestamp. To get a count of all the requests that occurred in a particular hour, you can look for request log timestamps ending at the hour mark:

#Starting the count at 2019-07-29 09:00 AM
patrickr-mac:logs patrickr$ grep "2019072909" request*.log | wc -l
  162848
patrickr-mac:logs patrickr$ grep "2019072910" request*.log | wc -l
  301432
patrickr-mac:logs patrickr$ grep "2019072911" request*.log | wc -l
  275613
patrickr-mac:logs patrickr$ grep "2019072912" request*.log | wc -l
  359222
patrickr-mac:logs patrickr$ grep "2019072913" request*.log | wc -l
  219759

Each of the numbers returned is the requests served in that hour according to the logs. These numbers are very useful when scaling Artifactory.

For tips on how to improve performance at scale, you can refer to the How to Tune Artifactory for Heavy Loads Knowledgebase article.