Introduction
If you notice frequent restarts of an Artifactory POD, it can be perplexing. Simply reviewing the logs may not suffice in identifying the root cause of the issue. An effective monitoring tool is essential for a comprehensive analysis of the problem.
To gain deeper insights, it’s important to examine the thread status within the POD just prior to its crash. This includes evaluating the number of threads as well as their states (e.g., hanging, runnable, or blocked).
As you may know, capturing thread dumps must be executed promptly. Ideally, thread dumps should be taken at the moment the issue occurs and not after a restart or when the issue appears to be resolved.
Otherwise, you risk obtaining a snapshot of the instance while it is functioning normally, which may not provide the needed insights.
Starting from Artifactory/Artifactory HA chart version 107.46.3 and above, lifecycle hooks have been implemented. This feature allows for the automatic collection of thread dump files.
Additionally, if you prefer the thread dump files to be stored in the Artifactory log folder, ensure they follow the prescribed naming format.
Resolution
The Support Bundle will automatically collect log files that conform to a specific naming convention. To ensure your thread dump files are included, they must be named with the artifactory- prefix and the .log suffix. Therefore, the required format for your thread dump files is:
artifactory-threads.xxxxx.td.log
Pod Lifecycle Configuration:
You can configure the lifecycle of your Artifactory pod to take thread dumps before it shuts down. Use the preStop hook in your Kubernetes deployment configuration as shown below:
artifactory:
lifecycle:
preStop:
exec:
command:
- /bin/bash
- -c
echo "Taking thread dump 1 with a 5-second interval";
/opt/jfrog/artifactory/app/third-party/java/bin/jcmd $(pidof java) Thread.print > /var/opt/jfrog/artifactory/log/"artifactory-thread.$(date +%Y%m%d%H%M%S).td.log";
sleep 5;
echo "Taking thread dump 2 with a 5-second interval";
/opt/jfrog/artifactory/app/third-party/java/bin/jcmd $(pidof java) Thread.print > /var/opt/jfrog/artifactory/log/"artifactory-thread.$(date +%Y%m%d%H%M%S).td.log";
sleep 5;
echo "Taking thread dump 3 with a 5-second interval";
/opt/jfrog/artifactory/app/third-party/java/bin/jcmd $(pidof java) Thread.print > /var/opt/jfrog/artifactory/log/"artifactory-thread.$(date +%Y%m%d%H%M%S).td.log";
Post-Restart or Crash Observations:
After a POD restart or crash, you should see new files named artifactory-thread.xxx.td.log. These thread dumps will be included in the subsequent support bundle, providing crucial information about the POD's state either at the time of the crash or immediately prior to the restart.
Please forward this support bundle to the Support team at support@jfrog.com for further analysis.