Introduction
Sometimes, you may be surprised to see multiple restarts of an Artifactory POD.
Looking into the logs may not be enough for you to find out the root cause of the issue.
Without a proper monitoring tool, it is more difficult to analyze the issue.
You may want to look deeper inside the pod at the thread status right before the POD crashes.
For example, to see the quantity of the threads or status of the threads (eg. hang, runnable, or blocked)
As you may know, it is very important to take thread dumps in a timely manner.
It means that you must take thread dumps at the time of the issue, not after a restart or after the issue is gone. Otherwise, you will get the state of the instance while it is functioning normally, which will not be of much help in most cases.
You may wonder how it is possible in the Kubernetes environment.
The good news is that Artifactory/Artifactory HA chart 107.46.3 and above have introduced the lifecycle hooks as seen here - https://github.com/jfrog/charts/blob/c4a2c6c671d20e8672db24572b965a655381ee7e/stable/artifactory-ha/values.yaml#L628 for which you can use.
Resolution
Below is the example of taking 3 dumps with 5 interval
The destination of the thread dumps (e.g. /tmp) should be adjusted to be on your persistent storage or back up location.
How to Verify Thread Dumps are Being Created ?
1. Access the Pod:
Enter the pod by running the following command:
2. Delete the Pod to see thread dumps creation:
3. Monitor Thread Dumps:
While inside the pod, continuously run the following command to check for thread dumps:
Sometimes, you may be surprised to see multiple restarts of an Artifactory POD.
Looking into the logs may not be enough for you to find out the root cause of the issue.
Without a proper monitoring tool, it is more difficult to analyze the issue.
You may want to look deeper inside the pod at the thread status right before the POD crashes.
For example, to see the quantity of the threads or status of the threads (eg. hang, runnable, or blocked)
As you may know, it is very important to take thread dumps in a timely manner.
It means that you must take thread dumps at the time of the issue, not after a restart or after the issue is gone. Otherwise, you will get the state of the instance while it is functioning normally, which will not be of much help in most cases.
You may wonder how it is possible in the Kubernetes environment.
The good news is that Artifactory/Artifactory HA chart 107.46.3 and above have introduced the lifecycle hooks as seen here - https://github.com/jfrog/charts/blob/c4a2c6c671d20e8672db24572b965a655381ee7e/stable/artifactory-ha/values.yaml#L628 for which you can use.
Resolution
Below is the example of taking 3 dumps with 5 interval
artifactory: lifecycle: preStop: exec: command: - /bin/bash - -c - | echo "taking thread dump 1 with 5 seconds interval"; /opt/jfrog/artifactory/app/third-party/java/bin/jcmd $(pidof java) Thread.print > /tmp/"artifactory.$(date +%Y%m%d%H%M%S).td"; sleep 5; echo "taking thread dump 2 with 5 seconds interval"; /opt/jfrog/artifactory/app/third-party/java/bin/jcmd $(pidof java) Thread.print > /tmp/"artifactory.$(date +%Y%m%d%H%M%S).td"; sleep 5; echo "taking thread dump 3 with 5 seconds interval"; /opt/jfrog/artifactory/app/third-party/java/bin/jcmd $(pidof java) Thread.print > /tmp/"artifactory.$(date +%Y%m%d%H%M%S).td";
The destination of the thread dumps (e.g. /tmp) should be adjusted to be on your persistent storage or back up location.
How to Verify Thread Dumps are Being Created ?
1. Access the Pod:
Enter the pod by running the following command:
kubectl exec -it <artifactory-pod-name> -- bash
2. Delete the Pod to see thread dumps creation:
kubectl delete pod <artifactory-pod-name>
3. Monitor Thread Dumps:
While inside the pod, continuously run the following command to check for thread dumps:
ls -al /tmp