ARTIFACTORY: Artifactory HA Nodes Out of Sync After Upgrade: Troubleshooting Access and Entitlement Errors

ARTIFACTORY: Artifactory HA Nodes Out of Sync After Upgrade: Troubleshooting Access and Entitlement Errors

Products
Frog_Artifactory
Content Type
Upgrade
AuthorFullName__c
Karthik Pandi
articleNumber
000006638
FirstPublishedDate
2025-09-25T08:41:38Z
lastModifiedDate
2025-09-25
VersionNumber
1
Introduction 

After upgrading Artifactory in a High Availability (HA) setup, the cluster may initially appear healthy but later show HTTP 500 errors when users attempt to fetch artifacts. In some cases, one node remains responsive while another becomes unhealthy, preventing UI access and blocking the ability to generate support bundles.

The underlying issue is often linked to stale Access service caches (specifically JFConnect entitlement configurations) that fail to sync properly between HA nodes. This causes inconsistent configuration states, lock contention, HA sync failures, and eventually service instability.


This article will walk you through identifying and resolving issues in an Artifactory High Availability (HA) environment caused by stale Access service caches and JFConnect entitlement configuration mismatches. It explains the common error patterns that occur after an upgrade, why these errors lead to node instability and HTTP 500 responses, and provides step-by-step instructions for restoring cluster health. By following this guide, you will understand how to recognize the symptoms, apply the corrective actions, and adopt best practices to prevent the issue from recurring in future upgrades.

Resolution 

1. Identify the errors in logs
You may notice one or more of the following recurring error patterns in the console.log or artifactory-service.log:
  • Entitlements cache update error
 [ERROR] [titlementsAutoUpdatingCache:62] - Can't update entitlements
java.lang.NullPointerException: Cannot invoke "org.jfrog.jfconnect.client.model.EntitlementsFullModel.getEntitlements()" 
because "updatedFullEntitlementsModel" is null
  • Lock acquisition failures between nodes
 
 [INFO ] [AcquiredExceptionGrpcMapper:14] - Attempt to acquire lock - jfconnect.entitlements.lock failed, 
with owner artifactory-1. Lock is held by - artifactory-0, for 1445(ms)
  • Access router errors / HA sync timeouts
 [ERROR] - Last retry failed: Access router is unhealthy, marking server as not available.
HA sync will not work with this node until the node is healthy, 'artifactory-0'.
Status code: UNAVAILABLE. HTTP status code 504
invalid content-type: text/plain; charset=utf-8
DATA-----------------------------
Gateway Timeout. Not trying again (exceeded number of attempts (3))
  • Outdated configuration revision errors
 [INFO ] - Set config failed for: jfconnect.entitlements.config, 
error: rpc error: code = InvalidArgument desc = org.jfrog.common.ExecutionFailed:
Last retry failed: INVALID_ARGUMENT: Outdated configuration for key `jfconnect.entitlements.config`;
current config revision: 94086, incoming config revision: 94070.


These errors confirm that one or more HA nodes are running with outdated Access cache values compared to the database.


2. Upgrade the Helm chart

 Upgrade to the latest stable Helm chart version for Artifactory (example: 11.2.1) to ensure you benefit from stability and bug fixes.
helm upgrade artifactory jfrog/artifactory --version 11.2.1 -n <namespace>

3. Stop all running nodes
 Scale down the StatefulSet to 0 replicas to stop all Artifactory pods. This clears stale cache states across nodes.
kubectl scale statefulset artifactory --replicas=0 -n <namespace>

4. Restart nodes in sequence
 Scale the StatefulSet back to the desired number of replicas (e.g., 3).
 The pods should come up sequentially in the order:
artifactory-0 → artifactory-1 → artifactory-2.
kubectl scale statefulset artifactory --replicas=3 -n <namespace>
This ensures clean configuration propagation and resets HA sync properly.

5. Verify cluster health
Confirm pod readiness:
 kubectl get pods -n <namespace>
Check logs for absence of entitlement or HA sync errors:
 kubectl logs artifactory-0 -n <namespace>
  • Validate that UI and API requests are successful without 500 errors.

6. Monitor for stability

 Keep monitoring logs and user activity for several hours to confirm that the cluster remains stable and that no new entitlement-related or HA sync errors appear.


Conclusion

The issue occurs when HA nodes become misaligned due to stale Access service caches (e.g., jfconnect.entitlements.config). This misalignment can lead to lock contention, failed HA sync, and HTTP 500 responses for users.
Restarting all nodes after upgrading the Helm chart forces caches to reset and ensures clean configuration propagation across nodes, restoring cluster stability.

Best Practices to Prevent Recurrence
  • Always upgrade using the latest Helm chart and Artifactory version.
  • When upgrading in HA, restart all nodes together to avoid partial cache states
  • After upgrades, monitor Access logs for entitlement and sync errors.