Important
This feature requires all JPDs in the Federation to run Artifactory release 7.71.1 or later.
In certain cases, it may not be possible to maintain near real-time synchronization of all artifact events (create, update, delete) and binary tasks among Federation members. Examples include short-term networking issues between the JPDs, Artifactory upgrades, a user-initiated synchronization pause, and so on. If synchronization continues to fail after reaching the maximum number of retry events, event sync is paused and the Federation moves into an error state.
One way to recover the Federation is to perform a full sync, but this can be a time-consuming process if the Federated repositories contain a large number of artifacts, as this amounts to restarting the Federation.
Artifactory features an auto-healing mechanism that checks Federated repositories at regular intervals for exhausted queues (queues that have exceeded the maximum number of attempts to send events to other Federation members). This mechanism resets the failed events automatically and tries again to sync with the target mirror.
Note
If events have accumulated over a period of days, the event cleanup mechanism might potentially clean events that have not been propagated, causing the queue to move to an out-of-sync state. In such cases, performing a full sync is required.
Email Notifications
All administrators who are registered for Artifactory's mail service will receive notifications similar to the one shown below when auto-healing takes place:
2023-09-21T10:39:24.696Z [jfrt ] [INFO ] [29cb8b34b3ec63e4] [atedRepositoryRecoveryTest:247] [TestNG_1 ] [rt_229036255 ] [rt_229036255] - Mail notification subject: [JFrog] Mirror Recovery in Progress 2023-09-21T10:39:24.696Z [jfrt ] [INFO ] [29cb8b34b3ec63e4] [atedRepositoryRecoveryTest:249] [TestNG_1 ] [rt_229036255 ] [rt_229036255] - Mail notification content: ------=_Part_0_2088885210.1695292764495 Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit "Auto Healing" task has recognized a mirror in an exhausted state: 'http://localhost:11552/artifactory/generic-fed-9ac0b669-5760-4ff3-a209-c6b8c7826928' -> 'http://localhost:55295/artifactory/generic-fed-9ac0b669-5760-4ff3-a209-c6b8c7826928'. Recovery attempt is now in progress...
System Properties
Federation recovery and auto-healing are controlled using the following properties in the artifactory.system.properties
file:
Property | Description |
---|---|
| Defines the interval (in seconds) at which the auto-healing feature checks for exhausted queues. The default value is |
| Defines the buffer that works in conjunction with the The default value is |
| Defines the number of attempts to send a queued Federated event before the queue becomes exhausted and therefore eligible for auto-healing. The default value is |
| Defines the delay interval (in minutes) between attempts to trigger the queue. The default value is |
| Defines the interval (in minutes) for an async task that resets the status of a Full Sync operation that has become "stuck", enabling the Full Sync to restart. This property is useful, for example, if the Artifactory instance is restarted while a Full Sync operation is running. After the restart, this async task will reset the operation and restart it. The default value is |
| Defines the initial delay (in minutes) before running the async task that resets the status of a stuck Full Sync operation. The default value is |
Manual Recovery using a REST API
Use the Federation Recovery REST API to perform recovery manually. This API can be used when auto-healing has been disabled or when you want to perform recovery immediately without waiting for the auto-healing interval to arrive.