The cause of a replication failure can be one of three reasons:

How do I monitor replication?

AuthorFullName__c
Aaron Rhodes
articleNumber
000004738
ft:sourceType
Salesforce
FirstPublishedDate
2020-02-06T23:36:53Z
lastModifiedDate
2024-03-10T07:45:55Z
VersionNumber
6
  1. There is/was a storage problem that caused an IOException which will halt the task (and usually the whole server). You can prevent this by setting your disk thresholds and you want to prevent this rather than react to it.
  2. There is/was a network problem that halted the replication task between the servers. You will be able to spot these by looking in the logs for SocketExceptions and the like. This is generally reacted to rather than prevented due to the nature of networks.
  3. Replication took too long and the job is stuck somehow. We have had bugs in the past that have caused this, so try to stay as up-to-date with the Artifactory version you are using as possible to prevent. To detect you can periodically use the replication status REST API

What you can also do is monitor the delta of your replication by querying storage results of each server using this REST call. If the delta exceeds your threshold you would get an alert. Another way is to capture the replication job results from the artifactory.log and see if the results match your expectations.