ARTIFACTORY: How to tune federated repository binary sync configuration

Matthew Wang
2023-01-22 11:07

Description

Federated repository sync works by syncing the metadata for artifacts first. The source instance will push the metadata for artifacts to the target instance. Due to this, you will see the artifact in the UI and it will be resolvable. 

The corresponding binary on the other hand is pulled on the target instance from the source instance. However, the binary for the artifact is not immediately pulled over when the metadata is pushed. When the metadata is pushed, a reference to the corresponding binary is added to a table called “binary_tasks” in the target instance’s database.

If the artifact is requested on the target instance and the binary isn’t pulled over to the target instance yet, it will be pulled over on demand. Otherwise, there is a job on the target instance that periodically queries its “binaries_tasks” table to pull over the binaries from the source instance.

There may be a need to tune the pull process of binaries on the target instance.

Resolution

You can add the below bolded “federated-repo” provider section to your $JFROG_HOME/var/etc/artifactory/binarystore.xml and tune the different properties. <?xml version="1.0" encoding="UTF-8"?>
<config version="1">
    <chain template="file-system"/>
    <provider id="federated-repo" type="federated-repo">
        <numberOfRemoteImporters>8</numberOfRemoteImporters>
        <numberOfLocalImporters>12</numberOfLocalImporters>
        <maxRetry>20</maxRetry>
        <maximumIdleTimeMs>60000</maximumIdleTimeMs>
        <errorRecoveryInterval>30000</errorRecoveryInterval>
        <maximumExecTimeMs>3600000</maximumExecTimeMs>
    </provider>
</config>

The properties and their explanations are listed below:

numberOfRemoteImporters
Amount of workers for download during synchronization. Increasing of workers can help with small files. Default 6.

numberOfLocalImporters: 
Amount of workers for download on demand. Default 6.

maxRetry:  
Amount of tries to sync a binary before failing. Can be increased to continue reprocessing failed binaries. Default 10.

maximumIdleTimeMs:  
Wait time between taking tasks. Default 1 minute.

errorRecoveryInterval:  
After this interval another node can take a task. Default 30 seconds.

maximumExecTimeMs
Hold time of task by single node. Default 60 minutes.