What is the difference between “s3” and “cluster-s3” filestore chains?

Ariel Kabov
2019-08-18 08:59

Relevant Versions: Artifactory 5 & 6.

When integrating Artifactory with a cloud binary provider such as S3, based on the documentation there are 2 possible chain templates to select from: the “s3”, and “cluster-s3” chain templates. Here we will focus on the differences, and how one should choose the right configuration to use.

*This article focuses on S3 as an example, while the mentioned information is also relevant for differences between “google-storage” & “cluster-google-storage”, and “azure-blob-storage” & “cluster-azure-blob-storage”.

Running a standalone Artifactory server?

The benefits of the “cluster” chains are relevant only when running Artifactory in High-Availability.
Therefore, for standalone setups to work with cloud storage providers, you should use the non-cluster template chains.

System Requirements

The “cluster” chains do not require anything beyond running in an HA cluster (and an Enterprise Artifactory license).
The non-cluster chains, when used in an HA setup – require to have a shared mount(NFS/NAS/SAN) between all Artifactory nodes.

The "non-cluster" template and how it works

<chain template="s3"/>

The "s3" chain stands for the following configuration:    <chain template="s3">
        <provider id="cache-fs" type="cache-fs">
            <provider id="eventual" type="eventual">
                <provider id="retry" type="retry">
                    <provider id="s3" type="s3"/>
                </provider>
            </provider>
        </provider>
    </chain>

As mentioned above, when the non-cluster template is used in an HA cluster, a shared mount is required between all HA nodes to share the Artifactory “data” directory. 
The shared “data” directory has to be configured in the $ARTIFACTORY_HOME/etc/ha-node.properties:artifactory.ha.data.dir=/mnt/shared/artifactory/ha-data

The template above configures 4 layers of storage providers.

  1. Cache-FS – Should not be shared, in order to assure good performance for frequent artifacts.
  2. Eventual – Being created in the configured “artifactory.ha.data.dir” and is the shared layer between all cluster nodes. The “eventual” directory contains 3 subdirectories: “_pre”, “_add”, “_delete”. Those folders are practically the queues of events which should be transmitted to the next provider, the cloud storage provider. 
  3. Retry – Responsible for “retry” events in case of failures.
  4. S3 – The cloud storage provider.

When a non-cluster template is being used, the primary node is responsible to dispatch the events from the “Eventual” folder to the S3 bucket.
Therefore his up-time is important to make sure the “Eventual” folder does not grow out of control.

Upon Downloading

  1. The node which received the download request will first check if the file is available in the Cache-FS layer. 
  2. If not available, he will check if the file exists in the Eventual folder. 
  3. Else, he will proceed to download the file from the cloud provider, and then provide it to the client. 

Direct Cloud Storage will allow you to skip the part where Artifactory downloads the file from the cloud storage and then serves it to the client.

Upon Uploading

  1. The node which received the upload request will stream the file to the eventual “_pre” folder. 
  2. Once the node has fully received the file, it will move the file from “_pre” to “_add”, and only then close the upload request. As the file reaches the “_add” folder, it is available to download for all other HA members. 
  3. The primary node checks if there are any new files to be handled in the “eventual” folder, and eventually will upload the file to S3.

Best Practices:

  • A large local Cache-FS partition is important to assure frequent artifacts are served as quickly as possible.
  • Primary node uptime is required for artifacts to be uploaded to S3.
  • Monitoring the number of files at the “eventual” subfolders is important, so in case of a failure, you as an administrator are notified in time. 

binarystore.xml example:<config version="v1">
    <chain template="s3">
        <provider id="cache-fs" type="cache-fs">
            <provider id="eventual" type="eventual">
                <provider id="retry" type="retry">
                    <provider id="s3" type="s3"/>
                </provider>
            </provider>
        </provider>
    </chain>

    <provider id="cache-fs" type="cache-fs">
        <cacheProviderDir>/var/opt/jfrog/artifactory/data/cache</cacheProviderDir>
        <maxCacheSize>100000000000</maxCacheSize>
    </provider>

<provider id="eventual" type="eventual">
        <numberOfThreads>10</numberOfThreads>  
        <timeout>180000</timeout>
        <dispatcherInterval>5000</dispatcherInterval>
    </provider>

    <provider id="retry" type="retry">
        <maxTrys>10</maxTrys>
        <interval>1000</interval>
    </provider>

    <provider id="s3" type="s3">
       <endpoint>http://s3.amazonaws.com</endpoint>
       <identity>[ENTER IDENTITY HERE]</identity>
       <credential>[ENTER CREDENTIALS HERE]</credential>
       <path>[ENTER PATH HERE]</path>
       <bucketName>[ENTER BUCKET NAME HERE]</bucketName>
    </provider>

</config>

 

The "cluster" template and how it works

<chain template="cluster-s3"/>

The "cluster-s3" chain stands for the following configuration:    <chain> <!--template="cluster-s3"-->
        <provider id="cache-fs-eventual-s3" type="cache-fs">
            <provider id="sharding-cluster-eventual-s3" type="sharding-cluster">
                <sub-provider id="eventual-cluster-s3" type="eventual-cluster">
                    <provider id="retry-s3" type="retry">
                        <provider id="s3" type="s3"/>
                    </provider>
                </sub-provider>
                <dynamic-provider id="remote-s3" type="remote"/>
            </provider>
        </provider>
    </chain>

The main difference between the “cluster” chain configuration to the non-cluster, is that in “cluster”, each Artifactory node will use his own storage for the “eventual” provider, so a shared mount is not needed.
Here each Artifactory node will manage his own queue and dispatch his own events against the Cloud storage.

This chain configures the following provider layers:

  1. Cache-FS
  2. Sharding-Cluster – This provider allows us to cluster the “Eventual” in a way that will allow Artifactory to recognize files which are persisted in other nodes.
  3. Eventual-Cluster – A clustered version of the Eventual provider. Located at $ARTIFACTORY_HOME/data/eventual and has 2 subfolders: “_pre” & “_queue”.
  4. Retry – Responsible for “retry” events in case of failures.
  5. S3 – The cloud storage provider.
  6. Remote – Responsible for communication with other HA member nodes.

As in this configuration binaries stored locally rather than on an external drive, they are accessible only when Artifactory is up and running (it is not enough for the host to be up). 
In order to ensure data is always available, it is important to familiarize yourself with the following 2 parameters of the “sharding-cluster” provider:
redundancy – Default: 2. The number of copies that should be stored for each binary. Although the “eventual” is only a transitive directory, we would usually want at least 2 copies of each file in the “eventual” to ensure all files are always available. For example, during a restart, we may take down a node that has files in the “eventual”. If the same file exists in another node, it will be served from there. (Assuming it is a rolling restart)
lenientLimit – Default: 1. The minimum number of copies that needs to be stored in order for the upload to be successful. If set to 0, the configured redundancy must be kept. 

Upon Downloading

  1. The node which received the download request will first check if the file is available in the Cache-FS layer. 
  2. If not available, he will check if the file exists in his local Eventual folder. 
  3. In case not, he will check using the “Remote” provider if the file is available in one of the other member nodes. 
  4. Else, he will proceed to download the file from the cloud provider, and then provide it to the client. 

Direct Cloud Storage will allow you to skip the part where Artifactory downloads the file from the cloud storage and then serves it to the client.

Upon Uploading

  1. The node which received the upload request will stream the file to the eventual “_pre” folder. 
  2. Once the node has fully received the file, it will move the file from “_pre” to “_queue”, and only then close the upload request. In case “lenientLimit” and “redundancy” are set to more than 1, it will first assure that the binary exists at least at “n” number of “eventual” folders of nodes of the HA cluster. (“n” being the configured “lenientLimit”)
  3. Once the file is in the “_queue” folder, it is available for download to all other HA members. 
  4. The node checks every 1 second (by default, configurable) if there are any new files to be handled in the “eventual” folder, and uploads them to S3.

Best Practices:

  1. A large local Cache-FS partition is important to assure frequent artifacts are served as quickly as possible.
  2. Monitoring the number of files at the “eventual” subfolders is important, so in case of a failure, you as an administrator are notified in time. 
  3. If “lenientLimit” is set to 1, before restarting an Artifactory node, take it out of the Load-Balancer for at least a minute. This is to ensure that redundancy of recently deployed artifacts is being kept.

binarystore.xml example:<config version="2">
    <chain> <!--template="cluster-s3"-->
        <provider id="cache-fs-eventual-s3" type="cache-fs">
            <provider id="sharding-cluster-eventual-s3" type="sharding-cluster">
                <sub-provider id="eventual-cluster-s3" type="eventual-cluster">
                    <provider id="retry-s3" type="retry">
                        <provider id="s3" type="s3"/>
                    </provider>
                </sub-provider>
                <dynamic-provider id="remote-s3" type="remote"/>
            </provider>
        </provider>
    </chain> 

    <provider id="cache-fs-eventual-s3" type="cache-fs">
        <maxCacheSize>100000000000</maxCacheSize>
    </provider>
  
    <provider id="sharding-cluster-eventual-s3" type="sharding-cluster">
        <readBehavior>crossNetworkStrategy</readBehavior>
        <writeBehavior>crossNetworkStrategy</writeBehavior>
        <redundancy>2</redundancy>
        <lenientLimit>1</lenientLimit>
        <property name="zones" value="local,remote"/>
    </provider>
 
    <provider id="eventual-cluster-s3" type="eventual-cluster">
        <maxWorkers>10</maxWorkers>
        <dispatcherInterval>1000</dispatcherInterval>
        <checkPeriod>15000</checkPeriod>
        <addStalePeriod>300000</addStalePeriod>
        <zone>local</zone>
    </provider>

    <provider id="remote-s3" type="remote">
        <checkPeriod>15000</checkPeriod>
        <connectionTimeout>5000</connectionTimeout>
        <socketTimeout>15000</socketTimeout>
        <maxConnections>200</maxConnections>
        <connectionRetry>2</connectionRetry>
       <zone>remote</zone>
    </provider>

    <provider id="s3" type="s3">
       <endpoint>http://s3.amazonaws.com</endpoint>
       <identity>[ENTER IDENTITY HERE]</identity>
       <credential>[ENTER CREDENTIALS HERE]</credential>
       <path>[ENTER PATH HERE]</path>
       <bucketName>[ENTER BUCKET NAME HERE]</bucketName>
    </provider>

</config>