How to Onboard to a Federated Repository

Scaling up your development organization typically involves spreading development across multiple locations around the globe. One of the key challenges with multisite development is ensuring reliable access to required software packages and artifacts for teams collaborating across time zones. The JFrog Software Supply Chain Platform solves this challenge with federated repositories in JFrog Artifactory.

What are federated repositories?

Federated repositories are created when two or more repositories of the same package type are connected via federation to enable automatic, full bi-directional mirroring across different JFrog platform deployments (JPDs) or JFrog Artifactory instances (Figure 1). This sync is also created for geo-synchronized environments or for an active-active Disaster Recovery (DR) environment.

Graphic showing bidirectional synchronization with federated repositories.
Figure 1. Bidirectional synchronization with federated repositories.

Benefits of continuously synced repositories

Federated repositories make it easy for distributed teams to work together by giving them access to the same set of artifacts, builds, and packages. This configuration eliminates the need for complex replication setups and rules around when artifacts are pushed or pulled from one repository to another. All-in-all, it makes the process of sharing components across multiple JPDs or development sites much easier to manage and maintain.

Before getting started with federated repositories

There are certain infrastructure elements to consider before starting to use federated repositories. They include:

  • Network speed: The federation heavily relies on the network infrastructure and when configuring the federation between two JPDs, the network speed should be considered.
  • Disk size: While the federation is being triggered, some temporary files will be created on the root disk. Therefore, having enough disk space is necessary.
  • Managing load: The federation will add some extra load on Artifactory, so admins must consider what needs to be federated, the size of artifacts, and the number of artifacts to be federated. Regular monitoring of resources such as DB resources (ex. CPU, connections, long-running queries) and Infrastructure resources (ex. CPU, memory, storage, JVM parameters) is recommended.

There are also a handful of prerequisites before being able to connect repositories in a federation:

  • The appropriate subscription/license level (ie, Enterprise X and above).
  • Artifactory versions earlier than 7.49.3 must be identical between federated members.
  • Before creating federated repositories, it’s mandatory to configure a custom base url for Artifactory.
  • A circle of trust must be set between the Artifactory instances. There are two ways to establish a circle of trust:

Once you’ve reviewed your infrastructure and ensured you have the right level of JPD, you’re ready to start setting up your repository federation.

Best practices for setting up federated repositories

Let’s take a look at onboarding best practices for federated repositories in three scenarios:

  1. Newly created federated repositories
  2. Moving from push replication to federated repositories
  3. Federating a large repo to an existing member site

Newly created federated repositories

In a large-scale enterprise, start slow by configuring the federation for small repositories to validate the federation speed as well as for any system performance anomalies.

After ensuring the limitations, and resources are satisfactory, start the federation for other repositories in a batch manner.

The federation can be scheduled based on the technologies, or based on the team structure. For example, the federation can be performed batch-wise first on generic repositories, moving to Maven, Docker and so on. Alternatively, the federation can be based on individual teams. For example, start performing federation for Team 1 (i.e., copy/sync all of the repositories for Team 1) before moving to Team 2, and then Team 3.

Due to the time constraint, and if the aim is to have an active sync of all repositories from JPD1 to JPD2, the following four steps can be followed:

  1. Copy the filestore of JPD1 to JPD2.
  2. Perform the migration of the configurations from JPD1 to JPD2 by following the methods in this KB article.
  3. This migration will create duplicate service IDs for Artifactory and Access in JPD2. Therefore, reach out to JFrog support for assistance changing the service IDs.

Once the migration is completed and the service IDs are changed, the local repositories can be converted to the federated repositories and the federation can be established for the delta data synchronization.

Moving from push/pull replication to federated repositories

In scenarios where a push or pull replication is already configured to keep repositories in JPD1 and JPD2 in sync, converting to federation is relatively easier as the data should already be synced.

Again, start slow by converting the least-used local repositories to federated repositories and monitor the above-mentioned infrastructure resources.

If there’s a Global DNS being used between the JPD1 and JPD2, and the custom base URL for both JPD1 and JPD2 is the same, configure the federated base URL over base URL config in the config descriptor as mentioned here in this wiki.

Federating a large repository to an existing member site

There’s a use case for setting up a federation of a large single repository from JPD1 to JPD2 and this federation is completely new to start with.

Start this federation by setting up a Push replication of this large repository to the JPD2 local repository so that this Push replication replicates all the repository binaries from JPD1 to JPD2.

The Push replication is suggested considering this replication is unidirectional and will not add as much load as the federated repository would, given the federated repository sync is bi-directional.

Once the Push replication replicates all the data from JPD1 to JPD2 for this large repository, convert this large local repository to the federated repository and monitor the above-mentioned infrastructure resources.

Items to be aware of during the federation process and tuning parameters

When leveraging federated repositories there some potential concerns to be aware of:

  • The metadata files such as maven-metadata.xml will not be federated. Therefore there will be a difference in the number of files federated between JPD1 and JPD2.
  • For Docker repositories, the files under the _uploads folder wouldn’t be federated therefore the number of artifacts would mismatch between JPD1 and JPD2.
  • When a federation for a large repository fails and a read timeout is seen in the logs, the timeout should be increased as mentioned here in this wiki by simply adding the artifactory.mirror.http.client.socket.timeout.mili=200000 parameter in the artifactory.system.properties file.
  • Similarly, the timeouts on the reverse proxy/load balancer should also be increased considering the federation is redirected through them.
  • As the federation would add some load to Artifactory, it’s recommended to tune your Artifactory instance as indicated in this KB article.

Upon successfully tuning Artifactory, the federated repository settings can also be tuned by adding the following:

  • Increase the federated repository configs in the binarystore.xml
    <provider id="federated-repo" type="federated-repo">
    <numberOfRemoteImporters>40</numberOfRemoteImporters>
    <numberOfLocalImporters>10</numberOfLocalImporters>
    <errorRecoveryInterval>35000</errorRecoveryInterval>
    <maxRetry>10</maxRetry>
    </provider>
  • Tune the federated repository parameters in the artifactory.system.properties
    artifactory.federated.repo.executor.poolMaxQueueSize=20000 (default is 10000)
    artifactory.federated.max.config.threads=20 (default is 5)
    artifactory.federated.repo.max.total.http.connections=70 (default is 50)
    artifactory.federated.repo.max.threads.percent=20 (default is 10)

Monitoring the federated repository sync

The Federated Repositories synchronization can be monitored by getting the status of the federation with the help of this Federated Repository sync status REST API:

$ curl -u admin -XGET https://myartifactory.jfrog.com/artifactory/api/federation/status/repo/<example-repo-local>

Monitor the lag time between the last federation using this REST API:

$ curl -u admin -XGET https://myartifactory.jfrog.com/artifactory/api/Federation/status/mirrorsLag

Get started with federated repositories today

If you’re looking for a way to keep multiple instances of JFrog Artifactory as part of the JFrog Platform in sync, consider using Artifactory’s Federated Repository functionality. These helpful tips will help you get started, but if you still need more guidance, our support team is here to help.

If you don’t use Artifactory today, you can give it a try for free.