How smart is your remote repository?

Note: This blog post has been update on June 14, 2021

The ability  to proxy remote repositories and cache external artifacts from them is crucial whether they are Docker images, NuGet packages, npm tar.gz files or any of the dependencies we use to create our own products. It speeds up our builds, ensures reliable access , gives control over the bill-of-materials and offers many more benefits making it a practice you cannot live without in todays CI / CD domain.

When working with Artifactory we will normally have one remote repo pointing to JCenter and others pointing to additional relevant, public repositories such as Docker Hub, NuGet Gallery, npm registry, PyPI and more. Complemented by the ability to configure virtual repositories, we provide our build tools, users and different clients a single endpoint from which to resolve all required artifacts, first looking through the local and cached items, and then searching remotely.

This works very well when our dependencies are all specific release versions. Things get more complicated when you have geographically distributed teams working on the same project or when you have different co-dependent projects that are constantly modified and need to stay up-to-date with each other’s snapshot version.

Replication is the answer… or is it?

One solution is to use Artifactory’s ability to replicate repositories (pull for remote, push and multi-push for local), which is perfect if you need to stay in full sync. The sync can be timed on a cron expression or be triggered by events. This will take care of actively downloading new artifacts, deleting the ones that have been removed remotely, and making sure the properties are always in sync.

The problem with replication is that it can get load and bandwidth intensive when repositories are laden with many artifacts. Wouldn’t it be great if you could get only those artifacts that you need for your teams to sync up, and avoid unnecessary load and network traffic? Well, guess what. You can. Let me introduce you to one of Artifactory’s latest features…

Smart remote repositories

In Artifactory, a remote repository is represented by the URL of the remote resource from which you download and cache artifacts. But what if that URL happens to point to a repository in another instance of Artifactory. This kind of kinship begs to be utilized. If my Artifactory is proxying a repository in another Artifactory, there’s no reason why these cousin instances shouldn’t talk to each other and do some smart things.

  1. Automatic detection
    So the first thing is that Artifactory recognizes its own kind, and if your remote repository’s URL points at another instance of Artifactory, you will be presented with a dialog on which you can configure how these two instances will interact.
  2. Sync properties
    Normally, once you have cached an artifact from a remote repository, you will not be aware if any of the properties annotating the artifact at the remote resource are changed. But with smart remote repositories you can. If you check this box in your remote repository configuration, every time there is a request to get an artifact’s properties, Artifactory will validate their values against the corresponding properties on the original artifact in the remote instance. Any changes to properties on the remote item (update, add, remove) will be automatically synced to your locally cached copy without you having to download the artifact again. So, for example, if you download an artifact whose status is “Release Candidate”, and the remote team building it later changes the status to “Integration Test Failed”, the status on your locally cached copy will be automatically updated next time you check if you’re good to go with that artifact. You don’t want to release with an artifact that fails integration testing, now, do you?
  3. Remote list browsing where you never thought possible
    Many of the package types supported in Artifactory do not offer list browsing for a variety of reasons, however, since smart remote repositories keep things in the family, Artifactory knows how to overcome this limitation and lets you browse remotely in places you always wanted to, but couldn’t, such as Docker, NuGet, npm, Bower, PyPI and many more.
  4. Delete indication
    This comes out of the box. No need to configure it. If an item you have downloaded and cached gets deleted from the remote Artifactory instance, your Artifactory UI will indicate it in different places you view your locally cached copy. This is something you need to know about because if you depend on this artifact, you don’t want to lose it next time you do some clean-up on your cache. You won’t be able to download the artifact from the remote Artifactory instance any more, so you might want to move it to a local repository for safe keeping.

This is just the beginning. You can look forward to features like synchronization of download stats, transitivity sync when chaining multiple Artifactory instances, executing AQL searches on remote repositories, pushing artifacts remotely and much more as smart remote repositories just keep getting smarter.