ARTIFACTORY: Uncovering Virtual Repository Resolution

Patrick Russell
2022-10-18 09:30

Sometimes you'll get an error when trying to download an artifact from a Virtual Repository, but the same URL works when the underlying Local or Remote is used instead. When bypassing the Virtual Repository bypasses the issue, it means the Virtual probably has a problem in its logic. 

In these cases, it's helpful to run a special API to diagnose the problem.

The "?trace" API

Ever since Artifactory introduced Virtual repositories, there has existed a special debug API to print its artifact resolution logic. The API allows you to see the underlying logic behind the download, including the outgoing searches made to remote sources like Maven Central.

To see the debug information, add a "?trace" token onto the end of a download URL that goes through a Virtual:curl -u <USER> "https://artifactory.com/libs-snapshot/jfrog/hello/1.0.8-SNAPSHOT/hello-1.0.8-20220526.215602-2.pom?trace"
Note: You cannot use the Trace API on package manager paths, for example, "api/pypi/pypi-virtual/[…]?trace" won't work. Try using the file path instead.

When you use this API, you receive a debug printout instead of getting a file. The debug output is the logic Artifactory used to find the file. This is very useful when troubleshooting a complex Virtual repository, as sometimes a very large resolution order can lead to unexpected logic problems

Make sure to authenticate when using this API, the ?trace API checks your credentials just like the Virtual would. You can even see which repositories will be skipped if your account doesn't have access to them.

A lot of the information in the printout is debug information that's not very helpful. Below you can find some key tags to look for in the printout. These examples are some of the common results when resolving a file from a Virtual: Finding a local file, a remote download, and a 404 Not Found event.

The examples were performed on a Virtual with two repositories within it, a small Local and a Remote pointed at Maven Central:
 

User-added image
Finding a local POM file in libs-snapshot-local

Artifactory's Virtual Repository system searches all the Local repositories first. If it finds the file, it will not query Maven Central.

[Sample command usage]curl -u admin "https://artifactory.com/libs-snapshot/jfrog/hello/1.0.8-SNAPSHOT/hello-1.0.8-20220526.215602-2.pom?trace"

Output:

2022-10-17T14:11:10.187-07:00 Executing any BeforeDownloadRequest user plugins that may exist

  • Start of the resolution process
  • The output prints any User Plugins that may block or modify the download request
  • There aren't any, so Artifactory continues the search

2022-10-17T14:11:10.188-07:00 Unable to find resource in libs-snapshot:jfrog/hello/1.0.8-SNAPSHOT/hello-1.0.8-20220526.215602-2.pom

  • Note: The file wasn't found directly in the virtual
  • Some virtuals, such as a Debian Virtual, actually keep metadata files within it
  • Debian Virtuals do this because the metadata is merged from Locals and Remotes during indexing

2022-10-17T14:11:10.189-07:00 Returning found resource from libs-snapshot-local:jfrog/hello/1.0.8-SNAPSHOT/hello-1.0.8-20220526.215602-2.pom

  • This search found the artifact in a Local so that file is returned
  • Maven Central is not queried, and the Remote Repo cache is not checked 

2022-10-17T14:11:10.190-07:00 Creating a resource handle from 'libs-snapshot-local:jfrog/hello/1.0.8-SNAPSHOT/hello-1.0.8-20220526.215602-2.pom'

  • The "resource handle" is the path to the direct file in the Local / Remote
  • It is passed to the parent Virtual classes so the download functions know what file to return

2022-10-17T14:11:10.216-07:00 Request succeeded

  • This means the request succeeded! 
  • Using this path without the "?trace" API on the end should return the file found in libs-snapshot-local

Finding a remote POM file in Maven-Central

Next, let's download a new file available only in Maven Central, it's not been downloaded by Artifactory yet. To download it, Artifactory has to search the Locals first, then search online.

[Sample command usage]curl -u admin -v "https://artifactory.com/artifactory/libs-snapshot/com/sun/pkg/pkg-java/1.0.0-alpha-1/pkg-java-1.0.0-alpha-1.pom?trace"

Output:

2022-10-17T14:26:40.535-07:00 Searching for the resource within libs-snapshot-local
2022-10-17T14:26:40.535-07:00 Unable to find resource in libs-snapshot-local:com/sun/pkg/pkg-java/1.0.0-alpha-1/pkg-java-1.0.0-alpha-1.pom

  • Looks like the package wasn't in libs-snapshot-local…

2022-10-17T14:26:40.535-07:00 Searching for the resource within maven-central-cache
2022-10-17T14:26:40.536-07:00 Unable to find resource in maven-central-cache:com/sun/pkg/pkg-java/1.0.0-alpha-1/pkg-java-1.0.0-alpha-1.pom

  • Looks like it wasn't in the cache either…

2022-10-17T14:26:40.536-07:00 Searching for the resource within maven-central
2022-10-17T14:26:40.537-07:00 Using remote request URL – https://repo.maven.apache.org/maven2/com/sun/pkg/pkg-java/1.0.0-alpha-1/pkg-java-1.0.0-alpha-1.pom
2022-10-17T14:26:40.537-07:00 Executing HEAD request to https://repo.maven.apache.org/maven2/com/sun/pkg/pkg-java/1.0.0-alpha-1/pkg-java-1.0.0-alpha-1.pom
2022-10-17T14:26:40.863-07:00 Found remote resource with content length – 1678

  • First, a HEAD check is done, and Artifactory found a file
  • This saves on networking bandwidth and is done just in case the remote site throttles frequent GET requests

2022-10-17T14:26:40.868-07:00 Resource was found in maven-central
2022-10-17T14:26:40.868-07:00 Resource is an exact match – returning
2022-10-17T14:26:40.868-07:00 Returning resource as found in the aggregated repositories

  • Artifactory has found the file and its logic says it's ok to download it

2022-10-17T14:26:40.880-07:00 Executing GET request to https://repo.maven.apache.org/maven2/com/sun/pkg/pkg-java/1.0.0-alpha-1/pkg-java-1.0.0-alpha-1.pom
2022-10-17T14:26:40.933-07:00 Downloading content   #Start of download
2022-10-17T14:26:40.933-07:00 Saving resource to maven-central-cache
2022-10-17T14:26:41.019-07:00 Downloaded content  #Download finished – Took 0.1 second

  • This is the file download event
  • You can use the timestamps to see how long the download took

2022-10-17T14:26:41.020-07:00 Creating a resource handle from 'maven-central-cache:com/sun/pkg/pkg-java/1.0.0-alpha-1/pkg-java-1.0.0-alpha-1.pom'

  • Again Artifactory sends a Resource Handle pointer up to the Virtual so the correct file is downloaded
  • Note the cache is used here as the file has been downloaded and cached

 

A 404 Not Found Example

In this example, a nonexistent package is requested. The "pkg-java" POM file path has been adjusted to an incorrect value.

[Sample command usage]
 curl -u admin -v "https://artifactory.com/artifactory/libs-snapshot/com/sun/pkg/pkg-java/1.0.0-alpha-1/pkg-java-1.0.0-alpha-d.pom?trace" 

Output

2022-10-17T14:46:33.906-07:00 Searching for the resource within libs-snapshot-local
2022-10-17T14:46:33.907-07:00 Unable to find resource in libs-snapshot-local:com/sun/pkg/pkg-java/1.0.0-alpha-1/pkg-java-1.0.0-alpha-d.pom

2022-10-17T14:46:33.907-07:00 Searching for the resource within maven-central-cache
2022-10-17T14:46:33.908-07:00 Unable to find resource in maven-central-cache:com/sun/pkg/pkg-java/1.0.0-alpha-1/pkg-java-1.0.0-alpha-d.pom

  • Not found in the cache or the Local, next Artifactory queries the remote…

2022-10-17T14:46:33.908-07:00 Using remote request URL – https://repo.maven.apache.org/maven2/com/sun/pkg/pkg-java/1.0.0-alpha-1/pkg-java-1.0.0-alpha-d.pom
2022-10-17T14:46:33.908-07:00 Executing HEAD request to https://repo.maven.apache.org/maven2/com/sun/pkg/pkg-java/1.0.0-alpha-1/pkg-java-1.0.0-alpha-d.pom
2022-10-17T14:46:34.242-07:00 Received status 404 (message: Not Found) on remote info request – returning unfound resource

  • The resource was not found in Maven Central either
  • Note that Artifactory only did a HEAD request, which failed. It didn't try a GET afterward 
  • You can bypass HEAD requests in the Remote Repository settings to make GET requests instead

2022-10-17T14:46:34.242-07:00 Configured to hide real status of un-authorized resources = false for repo libs-snapshot
2022-10-17T14:46:34.242-07:00 Original response status is auth related = false
2022-10-17T14:46:34.242-07:00 Using the original response status of '404' and message 'Could not find resource'
2022-10-17T14:46:34.242-07:00 Sending a response with the status '404' and the message 'Could not find resource'