Resolve Hugging Face Datasets via API

JFrog Artifactory Documentation

Products
JFrog Artifactory
Content Type
User Guide

Artifactory supports resolving entire dataset repositories using the snapshot_download API.

To resolve an entire dataset repository:

Run the following command:

from huggingface_hub import snapshot_download
snapshot_download(
    repo_id="<DATASET_NAME>", revision="<REVISION_ID>", repo_type="dataset", etag_timeout=86400
)

Where:

  • <DATASET_NAME>: The name of the dataset you want to resolve, formatted according to Hugging Face repository naming structure organization/name

  • <REVISION_ID>: The revision ID for the dataset

    Tip

    To find the revision ID, navigate to the dataset card on Hugging Face Hub, open the Files and versions tab, and click History to view all commits. Click the copy icon to copy the full commit hash.

For example:

from huggingface_hub import snapshot_download
snapshot_download(
    repo_id="nousresearch/hermes-3-dataset", revision="b1fddbdcae4e6714889365d1e6ce266a45289cc9", etag_timeout=86400
)

Note

You can also use JFrog Set me up to copy the snippet populated with your token and environment. For more information, see Use Artifactory Set Me Up for Configuring Package Manager Clients.