Resolve Hugging Face Datasets using Libraries

JFrog Artifactory Documentation

Products
JFrog Artifactory
Content Type
User Guide

Requirements:

  • Artifactory version 7.77 and above

  • Hugging Face client version 0.19.0 and above

  • HF_HUB_ETAG_TIMEOUT parameter enabled

If these requirements are not met, you can resolve Hugging Face datasets by resolving whole repositories via API. For more information, see Resolve Hugging Face Datasets via API.

To resolve Hugging Face datasets using libraries:

Run the following command:

from datasets import load_dataset
dataset = load_dataset("<DATASET_NAME>")

Where:

  • <DATASET_NAME>: The name of the dataset you want to resolve, formatted according to Hugging Face repository naming structure organization/name

For example:

from datasets import load_dataset
dataset = load_dataset("wikimedia/wikipedia")