Hugging Face Repository Structure

JFrog Artifactory Documentation

Products
JFrog Artifactory
Content Type
User Guide
ft:sourceType
Paligo

Hugging Face repositories are structured using two types of versions:

  • Integration version: these versions are similar to branches in a Git repository. In local repositories the only integration version is Main, and in remote repositories, the integration version is created according to the requested branch.

  • Release versions: these versions can be tags, commits, or revisions.

The Hugging Face repository structure for local repositories is as follows:

 ├── hf-local
    ├── models
        ├── org (optional)
            ├── model-name 
                ├── main     
                    ├── timestamp
                        ├── .jfrog_huggingface_dataset_info.json
                        ├── file1 
                        ├── file2
                    ├── timestamp 
                        ├── .jfrog_huggingface_dataset_info.json
                        ├── file1    
                        ├── file2
                    ├── timestamp (server)
                        ├── .jfrog_huggingface_dataset_info.json
                        ├── file1    
                        ├── file2
                ├── v1 	
                    ├── timestamp
                        ├── .jfrog_huggingface_dataset_info.json
                        ├── file1    
                        ├── file2
    ├── datasets
        ├── org (optional)
            ├── dataset-name 
                ├── main     
                    ├── timestamp
                        ├── .jfrog_huggingface_dataset_info.json
                        ├── file1 
                        ├── file2
                    ├── timestamp 
                        ├── .jfrog_huggingface_dataset_info.json
                        ├── file1    
                        ├── file2
                    ├── timestamp (server)
                        ├── .jfrog_huggingface_dataset_info.json
                        ├── file1    
                        ├── file2
                ├── v1 	
                    ├── timestamp
                        ├── .jfrog_huggingface_dataset_info.json
                        ├── file1    
                        ├── file2

Note the following:

  • When resolving (downloading) a model or dataset without specifying a revision, the latest model or dataset version will be saved with the timestamp under the ‘main’ branch in the model or dataset name folder. Previous versions are retrievable using the SHA1 value, which is stored as the property huggingfaceml.generated.revision.sha1 in the .jfrog_huggingface_model_info.json file.

  • When deploying (uploading) a model or dataset with a specific version, it will be saved with a timestamp under that version name in the model or dataset's name folder. If the same model or dataset has been deployed already with the same version number, the new deployment will overwrite it, and the previous deployment will not be retrievable using the Artifactory-generated SHA1 value.

    Note

    For Artifactory to overwrite the model, go into Administration > User Management > Permissions, and verify that the Delete/Overwrite permission is enabled. Otherwise, when trying to deploy a model that has been deployed before with the same version number, Artifactory will return a 409 error.

Starting from Artifactory version 7.77, the repository structure for remote repositories is as follows:

├── hf-remote
    ├── models
        ├── org (optional)
            ├── model-name
                ├── main 
                    ├── .latest_huggingface_model_info.json
                    ├── timestamp 1
                        ├── .jfrog_huggingface_model_info.json
                        ├── file1   
                        ├── file2
                    ├── timestamp 2
			├── .jfrog_huggingface_model_info.json
                        ├── file1    
                        ├── file2
                    ├── timestamp 3
                        ├── .jfrog_huggingface_model_info.json
                        ├── file1    
                        ├── file2
                ├── v1 
                    ├── .latest_huggingface_model_info.json
                    ├── timestamp 1
			├── .jfrog_huggingface_model_info.json
                        ├── file1   
                        ├── file2
                    ├── timestamp 2
                        ├── .jfrog_huggingface_model_info.json
                        ├── file1  
                        ├── file2
                ├── original hash
                    ├── .latest_huggingface_model_info.json
                    ├── timestamp 1
			├── .jfrog_huggingface_model_info.json
                        ├── file1    
                        ├── file2
    ├── datasets
        ├── org (optional)
            ├── dataset-name
                ├── main 
                    ├── .latest_huggingface_dataset_info.json
                    ├── timestamp 1
                        ├── .jfrog_huggingface_dataset_info.json
                        ├── file1   
                        ├── file2
                    ├── timestamp 2
			├── .jfrog_huggingface_dataset_info.json
                        ├── file1    
                        ├── file2
                    ├── timestamp 3
                        ├── .jfrog_huggingface_dataset_info.json
                        ├── file1    
                        ├── file2
                ├── v1 
                    ├── .latest_huggingface_dataset_info.json
                    ├── timestamp 1
			├── .jfrog_huggingface_dataset_info.json
                        ├── file1   
                        ├── file2
                    ├── timestamp 2
                        ├── .jfrog_huggingface_dataset_info.json
                        ├── file1  
                        ├── file2
                ├── original hash
                    ├── .latest_huggingface_dataset_info.json
                    ├── timestamp 1
			├── .jfrog_huggingface_dataset_info.json
                        ├── file1    
                        ├── file2

Note the following:

  • When resolving (downloading) a model or dataset without specifying a revision, the latest model or dataset version will be saved with the timestamp under the ‘main’ branch in the model or dataset's name folder. Previous versions are retrievable using the SHA1 value, which is stored as the property

    huggingfaceml.generated.revision.sha1 on the .jfrog_huggingface_model_info.json file.

  • When resolving (Downloading) a model or dataset with a specific version using a commit hash or tag, it will be saved with a timestamp under the requested revision in the model or dataset's name folder.