Hugging Face Repository Structure (Legacy)

JFrog Artifactory Documentation

Products
JFrog Artifactory
Content Type
User Guide

Legacy Notice

As of Artifactory 7.111.1, all new Hugging Face repositories use the Machine Learning repository structure. The Machine Learning structure is more flexible and scalable, and is optimized for a wide range of machine learning use cases. This structure provides consistent behavior across all MLOps repositories and enables more advanced JFrog ML features in the future.

Virtual Hugging Face repositories only support local and remote repositories that use the new Machine Learning structure.

For more information, see Migrate Legacy Hugging Face Repositories.

Hugging Face repositories are structured using two types of versions:

  • Integration versions: These versions are similar to branches in a Git repository. In local repositories, the only integration version is Main, and in remote repositories, the integration version is created according to the requested branch.

  • Release versions: These versions can be tags, commits, or revisions.

Local Repository Structure

 ├── hf-local
    ├── models
        ├── org (optional)
            ├── model-name 
                ├── main     
                    ├── timestamp
                        ├── .jfrog_huggingface_dataset_info.json
                        ├── file1 
                        ├── file2
                    ├── timestamp 
                        ├── .jfrog_huggingface_dataset_info.json
                        ├── file1    
                        ├── file2
                    ├── timestamp (server)
                        ├── .jfrog_huggingface_dataset_info.json
                        ├── file1    
                        ├── file2
                ├── v1 	
                    ├── timestamp
                        ├── .jfrog_huggingface_dataset_info.json
                        ├── file1    
                        ├── file2
    ├── datasets
        ├── org (optional)
            ├── dataset-name 
                ├── main     
                    ├── timestamp
                        ├── .jfrog_huggingface_dataset_info.json
                        ├── file1 
                        ├── file2
                    ├── timestamp 
                        ├── .jfrog_huggingface_dataset_info.json
                        ├── file1    
                        ├── file2
                    ├── timestamp (server)
                        ├── .jfrog_huggingface_dataset_info.json
                        ├── file1    
                        ├── file2
                ├── v1 	
                    ├── timestamp
                        ├── .jfrog_huggingface_dataset_info.json
                        ├── file1    
                        ├── file2

Note the following:

  • When resolving (downloading) a model or dataset without specifying a revision, the latest model or dataset version will be saved with the timestamp under the ‘main’ branch in the model or dataset name folder. Previous versions are retrievable using the SHA1 value, which is stored as the property huggingfaceml.generated.revision.sha1 in the .jfrog_huggingface_model_info.json file.

  • When deploying (uploading) a model or dataset with a specific version, it will be saved with a timestamp under that version name in the model or dataset's name folder. If the same model or dataset has been deployed already with the same version number, the new deployment will overwrite it, and the previous deployment will not be retrievable using the Artifactory-generated SHA1 value.

    Note

    For Artifactory to overwrite the model, go into Administration > User Management > Permissions, and verify that the Delete/Overwrite permission is enabled. Otherwise, when trying to deploy a model that has been deployed before with the same version number, Artifactory will return a 409 error.

Remote Repository Structure

Starting from Artifactory version 7.77, the repository structure for remote repositories is as follows:

Note

As of Artifactory 7.117, .latest_huggingface_model_info.json does not appear.

├── hf-remote
    ├── models
        ├── org (optional)
            ├── model-name
                ├── main 
                    ├── .latest_huggingface_model_info.json
                    ├── timestamp 1
                        ├── .jfrog_huggingface_model_info.json
                        ├── file1   
                        ├── file2
                    ├── timestamp 2
			├── .jfrog_huggingface_model_info.json
                        ├── file1    
                        ├── file2
                    ├── timestamp 3
                        ├── .jfrog_huggingface_model_info.json
                        ├── file1    
                        ├── file2
                ├── v1 
                    ├── .latest_huggingface_model_info.json
                    ├── timestamp 1
			├── .jfrog_huggingface_model_info.json
                        ├── file1   
                        ├── file2
                    ├── timestamp 2
                        ├── .jfrog_huggingface_model_info.json
                        ├── file1  
                        ├── file2
                ├── original hash
                    ├── .latest_huggingface_model_info.json
                    ├── timestamp 1
			├── .jfrog_huggingface_model_info.json
                        ├── file1    
                        ├── file2
    ├── datasets
        ├── org (optional)
            ├── dataset-name
                ├── main 
                    ├── .latest_huggingface_dataset_info.json
                    ├── timestamp 1
                        ├── .jfrog_huggingface_dataset_info.json
                        ├── file1   
                        ├── file2
                    ├── timestamp 2
			├── .jfrog_huggingface_dataset_info.json
                        ├── file1    
                        ├── file2
                    ├── timestamp 3
                        ├── .jfrog_huggingface_dataset_info.json
                        ├── file1    
                        ├── file2
                ├── v1 
                    ├── .latest_huggingface_dataset_info.json
                    ├── timestamp 1
			├── .jfrog_huggingface_dataset_info.json
                        ├── file1   
                        ├── file2
                    ├── timestamp 2
                        ├── .jfrog_huggingface_dataset_info.json
                        ├── file1  
                        ├── file2
                ├── original hash
                    ├── .latest_huggingface_dataset_info.json
                    ├── timestamp 1
			├── .jfrog_huggingface_dataset_info.json
                        ├── file1    
                        ├── file2

Note the following:

  • When resolving (downloading) a model or dataset without specifying a revision, the latest model or dataset version will be saved with the timestamp under the ‘main’ branch in the model or dataset's name folder. Previous versions are retrievable using the SHA1 value, which is stored as the property

    huggingfaceml.generated.revision.sha1 on the .jfrog_huggingface_model_info.json file.

  • When resolving (Downloading) a model or dataset with a specific version using a commit hash or tag, it will be saved with a timestamp under the requested revision in the model or dataset's name folder.