When working with PyTorch on GPU instances, it's crucial to ensure that your library installation is compatible with the CUDA drivers installed on JFrog ML instances. This ensures optimal performance and compatibility with GPU resources.
Currently the JFrog ML GPU instances are provisioned with CUDA version 12.1 and below you will find instructions on using the latest versions of Torch compatible with the CUDA mentioned above.
Installing Compatible PyTorch
To align PyTorch with the CUDA version on your instance, use the following index URL when adding the pytorch library to your dependencies configuration file, whether it's Conda, Pip (requirements.txt), or Poetry.
In Workspaces
Use this command in your workspace environment:
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
In Model Builds
For requirements.txt, your file should look like this:
requirements.txt
scipy scikit-learn pandas --extra-index-url https://download.pytorch.org/whl/cu121 torch torchvision torchaudio
For Conda environments, here's an example configuration:
conda.yaml
name: your-conda-environment
channels:
- defaults
- conda-forge
- huggingface
dependencies:
- python=3.11
- pip:
- --extra-index-url https://download.pytorch.org/whl/cu121
- torch
- torchvision
- torchaudio
- transformers
- accelerate
- scikit-learn
- pandasPlease note that the conda.yaml above is just an example, not all the dependencies are required.
Verifying the Installation
After installation, confirm that PyTorch is utilizing the GPU. Add the following code snippet to your FrogMlModel. For training models, insert it at the start of the build() method. If loading a pre-trained model, place it in the initialize_model() method.
import torch
print("Torch version:",torch.__version__)
# Automatically use CUDA if available, else use CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"The PyTorch device used by the model is {device}\n")This should output cuda as device in your FrogML model build logs, indicating that PyTorch is correctly set up to use the GPU.
Troubleshooting
If you don't see True in your logs, check the Code tab within your Build page. Ensure that the dependency file is correctly recognised by the model and that the requirements.lock file reflects the appropriate versions for the Torch libraries.