This procedure describes how to easily deploy approved AI models with secure connections to external providers, thereby streamlining your AI integration process. This procedure is part of the AI Catalog capabilities.
Deploying a model means actually setting up servers, often GPUs, and getting the models out live in your systems. Before connecting and deploying, you need to select a model and allow its use.
Key Actions
One-Click Deployment: Quickly deploy allowed models with a single click.
Secure Connections: Establish and manage secure links to external model providers.
Benefits:
Simplified Integration: Accelerate the process of bringing AI applications into production.
Ongoing Monitoring: Keep track of model performance and usage post-deployment.
To deploy a model package:
Note
If you are deploying a Hugging Face gated model, see the Deploying Gated Models section below.
Verify that the model name at the top of the Deploy model pane is the model you want to deploy, and also that the project associated with the deployment is the correct project.
Select the Instance type from the dropdown menu. Refer to Instance Sizes & ML Credits for detailed information on the available sizes and credits.
Select the Scaling policy and number of replicas:
Autoscaling - Coming soon (this option will allow the replicas to scale according to demand).
Fixed replicas - Select this to maintain a fixed number of replicas, according to the number you select. Select either on the Replicas bar, or select the Custom replica count checkbox and enter a value.
Click Deploy model. The model Overview page shows the deployment status.
Note
While the deployment is in process, a Cancel deployment button appears on the right-hand side of the page.
After the deployment has completed successfully you can see the model dashboard and.can integrate the model in your code, using the instructions below:
In the Configure tab, enter your JFrog account password and click Generate Token & create Instructions.
Click Deploy. The deployment status is shown.
Once deployment is successful , click Use Model again and now the frameworks are enabled (Python, Javascript, cURL).
Click Generate a token.
The Set Up A Generic Client pane is displayed.
Note
Keep the default repository.
Click Copy.
Click Done.
Select the correct framework and note that the token has been inserted into the
api_key.Copy the code snippet into your code editor.
Now in the model overview page, you can see the model's usage metrics.
Deploying Gated Models
Deploying gated models requires obtaining access approval from Hugging Face before deployment.
To deploy a gated model:
Enter the Deploy model pane for the required project (as described at the top of this page). Note that it is slightly different.
Follow the instructions at the top of the pane for getting access approval from Hugging Face.
Fill in the other fields as described above, and click Deploy model.