Automating model build and deployment helps maintaining models accurate in production.
This action streamlines the build and deployment workflows. It keeps your model accurate by automatically re-training and deploying based on a cron expression, defined time interval, or metric base triggers.
You can also define a deployment conditions to verify that the new build passes acceptance criteria within the desired parameters before replacing a currently deployed model.
Automation Example
Warning
Before Setting Up Automation
Prior to configuring automation, it's essential to have your model's code stored in a Git repository. It's recommended to confirm that all necessary Git repository access is correctly configured via CLI model builds. Ensure the JFrog ML model can successfully build from Git before proceeding with automation.
For additional details on building models from Git, refer to our Build Configurations page.
The automation will fetch the model's code during the training process. In the case of using a private repository, it is necessary to generate a Git access token and securely store the key in the Secret Manager.
from frogml.core.automations import Automation, ScheduledTrigger, FrogmlBuildDeploy,\
BuildSpecifications, BuildMetric, ThresholdDirection, DeploymentSpecifications
test_automation = Automation(
name="retrain_my_model",
model_id="my-model-id",
trigger=ScheduledTrigger(cron="0 0 * * 0"),
action=FrogmlBuildDeploy(
build_spec=BuildSpecifications(git_uri="https://github.com/org_id/repository_name.git#dir_1/dir_2",
git_access_token_secret="token_secret_name",
git_branch="main",
main_dir="main",
tags=["prod"],
env_vars=["key1=val1", "key2=val2", "key3=val3"]),
deployment_condition=BuildMetric(metric_name="f1_score",
direction=ThresholdDirection.ABOVE,
threshold="0.65"),
deployment_spec=DeploymentSpecifications(number_of_pods=1,
cpu_fraction=2.0,
memory="2Gi",
variation_name="B")
)
)Note
Scheduler Timezone
The default timezone for the cron scheduler is UTC.
Build & Deploy Configuration
The FrogmlBuildDeploy action has three configuration parameters:
build_specdefines the location of the model code that we will build in the JFrog ML platform.deployment_conditiondefines the metrics used to determine when to deploy the model after the training.deployment_specspecifies the runtime environment parameters for model deployment.
Warning
Metrics used to trigger build or deploy automations must be logged during the model build phase.
BuildSpecifications
To configure the automation build specification, we need a link to the git repository.
Note that the link consists of two parts delimited by hashtag #:
The repository URL
The path within the repository
For example, when we use this link: https://github.com/org_id/repository_name.git#dir_1/dir2 .
The platform will clone the https://github.com/org_id/repository_name.git repository and change the working directory to dir_1/dir_2 before starting the build.
In this example, dir_1/dir_2 should be the directory containing the main and tests folders.
Using Private Repositories
When using private repositories, we must also specify the access token or private key.
As the JFrog ML platform doesn't allow the usage of plain text token, we must store the access tokens in the JFrog ML Secret Manager, and specify only the secret name.
When not using the default folder structure, in which main is the models folder, we must also specify the git branch and the directory containing the ML model.
Custom Resources
In the build specification, you may control the number of CPUs, amount of memory or use GPUs instance sizes.
Defining CPU resources:
resources=CpuResources(cpu_fraction=2, memory="2Gi"))
Defining GPU resources:
resources=GpuResources(gpu_type="NVIDIA_K80", gpu_amount=1)
Alternatively, you can specify the instance type as opposed to fractions of resources. For example:
resources=ClientResources(instance='gpu.a10.8xl') #GPU #OR resources=ClientResources(instance='medium') #CPU
It is possible specify the IAM role used in production (assumed_iam_role) or a custom docker image (base_image).
Environment Variables
Additionally, we can specify the environment variables to configure in the build environment.
he environment variables should be specified with the env_vars field (list), and the value as the following:
key=value.
The model's code must log the metric that describes the model's performance. We will use the metric in the deployment condition. If you don't know how to do it, look at our Logging and Monitoring Guide.
Disable Push Image
It is possible to disable the push image phase in cases you don't want the final build saved to the docker repository. You can do that by adding push_image=False to the BuildSpecification
BuildMetric
During the build process, it is common to log metrics such as accuracy, F1 score, or loss. When executing the automation, these logged values may be compared against a specified threshold.
For each metric, it is possible to define whether the value should be above or below the threshold. Once this condition is met, the JFrog ML platform will proceed to deploy the model.
The BuildMetric object has three parameters:
metric_name: The metric name we logged during the build phase
direction: Show the value be below or above the threshold, where the valid values are
ThresholdDirection.ABOVE,ThresholdDirection.BELOWthreshold: The threshold used for comparison
Warning
The threshold must always be a string, where threshold="0.65" is a valid threshold and threshold=0.65 is invalid!
Dynamic Threshold
To use a dynamic threshold, we can use a SQL expression as the threshold value.
In this case, the JFrog ML platform will run the SQL query in JFrog ML Model Analytics and compare the model's metric with the threshold produced by the SQL query.
The query must return a single row containing only one column.
DeploymentSpecifications
After we build the model, compared its performance with the threshold, and concluded that the model is ready to be deployed, the platform will use the deployment specification to configure the model's runtime environment.
We may specify:
Parameter | Details |
|---|---|
number_of_http_server_workers | The number of threads used by the HTTP server. |
http_request_timeout_ms | The request timeout. |
daemon_mode | Should gunicorn process be daemonized, which makes the workers work in the background. |
custom_iam_role_arn | The IAM role used in production. |
max_batch_size | Max batch size of record. |
deployment_process_timeout_limit | The timeout for the deployment (in seconds). |
number_of_pods | The number of instances to be deployed. |
cpu_fraction | The CPU cores for Kubernetes. |
memory | The amount of RAM. |
variation_name | The variant name if we run an A/B test. |
auto_scale_config | The autoscaling configuration for Kubernetes. |
min_replica_count | The minimum number of replicas the resource will be scaled down to. |
max_replica_count | The maximum number of replicas of the target resource. |
polling_interval | This is the interval for which to check each trigger. By default, it's every 30 seconds. |
cool_down_period | The period to wait after the last trigger reported active before scaling the resource back to 0. By default it's 5 minutes (300 seconds). |
prometheus_trigger | metric_type: The type of the metric - cpu/gpu/memory/latency aggregation_type: The type of the aggregation - min/max/avg/sum time_period: The period to run the query based on threshold: Value to start scaling for |
Environments | List of environment names to deploy to. |
Defining Auto-Scaling
When we want to define an auto-scaling policy for our deployment, we have to use the following pattern:
auto_scale_config = AutoScalingConfig(min_replica_count=1, max_replica_count=10, polling_interval=30, cool_down_period=300, triggers=[ AutoScalingPrometheusTrigger( query_spec=AutoScaleQuerySpec( aggregation_type="max", metric_type="latency", time_period=4), threshold=60 ) ] )