Hyperparameter Optimization (HPO)

JFrog ML Documentation

Products
JFrog ML
Content Type
User Guide

This advanced build pattern enables you to specify parameters or parameter ranges for hyperparameter tuning jobs, helping you to enhance model performance by identifying the optimal settings.

Currently, JFrog ML supports training only on a single instance, whether CPU or GPU. As a result, all options described will apply to single-instance training.

For efficient model development, especially when training sessions are lengthy, consider using this guide in conjunction with Build Configurations. This approach allows for frequent iteration on model code without the need to retrain from scratch each time.

Key Concepts

  • Hyperparameters: These are configuration variables that control the learning process of a model. In the JFrog ML context, these are the parameters you'll be adjusting and testing in your Build jobs.

  • Hyperparameter Tuning: The process of finding the optimal set of hyperparameters for a model. In JFrog ML, this is done through Build jobs, where different combinations of hyperparameters are tested.

  • Build Jobs: JFrog ML mechanism for training and building models. These jobs provide the environment where your hyperparameter tuning takes place.

Passing Hyperparameters to Build Jobs

In JFrog ML, there are two primary methods for passing hyperparameters to your Build jobs:

  1. Via Configuration File

    This method involves defining your hyperparameters in a JSON file within your project structure.

    Project Structure
    .
    ├── README.md
    ├── main
    │   ├── __init__.py
    │   ├── conda.yml
    │   └── model.py
    └── tests
        └── it
            └── test_something.py

    The hyperparameters.json file is placed in the main directory, which ensures it will be automatically uploaded to the JFrog ML Build environment.

    Example JSON Configuration

    hyperparameters.json

    {
        "n_estimators": [50, 100, 200],
        "max_depth": [10, 20, 30],
        "learning_rate": [0.01, 0.1, 0.2]
    }

    This JSON structure defines ranges for each hyperparameter, which JFrog ML will use to test different combinations.

    Reading the Configuration in Python

    In your model.py file, you can access these hyperparameters as follows:

    model.py

    import json
    
    class SampleModel(FrogMLClient):
        def __init__(self):
            with open('hyperparameters.json') as f:
                self.params = json.load(f)
    
            # Now self.params contains your hyperparameter ranges

    This method allows you to keep your hyperparameters separate from your code, making it easier to version and modify them.

  2. Via Environment Variables or Build Parameters

    This method involves passing hyperparameters directly through the command line interface (CLI) when initiating a build job.

    Using Environment Variables
    frogml models builds --model-id sample_model \
    -E N_ESTIMATORS="50,100,200" \
    -E MAX_DEPTH="10,20,30" \
    -E LEARNING_RATE="0.01,0.1,0.2"
    .

    Here, the -E flag sets environment variables that will be available in your Build job.

    Using Build Parameters

    frogml models builds --model-id sample_model \
    -P N_ESTIMATORS="50,100,200" \
    -P MAX_DEPTH="10,20,30" \
    -P LEARNING_RATE="0.01,0.1,0.2"
    .

    The -P flag sets JFrog ML Build parameters. These are logged to the JFrog ML Platform and can be compared between Builds, offering better traceability.

    Reading Parameters in Python

    In your model.py, you can access these parameters:

    model.py

    import os
    
    class SampleModel(FrogMLClient):
    
        def parse_param_list(self, list_as_str):
            return list(map(float, list_as_str.split(',')))
    
        def __init__(self):
            self.params = {
                'n_estimators': self.parse_param_list(os.getenv('N_ESTIMATORS')),
                'max_depth': self.parse_param_list(os.getenv('MAX_DEPTH')),
                'learning_rate': self.parse_param_list(os.getenv('LEARNING_RATE'))
            }
    

    This method allows for more dynamic parameter setting and is useful for automated pipelines or when you need to change parameters frequently without modifying files.

Implementing Hyperparameter Optimization

Once you have your hyperparameters set up, you can implement various optimization techniques within your JFrog ML Build job. Here are examples of three common methods:

  1. Grid Search Example

    Grid Search exhaustively searches through a specified subset of the hyperparameter space.

    model.py

    import xgboost as xgb
    from sklearn.model_selection import GridSearchCV
    from frogml import FrogMlModel
    import frogml
    
    
    class XGBoostModel(FrogMlModel):
    
        def __init__(self):
            self.params = self.read_hyperparameters()
            self.model = xgb.XGBClassifier()
    
        def build(self):
            X_train, y_train = self.fetch_and_preprocess_data()
    
            grid_search = GridSearchCV(estimator=self.model,
                                       param_grid=self.params,
                                       cv=5,
                                       scoring='accuracy')
    
            grid_search.fit(X_train, y_train)
    
            frogml.log_param(grid_search.best_params_)
            frogml.log_metric(grid_search.best_score_)

    In this example, GridSearchCV tries all possible combinations of the specified hyperparameters.

  2. Random Search Example

    Random Search samples random combinations of hyperparameters, which can be more efficient than Grid Search for high-dimensional spaces.

    model.py

    from sklearn.model_selection import RandomizedSearchCV
    from frogml import FrogMlModel
    import frogml
    import xgboost as xgb
    import os
    import numpy as np
    
    
    class XGBoostRandomModel(FrogMlModel):
        def __init__(self):
            self.model = xgb.XGBClassifier()
            self.param_distributions = self.read_hyperparameters()
    
        def build(self):
            X_train, y_train = self.fetch_and_preprocess_data()
    
            random_search = RandomizedSearchCV(
                estimator=self.model,
                param_distributions={
                    'n_estimators': self.param_distributions['n_estimators'],
                    'max_depth': [int(x) for x in self.param_distributions['max_depth']],
                    'learning_rate': self.param_distributions['learning_rate']
                },
                n_iter=100,  # number of parameter settings sampled
                cv=5,
                scoring='accuracy',
                random_state=42
            )
    
            random_search.fit(X_train, y_train)
    
            frogml.log_param(random_search.best_params_)
            frogml.log_metric('best_accuracy', random_search.best_score_)
    • n_estimators and learning_rate are used as-is, allowing RandomizedSearchCV to sample from the provided lists. max_depth values are converted to integers, as this parameter requires integer values.

    • We run the random search for 100 iterations (n_iter=100), but this can be adjusted based on your specific needs and time constraints.

    • The best parameters and score are logged using JFrog ML logging functions.

  3. Bayesian Optimization Example with Optuna

    Optuna uses Bayesian optimization to efficiently search the hyperparameter space.

    model.py

    import optuna
    import xgboost as xgb
    from sklearn.model_selection import cross_val_score
    from frogml import FrogMlModel
    import frogml
    import os
    
    
    class XGBoostOptunaModel(FrogMlModel):
        def __init__(self):
            self.best_params = None
            self.best_score = None
            self.param_ranges = self.read_hyperparameters()
    
        def objective(self, trial):
            param = {
                'n_estimators': trial.suggest_int('n_estimators',
                                                  min(self.param_ranges['n_estimators']),
                                                  max(self.param_ranges['n_estimators'])),
                'max_depth': trial.suggest_int('max_depth',
                                               min(self.param_ranges['max_depth']),
                                               max(self.param_ranges['max_depth'])),
                'learning_rate': trial.suggest_loguniform('learning_rate',
                                                          min(self.param_ranges['learning_rate']),
                                                          max(self.param_ranges['learning_rate']))
            }
    
            model = xgb.XGBClassifier(**param)
            score = cross_val_score(model, self.X, self.y, cv=5, scoring='accuracy')
            return score.mean()
    
        def build(self):
            self.X, self.y = self.fetch_and_preprocess_data()
    
            study = optuna.create_study(direction='maximize')
            study.optimize(self.objective, n_trials=100)
    
            self.best_params = study.best_params
            self.best_score = study.best_value
    
            frogml.log_param(self.best_params)
            frogml.log_metric('best_accuracy', self.best_score)
    • We define an objective function that Optuna will optimize. This function creates an XGBoost model with hyperparameters suggested by Optuna, then evaluates it using cross-validation.

    • The hyperparameter ranges are read from environment variables.

    • In the build method, we create an Optuna study and run the optimization for 100 trials.

    • The best parameters and score are logged using JFrog ML logging functions for easier comparisons later on.

Considerations

  • Single Instance Training: Currently, JFrog ML supports training only on a single instance and does not offer distributed training capabilities. When performing hyperparameter optimization (HPO), be aware of the potential length of the HPO task and resource consumption. This is especially important if you are exploring hyperparameters sequentially, as it may impact the overall training time and resource usage.

  • Resource Management: Monitor memory requirements to avoid running out of memory (OOM) during later stages of hyperparameter optimization. Implement checkpointing where appropriate. Use the Resources tab in JFrog ML to track instance resource consumption.

  • Logging: Use JFrog ML logging capabilities (frogml.log_param() and frogml.log_metric()) to track your optimization process and compare results between different builds.