Optuna-超参优化

1 基本介绍

Optuna 是一个为机器学习设计的自动超参数优化软件框架

  • 轻量级、多功能和跨平台架构;依赖少,安装简单
  • Python式搜索空间(条件语句和循环均为Python语法)
  • 高效的优化算法;先进的超参采样方法,支持剪枝算法
  • 易用的并行优化;少量改动代码即可实现多服务器并行
  • 便捷的可视化;支持各种绘图函数展示优化历史记录

项目地址
官方文档
中文文档 :不推荐,因为很久没更新了(230331)

截至230331,Optuna库共支持7种超参采样方法(网格搜索、随机搜素、TPE、CMA-ES、参数部分固定法、NSGAⅡ、拟蒙特卡罗方法QMC)和6种剪枝算法(中位数剪枝、带容忍的剪枝、指定百分位数修剪、ASHA、Hyperband、阈值剪枝)

更多算法细节可参阅自动调参

2 简单上手

官方根据上一节提到的五大特性,分别提供了5个基本示例

2.1 寻找函数的最值

import optuna
def objective(trial):
    x = trial.suggest_float("x", -10, 10) # 定义待优化参数及其值域
    return (x - 2) ** 2 # 定义目标函数

study = optuna.create_study()
study.optimize(objective, n_trials=100)
best_params = study.best_params
found_x = best_params["x"]
print("Found x: {}, (x - 2)^2: {}".format(found_x, (found_x - 2) ** 2))
# 结果:Found x: 1.9918889682901693, (x - 2)^2: 6.578883539787978e-05

2.2 灵活的搜索空间

  • optuna.trial.Trial.suggest_categorical() 用于类别参数
  • optuna.trial.Trial.suggest_int() 用于整形参数
  • optuna.trial.Trial.suggest_float() 用于浮点型参数
import optuna

# 搜索空间:包含不同的参数类型
def objective(trial):
    # Categorical parameter 
    optimizer = trial.suggest_categorical("optimizer", ["MomentumSGD", "Adam"])
    # Integer parameter
    num_layers = trial.suggest_int("num_layers", 1, 3)
    # Integer parameter (log)
    num_channels = trial.suggest_int("num_channels", 32, 512, log=True)
    # Integer parameter (discretized)
    num_units = trial.suggest_int("num_units", 10, 100, step=5)
    # Floating point parameter
    dropout_rate = trial.suggest_float("dropout_rate", 0.0, 1.0)
    # Floating point parameter (log)
    learning_rate = trial.suggest_float("learning_rate", 1e-5, 1e-2, log=True)
    # Floating point parameter (discretized)
    drop_path_rate = trial.suggest_float("drop_path_rate", 0.0, 1.0, step=0.1)

import sklearn.ensemble
import sklearn.svm

# 搜索空间:通过条件语句区分不同模型的不同超参
def objective(trial):
    classifier_name = trial.suggest_categorical("classifier", ["SVC", "RandomForest"])
    if classifier_name == "SVC":
        svc_c = trial.suggest_float("svc_c", 1e-10, 1e10, log=True)
        classifier_obj = sklearn.svm.SVC(C=svc_c)
    else:
        rf_max_depth = trial.suggest_int("rf_max_depth", 2, 32, log=True)
        classifier_obj = sklearn.ensemble.RandomForestClassifier(max_depth=rf_max_depth)

import torch
import torch.nn as nn

# 搜索空间:通过循环语句区分不同神经网络层的不同超参
def create_model(trial, in_size):
    n_layers = trial.suggest_int("n_layers", 1, 3)

    layers = []
    for i in range(n_layers):
        n_units = trial.suggest_int("n_units_l{}".format(i), 4, 128, log=True)
        layers.append(nn.Linear(in_size, n_units))
        layers.append(nn.ReLU())
        in_size = n_units
    layers.append(nn.Linear(in_size, 10))
    return nn.Sequential(*layers)

2.3 采样器和剪枝

对于非深度学习任务,可考虑以下两种算法组合

  • RandomSampler + MedianPruner
  • TPESampler + HyperbandPruner

对于深度学习任务,可视具体情况考虑以下组合(参考论文

  • 当计算资源有限,且搜索空间是低维且连续时,考虑使用 GP-EI
  • 当计算资源有限,且搜索空间不是低维且连续时,考虑使用 TPE
  • 当计算资源充足,且不存在类型/条件型超参时,考虑使用 CMA-ES 或 Random Search
  • 当计算资源充足,且存在类型/条件型超参时,考虑使用 Genetic Algorithm 或 Random Search

采样+剪枝的代码示例:

import logging
import sys

import sklearn.datasets
import sklearn.linear_model
import sklearn.model_selection


def objective(trial):
    iris = sklearn.datasets.load_iris()
    classes = list(set(iris.target))
    train_x, valid_x, train_y, valid_y = sklearn.model_selection.train_test_split(
        iris.data, iris.target, test_size=0.25, random_state=0
    )

    alpha = trial.suggest_float("alpha", 1e-5, 1e-1, log=True)
    clf = sklearn.linear_model.SGDClassifier(alpha=alpha)

    for step in range(100):
        clf.partial_fit(train_x, train_y, classes=classes)

        # Report intermediate objective value.
        intermediate_value = 1.0 - clf.score(valid_x, valid_y)
        trial.report(intermediate_value, step)

        # Handle pruning based on the intermediate value.
        if trial.should_prune(): # 使用剪枝算法
            raise optuna.TrialPruned()

    return 1.0 - clf.score(valid_x, valid_y)

# Add stream handler of stdout to show the messages
optuna.logging.get_logger("optuna").addHandler(logging.StreamHandler(sys.stdout))
study = optuna.create_study(pruner=optuna.pruners.MedianPruner(), # 指定采样+剪枝方法
    					   sampler=optuna.samplers.CmaEsSampler())
study.optimize(objective, n_trials=20)

2.4 并行化调参

需要借助关系型数据库(比如MySQL)来共享不同节点间的进度

方式1:终端命令指定MySQL

mysql -u root -e "CREATE DATABASE IF NOT EXISTS example"
optuna create-study --study-name "distributed-example" --storage "mysql://root@localhost/example"

方式2:脚本中指定MySQL

import optuna

def objective(trial):
    x = trial.suggest_float("x", -10, 10)
    return (x - 2) ** 2

if __name__ == "__main__":
    study = optuna.load_study(
        study_name="distributed-example", storage="mysql://root@localhost/example"
    )
    study.optimize(objective, n_trials=100)

2.5 可视化分析

方式1:直接启动一个dashboard服务

pip install optuna-dashboard
optuna-dashboard sqlite:///example-study.db

方式2:在脚本中手动绘制图像

import lightgbm as lgb
import numpy as np
import sklearn.datasets
import sklearn.metrics
from sklearn.model_selection import train_test_split
import optuna

# You can use Matplotlib instead of Plotly for visualization by simply replacing `optuna.visualization` with
# `optuna.visualization.matplotlib` in the following examples.
from optuna.visualization import plot_contour
from optuna.visualization import plot_edf
from optuna.visualization import plot_intermediate_values
from optuna.visualization import plot_optimization_history
from optuna.visualization import plot_parallel_coordinate
from optuna.visualization import plot_param_importances
from optuna.visualization import plot_slice

SEED = 42
np.random.seed(SEED)

def objective(trial):
    data, target = sklearn.datasets.load_breast_cancer(return_X_y=True)
    train_x, valid_x, train_y, valid_y = train_test_split(data, target, test_size=0.25)
    dtrain = lgb.Dataset(train_x, label=train_y)
    dvalid = lgb.Dataset(valid_x, label=valid_y)

    param = {
        "objective": "binary",
        "metric": "auc",
        "verbosity": -1,
        "boosting_type": "gbdt",
        "bagging_fraction": trial.suggest_float("bagging_fraction", 0.4, 1.0),
        "bagging_freq": trial.suggest_int("bagging_freq", 1, 7),
        "min_child_samples": trial.suggest_int("min_child_samples", 5, 100),
    }

    # Add a callback for pruning.
    pruning_callback = optuna.integration.LightGBMPruningCallback(trial, "auc")
    gbm = lgb.train(param, dtrain, valid_sets=[dvalid], callbacks=[pruning_callback])

    preds = gbm.predict(valid_x)
    pred_labels = np.rint(preds)
    accuracy = sklearn.metrics.accuracy_score(valid_y, pred_labels)
    return accuracy

study = optuna.create_study(
    direction="maximize",
    sampler=optuna.samplers.TPESampler(seed=SEED),
    pruner=optuna.pruners.MedianPruner(n_warmup_steps=10),
)
study.optimize(objective, n_trials=100, timeout=600)

plot_optimization_history(study) # 可视化trial的优化历史(每一次trial的最优结果)
plot_intermediate_values(study) # 可视化每一次trial的所有迭代结果(学习曲线)
plot_parallel_coordinate(study) # 可视化高维参数
plot_contour(study) # 可视化参数间关系
plot_slice(study) # 绘制每一个参数的切片图(不同参数值对应的目标值)
plot_param_importances(study) # 绘制不同参数的重要性
optuna.visualization.plot_param_importances(
    study, target=lambda t: t.duration.total_seconds(), target_name="duration"
) # 绘制不同参数的模拟耗时
plot_edf(study) # 绘制目标函数的经验分布函数图像

3 进阶用法

进阶用法的官方说明地址

  • 关系型数据库信息的保存与恢复
  • 使用Optuna进行多目标优化
  • 用户属性、命令行界面
  • 自定义采样器或剪枝方法
  • 最优化过程的回调;手动指定超参
  • 问答接口;重复实验最佳参数

更多用法官方示例库

往年同期文章