Search Space#

Have questions? Chat with us on Github or Slack:

Homepage Slack Status

Search space is an important concept in parameter optimization. We know that grid search and random search are the common tuning methods. And they seem to be exclusive to each other. Actually if we have a well defined Space concept, both of them can be included.

We don’t see a satisfying space design in popular tuning frameworks, so here we polished this concept and created a space language. Plus, we make it as intuitive and minimal as possible. This search space definition will be used in conjunction with hyperparameter optimization frameworks like optuna and hyperopt.

Core Classes#

The core classes include the Space class itself plus Grid and stochastic expressions. In the following example, we only import the most used ones.

from tune import Space, Grid, Rand, RandInt, Choice

A Space can be converted to a list of independent configurations (parameter combinations) that can be executed anywhere.

Static Space#

Here, we define a search space that has fixed parameters in it. This is the most basic search space. To see the configurations defined in a Space object, we just have to list() them as seen below.

space = Space(a=1, b=2)

list(space)
[{'a': 1, 'b': 2}]

Grid Search Space#

The Grid parameter means every value must be represented in the configurations. If there are multiple grid expressions, we take the cross product of them. In the Space below, b and c are cross-multiplied.

space = Space(a=1, b=Grid(2,3), c=Grid("a","b"))

list(space)
[{'a': 1, 'b': 2, 'c': 'a'},
 {'a': 1, 'b': 2, 'c': 'b'},
 {'a': 1, 'b': 3, 'c': 'a'},
 {'a': 1, 'b': 3, 'c': 'b'}]

Random Search Space#

Stochastic expressions such as Rand and Choice will randomly draw a value from the collection every time called, it does not guarantee the final sample space will contain all values. But you can control the total number of samples, so you can control the compute load.

space = Space(a=1, b=Rand(0,1), c=Choice("a","b"))

list(space.sample(3, seed=10))
[{'a': 1, 'b': 0.771320643266746, 'c': 'b'},
 {'a': 1, 'b': 0.0207519493594015, 'c': 'a'},
 {'a': 1, 'b': 0.6336482349262754, 'c': 'b'}]

Without calling sample(), the random stochastic expressions do not give any values by themselves. You must be explicit on how many samples you want. Setting the seed gives reproducibility.

Random Search Space without Sampling#

So far, we have seen pre-determined search spaces. Even if grid searching and random searching are different approaches, we can sample before the tuning process to create a list of configurations that can be run independently. Some tuning frameworks such as Optuna let you sample random variables during a trial, but this approach lets you perform the sampling during “compile time”.

However, there are also cases where we don’t want to sample ahead of time. Take the following code. Rand() will be left as an expression.

space = Space(a=1, b=Rand(0,1), c=Choice("a","b"))

list(space)
[{'a': 1, 'b': Rand(low=0, high=1, q=None, log=False, include_high=True), 'c': Choice('a', 'b')}]

Space provides this flexibility to be compatible with Bayesian Optimization, where the algorithm can decide what values to try at each iteration of the tuning process. It will utilize the historical iterations to determine the best next guess.

The iterations are sequential, but it takes much less guesses than random search to achieve comparable results. The compute is much less, but the execution time can be longer because random search can be fully parallelized while Bayesian Optimization can’t.

In summary, all search algorithms have pros and cons, do not stick with one. So that is why we are going to combine them in the next step.

Space Operations#

In practice, it will be easy to define multiple search spaces and then take the union of them for the whole search space. The Space objects can be added together.

from lightgbm import LGBMRegressor
from xgboost import XGBRegressor
from catboost import CatBoostRegressor

xgb_gird = Space(model=XGBRegressor, n_estimators=Grid(50,150))  
lgbm_random = Space(model=LGBMRegressor, n_estimators=RandInt(100,200)).sample(3) 
catboost_bo = Space(model=CatBoostRegressor, n_estimators=RandInt(100,200))  

union_space = xgb_gird + lgbm_random + catboost_bo # "+" takes the union of spaces

list(union_space)
[{'model': <class 'xgboost.sklearn.XGBRegressor'>, 'n_estimatores': 50},
 {'model': <class 'xgboost.sklearn.XGBRegressor'>, 'n_estimatores': 150},
 {'model': <class 'lightgbm.sklearn.LGBMRegressor'>, 'n_estimatores': 115},
 {'model': <class 'lightgbm.sklearn.LGBMRegressor'>, 'n_estimatores': 169},
 {'model': <class 'lightgbm.sklearn.LGBMRegressor'>, 'n_estimatores': 182},
 {'model': <class 'catboost.core.CatBoostRegressor'>, 'n_estimatores': RandInt(low=100, high=200, q=1, log=False, include_high=True)}]

The Space objects can also be multiplied together to form a cross product. In the example below, we want to apply Bayesian Optimization to the learning rate for all of the configurations.

non_bo_space = Space(
    model=LGBMRegressor, 
    n_estimators=100,
    boosting=Grid("dart", "gbdt"),    # Grid search
    feature_fraction=Rand(0.5, 1)     # Random search
).sample(2, seed=0) 

bo_space = Space(
    learning_rate=Rand(1e-8, 10, log=True)  # Bayesian Optimization
) 

product_space = non_bo_space * bo_space # "*" takes cross product of spaces

list(product_space)
[{'model': <class 'lightgbm.sklearn.LGBMRegressor'>, 'n_estimators': 100, 'boosting': 'dart', 'feature_fraction': 0.7744067519636624, 'learning_rate': Rand(low=1e-08, high=10, q=None, log=True, include_high=True)},
 {'model': <class 'lightgbm.sklearn.LGBMRegressor'>, 'n_estimators': 100, 'boosting': 'gbdt', 'feature_fraction': 0.7744067519636624, 'learning_rate': Rand(low=1e-08, high=10, q=None, log=True, include_high=True)},
 {'model': <class 'lightgbm.sklearn.LGBMRegressor'>, 'n_estimators': 100, 'boosting': 'dart', 'feature_fraction': 0.8575946831862098, 'learning_rate': Rand(low=1e-08, high=10, q=None, log=True, include_high=True)},
 {'model': <class 'lightgbm.sklearn.LGBMRegressor'>, 'n_estimators': 100, 'boosting': 'gbdt', 'feature_fraction': 0.8575946831862098, 'learning_rate': Rand(low=1e-08, high=10, q=None, log=True, include_high=True)}]

Hybrid Search Space#

We can put everything together in the example below where we want to tune a common parameter for multiple search spaces.

# use case: any kind of hybrid search spaces
# e.g. tuning a common parameter on 3 modeling algorithms

xgb_static = Space(model=XGBRegressor)
lgb_static = Space(model=LGBMRegressor)
catboost_static = Space(model=CatBoostRegressor)

bo_space = Space(n_estimatores=RandInt(100,200)) # Bayesian Optimization on a common parameter

hybrid_space = (xgb_static + lgb_static + catboost_static) * bo_space

list(hybrid_space)
[{'model': <class 'xgboost.sklearn.XGBRegressor'>, 'n_estimatores': RandInt(low=100, high=200, q=1, log=False, include_high=True)},
 {'model': <class 'lightgbm.sklearn.LGBMRegressor'>, 'n_estimatores': RandInt(low=100, high=200, q=1, log=False, include_high=True)},
 {'model': <class 'catboost.core.CatBoostRegressor'>, 'n_estimatores': RandInt(low=100, high=200, q=1, log=False, include_high=True)}]

Conclusion#

The building blocks of defining a search space for hyperparameter tuning are minimal, yet expressive enough to form a wide range of configurations. Notice that the search space is also decoupled from any hyparameter tuning framework. This is done intentionally so that users can focus on purely defining the space without worrying about implementation.