Search Space
Contents
Search Space#
Have questions? Chat with us on Github or Slack:
Search space is an important concept in parameter optimization. We know that grid search and random search are the common tuning methods. And they seem to be exclusive to each other. Actually if we have a well defined Space concept, both of them can be included.
We don’t see a satisfying space design in popular tuning frameworks, so here we polished this concept and created a space language. Plus, we make it as intuitive and minimal as possible. This search space definition will be used in conjunction with hyperparameter optimization frameworks like optuna and hyperopt.
Core Classes#
The core classes include the Space
class itself plus Grid
and stochastic expressions. In the following example, we only import the most used ones.
from tune import Space, Grid, Rand, RandInt, Choice
A Space
can be converted to a list of independent configurations (parameter combinations) that can be executed anywhere.
Static Space#
Here, we define a search space that has fixed parameters in it. This is the most basic search space. To see the configurations defined in a Space
object, we just have to list()
them as seen below.
space = Space(a=1, b=2)
list(space)
[{'a': 1, 'b': 2}]
Grid Search Space#
The Grid
parameter means every value must be represented in the configurations. If there are multiple grid expressions, we take the cross product of them. In the Space
below, b
and c
are cross-multiplied.
space = Space(a=1, b=Grid(2,3), c=Grid("a","b"))
list(space)
[{'a': 1, 'b': 2, 'c': 'a'},
{'a': 1, 'b': 2, 'c': 'b'},
{'a': 1, 'b': 3, 'c': 'a'},
{'a': 1, 'b': 3, 'c': 'b'}]
Random Search Space#
Stochastic expressions such as Rand
and Choice
will randomly draw a value from the collection every time called, it does not guarantee the final sample space will contain all values. But you can control the total number of samples, so you can control the compute load.
space = Space(a=1, b=Rand(0,1), c=Choice("a","b"))
list(space.sample(3, seed=10))
[{'a': 1, 'b': 0.771320643266746, 'c': 'b'},
{'a': 1, 'b': 0.0207519493594015, 'c': 'a'},
{'a': 1, 'b': 0.6336482349262754, 'c': 'b'}]
Without calling sample()
, the random stochastic expressions do not give any values by themselves. You must be explicit on how many samples you want. Setting the seed gives reproducibility.
Random Search Space without Sampling#
So far, we have seen pre-determined search spaces. Even if grid searching and random searching are different approaches, we can sample before the tuning process to create a list of configurations that can be run independently. Some tuning frameworks such as Optuna let you sample random variables during a trial, but this approach lets you perform the sampling during “compile time”.
However, there are also cases where we don’t want to sample ahead of time. Take the following code. Rand()
will be left as an expression.
space = Space(a=1, b=Rand(0,1), c=Choice("a","b"))
list(space)
[{'a': 1, 'b': Rand(low=0, high=1, q=None, log=False, include_high=True), 'c': Choice('a', 'b')}]
Space
provides this flexbility to be compatible with Bayesian Optimization, where the algorithm can decide what values to try at each iteration of the tuning process. It will utilize the historical iterations to determine the best next guess.
The iteratons are sequential, but it takes much less guesses than random search to achieve comparable results. The compute is much less, but the execution time can be longer because random search can be fully parallelized while Bayesian Optimization can’t.
In summary, all search algorithms have pros and cons, do not stick with one. So that is why we are going to combine them in the next step.
Grid + Random Search#
It is common that for some parameters you want to do grid search while for others you want to do random/BO search.
space = Space(a=1, b=Grid(1,2), c=Rand(0,1))
list(space)
[{'a': 1, 'b': 1, 'c': Rand(low=0, high=1, q=None, log=False, include_high=True)},
{'a': 1, 'b': 2, 'c': Rand(low=0, high=1, q=None, log=False, include_high=True)}]
The above space will do two Bayesian Optimization runs. If you want to do grid search + random search then. Remember all items in Grid
will be represented in the Space.
space = Space(a=1, b=Grid(1,2), c=Rand(0,1)).sample(3)
list(space)
[{'a': 1, 'b': 1, 'c': 0.5833217369377363},
{'a': 1, 'b': 2, 'c': 0.5833217369377363},
{'a': 1, 'b': 1, 'c': 0.02517172841774562},
{'a': 1, 'b': 2, 'c': 0.02517172841774562},
{'a': 1, 'b': 1, 'c': 0.709208009843012},
{'a': 1, 'b': 2, 'c': 0.709208009843012}]
The stochastic expressions are sampled, and then cross-multiplied with the Grid
points, that is why you see 2*3=6 configurations. Grid
guarantees every values to present, but sample
doesn’t. So our design choice here is to make sure Grid
requirement is met, and we sample
only on stochastic expressions.
Space Operations#
In practice, it will be easy to define multiple search spaces and then take the union of them for the whole search space. The Space
objects can be added together.
from lightgbm import LGBMRegressor
from xgboost import XGBRegressor
from catboost import CatBoostRegressor
xgb_gird = Space(model=XGBRegressor, n_estimators=Grid(50,150))
lgbm_random = Space(model=LGBMRegressor, n_estimators=RandInt(100,200)).sample(3)
catboost_bo = Space(model=CatBoostRegressor, n_estimators=RandInt(100,200))
union_space = xgb_gird + lgbm_random + catboost_bo # "+" takes the union of spaces
list(union_space)
[{'model': <class 'xgboost.sklearn.XGBRegressor'>, 'n_estimatores': 50},
{'model': <class 'xgboost.sklearn.XGBRegressor'>, 'n_estimatores': 150},
{'model': <class 'lightgbm.sklearn.LGBMRegressor'>, 'n_estimatores': 115},
{'model': <class 'lightgbm.sklearn.LGBMRegressor'>, 'n_estimatores': 169},
{'model': <class 'lightgbm.sklearn.LGBMRegressor'>, 'n_estimatores': 182},
{'model': <class 'catboost.core.CatBoostRegressor'>, 'n_estimatores': RandInt(low=100, high=200, q=1, log=False, include_high=True)}]
The Space
objects can also be multiplied together to form a cross product. In the example below, we want to apply Bayesian Optimization to the learning rate for all of the configurations.
non_bo_space = Space(
model=LGBMRegressor,
n_estimators=100,
boosting=Grid("dart", "gbdt"), # Grid search
feature_fraction=Rand(0.5, 1) # Random search
).sample(2, seed=0)
bo_space = Space(
learning_rate=Rand(1e-8, 10, log=True) # Bayesian Optimization
)
product_space = non_bo_space * bo_space # "*" takes cross product of spaces
list(product_space)
[{'model': <class 'lightgbm.sklearn.LGBMRegressor'>, 'n_estimators': 100, 'boosting': 'dart', 'feature_fraction': 0.7744067519636624, 'learning_rate': Rand(low=1e-08, high=10, q=None, log=True, include_high=True)},
{'model': <class 'lightgbm.sklearn.LGBMRegressor'>, 'n_estimators': 100, 'boosting': 'gbdt', 'feature_fraction': 0.7744067519636624, 'learning_rate': Rand(low=1e-08, high=10, q=None, log=True, include_high=True)},
{'model': <class 'lightgbm.sklearn.LGBMRegressor'>, 'n_estimators': 100, 'boosting': 'dart', 'feature_fraction': 0.8575946831862098, 'learning_rate': Rand(low=1e-08, high=10, q=None, log=True, include_high=True)},
{'model': <class 'lightgbm.sklearn.LGBMRegressor'>, 'n_estimators': 100, 'boosting': 'gbdt', 'feature_fraction': 0.8575946831862098, 'learning_rate': Rand(low=1e-08, high=10, q=None, log=True, include_high=True)}]
Hybrid Search Space#
We can put everything together in the example below where we want to tune a common parameter for multiple search spaces.
# use case: any kind of hybrid search spaces
# e.g. tuning a common parameter on 3 modeling algorithms
xgb_static = Space(model=XGBRegressor)
lgb_static = Space(model=LGBMRegressor)
catboost_static = Space(model=CatBoostRegressor)
bo_space = Space(n_estimatores=RandInt(100,200)) # Bayesian Optimization on a common parameter
hybrid_space = (xgb_static + lgb_static + catboost_static) * bo_space
list(hybrid_space)
[{'model': <class 'xgboost.sklearn.XGBRegressor'>, 'n_estimatores': RandInt(low=100, high=200, q=1, log=False, include_high=True)},
{'model': <class 'lightgbm.sklearn.LGBMRegressor'>, 'n_estimatores': RandInt(low=100, high=200, q=1, log=False, include_high=True)},
{'model': <class 'catboost.core.CatBoostRegressor'>, 'n_estimatores': RandInt(low=100, high=200, q=1, log=False, include_high=True)}]
Conclusion#
The building blocks of defining a search space for hyperparameter tuning are minimal, yet expressive enough to form a wide range of configurations. Notice that the search space is also decoupled from any hyparameter tuning framework. This is done intentionally so that users can focus on purely defining the space without worrying about implementation.