Write a benchmark#
A benchmark is composed of three elements: an objective function, a list of datasets, and a list of solvers.
A benchmark is defined in a folder that should respect a certain structure. For example
my_benchmark/
├── objective.py # contains the definition of the objective
├── datasets/
│ ├── dataset1.py # some dataset
│ └── dataset2.py # some dataset
└── solvers/
├── solver1.py # some solver
└── solver2.py # some solver
Examples of actual benchmarks are available in the benchopt organisation such as for Ordinary Least Square (OLS), Lasso or L1-regularized logistic regression.
Note
The simplest way to create a benchmark is to copy an existing folder and to adapt its content. A benchmark template is provided as a GitHub template repository here.
1. Objective#
The objective function is defined through a Python class, Objective
, defined in objective.py
.
This class allows to monitor the quantities of interest along the iterations of the solvers.
Typically it allows to evaluate the objective function to be minimized by the solvers.
An objective class should define 4 methods:
set_data(**data)
: allows to specify the data. See the data as a dictionary of Python variables without any constraint. In the following example, the data contains only one variableX
. This data is provided by theDataset.get_data()
method of a dataset_.get_objective()
: returns the information that each method will need to provide a result. The information is also passed as a dictionary, which will serve as input for theSolver.set_objective
method of the solvers.evaluate_result(X_hat)
: it allows to evaluate the output of the different methods, here calledX_hat
. This method should take a dictionary as input, which is provided by theSolver.get_result
method. All other parameters should be stored in the class with theset_data
method.evaluate_result
should return a float (understood as the objective value) or a dictionary. If a dictionary is returned it should contain a key calledvalue
(the objective value) and all other keys should havefloat
values allowing to track more than one value of interest (e.g. train and test errors).get_one_result()
: returns one solution that can be returned by a solver. This defines the shape of the solution and will be used to test that the benchmark works properly.
An objective class needs to inherit from a base class,
benchopt.BaseObjective
.
Note
Multiple metrics can be returned by Objective.evaluate_result
as long as
they are stored in a dictionary, with a key being value
corresponding to the main metric to track.
Example#
from benchopt import BaseObjective
import numpy as np
class Objective(BaseObjective):
# Name of the Objective function
name = 'Quadratic'
# The three methods below define the links between the Dataset,
# the Objective and the Solver.
def set_data(self, X):
"""Set the data from a Dataset to compute the objective.
The argument are the keys of the dictionary returned by
``Dataset.get_data``.
"""
self.X = X
def get_objective(self):
"Returns a dict passed to ``Solver.set_objective`` method."
return dict(X=self.X)
def evaluate_result(self, X_hat):
"""Compute the objective value(s) given the output of a solver.
The arguments are the keys in the dictionary returned
by ``Solver.get_result``.
"""
return dict(value=np.linalg.norm(self.X - X_hat))
def get_one_result(self):
"""Return one solution for which the objective can be evaluated.
This function is mostly used for testing and debugging purposes.
"""
return dict(X_hat=1)
2. Datasets#
A dataset defines what can be passed to an objective. More specifically, a dataset should implement one method:
get_data()
: A method which outputs a dictionary that is passed as keyword arguments**data
to theObjective.set_data
method of the objective.
A dataset class also needs to inherit from a base class called
benchopt.BaseDataset
.
Example#
from benchopt import BaseDataset
import numpy as np
class Dataset(BaseDataset):
# Name of the Dataset, used to select it in the CLI
name = 'simulated'
# ``get_data()`` is the only method a dataset should implement.
def get_data(self):
"""Load the data for this Dataset.
Usually, the data are either loaded from disk as arrays or Tensors,
or a dataset/dataloader object is used to allow the models to load
the data in more flexible forms (e.g. with mini-batches).
The dictionary's keys are the kwargs passed to ``Objective.set_data``.
"""
return dict(X=np.random.randn(10, 2))
3. Solvers#
A solver must define three methods:
set_objective(**objective_dict)
: Store information about the data, objective and initialize required quantities. This method is called with the dictionary returned by the methodObjective.get_objective
.run(stop_condition)
: Run the actual method to benchmark. This is where the important part of the solver goes. This method takes one parameter controlling the stopping condition of the solver. This is either a number of iterationsn_iter
, a tolerance parametertol
, or acallback
function that will be called at each iteration can be passed. See the note bellow for more information on this parameter.get_result()
: Format the output of the method to be evaluated in the Objective. This method returns a dictionary that is passed toObjective.evaluate_result
.
Example#
from benchopt import BaseSolver
import numpy as np
class Solver(BaseSolver):
# Name of the Solver, used to select it in the CLI
name = 'gd'
# By default, benchopt will evaluate the result of a method after various
# number of iterations. Setting the sampling_strategy controls how this is
# done. Here, we use a callback function that is called at each iteration.
sampling_strategy = 'callback'
# Parameters of the method, that will be tested by the benchmark.
# Each parameter ``param_name`` will be accessible as ``self.param_name``.
parameters = {'lr': [1e-3, 1e-2]}
# The three methods below define the necessary methods for the Solver, to
# get the info from the Objective, to run the method and to return a
# result that can be evaluated by the Objective.
def set_objective(self, X):
"""Set the info from a Objective, to run the method.
This method is also typically used to adapt the solver's parameters to
the data (e.g. scaling) or to initialize the algorithm.
The kwargs are the keys of the dictionary returned by
``Objective.get_objective``.
"""
self.X = X
self.X_hat = np.zeros_like(X)
def run(self, cb):
"""Run the actual method to benchmark.
Here, as we use a "callback", we need to call it at each iteration to
evaluate the result as the procedure progresses.
The callback implements a stopping mechanism, based on the number of
iterations, the time and the evoluation of the performances.
"""
while cb():
self.X_hat = self.X_hat - self.lr * (self.X_hat - self.X)
def get_result(self):
"""Format the output of the method to be evaluated in the Objective.
Returns a dict which is passed to ``Objective.evaluate_result`` method.
"""
return {'X_hat': self.X_hat}
Note
Sampling strategy:
A solver should also define a sampling_strategy
as class attribute.
This sampling_strategy
can be:
'iteration'
: in this case therun
method of the solver is parametrized by the number of iterations computed. The parameter is calledn_iter
and should be an integer.'tolerance'
: in this case therun
method of the solver is parametrized by a tolerance that should decrease with the running time. The parameter is calledtol
and should be a positive float.'callback'
: in this case, therun
method of the solver should call at each iteration the provided callback function. It will compute and store the objective and returnFalse
once the computations should stop.'run_once'
: in this case, therun
method of the solver is run only
once during the benchmark.