Get started#
Installation#
Benchopt can be installed directly with pip.
pip install benchoptTo gain access to certain features, in particular to install the requirements of community benchmarks,
benchoptneeds to be installed in acondaenvironment.
Run an existing benchmark#
When running a benchmark, benchopt automatically detects all solvers and datasets defined in the repository, with their grid of parameters, and runs all combinations. This produces a result file that can be visualized with the interactive dashboard.
The core of benchopt is thus to provide a simple interface to the part of the benchmark that is duplicated across all benchmarks: the loop that runs all methods on all datasets! With extra features as a bonus: caching, parallelism, reproducibility, and more.
To run an existing benchmark, clone the repository, install its requirements, and run it:
git clone https://github.com/benchopt/template_benchmark_ml.git benchopt install template_benchmark_ml # only works in conda env benchopt run template_benchmark_mlSee Install and run a benchmark for a full walkthrough including CLI options, configuration files, and caching.
Create your own benchmark#
A benchmark is a folder with three types of components, each a Python file:
my_benchmark/ ├── objective.py ├── datasets/ │ └── my_dataset.py └── solvers/ └── my_solver.py Each file defines a component of the benchmark, that are then automatically discovered and linked with benchopt:
Dataset — loads or generates data.
Objective — defines what is measured;
evaluate_result()computes your metrics (accuracy, loss, throughput, …).Solver — the method under evaluation, train a model or solve an optimization problem.
This structure is intentionally general: creating a benchmark is mostly a matter of deciding which concept in your problem maps to which class — what counts as “data”, what counts as “performance”, and what counts as “a method”. Benchopt then handles the rest, including running all combinations, aggregating results, and providing a dashboard to visualize them, with enhanced features to make your life easier!
See Write a benchmark for complete guide on how to write a benchmark, or use one of our templates to get started quickly: ML benchmarks | Optimization benchmarks.
Key features#
Caching — Each run is cached (via joblib) so that re-running
benchopt run skips combinations whose results are already stored.
The cache is invalidated automatically when the solver, objective, or dataset
code changes. To force re-running specific solvers, pass
-f SOLVER_NAME; to disable caching entirely, pass --no-cache.
See Caching solver runs.
Parallelism — Solver/dataset combinations run sequentially by default.
Pass -j N to use N local workers, or --parallel-config slurm.yml to
dispatch jobs on a cluster (SLURM, Dask, …).
See Distributed run with Benchopt.
Reproducibility — Call self.get_seed() in any component to obtain a
deterministic integer seed that changes with the repetition index. Run with
--n-rep N to get N independent repetitions with different seeds.
See Controlling randomness in Benchopt.
Parametrization — Set parameters = {"lr": [1e-3, 1e-2]} on any class
to sweep over values. Benchopt runs the full Cartesian product automatically
and labels each curve in the dashboard.
See Parametrization.
Result management — Results are .parquet files stored in
./outputs. Use benchopt merge to combine runs from different machines
or users, benchopt publish to share on GitHub or Hugging Face.
See Manage and Visualize benchmark results.
Convergence tracking — For iterative solvers, set
sampling_strategy = "callback" (or "iteration" / "tolerance") to
record how the objective evolves with compute budget. Benchopt handles the
timing and stopping logic.
See Evaluating an iterative method.