benchopt.datasets.simulated.make_correlated_data#
- benchopt.datasets.simulated.make_correlated_data(n_samples=100, n_features=50, n_tasks=1, rho=0.6, snr=3.0, w_true=None, density=0.2, X_density=1.0, random_state=None)#
Generate a linear regression with decaying correlation for the design matrix
.The data are generated according to:
such that the signal to noise ratio is
.The generated features have mean 0, variance 1 and the expected correlation structure
- Parameters:
- n_samples: int
Number of samples in the design matrix.
- n_features: int
Number of features in the design matrix.
- n_tasks: int
Number of tasks.
- rho: float
Correlation
between successive features. The cross correlation between feature i and feature j will be . This parameter should be selected in .- snr: float or np.inf
Signal-to-noise ratio.
- w_true: np.array, shape (n_features,) or (n_features, n_tasks)| None
True regression coefficients. If None, a sparse array with standard Gaussian non zero entries is simulated.
- density: float
Proportion of non zero elements in w_true if the latter is simulated.
- random_state: int | RandomState instance | None (default)
Determines random number generation for data generation. Use an int to make the randomness deterministic.
- X_density: float in ]0, 1]
Proportion of elements of X which are non-zero.
- Returns:
- X: ndarray or CSC matrix, shape (n_samples, n_features)
A design matrix with Toeplitz covariance.
- y: ndarray, shape (n_samples,) or (n_samples, n_tasks)
Observation vector/matrix.
- w_true: ndarray, shape (n_features,) or (n_features, n_tasks)
True regression vector/matrix of the model.