himalaya.utils.generate_multikernel_dataset

himalaya.utils.generate_multikernel_dataset(n_kernels=4, n_targets=500, n_samples_train=1000, n_samples_test=400, noise=0.1, kernel_weights=None, n_features_list=None, random_state=None)[source]

Utility to generate datasets for the gallery of examples.

Parameters
n_kernelsint

Number of kernels.

n_targetsint

Number of targets.

n_samples_trainint

Number of samples in the training set.

n_samples_testint

Number of sample in the testing set.

noisefloat > 0

Scale of the Gaussian white noise added to the targets.

kernel_weightsarray of shape (n_targets, n_kernels) or None

Kernel weights used in the prediction of the targets. If None, generate random kernel weights from a Dirichlet distribution.

n_features_listlist of int of length (n_kernels, ) or None

Number of features in each kernel. If None, use 1000 features for each.

random_stateint, or None

Random generator seed use to generate the true kernel weights.

Returns
X_trainarray of shape (n_samples_train, n_features)

Training features.

X_testarray of shape (n_samples_test, n_features)

Testing features.

Y_trainarray of shape (n_samples_train, n_targets)

Training targets.

Y_testarray of shape (n_samples_test, n_targets)

Testing targets.

kernel_weightsarray of shape (n_targets, n_kernels)

Kernel weights in the prediction of the targets.

n_features_listlist of int of length (n_kernels, )

Number of features in each kernel.