Multiple-kernel ridge fit from fixed hyper-parameters

This example demonstrates how to fit a multiple-kernel ridge model with fixed hyper-parameters. Here are three different usecases:

  • If the kernel weights hyper-parameters are known and identical across targets, the kernels can be scaled and summed, and a simple KernelRidgeCV can be used to fit the model.

  • If the kernel weights hyper-parameters are unknown and different across targets, a MultipleKernelRidgeCV can be use to search the best hyper-parameters per target.

  • If the kernel weights hyper-parameters are known and different across targets, a WeightedKernelRidge model can be used to fit the ridge models on each target independently.

This method can be used for example in the following workflow:

  • fit a MultipleKernelRidgeCV to learn the kernel weights hyper-parameter,

  • save the hyper-parameters, but not the ridge weights to save disk space,

  • fit a WeightedKernelRidge from the saved hyper-parameters, for further use of the model (prediction, interpretation, etc.).

import numpy as np

from himalaya.backend import set_backend
from himalaya.kernel_ridge import WeightedKernelRidge
from himalaya.kernel_ridge import Kernelizer
from himalaya.kernel_ridge import ColumnKernelizer
from himalaya.utils import generate_multikernel_dataset

from sklearn.pipeline import make_pipeline
from sklearn import set_config
set_config(display='diagram')

In this example, we use the torch_cuda backend (GPU).

backend = set_backend("torch_cuda", on_error="warn")
/home/runner/work/himalaya/himalaya/himalaya/backend/_utils.py:66: UserWarning: Setting backend to torch_cuda failed: PyTorch with CUDA is not available..Falling back to numpy backend.
  warnings.warn(f"Setting backend to {backend} failed: {str(error)}."

Generate a random dataset

  • X_train : array of shape (n_samples_train, n_features)

  • X_test : array of shape (n_samples_test, n_features)

  • Y_train : array of shape (n_samples_train, n_targets)

  • Y_test : array of shape (n_samples_test, n_targets)

(X_train, X_test, Y_train, Y_test, kernel_weights,
 n_features_list) = generate_multikernel_dataset(n_kernels=4, n_targets=500,
                                                 n_samples_train=1000,
                                                 n_samples_test=400,
                                                 random_state=42)

Prepare the pipeline

# Find the start and end of each feature space X in Xs
start_and_end = np.concatenate([[0], np.cumsum(n_features_list)])
slices = [
    slice(start, end)
    for start, end in zip(start_and_end[:-1], start_and_end[1:])
]

# Create a different ``Kernelizer`` for each feature space.
kernelizers = [("space %d" % ii, Kernelizer(), slice_)
               for ii, slice_ in enumerate(slices)]
column_kernelizer = ColumnKernelizer(kernelizers)

Define the weighted kernel ridge model

Here we use the ground truth kernel weights for each target (deltas), but it can be typically used with deltas obtained from a MultipleKernelRidgeCV fit.

deltas = backend.log(backend.asarray(kernel_weights.T))

model_1 = WeightedKernelRidge(alpha=1, deltas=deltas, kernels="precomputed")
pipe_1 = make_pipeline(column_kernelizer, model_1)

# Fit the model on all targets
pipe_1.fit(X_train, Y_train)
Pipeline(steps=[('columnkernelizer',
                 ColumnKernelizer(transformers=[('space 0', Kernelizer(),
                                                 slice(np.int64(0), np.int64(1000), None)),
                                                ('space 1', Kernelizer(),
                                                 slice(np.int64(1000), np.int64(2000), None)),
                                                ('space 2', Kernelizer(),
                                                 slice(np.int64(2000), np.int64(3000), None)),
                                                ('space 3', Kernelizer(),
                                                 slice(np.int64(3000), np.int64(4000), None))])),
                ('weightedkernelrid...
                 WeightedKernelRidge(alpha=1,
                                     deltas=array([[ -2.348278  ,  -1.5064142 ,  -2.191093  , ...,  -0.54280853,
         -1.9162471 ,  -4.464685  ],
       [ -0.10634118, -12.753908  ,  -1.022846  , ...,  -1.0741493 ,
         -0.749257  , -11.534051  ],
       [ -5.267356  ,  -0.25595754,  -2.3263435 , ...,  -6.4705176 ,
         -5.0901437 ,  -0.0359191 ],
       [ -8.560996  ,  -5.4922204 ,  -0.8416861 , ...,  -2.5803237 ,
         -0.9835877 ,  -3.7396026 ]], dtype=float32),
                                     kernels='precomputed'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.


compute test score

test_scores_1 = pipe_1.score(X_test, Y_test)
test_scores_1 = backend.to_numpy(test_scores_1)

We can compare this model to a baseline model where the kernel weights are all equal and not learnt.

model_2 = WeightedKernelRidge(alpha=1, deltas="zeros", kernels="precomputed")
pipe_2 = make_pipeline(column_kernelizer, model_2)

# Fit the model on all targets
pipe_2.fit(X_train, Y_train)
Pipeline(steps=[('columnkernelizer',
                 ColumnKernelizer(transformers=[('space 0', Kernelizer(),
                                                 slice(np.int64(0), np.int64(1000), None)),
                                                ('space 1', Kernelizer(),
                                                 slice(np.int64(1000), np.int64(2000), None)),
                                                ('space 2', Kernelizer(),
                                                 slice(np.int64(2000), np.int64(3000), None)),
                                                ('space 3', Kernelizer(),
                                                 slice(np.int64(3000), np.int64(4000), None))])),
                ('weightedkernelridge',
                 WeightedKernelRidge(alpha=1, kernels='precomputed'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.


compute test score

test_scores_2 = pipe_2.score(X_test, Y_test)
test_scores_2 = backend.to_numpy(test_scores_2)

Compare the predictions on a test set

import matplotlib.pyplot as plt

plt.figure(figsize=(4, 3))
plt.hist(test_scores_2, np.linspace(0, 1, 30), alpha=0.7,
         label="Default deltas")
plt.hist(test_scores_1, np.linspace(0, 1, 30), alpha=0.7,
         label="Ground truth deltas")
plt.xlabel("$R^2$ generalization score")
plt.ylabel("Number of voxels")
plt.legend()
plt.tight_layout()
plt.show()
plot mkr 4 refit from deltas

Total running time of the script: (0 minutes 3.586 seconds)

Gallery generated by Sphinx-Gallery