himalaya.kernel_ridge.solve_multiple_kernel_ridge_hyper_gradient¶

himalaya.kernel_ridge.solve_multiple_kernel_ridge_hyper_gradient(Ks, Y, score_func=<function l2_neg_loss>, cv=5, fit_intercept=False, return_weights=None, Xs=None, initial_deltas=0, max_iter=10, tol=0.01, max_iter_inner_dual=1, max_iter_inner_hyper=1, cg_tol=0.001, n_targets_batch=None, hyper_gradient_method='conjugate_gradient', kernel_ridge_method='gradient_descent', random_state=None, progress_bar=True, Y_in_cpu=False)[source]¶

Solve bilinear kernel ridge regression with cross-validation.

The hyper-parameters deltas correspond to:

log(kernel_weights / ridge_regularization)

Parameters

Ksarray of shape (n_kernels, n_samples, n_samples): Training kernel for each feature space.
Yarray of shape (n_samples, n_targets): Training target data.
score_funccallable: Function used to compute the score of predictions.
cvint or scikit-learn splitter: Cross-validation splitter. If an int, KFold is used.
fit_interceptboolean: Whether to fit an intercept. If False, Ks should be centered (see KernelCenterer), and Y must be zero-mean over samples. Only available if return_weights == ‘dual’.
return_weightsNone, ‘primal’, or ‘dual’: Whether to refit on the entire dataset and return the weights.
Xsarray of shape (n_kernels, n_samples, n_features) or None: Necessary if return_weights == ‘primal’.
initial_deltasstr, float, array of shape (n_kernels, n_targets): Initial log kernel weights for each target. If a float, initialize the deltas with this value. If a str, initialize the deltas with different strategies: - ‘ridgecv’ : fit a RidgeCV model over the average kernel.
max_iterint: Maximum number of iteration for the outer loop.
tolfloat > 0, or None: Tolerance for the stopping criterion.
max_iter_inner_dualint: Maximum number of iterations for the dual weights conjugate gradient.
max_iter_inner_hyper :: Maximum number of iterations for the deltas gradient descent.
cg_tolfloat, or array of shape (max_iter): Tolerance for the conjugate gradients.
n_targets_batchint or None: Size of the batch for computing predictions. Used for memory reasons. If None, uses all n_targets at once.
hyper_gradient_methodstr, “conjugate_gradient”, “neumann”, “direct”: Method to compute the hypergradient.
kernel_ridge_methodstr, “conjugate_gradient” or “gradient_descent”: Algorithm used for the inner step.
random_stateint, or None: Random generator seed. Use an int for deterministic search.
progress_barbool: If True, display a progress bar over batches and iterations.
Y_in_cpubool: If True, keep the target values Y in CPU memory (slower).

Returns

deltasarray of shape (n_kernels, n_targets): Best log kernel weights for each target.
refit_weightsarray or None: Refit regression weights on the entire dataset, using selected best hyperparameters. Refit weights will always be on CPU memory. If return_weights == ‘primal’, shape is (n_features, n_targets), if return_weights == ‘dual’, shape is (n_samples, n_targets), else, None.
cv_scoresarray of shape (max_iter * max_iter_inner_hyper, n_targets): Cross-validation scores per iteration, averaged over splits. Cross-validation scores will always be on CPU memory.
interceptarray of shape (n_targets,): Intercept. Only returned when fit_intercept is True.

Related Topics

Navigation

himalaya.kernel_ridge.solve_multiple_kernel_ridge_hyper_gradient¶