himalaya.kernel_ridge.solve_multiple_kernel_ridge_hyper_gradient¶
- himalaya.kernel_ridge.solve_multiple_kernel_ridge_hyper_gradient(Ks, Y, score_func=<function l2_neg_loss>, cv=5, fit_intercept=False, return_weights=None, Xs=None, initial_deltas=0, max_iter=10, tol=0.01, max_iter_inner_dual=1, max_iter_inner_hyper=1, cg_tol=0.001, n_targets_batch=None, hyper_gradient_method='conjugate_gradient', kernel_ridge_method='gradient_descent', random_state=None, progress_bar=True, Y_in_cpu=False)[source]¶
Solve bilinear kernel ridge regression with cross-validation.
The hyper-parameters deltas correspond to:
log(kernel_weights / ridge_regularization)
- Parameters
- Ksarray of shape (n_kernels, n_samples, n_samples)
Training kernel for each feature space.
- Yarray of shape (n_samples, n_targets)
Training target data.
- score_funccallable
Function used to compute the score of predictions.
- cvint or scikit-learn splitter
Cross-validation splitter. If an int, KFold is used.
- fit_interceptboolean
Whether to fit an intercept. If False, Ks should be centered (see KernelCenterer), and Y must be zero-mean over samples. Only available if return_weights == ‘dual’.
- return_weightsNone, ‘primal’, or ‘dual’
Whether to refit on the entire dataset and return the weights.
- Xsarray of shape (n_kernels, n_samples, n_features) or None
Necessary if return_weights == ‘primal’.
- initial_deltasstr, float, array of shape (n_kernels, n_targets)
Initial log kernel weights for each target. If a float, initialize the deltas with this value. If a str, initialize the deltas with different strategies: - ‘ridgecv’ : fit a RidgeCV model over the average kernel.
- max_iterint
Maximum number of iteration for the outer loop.
- tolfloat > 0, or None
Tolerance for the stopping criterion.
- max_iter_inner_dualint
Maximum number of iterations for the dual weights conjugate gradient.
- max_iter_inner_hyper :
Maximum number of iterations for the deltas gradient descent.
- cg_tolfloat, or array of shape (max_iter)
Tolerance for the conjugate gradients.
- n_targets_batchint or None
Size of the batch for computing predictions. Used for memory reasons. If None, uses all n_targets at once.
- hyper_gradient_methodstr, “conjugate_gradient”, “neumann”, “direct”
Method to compute the hypergradient.
- kernel_ridge_methodstr, “conjugate_gradient” or “gradient_descent”
Algorithm used for the inner step.
- random_stateint, or None
Random generator seed. Use an int for deterministic search.
- progress_barbool
If True, display a progress bar over batches and iterations.
- Y_in_cpubool
If True, keep the target values
Y
in CPU memory (slower).
- Returns
- deltasarray of shape (n_kernels, n_targets)
Best log kernel weights for each target.
- refit_weightsarray or None
Refit regression weights on the entire dataset, using selected best hyperparameters. Refit weights will always be on CPU memory. If return_weights == ‘primal’, shape is (n_features, n_targets), if return_weights == ‘dual’, shape is (n_samples, n_targets), else, None.
- cv_scoresarray of shape (max_iter * max_iter_inner_hyper, n_targets)
Cross-validation scores per iteration, averaged over splits. Cross-validation scores will always be on CPU memory.
- interceptarray of shape (n_targets,)
Intercept. Only returned when fit_intercept is True.