Troubleshooting¶
We detail here common issues encountered with himalaya
, and how to fix
them.
CUDA out of memory¶
The GPU memory is often smaller than the CPU memory, so it requires more attention to avoid running out of memory. Himalaya implements a series of options to limit the GPU memory, often at the cost of computational speed:
Some solvers implement computations over batches, to limit the size of intermediate arrays. See for instance
n_targets_batch
, orn_alphas_batch
inKernelRidgeCV
.Some solvers implement an option to keep the input kernels or the targets in CPU memory. See for instance
Y_in_cpu
inMultipleKernelRidgeCV
.Some estimators can also be forced to use CPU, ignoring the current backend, using the parameter
force_cpu=True
. To limit GPU memory, some estimators in the same pipeline can useforce_cpu=True
and othersforce_cpu=False
. In particular, it is possible to precompute kernels on CPU, usingKernelizer
orColumnKernelizer
with the parameterforce_cpu=True
before fitting aKernelRidgeCV
or aMultipleKernelRidgeCV
on GPU.
A CUDA out of memory issue can also arise with pytorch < 1.9
, for example
with KernelRidge
, where a solver requires
ridiculously high peak memory during a broadcasting matmul operation. This
issue can be fixed by
updating to pytorch = 1.9
or newer versions.
Slow check_array¶
In himalaya, the scikit-learn compatible estimators validate the input data,
checking the absence of NaN or infinite values. For large datasets, this check
can take significant computational time. To skip this check, simply call
sklearn.set_config(assume_finite=True)
before fitting your models.