Troubleshooting¶
We detail here common issues encountered with himalaya, and how to fix
them.
CUDA out of memory¶
The GPU memory is often smaller than the CPU memory, so it requires more attention to avoid running out of memory. Himalaya implements a series of options to limit the GPU memory, often at the cost of computational speed:
Some solvers implement computations over batches, to limit the size of intermediate arrays. See for instance
n_targets_batch, orn_alphas_batchinKernelRidgeCV.Some solvers implement an option to keep the input kernels or the targets in CPU memory. See for instance
Y_in_cpuinMultipleKernelRidgeCV.Some estimators can also be forced to use CPU, ignoring the current backend, using the parameter
force_cpu=True. To limit GPU memory, some estimators in the same pipeline can useforce_cpu=Trueand othersforce_cpu=False. In particular, it is possible to precompute kernels on CPU, usingKernelizerorColumnKernelizerwith the parameterforce_cpu=Truebefore fitting aKernelRidgeCVor aMultipleKernelRidgeCVon GPU.
A CUDA out of memory issue can also arise with pytorch < 1.9, for example
with KernelRidge, where a solver requires
ridiculously high peak memory during a broadcasting matmul operation. This
issue can be fixed by
updating to pytorch = 1.9 or newer versions.
Slow check_array¶
In himalaya, the scikit-learn compatible estimators validate the input data,
checking the absence of NaN or infinite values. For large datasets, this check
can take significant computational time. To skip this check, simply call
sklearn.set_config(assume_finite=True) before fitting your models.