himalaya.kernel_ridge.make_column_kernelizer

himalaya.kernel_ridge.make_column_kernelizer(*transformers, **kwargs)[source]

Construct a ColumnKernelizer from the given transformers.

This is a shorthand for the ColumnKernelizer constructor; it does not require, and does not permit, naming the transformers. Instead, they will be given names automatically based on their types. It also does not allow weighting with transformer_weights.

Parameters
*transformerstuples

Tuples of the form (transformer, columns) specifying the transformer objects to be applied to subsets of the data.

transformer{‘drop’, ‘passthrough’} or estimator

Estimator must support fit and transform. Special-cased strings ‘drop’ and ‘passthrough’ are accepted as well, to indicate to drop the columns or to pass them through untransformed, respectively. If the transformer does not return a kernel (as informed by the attribute kernelizer=True), a linear kernelizer is applied after the transformer.

columnsstr, array-like of str, int, array-like of int, slice, array-like of bool or callable

Indexes the data on its second axis. Integers are interpreted as positional columns, while strings can reference DataFrame columns by name. A scalar string or int should be used where transformer expects X to be a 1d array-like (vector), otherwise a 2d array will be passed to the transformer. A callable is passed the input data X and can return any of the above. To select multiple columns by name or dtype, you can use make_column_selector.

remainder{‘drop’, ‘passthrough’} or estimator, default=’drop’

By default, only the specified columns in transformers are transformed and combined in the output, and the non-specified columns are dropped. (default of 'drop'). By specifying remainder='passthrough', all remaining columns that were not specified in transformers will be automatically passed through. This subset of columns is concatenated with the output of the transformers. By setting remainder to be an estimator, the remaining non-specified columns will use the remainder estimator. The estimator must support fit and transform.

n_jobsint, default=None

Number of jobs to run in parallel. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors. n_jobs does not work with with GPU backends.

verbosebool, default=False

If True, the time elapsed while fitting each transformer will be printed as it is completed.

force_cpubool

If True, computations will be performed on CPU, ignoring the current backend. If False, use the current backend.

Returns
column_kernelizerColumnKernelizer

See also

himalaya.kernel_ridge.ColumnKernelizer

Class that allows combining the outputs of multiple transformer objects used on column subsets of the data into a single feature space.

Examples

>>> import numpy as np
>>> from himalaya.kernel_ridge import make_column_kernelizer
>>> from himalaya.kernel_ridge import Kernelizer
>>> ck = make_column_kernelizer(
...     (Kernelizer(kernel="linear"), [0, 1, 2]),
...     (Kernelizer(kernel="polynomial"), slice(3, 5)))
>>> X = np.array([[0., 1., 2., 2., 3.],
                  [0., 2., 0., 0., 3.],
                  [0., 0., 1., 0., 3.],
...               [1., 1., 0., 1., 2.]])
>>> # Kernelize separately the first three columns and the last two
>>> # columns, creating two kernels of shape (n_samples, n_samples).
>>> ck.fit_transform(X).shape
(2, 4, 4)