Note
Go to the end to download the full example code.
Using different backends (NumPy, PyTorch, GPU)
This example shows how to use the different computational backends available in
pymoten to extract motion energy features. Backends let you run the exact
same pyramid projection on the CPU with NumPy (the default) or on the GPU with
PyTorch, which can be substantially faster for large stimuli.
The available backends are:
"numpy": CPU backend using NumPy (the default)."torch": CPU backend using PyTorch."torch_cuda": GPU backend using PyTorch with CUDA (NVIDIA GPUs)."torch_mps": GPU backend using PyTorch with Metal (Apple Silicon GPUs). Note that the MPS backend runs infloat32precision only, so results are slightly less precise than the other backends.
The numpy backend is always available. The torch backends additionally require PyTorch to be installed, and the GPU backends require a compatible GPU.
Selecting a backend
Backends are managed with moten.backend.set_backend() and
moten.backend.get_backend(). The default backend is "numpy".
import numpy as np
import moten
from moten.backend import set_backend, get_backend, ALL_BACKENDS
print("Available backends:", ALL_BACKENDS)
print("Current backend:", get_backend().name)
Available backends: ['numpy', 'torch', 'torch_cuda', 'torch_mps']
Current backend: numpy
Computing features with the NumPy backend
Let us create a small synthetic stimulus and a motion energy pyramid. We use
random noise here so the example runs quickly and reproducibly, but in
practice you would load a video with moten.io.video2luminance().
nimages, vdim, hdim = (100, 72, 128)
stimulus_fps = 24
rng = np.random.RandomState(0)
luminance_images = rng.randn(nimages, vdim, hdim)
pyramid = moten.pyramids.MotionEnergyPyramid(stimulus_vhsize=(vdim, hdim),
stimulus_fps=stimulus_fps)
print(pyramid)
<moten.pyramids.MotionEnergyPyramid [#2530 filters (ntfq=3, nsfq=5, ndirs=8) aspect=1.778]>
With the numpy backend, project_stimulus works exactly as in the other
examples and returns a NumPy array of shape (nimages, nfilters).
set_backend("numpy")
features_numpy = pyramid.project_stimulus(luminance_images)
print(type(features_numpy), features_numpy.shape)
<class 'numpy.ndarray'> (100, 2530)
Per-filter vs. batched projection
Each backend exposes two projection methods. project_stimulus loops over
the filters one at a time (lower memory use), while
project_stimulus_batched groups filters into batches and computes them with
large matrix multiplications. The batched version is what unlocks most of the
GPU speed-up, and it computes the same quantity as the per-filter version.
The two methods are mathematically equivalent, but they sum the filter
responses in a different order (a single large matrix multiply per batch
versus one dot product per filter). In the default float32 precision this
reordering produces tiny rounding differences, so the results agree to about
single-precision accuracy rather than bit-for-bit. We therefore compare them
with a float32-appropriate tolerance instead of the (much stricter)
numpy.allclose defaults.
features_batched = pyramid.project_stimulus_batched(luminance_images,
batch_size=128)
max_abs_diff = np.max(np.abs(features_numpy - features_batched))
print(f"Max |per-filter - batched|: {max_abs_diff:.2e}")
print("Per-filter and batched results match (float32 tolerance):",
np.allclose(features_numpy, features_batched, atol=1e-3, rtol=1e-4))
Max |per-filter - batched|: 9.79e-04
Per-filter and batched results match (float32 tolerance): True
The small discrepancy really is just floating-point rounding: running both
methods in float64 makes them agree to roughly machine precision.
features_numpy_f64 = pyramid.project_stimulus(luminance_images, dtype="float64")
features_batched_f64 = pyramid.project_stimulus_batched(luminance_images,
batch_size=128,
dtype="float64")
print("Per-filter and batched match in float64:",
np.allclose(features_numpy_f64, features_batched_f64))
Per-filter and batched match in float64: True
Using a GPU backend
Switching to a GPU backend is a two-step process:
Call
moten.backend.set_backend()with the backend name. This returns the backend module.Move the stimulus onto the device with
backend.asarray(...)before projecting. The result lives on the GPU, so convert it back to a NumPy array withbackend.to_numpy(...).
Here we try the GPU backends in turn. We pass on_error="warn" so that, on
machines without a GPU (such as the documentation build server), the call
simply warns and keeps the current backend instead of raising. This lets the
example run everywhere while still demonstrating the GPU API.
gpu_backend = None
for name in ["torch_cuda", "torch_mps"]:
backend = set_backend(name, on_error="warn")
if backend.name == name:
gpu_backend = name
break
if gpu_backend is None:
print("No GPU backend available; skipping the GPU comparison.")
else:
print(f"Using GPU backend: {gpu_backend}")
# Move the stimulus to the GPU.
stimulus_gpu = backend.asarray(luminance_images)
# Project on the GPU (batched is recommended for speed).
features_gpu = pyramid.project_stimulus_batched(stimulus_gpu,
batch_size=128)
# Bring the result back to the CPU as a NumPy array.
features_gpu = backend.to_numpy(features_gpu)
# Compare against the numpy reference. The GPU result is computed in
# float32, so we only expect agreement up to single precision.
max_abs_diff = np.max(np.abs(features_numpy.astype(np.float32)
- features_gpu.astype(np.float32)))
print(f"Max |numpy - {gpu_backend}|: {max_abs_diff:.2e}")
/home/runner/work/pymoten/pymoten/moten/backend/_utils.py:59: UserWarning: Setting backend to torch_cuda failed: PyTorch not installed.. Falling back to numpy backend.
warnings.warn(f"Setting backend to {backend} failed: {str(error)}. "
/home/runner/work/pymoten/pymoten/moten/backend/_utils.py:59: UserWarning: Setting backend to torch_mps failed: No module named 'torch'. Falling back to numpy backend.
warnings.warn(f"Setting backend to {backend} failed: {str(error)}. "
No GPU backend available; skipping the GPU comparison.
Always reset the backend to numpy when you are done, so that other code is not affected by the global backend setting.
set_backend("numpy")
<module 'moten.backend.numpy' from '/home/runner/work/pymoten/pymoten/moten/backend/numpy.py'>
Benchmarking backends
Finally, pymoten ships a small helper,
moten.backend.benchmark(), that times the per-filter and batched
projections on one or all backends. It is handy for checking how much speed-up
a GPU gives on your own hardware.
from moten.backend import benchmark
results = benchmark("numpy", nimages=50, vdim=72, hdim=128)
numpy_result = results["numpy"]
print(f"numpy per-filter: {numpy_result['duration_seconds']:.3f}s")
print(f"numpy batched: {numpy_result['duration_batched_seconds']:.3f}s")
numpy per-filter: 1.868s
numpy batched: 1.545s
Calling benchmark() with no arguments times every available backend, so on
a machine with a GPU you can directly compare CPU and GPU timings.
Total running time of the script: (0 minutes 14.673 seconds)