Disentangling Superpositions: Interpretable Brain Encoding Model with Sparse Concept Atoms (Zeng and Gallant, NeurIPS, 2025)

November 13, 2025

Dense ANN word embeddings entangle multiple concepts in each feature, making it difficult to interpret encoding model maps. We use a Sparse Concept Encoding Model to produce a feature space where each dimension corresponds to an interpretable concept. The resulting model matches the prediction performance of dense models while substantially enhancing interpretability.

Read the full article ↗