Abstract:Why do biological and artificial neurons sometimes modularise, each encoding a single meaningful variable, and sometimes entangle their representation of many variables? In this work, we develop a theory of when biologically inspired representations -- those that are nonnegative and energy efficient -- modularise with respect to source variables (sources). We derive necessary and sufficient conditions on a sample of sources that determine whether the neurons in an optimal biologically-inspired linear autoencoder modularise. Our theory applies to any dataset, extending far beyond the case of statistical independence studied in previous work. Rather, we show that sources modularise if their support is "sufficiently spread". From this theory, we extract and validate predictions in a variety of empirical studies on how data distribution affects modularisation in nonlinear feedforward and recurrent neural networks trained on supervised and unsupervised tasks. Furthermore, we apply these ideas to neuroscience data. First, we explain why two studies that recorded prefrontal activity in working memory tasks conflict on whether memories are encoded in orthogonal subspaces: the support of the sources differed due to a critical discrepancy in experimental protocol. Second, we use similar arguments to understand why preparatory and potent subspaces in RNN models of motor cortex are only sometimes orthogonal. Third, we study spatial and reward information mixing in entorhinal recordings, and show our theory matches data better than previous work. And fourth, we suggest a suite of surprising settings in which neurons can be (or appear) mixed selective, without requiring complex nonlinear readouts as in traditional theories. In sum, our theory prescribes precise conditions on when neural activities modularise, providing tools for inducing and elucidating modular representations in brains and machines.