Abstract:Most of the existing machine-learning schemes applied to atomic-scale simulations rely on a local description of the geometry of a structure, and struggle to model effects that are driven by long-range physical interactions. Efforts to overcome these limitations have focused on the direct incorporation of electrostatics, which is the most prominent effect, often relying on architectures that mirror the functional form of explicit physical models. Including other forms of non-bonded interactions, or predicting properties other than the interatomic potential, requires ad hoc modifications. We propose an alternative approach that extends the long-distance equivariant (LODE) framework to generate local descriptors of an atomic environment that resemble non-bonded potentials with arbitrary asymptotic behaviors, ranging from point-charge electrostatics to dispersion forces. We show that the LODE formalism is amenable to a direct physical interpretation in terms of a generalized multipole expansion, that simplifies its implementation and reduces the number of descriptors needed to capture a given asymptotic behavior. These generalized LODE features provide improved extrapolation capabilities when trained on structures dominated by a given asymptotic behavior, but do not help in capturing the wildly different energy scales that are relevant for a more heterogeneous data set. This approach provides a practical scheme to incorporate different types of non-bonded interactions, and a framework to investigate the interplay of physical and data-related considerations that underlie this challenging modeling problem.