Recent neural news recommenders (NNR) extend content-based recommendation by (1) aligning additional aspects such as topic or sentiment between the candidate news and user history or (2) diversifying recommendations w.r.t. these aspects. This customization is achieved by ``hardcoding'' additional constraints into NNR's architecture and/or training objectives: any change in the desired recommendation behavior thus requires the model to be retrained with a modified objective, impeding wide adoption of multi-aspect news recommenders. In this work, we introduce MANNeR, a modular framework for flexible multi-aspect (neural) news recommendation that supports ad-hoc customization over individual aspects at inference time. With metric-based learning at its core, MANNeR obtains aspect-specialized news encoders and then flexibly combines aspect-specific similarity scores for final ranking. Evaluation on two standard news recommendation benchmarks (one in English, one in Norwegian) shows that MANNeR consistently outperforms state-of-the-art NNRs on both standard content-based recommendation and single- and multi-aspect customization. Moreover, with MANNeR we can trivially scale the importance and find the optimal trade-off between content-based recommendation performance and aspect-based diversity of recommendations. Finally, we show that both MANNeR's content-based recommendation and aspect customization are robust to domain- and language transfer.