We consider the problem of dynamic channel allocation (DCA) in cognitive communication networks with the goal of maximizing a global signal-to-interference-plus-noise ratio (SINR) measure under a specified target quality of service (QoS)-SINR for each network. The shared bandwidth is partitioned into K channels with frequency separation. In contrast to the majority of existing studies that assume perfect orthogonality or a one- to-one user-channel allocation mapping, this paper focuses on real-world systems experiencing inter-carrier interference (ICI) and channel reuse by multiple large-scale networks. This realistic scenario significantly increases the problem dimension, rendering existing algorithms inefficient. We propose a novel multi-agent reinforcement learning (RL) framework for distributed DCA, named Channel Allocation RL To Overlapped Networks (CARLTON). The CARLTON framework is based on the Centralized Training with Decentralized Execution (CTDE) paradigm, utilizing the DeepMellow value-based RL algorithm. To ensure robust performance in the interference-laden environment we address, CARLTON employs a low-dimensional representation of observations, generating a QoS-type measure while maximizing a global SINR measure and ensuring the target QoS-SINR for each network. Our results demonstrate exceptional performance and robust generalization, showcasing superior efficiency compared to alternative state-of-the-art methods, while achieving a marginally diminished performance relative to a fully centralized approach.