We propose an AE-based transceiver for a WDM system impaired by hardware imperfections. We design our AE following the architecture of conventional communication systems. This enables to initialize the AE-based transceiver to have similar performance to its conventional counterpart prior to training and improves the training convergence rate. We first train the AE in a single-channel system, and show that it achieves performance improvements by putting energy outside the desired bandwidth, and therefore cannot be used for a WDM system. We then train the AE in a WDM setup. Simulation results show that the proposed AE significantly outperforms the conventional approach. More specifically, it increases the spectral efficiency of the considered system by reducing the guard band by 37\% and 50\% for a root-raised-cosine filter-based matched filter with 10\% and 1\% roll-off, respectively. An ablation study indicates that the performance gain can be ascribed to the optimization of the symbol mapper, the pulse-shaping filter, and the symbol demapper. Finally, we use reinforcement learning to learn the pulse-shaping filter under the assumption that the channel model is unknown. Simulation results show that the reinforcement-learning-based algorithm achieves similar performance to the standard supervised end-to-end learning approach assuming perfect channel knowledge.