This paper introduces a cross adversarial source separation (CASS) framework via autoencoder, a new model that aims at separating an input signal consisting of a mixture of multiple components into individual components defined via adversarial learning and autoencoder fitting. CASS unifies popular generative networks like auto-encoders (AEs) and generative adversarial networks (GANs) in a single framework. The basic building block that filters the input signal and reconstructs the $i$-th target component is a pair of deep neural networks $\mathcal{EN}_i$ and $\mathcal{DE}_i$ as an encoder for dimension reduction and a decoder for component reconstruction, respectively. The decoder $\mathcal{DE}_i$ as a generator is enhanced by a discriminator network $\mathcal{D}_i$ that favors signal structures of the $i$-th component in the $i$-th given dataset as guidance through adversarial learning. In contrast with existing practices in AEs which trains each Auto-Encoder independently, or in GANs that share the same generator, we introduce cross adversarial training that emphasizes adversarial relation between any arbitrary network pairs $(\mathcal{DE}_i,\mathcal{D}_j)$, achieving state-of-the-art performance especially when target components share similar data structures.