With the wide adoption of mobile devices, today's location tracking systems such as satellites, cellular base stations and wireless access points are continuously producing tremendous amounts of location data of moving objects. The ability to discover moving objects that travel together, i.e., traveling companions, from their trajectories is desired by many applications such as intelligent transportation systems and location-based services. Existing algorithms are either based on pattern mining methods that define a particular pattern of traveling companions or based on representation learning methods that learn similar representations for similar trajectories. The former methods suffer from the pairwise point-matching problem and the latter often ignore the temporal proximity between trajectories. In this work, we propose a generic deep representation learning model using autoencoders, namely, ATTN-MEAN, for the discovery of traveling companions. ATTN-MEAN collectively injects spatial and temporal information into its input embeddings using skip-gram, positional encoding techniques, respectively. Besides, our model further encourages trajectories to learn from their neighbours by leveraging the Sort-Tile-Recursive algorithm, mean operation and global attention mechanism. After obtaining the representations from the encoders, we run DBSCAN to cluster the representations to find travelling companion. The corresponding trajectories in the same cluster are considered as traveling companions. Experimental results suggest that ATTN-MEAN performs better than the state-of-the-art algorithms on finding traveling companions.