The drug discovery stage is a vital part of the drug development process and forms part of the initial stages of the development pipeline. In recent times, machine learning-based methods are actively being used to model drug-target interactions for rational drug discovery due to the successful application of these methods in other domains. In machine learning approaches, the numerical representation of molecules is vital to the performance of the model. While significant progress has been made in molecular representation engineering, this has resulted in several descriptors for both targets and compounds. Also, the interpretability of model predictions is a vital feature that could have several pharmacological applications. In this study, we propose a self-attention-based, multi-view representation learning approach for modeling drug-target interactions. We evaluated our approach using three large-scale kinase datasets and compared six variants of our method to 16 baselines. Our experimental results demonstrate the ability of our method to achieve high accuracy and offer biologically plausible interpretations using neural attention.