Social norms are shared rules that govern and facilitate social interaction. Violating such social norms via teasing and insults may serve to upend power imbalances or, on the contrary reinforce solidarity and rapport in conversation, rapport which is highly situated and context-dependent. In this work, we investigate the task of automatically identifying the phenomena of social norm violation in discourse. Towards this goal, we leverage the power of recurrent neural networks and multimodal information present in the interaction, and propose a predictive model to recognize social norm violation. Using long-term temporal and contextual information, our model achieves an F1 score of 0.705. Implications of our work regarding developing a social-aware agent are discussed.