Online conversations are particularly susceptible to derailment, which can manifest itself in the form of toxic communication patterns including disrespectful comments and abuse. Forecasting conversation derailment predicts signs of derailment in advance enabling proactive moderation of conversations. State-of-the-art approaches to conversation derailment forecasting sequentially encode conversations and use graph neural networks to model dialogue user dynamics. However, existing graph models are not able to capture complex conversational characteristics such as context propagation and emotional shifts. The use of common sense knowledge enables a model to capture such characteristics, thus improving performance. Following this approach, here we derive commonsense statements from a knowledge base of dialogue contextual information to enrich a graph neural network classification architecture. We fuse the multi-source information on utterance into capsules, which are used by a transformer-based forecaster to predict conversation derailment. Our model captures conversation dynamics and context propagation, outperforming the state-of-the-art models on the CGA and CMV benchmark datasets