Recent technological advancements in artificial intelligence and computer vision have enabled gait analysis on portable devices such as cell phones. However, most state-of-the-art vision-based systems still impose numerous constraints for capturing a patient's video, such as using a static camera and maintaining a specific distance from it. While these constraints are manageable under professional observation, they pose challenges in home settings. Another issue with most vision-based systems is their output, typically a classification label and confidence value, whose reliability is often questioned by medical professionals. This paper addresses these challenges by presenting a novel system for gait analysis robust to camera movements and providing explanations for its output. The study utilizes a dataset comprising videos of subjects wearing two types of Knee Ankle Foot Orthosis (KAFO), namely "Locked Knee" and "Semi-flexion," for mobility, along with metadata and ground truth for explanations. The ground truth highlights the statistical significance of seven features captured using motion capture systems to differentiate between the two gaits. To address camera movement challenges, the proposed system employs super-resolution and pose estimation during pre-processing. It then identifies the seven features - Stride Length, Step Length and Duration of single support of orthotic and non-orthotic leg, Cadence, and Speed - using the skeletal output of pose estimation. These features train a multi-layer perceptron, with its output explained by highlighting the features' contribution to classification. While most state-of-the-art systems struggle with processing the video or training on the proposed dataset, our system achieves an average accuracy of 94%. The model's explainability is validated using ground truth and can be considered reliable.