Diverse fault types, fast re-closures and complicated transient states after a fault event make real-time fault location in power grids challenging. Existing localization techniques in this area rely on simplistic assumptions, such as static loads, or require much higher sampling rates or total measurement availability. This paper proposes a data-driven localization method based on a Convolutional Neural Network (CNN) classifier using bus voltages. Unlike prior data-driven methods, the proposed classifier is based on features with physical interpretations that are described in details. The accuracy of our CNN based localization tool is demonstrably superior to other machine learning classifiers in the literature. To further improve the location performance, a novel phasor measurement units (PMU) placement strategy is proposed and validated against other methods. A significant aspect of our methodology is that under very low observability (7% of buses), the algorithm is still able to localize the faulted line to a small neighborhood with high probability. The performance of our scheme is validated through simulations of faults of various types in the IEEE 68-bus power system under varying load conditions, system observability and measurement quality.