We present an efficient and realistic geometric sound simulation approach for generating and augmenting training data in speech-related machine learning tasks. Our physically based acoustic simulation method is capable of modeling occlusion, specular and diffuse reflections of sound in complicated acoustic environments, whereas the classical image method can only model specular reflections in simple room settings. We show that by using our synthetic training data, the same models gain significant performance improvement on real test sets in both speech recognition and keyword spotting tasks, without fine tuning using any real data.