Machine Learning (ML) has increased its role, becoming essential in several industries. However, questions around training data lineage, such as "where has the dataset used to train this model come from?"; the introduction of several new data protection legislation; and, the need for data governance requirements, have hindered the adoption of ML models in the real world. In this paper, we discuss how data lineage can be leveraged to benefit the ML lifecycle to build ML models to discover sweet-spots for shale oil and gas production, a major application in the Oil and Gas O&G Industry.