Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Remis Balaniuk

Rastro-DM: data mining with a trail

Jan 08, 2024

Marcus Vinicius Borela de Castro, Remis Balaniuk

Abstract:This paper proposes a methodology for documenting data mining (DM) projects, Rastro-DM (Trail Data Mining), with a focus not on the model that is generated, but on the processes behind its construction, in order to leave a trail (Rastro in Portuguese) of planned actions, training completed, results obtained, and lessons learned. The proposed practices are complementary to structuring methodologies of DM, such as CRISP-DM, which establish a methodological and paradigmatic framework for the DM process. The application of best practices and their benefits is illustrated in a project called 'Cladop' that was created for the classification of PDF documents associated with the investigative process of damages to the Brazilian Federal Public Treasury. Building the Rastro-DM kit in the context of a project is a small step that can lead to an institutional leap to be achieved by sharing and using the trail across the enterprise.

* Revista do TCU (Brazilian Federal Court of Accounts), 145 (2021): 79-106
* It was published in the Brazilian Federal Court of Accounts Journal n. 145 on 2021 (https://revista.tcu.gov.br/ojs/index.php/RTCU/article/view/1733)

Via

Access Paper or Ask Questions

Mining and Tailings Dam Detection In Satellite Imagery Using Deep Learning

Jul 02, 2020

Remis Balaniuk, Olga Isupova, Steven Reece

Figure 1 for Mining and Tailings Dam Detection In Satellite Imagery Using Deep Learning

Figure 2 for Mining and Tailings Dam Detection In Satellite Imagery Using Deep Learning

Figure 3 for Mining and Tailings Dam Detection In Satellite Imagery Using Deep Learning

Figure 4 for Mining and Tailings Dam Detection In Satellite Imagery Using Deep Learning

Abstract:This work explores the combination of free cloud computing, free open-source software, and deep learning methods to analyse a real, large-scale problem: the automatic country-wide identification and classification of surface mines and mining tailings dams in Brazil. Locations of officially registered mines and dams were obtained from the Brazilian government open data resource. Multispectral Sentinel-2 satellite imagery, obtained and processed at the Google Earth Engine platform, was used to train and test deep neural networks using the TensorFlow 2 API and Google Colab platform. Fully Convolutional Neural Networks were used in an innovative way, to search for unregistered ore mines and tailing dams in large areas of the Brazilian territory. The efficacy of the approach is demonstrated by the discovery of 263 mines that do not have an official mining concession. This exploratory work highlights the potential of a set of new technologies, freely available, for the construction of low cost data science tools that have high social impact. At the same time, it discusses and seeks to suggest practical solutions for the complex and serious problem of illegal mining and the proliferation of tailings dams, which pose high risks to the population and the environment, especially in developing countries. Code is made publicly available at: https://github.com/remis/mining-discovery-with-deep-learning.

* Preprint submitted to Remote Sensing of Environment

Via

Access Paper or Ask Questions

BrazilDAM: A Benchmark dataset for Tailings Dam Detection

Mar 17, 2020

Edemir Ferreira, Matheus Brito, Remis Balaniuk, Mário S. Alvim, Jefersson A. dos Santos

Figure 1 for BrazilDAM: A Benchmark dataset for Tailings Dam Detection

Figure 2 for BrazilDAM: A Benchmark dataset for Tailings Dam Detection

Figure 3 for BrazilDAM: A Benchmark dataset for Tailings Dam Detection

Figure 4 for BrazilDAM: A Benchmark dataset for Tailings Dam Detection

Abstract:In this work we present BrazilDAM, a novel public dataset based on Sentinel-2 and Landsat-8 satellite images covering all tailings dams cataloged by the Brazilian National Mining Agency (ANM). The dataset was built using georeferenced images from 769 dams, recorded between 2016 and 2019. The time series were processed in order to produce cloud free images. The dams contain mining waste from different ore categories and have highly varying shapes, areas and volumes, making BrazilDAM particularly interesting and challenging to be used in machine learning benchmarks. The original catalog contains, besides the dam coordinates, information about: the main ore, constructive method, risk category, and associated potential damage. To evaluate BrazilDAM's predictive potential we performed classification essays using state-of-the-art deep Convolutional Neural Network (CNNs). In the experiments, we achieved an average classification accuracy of 94.11\% in tailing dam binary classification task. In addition, others four setups of experiments were made using the complementary information from the original catalog, exhaustively exploiting the capacity of the proposed dataset.

Via

Access Paper or Ask Questions