Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

P. K. Mishra

RDD-Eclat: Approaches to Parallelize Eclat Algorithm on Spark RDD Framework

Dec 13, 2019

Pankaj Singh, Sudhakar Singh, P. K. Mishra, Rakhi Garg

Figure 1 for RDD-Eclat: Approaches to Parallelize Eclat Algorithm on Spark RDD Framework

Figure 2 for RDD-Eclat: Approaches to Parallelize Eclat Algorithm on Spark RDD Framework

Figure 3 for RDD-Eclat: Approaches to Parallelize Eclat Algorithm on Spark RDD Framework

Figure 4 for RDD-Eclat: Approaches to Parallelize Eclat Algorithm on Spark RDD Framework

Abstract:Initially, a number of frequent itemset mining (FIM) algorithms have been designed on the Hadoop MapReduce, a distributed big data processing framework. But, due to heavy disk I/O, MapReduce is found to be inefficient for such highly iterative algorithms. Therefore, Spark, a more efficient distributed data processing framework, has been developed with in-memory computation and resilient distributed dataset (RDD) features to support the iterative algorithms. On the Spark RDD framework, Apriori and FP-Growth based FIM algorithms have been designed, but Eclat-based algorithm has not been explored yet. In this paper, RDD-Eclat, a parallel Eclat algorithm on the Spark RDD framework is proposed with its five variants. The proposed algorithms are evaluated on the various benchmark datasets, which shows that RDD-Eclat outperforms the Spark-based Apriori by many times. Also, the experimental results show the scalability of the proposed algorithms on increasing the number of cores and size of the dataset.

* ICCNCT 2019, LNDECT 44
* 16 pages, 6 figures, ICCNCT 2019

Via

Access Paper or Ask Questions

Mining Association Rules in Various Computing Environments: A Survey

Jun 30, 2019

Sudhakar Singh, Pankaj Singh, Rakhi Garg, P. K. Mishra

Figure 1 for Mining Association Rules in Various Computing Environments: A Survey

Figure 2 for Mining Association Rules in Various Computing Environments: A Survey

Figure 3 for Mining Association Rules in Various Computing Environments: A Survey

Figure 4 for Mining Association Rules in Various Computing Environments: A Survey

Abstract:Association Rule Mining (ARM) is one of the well know and most researched technique of data mining. There are so many ARM algorithms have been designed that their counting is a large number. In this paper we have surveyed the various ARM algorithms in four computing environments. The considered computing environments are sequential computing, parallel and distributed computing, grid computing and cloud computing. With the emergence of new computing paradigm, ARM algorithms have been designed by many researchers to improve the efficiency by utilizing the new paradigm. This paper represents the journey of ARM algorithms started from sequential algorithms, and through parallel and distributed, and grid based algorithms to the current state-of-the-art, along with the motives for adopting new machinery.

* International Journal of Applied Engineering Research 2016; 11(8): 5629-5640
* 14 pages

Via

Access Paper or Ask Questions