Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Adam Gudyś

RuleKit 2: Faster and simpler rule learning

Apr 29, 2025

Adam Gudyś, Cezary Maszczyk, Joanna Badura, Adam Grzelak, Marek Sikora, Łukasz Wróbel

Abstract:Rules offer an invaluable combination of predictive and descriptive capabilities. Our package for rule-based data analysis, RuleKit, has proven its effectiveness in classification, regression, and survival problems. Here we present its second version. New algorithms and optimized implementations of those previously included, significantly improved the computational performance of our suite, reducing the analysis time of some data sets by two orders of magnitude. The usability of RuleKit 2 is provided by two new components: Python package and browser application with a graphical user interface. The former complies with scikit-learn, the most popular data mining library for Python, allowing RuleKit 2 to be straightforwardly integrated into existing data analysis pipelines. RuleKit 2 is available at GitHub under GNU AGPL 3 license (https://github.com/adaa-polsl/RuleKit)

* 10 pages, 3 figures, 2 tables

Via

Access Paper or Ask Questions

Separate and conquer heuristic allows robust mining of contrast sets from various types of data

Apr 01, 2022

Adam Gudyś, Marek Sikora, Łukasz Wróbel

Figure 1 for Separate and conquer heuristic allows robust mining of contrast sets from various types of data

Figure 2 for Separate and conquer heuristic allows robust mining of contrast sets from various types of data

Figure 3 for Separate and conquer heuristic allows robust mining of contrast sets from various types of data

Figure 4 for Separate and conquer heuristic allows robust mining of contrast sets from various types of data

Abstract:Identifying differences between groups is one of the most important knowledge discovery problems. The procedure, also known as contrast sets mining, is applied in a wide range of areas like medicine, industry, or economics. In the paper we present RuleKit-CS, an algorithm for contrast set mining based on a sequential covering - a well established heuristic for decision rule induction. Multiple passes accompanied with an attribute penalization scheme allow generating contrast sets describing same examples with different attributes, unlike the standard sequential covering. The ability to identify contrast sets in regression and survival data sets, the feature not provided by the existing algorithms, further extends the usability of RuleKit-CS. Experiments on wide range of data sets confirmed RuleKit-CS to be a useful tool for discovering differences between defined groups. The algorithm is a part of the RuleKit suite available at GitHub under GNU AGPL 3 licence (https://github.com/adaa-polsl/RuleKit). Keywords: Contrast sets, Sequential covering, Rule induction, Regression, Survival, Knowledge discovery

Via

Access Paper or Ask Questions

RuleKit: A Comprehensive Suite for Rule-Based Learning

Aug 02, 2019

Adam Gudyś, Marek Sikora, Łukasz Wróbel

Figure 1 for RuleKit: A Comprehensive Suite for Rule-Based Learning

Figure 2 for RuleKit: A Comprehensive Suite for Rule-Based Learning

Figure 3 for RuleKit: A Comprehensive Suite for Rule-Based Learning

Abstract:Rule-based models are often used for data analysis as they combine interpretability with predictive power. We present RuleKit, a versatile tool for rule learning. Based on a sequential covering induction algorithm, it is suitable for classification, regression, and survival problems. The presence of a user-guided induction facilitates verifying hypotheses concerning data dependencies which are expected or of interest. The powerful and flexible experimental environment allows straightforward investigation of different induction schemes. The analysis can be performed in batch mode, through RapidMiner plug-in, or R package. A documented Java API is also provided for convenience. The software is publicly available at GitHub under GNU AGPL-3.0 license.

* 5 pages, 3 figures

Via

Access Paper or Ask Questions

GuideR: a guided separate-and-conquer rule learning in classification, regression, and survival settings

Jun 05, 2018

Marek Sikora, Łukasz Wróbel, Adam Gudyś

Figure 1 for GuideR: a guided separate-and-conquer rule learning in classification, regression, and survival settings

Figure 2 for GuideR: a guided separate-and-conquer rule learning in classification, regression, and survival settings

Figure 3 for GuideR: a guided separate-and-conquer rule learning in classification, regression, and survival settings

Figure 4 for GuideR: a guided separate-and-conquer rule learning in classification, regression, and survival settings

Abstract:This article presents GuideR, a user-guided rule induction algorithm, which overcomes the largest limitation of the existing methods-the lack of the possibility to introduce user's preferences or domain knowledge to the rule learning process. Automatic selection of attributes and attribute ranges often leads to the situation in which resulting rules do not contain interesting information. We propose an induction algorithm which takes into account user's requirements. Our method uses the sequential covering approach and is suitable for classification, regression, and survival analysis problems. The effectiveness of the algorithm in all these tasks has been verified experimentally, confirming guided rule induction to be a powerful data analysis tool.

Via

Access Paper or Ask Questions