A Guided FP-growth algorithm for fast mining of frequent itemsets from big data

03/18/2018
by   Lior Shabtay, et al.
0

In this paper we present the GFP-growth (Guided FP-growth) algorithm, a novel method for finding the count of a given list of itemsets in large data. Unlike FP-growth, our algorithm is designed to focus on the specific multiple itemsets of interest and hence its time and memory costs are better. We prove that the GFP-growth algorithm yields the exact frequency-counts for the required itemsets. We show that for a number of different problems, a solution can be devised which takes advantage of the efficient implementation of multi-targeted mining for boosting the performance. In particular, we study in detail the problem of generating the minority-class rules from imbalanced data, a scenario that appears in many real-life domains such as medical applications, failure prediction, network and cyber security, and maintenance. We develop the Minority-Report Algorithm that uses the GFP-growth for boosting performance. We prove some theoretical properties of the Minority-Report Algorithm and demonstrate its superior performance using simulations and real data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/18/2018

A Guided FP-growth algorithm for multitude-targeted mining of big data

In this paper we present the GFP-growth (Guided FP-growth) algorithm, a ...
research
01/12/2019

Learning of High Dengue Incidence with Clustering and FP-Growth Algorithm using WHO Historical Data

This paper applies FP-Growth algorithm in mining fuzzy association rules...
research
02/28/2019

Evaluation of Frequent Itemset Mining Platforms using Apriori and FP-Growth Algorithm

With the overwhelming amount of complex and heterogeneous data pouring f...
research
10/29/2018

Big Data Meet Cyber-Physical Systems: A Panoramic Survey

The world is witnessing an unprecedented growth of cyber-physical system...
research
01/30/2017

Comparing Dataset Characteristics that Favor the Apriori, Eclat or FP-Growth Frequent Itemset Mining Algorithms

Frequent itemset mining is a popular data mining technique. Apriori, Ecl...
research
02/23/2021

Learning High-Order Interactions via Targeted Pattern Search

Logistic Regression (LR) is a widely used statistical method in empirica...

Please sign up or login with your details

Forgot password? Click here to reset