A Guided FP-growth algorithm for fast mining of frequent itemsets from big data

03/18/2018
by   Lior Shabtay, et al.
0

In this paper we present the GFP-growth (Guided FP-growth) algorithm, a novel method for finding the count of a given list of itemsets in large data. Unlike FP-growth, our algorithm is designed to focus on the specific multiple itemsets of interest and hence its time and memory costs are better. We prove that the GFP-growth algorithm yields the exact frequency-counts for the required itemsets. We show that for a number of different problems, a solution can be devised which takes advantage of the efficient implementation of multi-targeted mining for boosting the performance. In particular, we study in detail the problem of generating the minority-class rules from imbalanced data, a scenario that appears in many real-life domains such as medical applications, failure prediction, network and cyber security, and maintenance. We develop the Minority-Report Algorithm that uses the GFP-growth for boosting performance. We prove some theoretical properties of the Minority-Report Algorithm and demonstrate its superior performance using simulations and real data.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset