A comparative study of top-k high utility itemset mining methods

09/04/2018
by   Srikumar Krishnamoorthy, et al.
0

High Utility Itemset (HUI) mining problem is one of the important problems in the data mining literature. The problem offers greater flexibility to a decision maker to incorporate her/his notion of utility into the pattern mining process. The problem, however, requires the decision maker to choose a minimum utility threshold value for discovering interesting patterns. This is quite challenging due to the disparate itemset characteristics and their utility distributions. In order to address this issue, Top-K High Utility Itemset (THUI) mining problem was introduced in the literature. THUI mining problem is primarily a variant of the HUI mining problem that allows a decision maker to specify the desired number of HUIs rather than the minimum utility threshold value. Several algorithms have been introduced in the literature to efficiently mine top-k HUIs. This paper systematically analyses the top-k HUI mining methods in the literature, describes the methods, and performs a comparative analysis. The data structures, threshold raising strategies, and pruning strategies adopted for efficient top-k HUI mining are also presented and analysed. Furthermore, the paper reviews several extensions of the top-k HUI mining problem such as data stream mining, sequential pattern mining and on-shelf utility mining. The paper is likely to be useful for researchers to examine the key methods in top-k HUI mining, evaluate the gaps in literature, explore new research opportunities and enhance the state-of-the-art in high utility pattern mining.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/25/2019

Utility Mining Across Multi-Sequences with Individualized Thresholds

Utility-oriented pattern mining has become an emerging topic since it ca...
research
07/06/2023

Finding Favourite Tuples on Data Streams with Provably Few Comparisons

One of the most fundamental tasks in data science is to assist a user wi...
research
03/25/2023

Targeted Mining of Top-k High Utility Itemsets

Finding high-importance patterns in data is an emerging data mining task...
research
08/27/2022

A Generic Algorithm for Top-K On-Shelf Utility Mining

On-shelf utility mining (OSUM) is an emerging research direction in data...
research
12/24/2019

High Utility Interval-Based Sequences

Sequential pattern mining is an interesting research area with broad ran...
research
03/23/2023

Extended High Utility Pattern Mining: An Answer Set Programming Based Framework and Applications

Detecting sets of relevant patterns from a given dataset is an important...
research
09/27/2022

Contrast Pattern Mining: A Survey

Contrast pattern mining (CPM) is an important and popular subfield of da...

Please sign up or login with your details

Forgot password? Click here to reset