Motifs corrélés rares : Caractérisation et nouvelles représentations concises
Recently, rare pattern mining proves to be of added-value in different data mining applications since these patterns allow conveying knowledge on rare and unexpected events. However, the extraction of rare patterns suffers from two main limits, namely the large number of mined patterns in real-life applications, as well as the low informativeness quality of several rare patterns. In this situation, we propose to use the correlation measure, bond, in the mining process in order to only retain those rare patterns having a certain degree of correlation between their respective items. A characterization of the resulting set, of rare correlated patterns, is then carried out based on the study of constraints of distinct types induced by the rarity and the correlation. In addition, based on the equivalence classes associated to a closure operator dedicated to the bond measure, we propose concise representations of rare correlated patterns. We then design a new algorithm CRP_Miner dedicated to the extraction of the whole set of rare correlated patterns. We also introduce the CRPR_Miner algorithm allowing an efficient extraction of the proposed concise representations. In addition, we design two other algorithms which allow to us the query and the regeneration of the whole set of rare correlated patterns. The carried out experimental studies show the effectiveness of the algorithm CRPR_Miner and prove the compactness rate offered by the proposed concise representations.
READ FULL TEXT