Data Curves Clustering Using Common Patterns Detection

For the past decades we have experienced an enormous expansion of the accumulated data that humanity produces. Daily a numerous number of smart devices, usually interconnected over internet, produce vast, real-values datasets. Time series representing datasets from completely irrelevant domains such as finance, weather, medical applications, traffic control etc. become more and more crucial in human day life. Analyzing and clustering these time series, or in general any kind of curves, could be critical for several human activities. In the current paper, the new Curves Clustering Using Common Patterns (3CP) methodology is introduced, which applies a repeated pattern detection algorithm in order to cluster sequences according to their shape and the similarities of common patterns between time series, data curves and eventually any kind of discrete sequences. For this purpose, the Longest Expected Repeated Pattern Reduced Suffix Array (LERP-RSA) data structure has been used in combination with the All Repeated Patterns Detection (ARPaD) algorithm in order to perform highly accurate and efficient detection of similarities among data curves that can be used for clustering purposes and which also provides additional flexibility and features.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/09/2022

OPP-Miner: Order-preserving sequential pattern mining

A time series is a collection of measurements in chronological order. Di...
research
10/24/2022

Applications of Machine Learning in Pharmacogenomics: Clustering Plasma Concentration-Time Curves

Pharmaceutical researchers are continually searching for techniques to i...
research
10/14/2021

Time Series Clustering for Human Behavior Pattern Mining

Human behavior modeling deals with learning and understanding of behavio...
research
08/14/2015

Fuzzy Longest Common Subsequence Matching With FCM Using R

Capturing the interdependencies between real valued time series can be a...
research
04/24/2023

Analyzing categorical time series with the R package ctsfeatures

Time series data are ubiquitous nowadays. Whereas most of the literature...
research
07/02/2021

Depth-based Outlier Detection for Grouped Smart Meters: a Functional Data Analysis Toolbox

Smart metering infrastructures collect data almost continuously in the f...
research
02/02/2010

Detecting Motifs in System Call Sequences

The search for patterns or motifs in data represents an area of key inte...

Please sign up or login with your details

Forgot password? Click here to reset