Classification non supervisée des données hétérogènes à large échelle

07/02/2017
by   Mohamed Ali Zoghlami, et al.
0

When it comes to cluster massive data, response time, disk access and quality of formed classes becoming major issues for companies. It is in this context that we have come to define a clustering framework for large scale heterogeneous data that contributes to the resolution of these issues. The proposed framework is based on, firstly, the descriptive analysis based on MCA, and secondly, the MapReduce paradigm in a large scale environment. The results are encouraging and prove the efficiency of the hybrid deployment on response quality and time component as on qualitative and quantitative data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/21/2021

Towards Exploratory Landscape Analysis for Large-scale Optimization: A Dimensionality Reduction Framework

Although exploratory landscape analysis (ELA) has shown its effectivenes...
research
09/25/2013

A Unified Framework for Representation-based Subspace Clustering of Out-of-sample and Large-scale Data

Under the framework of spectral clustering, the key of subspace clusteri...
research
12/01/2021

Efficient Big Text Data Clustering Algorithms using Hadoop and Spark

Document clustering is a traditional, efficient and yet quite effective,...
research
06/07/2023

Towards High-Performance Exploratory Data Analysis (EDA) Via Stable Equilibrium Point

Exploratory data analysis (EDA) is a vital procedure for data science pr...
research
02/14/2018

Web-Scale Responsive Visual Search at Bing

In this paper, we introduce a web-scale general visual search system dep...
research
11/21/2019

How to Ask Better Questions? A Large-Scale Multi-Domain Dataset for Rewriting Ill-Formed Questions

We present a large-scale dataset for the task of rewriting an ill-formed...

Please sign up or login with your details

Forgot password? Click here to reset