An Analysis of Gene Expression Data using Penalized Fuzzy C-Means Approach

01/08/2013
by   P. K. Nizar Banu, et al.
0

With the rapid advances of microarray technologies, large amounts of high-dimensional gene expression data are being generated, which poses significant computational challenges. A first step towards addressing this challenge is the use of clustering techniques, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. A robust gene expression clustering approach to minimize undesirable clustering is proposed. In this paper, Penalized Fuzzy C-Means (PFCM) Clustering algorithm is described and compared with the most representative off-line clustering techniques: K-Means Clustering, Rough K-Means Clustering and Fuzzy C-Means clustering. These techniques are implemented and tested for a Brain Tumor gene expression Dataset. Analysis of the performance of the proposed approach is presented through qualitative validation experiments. From experimental results, it can be observed that Penalized Fuzzy C-Means algorithm shows a much higher usability than the other projected clustering algorithms used in our comparison study. Significant and promising clustering results are presented using Brain Tumor Gene expression dataset. Thus patterns seen in genome-wide expression experiments can be interpreted as indications of the status of cellular processes. In these clustering results, we find that Penalized Fuzzy C-Means algorithm provides useful information as an aid to diagnosis in oncology.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset