Abstract
A REVIEW ON DIMENSIONALITY REDUCTION USING COPULA APPROACH IN DATA MINING
Sumaiya Maryam* and Sriram Yadav
ABSTRACT
Copula approach is a Sampling-based dimensionality reduction technique. Removing linearly redundant combined dimensions, giving a convenient way to generate correlated multivariate random variables. Managing the integrity of the original information, deducting the dimension of data space without losing valuable information. The modern trends in collecting very large and diverse datasets have created a great challenge in data analysis. The recent trends in collecting very large and diverse datasets have created a great challenge in data analysis. One of the attributes of these gigantic datasets is that they often have significant amounts of redundancies. The use of very large multi-dimensional data will result in more noise, redundant data, and the possibility of unconnected data entities. To efficiently manage data represented in a high-dimensional space and to address the impact of redundant dimensions on the final results, a new technique has been proposed for the dimensionality reduction using Copulas and the LU-decomposition (Forward Substitution) method. The proposed method is compared favorably with existing approaches on real-world datasets: Diabetes, Waveform, two versions of Human Activity Recognition based on Smartphone, and Thyroid Datasets taken from machine learning repository in terms of dimensionality reduction and efficiency of the method, which are performed on statistical and classification measures.
[Full Text Article] [Download Certificate]