Abstract
A REVIEW ON DIMENSIONALITY REDUCTION TECHNIQUES IN DATA MINING
Wasim Akram* and Sriram Yadav
ABSTRACT
Data mining is a form of knowledge discovery essential for solving problems in a specific domain. Classification is a technique used for discovering classes of unknown data. Various methods for classification exists like Bayesian, Decision Trees and Rule based neural networks etc. Before applying any mining technique, irrelevantattributes needs to be filtered. Filtering is done using different feature selection techniques like wrapper, filter, and embedded technique. Feature selection plays an important role in data mining and machine learning. It helps to reduce the dimensionality of data and increase the performance of classification algorithms. A variety of feature selection methods have been presented in state-of-the-art literature to resolve feature selection problems such as large search space in high dimensional datasets like in microarray. However, it is a challenging task to identify the best feature selection method that suits a specific scenario or situation. Dimensionality reduction in data mining focuses on representing data with minimum number of dimensions such that its properties are not lost and hence reducing the underlying complexity in processing the data. Principal Component Analysis (PCA) is one of the prominent dimensionality reduction techniques widely used in network traffic analysis.
[Full Text Article] [Download Certificate]