Hadoop Based Parallel Framework for Feature
Subset Selection in Big Data

Revathi.L; A.App; iraj

抽象的

Hadoop Based Parallel Framework for Feature Subset Selection in Big Data

Revathi.L, A.Appandiraj

It is the era of Big Data. Since scale of data is increasing every minute, handling massive data becomes important in this era. Massive data poses a great challenge for classification. High dimensionality of modern massive dataset has provided a considerable challenge to clustering approaches. The curse of dimensionality can make clustering very slow, and, second, the existence of many irrelevant features may not allow the identification of the relevant underlying structure in the data. Feature selection is the most important part of the clustering process that involves identifying the set of features of a subset, at which they produce accurate and accordant results with the original set of features. Designing traditional machine learning algorithms and data mining algorithms with Map Reduce Programming is necessary in dealing with massive data sets. Map Reduce is a parallel processing framework for large datasets and Hadoop is its open-source implementation. The objective of this paper is to implement FAST clustering algorithm with Map Reduce programming to remove irrelevant and redundant features. Following preprocessing, cluster based map-reduce feature selection approach is implemented for effective outcome of features

免责声明: 此摘要通过人工智能工具翻译，尚未经过审核或验证

期刊亮点

应用科学植物学流体动力学生物化学生物医学工程航空航天工程色谱技术

索引于

学术钥匙

研究圣经

引用因子

宇宙IF

参考搜索

哈姆达大学

世界科学期刊目录

学者指导

国际创新期刊影响因子（IIJIF）

国际组织研究所 (I2OR)

宇宙

国际期刊

制药科学医学科学工程普通科学

国际科学、工程与技术创新研究杂志

抽象的

Hadoop Based Parallel Framework for Feature Subset Selection in Big Data

期刊亮点

索引于

国际期刊

地址