抽象的

A Resemblance between Credentials in Nptel Application Using Weka Tool

Ms.N.Kalpana , Dr.S.Appavu Alias Balamurugan

Many data mining techniques have been proposed for mining useful patterns in text documents. However, how to effectively use and update discovered patterns is still an open research issue, especially in the domain of text mining. Since most existing text mining methods adopted term-based approaches, they all suffer from the problems of polysemy and synonymy. Over the years, people have often held the hypothesis that pattern (or phrase)-based approaches should perform better than the term-based ones, but many experiments do not support this hypothesis. This paper presents an innovative and effective pattern discovery technique which includes the processes of pattern deploying and pattern evolving, to improve the effectiveness of using and updating discovered patterns for finding relevant and interesting information. Word similarity and Information extraction systems are traditionally implemented as a pipeline of special-purpose processing modules targeting the extraction of a particular kind of information. A fundamental data-mining problem is to examine data for “similar” items. These pages could be plagiarized, for example, or they could be mirrors that have almost the same pleased, but differ in information about the host and about other mirrors. We introduce a technique called “min hashing,” which compresses large sets in such a way that we can still deduce the similarity of the underlying sets from their compressed versions. Finally, we explore notions of “similarity” that are not expressible as intersection of sets. This study leads us to consider the theory of distance measures in arbitrary spaces.

免责声明: 此摘要通过人工智能工具翻译,尚未经过审核或验证

索引于

学术钥匙
研究圣经
引用因子
宇宙IF
参考搜索
哈姆达大学
世界科学期刊目录
学者指导
国际创新期刊影响因子(IIJIF)
国际组织研究所 (I2OR)
宇宙

查看更多