抽象的

CORPUS ALIGNMENT FOR WORD SENSE DISAMBIGUATION

Shweta Vikram

Machine translation convert one language to another language. Anusaaraka is a machine translation, which is an English to Indian language accessing software. Anusaaraka is a Natural Language Processing (NLP) Research and Development project undertaken by Chinmaya International Foundation (CIF). When any machine do that work they need big parallel corpus that can help for making some rules and disambiguate many senses. It is following hybrid approach but we are working on rule based approach. For this approach we needed big parallel aligned corpus. In this paper we discuss how we collect parallel corpus with the help of some shell scripts, some programs, some tool kit and other things.

免责声明: 此摘要通过人工智能工具翻译,尚未经过审核或验证

索引于

谷歌学术
学术期刊数据库
打开 J 门
学术钥匙
研究圣经
引用因子
电子期刊图书馆
参考搜索
哈姆达大学
学者指导
国际创新期刊影响因子(IIJIF)
国际组织研究所 (I2OR)
宇宙

查看更多