Automatic Language Identification from
Written Texts ? An Overview

H L Shashirekha

抽象的

Automatic Language Identification from Written Texts ? An Overview

H L Shashirekha

Language Identification is the task of automatically identifying the language(s) in which the content is written in a document (web page, text document). Due to the widespread use of internet, identification of languages has become an important preprocessing step for a number of applications such as machine translation, Part-of-Speech tagging, linguistic corpus creation, supporting low-density languages, accessibility of social media or user-generated content, search engines and information extraction in addition to processing multilingual documents. In a multilingual country like India, Language Identification has wider scope to bridge the digital divide between different language users. This paper presents a brief overview of the challenges involved in automatic language identification, existing methodologies and some of the tools available for language identification.

免责声明: 此摘要通过人工智能工具翻译，尚未经过审核或验证

期刊亮点

CDMA/GSM Communication Protocol 人工智能图案/图像识别先进的计算架构冷静科技基于代理的中间件安全系统宽带与智能网络开源软件数据仓库数据库安全数据结构无线传感器机器人技术生物信息学和计算生物学网格计算自主和上下文感知计算自组织网络自适应雷达技术高级数值算法

索引于

哥白尼索引

学术钥匙

引用因子

宇宙IF

参考搜索

哈姆达大学

世界科学期刊目录

国际创新期刊影响因子（IIJIF）

国际组织研究所 (I2OR)

宇宙

国际期刊

制药科学医学科学工程普通科学

国际计算机与通信工程创新研究杂志

抽象的

Automatic Language Identification from Written Texts ? An Overview

期刊亮点

索引于

国际期刊

地址