EXTRACTING PLAIN TEXT FROM CORRUPTED WORD DOCUMENT

Saptarshi Naskar; Souvik Sarkar; Krishnendu Basuli

抽象的

EXTRACTING PLAIN TEXT FROM CORRUPTED WORD DOCUMENT

Saptarshi Naskar, Souvik Sarkar and Krishnendu Basuli

Text conversion is a process written in some programming language whose main task is to extract the plain text from the supplied source file in some format, and put the text in the file of different format. In this conversion procedure the extension of the file will be changed but the data within it remains unchanged in format and in size of the actual data. This conversion procedure is done through some programming language (may be written in C or C++ or some other programming language), whose main task is to read the source file line by line (in many systems it reads per character) and whenever the appropriate text (i.e. the valid data or character) is found it then copies the entire text into a different file format, remaining unchanged the format of the text that is in the source file. Here, we design the text converter in C programming language which accepts the file format with an extension of .doc, .rtf and also .txt, and extracts the plain text from these files and put the texts in a text file with remaining unchanged of the text format of the source file.

免责声明: 此摘要通过人工智能工具翻译，尚未经过审核或验证

期刊亮点

人工智能信息技术信息系统图形控制论数据库管理系统数据挖掘机器学习神经网络编程语言虚拟现实计算机人机交互计算机安全计算机工程计算机架构计算机科学计算理论计算生物学通讯网络

索引于

谷歌学术

学术期刊数据库

打开 J 门

学术钥匙

研究圣经

引用因子

电子期刊图书馆

参考搜索

哈姆达大学

学者指导

国际创新期刊影响因子（IIJIF）

国际组织研究所 (I2OR)

宇宙

国际期刊

制药科学医学科学工程普通科学

全球计算机科学研究杂志

抽象的

EXTRACTING PLAIN TEXT FROM CORRUPTED WORD DOCUMENT

期刊亮点

索引于

国际期刊

地址