An Efficient Information-Rich Representation Scheme for Information Access and Knowledge Acquisition.

Asmaa El-Said, Computers and Systems Engineering - Faculty of Engineering - Mansoura University
Hesham Arafat, Computers and Systems Engineering - Faculty of Engineering - Mansoura UniversityFollow

Corresponding Author

El-Said, Asmaa

Subject Area

Computer and Control Systems Engineering

Article Type

Original Study

Abstract

Tremendous growth in the number of textual documents has produced daily requirements for effective development to explore, analyze, and discover knowledge from these textual documents. Conventional text mining and managing systems mainly use the presence or absence of key words to discover and analyze useful information from textual documents. However, simple word counts and frequency distributions of term appearances do not capture the meaning behind the words, which results in limiting the ability to mine the texts. This paper proposes a novel representation scheme of a semantic understanding-based approach to mine textual documents. This approach is based on semantic notions to represent the text in documents, to infer unknown dependencies and relationships among concepts in a text, to measure the relatedness between text documents and to apply mining processes using the representation and the relatedness measure. The representation scheme reflects the existing relationships among concepts and facilitates accurate relatedness measurements that result in a better mining performance. An extensive experimental evaluation is conducted on real datasets from various domains, indicating the importance of the proposed approach.

Keywords

Linguistic processing; text analysis; Text Mining; Knowledge acquisition; information access; Interactive data exploration and discovery

Recommended Citation

El-Said, Asmaa and Arafat, Hesham (2020) "An Efficient Information-Rich Representation Scheme for Information Access and Knowledge Acquisition.," Mansoura Engineering Journal: Vol. 40 : Iss. 1 , Article 4.
Available at: https://doi.org/10.21608/bfemu.2020.100775

Download