Subject Area
Computer and Control Systems Engineering
Article Type
Original Study
Abstract
Tremendous growth in the number of textual documents has produced daily requirements for effective development to explore, analyze, and discover knowledge from these textual documents. Conventional text mining and managing systems mainly use the presence or absence of key words to discover and analyze useful information from textual documents. However, simple word counts and frequency distributions of term appearances do not capture the meaning behind the words, which results in limiting the ability to mine the texts. This paper proposes a novel representation scheme of a semantic understanding-based approach to mine textual documents. This approach is based on semantic notions to represent the text in documents, to infer unknown dependencies and relationships among concepts in a text, to measure the relatedness between text documents and to apply mining processes using the representation and the relatedness measure. The representation scheme reflects the existing relationships among concepts and facilitates accurate relatedness measurements that result in a better mining performance. An extensive experimental evaluation is conducted on real datasets from various domains, indicating the importance of the proposed approach.
Keywords
Linguistic processing; text analysis; Text Mining; Knowledge acquisition; information access; Interactive data exploration and discovery
Recommended Citation
El-Said, Asmaa and Arafat, Hesham
(2020)
"An Efficient Information-Rich Representation Scheme for Information Access and Knowledge Acquisition.,"
Mansoura Engineering Journal: Vol. 40
:
Iss.
1
, Article 4.
Available at:
https://doi.org/10.21608/bfemu.2020.100775