School of Engineering

Graduates the engineering leaders of tomorrow...

Fast, Simple, and Effective Frequency-based Text Classification

Text classification is an important task that is extensively covered in the natural language processing literature. Most of the traditional approaches focus on tackling this task using dimensionality reduction, selection, or transformation techniques that precede the classification phase. The work in this undergraduate research project aims at providing a new approach that proposes additional features that enhance the classification’s accuracy. The proposed features infer term-frequency relationships between the documents and the categories in the dataset, providing a fast and effective alternative to existing computationally expensive text classification solutions. Our approach is evaluated on three of the most commonly used public datasets, i.e. 20 Newsgroups data, R8 and R52. Experimental results reveal that the proposed approach is on a par with the state-of-the-art methods and enhances the classification results.  Report in pdf

19-20-SP-TextClass-JT-res.jpg


Copyright 1997–2021 Lebanese American University, Lebanon.
Contact LAU | Feedback