Please use this identifier to cite or link to this item:
http://dspace.uniten.edu.my/jspui/handle/123456789/11762
Title: | Document classification based on kNN algorithm by term vector space reduction | Authors: | Moldagulova, A. Sulaiman, R.B. |
Issue Date: | 2018 | Journal: | International Conference on Control, Automation and Systems Volume 2018-October, 10 December 2018, Article number 8571540, Pages 387-391 | Conference: | 18th International Conference on Control, Automation and Systems, ICCAS 2018; YongPyong ResortPyeongChang; South Korea; 17 October 2018 through 20 October 2018; Category numberCFP1810D-USB; Code 143670 | Abstract: | Nowadays there is an increasing interest in the area of unstructured data analysis. The vast majority of unstructured data belongs to unstructured text data. Retrieving useful information from huge volume of unstructured text data is very challenging task. Text mining is a thought-provoking research area as it tries to discover knowledge from unstructured text. This paper deals with methods used for handling unstructured text data in particular document classification problems. Most document classification methods based on term vector space model of representation of unstructured textual data. The term vector space model is easy to implement, provides uniform representation for documents. However feature space for a large collection of documents can reach millions and be sparse. One of the issues is to reduce the dimension of the term-document matrix. In this research we proposed an approach for reduction of term vector space in KNN algorithm. © ICROS. |
Appears in Collections: | UNITEN Scholarly Publication |
Show full item record
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.