Document Classification using Matrix Decomposition with Varied Viewpoints

  • Kaname Maruta Kyushu Institute of Technology
  • Hidetoshi Nagai Kyushu Institute of Technology
  • Teigo Nakamura Kyushu Institute of Technology
Keywords: Document Classification, NMF, Matrix Decomposition

Abstract

Classification results are not unique and can vary according to the user’s viewpoint. If a document classification system ignores the user’s viewpoints, classification will be different from the result desired by the user, and the difference between the user’s desired result and the system’s produced result can cause some inhibitions and oversights in information retrieval. Extracting the user’s viewpoints from the classification examples performed preliminarily by the user allows us to configure classifications that reflect the user’s desire. In this study, we propose four methods to extract viewpoints and three methods to classify documents using Nonnegative Matrix Factorization (NMF) matrix decomposition. We exhibit the results of comparative experiments with the original NMF, Semi-Supervised NMF (SSNMF) and our proposed methods.

Author Biography

Kaname Maruta, Kyushu Institute of Technology
Graduate School of Computer Science and Systems Engineering

References

D.D.Lee, H.S.Seung, “Algorithms for Non-negative Matrix Factorization”, NIPS, pp.556-562 , (2000).

W.Xu, X.Liu, Y.Gong, “Document clustering based on non-negative matrix factorization”, in Proc.ACM SIGIR Conf.Research and Development in Information Retrieval, Toronto, ON, Canada, (2003).

K.Maruta, H.Nagai, T.Nakamura, “NMF with Supervised Constraints for Document Classification (in Japanese)”, SIG-IFAT, IAS, pp14-21, (2013).

H.Lee , J.Yoo , S.Choi “Semi-Supervised Nonnegative Matrix Factorization”, IEEE SIGNAL PROCESSING LETTERS , Vol.17 No.1 , pp.4-7 , JANUARY (2010).

H.Shinnou, M.Sasaki, “Ping-Pong Document Clustering by using NMF and Linkage Based Refinement” , IPS japan, NL Technical Reports, Vol.2007, No.47, pp.7-12, (2007).

C.Ding,T.Li,W.Peng : “On the equivalence between Non-negative Matrix Factorization and Probabilistic Latent Semantic Indexing”, Computational Statistics and Data Analysis 52 , 3913 - 3927, (2008).

S.Hotta, S.Miyahara, “An Initialization Method for Non-negative Matrix Factorization and Its Applications”, IEICE Technical Reports, PRMU, Vol.102, No.652, pp.19-24, (2003).

A.Agresti, “An Introduction to Categorical Data Analysis”, John Wiley & Sons, (2006).

C.D.Manning et al., “Introduction to Information Retrieval”, trans.K.Iwano et al., Kyoritu Shuppan, (2012).

Published
2015-12-31
Section
Technical Papers (Advanced Applied Informatics)