11 July 2016 Detecting nonsense for Chinese comments based on logistic regression
Author Affiliations +
Proceedings Volume 10011, First International Workshop on Pattern Recognition; 100111J (2016) https://doi.org/10.1117/12.2242283
Event: First International Workshop on Pattern Recognition, 2016, Tokyo, Japan
Abstract
To understand cyber citizens’ opinion accurately from Chinese news comments, the clear definition on nonsense is present, and a detection model based on logistic regression (LR) is proposed. The detection of nonsense can be treated as a binary-classification problem. Besides of traditional lexical features, we propose three kinds of features in terms of emotion, structure and relevance. By these features, we train an LR model and demonstrate its effect in understanding Chinese news comments. We find that each of proposed features can significantly promote the result. In our experiments, we achieve a prediction accuracy of 84.3% which improves the baseline 77.3% by 7%.
© (2016) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Ren Zhuolin, Ren Zhuolin, Chen Guang, Chen Guang, Chen Shu, Chen Shu, } "Detecting nonsense for Chinese comments based on logistic regression", Proc. SPIE 10011, First International Workshop on Pattern Recognition, 100111J (11 July 2016); doi: 10.1117/12.2242283; https://doi.org/10.1117/12.2242283
PROCEEDINGS
6 PAGES


SHARE
Back to Top