[1]张敏,陈锻生.结合情感词典的主动贝叶斯文本情感分类方法[J].华侨大学学报(自然科学版),2018,39(4):623-626.[doi:10.11830/ISSN.1000-5013.201608010]
 ZHANG Min,CHEN Duansheng.Text Sentiment Classification Based on Semantic Lexicon and Active Bayesian[J].Journal of Huaqiao University(Natural Science),2018,39(4):623-626.[doi:10.11830/ISSN.1000-5013.201608010]
点击复制

结合情感词典的主动贝叶斯文本情感分类方法()
分享到:

《华侨大学学报(自然科学版)》[ISSN:1000-5013/CN:35-1079/N]

卷:
第39卷
期数:
2018年第4期
页码:
623-626
栏目:
出版日期:
2018-07-18

文章信息/Info

Title:
Text Sentiment Classification Based on Semantic Lexicon and Active Bayesian
文章编号:
1000-5013(2018)04-0623-04
作者:
张敏 陈锻生
华侨大学 计算机科学与技术学院, 福建 厦门 361021
Author(s):
ZHANG Min CHEN Duansheng
College of Computer Science and Technology, Huaqiao University, Xiamen 361021, China
关键词:
主动学习 文本情感分类 情感词典 朴素贝叶斯 不确定采样策略
Keywords:
active learning text sentiment classification semantic lexicon naive Bayesian uncertainty sampling strategy
分类号:
TP391.1
DOI:
10.11830/ISSN.1000-5013.201608010
文献标志码:
A
摘要:
提出一种改进的结合情感词典的主动贝叶斯情感分类方法(SLAB).为了证明提出方法的有效性,选用康奈尔影评数据集和互联网电影资料库(IMDB)数据集作为实验数据,并与基于不确定性采样策略的主动学习方法进行比较.结果表明:文中提出的方法在较少的标注训练集下,能够取得更高的分类准确率,一定程度上解决了基于不确定性采样策略的主动学习方法中的误差累积问题.
Abstract:
An improved sentiment classification method combined semantic lexicon and active Bayesian(SLAB)is proposed. To demonstrate the effectiveness of our proposed method, the Cornell movie review datasets and Internet movie database(IMDB)datasets are exploited as our experimental data and the active learning method based on the uncertainty of sampling is studied as a comparison. The results show that the proposed method can achieve higher classification accuracy with less labeled training set, which alleviates the influence of error accumulation caused by the active learning method based on the uncertainty of sampling.

参考文献/References:

[1] 赵妍妍,秦兵,刘挺.文本情感分析[J].软件学报,2010,21(8):1834-1848.
[2] WANG Sida,MANNING C D.Baselines and bigrams: Simple, good sentiment and topic classification[C]//Meeting of the Association for Computational Linguistics.Stroudsburg:ACL,2012:90-94.
[3] WU Fangzhao,SONG Yangqiu,HUANG Yongfeng.Microblog sentiment classification with contextual knowledge regularization[C]//Conference on Artificial Intelligence.Halifax:AAAI Press,2015:2332-2338.
[4] 吴伟宁,刘扬,郭茂祖,等.基于采样策略的主动学习算法研究进展[J].计算机研究与发展,2012,49(6):1162-1173.
[5] ZHU Weizhong,ALLEN R B.Active learning for text classification: Using the LSI subspace signature model[C]//International Conference on Data Science and Advanced Analytics.New Jersey:IEEE Press,2014:149-155.DOI:10.1109/DSAA.2014.7058066.
[6] CETIN M,AMASYALI M F.Active learning for Turkish sentiment analysis[C]//Innovations in Intelligent Systems and Applications.New Jersey:IEEE Press,2013:1-4.
[7] KUMAR A,KANSAL C,EKBAL A.Investigating active learning techniques for document level sentiment classification of tweets[C]//7th International Conference on Communication Systems and Networks.Bangalore:IEEE Press,2015:1-6.DOI:10.1109/COMSNETS.2015.7098727.
[8] 赵建华,刘宁.结合主动学习策略的半监督分类算法[J].计算机应用研究,2015,32(8):2295-2298.
[9] ANGLUIN D.Queries and concept learning[J].Machine Learning,1988,2(4):319-342.DOI:10.1007/BF00116828.
[10] SETTLES B.Active learning literature survey[J].University of Wisconsinmadison,2010,39(2):127-131.
[11] SUN Lili,WANG Xizhao.A survey on active learning strategy[C]//International Conference on Machine Learning and Cybernetics.New Jersey:IEEE Press,2010:161-166.DOI:10.1109/ICMLC.2010.5581075.
[12] LEWIS D D,GALE W A.A sequential algorithm for training text classifiers[J].Sigir,1994,29(2):3-12.
[13] HUANG Shenjun,JIN Rong,ZHOU Zhihua.Active learning by querying informative and representative examples[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2014,36(10):1936-1949.DOI:10.1109/TPAMI.2014.2307881.
[14] DONMEZ P,CARBONELL J G,BENNETT P N.Dual strategy active learning[C]//European Conference on Machine Learning.Berlin:Springer,2007:116-127.
[15] ZHAO Xu,YU Kai,TRESP V,et al.Representative sampling for text classification using support vector machines[C]//European Conference on Information Retrieval.Berlin:Springer, 2003:393-407. DOI:10.1007/3-540-36618-0_28.
[16] 杨鼎,阳爱民.一种基于情感词典和朴素贝叶斯的中文文本情感分类方法[J].计算机应用研究,2010,27(10):3737-3739.
[17] 宫秀军,孙建平,史忠植.主动贝叶斯网络分类器[J].计算机研究与发展,2002,39(5):574-579.
[18] HOULSBY N,HUSZáR F,GHAHRAMANI Z,et al.Bayesian active learning for classification and preference learning[J].Computer Science,2011:10-13.
[19] PANG B,LEE L.Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales[C]//Proceedings of the 43rd Annual Meeting of the ACL.Morristown:ACL,2005:115-124.
[20] MAAS A L,DALY R E,PHAM P T,et al.Learning word vectors for sentiment analysis[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics.Morristown:ACL,2011:142-150.

备注/Memo

备注/Memo:
收稿日期: 2016-08-08
通信作者: 陈锻生(1959-),男,教授,博士,主要从事机器学习与数据挖掘的研究.E-mail:dschen@hqu.edu.cn.
基金项目: 国家自然科学基金资助项目(61370006); 福建省科技计划(工业引导性)重点项目(2015H0025); 华侨大学研究生科研创新能力培育计划资助项目(1400214023)
更新日期/Last Update: 2018-07-20