[1]福州大学信息管理研究所.MOOC课程评论评价对象和评价词抽取研究报告福州大学信息管理研究所[J].信息化理论与实践,2019,(01):115-140.
点击复制

MOOC课程评论评价对象和评价词抽取研究报告福州大学信息管理研究所()
分享到:

《信息化理论与实践》[ISSN:2520-5862/CN:]

卷:
期数:
2019年01期
页码:
115-140
栏目:
出版日期:
2020-06-30

文章信息/Info

作者:
福州大学信息管理研究所
Author(s):
Extraction of Opinion Targets And Words from MOOC Reviews
关键词:
BiLSTM-CRF课程评论评价对象评价词
Keywords:
BiLSTM-CRF course reviews opinion targets opinion words
摘要:
随着MOOC平台的飞速发展,MOOC平台积累了大量课程评论信息。这些课程评论中蕴含了大量对MOOC学习者、MOOC教学者和MOOC平台管理者富有价值的信息。通过对这类具有互动性的文本进行情感分析,既可以为学习者选择所要学习的课程提供支撑,也可以为教学者提升教学水平、优化课程资源提供参考,还可以管理者优化平台体验提供补充。MOOC课程评论中两个核心元素是评价对象和评价词。MOOC课程评论评价对象和评价词抽取是MOOC课程评论情感分析的基础任务。本研究以MOOC课程评论为研究对象,研究如何有效地从MOOC课程评论中抽取评价对象和评价词的问题。 在相关研究成果的基础上,本研究提出将MOOC课程评论中评价对象和评价词的抽取问题看作一个序列标注问题,并提出一种基于BiLSTM-CRF的MOOC课程评论评价对象和评价词抽取模型。具体来说,基于BiLSTM-CRF的MOOC课程评论评价对象和评价词抽取模型由四部分组成。第一部分是输入层,将课程评论编码。第二部分是嵌入层,将输入的MOOC课程评论中的每一个词映射成一个低维的向量。第三部分是BiLSTM层,将嵌入层的输出作为BiLSTM层的输入,学习MOOC课程评论评价对象和评价词的双向长距离上下文特征。第四部分是CRF层,将BiLSTM层的输出作为条件随机场层的输入,学到一个更高层次的特征。本研究的主要贡献是利用BiLSTM在高维表征学习方面的优势来抽取评价对象和评价词的双向长距离上下文特征,以及利用CRF学习评价对象和评价词标签之间的前后依赖关系。 本研究从中国大学MOOC平台的课程评价区采集了大量的课程评论文本作为实验语料,来验证本研究所提出的MOOC课程评论对象和评价词抽取模型在真实的MOOC课程评论数据集上抽取评价对象和评价词的有效性。在实验中,本研究主要评测了本研究提出的模型在准确率、召回率以及F1值这三个方面的有效性。实验结果表明,本研究提出的模型可以有效地从MOOC课程评论中抽取出评价对象和评价词。
Abstract:
With the rapid development of the MOOC platform, the MOOC platform has accumulated a large amount of course commentary information. These course reviews contain a wealth of valuable information for MOOC learners, MOOC educators, and MOOC platform managers. By performing sentiment analysis on such interactive texts, it can not only provide support for learners to choose the courses they want to learn, but also provide reference for the learners to improve their teaching level and optimize curriculum resources, and also provide supplements for managers to optimize the platform experience. . The two core elements of the MOOC course review s are the opinion targets and the opinion word s. The extraction of opinion targets and words from MOOC reviews is the basic task of sentiment analysis on MOOC course reviews. This study takes the MOOC course review as the research object and studies how to effectively extract opinion targets and words from MOOC reviews. Based on the relevant research results, this study proposes to consider the extraction of opinion targets and words from MOOC reviews as a sequence labeling problem, and proposes a model of extraction of opinion targets and words from MOOC reviews based on BiLSTM-CRF. Specifically, the BiLSTM-CRF-based model of extraction of opinion targets and words from MOOC reviews consist of four parts. The first part is the input layer, which encodes the MOOC course reviews. The second part is the embedding layer, which maps each word in the input MOOC course reviews into a low-dimensional vector. The third part is the BiLSTM layer. The output of the embedded layer is used as the input of the BiLSTM layer to learn the bidirectional long-distance context feature of opinion targets and words from MOOC reviews. The fourth part is the CRF layer. The output of the BiLSTM layer is used as a conditional random field input to learn a higher level feature. The main contribution of this research is to use the advantages of BiLSTM to learn the bidirectional context features of opinion targets and words from MOOC reviews, and to use CRF to solve the pre- and post-dependence relationship between labels of opinion targets and words from MOOC reviews. This study collected a large number of MOOC course reviews from the course evaluation area of the Chinese University MOOC platform as experimental corpus to verify that the model of extraction of opinion targets and words from MOOC reviews based on BiLSTM-CRF on the real MOOC course reviews data set. In the experiment, this study mainly evaluated the effectiveness of the model proposed in this study in terms of precision, recall and F1 value. The experimental results show that the model proposed in this study can effectively extract opinion targets and words from MOOC reviews

参考文献/References:

[1].Dhawal Shah. By The Numbers: MOOCs in 2018[EB/OL].(2018-12-11)[2019-05-03]. https://www.classcentral.com/report/mooc-stats-2018/.
[2].新华网.慕课——中国高等教育实现“变轨超车”的关键一招[EB/OL].(2018-04-16)[2019-06-12]. http://www.xinhuanet.com/politics/2018-04/16/c_1122689822.htm.
[3].王萍.大规模在线开放课程的新发展与应用:从cMOOC到xMOOC[J].现代远程教育研究,2013(03):13-19.
[4].Bing Liu. Sentiment analysis: Mining opinions, sentiments, and emotions [M]. Cambridge University Press, 2015.
[5].Bo Pang, Lillian Lee. Opinion mining and sentiment analysis[J]. Computational Linguistics, 2009, 35( 2): 311-312.
[6].Bing Liu. Sentiment analysis and opinion mining[J]. Synthesis lectures on human language technologies, 2012, 5(1): 1-167.
[7].Guang Qiu, Bing Liu, Jiajun Bu, Chun Chen. Opinion word expansion and target extraction through double propagation[J]. Computational linguistics, 2011, 37(1): 9-27.
[8].Ana-Maria Popescu, Oren Etzioni. Extracting product features and opinions from reviews[C]. Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, Vancouver. Stroudsburg, PA, USA: Association for Computational Linguistics, 2005 : 9-28.
[9].Li Zhuang, Feng Jing, Xiao-Yan Zhu. Movie Review Mining and Summarization[C]. Proceedings of the ACM 15th Conference on Information and Knowledge Management , Arlington, Virginia, USA . New York, USA: ACM, 2006: 43-50.
[10].刘三女牙,彭晛,刘智,孙建文,刘海.面向MOOC课程评论的学习者话题挖掘研究[J].电化教育研究,2017,38(10):30-36.
[11].贺杰. 在线教育课程评论文本情感倾向性研究[D]. 江西财经大学, 2017.
[12].Rani S, Kumar P. A Sentiment Analysis System to Improve Teaching and Learning[J]. Computer, 2017, 50(5):36-43.
[13].M?ntyl? M V, Graziotin D, Kuutila M. The evolution of sentiment analysis—A review of research topics, venues, and top cited papers[J]. Computer Science Review, 2018, 27:16-32.
[14].Shiliang Sun, Chen Luo, Junyu Chen. A review of natural language processing techniques for opinion mining systems[J]. Information fusion, 2017, 36: 10-25.
[15].Theresa Wilson, Janyce Wiebe, Paul Hoffmann. Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis[C]. Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, British Columbia, Canada. Stroudsburg, PA, USA: Association for Computational Linguistics , 2005: 347–354.
[16].Jun Zhao, Kang Liu, Gen Wang. Adding redundant features for CRFs-based sentence sentiment classification[C]. Proceedings of the Conference on Empirical Methods in Natural Language Processing, Honolulu, Hawaii. Stroudsburg, PA, USA: Association for Computational Linguistics, 2008: 117-126.
[17].Bo Pang, Lillian Lee. A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts[C]. Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, Barcelona, Spain. Stroudsburg, PA, USA: Association for Computational Linguistics, 2004.
[18].Alaa El-Halees. Mining Opinions in User-Generated Contents to Improve Course Evaluation[C]. International Conference on Software Engineering and Computer Systems, Berlin, Heidelberg . Springer, 2011:107-115.
[19].Chee Kian Leong, Yew Haur Lee, and Wai Keong Mak. Mining sentiments in SMS texts for teaching evaluation[J]. Expert Systems with Applications, 2012, 39(3): 2584-2589.
[20].Myriam Munezero, Calkin Suero Montero, Maxim Mozgovoy, and Erkki Sutinen. Exploiting sentiment analysis to track emotions in students’ learning diaries[C]. Proceedings of the 13th Koli Calling International Conference on Computing Education Research, Koli, Finland. New York, USA: ACM, 2013: 145-152.
[21].Rebecca Ferguson, Zhongyu Wei, Yulan He, and Simon Buckingham Shum. An evaluation of learning analytics to identify exploratory dialogue in online discussions[C]. Proceedings of the Third International Conference on Learning Analytics and Knowledge, Leuven, Belgium. New York, USA: ACM, 2013: 85-93.
[22].Arti Ramesh, Dan Goldwasser, Bert Huang, Hal Daumé, Lise Getoor. Understanding MOOC Discussion Forums using Seeded LDA [C]. Proceedings of the Ninth Workshop on Innovative Use of NLP for Building Educational Applications, Baltimore, Maryland. Association for Computational Linguistics, 2014: 28-33.
[23].Miaomiao Wen, Diyi Yang, Carolyn Penstein Rosé. Sentiment Analysis in MOOC Discussion Forums: What does it tell us? [C]. Proceedings of Educational Data Mining. 2014: 130-137.
[24].刘智. 课程评论的情感倾向识别与话题挖掘技术研究[D]. 华中师范大学, 2014.
[25].潘怡,叶辉,邹军华.E-learning评论文本的情感分类研究[J].开放教育研究,2014,20(02):88-94.
[26].Aysu Ezen-Can, Joseph F. Grafsgaard, James C. Lester, and Kristy Elizabeth Boyer. Classifying student dialogue acts with multimodal learning analytics[C]. Proceedings of the Fifth International Conference on Learning Analytics And Knowledg, Poughkeepsie, New York. New York, USA: ACM , 2015: 280-289.
[27].冯君. 基于条件随机场的情感分析模型在MOOCs评论文本分析中的应用研究[D].华中师范大学,2017.
[28].Hongxiao Fei, Hongyuan Li. The Study of Learners’ Emotional Analysis Based on MOOC[C]. International Conference on Cognitive Computing. Springer, 2018: 170-178.
[29].Jian-Syuan Wong, Xiaolong Lu, ke Zhang. MessageLens: A Visual Analytics System to Support Multifaceted Exploration of MOOC Forum Discussions[J]. Visual Informatics, 2018, 2(1):37-49.
[30].姚天昉, 程希文, 徐飞玉,等. 文本意见挖掘综述[J]. 中文信息学报, 2008, 22(3):71-80.
[31].赵妍妍, 秦兵, 刘挺. 文本情感分析[J]. 软件学报, 2010, 21(8):1834-1848.
[32].Minqing Hu, Bing Liu. Mining and summarizing customer reviews[C]. Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, Seattle, WA, USA. New York, NY, USA: ACM , 2004: 168-177.
[33].Kim S M, Hovy E. Identifying opinion holders for question answering in opinion texts[C]. Proceedings of AAAI-05 Workshop on Question Answering in Restricted Domains, 2005.
[34].Wei Jin, Hung Hay Ho, Rohini K. Srihari. A novel machine learning system for web opinion mining and extraction[C]. Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, Paris, France. New York, USA: ACM, 2009:1195-1204.
[35].Niklas Jakob, Iryna Gurevych. Extracting opinion targets in a single-and cross-domain setting with conditional random fields[C]. Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Cambridge, Massachusetts. Stroudsburg, PA, USA: Association for Computational Linguistics, 2010: 1035-1045.
[36].Zhou X, Wan X, Xiao J. Cross-language opinion target extraction in review texts[C]. 2012 IEEE 12th International Conference on Data Mining, Brussels, Belgium. IEEE, 2012: 1200-1205.
[37].Wang W, Pan S J, Dahlmeier D, et al. Recursive neural conditional random fields for aspect-based sentiment analysis[C]. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas. USA: Association for Computational Linguistics, 2016: 616–626.
[38].Wu Y, Zhang Q, Huang X, et al. Phrase dependency parsing for opinion mining[C]. Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore. Stroudsburg, PA, USA: Association for Computational Linguistics, 2009: 1533-1541.
[39].张彩琴. 基于Co-training训练CRF模型的评价搭配识别[D]. 山西大学, 2013.
[40].廖祥文, 陈兴俊, 魏晶晶,等. 基于多层关系图模型的中文评价对象与评价词抽取方法[J]. 自动化学报, 2017, 43(3):462-471.
[41].沈亚田,黄萱菁,曹均阔.使用深度长短时记忆模型对于评价词和评价对象的联合抽取[J].中文信息学报,2018,32(02):110-119.
[42].Li S, Wang R, Zhou G. Opinion target extraction using a shallow semantic parsing framework[C]. Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, Toronto, Ontario, Canada. AAAI, 2012: 1671-1677.
[43].Qiu G, Liu B, Bu J, et al. Opinion word expansion and target extraction through double propagation[J]. Computational linguistics, 2011, 37(1): 9-27.
[44].Li F, Pan S J, Jin O, et al. Cross-domain co-extraction of sentiment and topic lexicons[C]. Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Jeju Island, Korea. Stroudsburg, PA, USA: Association for Computational Linguistics, 2012:410-419.
[45].Xu L, Liu K, Lai S, et al. Mining opinion words and opinion targets in a two-stage framework[C]. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria. Association for Computational Linguistics, 2013: 1764-1773.
[46].Liu K, Xu L, Zhao J. Syntactic patterns versus word alignment: Extracting opinion targets from online reviews[C]. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria. Association for Computational Linguistics, 2013: 1754-1763.
[47].李纲,刘广兴,毛进,叶光辉.一种基于句法分析的情感标签抽取方法[J].图书情报工作,2014,58(14):12-20.
[48].张莉, 钱玲飞, 许鑫. 基于核心句及句法关系的评价对象抽取[J]. 中文信息学报. 2011.25(3):23-29.
[49].顾正甲,姚天昉.评价对象及其倾向性的抽取和判别[J].中文信息学报,2012,26(04):91-97.
[50].李丕绩,马军,张冬梅,韩晓晖.用户评论中的标签抽取以及排序[J].中文信息学报,2012,26(05):14-19.
[51].郗亚辉.产品评论特征及观点抽取研究[J].情报学报,2014,33(03):326-336.
[52].彭云. 提取商品特征和情感词的语义约束LDA模型研究[D]. 江西财经大学, 2016.
[53].Huang H, Liu Q, Huang T. Appraisal Expression Recognition Based on Generalized Mutual Information[J]. JCP, 2013, 8(7): 1715-1721.
[54].赵妍妍,秦兵,车万翔,刘挺.基于句法路径的情感评价单元识别[J].软件学报,2011,22(05):887-898.
[55].王娟, 曹树金, 谢建国. 基于短语句法结构和依存句法分析的情感评价单元抽取[J]. 信息系统, 2017,40(3):107-113.
[56].方明,刘培玉.基于最大熵模型的评价搭配识别[J].计算机应用研究,2011,28(10):3714-3716.
[57].王素格,吴苏红.基于依存关系的旅游景点评论的特征—观点对抽取[J].中文信息学报,2012,26(03):116-121.
[58].陶新竹,赵鹏,刘涛.融合核心句与依存关系的评价搭配抽取[J].计算机技术与发展,2014,24(01):118-121.
[59].聂卉,杜嘉忠.依存句法模板下的商品特征标签抽取研究[J].现代图书情报技术,2014(12):44-50.
[60].姚兆旭, 马静. 面向微博话题的“主题+观点”词条抽取算法研究[J]. 现代图书情报技术, 2016, 32(7):78-86.
[61].江腾蛟, 万常选, 刘德喜,等. 基于语义分析的评价对象-情感词对抽取[J]. 计算机学报, 2017, 40(3):617-633.
[62].吴双. 基于依存句法分析的Web金融信息情感极性单元抽取[D]. 江西财经大学, 2015.
[63].孙晓,唐陈意.基于层叠模型细粒度情感要素抽取及倾向分析[J].模式识别与人工智能,2015,28(06):513-520.
[64].陈兴俊,魏晶晶,廖祥文,简思远,陈国龙.基于词对齐模型的中文评价对象与评价词抽取[J].山东大学学报(理学版),2016,51(01):58-64.
[65].杜思奇, 李红莲, 吕学强. 基于汉语组块分析的情感标签抽取[J]. 情报理论与实践, 2016, 39(5):125-129.
[66].张璞,李逍,刘畅.基于规则的评价搭配抽取方法[J].计算机工程,2019,45(08):217-223.
[67].李良强,徐华林,袁华,邵培基.基于最大频繁模式的在线评论标签抽取[J].信息系统学报,2016,16(1): 125-129.
[68].姚兆旭. 基于WSO-LDA的微博话题“主题+观点”词条抽取算法研究[D]. 南京航空航天大学, 2017.
[69].刘臣,韩林,李丹丹,安咏雪,霍良安.基于汉语组块产品特征——观点对提取与情感分析研究[J].计算机应用研究,2017,34(10):2942-2945.
[70].李大宇,王佳,文治,王素格.面向电影评论的标签方面情感联合模型[J].计算机科学与探索,2018,12(02):300-307.
[71].刘涛.基于特征的中文在线评论观点挖掘系统的研究与实现[D]. 东南大学, 2017.
[72].王忠群,吴东胜,蒋胜,皇苏斌.一种基于主流特征观点对的评论可信性排序研究[J].数据分析与知识发现,2017,1(10):32-42.
[73].李志义,王冕,赵鹏武.基于条件随机场模型的“评价特征-评价词”对抽取研究[J].情报学报,2017,36(04):411-421.
[74].王晓宇.网络评论标签提取的研究与实现[D]. 北京邮电大学, 2018.
[75].刘三女牙,彭晛,刘智,孙建文,刘海.面向MOOC课程评论的学习者话题挖掘研究[J].电化教育研究,2017,38(10):30-36.
[76].廖祥文, 许洪波, 孙乐,等. 第三届中文倾向性分析评测(COAE2011)语料的构建与分析[J]. 中文信息学报, 2013, 27(1):56-63.
[77].戴敏,朱珠,李寿山,周国栋.面向中文文本的情感信息抽取语料库构建[J].中文信息学报,2015,29(04):67-73.
[78].Kim S M. Determining the sentiment of opinions[C]. Proceedings of the 20th international conference on Computational Linguistics, Geneva, Switzerland. Stroudsburg, PA, USA: Association for Computational Linguistics. 2004: 1367-1373.
[79].Ian J. Goodfellow, Yoshua Bengio, Aaron C. Courville. Deep learning[M]. MIT press, 2016.
[80].John D. Lafferty, Andrew McCallum, Fernando C. N. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data[C]. Proceedings of the Eighteenth International Conference on Machine Learning, San Francisco, CA: Morgan Kaufmann Publishers Inc.. 2001: 282-289.

备注/Memo

备注/Memo:
3
更新日期/Last Update: 2021-01-19