Preview

Vestnik natsional'nogo issledovatel'skogo yadernogo universiteta "MIFI"

Advanced search

Method for Automated Intelligent Emotive and Sentiment Analysis of Texts with a Thematic Focus

https://doi.org/10.56304/S2304487X20030086

Abstract

   A tool for complex thematic, tonal, aspect-sentiment, and emotive analysis of natural-language texts is reported. For thematic analysis, a method based on probabilistic-entropy characteristics is employed, which allows us to identify automatically the number of topics in a text collection. Aspect-based sentiment analysis is performed within the neural network model with the topology of an interactive attention network, which demonstrates an accuracy of 0.58 by F1-macro score on the corpus from the SentiRuEval-2015 competition, thus outperforming the best results of the competition. For conducting emotive analysis, a method is developed on the basis of context-dependent vector representations of words, with the subsequent processing by an ensemble classifier trained on a corpus of texts prepared specially for five basic emotions: joy, sadness, anger, fear, and surprise (this classifier achieving an accuracy of 0.76 by F1-macro). An example of using the method is also demonstrated. Texts of the LiveJournal social network and news from the SCTM-ru project are selected for analysis. Presented visualization of text analysis results in the form of graphs, which shows the efficiency of the developed method.

About the Authors

A. V. Naumov
National Research Center Kurchatov Institute
Russian Federation

123182

Moscow



A. A. Selivanov
National Research Center Kurchatov Institute
Russian Federation

123182

Moscow



I. A. Moloshnikov
National Research Center Kurchatov Institute
Russian Federation

123182

Moscow



A. G. Sboev
National Research Center Kurchatov Institute
Russian Federation

123182

Moscow



References

1. IBM Watson Explorer. Available at: https://www.ibm.com/products/watson-explorer (accessed: 14. 03. 2020).

2. iFORA. Available at: https://issek.hse.ru/news/254274661.html (accessed: 14. 03. 2020).

3. Semantic Archive Platform. Available at: http://www.anbr.ru/ (accessed: 14. 03. 2020).

4. Ma D. et al. Interactive attention networks for aspect-level sentiment classification. arXiv preprint arXiv:1709.00893. 2017.

5. Huang B., Ou Y., Carley K. M. Aspect level sentiment classification with attention-over-attention neural net-works. International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation. Springer, Cham, 2018. P. 197–206.

6. Peters M. E. et al. Deep contextualized word representations. arXiv preprint arXiv:1802.05365. 2018.

7. Pennington J., Socher R., Manning C. Glove: Global vectors for word representation. Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014. P. 1532–1543.

8. Duppada V., Jain R., Hiray S. Seernet at semeval-2018 task 1: Domain adaptation for affect in tweets. arXiv preprint arXiv:1804.06137. 2018.

9. Jabreel M., Moreno A. EiTAKA at SemEval-2018 Task 1: An ensemble of n-channels ConvNet and XGboost regressors for emotion analysis of tweets. arXiv preprint arXiv:1802.09233. 2018.

10. Mohammad S., Kiritchenko S. Understanding emotions: A dataset of tweets to study interactions between affect categories. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). 2018.

11. Mohammad S. M., Bravo-Marquez F. Emotion intensities in tweets. arXiv preprint arXiv:1708.03696. 2017.

12. Straka M., Straková J. Tokenizing, pos tagging, lemmatizing and parsing ud 2.0 with udpipe. Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. 2017. P. 88–99.

13. Moloshnikov I. A. et al. A probabilistic-entropy approach of finding thematically similar documents with creating context-semantic graph for investigating evolution of society opinion. Journal of Physics: Conference Series. IOP Publishing, 2016. V. 681. № 1. P. 012012.

14. Moloshnikov I. A., Sboev A. G., Rybka R. B., & Gydovskikh D. V. An algorithm of finding thematically similar documents with creating context-semantic graph based on probabilistic-entropy approach. Procedia Computer Science, 2015. № 66. P. 297–306.

15. Frey B. J., Dueck D. Clustering by passing messages between data points. Science. 2007. V. 315. № 5814. P. 972–976.

16. Pedregosa F., Varoquaux et al., “Scikit-learn: Machine Learning in Python”, Journal of Machine Learning Research. 2011. № 12. P. 2825–2830.

17. Ma D. et al. Interactive attention networks for aspect-level sentiment classification. arXiv preprint arXiv:1709.00893. 2017.

18. Devlin J. et al. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. 2018.

19. Kuratov Y., Arkhipov M. Adaptation of Deep Bidirectional Multilingual Transformers for Russian Language. arXiv preprint arXiv:1905.07213. 2019.

20. Mozharova V., Loukachevitch N. Two-stage approach in Russian named entity recognition. International FRUCT Conference on Intelligence, Social Media and Web, ISMW FRUCT 2016. Saint-Petersburg; Russian Federation, doi: 10.1109/FRUCT.2016.7584769

21. Hochreiter S., Schmidhuber J. Long short-term memory. Neural computation. 1997. Vol. 9. № 8. P. 1735–1780.

22. Loukachevitch N., Blinov P., Kotelnikov E., Rubtsova Y., Ivanov V., & Tutubalina E. SentiRuEval: testing object-oriented sentiment analysis systems in Russian. In Proceedings of International Conference Dialog. 2015. V. 2. P. 3–13.

23. Blinov P., Kotelnikov E. V. Semantic similarity for aspect-based sentiment analysis. Russian Digital Libraries Journal. 2015. Vol. 18. № 3–4. P. 120–137.

24. Rusprofiling corpus of russian texts. Available at: http://rusprofilinglab.ru/rusprofiling-atpan/corpus/ (accessed: 14. 03. 2020)

25. Rubtsova Y. Avtomaticheskoe postroenie i analiz korpusa korotkih tekstov (postov mikroblogov) dlja zadachi razrabotki i trenirovki tonovogo klassifikatora [Automatic construction and analysis of the short texts data-set (microblogging posts) for the task of developing and training sentiment classifier]. Inzhenerija znanij i tehnologii semanticheskogo veba. 2012. Vol. 1. P. 109–116.

26. Loukachevitch N., Levchik A. Creating a general russian sentiment lexicon. Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16). 2016.

27. Karpovich S. N. The Russian language text corpus for testing algorithms of topic model. SPIIRAS Proceedings. 2015. Vol. 39. P. 123–142.

28. Bastian M., Heymann S., Jacomy M. Gephi: an open source software for exploring and manipulating net-works. Third international AAAI conference on weblogs and social media. 2009.

29. Jacomy M. et al. ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for the Gephi software. PloS one. 2014. № 6. Vol. 9. P. e98679.


Review

For citations:


Naumov A.V., Selivanov A.A., Moloshnikov I.A., Sboev A.G. Method for Automated Intelligent Emotive and Sentiment Analysis of Texts with a Thematic Focus. Vestnik natsional'nogo issledovatel'skogo yadernogo universiteta "MIFI". 2020;9(3):279-288. (In Russ.) https://doi.org/10.56304/S2304487X20030086

Views: 92


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2304-487X (Print)