Solution of the Problem of Author’s Profiling of Russian Texts Using Spike Neural Networks and a New Method of Coding Text Data
https://doi.org/10.56304/S2304487X21060092
Abstract
The possibility of using spike neural networks to solve the problem of author profiling texts in Russian has been examined on the example of the tasks of determining the gender and age of the author, as well as the task of distinguishing texts generated algorithmically and written by a person. A method has been developed to convert texts encoded by sequences of vectors obtained using the FastText language model into spike sequences. Within the framework of the task, two bodies of documents are used, the first of which is characterized by a large number of short texts, the second by four times fewer texts of significantly longer length. Such a choice of enclosures allows us to draw conclusions about the limitations and possibilities of the proposed coding method. The experiments show that the proposed text encoding method in combination with the spike topology used in the problem successfully solves the tasks assigned to it: the accuracy obtained corresponds to the baseline model (LinearSVC) on both cases according to the f1-score metric.
Keywords
About the Authors
A. G. SboevRussian Federation
123182
115409
Moscow
R. B. Rybka
Russian Federation
123182
Moscow
Y. A. Davydov
Russian Federation
123182
Moscow
D. S. Vlasov
Russian Federation
123182
Moscow
References
1. Bojanowski P., Grave E., Joulin A., Mikolov T. Enriching word vectors with subword information // Transactions of the Association for Computational Linguistics, 2017. V. 5. P. 135–146.
2. Sboev A. G., Davydov Yu. A., Rybka R. B. [A neural network model for translating text commands to a mobile robot in natural Russian language into semiotic RDF format]. Sb. nauchnyh trudov VII mezhdunarodnoj konferencii “Lazernye, plazmennye issledovaniya i tekhnologii-LAPLAZ-2021” [Sat. scientific papers of the VII International Conference “Laser, Plasma Research and Technologies-LAPLAZ-2021”]. Moscow, 2021. pp. 138–139 (In Russian).
3. Abbott L. F. Lapicque’s introduction of the integrate-and-fire model neuron (1907) // Brain research bulletin, 1999. V. 50 (5–6). P. 303–304.
4. Sjöström J., Wulfram G. Spike-timing dependent plasticity // Scholarpedia, 2010. V. 5 (2). P. 1362.
5. Diehl P. U., Cook M. Unsupervised learning of digit recognition using spike-timing-dependent plasticity // Frontiers in computational neuroscience, 2015. V. 9. P. 99.
6. Hazan H., Saunders D., Khan H., Patel D. et al. Bindsnet: A machine learning-oriented spiking neural networks library in Python // Frontiers in Neuroinformatics, 2018. V. 12. P. 89.
7. Pedregosa F., Varoquaux G., Gramfort A. et al. Scikitlearn: Machine learning in Python // Journal of Machine Learning Research, 2011. V. 12. P. 2825–2830.
8. Vaswani A., Shazeer N., Parmar N. et al. Attention is all you need. / 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 2017.
Review
For citations:
Sboev A.G., Rybka R.B., Davydov Y.A., Vlasov D.S. Solution of the Problem of Author’s Profiling of Russian Texts Using Spike Neural Networks and a New Method of Coding Text Data. Vestnik natsional'nogo issledovatel'skogo yadernogo universiteta "MIFI". 2021;10(6):523-528. (In Russ.) https://doi.org/10.56304/S2304487X21060092