Публикации

Bolshakov V., Mikhaylovskiy N., 2023. Pseudo-Labelling for Autoregressive Structured Prediction in Coreference Resolution. In Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference “Dialogue 2023”, Moscow.

Coreference resolution is an important task in natural language processing, since it can be applied to such vital tasks as information retrieval, text summarization, question answering, sentiment analysis and machine translation. In this paper, we present a study on the effectiveness of several approaches to coreference resolution, focusing on the RuCoCo dataset as well as results of participation in the Dialogue Evaluation 2023. We explore ways to increase the dataset size by using pseudo-labelling and data translated from another language. Using such technics we managed to triple the size of dataset, make it more diverse and improve performance of autoregressive structured prediction (ASP) on coreference resolution task. This approach allowed us to achieve the best results on RuCoCo private test with increase of F1-score by 1.8, Precision by 0.5 and Recall by 3.0 points compared to the second-best leaderboard score. Our results demonstrate the potential of the ASP model and the importance of utilizing diverse training data for coreference resolution. ...

Жовнерчук Е.В., Чичкалюк В.А., Жовнерчук И.Ю., Михайловский Н.Э., Мошкин В.В., Юрашку И.В. Методика определения психической надежности сотрудников транспортной безопасности с использованием машинного обучения. Психическое здоровье 2022; 17(12): 3-10.

Цель: разработать методику формирования видеодатасета рефлекторной мимической активности лица в группе сотрудников транспортной безопасности (ТБ) с оценкой их психофизиологического состояния, подтверждающего наличие функционального состояния утомления. ...

Mikhaylovskiy N., 2022. On Unsupervised Training of Link Grammar Based Language Models. arXiv preprint arXiv:2208.13021.

In this short note we explore what is needed for the unsupervised training of graph language models based on link grammars. First, we introduce the ter-mination tags formalism required to build a language model based on a link grammar formalism of Sleator and Temperley [21] and discuss the influence of context on the unsupervised learning of link grammars. Second, we pro-pose a statistical link grammar formalism, allowing for statistical language generation. Third, based on the above formalism, we show that the classical dissertation of Yuret [25] on discovery of linguistic relations using lexical at-traction ignores contextual properties of the language, and thus the approach to unsupervised language learning relying just on bigrams is flawed. This correlates well with the unimpressive results in unsupervised training of graph language models based on bigram approach of Yuret. ...

Zubchuk, E., Menshikov, D., & Mikhaylovskiy, N. (2022, February). Using a Language Model in a Kiosk Recommender System at Fast-Food Restaurants.

Kiosks are a popular self-service option in many fast-food restaurants, they save time for the visitors and save labor for the fast-food chains. In this paper, we propose an effective design of a kiosk shopping cart recommender system that combines a language model as a vectorizer and a neural network-based classifier. The model performs better than other models in offline tests and exhibits performance comparable to the best models in A/B/C tests. ...

Danilovich, I., Moshkin, V., Reimche, A., Tevelevich, M., & Mikhaylovskiy, N. (2021, November). Video monitoring over anti-decubitus protocol execution with a deep neural network to prevent pressure ulcer. In 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC) (pp. 1384-1387). IEEE.

Video monitoring of the patient position in the intensive care units is complicated by the obstacles covering the patient body. Conventional posture detection algorithms do not work in this case. A reformulation of the posture detection problem for the case as an object detection/image classification problem and the use of recent deep learning techniques allowed us to achieve 94.5% accuracy on a pre-clinical test classifying 4 postures using imagery from an off-the-shelf camera and edge processing, which is a 60% improvement over the result previously known in literature. This in turn allowed us to build a ready for the clinical trials system based on inexpensive off-the-shelf cameras.Clinical Relevance — A cheap and practical system of automatic video monitoring of bedridden patients allows to minimize the risks of pressure ulcer in ICU. ...

Zubchuk, E., Menshikov, D., & Mikhaylovsky, N. (2021, September). Efficiency of short text classifiers for payment classification. In 2021 International Conference on Information Technology and Nanotechnology (ITNT) (pp. 1-4). IEEE.

Traditionally, the Central Bank of Russia used regular expressions for the payment classification as part of its supervisory activities. Regular expressions often spanned multiple pages to cover varied relevant keywords and their forms. We compare this approach to two modern short text classification approaches: fastText and BERT-based transformer in terms of speed, accuracy and flexibility, including few-shot learning. ...