Публикации | Компания НТР

2025

Kurtukova A., Kozachenko A., 2025. Backtranslation Invariance Boosts Effectiveness of Non-English Prompts. Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference “Dialogue 2025”, Moscow

We present an approach to improving non-English prompts based on […] ...

2025

Alibekov A., Matenkov A., Bolshakov V., Mukhtarova G., Migal A., Muryshev A., Kozachenko A., Mikhaylovskiy N., 2025. RuTaR—A Dataset in Russian for Reasoning about Taxes. Proceedings of the International Conference “Dialogue 2025″, Moscow

In 2024, reasoning have emerged as a new frontier for […] ...

2025

Chaunin V., Mikhaylovskiy N., 2025. Matrix Mixture of Experts is the best fast feed-forward. MathAI, Tomsk

We dissect the recently introduced Fast Feed-Forward (FFF) neural network […] ...

2024

Migal A., Seredina D., Telnina L., Nazarov N., Kolmogorova A., Mikhaylovskiy N., 2024. Overview of Long Story Generation Challenge (LSGC) at INLG 2024. In Natural Language Generation: Proceedings of the 17th International Conference on Natural Language Generation (INLG 2024), Tokyo.

This report describes the setup and results of the shared task of human-like long story generation, the LSG Challenge, which asks to generate a consistent, human-like long story (a Harry Potter fanfic in English for a general audience) given a prompt of about 1,000 tokens. We evaluated the submissions using both automated metrics and human evaluation protocols. ...

2023

Mikhaylovskiy N., 2023. Long Story Generation Challenge. In Proceedings of the 16th International Natural Language Generation Conference: Generation Challenges, pages 10–16, Prague, Czechia. Association for Computational Linguistics.

We propose a shared task of human-like long story generation, LSG Challenge, that asks models to output a consistent human-like long story (a Harry Potter generic audience fanfic in English), given a prompt of about 1K tokens. We suggest a novel statistical metric of the text structuredness, GloVe Autocorrelations Power/ Exponential Law Mean Absolute Percentage Error Ratio (GAPELMAPER) and the use of previously-known UNION metric and a human evaluation protocol. We hope that LSG can open new avenues for researchers to investigate sampling approaches, prompting strategies, autoregressive and non-autoregressive text generation architectures and break the barrier to generate consistent long (40K+ word) texts. ...

2023

Borisov E., Mikhaylovskiy N., 2023. Team NTR @ AutoMin 2023: Dolly LLM Improves Minuting Performance, Semantic Segmentation Doesn’t. In Proceedings of the 16th International Natural Language Generation Conference: Generation Challenges, pages 132–137, Prague, Czechia. Association for Computational Linguistics.

This paper documents the approach of Team NTR for the Second Shared Task on Automatic Minuting (AutoMin) at INLG 2023. The goal of this work is to develop a module for automatic generation of meeting minutes based on a meeting transcript text produced by an Automated Speech Recognition (ASR) system (Task A). We consider minuting as a supervised machine learning task on pairs of texts: the transcript of the meeting and its minutes. We use a two-staged minuting pipeline that consists of segmentation and summarization. We experiment with semantic segmentation and multi-language approaches and Large Language Model Dolly, and achieve Rouge1-F of 0.2455 and BERT-Score of 0.8063 on the English part of ELITR test set and Rouge1-F of 0.2430 and BERT-Score of 0.8332 on the EuroParl dev set with the submitted Naive Segmentation + Dolly7b pipeline. ...