Alibekov A., Matenkov A., Bolshakov V., Mukhtarova G., Migal A., Muryshev A., Kozachenko A., Mikhaylovskiy N., 2025. RuTaR—A Dataset in Russian for Reasoning about Taxes. Proceedings of the International Conference “Dialogue 2025″, Moscow

In 2024, reasoning have emerged as a new frontier for artificial intelligence and computational linguistics. Reasoning models are typically evaluated either on STEM-related datasets, or on synthetic datasets. This ignores a huge area of human thought—namely, humanitarian. To bridge this gap partially, we present a new open dataset, RuTaR (Russian Tax Reasoning). The dataset consists of modestly modified content of 199 select Ministry of Finances of Russia and Russian Federal Tax Service letters that typically reason to answer some taxpayer question. Despite apparent simplicity of yes/no questions, both off-the-shelf Large Language Models (LLMs) and Retrieval Augmented Generation (RAG) systems struggle to achieve high results on the dataset, with top RAG system studied achieving 77% accuracy.