Chaunin V., Mikhaylovskiy N., 2025. Matrix Mixture of Experts is the best fast feed-forward. MathAI, Tomsk

We dissect the recently introduced Fast Feed-Forward (FFF) neural network architecture and propose a matrix formulation of FFF that allows a unified perspective on FFF and Mixture of Experts (MoE) architectures. This formulation achieves, on average, a nearly 4x speedup in the FFF inference on GPUs compared to the original formulation for depths of up to 8.