New Preprint, Global Convergence and Better Spectral Bias in Low-Rank Neural Networks

We study low-rank neural networks as more than compressed versions of dense models. Main message: rank can act as a structural control on what the network learns first, and an intermediate rank can improve high-frequency recovery at finite training time.

Neural networks often learn low-frequency components before high-frequency ones. Even when the model can represent a highly oscillatory target, finite-time training can leave the oscillatory modes poorly recovered. This paper asks whether low rank can improve that behavior, not only reduce parameter count.

Main results.

Global convergence: Low-rank random-feature networks in the mean-field limit converge to a global minimizer of the population risk whenever their limiting dynamics converge.
Rank as learning structure: Rank is not only a compression parameter. Choosing it correctly can reduce trainable degrees of freedom while improving the fit of highly oscillatory targets.
Intermediate-rank principle: If the rank is too small, the model lacks expressivity; if it is too large, it recovers the finite-time spectral bias of the dense model. The best rank is typically intermediate.

Diagnostics and experiments. Controlled geometric and Fourier diagnostics show that the optimal rank shifts with the target spectrum and the training objective. In high-frequency regression experiments, an appropriate low rank lowers test loss, improves high-frequency recovery, and avoids the dense model’s finite-time bias.

Paper: Global Convergence and Better Spectral Bias in Low-Rank Neural Networks. Joint work with Haizhao Yang and Shijun Zhang.