DL theory | Notion

Layer after layer, like a computational onion. I’m a big fan of infinite depth these days, but the papers in this section consist of everything that theoretically investigates the deep neural network structure.

You’re gonna see topics like:

infinite width/depth (perhaps jointly)
feature learning
mean field analyses

list of papers

[ ] "Most Neural Networks Are Almost Learnable" - Daniely, Srebro, Vardi
[ ] "Exact Solutions of a Deep Linear Network" - Ziyin, Li, and Meng
[ ] "A Spectral Condition for Feature Learning" - Greg Yang, …, Jeremy Bernstein
[ ] "A Deep Conditioning Treatment of Neural Networks" - Naman et al.
[ ] "Tensor Programs VI: Feature Learning in Infinite-Depth Neural Networks" - Greg Yang, …, Soufiane Hayou
[ ] "Quantitative CLTs in Deep Neural Networks" - Boris, …
[ ] "Beyond NTK with Vanilla Gradient Descent: A Mean-Field Analysis of Neural Networks with Polynomial Width, Samples, and Time" - Tengyu et al
[ ] "The Neural Covariance SDE: Shaped Infinite Depth-and-Width Networks at Initialization" - Mufan (Bill) Li, Mihai Nica, Daniel M. Roy
[ ] "Regret Guarantees for Online Deep Control" - Xinyi, Edgar, Jason, Hazan
[ ] "Depth Dependence of μP Learning Rates in ReLU MLPs" - Hanin et al
[ ] "The Double-Edged Sword of Implicit Bias: Generalization vs. Robustness in ReLU Networks" - Frei, Vardi, Bartlett, Srebro
[ ] "Optimisation & Generalisation in Networks of Neurons" - Jeremy Bernstein
[ ] "Width and Depth Limits Commute in Residual Networks" - Soufiane Hayou & Greg Yang
[ ] "Random Fully Connected Neural Networks as Perturbatively Solvable Hierarchies" - Boris
[ ] "Bayesian Interpolation with Deep Linear Networks" - Boris & Alexander Zlopaka