optimization | Notion

Make it better. Make it better again. Make it better with derivatives. Make it better with acceleration. Make it better even though things are nonconvex because I’m using a deep network and things are stochastic because I’m using minibatching. Stuff like that.

Youre gonna see topics like:

optimization algorithms and their guarantees (SGD, Adam, AdaGrad, stuff like that)
research on different optimization settings (convexity, stochasticity)
statistical learning/ML theory (anything from linear regression to PAC to Rademacher)
online learning and regret

list of papers

[ ] The Power of Convex Relaxation: Near-Optimal Matrix Completion — Candles & Terry Tao
[ ] Online Infinite-Dimensional Regression: Learning Linear Operators — Raman, Subedi, Tewari
[ ] "On the Trajectories of SGD Without Replacement" - Beneventano
[ ] "Settling the Sample Complexity of Online Reinforcement Learning" - Zhang, Yuxin Chen, Jason Lee, Simon Du
[ ] "A Theory of Multimodal Learning" - Zhou
[ ] "Tight Guarantees for Interactive Decision Making with the Decision-Estimation Coefficient" - Dylan J. Foster et al
[ ] "The Statistical Complexity of Interactive Decision Making" - Foster et al
[ ] "Exponentiated Gradient Meets Gradient Descent" - Udaya, Hazan, and Yoram Singer
[ ] "On the Convergence and Sample Complexity Analysis of Deep Q-Networks with ε-Greedy Exploration" - IBM peeps
[ ] "Lower Bounds for Non-Convex Stochastic Optimization" - Duchi, Foster, Srebro, …
[ ] "On the hardness of learning under symmetries" - Melanie Weber et al
[ ] "Automatic Gradient Descent: Deep Learning without Hyperparameters" - Bernstein, …, Yisong Yue
[ ] "SAMUEL: Adaptive Gradient Methods with Local Guarantees" - Zhou, Luna, Sanjeev, Hazan
[ ] "Global Optimality Guarantees For Policy Gradient Methods" - Bhandari & Russo
[ ] "On Lower and Upper Bounds in Smooth and Strongly Convex Optimization" - Shai Shalev-Schwartz and some others