Spencer Frei

I am a Senior Research Scientist at Google DeepMind, where I work on foundational research in artificial intelligence. I aim to improve the efficiency and performance of AI systems and to further our understanding of their behavior.

Prior to joining Google DeepMind, I was an assistant professor in the Department of Statistics at UC Davis. Before that, I was a postdoctoral fellow at the Simons Institute for the Theory of Computing at UC Berkeley and received my PhD from UCLA. More can be found on my CV.

NeurIPS 2023 Tutorial:
Reconsidering Overfitting in the Age of Overparameterized Models
Spencer Frei, Vidya Muthukumar (Georgia Tech), Fanny Yang (ETH Zurich).
Video, Slides, Webpage, Photo

Selected recent works

Trained transformer classifiers generalize and exhibit benign overfitting in-context.
Spencer Frei and Gal Vardi.
ICLR 2025.

Trained transformers learn linear models in-context.
Ruiqi Zhang, Spencer Frei, Peter L. Bartlett.
Journal of Machine Learning Research, 2024.

Benign overfitting in linear classifiers and leaky ReLU networks from KKT conditions for margin maximization.
Spencer Frei*, Gal Vardi*, Peter L. Bartlett, Nathan Srebro.
COLT 2023.

Implicit bias in leaky ReLU networks trained on high-dimensional data.
Spencer Frei*, Gal Vardi*, Peter L. Bartlett, Nathan Srebro, Wei Hu.
ICLR 2023. (Spotlight)

Benign overfitting without linearity: Neural network classifiers trained by gradient descent for noisy linear data.
Spencer Frei, Niladri S. Chatterji, Peter L. Bartlett.
COLT 2022.

Proxy convexity: A unified framework for the analysis of neural networks trained by gradient descent.
Spencer Frei and Quanquan Gu.
NeurIPS 2021.

For a complete list of publications, click the Publications tab above.

* Paper on benign overfitting in attention-based neural networks was accepted at NeurIPS 2025.
* I will be an Area Chair for ICLR 2026.
* I will be a participant at the Modern and Emerging Phenomena in Machine Learning workshop at Oberwolfach in March 2026.
* Paper on adversarial robustness of in-context learning was accepted at TMLR.
* I will give a talk at the Columbia University Statistics Seminar on April 21.
* New preprint with Alexander Tsigler, Luiz Chamon, and Peter Bartlett on benign overfitting in classification.
* Paper on in-context learning of linear classifiers accepted for publication at ICLR 2025.
* I have left UC Davis and have joined Google DeepMind as a Research Scientist.
* I will be an Area Chair for ICML 2025.
* New preprint with Usman Anwar, Johannes von Oswald, Louis Kirsch, and David Krueger on adversarial robustness of in-context learning.
* New preprint with Roey Magen, Shuning Shang, Zhiwei Xu, Wei Hu, and Gal Vardi on benign overfitting in single-head attention.
* New preprint with Gal Vardi on in-context learning and benign overfitting in transformers.
* Paper on learning a single-neuron autoencoder with SGD has been accepted at JMLR pending minor revision.
* I will be an Area Chair for ALT 2025.
* I will be an Area Chair for NeurIPS 2024.
* I will be an Area Chair for the ICML 2024 workshop Theoretical Foundations of Foundation Models.
* I will be a long-term participant at the Simons Institute's program on Modern Paradigms in Generalization in Fall 2024.
* Paper on interpolation under distribution shift was accepted at ICML 2024.
* New preprint on minimum-norm interpolation under distribution shift with Neil Mallinar, Austin Zane, and Bin Yu.
* Paper on transformers and in-context learning accepted for publication at JMLR.
* I will give a talk at the UC Davis Mathematics of Data and Decisions Seminar on May 21.
* I will give a talk at the UCLA Department of Statistics & Data Science Seminar on May 16.
* I will give a talk at the Sorbonne Université-Paris Diderot University Statistics seminar on April 30.
* I will give a talk at INRIA SIERRA/École normale supérieure on April 29.
* Paper on benign overfitting and grokking in ReLU networks was accepted at ICLR 2024.
* I will present the paper Random Feature Amplification at ICLR 2024 as a part of the journal-to-conference track.

Older news (click to expand)

2023
* Two papers accepted at NeurIPS 2023 workshops: one at Robustness of Foundation Models (R0-FoMo), another at Mathematics of Modern Machine Learning (M3L).
* I will give a talk at Apple Machine Learning Research, Cupertino on November 29.
* New preprint on benign overfitting and grokking with Zhiwei Xu, Yutong Wang, Gal Vardi, and Wei Hu.
* The Double-Edged Sword of Implicit Bias was accepted at NeurIPS 2023.
* Random Feature Amplification was accepted for publication at JMLR.
* I will present the tutorial on "Reconsidering Overfitting in the Age of Overparameterized Models" with Vidya Muthukumar and Fanny Yang at NeurIPS 2023.
* I will give a talk at the University of Basel Department of Mathematics and Computer Science seminar on November 9.
* I will give a talk at the University of Cambridge Machine Learning Group on October 25.
* I will give a talk at Google DeepMind London on October 18.
* I will give a talk at the Imperial College London AI+X Seminar on October 17.
* I will give a talk at the University of Oxford Computational Statistics and Machine Learning Seminar on October 13.
* I will give a talk at Stanford University (Tengyu Ma's group) on September 11.
* I will give a talk at the Google Research in-context learning reading group on August 11th.
* New preprint with Nikhil Ghosh, Wooseok Ha, and Bin Yu on learning a single-neuron autoencoder with SGD.
* I will be a member of the senior program committee for ALT 2024.
* New preprint with Ruiqi Zhang and Peter Bartlett on in-context learning of linear models with transformers.
* Paper on benign overfitting in neural networks was accepted at COLT 2023.
* I will be an Area Chair for NeurIPS 2023 in New Orleans.
* I will be speaking at the Youth in High Dimensions workshop in Trieste, Italy, from May 29-June 2. Registration is available here.
* I will be joining UC Davis as an Assistant Professor of Statistics in the fall.
* Two new preprints with Gal Vardi, Peter Bartlett, and Nati Srebro: one on benign overfitting, another on adversarial robustness.
* I am on the program committee for COLT 2023.
* Paper on implicit bias in neural nets trained on high-dimensional data was accepted for publication at ICLR 2023 as a spotlight presentation.

2022
* I will be speaking at the Symposium on Frontiers of Machine Learning and Artificial Intelligence at the University of Southern California on November 10th.
* New preprint with Gal Vardi, Peter Bartlett, Nati Srebro, and Wei Hu on implicit bias in neural networks trained on high-dimensional data.
* I have been selected as a 2022 Rising Star in Machine Learning by the University of Maryland.
* I am giving a talk at the University of Alberta Statistics Seminar on October 26th.
* I am giving a talk at the EPFL Fundamentals of Learning and Artificial Intelligence Seminar on September 30th.
* I am a visiting scientist at EPFL in September and October, hosted by Emmanuel Abbe.
* I am giving a talk at the Joint Statistical Meetings about benign overfitting without linearity.
* Benign overfitting without linearity was accepted at COLT 2022.
* I am an organizer for the Deep Learning Theory Summer School and Workshop, to be held this summer at the Simons Institute.
* I will be speaking at the ETH Zurich Data, Algorithms, Combinatorics, and Optimization Seminar on June 7th.
* I will be a keynote speaker at the University of Toronto Statistics Research Day on May 25th.
* I am giving a talk at Harvard University's Probabilitas Seminar on May 6th.
* Two recent works accepted at the Theory of Overparameterized Machine Learning 2022 workshop, including one as a contributed talk.
* I am giving a talk at the Microsoft Research ML Foundations Seminar on April 28th.
* I am giving a talk at the University of British Columbia (Christos Thrampoulidis's group) on April 8th.
* I am giving a talk at Columbia University (Daniel Hsu's group) on April 4th.
* I am giving a talk at Oxford University (Yee Whye Teh's group) on March 23rd.
* I am giving a talk at the NSF/Simons Mathematics of Deep Learning seminar on March 10th.
* I am giving a talk at the Google Algorithms Seminar on March 8th.
* I'm reviewing for the Theory of Overparameterized Machine Learning 2022 workshop.
* Two new preprints with Niladri Chatterji and Peter Bartlett: Benign Overfitting without Linearity and Random Feature Amplification.
* Recent work on sample complexity of a self-training algorithm accepted at AISTATS 2022.

2021
* I am speaking at the Deep Learning Theory Symposium at the Simons Institute on December 6th.
* My paper on proxy convexity as a framework for neural network optimization was accepted at NeurIPS 2021.
* Two new preprints on arxiv: (1) Proxy convexity: a unified framework for the analysis of neural networks trained by gradient descent, and (2) Self training converts weak learners to strong learners in mixture models.
* I am reviewing for the ICML 2021 workshop Overparameterization: Pitfalls and Opportunities (ICMLOPPO2021).
* Three recent papers accepted at ICML, including one as a long talk.
* New preprint on provable robustness of adversarial training for learning halfspaces with noise.
* I will be presenting recent work at TOPML2021 as a lightning talk, and at the SoCal ML Symposium as a spotlight talk.
* I'm giving a talk at the ETH Zurich Young Data Science Researcher Seminar on April 16th.
* I'm giving a talk at the Johns Hopkins University Machine Learning Seminar on April 2nd.
* I'm reviewing for the Theory of Overparameterized Machine Learning Workshop.
* I'm giving a talk at the Max-Planck-Insitute (MPI) MiS Machine Learning Seminar on March 11th.
* New preprint showing SGD-trained neural networks of any width generalize in the presence of adversarial label noise.

2020
* New preprint on agnostic learning of halfspaces using gradient descent is now on arXiv.
* My single neuron paper was accepted at NeurIPS 2020.
* I will be attending the IDEAL Special Quarter on the Theory of Deep Learning hosted by TTIC/Northwestern for the fall quarter.
* I've been awarded a Dissertation Year Fellowship by UCLA's Graduate Division.
* New preprint on agnostic PAC learning of a single neuron using gradient descent is now on arXiv.
* New paper accepted at Brain Structure and Function from work with researchers at UCLA School of Medicine.
* I'll be (remotely) working at Amazon's Alexa AI group for the summer as a research intern, working on natural language understanding.

2019
* My paper with Yuan Cao and Quanquan Gu, "Algorithm-dependent Generalization Bounds for Overparameterized Deep Residual Networks", was accepted at NeurIPS 2019 (arXiv version, NeurIPS version).

Benign overfitting and the geometry of the ridge regression solution in binary classification.
Alexander Tsigler, Luiz F. O. Chamon, Spencer Frei, Peter L. Bartlett.
Preprint, 2025.

Understanding in-context learning of linear models in transformers through an adversarial lens.
Usman Anwar, Johannes von Oswald, Louis Kirsch, David Krueger, Spencer Frei.
TMLR 2025.

Benign overfitting in single-head attention.
Roey Magen*, Shuning Shang*, Zhiwei Xu, Spencer Frei, Wei Hu, Gal Vardi.
NeurIPS 2025.

Trained transformer classifiers generalize and exhibit benign overfitting in-context.
Spencer Frei and Gal Vardi.
ICLR 2025.

Minimum-norm interpolation under covariate shift.
Neil Mallinar*, Austin Zane*, Spencer Frei, Bin Yu.
ICML 2024.

Benign overfitting and grokking in ReLU networks for XOR cluster data.
Zhiwei Xu, Yutong Wang, Spencer Frei, Gal Vardi, Wei Hu.
ICLR 2024.

The effect of SGD batch size on autoencoder learning: Sparsity, sharpness, and feature learning.
Nikhil Ghosh, Spencer Frei, Wooseok Ha, Bin Yu.
Journal of Machine Learning Research, 2025.

Trained transformers learn linear models in-context.
Ruiqi Zhang, Spencer Frei, Peter L. Bartlett.
Journal of Machine Learning Research, 2024.

The double-edged sword of implicit bias: Generalization vs. robustness in ReLU networks.
Spencer Frei*, Gal Vardi*, Peter L. Bartlett, Nathan Srebro.
NeurIPS 2023.

Benign overfitting in linear classifiers and leaky ReLU networks from KKT conditions for margin maximization.
Spencer Frei*, Gal Vardi*, Peter L. Bartlett, Nathan Srebro.
COLT 2023.

Implicit bias in leaky ReLU networks trained on high-dimensional data.
Spencer Frei*, Gal Vardi*, Peter L. Bartlett, Nathan Srebro,and Wei Hu.
ICLR 2023. (Spotlight)

Random feature amplification: Feature learning and generalization in neural networks.
Spencer Frei, Niladri S. Chatterji, Peter L. Bartlett.
Journal of Machine Learning Research, 2023.

Benign overfitting without linearity: Neural network classifiers trained by gradient descent for noisy linear data.
Spencer Frei, Niladri S. Chatterji, Peter L. Bartlett.
COLT 2022.

Self-training converts weak learners to strong learners in mixture models.
Spencer Frei*, Difan Zou*, Zixiang Chen*, Quanquan Gu.
AISTATS 2022.

Proxy convexity: A unified framework for the analysis of neural networks trained by gradient descent.
Spencer Frei and Quanquan Gu.
NeurIPS 2021.

Provable robustness of adversarial training for learning halfspaces with noise.
Difan Zou*, Spencer Frei*, Quanquan Gu.
ICML 2021.

Provable generalization of SGD-trained neural networks of any width in the presence of adversarial label noise.
Spencer Frei, Yuan Cao, Quanquan Gu.
ICML 2021.

Agnostic learning of halfspaces with gradient descent via soft margins.
Spencer Frei, Yuan Cao, Quanquan Gu.
ICML 2021, Oral (long talk).

Agnostic learning of a single neuron with gradient descent.
Spencer Frei, Yuan Cao, Quanquan Gu.
NeurIPS 2020.

Hemodynamic latency is associated with reduced intelligence across the lifespan: an fMRI DCM study of aging, cerebrovascular integrity, and cognitive ability.
Ariana E. Anderson, Mirella Diaz-Santos, Spencer Frei et al.
Brain Structure and Function, 2020.

Algorithm-dependent generalization bounds for overparameterized deep residual networks.
Spencer Frei, Yuan Cao, Quanquan Gu.
NeurIPS 2019.

A lower bound for $p_c$ in range-$R$ bond percolation in two and three dimensions.
Spencer Frei and Edwin Perkins.
Electronic Journal of Probability, 2016.

On thermal resistance in concentric residential geothermal heat exchangers.
Spencer Frei, Kathryn Lockwood, Greg Stewart, Justin Boyer, Burt S. Tilley.
Journal of Engineering Mathematics, 2014.

* denotes equal contribution.

At UC Davis:
* STA 035B: Statistical Data Science II (Winter 2024)
* STA 250: Theoretical Foundations of Modern AI. (Winter 2024)
* STA 290: Statistics Seminar (Winter 2024)