Machine Learning Seminar Series Fall 2025 | Do Neural Networks Generalize Well? Low Norms vs. Flat Minima

Do Neural Networks Generalize Well? Low Norms vs. Flat Minima

 

Abstract: This talk investigates the fundamental differences between low-norm and flat solutions of shallow ReLU networks training problems, particularly in high-dimensional settings. We show that global minima with small weight norms exhibit strong generalization guarantees that are dimension-independent. In contrast, local minima that are “flat” can generalize poorly as the input dimension increases. We attribute this gap to a phenomenon we call neural shattering, where neurons specialize to extremely sparse input regions, resulting in activations that are nearly disjoint across data points. This forces the network to rely on large weight magnitudes, leading to poor generalization. Our theoretical analysis establishes an exponential separation between flat and low-norm minima. In particular, while flatness does imply some degree of generalization, we show that the corresponding convergence rates necessarily deteriorate exponentially with input dimension. These findings suggest that flatness alone does not fully explain the generalization performance of neural networks.

Bio: Rahul Parhi is an Assistant Professor in the Department of Electrical and Computer Engineering at the University of California, San Diego. Prior to joining UCSD, he was a Postdoctoral Researcher at the École Polytechnique Fédérale de Lausanne (EPFL), where he worked from 2022 to 2024. He completed his PhD in Electrical Engineering at the University of Wisconsin-Madison in 2022. His research interests lie at the interface between functional and harmonic analysis and data science.

 

Zoom link: https://gatech.zoom.us/j/93991261775?pwd=PCAJs5vTANpLGiCcKyyBR50hMuV432.1
Meeting ID: 939 9126 1775 
Passcode: 914902

Event Details

Date/Time:

  • Date: 
    Wednesday, September 10, 2025 - 12:00pm to 1:00pm

Location:
CODA Building 9th floor Atrium & Virtual

URL: