top of page

Simons collaboration on the physics of learning and neural computation

Harnessing the fundamental sciences to break AI out of its black box
inject-online-programs-service_1723995646572.png

 Our mission 

​Recent advances in artificial intelligence, including deep learning, large language models, and generative AI, stand poised to transform our economy, society and the very nature of scientific research itself. However, quite alarmingly, this rapid engineering progress far outstrips the rate at which we can scientifically understand it. 

 

Our collaboration thus seeks to elucidate fundamental scientific principles underlying AI.  To do so, we employ and develop powerful tools for complex systems analysis from physics, mathematics, computer science, neuroscience, and statistics to understand how large neural networks learn, compute, scale, reason, and imagine. By studying AI as a complex physical system, we aim to break AI out of its black box.     

​

Indeed ideas from the physics of complex systems have long played a profound role in the development and analysis of machine learning and neural computation, ranging from the Hopfield model and Boltzmann machine (2024 physics Nobel prize), and the understanding of optimization dynamics and geometry in high dimensional disordered systems (2021 physics Nobel prize), to more recent advances in the discovery and analysis of scaling laws, and the inspiration of nonequilibrium statistical mechanics for diffusion models in generative AI. 

​

However, the highly performant AI systems of today open up entirely new opportunities for the concerted interaction of theory and experiment to both advance the science of AI and improve AI in a principled manner.  In particular, we seek to understand how the structure of data, the architecture of neural networks, and the dynamics of learning, all interact to give rise to the striking scaling properties and emergent capabilities of modern AI, as well as its mysterious failures. We work across multiple domains, spanning visual perception, language understanding, reasoning and creativity.  

​

Florent

Krzakala

How Do Neural Networks Learn Simple Functions with Gradient Descent?

Feb 13, 2025

In this talk, I will review the mechanisms by which two-layer neural networks can learn simple high-dimensional functions from data over time. We will focus on the intricate interplay between algorithms, iterations, and the complexity of tasks at hand, and how gradient descent and stochastic gradient descent can learn features of the function, and improve generalization over random initialization and kernels. I will also illustrate how ideas and methods at the intersection of high-dimensional probability and statistical physics provide fresh perspectives on these questions.

bottom of page