Publications

Publications in reversed chronological order. Please check my Google Scholar for an up-to-date list.

2025

Artificial models of biological learning: exploring recurrence and data diversity

Nicolas Zucchet

PhD Thesis, 2025

PDF
Teaching signal synchronization in deep neural networks with prospective neurons

Nicolas Zucchet, Qianqian Feng, Axel Laborieux, Friedemann Zenke, Walter Senn, and João Sacramento

arXiv preprint arXiv:2511.14917, 2025

PDF Thread
The emergence of sparse attention: impact of data distribution and benefits of repetition

Nicolas Zucchet, Francesco d’Angelo, Andrew Lampinen, and Stephanie Chan

Advances in Neural Information Processing Systems, 2025

Awarded PDF Thread Video Code Poster

Oral (77 out of 21575 submitted papers)
How language models learn facts? Dynamics, curricula and hallucinations

Nicolas Zucchet, Jörg Bornschein, Stephanie Chan, Andrew Lampinen, Razvan Pascanu, and Soham De

Conference on Language Modeling, 2025

Awarded PDF Thread Poster

Oral (24 out of 1305 submitted papers)

2024

Recurrent neural networks: vanishing and exploding gradients are not the end of the story

Nicolas Zucchet, and Antonio Orvieto

Advances in Neural Information Processing Systems, 2024

PDF Thread Code Poster

2023

Gated recurrent neural networks discover attention

Nicolas Zucchet^*, Seijin Kobayashi^*, Yassir Akram^*, Johannes Oswald, Maxime Larcher, Angelika Steger, and João Sacramento

arXiv preprint arXiv:2309.01775, 2023

PDF Thread
Uncovering mesa-optimization algorithms in transformers

Johannes Oswald^*, Eyvind Niklasson^*, Maximilian Schlegel^*, Seijin Kobayashi, Nicolas Zucchet, Nino Scherrer, Nolan Miller, Mark Sandler, Max Vladymyrov, Razvan Pascanu, and João Sacramento

arXiv preprint arXiv:2309.05858, 2023

PDF Thread
Online learning of long-range dependencies

Nicolas Zucchet^*, Robert Meier^*, Simon Schug^*, Asier Mujika, and João Sacramento

In Advances in Neural Information Processing Systems, 2023

PDF Thread Code Poster

2022

The least-control principle for local learning at equilibrium

Alexander Meulemans^*, Nicolas Zucchet^*, Seijin Kobayashi^*, Johannes Oswald, and João Sacramento

In Advances in Neural Information Processing Systems, 2022

Awarded PDF Thread Code Poster

Oral and pre-selected for best paper award.
Random initialisations performing above chance and how to find them

Frederik Benzing, Simon Schug, Robert Meier, Johannes Oswald, Yassir Akram, Nicolas Zucchet, Laurence Aitchison, and Angelika Steger

In Annual Workshop on Optimization for Machine Learning, 2022

PDF Thread
A contrastive rule for meta-learning

Nicolas Zucchet^*, Simon Schug^*, Johannes Oswald^*, Dominic Zhao, and João Sacramento

In Advances in Neural Information Processing Systems, 2022

PDF Thread Poster
Beyond backpropagation: bilevel optimization through implicit differentiation and equilibrium propagation

Nicolas Zucchet, and João Sacramento

Neural Computation, 2022

PDF

2021

Learning where to learn: Gradient sparsity in meta and continual learning

Johannes Oswald^*, Dominic Zhao^*, Seijin Kobayashi, Simon Schug, Massimo Caccia, Nicolas Zucchet, and João Sacramento

In Advances in Neural Information Processing Systems, 2021

PDF