2

Length generalization in arithmetic transformers

End-to-end symbolic regression with transformers

L'espace est courbe, qu'est-ce à dire?

La Conversation Scientifique with Etienne Klein, France Culture

Big Bang et trous noirs

Minute papillon with Sidonie bonnec, France Bleu

On the interplay between data structure and loss function in classification problems

Transformed CNNs: recasting pre-trained convolutional layers with self-attention

Bias and Variances in the double descent curve

Reconciling the Double Descent curve with older ideas

Scaling description of generalization with number of parameters in deep learning

A jamming transition from under-to over-parametrization affects generalization in deep learning