Search

Giulio Biroli

Optimal learning rate schedules in high-dimensional non-convex optimization problems
ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases
On the interplay between data structure and loss function in classification problems
Transformed CNNs: recasting pre-trained convolutional layers with self-attention
Double Trouble in Double Descent: Bias and Variance (s) in the Lazy Regime
Scaling description of generalization with number of parameters in deep learning
Triple descent and the two kinds of overfitting: where and why do they appear?
Finding the Needle in the Haystack with Convolutions: on the benefits of architectural bias
Jamming transition as a paradigm to understand the loss landscape of deep neural networks