Double descent
In
data points used to train the model will have a large error.[2]
History
Early observations of double descent in specific models date back to 1989,
Theoretical models
[9] shows that double descent occurs in linear regression with isotropic Gaussian covariates and isotropic Gaussian noise.
A model of double descent at the thermodynamic limit has been analyzed by the replica method, and the result has been confirmed numerically.[10]
Empirical examples
The scaling behavior of double descent has been found to follow a broken neural scaling law[11] functional form.
References
- arXiv:2303.14151v1 [cs.LG].
- ^ "Deep Double Descent". OpenAI. 2019-12-05. Retrieved 2022-08-12.
- ISSN 0295-5075.
- PMID 32371495.
- ^ PMID 31341078.
- ISSN 0162-8828.
- ^ Eric (2023-01-10). "The bias-variance tradeoff is not a statistical concept". Eric J. Wang. Retrieved 2024-01-05.
- S2CID 207808916.
- ^ Nakkiran, Preetum (2019-12-16). "More Data Can Hurt for Linear Regression: Sample-wise Double Descent". arXiv.org. Retrieved 2024-04-18.
- PMC 7685244.
- ^ Caballero, Ethan; Gupta, Kshitij; Rish, Irina; Krueger, David (2022). "Broken Neural Scaling Laws". International Conference on Learning Representations (ICLR), 2023.
Part of a series on |
Machine learning and data mining |
---|
Further reading
- Mikhail Belkin; Daniel Hsu; Ji Xu (2020). "Two Models of Double Descent for Weak Features". .
- Mount, John (3 April 2024). "The m = n Machine Learning Anomaly".
- Preetum Nakkiran; Gal Kaplun; Yamini Bansal; Tristan Yang; Boaz Barak; Ilya Sutskever (29 December 2021). "Deep double descent: where bigger models and more data hurt". S2CID 207808916.
- Song Mei; Andrea Montanari (April 2022). "The Generalization Error of Random Features Regression: Precise Asymptotics and the Double Descent Curve". S2CID 199668852.
- Xiangyu Chang; Yingcong Li; Samet Oymak; Christos Thrampoulidis (2021). "Provable Benefits of Overparameterization in Model Compression: From Double Descent to Pruning Neural Networks". Proceedings of the AAAI Conference on Artificial Intelligence. 35 (8). arXiv:2012.08749.
External links
- Brent Werness; Jared Wilber. "Double Descent: Part 1: A Visual Introduction".
- Brent Werness; Jared Wilber. "Double Descent: Part 2: A Mathematical Explanation".
- Understanding "Deep Double Descent" at evhub.