Ethan Caballero–Scale is All You Need

The Inside View

Indhold leveret af Michaël Trazzi. Alt podcastindhold inklusive episoder, grafik og podcastbeskrivelser uploades og leveres direkte af Michaël Trazzi eller deres podcastplatformspartner. Hvis du mener, at nogen bruger dit ophavsretligt beskyttede værk uden din tilladelse, kan du følge processen beskrevet her https://da.player.fm/legal.

2+ y ago 51:54

M4A•Episode hjem

Ethan is known on Twitter as the edgiest person at MILA. We discuss all the gossips around scaling large language models in what will be later known as the Edward Snowden moment of Deep Learning. On his free time, Ethan is a Master’s degree student at MILA in Montreal, and has published papers on out of distribution generalization and robustness generalization, accepted both as oral presentations and spotlight presentations at ICML and NeurIPS. Ethan has recently been thinking about scaling laws, both as an organizer and speaker for the 1st Neural Scaling Laws Workshop.

Transcript: https://theinsideview.github.io/ethan

Youtube: https://youtu.be/UPlv-lFWITI

Michaël: https://twitter.com/MichaelTrazzi

Ethan: https://twitter.com/ethancaballero

Outline

(00:00) highlights

(00:50) who is Ethan, scaling laws T-shirts

(02:30) scaling, upstream, downstream, alignment and AGI

(05:58) AI timelines, AlphaCode, Math scaling, PaLM

(07:56) Chinchilla scaling laws

(11:22) limits of scaling, Copilot, generative coding, code data

(15:50) Youtube scaling laws, constrative type thing

(20:55) AGI race, funding, supercomputers

(24:00) Scaling at Google

(25:10) gossips, private research, GPT-4

(27:40) why Ethan was did not update on PaLM, hardware bottleneck

(29:56) the fastest path, the best funding model for supercomputers

(31:14) EA, OpenAI, Anthropics, publishing research, GPT-4

(33:45) a zillion language model startups from ex-Googlers

(38:07) Ethan's journey in scaling, early days

(40:08) making progress on an academic budget, scaling laws research

(41:22) all alignment is inverse scaling problems

(45:16) predicting scaling laws, useful ai alignment research

(47:16) nitpicks aobut Ajeya Cotra's report, compute trends

(50:45) optimism, conclusion on alignment

55 episoder

#Tech #Michaël Trazzi