Show notes are at https://stevelitchfield.com/sshow/chat.html
…
continue reading
Indhold leveret af LessWrong. Alt podcastindhold inklusive episoder, grafik og podcastbeskrivelser uploades og leveres direkte af LessWrong eller deres podcastplatformspartner. Hvis du mener, at nogen bruger dit ophavsretligt beskyttede værk uden din tilladelse, kan du følge processen beskrevet her https://da.player.fm/legal.
Player FM - Podcast-app
Gå offline med appen Player FM !
Gå offline med appen Player FM !
“o3” by Zach Stein-Perlman
MP3•Episode hjem
Manage episode 456844562 series 3364760
Indhold leveret af LessWrong. Alt podcastindhold inklusive episoder, grafik og podcastbeskrivelser uploades og leveres direkte af LessWrong eller deres podcastplatformspartner. Hvis du mener, at nogen bruger dit ophavsretligt beskyttede værk uden din tilladelse, kan du følge processen beskrevet her https://da.player.fm/legal.
I'm editing this post.
OpenAI announced (but hasn't released) o3 (skipping o2 for trademark reasons).
It gets 25% on FrontierMath, smashing the previous SoTA of 2%. (These are really hard math problems.) Wow.
72% on SWE-bench Verified, beating o1's 49%.
Also 88% on ARC-AGI.
---
First published:
December 20th, 2024
Source:
https://www.lesswrong.com/posts/Ao4enANjWNsYiSFqc/o3
---
Narrated by TYPE III AUDIO.
…
continue reading
OpenAI announced (but hasn't released) o3 (skipping o2 for trademark reasons).
It gets 25% on FrontierMath, smashing the previous SoTA of 2%. (These are really hard math problems.) Wow.
72% on SWE-bench Verified, beating o1's 49%.
Also 88% on ARC-AGI.
---
First published:
December 20th, 2024
Source:
https://www.lesswrong.com/posts/Ao4enANjWNsYiSFqc/o3
---
Narrated by TYPE III AUDIO.
399 episoder
MP3•Episode hjem
Manage episode 456844562 series 3364760
Indhold leveret af LessWrong. Alt podcastindhold inklusive episoder, grafik og podcastbeskrivelser uploades og leveres direkte af LessWrong eller deres podcastplatformspartner. Hvis du mener, at nogen bruger dit ophavsretligt beskyttede værk uden din tilladelse, kan du følge processen beskrevet her https://da.player.fm/legal.
I'm editing this post.
OpenAI announced (but hasn't released) o3 (skipping o2 for trademark reasons).
It gets 25% on FrontierMath, smashing the previous SoTA of 2%. (These are really hard math problems.) Wow.
72% on SWE-bench Verified, beating o1's 49%.
Also 88% on ARC-AGI.
---
First published:
December 20th, 2024
Source:
https://www.lesswrong.com/posts/Ao4enANjWNsYiSFqc/o3
---
Narrated by TYPE III AUDIO.
…
continue reading
OpenAI announced (but hasn't released) o3 (skipping o2 for trademark reasons).
It gets 25% on FrontierMath, smashing the previous SoTA of 2%. (These are really hard math problems.) Wow.
72% on SWE-bench Verified, beating o1's 49%.
Also 88% on ARC-AGI.
---
First published:
December 20th, 2024
Source:
https://www.lesswrong.com/posts/Ao4enANjWNsYiSFqc/o3
---
Narrated by TYPE III AUDIO.
399 episoder
Alle episoder
×Velkommen til Player FM!
Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.