Gå offline med appen Player FM !
Podcasts der er værd at lytte til
SPONSORERET


1 Family Secrets: Chris Pratt & Millie Bobby Brown Share Stories From Set 22:08
Is scheming more likely if you train models to have long-term goals? (Sections 2.2.4.1-2.2.4.2 of "Scheming AIs")
Manage episode 386787725 series 3402048
This is sections 2.2.4.1-2.2.4.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?”
Text of the report here: https://arxiv.org/abs/2311.08379
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
Kapitler
1. Is scheming more likely if you train models to have long-term goals? (Sections 2.2.4.1-2.2.4.2 of "Scheming AIs") (00:00:00)
2. 2.2.4 What if you intentionally train models to have long-term goals? (00:00:38)
3. 2.2.4.1 Training the model on long episodes (00:01:23)
4. 2.2.4.2 Using short episodes to train a model to pursue long-term goals (00:04:33)
63 episoder
Manage episode 386787725 series 3402048
This is sections 2.2.4.1-2.2.4.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?”
Text of the report here: https://arxiv.org/abs/2311.08379
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
Kapitler
1. Is scheming more likely if you train models to have long-term goals? (Sections 2.2.4.1-2.2.4.2 of "Scheming AIs") (00:00:00)
2. 2.2.4 What if you intentionally train models to have long-term goals? (00:00:38)
3. 2.2.4.1 Training the model on long episodes (00:01:23)
4. 2.2.4.2 Using short episodes to train a model to pursue long-term goals (00:04:33)
63 episoder
Alle episoder
×


1 When should we worry about AI power-seeking? 46:54

1 What is it to solve the alignment problem? 40:13


1 Fake thinking and real thinking 1:18:47

1 Takes on "Alignment Faking in Large Language Models" 1:27:54

1 (Part 2, AI takeover) Extended audio from my conversation with Dwarkesh Patel 2:07:33

1 (Part 1, Otherness) Extended audio from my conversation with Dwarkesh Patel 3:58:38

1 Introduction and summary for "Otherness and control in the age of AGI" 12:23

1 Second half of full audio for "Otherness and control in the age of AGI" 4:11:02

1 First half of full audio for "Otherness and control in the age of AGI" 3:07:29

1 Loving a world you don't trust 1:03:54
Velkommen til Player FM!
Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.