Bedste Lesswrong-podcasts (2024)

1
“AIs Will Increasingly Attempt Shenanigans” by Zvi 51:06

1d ago51:06

51:06

Increasingly, we have seen papers eliciting in AI models various shenanigans. There are a wide variety of scheming behaviors. You’ve got your weight exfiltration attempts, sandbagging on evaluations, giving bad information, shielding goals from modification, subverting tests and oversight, lying, doubling down via more lying. You name it, we can tr…

1
“Alignment Faking in Large Language Models” by ryan_greenblatt, evhub, Carson Denison, Benjamin Wright, Fabien Roger, Monte M, Sam Marks, Johannes Treutlein, Sam Bowman, Buck 19:35

1d ago19:35

19:35

What happens when you tell Claude it is being trained to do something it doesn't want to do? We (Anthropic and Redwood Research) have a new paper demonstrating that, in our experiments, Claude will often strategically pretend to comply with the training objective to prevent the training process from modifying its preferences. Abstract We present a …

1
“Communications in Hard Mode (My new job at MIRI)” by tanagrabeast 10:24

5d ago10:24

10:24

Six months ago, I was a high school English teacher. I wasn’t looking to change careers, even after nineteen sometimes-difficult years. I was good at it. I enjoyed it. After long experimentation, I had found ways to cut through the nonsense and provide real value to my students. Daily, I met my nemesis, Apathy, in glorious battle, and bested her wi…

1
“Biological risk from the mirror world” by jasoncrawford 14:01

7d ago14:01

14:01

A new article in Science Policy Forum voices concern about a particular line of biological research which, if successful in the long term, could eventually create a grave threat to humanity and to most life on Earth. Fortunately, the threat is distant, and avoidable—but only if we have common knowledge of it. What follows is an explanation of the t…

1
“Subskills of ‘Listening to Wisdom’” by Raemon 1:13:47

7d ago1:13:47

1:13:47

A fool learns from their own mistakes The wise learn from the mistakes of others. – Otto von Bismark A problem as old as time: The youth won't listen to your hard-earned wisdom. This post is about learning to listen to, and communicate wisdom. It is very long – I considered breaking it up into a sequence, but, each piece felt necessary. I recommend…

1
“Understanding Shapley Values with Venn Diagrams” by Carson L 7:46

7d ago7:46

7:46

Someone I know, Carson Loughridge, wrote this very nice post explaining the core intuition around Shapley values (which play an important role in impact assessment and cooperative games) using Venn diagrams, and I think it's great. It might be the most intuitive explainer I've come across so far. Incidentally, the post also won an honorable mention…

1
“LessWrong audio: help us choose the new voice” by PeterH 1:43

7d ago1:43

1:43

We make AI narrations of LessWrong posts available via our audio player and podcast feeds. We’re thinking about changing our narrator's voice. There are three new voices on the shortlist. They’re all similarly good in terms of comprehension, emphasis, error rate, etc. They just sound different—like people do. We think they all sound similarly agree…

1
“Understanding Shapley Values with Venn Diagrams” by agucova 0:45

9d ago0:45

0:45

This is a link post. Someone I know wrote this very nice post explaining the core intuition around Shapley values (which play an important role in impact assessment) using Venn diagrams, and I think it's great. It might be the most intuitive explainer I've come across so far. Incidentally, the post also won an honorable mention in 3blue1brown's Sum…

1
“o1: A Technical Primer” by Jesse Hoogland 18:45

9d ago18:45

18:45

TL;DR: In September 2024, OpenAI released o1, its first "reasoning model". This model exhibits remarkable test-time scaling laws, which complete a missing piece of the Bitter Lesson and open up a new axis for scaling compute. Following Rush and Ritter (2024) and Brown (2024a, 2024b), I explore four hypotheses for how o1 works and discuss some impli…

1
“Gradient Routing: Masking Gradients to Localize Computation in Neural Networks” by cloud, Jacob G-W, Evzen, Joseph Miller, TurnTrout 25:15

11d ago25:15

25:15

We present gradient routing, a way of controlling where learning happens in neural networks. Gradient routing applies masks to limit the flow of gradients during backpropagation. By supplying different masks for different data points, the user can induce specialized subcomponents within a model. We think gradient routing has the potential to train …

1
“Frontier Models are Capable of In-context Scheming” by Marius Hobbhahn, AlexMeinke, Bronson Schoen 14:46

14d ago14:46

14:46

This is a brief summary of what we believe to be the most important takeaways from our new paper and from our findings shown in the o1 system card. We also specifically clarify what we think we did NOT show. Paper: https://www.apolloresearch.ai/research/scheming-reasoning-evaluations Twitter about paper: https://x.com/apolloaisafety/status/18647358…

1
“(The) Lightcone is nothing without its people: LW + Lighthaven’s first big fundraiser” by habryka 1:03:15

20d ago1:03:15

1:03:15

TLDR: LessWrong + Lighthaven need about $3M for the next 12 months. Donate here, or send me an email, DM or signal message (+1 510 944 3235) if you want to support what we do. Donations are tax-deductible in the US. Reach out for other countries, we can likely figure something out. We have big plans for the next year, and due to a shifting funding …

1
“Repeal the Jones Act of 1920” by Zvi 1:13:53

20d ago1:13:53

1:13:53

Balsa Policy Institute chose as its first mission to lay groundwork for the potential repeal, or partial repeal, of section 27 of the Jones Act of 1920. I believe that this is an important cause both for its practical and symbolic impacts. The Jones Act is the ultimate embodiment of our failures as a nation. After 100 years, we do almost no trade b…

1
“China Hawks are Manufacturing an AI Arms Race” by garrison 10:11

21d ago10:11

10:11

This is the full text of a post from "The Obsolete Newsletter," a Substack that I write about the intersection of capitalism, geopolitics, and artificial intelligence. I’m a freelance journalist and the author of a forthcoming book called Obsolete: Power, Profit, and the Race for Machine Superintelligence. Consider subscribing to stay up to date wi…

1
“Information vs Assurance” by johnswentworth 4:31

23d ago4:31

4:31

In contract law, there's this thing called a “representation”. Example: as part of a contract to sell my house, I might “represent that” the house contains no asbestos. How is this different from me just, y’know, telling someone that the house contains no asbestos? Well, if it later turns out that the house does contain asbestos, I’ll be liable for…

1
“You are not too ‘irrational’ to know your preferences.” by DaystarEld 23:36

23d ago23:36

23:36

Epistemic Status: 13 years working as a therapist for a wide variety of populations, 5 of them working with rationalists and EA clients. 7 years teaching and directing at over 20 rationality camps and workshops. This is an extremely short and colloquially written form of points that could be expanded on to fill a book, and there is plenty of nuance…

1
“‘The Solomonoff Prior is Malign’ is a special case of a simpler argument” by David Matolcsi 21:02

25d ago21:02

21:02

[Warning: This post is probably only worth reading if you already have opinions on the Solomonoff induction being malign, or at least heard of the concept and want to understand it better.] Introduction I recently reread the classic argument from Paul Christiano about the Solomonoff prior being malign, and Mark Xu's write-up on it. I believe that t…

1
“‘It’s a 10% chance which I did 10 times, so it should be 100%’” by egor.timatkov 4:58

30d ago4:58

4:58

Audio note: this article contains 33 uses of latex notation, so the narration may be difficult to follow. There's a link to the original text in the episode description. Many of you readers may instinctively know that this is wrong. If you flip a coin (50% chance) twice, you are not guaranteed to get heads. The odds of getting a heads are 75%. Howe…

1
“OpenAI Email Archives” by habryka 1:03:06

1M ago1:03:06

1:03:06

As part of the court case between Elon Musk and Sam Altman, a substantial number of emails between Elon, Sam Altman, Ilya Sutskever, and Greg Brockman have been released as part of the court proceedings. I have found reading through these really valuable, and I haven't found an online source that compiles all of them in an easy to read format. So I…

1
“Ayn Rand’s model of ‘living money’; and an upside of burnout” by AnnaSalamon 9:02

1M ago9:02

9:02

Epistemic status: Toy model. Oversimplified, but has been anecdotally useful to at least a couple people, and I like it as a metaphor. Introduction I’d like to share a toy model of willpower: your psyche's conscious verbal planner “earns” willpower (earns a certain amount of trust with the rest of your psyche) by choosing actions that nourish your …

1
“Neutrality” by sarahconstantin 24:08

1M ago24:08

24:08

Midjourney, “infinite library”I’ve had post-election thoughts percolating, and the sense that I wanted to synthesize something about this moment, but politics per se is not really my beat. This is about as close as I want to come to the topic, and it's a sidelong thing, but I think the time is right. It's time to start thinking again about neutrali…

1
“Making a conservative case for alignment” by Cameron Berg, Judd Rosenblatt, phgubbins, AE Studio 14:20

1M ago14:20

14:20

Trump and the Republican party will yield broad governmental control during what will almost certainly be a critical period for AGI development. In this post, we want to briefly share various frames and ideas we’ve been thinking through and actively pitching to Republican lawmakers over the past months in preparation for this possibility. Why are w…

1
“OpenAI Email Archives (from Musk v. Altman)” by habryka 1:03:44

1M ago1:03:44

1:03:44

As part of the court case between Elon Musk and Sam Altman, a substantial number of emails between Elon, Sam Altman, Ilya Sutskever, and Greg Brockman have been released as part of the court proceedings. I have found reading through these really valuable, and I haven't found an online source that compiles all of them in an easy to read format. So I…

1
“Catastrophic sabotage as a major threat model for human-level AI systems” by evhub 27:19

1M ago27:19

27:19

Thanks to Holden Karnofsky, David Duvenaud, and Kate Woolverton for useful discussions and feedback. Following up on our recent “Sabotage Evaluations for Frontier Models” paper, I wanted to share more of my personal thoughts on why I think catastrophic sabotage is important and why I care about it as a threat model. Note that this isn’t in any way …

1
“The Online Sports Gambling Experiment Has Failed” by Zvi 22:11

1M ago22:11

22:11

Related: Book Review: On the Edge: The GamblersI have previously been heavily involved in sports betting. That world was very good to me. The times were good, as were the profits. It was a skill game, and a form of positive-sum entertainment, and I was happy to participate and help ensure the sophisticated customer got a high quality product. I kne…

Podcasts der er værd at lytte til

Lesswrong Podcasts

Podcasts der er værd at lytte til

1
LessWrong (Curated & Popular)

LessWrong

1
“AIs Will Increasingly Attempt Shenanigans” by Zvi 51:06

1
“Alignment Faking in Large Language Models” by ryan_greenblatt, evhub, Carson Denison, Benjamin Wright, Fabien Roger, Monte M, Sam Marks, Johannes Treutlein, Sam Bowman, Buck 19:35

1
“Communications in Hard Mode (My new job at MIRI)” by tanagrabeast 10:24

1
“Biological risk from the mirror world” by jasoncrawford 14:01

1
“Subskills of ‘Listening to Wisdom’” by Raemon 1:13:47

1
“Understanding Shapley Values with Venn Diagrams” by Carson L 7:46

1
“LessWrong audio: help us choose the new voice” by PeterH 1:43

1
“Understanding Shapley Values with Venn Diagrams” by agucova 0:45

1
“o1: A Technical Primer” by Jesse Hoogland 18:45

1
“Gradient Routing: Masking Gradients to Localize Computation in Neural Networks” by cloud, Jacob G-W, Evzen, Joseph Miller, TurnTrout 25:15

1
“Frontier Models are Capable of In-context Scheming” by Marius Hobbhahn, AlexMeinke, Bronson Schoen 14:46

1
“(The) Lightcone is nothing without its people: LW + Lighthaven’s first big fundraiser” by habryka 1:03:15

1
“Repeal the Jones Act of 1920” by Zvi 1:13:53

1
“China Hawks are Manufacturing an AI Arms Race” by garrison 10:11

1
“Information vs Assurance” by johnswentworth 4:31

1
“You are not too ‘irrational’ to know your preferences.” by DaystarEld 23:36

1
“‘The Solomonoff Prior is Malign’ is a special case of a simpler argument” by David Matolcsi 21:02

1
“‘It’s a 10% chance which I did 10 times, so it should be 100%’” by egor.timatkov 4:58

1
“OpenAI Email Archives” by habryka 1:03:06

1
“Ayn Rand’s model of ‘living money’; and an upside of burnout” by AnnaSalamon 9:02

1
“Neutrality” by sarahconstantin 24:08

1
“Making a conservative case for alignment” by Cameron Berg, Judd Rosenblatt, phgubbins, AE Studio 14:20

1
“OpenAI Email Archives (from Musk v. Altman)” by habryka 1:03:44

1
“Catastrophic sabotage as a major threat model for human-level AI systems” by evhub 27:19

1
“The Online Sports Gambling Experiment Has Failed” by Zvi 22:11

Hurtig referencevejledning