What’s the Magic Word? A Control Theory of LLM Prompting.

Machine Learning Street Talk (MLST)

Indhold leveret af Machine Learning Street Talk (MLST). Alt podcastindhold inklusive episoder, grafik og podcastbeskrivelser uploades og leveres direkte af Machine Learning Street Talk (MLST) eller deres podcastplatformspartner. Hvis du mener, at nogen bruger dit ophavsretligt beskyttede værk uden din tilladelse, kan du følge processen beskrevet her https://da.player.fm/legal.

for ca. et år siden 1:17:07

MP3•Episode hjem

These two scientists have mapped out the insides or “reachable space” of a language model using control theory, what they discovered was extremely surprising.

Please support us on Patreon to get access to the private Discord server, bi-weekly calls, early access and ad-free listening.

https://patreon.com/mlst

YT version: https://youtu.be/Bpgloy1dDn0

Aman Bhargava from Caltech and Cameron Witkowski from the University of Toronto to discuss their groundbreaking paper, “What’s the Magic Word? A Control Theory of LLM Prompting.” (the main theorem on self-attention controllability was developed in collaboration with Dr. Shi-Zhuo Looi from Caltech).

They frame LLM systems as discrete stochastic dynamical systems. This means they look at LLMs in a structured way, similar to how we analyze control systems in engineering. They explore the “reachable set” of outputs for an LLM. Essentially, this is the range of possible outputs the model can generate from a given starting point when influenced by different prompts. The research highlights that prompt engineering, or optimizing the input tokens, can significantly influence LLM outputs. They show that even short prompts can drastically alter the likelihood of specific outputs. Aman and Cameron’s work might be a boon for understanding and improving LLMs. They suggest that a deeper exploration of control theory concepts could lead to more reliable and capable language models.

We dropped an additional, more technical video on the research on our Twitter account here: https://x.com/MLStreetTalk/status/1795093759471890606

Additional 20 minutes of unreleased footage on our Patreon here: https://www.patreon.com/posts/whats-magic-word-104922629

What's the Magic Word? A Control Theory of LLM Prompting (Aman Bhargava, Cameron Witkowski, Manav Shah, Matt Thomson)

https://arxiv.org/abs/2310.04444

LLM Control Theory Seminar (April 2024)

https://www.youtube.com/watch?v=9QtS9sVBFM0

Society for the pursuit of AGI (Cameron founded it)

https://agisociety.mydurable.com/

Roger Federer demo

http://conway.languagegame.io/inference

Neural Cellular Automata, Active Inference, and the Mystery of Biological Computation (Aman)

https://aman-bhargava.com/ai/neuro/neuromorphic/2024/03/25/nca-do-active-inference.html

Aman and Cameron also want to thank Dr. Shi-Zhuo Looi and Prof. Matt Thomson from from Caltech for help and advice on their research. (https://thomsonlab.caltech.edu/ and https://pma.caltech.edu/people/looi-shi-zhuo)

https://x.com/ABhargava2000

https://x.com/witkowski_cam

216 episoder

#Machine Learning Street Talk #Artificial Intelligence #Tech #Machine Learning

What’s the Magic Word? A Control Theory of LLM Prompting.

Machine Learning Street Talk (MLST)

254 subscribers

published for ca. et år siden

Del

MP3•Episode hjem

These two scientists have mapped out the insides or “reachable space” of a language model using control theory, what they discovered was extremely surprising.

Please support us on Patreon to get access to the private Discord server, bi-weekly calls, early access and ad-free listening.

https://patreon.com/mlst

YT version: https://youtu.be/Bpgloy1dDn0

We dropped an additional, more technical video on the research on our Twitter account here: https://x.com/MLStreetTalk/status/1795093759471890606

Additional 20 minutes of unreleased footage on our Patreon here: https://www.patreon.com/posts/whats-magic-word-104922629

What's the Magic Word? A Control Theory of LLM Prompting (Aman Bhargava, Cameron Witkowski, Manav Shah, Matt Thomson)

https://arxiv.org/abs/2310.04444

LLM Control Theory Seminar (April 2024)

https://www.youtube.com/watch?v=9QtS9sVBFM0

Society for the pursuit of AGI (Cameron founded it)

https://agisociety.mydurable.com/

Roger Federer demo

http://conway.languagegame.io/inference

Neural Cellular Automata, Active Inference, and the Mystery of Biological Computation (Aman)

https://aman-bhargava.com/ai/neuro/neuromorphic/2024/03/25/nca-do-active-inference.html

https://x.com/ABhargava2000

https://x.com/witkowski_cam

216 episoder

#Machine Learning Street Talk #Artificial Intelligence #Tech #Machine Learning

Alle episoder

1
How Machines Learn to Ignore the Noise (Kevin Ellis + Zenna Tavares) 1:16:55

for 8 hours siden1:16:55

1:07:01

Sepp Hochreiter, the inventor of LSTM (Long Short-Term Memory) networks – a foundational technology in AI. Sepp discusses his journey, the origins of LSTM, and why he believes his latest work, XLSTM, could be the next big thing in AI, particularly for applications like robotics and industrial simulation. He also shares his controversial perspective on Large Language Models (LLMs) and why reasoning is a critical missing piece in current AI systems. SPONSOR MESSAGES: *** CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. Check out their super fast DeepSeek R1 hosting! https://centml.ai/pricing/ Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/ *** TRANSCRIPT AND BACKGROUND READING: https://www.dropbox.com/scl/fi/n1vzm79t3uuss8xyinxzo/SEPPH.pdf?rlkey=fp7gwaopjk17uyvgjxekxrh5v&dl=0 Prof. Sepp Hochreiter https://www.nx-ai.com/ https://x.com/hochreitersepp https://scholar.google.at/citations?user=tvUH3WMAAAAJ&hl=en TOC: 1. LLM Evolution and Reasoning Capabilities [00:00:00] 1.1 LLM Capabilities and Limitations Debate [00:03:16] 1.2 Program Generation and Reasoning in AI Systems [00:06:30] 1.3 Human vs AI Reasoning Comparison [00:09:59] 1.4 New Research Initiatives and Hybrid Approaches 2. LSTM Technical Architecture [00:13:18] 2.1 LSTM Development History and Technical Background [00:20:38] 2.2 LSTM vs RNN Architecture and Computational Complexity [00:25:10] 2.3 xLSTM Architecture and Flash Attention Comparison [00:30:51] 2.4 Evolution of Gating Mechanisms from Sigmoid to Exponential 3. Industrial Applications and Neuro-Symbolic AI [00:40:35] 3.1 Industrial Applications and Fixed Memory Advantages [00:42:31] 3.2 Neuro-Symbolic Integration and Pi AI Project [00:46:00] 3.3 Integration of Symbolic and Neural AI Approaches [00:51:29] 3.4 Evolution of AI Paradigms and System Thinking [00:54:55] 3.5 AI Reasoning and Human Intelligence Comparison [00:58:12] 3.6 NXAI Company and Industrial AI Applications REFS: [00:00:15] Seminal LSTM paper establishing Hochreiter's expertise (Hochreiter & Schmidhuber) https://direct.mit.edu/neco/article-abstract/9/8/1735/6109/Long-Short-Term-Memory [00:04:20] Kolmogorov complexity and program composition limitations (Kolmogorov) https://link.springer.com/article/10.1007/BF02478259 [00:07:10] Limitations of LLM mathematical reasoning and symbolic integration (Various Authors) https://www.arxiv.org/pdf/2502.03671 [00:09:05] AlphaGo’s Move 37 demonstrating creative AI (Google DeepMind) https://deepmind.google/research/breakthroughs/alphago/ [00:10:15] New AI research lab in Zurich for fundamental LLM research (Benjamin Crouzier) https://tufalabs.ai [00:19:40] Introduction of xLSTM with exponential gating (Beck, Hochreiter, et al.) https://arxiv.org/abs/2405.04517 [00:22:55] FlashAttention: fast & memory-efficient attention (Tri Dao et al.) https://arxiv.org/abs/2205.14135 [00:31:00] Historical use of sigmoid/tanh activation in 1990s (James A. McCaffrey) https://visualstudiomagazine.com/articles/2015/06/01/alternative-activation-functions.aspx [00:36:10] Mamba 2 state space model architecture (Albert Gu et al.) https://arxiv.org/abs/2312.00752 [00:46:00] Austria’s Pi AI project integrating symbolic & neural AI (Hochreiter et al.) https://www.jku.at/en/institute-of-machine-learning/research/projects/ [00:48:10] Neuro-symbolic integration challenges in language models (Diego Calanzone et al.) https://openreview.net/forum?id=7PGluppo4k [00:49:30] JKU Linz’s historical and neuro-symbolic research (Sepp Hochreiter) https://www.jku.at/en/news-events/news/detail/news/bilaterale-ki-projekt-unter-leitung-der-jku-erhaelt-fwf-cluster-of-excellence/ YT: https://www.youtube.com/watch?v=8u2pW2zZLCs…

Velkommen til Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

Lyt til 500+ emner

Minder om Machine Learning Street Talk (MLST)

Bounty Quick Size Paper Towels, White, 8 Family Rolls = 20 Regular Rolls (Packaging May Vary)

Schick Hydro Silk Touch-Up Dermaplaning Tool with Precision Cover, 3ct | Dermaplane Razor, Face Razors for Women, Eyebrow Razor, Facial Razor, Dermaplaning Razor, Womens Face Razor Peach Fuzz Remover

Play-Doh Eggs 24-Pack of Non-Toxic Modeling Compound for Kids 2 Years and Up for Party Favors, Easter Basket Stuffers, Pinata Toys, and More (Amazon Exclusive)

Podcasts der er værd at lytte til

Machine Learning Street Talk (MLST) « » What’s the Magic Word? A Control Theory of LLM Prompting.

What’s the Magic Word? A Control Theory of LLM Prompting.

Podcasts der er værd at lytte til

Velkommen til Player FM!

Apple AirPods Pro 2 Wireless Earbuds, Bluetooth Headphones, Active Noise Cancellation, Hearing Aid Feature, Transparency, Personalized Spatial Audio, High-Fidelity Sound, H2 Chip, USB-C Charging

2024 Topps Baseball Complete Set Factory Sealed Box Set - Baseball Complete Sets

Minder om Machine Learning Street Talk (MLST)

Hurtig referencevejledning

Machine Learning Street Talk (MLST) « »
What’s the Magic Word? A Control Theory of LLM Prompting.