Artwork

Indhold leveret af Zeta Alpha. Alt podcastindhold inklusive episoder, grafik og podcastbeskrivelser uploades og leveres direkte af Zeta Alpha eller deres podcastplatformspartner. Hvis du mener, at nogen bruger dit ophavsretligt beskyttede værk uden din tilladelse, kan du følge processen beskrevet her https://da.player.fm/legal.
Player FM - Podcast-app
Gå offline med appen Player FM !

Transformer Memory as a Differentiable Search Index: memorizing thousands of random doc ids works!?

1:01:40
 
Del
 

Manage episode 355037188 series 3446693
Indhold leveret af Zeta Alpha. Alt podcastindhold inklusive episoder, grafik og podcastbeskrivelser uploades og leveres direkte af Zeta Alpha eller deres podcastplatformspartner. Hvis du mener, at nogen bruger dit ophavsretligt beskyttede værk uden din tilladelse, kan du følge processen beskrevet her https://da.player.fm/legal.

Andrew Yates and Sergi Castella discuss the paper titled "Transformer Memory as a Differentiable Search Index" by Yi Tay et al at Google. This work proposes a new approach to document retrieval in which document ids are memorized by a transformer during training (or "indexing") and for retrieval, a query is fed to the model, which then generates autoregressively relevant doc ids for that query.

Paper: https://arxiv.org/abs/2202.06991

Timestamps:

00:00 Intro: Transformer memory as a Differentiable Search Index (DSI)

01:15 The gist of the paper, motivation

4:20 Related work: Autoregressive Entity Linking

7:38 What is an index? Conventional vs. "differentiable"

10:20 Indexing and Retrieval definitions in the context of the DSI

12:40 Learning representations for documents

17:20 How to represent document ids: atomic, string, semantically relevant

22:00 Zero-shot vs. finetuned settings

24:10 Datasets and baselines

27:08 Dinetuned results

36:40 Zero-shot results

43:50 Ablation results

47:15 Where could this model be useds?

52:00 Is memory efficiency a fundamental problem of this approach?

55:14 What about semantically relevant doc ids?

60:30 Closing remarks

Contact: castella@zeta-alpha.com

  continue reading

21 episoder

Artwork
iconDel
 
Manage episode 355037188 series 3446693
Indhold leveret af Zeta Alpha. Alt podcastindhold inklusive episoder, grafik og podcastbeskrivelser uploades og leveres direkte af Zeta Alpha eller deres podcastplatformspartner. Hvis du mener, at nogen bruger dit ophavsretligt beskyttede værk uden din tilladelse, kan du følge processen beskrevet her https://da.player.fm/legal.

Andrew Yates and Sergi Castella discuss the paper titled "Transformer Memory as a Differentiable Search Index" by Yi Tay et al at Google. This work proposes a new approach to document retrieval in which document ids are memorized by a transformer during training (or "indexing") and for retrieval, a query is fed to the model, which then generates autoregressively relevant doc ids for that query.

Paper: https://arxiv.org/abs/2202.06991

Timestamps:

00:00 Intro: Transformer memory as a Differentiable Search Index (DSI)

01:15 The gist of the paper, motivation

4:20 Related work: Autoregressive Entity Linking

7:38 What is an index? Conventional vs. "differentiable"

10:20 Indexing and Retrieval definitions in the context of the DSI

12:40 Learning representations for documents

17:20 How to represent document ids: atomic, string, semantically relevant

22:00 Zero-shot vs. finetuned settings

24:10 Datasets and baselines

27:08 Dinetuned results

36:40 Zero-shot results

43:50 Ablation results

47:15 Where could this model be useds?

52:00 Is memory efficiency a fundamental problem of this approach?

55:14 What about semantically relevant doc ids?

60:30 Closing remarks

Contact: castella@zeta-alpha.com

  continue reading

21 episoder

Alle episoder

×
 
Loading …

Velkommen til Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Hurtig referencevejledning

Lyt til dette show, mens du udforsker
Afspil