Gå offline med appen Player FM !
[QA] FERRET: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique
Manage episode 436144155 series 3524393
FERRET enhances adversarial prompt generation for large language models, improving attack success rates and efficiency over RAINBOW TEAMING while ensuring effective prompts across various model sizes.
https://arxiv.org/abs//2408.10701
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
--- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support
1633 episoder
Manage episode 436144155 series 3524393
FERRET enhances adversarial prompt generation for large language models, improving attack success rates and efficiency over RAINBOW TEAMING while ensuring effective prompts across various model sizes.
https://arxiv.org/abs//2408.10701
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
--- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support
1633 episoder
Tất cả các tập
×Velkommen til Player FM!
Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.