Prompt Refusal

Jul 24 2023 • 44 mins

The creators of large language models impose restrictions on some of the types of requests one might make of them. LLMs commonly refuse to give advice on committing crimes, producting adult content, or respond with any details about a variety of sensitive subjects. As with any content filtering system, you have false positives and false negatives.

Today's interview with Max Reuter and William Schulze discusses their paper "I'm Afraid I Can't Do That: Predicting Prompt Refusal in Black-Box Generative Language Models". In this work, they explore what types of prompts get refused and build a machine learning classifier adept at predicting if a particular prompt will be refused or not.

You Might Like

TED Radio Hour

NPR

Acquired

Ben Gilbert and David Rosenthal

Darknet Diaries

Darknet Diaries

Jack Rhysider

Hard Fork

The New York Times

Kim Komando Today

Kim Komando Today

Kim Komando

All-In with Chamath, Jason, Sacks & Friedberg

All-In with Chamath, Jason, Sacks & Friedberg

All-In Podcast, LLC

Marketplace Tech

Marketplace Tech

Marketplace

This Week in Tech (Audio)

This Week in Tech (Audio)

TWiT

WSJ’s The Future of Everything

WSJ’s The Future of Everything

The Wall Street Journal

Ask The Tech Guys (Audio)

Ask The Tech Guys (Audio)

TWiT

Daily Tech News Show

Daily Tech News Show

Tom Merritt

Rich On Tech

Rich DeMuro

Elon Musk Podcast

Elon Musk Podcast

Stage Zero

Security Now (Audio)

Security Now (Audio)

TWiT

TechStuff

iHeartPodcasts

Endless Thread

WBUR

The Vergecast

The Verge

Fortnite Emotes

Fortnite Emotes

Lawrence Hopkinson

Lofi ~ Sleep/Chill

Lofi ~ Sleep/Chill

Lofi King

Waveform: The MKBHD Podcast

Waveform: The MKBHD Podcast

Vox Media Podcast Network