This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
Prompt Engineering
•
Applied to
Using ideologically-charged language to get gpt-3.5-turbo to disobey it's system prompt: a demo
by
Milan Weibel
4mo
ago
•
Applied to
Please Understand
by
Sam Healy
8mo
ago
•
Applied to
LLM keys - A Proposal of a Solution to Prompt Injection Attacks
by
Peter Hroššo
1y
ago
•
Applied to
Extrapolating from Five Words
by
Gordon Seidoh Worley
1y
ago
•
Applied to
The Stochastic Parrot Hypothesis is debatable for the last generation of LLMs
by
Quentin Feuillade--Montixi
1y
ago
•
Applied to
Chess as a case study in hidden capabilities in ChatGPT
by
Tobias D.
1y
ago
•
Applied to
MetaAI: less is less for alignment.
by
Cleo Nardo
1y
ago
•
Applied to
Tutor-GPT & Pedagogical Reasoning
by
Courtland Leer
2y
ago
•
Applied to
$300 for the best sci-fi prompt
by
RomanS
2y
ago
•
Applied to
DELBERTing as an Adversarial Strategy
by
Matthew_Opitz
2y
ago
•
Applied to
Readability is mostly a waste of characters
by
Vlad Gheorghe
2y
ago
•
Applied to
LW is probably not the place for "I asked this LLM (x) and here's what it said!", but where is?
by
lillybaeum
2y
ago
•
Applied to
You can use GPT-4 to create prompt injections against GPT-4
by
WitchBOT
2y
ago
•
Applied to
Hutter-Prize for Prompts
by
rokosbasilisk
2y
ago
•
Applied to
Remarks 1–18 on GPT (compressed)
by
Cleo Nardo
2y
ago
•
Applied to
Are nested jailbreaks inevitable?
by
Andrew Judson
2y
ago
•
Applied to
Want to predict/explain/control the output of GPT-4? Then learn about the world, not about transformers.
by
Cleo Nardo
2y
ago
•
Applied to
The Waluigi Effect (mega-post)
by
Cleo Nardo
2y
ago