While you can make a lot of progress in evals with tinkering and paying little attention to the literature, we found that various other papers have saved us many months of research effort. The Apollo Research evals team thus compiled a list of what we felt were important evals-related papers. We likely missed some relevant papers, and our recommendations reflect our personal opinions.
Core:
Other:
Core:
Other:
Core:
Other:
Core:
Other:
Core:
Other:
Core:
Other:
Core:
Other:
Core:
Other:
Core:
Other:
Core:
Other:
RLHF
Core:
Other:
Supervised Finetuning/Training & Prompting
Core:
Other:
Core:
Other:
The first draft of the list was based on a combination of various other reading lists that Marius Hobbhahn and Jérémy Scheurer had previously written. Marius wrote most of the final draft with detailed input from Jérémy and high-level input from Mikita Balesni, Rusheb Shah, and Alex Meinke.