Sandbagging (AI)

Written by Raymond Arnold
last updated

Summaries

Sandbagging is when an AI system pretends to be less capable during training/evaluation.