Penalize Model Complexity Via Self-Distillation — AI Alignment Forum