dil-leik-og

Posts

Sorted by New

8When is it Better to Train on the Alignment Proxy?

22d

0

4Low Temperature Solomonoff Induction

4mo

4

Wikitag Contributions

Comments

Sorted by

Newest

Validating against a misalignment detector is very different to training against one

dil-leik-og26d00

The post's claim that validation-only approaches are fundamentally better than training-with-validation oversimplifies a complex reality. Both approaches modify the distribution of models - neither preserves some "pure" average case. Our base training objective may already have some correlation with our validation signal, and there's nothing special about maintaining this arbitrary starting point. Sometimes we should increase correlation between training and validation, sometimes decrease it, depending on the specific relationship between our objective and validator. What matters is understanding how correlation affects both P(aligned) and P(pass|misaligned), weighing the tradeoffs, and optimizing within our practical retraining budget (because often, increasing P(aligned|pass) will also decrease P(pass)).

Reply

Low Temperature Solomonoff Induction

dil-leik-og4mo00

thank you, will look into that. I intuitively expect that in the setting where compute is precisely 0 cost, you can always just convert multiplicity to negative-length by building an iterate/sort/index loop around the bit segment where the multiplicity lies, and this just costs you the length of the iterate/sort/index loop (a constant which depends on your language). I also intuitively expect this to break in the infinite bitstring setting because you can have multiplicity that isn't contained in a finite substring?

Reply

Low Temperature Solomonoff Induction

dil-leik-og4mo00

I was not able on a quick skim of the pdf to identify which passage you were referring to. If possible can you point me to an example Temperature 0 in the textbook?

Reply