All of Rudi's Comments + Replies

Thanks a lot for the context!

Out of curiosity, why does the model training restriction make it much less useful for safety research?

4Zac Hatfield-Dodds
Example projects you're not allowed to do, if they involve other model families: * using Llama 2 as part of an RLAIF setup, which you might want to do when investigating Constitutional AI or decomposition or faithfulness of chain-of-thought or many many other projects; * using Llama 2 in auto-interpretability schemes to e.g. label detected features in smaller models, if this will lead to improvements in non-Llama-2 models; * fine-tuning other or smaller models on synthetic data produced by Llama 2, which has some downsides but is a great way to check for signs of life of a proposed technique In many cases I expect that individuals will go ahead and do this anyway, much like the license of Llama 1 was flagrantly violated all over the place, but remember that it's differentially risky for any organisation which Meta might like to legally harass.