Not sure if I'm fully responding to your q but...
there might be no canonical topology for the original computation
This sounds right to me, and overall I mostly think of treeification as just a kind of extensional rewrite (plus adding more inputs).
these hypotheses can't be understood as making precise claims about the original computation anymore
I think of the underlying graph as providing some combination of 1) causal relationships, and 2) smaller pieces to help with search/reasoning, rather than being an object we inherently care about. (It's possibly use...
This is a nice comparison. I particularly like the images :) and drawing the comparisons setting aside historical accidents.
A few comments that came to mind as I was reading:
Perform an interchange intervention on the treeification of L such that the corresponding intervention in the treeification of H would not change any values.
As far as I saw, you don’t mention how causal scrubbing specifies selecting the interchange intervention (the answer is: preserving the distribution of inputs to nodes in H, see e.g. the Appendix post). I think ...
I had cached impressions that AI safety people were interested in auditing, ELK, and scalable oversight.
A few AIS people who volunteered to give feedback before the workshop (so biased towards people who were interested in the title) each named a unique top choice: scientific understanding (specifically threat models), model editing, and auditing (so 2/3 were unexpected for me).
During the workshop, attendees (again, biased, as they self-selected into the session) expressed excitement most about auditing, unlearning, MAD, ELK, and general scientific underst... (read more)