This model makes two really strong assumptions: that optimization is like conditioning, and that and are independent.
[...]
There's also a sort of implicit assumption in even using a framing that thinks about things as ; the world might be better thought of as naturally containing tuples (with our proxy measurement), and could be a sort of unnatural construction that doesn't make sense to single out in the real world. (We do think this framing is relatively natural, but won't get into justifications here.)
Will you get into justifications in the next post? Because otherwise the following advice, which I consider literally correct:
- In an alignment plan involving generation and evaluation, you should either have reason to believe that your classifier's errors are light-tailed, or have a story for why inductive bias and/or non-independence work in your favor.
in practice reduces just to the part "have a story for why inductive bias and/or non-independence work in your favor", because I currently think Normality + additivity + independence are bad assumptions, and I see that as almost a null advice.
I think that Normality + additivity + independence come out together if you have a complex system subject to small perturbations, because you can write any dynamic as linear relationships over many variables. This gets you the three perks with:
Since we want to study the situation in which we apply a lot of optimization pressure, I think this scenario gets thrown out the window.
So:
Example: here
But now let's look at a case where and are heavier-tailed. Say that the probability density functions (PDFs) of and are proportional to , instead of like before.
my gut instinct tells me to look at elliptical distributions like , which will not show this specific split-tail behavior. My gut instinct is not particularly justified, but seems to be making weaker assumptions.
I have a vague impression that I am not crazy to hope for whole primate-brain connectomes in the 2020s and whole human-brain connectomes in the 2030s, if all goes well.
After reading the post "Whole Brain Emulation: No Progress on C. elegans After 10 Years" I was left with the general impression that this stuff is very difficult; but I don't know the details, and that post talks about simulation given a connectome, not getting a connectome, which maybe then is easier even for a huge primate brain, I guess? And I don't know what probability you mean with "not crazy".
A market here is thus apposite:
I wasn't saying you made all those assumption, I was trying to imagine an empirical scenario to get your assumptions, and the first thing to come to my mind produced even stricter ones.
I do realize now that I messed up my comment when I wrote
Here there should not be Normality, just additivity and independence, in the sense of U−V⊥V. Sorry.
I do agree you could probably obtain similar-looking results with relaxed versions of the assumptions.
However, the same way U−V⊥V seems quite specific to me, and you would need to make a convincing case that this is what you get in some realistic cases to make your theorem look useful, I expect this will continue to apply for whatever relaxed condition you can find that allows you to make a theorem.
Example: if you said "I made a version of the theorem assuming there exists f such that f(U,V)⊥V for f in some class of functions", I'd still ask "and in what realistic situations does such a setup arise, and why?"