Here's an argument that I think works in the limit where m→∞. For very large m, each subset S⊆[n] of size s will occur many times as an Si. Moreover, for each set of coodinate values xS:=(xj)j∈S, there will be many xi such that Si=S and xiS=xS. Therefore, using the law of large numbers for any subset S⊆[n] of size s and any set of coordinate values xS, Bob can infer N(xS,f):=#{x′∈{0,1}n:x′S=xS and f(x′)=1}, i.e. the number of inputs x′ that ag... (read more)
It seems to me like there are a couple of different notions of “being able to distinguish between mechanisms” we might want to use:
There exists an efficient algorithm which when run on a model input x will output which mechanism model M uses when run on x.
There exists an efficient algorithm which when run on a model Mwill output another programme P such that when we run P on a model input x it will output (in reasonable time) which mechanism M uses when run on x.
Your overall picture sounds pretty similar to mine. A few differences.
* I don't think the literal version of (2) is plausible. For example, consider an obfuscated circuit.
* The reason that's OK is that finding the de-obfuscation is just as easy as finding the obfuscated circuit, so if gradient descent can do one it can do the other. So I'm really interested in some modified version of (2), call it (2'). This is roughly like adding an advice string as input to P, with the requirement that the advice string is no harder to learn than M itself (though this isn't exactly right).
* I mostly care about (1) because the difficulty of (1) seems like the main reason to think that (2') is impossible. Conversely, if we understood why (1) was always possible it would likely give some hints for how to do (2). And generally working on an easier warm-up is often good.
* I think that if (1) is false in general, we should be able to find some example of it. So that's a fairly high priority for me, given that this is a crucial question for the feasibility or character of the worst-case approach.
* That said, I'm also still worried about the leap from (1) to (2'), and as mentioned in my other comment I'm very interested in finding a way to solve the harder problem in the case of primality testing.
Here's an argument that I think works in the limit where m→∞. For very large m, each subset S⊆[n] of size s will occur many times as an Si. Moreover, for each set of coodinate values xS:=(xj)j∈S, there will be many xi such that Si=S and xiS=xS. Therefore, using the law of large numbers for any subset S⊆[n] of size s and any set of coordinate values xS, Bob can infer N(xS,f):=#{x′∈{0,1}n:x′S=xS and f(x′)=1}, i.e. the number of inputs x′ that ag... (read more)