We aren’t offering these criteria as necessary for “knowledge”—we could imagine a breaker proposing a counterexample where all of these properties are satisfied but where intuitively M didn’t really know that A′ was a better answer. In that case the builder will try to make a convincing argument to that effect.
Bolded should be sufficient.
In fact, I'm pretty sure that's how humans work most of the time. We use the general-intelligence machinery to "steer" ourselves at a high level, and most of the time, we operate on autopilot.
Yeah, I agree with this. But I don't think the human system aggregates into any kind of coherent total optimiser. Humans don't have an objective function (not even approximately?).
A human is not well modelled as a wrapper mind; do you disagree?
Thus, any greedy optimization algorithm would convergently shape its agent to not only pursue , but to maximize for 's pursuit — at the expense of everything else.
Conditional on:
I'm pretty sceptical of #2. I'm sceptical that systems that perform inference via direct optimisation over their outputs are competitive in rich/complex environments.
Such o...
I think mesa-optimisers should not be thought of as learned optimisers, but systems that employ optimisation/search as part of their inference process.
The simplest case is that pure optimisation during inference is computationally intractable in rich environments (e.g. the real world), so systems (e.g. humans) operating in the real world, do not perform inference solely by directly optimising over outputs.
Rather optimisation is employed sometimes as one part of their inference strategy. That is systems o...
GPTs are not Imitators, nor Simulators, but Predictors.
I think an issue is that GPT is used to mean two things:
[See the Appendix]
The latter kind of GPT, is what I think is rightly called a "Simulator".
From @janus' Simulators (italicised by me):
...I use the generic term “simulator” to refer to models trained with pre
Predictors are (with a sampling loop) simulators! That's the secret of mind
What do you think MIRI is currently doing wrong/what should they change about their approach/general strategy?
I thought I was pretty clear in the post that I don't have anything against MIRI. I guess if I were to provide feedback, the one thing I most wish MIRI would do more is hire additional researchers—I think MIRI currently has too high of a hiring bar.
To be clear, I enjoyed the post and am looking forward to this sequence. A point of disagreement though:
One feasible-seeming approach is "accelerating alignment," which involves leveraging AI as it is developed to help solve the challenging problems of alignment. This is not a novel idea, as it's related to previously suggested concepts such as seed AI, nanny AI, and iterated amplification and distillation (IDA).
I disagree that using AI to accelerate alignment research is particularly load bearing for the development of a practical alignment craf...
I won't write a detailed object-level response to this for now, since we're probably going to publish a lot about it soon. I'll just say that my/our experience with the usefulness of GPT has been very different than yours -
I have used ChatGPT to aid some of my writing and plan to use it more — but it's to the same extent that we use Google/Wikipedia/Word processors to do research in general.
I've used GPT-3 extensively, and for me it has been transformative. To the extent that my work has been helpful to you, you're indebted to GPT-3 as well, because "janus...
I disagree that intelligence and rationality are more fundamental than physics; the territory itself is physics, and that is all that is really there. Everything else (including the body of our phone knowledge) are models for navigating that territory.
Turing formalised computation and established the limits of computation given certain assumptions. However, those limits only apply as long as the assumptions are true. Turing did not prove that no mechanical system is superior to a Universal Turing Machine, and weird physics may enable super Turing computati
...Physics is not the territory, physics is (quite explicitly) the models we have of the territory. Rationality consists of the rules for formulating these models, and in this sense it is prior to physics and more fundumental. (This might be a disagreement over use of words. If by "physics" you, by definition, refer to the territory, then it seems to miss my point about Occam's razor. Occam's razor says that the map should be parsimonious, not the territory: the latter would be a type error.) In fact, we can adopt the view that Solomonoff ...
I think there might be a typo here. Did you instead mean to write: "w∈B(W)" for the second order beliefs about the forecasters?