All of Mart_Korz's Comments + Replies

Finally, if we want to make the model capture certain non-Bayesian human behaviors while still keeping most of the picture, we can assume that instrumental values and/or epistemic updates are cached. This creates the possibility of cache inconsistency/incoherence.

In my mind, there is an amount of internal confusion which feels much stronger than what I would expect for an agent as in the OP. Or is the idea possibly that everything in the architecture uses caching and instrumental values? From reading, I imagined a memory+cache structure instead of being... (read more)

gwern*161

"Cached" might be an unhelpful term here, compared to "amortized". 'Cache' makes one think of databases or memories, as something you 'know' (in a database or long-term memory somewhere), whereas in practice it tends to be more something you do - fusing inference with action. (They are 'cached' in the same way that you might loosely talk about a neural net 'caching' a complicated-to-compute function, like a value function in RL/decision theory.)

So 'amortized' tends to be more used in the Bayesian RL literature, and give you an idea of what Bayesian RL agen... (read more)