AbstractionAI
Frontpage

32

4 min read

32

Pragma (Greek): thing, object.

A “pragmascope”, then, would be some kind of measurement or visualization device which shows the “things” or “objects” present.

I currently see the pragmascope as the major practical objective of work on natural abstractions. As I see it, the core theory of natural abstractions is now 80% nailed down, I’m now working to get it across the theory-practice gap, and the pragmascope is the big milestone on the other side of that gap.

This post introduces the idea of the pragmascope and what it would look like.

Background: A Measurement Device Requires An Empirical Invariant

First, an aside on developing new measurement devices.

Why The Thermometer?

What makes a thermometer a good measurement device? Why is “temperature”, as measured by a thermometer, such a useful quantity?

Well, at the most fundamental level… we stick a thermometer in two different things. Then, we put those two things in contact. Whichever one showed a higher “temperature” reading on the thermometer gets colder, whichever one showed a lower “temperature” reading on the thermometer gets hotter, all else equal (i.e. controlling for heat exchanged with other things in the environment). And this is robustly true across a huge range of different things we can stick a thermometer into.

It didn’t have to be that way! We could imagine a world (with very different physics) where, for instance, heat always flows from red objects to blue objects, from blue objects to green objects, and from green objects to red objects. But we don’t see that in practice. Instead, we see that each system can be assigned a single number (“temperature”), and then when we put two things in contact, the higher-number thing gets cooler and the lower-number thing gets hotter, regardless of which two things we picked.

Underlying the usefulness of the thermometer is an empirical fact, an invariant: the fact that which-thing-gets-hotter and which-thing-gets-colder when putting two things into contact can be predicted from a single one-dimensional real number associated with each system (i.e. “temperature”), for an extremely wide range of real-world things.

Generalizing: a useful measurement device starts with identifying some empirical invariant. There needs to be a wide variety of systems which interact in a predictable way across many contexts, if we know some particular information about each system. In the case of the thermometer, a wide variety of systems get hotter/colder when in contact, in a predictable way across many contexts, if we know the temperature of each system.

So what would be an analogous empirical invariant for a pragmascope?

The Role Of The Natural Abstraction Hypothesis

The natural abstraction hypothesis has three components:

  1. Chunks of the world generally interact with far-away chunks of the world via relatively-low-dimensional summaries
  2. A broad class of cognitive architectures converge to use subsets of these summaries (i.e. they’re instrumentally convergent)
  3. These summaries match human-recognizable “things” or “concepts”

For purposes of the pragmascope, we’re particularly interested in claim 2: a broad class of cognitive architectures converge to use subsets of the summaries. If true, that sure sounds like an empirical invariant!

So what would a corresponding measurement device look like?

What would a pragmascope look like, concretely?

The “measurement device” (probably a python function, in practice) should take in some cognitive system (e.g. a trained neural network) and maybe its environment (e.g. simulator/data), and spit out some data structure representing the natural “summaries” in the system/environment. Then, we should easily be able to take some other cognitive system trained on the same environment, extract the natural “summaries” from that, and compare. Based on the natural abstraction hypothesis, we expect to observe things like:

  • A broad class of cognitive architectures trained on the same data/environment end up with subsets of the same summaries.
  • Two systems with the same summaries are able to accurately predict the same things on new data from the same environment/distribution.
  • On inspection, the summaries correspond to human-recognizable “things” or “concepts”.
  • A system is able to accurately predict things involving the same human-recognizable concepts the pragmascope says it has learned, and cannot accurately predict things involving human-recognizable concepts the pragmascope says it has not learned.

It’s these empirical observations which, if true, will underpin the usefulness of the pragmascope. The more precisely and robustly these sorts of properties hold, the more useful the pragmascope. Ideally we’d even be able to prove some of them.

What’s The Output Data Structure?

One obvious currently-underspecified piece of the picture: what data structures will the pragmascope output, to represent the “summaries”? I have some current-best-guesses based on the math, but the main answer at this point is “I don’t know yet”. I expect looking at the internals of trained neural networks will give lots of feedback about what the natural data structures are.

Probably the earliest empirical work will just punt on standard data structures, and instead focus on translating internal-concept-representations in one net into corresponding internal-concept-representations in another. For instance, here’s one experiment I recently proposed:

  • Train two nets, with different architectures (both capable of achieving zero training loss and good performance on the test set), on the same data.
  • Compute the small change in data dx which would induce a small change in trained parameter values d\theta along each of the narrowest directions of the ridge in the loss landscape (i.e. eigenvectors of the Hessian with largest eigenvalue).
  • Then, compute the small change in parameter values d\theta in the second net which would result from the same small change in data dx.
  • Prediction: the d\theta directions computed will approximately match the narrowest directions of the ridge in the loss landscape of the second net.

Conceptually, this sort of experiment is intended to take all the stuff one network learned, and compare it to all the stuff the other network learned. It wouldn’t yield a full pragmascope, because it wouldn’t say anything about how to factor all the stuff a network learns into individual concepts, but it would give a very well-grounded starting point for translating stuff-in-one-net into stuff-in-another-net (to first/second-order approximation).

New Comment
3 comments, sorted by Click to highlight new comments since:

As I see it, the core theory of natural abstractions is now 80% nailed down

Question 1: What's the minimal set of articles one should read to understand this 80%?

Question/Remark 2: AFAICT, your theory has a major missing piece, which is, proving that "abstraction" (formalized according to your way of formalizing it) of is actually a crucial ingredient of learning/cognition. The way I see it, such a proof should be by demonstrating that hypothesis classes defined in terms of probabilistic graph models / abstraction hierarchies can be learned with good sample complexity (and better yet if you can tell something about the computational complexity), in a manner that cannot be achieved if you discard any of the important-according-to-you pieces. You might have some different approach to this, but I'm not sure what it is.

Question 1: What's the minimal set of articles one should read to understand this 80%?

Telephone Theorem, Redundancy/Resampling, and Maxent for the math, Chaos for the concepts.

Question/Remark 2: AFAICT, your theory has a major missing piece, which is, proving that "abstraction" (formalized according to your way of formalizing it) of is actually a crucial ingredient of learning/cognition. The way I see it, such a proof should be by demonstrating that hypothesis classes defined in terms of probabilistic graph models / abstraction hierarchies can be learned with good sample complexity (and better yet if you can tell something about the computational complexity), in a manner that cannot be achieved if you discard any of the important-according-to-you pieces. You might have some different approach to this, but I'm not sure what it is.

If we want to show that abstraction is a crucial ingredient of learning/cognition, then "Can we efficiently learn hypothesis classes defined in terms of abstraction hierarchies, as captured by John's formalism?" is entirely the wrong question. Just because something can be learned efficiently doesn't mean it's convergent for a wide variety of cognitive systems. And even if such hypothesis classes couldn't be learned efficiently in full generality, it would still be possible for a subset of that hypothesis class to be convergent for a wide variety of cognitive systems, in which case general properties of the hypothesis class would still apply to those systems' cognition.

The question we actually want here is "Is abstraction, as captured by John's formalism, instrumentally convergent for a wide variety of cognitive systems?". And that question is indeed not yet definitively answered. The pragmascope itself would largely allow us to answer that question empirically, and I expect the ability to answer it empirically will quickly lead to proofs as well.

Telephone Theorem, Redundancy/Resampling, and Maxent for the math, Chaos for the concepts.

Thank you!

Just because something can be learned efficiently doesn't mean it's convergent for a wide variety of cognitive systems.

I believe that the relevant cognitive systems all look like learning algorithms for a prior of certain fairly specific type. I don't know how this prior looks like, but it's something very rich on the one hand and efficiently learnable on the other hand. So, if you showed that your formalism naturally produces priors that seem closer to that "holy grail prior", in terms of richness/efficiency, compared to priors that we already know (e.g. MDPs with small number of states which are not rich enough, or the Solomonoff prior which is both statistically and computationally intractable), that would at least be evidence that you're going in the right direction.

And even if such hypothesis classes couldn't be learned efficiently in full generality, it would still be possible for a subset of that hypothesis class to be convergent for a wide variety of cognitive systems, in which case general properties of the hypothesis class would still apply to those systems' cognition.

Hmm, I'm not sure what would it mean for a subset of a hypothesis class to be "convergent".

The question we actually want here is "Is abstraction, as captured by John's formalism, instrumentally convergent for a wide variety of cognitive systems?".

That's interesting, but I'm still not sure what it means exactly. Let's say we take a reinforcement learner which a specific hypothesis class, such all MDPs of certain size, or some family of MDPs with low eluder dimension, or the actual AIXI. How would you determine whether your formalism is "instrumentally convergent" for each of those? Is there a rigorous way to state the question?