AI ALIGNMENT FORUM
AF

Agent FoundationsAI ControlEpistemologyAI
Frontpage

5

S-Expressions as a Design Language: A Tool for Deconfusion in Alignment

by Johannes C. Mayer
19th Jun 2025
7 min read
0

5

Agent FoundationsAI ControlEpistemologyAI
Frontpage
New Comment
Moderation Log
More from Johannes C. Mayer
View more
Curated and popular this week
0Comments

TL;DR: S-expressions are a minimal structure language for early-stage conceptual engineering (aka deconfusion). They are especially useful in alignment, where the hardest part is often figuring out what the problem even is. S-expressions do not enforce any semantics. This lets you write down structure before you know what the structure means. You can define partial concepts, factor arguments, link components, and reuse references, all without committing to types, schemas, or logic. The result is a format that makes conceptual confusion structurally visible, enables simple programmatic tooling, and uses natural language to carry provisional meaning. They are not a programming tool per se. They are a design scaffold for when your theory is still under construction.

A central problem in alignment research is that we often can't even represent the thing we're trying to solve. It's not just about formalizing ideas. It's about forming them in the first place—building coherent internal structure around concepts that are still vague.

In practice, this process unfolds in stages:

  1. You notice that some structural insight exists—you glimpse that there's a “there” there.
  2. You work to make the idea coherent: to lay out its parts, dependencies, and implications—even if imprecisely.
  3. Only then do you translate it into a precise formal system.

Most research tools (e.g. programming languages, theorem provers) are built for stage 3. But in alignment, the hard step is usually stage 2. It’s not that we can’t formalize things. It’s that the ontology is underdefined, the relationships are unclear, and the whole structure is unstable.

To move forward, we need a design language. Not a language for computation. A language for exploring and organizing structure before it’s nailed down. One that forces coherence without demanding premature precision.

This essay presents the hypothesis that S-expressions are the best format we have for this. They provide minimal syntax with maximal compositionality. They let you define and reuse concepts, see your structure grow, track unresolved parts, and manipulate the system programmatically—all without committing to any semantics you’re not ready to specify.

This is not about Lisp or programming. It’s about writing down vague ideas in a way that forces epistemic structure—and then lets you grow that structure into a real theory.

The point is to understand and manipulate complex conceptual structures with the least amount of constraint or arbitrary assumption.

What Is a Design Language?

A design language is a notation system you use to think. Not to implement. Not to describe finished systems. But to work through systems that aren’t yet fully defined. When designing cognitive systems, reasoning frameworks, or abstraction hierarchies, you are trying to expose and organize unknown structure.

Natural language has minimal constraints. That makes it expressive, but hard to analyze structurally or refactor. Formal systems (like typed programming languages or JSON) make assumptions early—about types, schemas, fields, relationships—which you often don’t want to make yet.

S-expressions provide structure with minimal assumptions. They impose enough syntactic form to organize ideas without forcing semantic commitments you’re not ready to make.

The Core Properties of S-Expressions

An S-expression is just:

(exp1 exp2 exp3 ...)

Each of exp1, exp2, and exp3 is itself either a primitive value or another S-expression.

That is, an S-expression is just a nested list (or analogously a tree). That’s all. Here is a concrete S-expression:

(goal (intelligence)
  (definition "The ability to steer the world toward a target state.")
  (factors
    (world-model)
    (efficient-planning)
    (updateable-world-model)))

This gives you three key properties:

  1. Uniformity – every node is either an atom or a list; the structure is fully recursive.
  2. Explicit structure – the full hierarchy is spelled out.
  3. Deferred semantics – the meaning is determined by your interpretation, not baked into the syntax.

This third point is the key. When you write a map in Clojure, or define an object with fields, you’re already making semantic decisions. When you write an S-expression, you are just laying out structure. What it means is up to the context, or later processing. This is what makes it appropriate for designing systems you don’t yet fully understand.

Example: Structuring an Intelligence Architecture

Here is an intentionally broken example of a structure that tries to capture some parts of the universal structure of intelligence:

(goal (intelligence)
  (definition "The ability to steer the world toward a target state.")
  (factors
    (world-model)
    (efficient-planning)
    (updateable-world-model)))

(structure (world-model)
  (description "A representation of the environment that enables the agent to reason about possible states and actions.")
  (requirements
    (ontology)
    (causal-structure)))

(structure (ontology)
  (description "A formal system of categories that describes the kinds of entities in the environment."))

(structure (causal-structure)
  (description "Encodes how actions lead to changes in the environment."))

(structure (efficient-planning)
  (description "The ability to generate action sequences that achieve a goal with tractable computation.")
  (substructures
    (hierarchical-abstraction)
    (search-guidance)))

(structure (hierarchical-abstraction)
  (description "Breaks down large tasks into manageable subgoals."))

(structure (search-guidance)
  (description "Uses heuristics or learned priors to prune the search space."))

(structure (updateable-world-model)
  (description "A world model that can be updated based on observed data.")
  (implements
    (locally-updatable)
    (causally-traceable)))

(structure (locally-updatable)
  (description "Changes in the environment can be integrated into localized parts of the model."))

(structure (causally-traceable)
  (description "Effects in the world model can be traced back to their generative causes."))

The core advantage of S-expressions here is that they let you externalize structure while keeping semantics implicit. That constraint set is perfect for exploratory conceptual design, where you’re trying to uncover what the right ontology even is.

When writing the structure above, several insights emerged that weren’t just helped by the format, but caused by it.

For example, the factorization of (goal intelligence) includes both (world-model) and (updateable-world-model). That immediately stands out as confused. Updateability isn’t a separate factor of intelligence—it’s a property of the world model. That inconsistency is visually obvious. The structure is broken, even before any downstream consequence emerges. You don’t need to finish the system to see it crash.

Another example: (instrumental-rationality) is defined above as “having an accurate world model.” But in the factorization of intelligence, we don’t mention instrumental rationality at all—we reference the world model directly. This reveals an ontological mismatch. Is instrumental rationality just a name for that structure? Or is it something more abstract? Again, the structure forces a choice, and highlights incoherence when you fail to make one.

You also get the kind of feedback you get when trying to implement an algorithm you don’t understand and fail: the structure fails to go together. The inability to write a consistent structure reveals that your concept is underdefined. If you try to reference (structure causally-traceable) but haven’t written a description, that shows up as an unresolved subnode in the structure. You don’t need to track this mentally—it’s held in the shape of the expression. It’s an explicit placeholder for confusion.

This is what makes it useful. The format allows deferred semantics—but the structure is strict enough to enforce legibility. You know what’s missing. You see what needs clarification. You can traverse the structure or display it visually and instantly identify where the system is still vague.

S-expressions enable this because they are:

  1. Visual. Aligning S-expressions in the usual Lisp style makes the structure visually obvious. You can navigate the hierarchy and spot inconsistencies or missing pieces by inspection, without tooling.

  2. Composable. You define (structure updateable-world-model ...) once and reference it in multiple places. This makes duplication, misnaming, and incoherence much easier to detect. It also makes it easier to reason because the concepts are broken up into digestible chunks.

  3. Semantically open. Identifiers are just symbols. Strings are just text. The semantics are assigned by context—often via embedded natural language—not enforced by the syntax.

This last point is essential: we’re not even using formal semantics. The strings contain informal explanations. The symbols are ordinary language. The structure exists independently of interpretation. That makes it ideal for early-stage theorizing, where the goal is to figure out what the semantics should be.

Structure That Enables Computation

What makes this useful is that the structure is explicit and simple. This makes it easy to work with programmatically. You can write tools to check that every goal has a definition. You can enforce that each factor has an argument. You can write transformation rules to reformat or extract parts of the structure.

Because the whole representation is just trees of lists and primitives, the logic to process it is trivial. Parsing is trivial. Validation is trivial. Generating new structures is trivial. Rewriting patterns is trivial.

Contrast this with Clojure maps:

{:goal :intelligence
 :definition "The ability to steer the world..."
 :factors [{:name :world-model
            :argument "..."}]}

You’ve now committed to field names. You’ve assumed key-value pairs. You’ve lost the ability to freely vary the structure. Maybe one factor needs two sub-steps and therefore doesn't fit the key-value model. You now need custom logic for every edge case. The semantic overhead is larger.

S-expressions are agnostic to all of this. The only structure is what you put into them. No assumptions about what a symbol means. No assumptions about cardinality. Just syntax for trees.

Why This Reflects Thought

There’s a deeper point. The reason this works well is that thought is compositional. When you decompose a goal, explain a reason, or define a concept, you’re often implicitly building a tree of symbolic relationships. A structure of meaning. S-expressions directly reflect this shape. That makes them not just convenient, but aligned with how thinking works.

And because S-expressions carry no built-in semantics, you can impose your own interpretation—both informally through natural language, and formally through programmatic checks—without conflict. That makes them useful both for exploratory reasoning and for later automation.

Summary

S-expressions provide a uniquely effective design language for exploratory reasoning. They offer:

Structural and Computational Properties

  • A syntax with no enforced semantics.
  • A uniform tree representation—easy to parse, transform, and validate.
  • Trivial programmatic access: you can walk, pattern match, and rewrite the structure effortlessly.
  • Unconstrained composition: you can define, reuse, and refer to structures without schema overhead.

Cognitive and Epistemic Benefits

  • Explicit structure that makes conceptual confusion visible.
  • Semantic openness—symbols carry no meaning unless you assign it.
  • Natural integration with human reasoning: you describe structure while still using natural language for meaning.
  • Epistemic legibility—unresolved concepts remain visible in the structure, not lost in prose.

S-expressions don’t impose precision before you’re ready. They support the actual work of forming coherent ideas. When you don’t yet have a theory, but you know the shape of one is forming, they give you the right minimal language to write it down.

They’re not a programming language. They’re the language to use before you have one.

Caution: This is a tool for epistemic structure, not performance optimization. Like any scaffolding for theoretical understanding, it can support both alignment and capability work. Use judgment when applying or publishing conceptual decompositions derived with it, especially in domains where clearer structure could increase downstream system competence.