Overview

A top-to-bottom tour of what Neander is, the guarantees it makes, and why it is built the way it is. This page is the conceptual companion to the Specification; the spec is the normative source for every rule summarized here.

Contents

  1. Purpose
  2. What Neander looks like
  3. Safe by construction — the language is the sandbox
    1. What the language leaves to the host
  4. Guaranteed termination
    1. Why not Turing-complete
    2. How termination is provable
  5. No subprograms — only inline composition
    1. Many small programs, not one lasting one
  6. The budget system
    1. Per-execution budget 1 — Computation (Thalers)
    2. Per-execution budget 2 — Memory
    3. Per-execution budget 3 — Duration
    4. Fixed limit 1 — Per-call timeout
    5. Fixed limit 2 — Source size
    6. Fixed limit 3 — Nesting depth
    7. Fixed limit 4 — Repeat limit cap
    8. Enforcement, at a glance
  7. API orchestration, not a general-purpose language

Purpose

Neander exists to solve one problem: letting an enterprise system safely execute small programs written by an external AI agent, where those programs orchestrate the system’s own APIs.

The conventional way to connect an agent to an enterprise’s functions is to load every available function into the agent’s context as a tool definition. In an environment with hundreds or thousands of API functions, that floods the context with descriptions the agent will never use — degrading answer quality and hitting context limits before any work is done.

Neander inverts the model. Instead of a sprawling tool catalog, the agent holds a single compact language reference. It then discovers the few functions it needs and calls them by writing a program — a small, self-contained artifact the runtime executes server-side, close to the APIs.

Embedding application API API API Neander runtime AI agent program result
The agent submits programs across the channel and results come back along it; the runtime executes them inside the embedding application and calls the registered APIs.

That immediately raises the security question: how do you run code an untrusted agent wrote, on a server that sits next to your production APIs, without it becoming an attack surface or a runaway cost? Everything below is the answer. But first — since the rest of this page is mostly about what a program cannot do — here is what one actually looks like.

What Neander looks like

A Neander program is a single, self-contained text artifact: a version header, a mandatory types block, and a main block. It is submitted as text and returns a JSON envelope; the agent reads the result back into its own context. An agent rarely writes just one program, though — because it starts out knowing nothing about the host’s APIs, it works in a short sequence, first discovering what exists and then calling it. A discovery program asks the runtime what it offers:

neander 1 {
  types {}
  main -> [Namespace] {
    return discover namespaces []   // [] means "no filter" — list them all
  }
}

The runtime answers with a list of namespaces and their descriptions; the agent selects one and drills into its functions and types the same way, then writes the program that does the real work — here, look up one record and return a field:

neander 1 {
  types {}
  main -> decimal(2, half_away) {
    // orders.get and the orders.Order type were learned from the discovery above
    let order: orders.Order =? call orders.get(id: 7341)
    return order.total
  }
}

That is the whole loop: discover, call, combine, return. The In Action page follows complete agent–runtime sessions on real transcripts — discovery, calls, and learning from errors. Beyond that, a handful of deliberate design choices give Neander its distinct feel — most of them recognizable as inversions of what a general-purpose language offers:

  • Runtime discovery (discover). A program learns the available APIs — namespaces, functions, their types, even prose format documents — at execution time, straight from the runtime. There is no out-of-band integration step and no API catalog to keep in sync: the runtime is the single, always-current source of truth.
  • Failure lives in the type system (T!). Every call returns a failable — a value or an error, never a bare value. The program cannot read the result without first resolving that possibility: throw on it (=?), substitute a default (??), or branch on it (is error). An API error cannot be silently ignored.
  • Absence lives in the type system (T?). A value that might be missing has a distinct nullable type and cannot be used as if present without an explicit check — no null slips through unhandled.
  • Exact, arbitrary-precision numbers. int and decimal are unbounded and never implicitly mix. Each decimal binding carries an explicit scale and rounding mode, so monetary arithmetic is exact and reproducible — no floating-point drift, and identical results across conforming runtimes.
  • Structural typing. A program declares only the fields it actually uses from an API response; extra fields are ignored — so the agent reproduces only the slice of a response its task needs, never the API’s full schema.
  • Immutable and explicit, with no hidden state. Values are immutable and no binding is ever reassigned or shadowed; every argument is named, and there is no global or shared state. The result is a flat artifact that reads top to bottom — easy to audit.
  • A self-describing runtime. Submitting the empty program returns the full language Reference plus the runtime’s active budgets and limits in one exchange, so a fresh agent can bootstrap into an unfamiliar runtime with no prior knowledge.

Safe by construction — the language is the sandbox

Neander treats every program as untrusted and validates it completely before execution. But it deliberately does not rely on a traditional sandbox (containers, VMs, restricted OS runtimes). Those add operational complexity and overhead. Instead, the dangerous capabilities simply do not exist in the language:

  • No I/O of any kind except calling APIs the host explicitly registered — no file access, no network access, no environment variables.
  • No dynamic code — no eval, no string-to-code conversion, no code generation.
  • No sleep or timing primitives — a program has no way to deliberately stall or delay.
  • No general recursion and no unbounded loops (see termination, below).

Because the hazards are absent rather than merely walled off, a program that passes validation is safe to execute by construction. Validation guarantees that a passing program will:

  • terminate (bounded steps, bounded iterations, no recursion);
  • call only functions that exist in the host’s API registry;
  • pass only correctly-typed arguments to those functions;
  • respect type boundaries (a value that might be null is never treated as non-null without an explicit check);
  • have no undefined behavior — every construct has exactly one interpretation.

The trust model is explicit: programs are untrusted; the API registry, the API providers, and the interpreter are trusted. The interpreter is the trusted computing base and is kept as small as possible.

What the language leaves to the host

Neander secures a single program execution, structurally — but it says nothing about who is calling or whether they are allowed to. Authentication, authorization, rate limiting, and per-client quotas are the embedding application’s responsibility, not the language’s. Neander guarantees a program can only reach the registered APIs; it does not decide whether a given caller is allowed to use them.

Guaranteed termination

The cornerstone guarantee: every Neander program terminates. A host can accept any valid program and run it with no risk of an infinite loop or a hang.

Why not Turing-complete

This is a deliberate choice: Neander is intentionally sub-Turing. API orchestration does not need Turing completeness — the patterns it requires (sequential calls, conditional branching, bounded iteration) are all expressible without general recursion or unbounded loops. This follows the precedent of languages like Google’s CEL and Dhall, and the theoretical foundation of primitive recursive functions. Giving up unbounded computation costs Neander nothing it actually needs, and buys it static, provable termination.

How termination is provable

Termination is established statically, before a program runs:

  • No general recursion — there are no user-defined functions to recurse, so no call can loop back on itself.
  • each is bounded by list length — iteration count is fixed by the data being traversed.
  • repeat is capped by a hard maximum — the runtime enforces a fixed ceiling on how many times any repeat loop may run, checked before the program runs, so it can never iterate without bound.

The budget system (next) then adds a second, independent layer of resource control on top of this structural guarantee.

No subprograms — only inline composition

A program cannot define its own functions. The only callable units in Neander are the API functions registered by the host and the language’s built-in functions. There are no user-defined procedures or subroutines — no subprograms of any kind.

This is not an omission; it is a deliberate decision, and it is what makes the termination guarantee above hold so cleanly. User-defined functions would open the door to recursion — directly, or through mutual references between two functions — and recursion is exactly what Neander rules out to guarantee termination. Removing the construct removes the entire problem at its root, rather than trying to detect and forbid recursive call graphs after the fact.

The decision costs almost nothing in expressiveness. Even non-recursive local functions would add naming, scoping, and call-graph complexity without enabling a single pattern that cannot already be written inline as straight-line code with let, if, and the bounded loops (each, repeat). And because Neander programs are short, one-shot, and disposable, the usual payoff of factoring code into reusable functions — deduplication across a large, long-lived codebase — is marginal here. An agent writes a small program, it runs once, and it is gone.

So composition within a program happens by sequencing and nesting inline, not by abstraction: bind intermediate results with let, branch with if, iterate with the bounded loops, and call out to the registered APIs. The result is a flat, fully-inlined artifact with no internal call graph to analyze — simpler for the runtime to validate and execute, and simpler for a human to audit.

Many small programs, not one lasting one

There is a bigger shift hiding here, and it is worth stating plainly. The unit of reuse and iteration in Neander is not a function inside a program — it is another program.

An agent does not write one durable program and maintain it. It interacts with the runtime the way a developer interacts with a terminal: it writes and runs a small program, reads the result, and then writes the next program informed by what it just learned. A typical session is a sequence — submit the empty program to receive the Reference; write a discover program to find the right API; read the result; write a call program to do the work. Knowledge accumulates in the agent’s own context, not in the language runtime, which is precisely why Neander needs no module system, no imports, and no shared state between programs.

This inverts a deep habit. Conventional code is written to last — an asset to be named, structured, reused, versioned, and maintained over time. A Neander program is the opposite: a throwaway product, written for a single purpose, run once, and discarded. It is closer to a shell one-liner or a REPL expression than to a module in a codebase. There is nothing to keep, because the durable, intelligent, stateful party in the system is the agent — the program is just the disposable instrument it reaches for in the moment.

That is also what makes each program a complete audit artifact: because it carries no hidden dependencies and no shared state, everything it does is visible in its own source, and once it has run, it leaves nothing behind.

The budget system

A Neander program runs inside an enterprise application that ultimately pays for the compute, memory, time, and downstream API calls the program consumes. To keep that cost bounded — and to stop any one agent-authored program from monopolizing or destabilizing the host — the runtime enforces a set of ceilings on every program execution, assigned according to the embedding application’s policy.

Every budget and limit here bounds a single program execution. Bounding aggregate or concurrent usage across program executions is the embedding application’s job: Neander deliberately has no notion of a session or agent identity, so only the host can enforce limits that span more than one run.

Two important framing points:

  • Programs never declare or negotiate their own budgets. There is no language syntax to inspect, request, or change a budget — it is an operational concern, not a language one.
  • The agent learns the current values up front. When an agent submits the empty program, the runtime’s Reference Response reports the active budgets and limits, so the agent knows the ceilings before it writes its first real program.

The dimensions fall into two groups. Three per-execution budgets are consumed as a program runs and, if exhausted, terminate it with a budget_exceeded Abort. Four fixed structural limits are properties of the runtime — configuration constants, not budgets the program consumes. Three of them (source size, nesting depth, and the repeat cap) are checked before execution, so exceeding one is a Flaw that rejects the program outright; the per-call timeout is the exception, checked at runtime and failing a single call recoverably.

Per-execution budget 1 — Computation (Thalers)

The unit of computational work is the Thaler (after the old coin — and a nod to the Neanderthal). The runtime assigns a Thaler budget to each execution, and every operation that performs real computation costs Thalers.

The cost is reckoned at the atomic operation level, not per statement: a single arithmetic, comparison, logical, type-check, field-access, or index operation costs 1 Thaler; a call, a discover, and each loop iteration cost 1 Thaler each (the loop body’s operations are charged on top). Operations that perform no computation cost nothing — binding a name with let, returning, yielding, skipping, and evaluating a literal are all free, because any work needed to produce their value was already charged to the expression that built it.

Structural comparisons recurse: comparing two records with three comparable fields costs 3 Thalers; comparing two 100-element lists costs up to 100 (it short-circuits on the first unequal element). Because the cost table is fixed by the spec, Thaler cost is identical across conforming runtimes.

Per-execution budget 2 — Memory

The runtime assigns a separate memory budget, in whole kilobytes, capping peak allocation. Every value allocation — a binding, a list element, a record field, a map entry — consumes memory proportional to its size, and the runtime checks the budget before each allocation. If an allocation would push peak usage over the ceiling, it raises a budget_exceeded Abort before allocating.

Memory and computation are kept separate because they are genuinely different resources: a program can use lots of memory with little computation (receiving a large API response) or lots of computation with little memory (heavy arithmetic on a few values). Splitting them lets the host tune each independently.

Unlike Thalers, the exact byte cost of a value is implementation-defined — a runtime written in a different host language represents values differently. What the spec mandates is the observable contract: a real ceiling on peak allocation, an Abort before any allocation that would exceed it, and peak-usage telemetry reported in whole kilobytes. A conforming runtime must still account for the big consumers (API payloads; the contents of lists, maps, and records; let-bound values; the value returned from main; and returned document content) so the budget stays a meaningful bound.

Per-execution budget 3 — Duration

The runtime assigns a duration budget in wall-clock milliseconds, limiting total elapsed time — including both computation and time spent waiting for API responses. Reaching it aborts execution immediately.

Duration is independent of computation: a program might burn its Thalers fast through intensive arithmetic, or exhaust its duration budget waiting on slow APIs while spending almost no Thalers.

Fixed limit 1 — Per-call timeout

Beyond the total duration budget, the runtime enforces a per-call timeout on every individual call. It is runtime policy — the agent does not declare it and API functions cannot override it.

Crucially, a timed-out call is not an Abort. The runtime terminates that one call and injects a Runtime Error, so the agent’s ordinary failure handling is the right response and execution continues. The wait still counts against the duration budget; if the aggregate wall-clock then exceeds it, that remains a terminal Abort. The two mechanisms compose: one slow API can fail recoverably without taking down the whole program.

Fixed limit 2 — Source size

The runtime caps the maximum size of submitted source, in UTF-8 bytes. It is checked before lexing, so an oversized submission is rejected as a Flaw before execution starts.

Fixed limit 3 — Nesting depth

The runtime caps the maximum AST nesting depth (the longest root-to-leaf path in the program’s syntax tree), counting nested blocks, expressions, and literal structures. It is checked incrementally during parsing (a cheap integer compare on each descent); the first node that would exceed it aborts parsing with a Flaw. The limit prevents a program that is short and within every other budget from overflowing the interpreter’s host-language stack during parsing, type checking, or execution.

Fixed limit 4 — Repeat limit cap

The runtime sets a maximum permitted number of iterations for every repeat loop. The cap is checked at validation, so a loop that exceeds it is rejected as a Flaw before the program runs.

Enforcement, at a glance

Dimension Kind When checked On violation
Computation Per-execution budget Before every operation Abort
Memory Per-execution budget Before every allocation Abort
Duration Per-execution budget Continuously Abort
Per-call timeout Fixed limit Per call Runtime Error (recoverable)
Source size Fixed limit Before lexing Flaw
Nesting depth Fixed limit During parsing Flaw
Repeat cap Fixed limit During validation Flaw

The three per-execution budgets abort a running program; three of the four fixed limits reject it before it runs, while the per-call timeout fails a single call during it.

API orchestration, not a general-purpose language

Neander’s job is to orchestrate APIs, not to be a general-purpose language — and it defines no APIs of its own. Throughout Neander, an API is a function exposed to programs by the runtime, registered by the embedding application. A program’s entire job is to discover what is available (discover), invoke registered APIs (call), and combine the results with conditional logic and bounded iteration — never to implement an API itself.


© 2026 New Adventures in IT. Licensed under CC BY 4.0.