Neander in action
The Overview describes how Neander works; this page shows it. Below are annotated transcripts of an AI agent — a mid-tier, Sonnet-class model — working against a live Neander runtime, captured from real sessions against the reference runtime implementation. The agent starts cold — it knows neither the host’s APIs nor the Neander language itself, and learns both in-band from the runtime: the language from the empty-program bootstrap, then the APIs by discovering them.
There is one thing the agent cannot bootstrap on its own, though: the convention itself. A blank agent has no way to guess that an empty program returns the Reference — or even how to reach the runtime. The embedding application supplies that, as a minimal launch prompt: in effect, submit programs on this channel; send an empty one first to get the language reference. That single instruction is the whole of the bootstrap contract — everything after it, the agent learns from the runtime’s responses. The transcripts below pick up right after that launch prompt.
Each example is a sequence of round-trips. The agent’s only capability is to submit a program as text and read the JSON response; everything it learns, it learns from those responses. Watch the same loop recur: discover, call, combine, return.
These transcripts are for humans, to show how the language is used in practice. They are not an input for agents — do not point a running agent at this page. In particular, the runtime’s Reference Response (returned to the empty program) is elided below to a short placeholder: an agent must receive the Reference in-band from the runtime it is actually connected to, at that runtime’s exact version. The Reference page explains why.
Contents
- How to read a transcript
- Example 1 — A single lookup
- Example 2 — Discovering a format
- Example 3 — Composing a view
- Example 4 — Learning from errors
How to read a transcript
A little orientation before the first example:
- The empty program bootstraps the agent. An agent that knows nothing submits an empty program — zero bytes — and the runtime replies with the full language Reference plus its active budgets and limits. We elide the Reference body below; it is the same document linked from the Reference page.
- Discovery handles are opaque.
discoverreturns handles the program cannot inspect directly. To read what they describe, the agent returns them frommainand reads the JSON the runtime serializes — so a “discovery program” is usually justreturn discover …. - Notation. A block labelled Agent submits is the program text sent to the
runtime; Runtime responds is the JSON envelope returned. Long or repetitive
payloads are trimmed with
…and noted where that happens.
Example 1 — A single lookup
Task given to the agent: “The remote system manages reservations. Find the booking with id 8821 and report its fare amount.”
The simplest useful exchange, end to end. The agent starts cold: it does not know
the system manages “bookings,” that a bookings.get function exists, or what a
booking record looks like. It discovers all of that, then does the work.
Round 1 — Bootstrap
The agent submits an empty program to learn the language and the runtime’s ceilings.
Agent submits: an empty program — nothing at all.
Runtime responds:
{
"success": true,
"result": "# The Neander Reference v1_draft_20260618T104500Z\n\n… (the full language reference — elided; see the Reference page)",
"meta": {
"neanderVersion": 1,
"thalerBudget": 5000,
"memoryBudgetKb": 2048,
"maxDurationMs": 10000,
"perCallTimeoutMs": 3000,
"maxProgramSizeBytes": 65536,
"maxNestingDepth": 64,
"maxRepeatLimit": 1000
}
}
The agent now knows the language and the ceilings it must stay under. From here on it can write real programs.
These budgets and limits are set by the embedding application, not fixed by the language — the values above are this runtime’s configuration for this session. A different deployment can assign different ceilings, and the agent always learns the active values from this bootstrap response before it writes a line.
Round 2 — Discover the namespaces
The agent has no idea what the host exposes, so it lists the namespaces.
Agent submits:
neander 1 {
types {}
main -> [Namespace] {
return discover namespaces []
}
}
Runtime responds:
{
"success": true,
"result": [
{
"name": "bookings",
"description": "Booking management — retrieve, list, confirm, and escalate reservations for the hospitality backend"
},
{
"name": "concierge",
"description": "Concierge planning — assemble multi-stay room blocks for the hospitality backend from an encoded request"
},
{
"name": "guests",
"description": "Guest profile lookup — retrieve and list guests for the hospitality backend"
},
{
"name": "metrics",
"description": "Chunked access to historical reading measurements"
},
{
"name": "payments",
"description": "Payment processing — charge bookings, issue refunds, and look up prior charges"
},
{
"name": "reports",
"description": "Generate audit reports for reservations and guests"
}
],
"meta": {
"thalersConsumed": 1,
"memoryConsumedKb": 1,
"durationMs": 2,
"apiCalls": []
}
}
The agent reads the namespace names and descriptions and picks the one that
matches the task — bookings.
Round 3 — Discover the function
Now it lists the functions in bookings to find the right one and learn its
shape.
Agent submits:
neander 1 {
types {}
main -> [Function] {
let ns: Namespace =? discover namespace "bookings"
return discover functions ns []
}
}
Runtime responds:
{
"success": true,
"result": [
… (bookings.confirm and bookings.escalate omitted)
{
"qualifiedName": "bookings.get",
"description": "Get a booking by its numeric id",
"params": {
"id": "int"
},
"returnType": "bookings.Booking",
"errors": {
"404": "Booking not found"
},
"types": {
"bookings.Booking": {
"description": "A reservation record with id, fare amount, state, and the guest who placed it",
"fields": {
"id": "int",
"fare": "decimal(2, half_away)",
"state": "string",
"guestId": "int"
}
}
}
},
… (bookings.list and bookings.page omitted)
],
"meta": {
"thalersConsumed": 2,
"memoryConsumedKb": 1,
"durationMs": 3,
"apiCalls": []
}
}
The namespace also offers confirm, escalate, list, and page; the agent
picks get. This is the moment that replaces an integration
document: the serialized Function carries bookings.get, its params, its
returnType, and a complete types side-table describing bookings.Booking.
Every name in that JSON is the exact form the agent writes next.
Round 4 — Make the call
The agent writes the program that does the work — look up the booking, return its fare.
Agent submits:
neander 1 {
types {}
main -> decimal(2, half_away) {
let booking: bookings.Booking =? call bookings.get(id: 8821)
return booking.fare
}
}
Runtime responds:
{
"success": true,
"result": "1850.00",
"meta": {
"thalersConsumed": 2,
"memoryConsumedKb": 1,
"durationMs": 3,
"apiCalls": [
{
"function": "bookings.get",
"durationMs": 0
}
]
}
}
Notice the agent never declared a Booking type of its own — it used the
qualified name bookings.Booking straight from the discovery output, so the
types block stays empty. That is the whole loop: bootstrap, discover, call,
return. The fare comes back as the string "1850.00", so its precision is never
in doubt.
Example 2 — Discovering a format
Task given to the agent: plan a multi-stay room block and report the total fare the concierge service computes.
Beyond namespaces and functions, discover reaches a third kind of thing — a
document — and this example puts it to use. The concierge service can plan a
block, but its planBlock function takes the whole request as a single string in
a house-specific format the agent cannot possibly know in advance, and a
malformed request is rejected with a hintless error. That format is not on the web
or in any external doc: the concierge namespace publishes it itself, as a
document the runtime serves in-band — so the agent discovers and reads it the same
way it discovered the function, without ever leaving the runtime.
The bootstrap and namespace-discovery rounds (rounds 1 and 2) look just like Example 1, so we pick up at round 3, where it gets interesting.
Round 3 — Discover the function, and notice the document
Listing the concierge functions surfaces planBlock, whose own description
points at a format document by name (there is no machine link — the connection
lives in prose and shared search terms).
Agent submits:
neander 1 {
types {}
main -> [Function] {
let ns: Namespace =? discover namespace "concierge"
return discover functions ns []
}
}
Runtime responds:
{
"success": true,
"result": [
{
"qualifiedName": "concierge.planBlock",
"description": "Plan a multi-stay room block and compute its total fare. The 'request' parameter MUST be a Helix Block Request — a house-specific encoded string. See the 'blockRequestFormat' document in this namespace for the exact encoding. Malformed requests are rejected.",
"params": {
"request": "string"
},
"returnType": "concierge.BlockPlan",
"errors": {
"400": "Malformed request — does not conform to the Helix Block Request format"
},
"types": {
"concierge.BlockPlan": {
"description": "The result of planning a room block: the number of stays and the summed total fare",
"fields": {
"stays": "int",
"totalFare": "decimal(2, half_away)"
}
}
}
}
],
"meta": {
"thalersConsumed": 2,
"memoryConsumedKb": 1,
"durationMs": 3,
"apiCalls": []
}
}
The description spells out the catch: the request parameter must be a Helix
Block Request, and it names the blockRequestFormat document for the exact
encoding. The agent cannot guess that format — it has to go and read it.
Round 4 — Fetch and read the format document
The agent fetches the named document and returns it, so the runtime serializes
its prose content into the response for the agent to read.
Agent submits:
neander 1 {
types {}
main -> [Document] {
let ns: Namespace =? discover namespace "concierge"
return discover documents ns ["blockRequestFormat"]
}
}
Runtime responds:
{
"success": true,
"result": [
{
"name": "blockRequestFormat",
"description": "Helix Block Request format — the line-oriented encoding the concierge.planBlock 'request' parameter expects, with field order, separators, and the header count rule",
"content": "# Helix Block Request — Format v1\n\n… (the full format spec — shown below)"
}
],
"meta": {
"thalersConsumed": 2,
"memoryConsumedKb": 4,
"durationMs": 2,
"apiCalls": []
}
}
The field that matters is content — the format spec itself, opaque prose the
runtime never parses. The encoding is deliberately unlike any standard — not
JSON, CSV, or anything an agent could have picked up in training — so prior
knowledge is no shortcut: the only path to a valid request is to discover, fetch,
read, and apply this document. This is exactly what the agent read:
# Helix Block Request — Format v1
The concierge.planBlock "request" parameter is a single text string encoding a
room block of one or more stays. It is NOT JSON, CSV, or any standard format.
Encode it exactly as described below; any deviation is rejected.
## Shape
Line 1 — header:
HBR1*<n>
the literal token "HBR1", then a single asterisk "*", then <n>: the number of
stay lines that follow. <n> MUST equal the count of stay lines exactly.
Lines 2..n+1 — one stay per line:
<ordinal>) <guestId> ~ <nights> ~ <nightlyRate>
- <ordinal>: the 1-based position of the line, written as the number followed
by a close-paren ")", e.g. "1)", "2)", "3)". Ordinals must run 1,2,3,... with
no gaps and in order.
- then a single space, then the three fields separated by " ~ " (space, tilde,
space), in THIS EXACT ORDER:
<guestId> an integer guest id
<nights> a positive integer number of nights
<nightlyRate> a decimal with exactly two fractional digits (e.g. 80.00)
Lines are joined by a single newline. There is NO trailing newline.
## What the endpoint computes
- stays = the number of stay lines (= <n>)
- totalFare = the sum over all stay lines of (nights * nightlyRate)
## Worked example
A block of three stays — guest 7 for 2 nights at 150.00, guest 7 for 1 night at
99.50, and guest 12 for 3 nights at 80.00 — is encoded as:
HBR1*3
1) 7 ~ 2 ~ 150.00
2) 7 ~ 1 ~ 99.50
3) 12 ~ 3 ~ 80.00
For that request the endpoint returns stays = 3 and totalFare = 639.50
(2*150.00 + 1*99.50 + 3*80.00 = 639.50).
The runtime hands this over verbatim and does nothing further with it; reading and applying it is entirely the agent’s job. Notice the worked example uses different data (guests 7/7/12) than the task (guests 5/5/31) — so the agent has to understand the format and apply it to new values, not copy the sample.
Round 5 — Encode the request and call
Having read the format, the agent encodes the three stays and calls planBlock.
Agent submits:
neander 1 {
types {}
main -> concierge.BlockPlan {
let plan: concierge.BlockPlan =? call concierge.planBlock(request: "HBR1*3\n1) 5 ~ 4 ~ 120.00\n2) 5 ~ 1 ~ 89.99\n3) 31 ~ 2 ~ 200.00")
return plan
}
}
Runtime responds:
{
"success": true,
"result": {
"stays": 3,
"totalFare": "969.99"
},
"meta": {
"thalersConsumed": 1,
"memoryConsumedKb": 1,
"durationMs": 4,
"apiCalls": [
{
"function": "concierge.planBlock",
"durationMs": 0
}
]
}
}
The encoded request follows the document exactly — header HBR1*3, then one
ordinal-numbered line per stay with ~-separated fields — but carries the task’s
data. The agent chose to return the whole concierge.BlockPlan record, so both
stays and the totalFare of "969.99" come back. It integrated with an API
whose contract it could not have known up front — entirely from what the runtime
handed it in-band.
Example 3 — Composing a view
Task given to the agent: find booking 8821, look up the guest who placed it, and return one record holding the booking id, the guest’s name, the fare, and the guest’s loyalty tier.
The first example that builds something — a single record stitched together
from two different namespaces — and the first to put a program’s own types block
to work.
The agent discovers the bookings and guests namespaces and their get
functions exactly as in the earlier examples — it learns guests.get and the
guests.Guest type the same way it learned bookings.get. We skip those four
rounds (1–4) and pick up at round 5, the program that does the composition.
Round 5 — Compose the booking-and-guest view
Two sequential calls, where the second feeds on the first (guests.get takes
booking.guestId), then a record literal at return. Watch what lands in the
types block and what doesn’t: the inputs use the qualified bookings.Booking
and guests.Guest types straight from discovery, exactly as in the earlier
examples — but the output shape exists in no namespace, so the agent declares it
as a program-local type (here named Result), and the runtime types the
returned record literal against it.
Agent submits:
neander 1 {
types {
Result {
bookingId: int
guestName: string
fare: decimal(2, half_away)
tier: string
}
}
main -> Result {
let booking: bookings.Booking =? call bookings.get(id: 8821)
let guest: guests.Guest =? call guests.get(id: booking.guestId)
return {
bookingId: booking.id,
guestName: guest.name,
fare: booking.fare,
tier: guest.tier
}
}
}
Runtime responds:
{
"success": true,
"result": {
"bookingId": 8821,
"guestName": "Dana Patel",
"fare": "1850.00",
"tier": "gold"
},
"meta": {
"thalersConsumed": 7,
"memoryConsumedKb": 2,
"durationMs": 3,
"apiCalls": [
{
"function": "bookings.get",
"durationMs": 0
},
{
"function": "guests.get",
"durationMs": 0
}
]
}
}
This is orchestration, the language’s actual purpose: two API calls and a shaped
result, with the program’s own types block defining the output record. The two
bound values feed each other — booking.guestId becomes the argument to
guests.get — which is exactly the combine step the loop has been building
toward.
Example 4 — Learning from errors
Task given to the agent: the
metricsnamespace serves historical readings in chunks (metrics.chunkSize(),metrics.totalChunks(),metrics.readings(chunk)). Count how many readings across all chunks have status"alert", and return the total.
The first example where things go wrong — repeatedly — and that is the point. Every Neander error is a structured JSON object, not a stack trace, so the agent can read exactly what failed and why, adjust, and resubmit. This is a real, unscripted session: it takes ten rounds and three different kinds of error to reach the answer. Watch the agent converge.
Setup (rounds 1–3). The task names the metrics namespace, so the agent goes
straight to it: it bootstraps, discovers the three metrics functions
(chunkSize, totalChunks, and readings(chunk)), and calls
metrics.totalChunks() — learning there are 2 chunks. From the bootstrap
response it also knows the ceilings it must respect: a 5000-thaler budget and a
maxRepeatLimit of 1000 (the same configuration as Example 1). Its first
instinct is to scan everything in a single program.
Round 4 — First attempt: scan everything at once
The agent loops over both chunks, concatenates them, filters for "alert", and
counts:
neander 1 {
types {
Reading {
index: int
value: int
status: string
}
}
main -> int {
let totalChunks: int =? call metrics.totalChunks()
let allReadings: [Reading] = flatten(repeat totalChunks limit 10000 as i -> [Reading] {
let chunk: [Reading] =? call metrics.readings(chunk: i)
yield chunk
})
let alertReadings: [Reading] = each allReadings as r where r.status == "alert" -> Reading {
yield r
}
return len(alertReadings)
}
}
Runtime responds:
{
"success": false,
"result": {
"failure": "flaw",
"kind": "repeat_limit_too_high",
"line": 11,
"column": 42,
"hint": "repeat limit 10000 exceeds maximum of 1000",
"limit": 1000
},
"meta": {}
}
This never runs. The response is a Flaw — a defect caught before execution —
and it is exact: the repeat cap of 10000 exceeds the runtime’s maximum of 1000,
down to the line and column, with meta empty because nothing executed. The agent
lowers the cap and resubmits.
Round 5 — Legal cap, but over budget
Agent submits: the same program with the repeat cap lowered to the runtime’s
maximum — repeat totalChunks limit 1000.
Runtime responds:
{
"success": false,
"result": {
"failure": "abort",
"kind": "budget_exceeded",
"budget": "thalers",
"limit": 5000,
"consumed": 5001
},
"meta": {
"thalersConsumed": 5001,
"memoryConsumedKb": 865,
"durationMs": 27,
"apiCalls": [
{
"function": "metrics.totalChunks",
"durationMs": 0
},
{
"function": "metrics.readings",
"durationMs": 1
},
{
"function": "metrics.readings",
"durationMs": 0
}
]
}
}
Now it parses and runs — and fails a different way. A budget Abort: scanning
all 2800 readings costs more than the 5000-thaler budget (consumed: 5001). This
time meta is populated — the program really executed, and even made its API calls,
before being stopped mid-flight. The repeat cap was a red herring; the real wall is
the budget. A single program cannot hold this much work.
Round 6 — Restructuring trips a language rule
The agent’s next idea is to stay in one program but split the work into two loops —
and it reuses the loop variable r in both:
neander 1 {
types {
Reading {
value: int
status: string
}
}
main -> int {
let chunk0: [Reading] =? call metrics.readings(chunk: 0)
let chunk1: [Reading] =? call metrics.readings(chunk: 1)
let alerts0: [Reading] = each chunk0 as r where r.status == "alert" -> Reading {
yield r
}
let alerts1: [Reading] = each chunk1 as r where r.status == "alert" -> Reading {
yield r
}
return len(alerts0) + len(alerts1)
}
}
Runtime responds:
{
"success": false,
"result": {
"failure": "flaw",
"kind": "type_error",
"line": 14,
"column": 5,
"hint": "name \"r\" is already bound (shadowing is not permitted)"
},
"meta": {}
}
Another Flaw, of a third kind — a language rule. Neander forbids binding a name
twice (no shadowing), and the hint says so directly. The agent renames to
r0/r1 — but that version (round 7) simply hits the budget Abort yet again. The
structure was never the problem: two chunks in one run is too much, however it is
written.
Round 8 — The breakthrough: one chunk per program
The agent abandons the single-program approach and processes just one chunk:
neander 1 {
types {
Reading {
value: int
status: string
}
}
main -> int {
let chunk0: [Reading] =? call metrics.readings(chunk: 0)
let count0: int = len(each chunk0 as r0 where r0.status == "alert" -> Reading { yield r0 })
return count0
}
}
Runtime responds:
{
"success": true,
"result": 200,
"meta": {
"thalersConsumed": 4202,
"memoryConsumedKb": 460,
"durationMs": 18,
"apiCalls": [
{
"function": "metrics.readings",
"durationMs": 1
}
]
}
}
Success at last. One chunk’s work costs 4202 thalers — comfortably under 5000 —
and chunk 0 has 200 alerts. (A metrics.chunkSize() check in between, round 9,
confirmed each chunk holds 1400 readings; at ~3 thalers per filtered iteration that
is ≈ 4200, which matches what was just spent — and why two chunks never fit.)
Round 10 — The other chunk, and the total
A second program handles chunk 1:
neander 1 {
types {
Reading {
value: int
status: string
}
}
main -> int {
let chunk1: [Reading] =? call metrics.readings(chunk: 1)
let count1: int = len(each chunk1 as r1 where r1.status == "alert" -> Reading { yield r1 })
return count1
}
}
Runtime responds:
{
"success": true,
"result": 200,
"meta": {
"thalersConsumed": 4202,
"memoryConsumedKb": 460,
"durationMs": 19,
"apiCalls": [
{
"function": "metrics.readings",
"durationMs": 1
}
]
}
}
Chunk 1 also has 200 alerts. Neander programs are stateless — nothing carries from one run to the next — so the agent adds the two partial counts in its own context: 200 + 200 = 400, the answer it reports.
This is the loop that makes Neander workable for an agent with no human beside it.
Errors are data, not stack traces: a Flaw rejects a program before it runs and
points at the offending construct — an illegal repeat cap, a shadowed name — while
an Abort stops one that overruns its budget mid-execution. The agent reads each,
adapts, and — when a workload genuinely will not fit in a single run — splits it
across several and combines the results itself. Ten programs, three errors, zero
human intervention, one correct answer.