Architecture

Memory architecture

How a character remembers you. Every turn we recall relevant past beats into the prompt; only turns that carry durable, retrieval-worthy information write anything back. A knowledge graph — not a chat log — holds the memory.

Recall every turn, write only when it matters. The same Safety Guard that judges canon and safety also decides, per turn, whether memory extraction runs — so most turns skip the heavy write pipeline entirely.

Knowledge graph

Nodes + edges + 1536-d embeddings in Postgres (pgvector).

Salience-gated

Writes run only when the judge sets extractMemory = true.

Session-scoped

Keyed by session today. Cross-session user memory is on the roadmap.

01

Data model

Four tables, all scoped by (world_id, session_id). An immutable event log feeds a node/edge graph; a snapshot caches the projected state.

ontology_nodesmemories
  • stable_key
  • node_type · semantic_kind
  • title · summary
  • embedding vector(1536)
  • consent_state · status
ontology_edgesrelations
  • from_node_id → to_node_id
  • relation_label
  • relation_description
  • evidence
  • confidence
world_eventsevent log
  • event_type
  • source_turn_id
  • payload (jsonb)
  • created_at
ontology_snapshotsprojection cache
  • projection_version
  • payload (jsonb)
  • source_event_ids

db/migrations/001_ontology_scaffold.sql · HNSW cosine index on embedding; nodes upsert by stable_key.

Ontology graph

Anchors are a fixed identity backbone. Each meaningful turn grafts a state atom — and a typed edge — onto it. Gold edges are the memory-to-memory links that make it a graph, not a star.

characterplayerlocationrelationship_beatrelationship_beatworld_discoverythreadshared_scene_withtemporal_followsdiscoveredat_locationcontinues_thread
anchor — backbonestate_atom — memory + embeddingmemory ↔ memory edge
02

The turn loop

One judge call routes the turn. Recall happens on every turn; the write pipeline is gated behind the judge's extractMemory flag.

User turn

Safety Guard (judge)

moderiskflagsnoteextractMemory

Read · every turn

1Load recent beats (recency)
2Compress → inject into prompt
3Generate the reply

Write · gated

gate: extractMemory && mode ≠ safe_redirect
false →skip pipeline (most turns)true →extract → link → embed → persist

lib/ai-engine/runtime.ts · lib/ontology/runtime.ts

03

The Safety Guard contract

The judge returns five fields. blocked status, review-required and the memory-candidate gate are derived in code — the LLM is not asked to repeat them.

mode
in_world_answer · memory_consent · safe_redirectrouting decision
risk
low · medium · highmerges old safetyStatus + canonRisk
flags
string[]violation labels (legal_floor:*)
note
stringmerges generationGuidance + rationaleForAudit
extractMemory
booleanthe per-turn memory gate

Derived in code: safetyStatus = (mode=safe_redirect ∥ risk=high) ? blocked : clear · reviewRequired = risk ≠ low · lib/ai-engine/policy.ts

04

The salience gate

Memory worth storing arrives in bursts, not on a clock. So the gate is event-driven — and it adds no new model call, because the judge already runs every turn.

No new call

extractMemory rides on the judge that already gates safety.

No bias to fire

Default false. Greetings, acknowledgements, recap → skip.

Hard floor

safe_redirect always forces extractMemory = false.

05

Roadmap

What the current build deliberately leaves for the next increments.

  • Recall by embedding similarity, not just recency.
  • Move edge-linking from per-node to a periodic consolidation pass.
  • Take the write off the response critical path (fire-and-forget).
  • A periodic backstop that re-scans recent turns the single-turn gate missed.
  • Promote the memory key from session to a stable user identity.
Related surfaces:Rights ConsoleDev Console