Architecture
Memory architecture
How a character remembers you. Every turn we recall relevant past beats into the prompt; only turns that carry durable, retrieval-worthy information write anything back. A knowledge graph — not a chat log — holds the memory.
Recall every turn, write only when it matters. The same Safety Guard that judges canon and safety also decides, per turn, whether memory extraction runs — so most turns skip the heavy write pipeline entirely.
Knowledge graph
Nodes + edges + 1536-d embeddings in Postgres (pgvector).
Salience-gated
Writes run only when the judge sets extractMemory = true.
Session-scoped
Keyed by session today. Cross-session user memory is on the roadmap.
Data model
Four tables, all scoped by (world_id, session_id). An immutable event log feeds a node/edge graph; a snapshot caches the projected state.
- stable_key
- node_type · semantic_kind
- title · summary
- embedding vector(1536)
- consent_state · status
- from_node_id → to_node_id
- relation_label
- relation_description
- evidence
- confidence
- event_type
- source_turn_id
- payload (jsonb)
- created_at
- projection_version
- payload (jsonb)
- source_event_ids
db/migrations/001_ontology_scaffold.sql · HNSW cosine index on embedding; nodes upsert by stable_key.
Ontology graph
Anchors are a fixed identity backbone. Each meaningful turn grafts a state atom — and a typed edge — onto it. Gold edges are the memory-to-memory links that make it a graph, not a star.
The turn loop
One judge call routes the turn. Recall happens on every turn; the write pipeline is gated behind the judge's extractMemory flag.
Safety Guard (judge)
Read · every turn
Write · gated
lib/ai-engine/runtime.ts · lib/ontology/runtime.ts
The Safety Guard contract
The judge returns five fields. blocked status, review-required and the memory-candidate gate are derived in code — the LLM is not asked to repeat them.
Derived in code: safetyStatus = (mode=safe_redirect ∥ risk=high) ? blocked : clear · reviewRequired = risk ≠ low · lib/ai-engine/policy.ts
The salience gate
Memory worth storing arrives in bursts, not on a clock. So the gate is event-driven — and it adds no new model call, because the judge already runs every turn.
No new call
extractMemory rides on the judge that already gates safety.
No bias to fire
Default false. Greetings, acknowledgements, recap → skip.
Hard floor
safe_redirect always forces extractMemory = false.
Roadmap
What the current build deliberately leaves for the next increments.
- Recall by embedding similarity, not just recency.
- Move edge-linking from per-node to a periodic consolidation pass.
- Take the write off the response critical path (fire-and-forget).
- A periodic backstop that re-scans recent turns the single-turn gate missed.
- Promote the memory key from session to a stable user identity.