Architect / Memory
What a workspace remembers, and how.
Architect's memory layer is the substrate that separates this workspace from a chat-on-top-of-your-notes tool. It makes prior context recall-visible, grounds every model call in your own corpus, and refuses to let the model drift from what's actually written.
Five promises, made by the substrate itself.
01
Promise 01
Recall before generation.
The substrate runs vector recall over your memory units before the model speaks. The model sees the citations as input, not as a hope after the fact.
When you ask Architect a question, the workspace doesn't hand the model a blank prompt and pray. It first fetches the most relevant memory units from your own corpus — locked sessions, prior asks, promoted canvas tiles, citations imported from Granth — and passes them as part of the input. The model is grounded by construction. The retrieval-augmented-generation pattern is well-known; what matters is whether the substrate enforces it on every call. Architect does.
Substrate
- Vector store
- pgvector inside your workspace's Postgres
- Embedding model
- nomic-embed-text-v1 via the Dhara Ollama endpoint
- Scope
- workspace + matter + tag — never cross-tenant
- Top-k
- tunable per ask, default narrow
02
Promise 02
Every claim carries a citation.
Every model call returns citations as structured output. If the model can't ground a claim in your corpus, it says so instead of hallucinating one.
An ungrounded answer is a quiet failure — fluent, plausible, and wrong. Architect treats it as a caught error. Each ask is a structured call: question in, answer + citations out. The UI renders the citations inline. If the model returns a claim it cannot anchor to a memory unit, the substrate flags it and the ask gets routed to a second pass. You see the citations. You see the gaps. You don't see fabrications dressed as confidence.
Substrate
- Storage
- ask_events with citations[] and grade columns
- Verifier
- 8 rules covering shape, anchor, freshness, license
- Failure mode
- auto-retry-once, then surface to the user
- Counter
- ungrounded_responses_caught is monotonic, observable
03
Promise 03
The distortion guard catches drift.
A second pass checks the model didn't fabricate, misquote, or drift from its citations. Wrong-but-fluent answers are caught before they reach you.
Models drift. They paraphrase a source slightly wrong, attribute claims to the wrong document, or smuggle in priors that aren't in the corpus. The distortion guard is a Pass-B reviewer that compares the model's answer to its claimed citations and the underlying chunks — if the answer departs from what's actually written, the guard fails the ask and triggers a retry. It's not a moderation layer. It's a fidelity layer.
Substrate
- Pass
- B (after the model returns)
- Compares
- answer span ↔ cited chunk ↔ source unit
- Triggers
- retry-once with a tighter prompt and the failure annotated
- Visible
- ask_events.grade and ask_events.distortion_flags
04
Promise 04
Forgetting is a feature.
Near-duplicate memory units merge. Stale units age out. The spine stays sharp instead of growing into a swamp.
A memory layer that only grows is a junk drawer in waiting. Architect runs a consolidation pass that merges near-duplicates by embedding similarity, and a reaper that ages out units with low recall pressure and old timestamps. What survives is what you actually use. The audit trail is preserved separately; consolidation never deletes evidence, only compresses redundant copies of it.
Substrate
- Consolidation
- nightly, cosine ≥ 0.92 + same matter
- Reaper
- decay-aware, low-recall + age threshold
- Evidence
- audit log persists independent of memory unit lifecycle
- Reversible
- merge events are undoable for 30 days
05
Promise 05
The graph is yours.
Concept-overlap edges between sections give you a knowledge graph you can actually browse. Not a hidden index — a navigable surface at /mind.
Memory without structure is a pile. Architect's memory units are linked by concept-overlap edges that surface at /mind — a browseable workspace knowledge graph where you can drill into a node, follow its neighbors, and pull related work back into a session. The graph is the difference between recalling and rediscovering. The edges are computed from your corpus alone. They are not generated by a third-party model.
Substrate
- Surface
- /mind — workspace KG viewer + suggestions
- Edge taxonomy
- concept-overlap, citation, lineage, manual
- Suggestions
- link_suggestions table, accepted/rejected per workspace
- Provenance
- every edge carries the block_id that produced it
The mechanism
Every workspace move produces a memory unit.
The memory spine isn't a side effect; it's the audit trail of how your team thinks. Six entry points, one schema, one substrate.
Lock a session
memory_unit { kind: 'session-lock', body, citations[] }
Promote a canvas tile
memory_unit { kind: 'canvas-promoted', tile_id, … }
Anchor an ask to a matter
memory_unit { kind: 'ask-anchored', ask_id, citations[] }
Import from Granth
memory_unit { kind: 'granth-citation', source_doc_id }
Resolve a :demand
memory_unit { kind: 'demand-resolved', resolution_grade }
Episode promotion
memory_unit { kind: 'episode-anchored', episode_id }
The shape
What a memory unit actually looks like.
The schema is small on purpose. Everything that compounds in Architect — sessions, asks, citations, decisions — lands as a row in this shape.
memory_units row
pgvector
{
id: uuid,
workspace_id: uuid, -- single-tenant, RLS-enforced
matter_id: uuid | null,
kind: 'session-lock' | 'ask-anchored' | 'canvas-promoted' | …,
body: text, -- the citable content
citations: citation[], -- back-references to source units
embedding: vector(768), -- nomic-embed-text-v1
graph_edges: edge_kind[], -- concept-overlap, citation, lineage
decay_at: timestamptz, -- reaper checkpoint
created_by: principal,
created_at: timestamptz
}Hand-authored migrations. No ORM-inferred drift. The shape your data takes is the shape your migration declared, immune to the next library upgrade.
A note on what this isn't
Memory is a posture, not a feature flag.
It is not a vector DB you rent from a third-party.
It is not RAG-as-a-service running on someone else's GPUs.
It is not a chat history search box pretending to be memory.
It is not a wiki you remember to update.
It is not 'we'll figure out memory later.'
The promises above are not aspirations. They are properties enforced by the substrate — by hand-authored migrations, by an RLS-scoped Postgres role, by a sovereign embedding endpoint, by a verifier pass that fails an ask when its citations don't hold up. Architect carries them by construction, not by reminder.
Open Architect