Skip to content

The semantic layer

The semantic layer is the curated map between business language and your Postgres schema. Agents do not guess table names from raw information_schema dumps alone — they consult entity files you version in git. RFC-002 §3.5 (REQ-18–22) and the pivot’s context-mode table define how those files are shaped, validated, and loaded.

REQ-18 places semantic-layer files as YAML under <consumer-repo>/semantic/, one file per entity, filename = entity name + .yml. REQ-19 requires a Zod schema at @arivie/semantic to validate every file at lint time and runtime; the schema is the source of truth for allowed and required fields.

REQ-20 mandates semantic/catalog.yml, auto-generated from entity files by arivie lint. The catalog is the index the agent consults first in preload and rag modes.

REQ-21 lists minimum fields: name, description, grain, primary_key, measures[], dimensions[], segments[], joins[], example_questions[], example_queries[], columns[], hints[], with per-column pii: boolean (default false). REQ-22 requires TypeScript declarations emitted to semantic/.generated/index.ts so React consumers can autocomplete entity names.

The CLI command arivie add entity <name> scaffolds a starter entity; arivie lint validates and regenerates the catalog and generated types.

REQ-13 requires all three modes in v0.1: preload, browse, and rag. Mode is auto-detected by arivie lint from semantic-layer token count; consumers override via semantic.mode in config.

ModeWhenMechanism (REQ / pivot)
preloadFew entities (< ~30; ~10k tokens)Entire layer flattened into agent instructions (REQ-14: hard 10,000-token cap; exceed → lint error, force browse)
browseMedium catalogsMastra Workspace + WorkspaceFilesystem reads ./semantic on demand (REQ-15)
ragLarge catalogsMastra Vector over chunked paragraphs; reindex via arivie lint --reindex (REQ-16)

REQ-17 requires the same explore tool interface across modes (query string + optional entity hint → text chunks with provenance). Only the internal mechanism differs; the agent contract does not.

@arivie/workspace implements SemanticLayerFilesystem over the semantic directory as read-only. Write attempts throw immediately — the semantic layer is text in the consumer’s repo, not a studio UI (RFC-002 problem background §2).

The examples/with-nextjs/semantic/ directory is the canonical reference: five entities (customers, orders, products, line_items, invoices) plus catalog.yml, seeded by seed.sql for local and CI runs.