Architecture

How JFYI is structured, why it is structured that way, and the rules that govern how it should grow.

Contents

Mission

JFYI exists to give an AI agent useful information about the human user — their behaviour, expectations, and preferences — at the start of every interaction, so the agent can act usefully on the first try and reduce the volume of corrections, errors, and rework that would otherwise consume the user's time.

The system is one-purpose: profile the human; serve the profile back to the agent at session start.

This is distinct from general-purpose agent memory (storing facts about the world) and from per-project instructions (CLAUDE.md / GEMINI.md files that describe the codebase). JFYI captures the developer as a person — their preferences, patterns, communication style, and technical opinions — not the specifics of any one project.

Test for any new feature: Does this serve the agent reading better-curated info about the user? Yes → core. Maybe/opportunistic → supplementary. No → out of scope.

Core asymmetry: write raw, read curated

The MCP surface follows the same shape at every tier. The agent writes raw observations during a session. A curation step (a human in the dashboard, or an automated aggregation) distils those raw observations into low-volume, high-signal artifacts. The agent reads only the curated artifacts at the start of the next session.

TierAgent writes (raw)CuratorAgent reads (curated)
Profile add_profile_note — opportunistic observations about the developer's preferences and patterns Human in the /notes dashboard UX — composes notes into rules; deletes noise get_developer_profile — the curated rules constitution
Analytics record_interaction — friction signals after each generation (was_corrected, latency, edit count) AnalyticsEngine — computes correction rate, friction score, latency aggregates, vibe matches get_agent_analytics — comparative friction rates by agent model
Episodic (background summarizer writes; agents do not write to this tier directly) Background summarizer — distils session interactions into durable summaries recall_episodic — session summaries on demand

The asymmetry is the value: agents see only what the curation step has approved. This keeps the "developer's constitution" small and high-signal, rather than accumulating every observation verbatim.

The curation step typically crosses tiers — it transforms evidence (notes) into conclusions (rules), leaving the evidence in place. Intra-tier curation (notes → notes) loses this value because nothing gets elevated to a trusted, curated form.

Memory tiers

Short-term (session scratchpad)

TTL-evicted key/value store scoped to a session. Agents can pass context between tool calls within a session without writing to permanent storage. Does not appear in the developer profile. Accessed via remember_short_term (discoverable tool).

Long-term (profile notes)

Raw observations filed by agents via add_profile_note. Each note has a text, category, confidence score, and source identifier. Notes are never read back to agents directly — they exist for the human to review and promote into curated rules. The profile_notes table holds notes from multiple sources: manual agent filings, automated inferences (source='inferred', confidence=0.3 from Semantic Rule Inference), and any other raw observation channel.

Curated (rules — the constitution)

The developer's explicit preferences and principles. Every rule has been either directly authored by the developer or promoted from a note after human review. Rules have a scope (global or project), an optional project_id, and a confidence score that rises when vibe matches occur in sessions where that rule was active. Agents read curated rules via get_developer_profile, optionally scoped by project context.

Episodic (session summaries)

Distilled session records written by the background summarizer. Each record stores an event type, a summary string, and optional context JSON. Supports semantic search (vector embeddings when the vector extra is enabled). Used by recall_episodic and by the Agent Warming feature when selecting best sessions.

Data flow

A typical session proceeds as follows:

  1. Agent calls get_developer_profile(project_context="myproject") at session start to load the developer's constitution (curated rules) and any project-scoped overrides.
  2. Agent optionally calls warm_agent() for a Vibe Brief — a synthesised style sample from the developer's best past sessions — to accelerate cold-start alignment.
  3. During the session, after each generation, the agent calls record_interaction with the prompt, response, and correction signal (was_corrected, latency, edit count). The analytics engine computes a friction score and updates running aggregates. If the response is long and uncorrected, a vibe match is recorded and relevant rule confidence scores are boosted.
  4. When the agent observes a pattern worth noting, it files it via add_profile_note. These land in the raw notes tier for human review.
  5. The background summarizer runs on an interval (default 300 s), distilling recent interactions into episodic summaries that persist across sessions.
  6. If Semantic Rule Inference is enabled, a separate background engine (default 600 s interval) processes corrected interactions, calls an LLM to identify the violated principle, and files low-confidence inferred notes (confidence=0.3, source='inferred') for the developer to review.
  7. In the dashboard, the developer reviews raw notes, composes rules from one or more notes via the synthesis flow, and deletes noise. Composed rules enter the curated tier and appear in the next get_developer_profile call.

Core vs supplementary

Core

Features that serve the agent reading better-curated info about the user. The read-curated path runs on every session start, so improvements here compound directly with every agent call.

Supplementary

Emergent analysis surfaces. Useful when patterns emerge, but not load-bearing. The user can ignore them and the core mission is still served.

Anti-patterns

Three patterns that have come up in implementation history and should be actively avoided.

1. Inverted asymmetry

Letting the agent author the curated form directly. Example: an MCP tool that lets agents propose rules and have them appear in the constitution without explicit human approval. This bypasses the curation step — the agent grades its own homework.

The dashboard's Synthesize feature is LLM-driven, but it is initiated by the human and produces a preview requiring explicit approval. That is curation. An MCP tool that authored rules directly would not be. Any propose-style MCP tool must land in a draft queue requiring human ratification, never in the rules table directly.

2. Dormant artifact drift

Shipping schemas, columns, and endpoints as "forward-compat" for features that never materialise. Example: profile_notes.archived was carried from the pre-split schema for a bulk-archive flow that was never built. It was pruned in v2.10.1 once the dormancy was clear. When a write path stops, prune the supporting structure on a deliberate cadence. Don't accumulate columns "just in case."

3. Curation-bypass through volume

Adding many low-signal artifacts to the curated tier. Example: a synthesis exercise that proposed 10 rules of which 5 were near-duplicates of existing rules added noise, not information. The rules tier must stay small to do its job — maintained, not appended to.

Decision rules for new features

Before adding any new feature, route, MCP tool, or schema column, apply these three checks:

  1. Does it serve the agent reading better-curated info about the user?
    • Yes → core. Prioritise.
    • Maybe → supplementary. Earn its place by either (a) producing signal that flows back into the curation step, or (b) producing diagnostic value the user actively consumes.
    • No → out of scope. Keep it out.
  2. For MCP tool surface specifically: agent-writable tools should produce raw observations; agent-readable tools should return curated artifacts. Tools that let agents author curated artifacts directly invert the asymmetry.
  3. Avoid forward-compat speculation: don't ship columns, endpoints, or routes for features that aren't being built right now. Add them when they earn their place; prune them when they go dormant.

Deployment architecture

JFYI is a single Python package (src/jfyi/) serving three roles from one process on port 8080:

Persistence: SQLite at JFYI_DB_PATH (default /data/jfyi.db). Schema migrations tracked by PRAGMA user_version; forward-only; inline _run_migrations() at startup. Optional ChromaDB vector store for semantic search, gated behind the vector extra and JFYI_ENABLE_VECTOR_DB=true.

Kubernetes deployment: Helm chart in helm/jfyi-mcp-server/, published as OCI artifact to GHCR. Data persists on a PVC mounted at /data. ML embedding models download at runtime to a persisted path (SENTENCE_TRANSFORMERS_HOME=/data/models) — never baked into the container image.

Multi-arch: container image builds for linux/amd64 and linux/arm64 (Raspberry Pi 5). The arm64 build runs on a self-hosted Pi runner in GitHub Actions.