Architecture
How JFYI is structured, why it is structured that way, and the rules that govern how it should grow.
Contents
Mission
JFYI exists to give an AI agent useful information about the human user — their behaviour, expectations, and preferences — at the start of every interaction, so the agent can act usefully on the first try and reduce the volume of corrections, errors, and rework that would otherwise consume the user's time.
The system is one-purpose: profile the human; serve the profile back to the agent at session start.
This is distinct from general-purpose agent memory (storing facts about the world) and from per-project instructions (CLAUDE.md / GEMINI.md files that describe the codebase). JFYI captures the developer as a person — their preferences, patterns, communication style, and technical opinions — not the specifics of any one project.
Test for any new feature: Does this serve the agent reading better-curated info about the user? Yes → core. Maybe/opportunistic → supplementary. No → out of scope.
Core asymmetry: write raw, read curated
The MCP surface follows the same shape at every tier. The agent writes raw observations during a session. A curation step (a human in the dashboard, or an automated aggregation) distils those raw observations into low-volume, high-signal artifacts. The agent reads only the curated artifacts at the start of the next session.
| Tier | Agent writes (raw) | Curator | Agent reads (curated) |
|---|---|---|---|
| Profile | add_profile_note — opportunistic observations about the developer's preferences and patterns |
Human in the /notes dashboard UX — composes notes into rules; deletes noise |
get_developer_profile — the curated rules constitution |
| Analytics | record_interaction — friction signals after each generation (was_corrected, latency, edit count) |
AnalyticsEngine — computes correction rate, friction score, latency aggregates, vibe matches |
get_agent_analytics — comparative friction rates by agent model |
| Episodic | (background summarizer writes; agents do not write to this tier directly) | Background summarizer — distils session interactions into durable summaries | recall_episodic — session summaries on demand |
The asymmetry is the value: agents see only what the curation step has approved. This keeps the "developer's constitution" small and high-signal, rather than accumulating every observation verbatim.
The curation step typically crosses tiers — it transforms evidence (notes) into conclusions (rules), leaving the evidence in place. Intra-tier curation (notes → notes) loses this value because nothing gets elevated to a trusted, curated form.
Memory tiers
Short-term (session scratchpad)
TTL-evicted key/value store scoped to a session. Agents can pass context between tool calls within a session without writing to permanent storage. Does not appear in the developer profile. Accessed via remember_short_term (discoverable tool).
Long-term (profile notes)
Raw observations filed by agents via add_profile_note. Each note has a text, category, confidence score, and source identifier. Notes are never read back to agents directly — they exist for the human to review and promote into curated rules. The profile_notes table holds notes from multiple sources: manual agent filings, automated inferences (source='inferred', confidence=0.3 from Semantic Rule Inference), and any other raw observation channel.
Curated (rules — the constitution)
The developer's explicit preferences and principles. Every rule has been either directly authored by the developer or promoted from a note after human review. Rules have a scope (global or project), an optional project_id, and a confidence score that rises when vibe matches occur in sessions where that rule was active. Agents read curated rules via get_developer_profile, optionally scoped by project context.
Episodic (session summaries)
Distilled session records written by the background summarizer. Each record stores an event type, a summary string, and optional context JSON. Supports semantic search (vector embeddings when the vector extra is enabled). Used by recall_episodic and by the Agent Warming feature when selecting best sessions.
Data flow
A typical session proceeds as follows:
- Agent calls
get_developer_profile(project_context="myproject")at session start to load the developer's constitution (curated rules) and any project-scoped overrides. - Agent optionally calls
warm_agent()for a Vibe Brief — a synthesised style sample from the developer's best past sessions — to accelerate cold-start alignment. - During the session, after each generation, the agent calls
record_interactionwith the prompt, response, and correction signal (was_corrected, latency, edit count). The analytics engine computes a friction score and updates running aggregates. If the response is long and uncorrected, a vibe match is recorded and relevant rule confidence scores are boosted. - When the agent observes a pattern worth noting, it files it via
add_profile_note. These land in the raw notes tier for human review. - The background summarizer runs on an interval (default 300 s), distilling recent interactions into episodic summaries that persist across sessions.
- If Semantic Rule Inference is enabled, a separate background engine (default 600 s interval) processes corrected interactions, calls an LLM to identify the violated principle, and files low-confidence inferred notes (confidence=0.3, source='inferred') for the developer to review.
- In the dashboard, the developer reviews raw notes, composes rules from one or more notes via the synthesis flow, and deletes noise. Composed rules enter the curated tier and appear in the next
get_developer_profilecall.
Core vs supplementary
Core
Features that serve the agent reading better-curated info about the user. The read-curated path runs on every session start, so improvements here compound directly with every agent call.
- The
/notesand/profiledashboard surfaces add_profile_noteandget_developer_profileMCP tools- Note → rule synthesis flow (preview, compose-into-rule modal)
- Rule confidence, scope, and provenance
- Vibe Telemetry (real-time alignment signal for agent self-correction)
- Agent Warming (cold-start acceleration)
- Tiered Profiling (global vs project-scoped rules)
Supplementary
Emergent analysis surfaces. Useful when patterns emerge, but not load-bearing. The user can ignore them and the core mission is still served.
- Friction Clustering — groups similar friction events into semantic clusters, surfaces a "Vibe Map" of recurring gaps. Supplementary because it does not change what the agent reads; it surfaces signals for the human to act on.
- Developer Analytics — correction rates, friction by agent, latency trends, rule-confidence accumulation. Supplementary diagnostic value.
- Agent Analytics — comparative agent performance across the fleet. Supplementary diagnostic value.
- Positive Reinforcement — vibe matches, confidence boosting. Bridges supplementary → core by feeding confidence signals back into the curated rules tier.
- Semantic Rule Inference — bridges supplementary → core by filing inferred notes that the developer can promote to rules.
Anti-patterns
Three patterns that have come up in implementation history and should be actively avoided.
1. Inverted asymmetry
Letting the agent author the curated form directly. Example: an MCP tool that lets agents propose rules and have them appear in the constitution without explicit human approval. This bypasses the curation step — the agent grades its own homework.
The dashboard's Synthesize feature is LLM-driven, but it is initiated by the human and produces a preview requiring explicit approval. That is curation. An MCP tool that authored rules directly would not be. Any propose-style MCP tool must land in a draft queue requiring human ratification, never in the rules table directly.
2. Dormant artifact drift
Shipping schemas, columns, and endpoints as "forward-compat" for features that never materialise. Example: profile_notes.archived was carried from the pre-split schema for a bulk-archive flow that was never built. It was pruned in v2.10.1 once the dormancy was clear. When a write path stops, prune the supporting structure on a deliberate cadence. Don't accumulate columns "just in case."
3. Curation-bypass through volume
Adding many low-signal artifacts to the curated tier. Example: a synthesis exercise that proposed 10 rules of which 5 were near-duplicates of existing rules added noise, not information. The rules tier must stay small to do its job — maintained, not appended to.
Decision rules for new features
Before adding any new feature, route, MCP tool, or schema column, apply these three checks:
- Does it serve the agent reading better-curated info about the user?
- Yes → core. Prioritise.
- Maybe → supplementary. Earn its place by either (a) producing signal that flows back into the curation step, or (b) producing diagnostic value the user actively consumes.
- No → out of scope. Keep it out.
- For MCP tool surface specifically: agent-writable tools should produce raw observations; agent-readable tools should return curated artifacts. Tools that let agents author curated artifacts directly invert the asymmetry.
- Avoid forward-compat speculation: don't ship columns, endpoints, or routes for features that aren't being built right now. Add them when they earn their place; prune them when they go dormant.
Deployment architecture
JFYI is a single Python package (src/jfyi/) serving three roles from one process on port 8080:
- MCP Server (
server.py) — exposes the tool catalogue over stdio or SSE. Always-on tools (record_interaction,get_developer_profile) are registered directly. Discoverable tools are exposed viadiscover_toolsto reduce initial schema injection cost. - Web Dashboard (
web/app.py) — FastAPI REST API mirroring MCP tools, plus a vanilla HTML/JS/CSS SPA (no Node.js build step). Serves the notes review, rule composition, analytics, and memory explorer UIs. - Background Engines — async loops running inside the same process:
BackgroundSummarizer(session distillation) and optionallyInferenceEngine(semantic rule inference). Both respect daily token caps and degrade gracefully when no LLM API key is configured.
Persistence: SQLite at JFYI_DB_PATH (default /data/jfyi.db). Schema migrations tracked by PRAGMA user_version; forward-only; inline _run_migrations() at startup. Optional ChromaDB vector store for semantic search, gated behind the vector extra and JFYI_ENABLE_VECTOR_DB=true.
Kubernetes deployment: Helm chart in helm/jfyi-mcp-server/, published as OCI artifact to GHCR. Data persists on a PVC mounted at /data. ML embedding models download at runtime to a persisted path (SENTENCE_TRANSFORMERS_HOME=/data/models) — never baked into the container image.
Multi-arch: container image builds for linux/amd64 and linux/arm64 (Raspberry Pi 5). The arm64 build runs on a self-hosted Pi runner in GitHub Actions.