Features

All shipped features across six phases, with rationale and implementation notes.

Contents

v2.12.0 — Phase 6: Vibe Coder Optimization Shipped

Phase 6 was pulled forward from a planned v3.1.0 slot because all six features are pure enhancements to the existing MCP profile loop — none depend on the Phase 5 protocol work (ACP/A2A). The theme is tighter developer-agent alignment: positive signals, scoped rules, real-time telemetry, smarter inference, and fast onboarding for new agents.

Tiered Profiling

Tag: Core  |  PR: #40

Adds a scope column ('global' or 'project'), an optional project_id, and a confidence score to the profile_rules table. The get_developer_profile tool accepts a project_context argument — when provided, it returns global rules plus any project-scoped rules whose project_id matches, with project rules sorted first and taking precedence in any category overlap.

Why it matters: A developer may prefer quick-and-dirty scripting on a side project but strict type safety on a production codebase. Without scope, the agent applies enterprise norms to prototypes and vice versa. Tiered profiling prevents that context pollution.

Schema migration: v11 — adds scope TEXT DEFAULT 'global', project_id TEXT, confidence REAL DEFAULT 0.5 to profile_rules; adds index on (user_id, scope, project_id).

Positive Reinforcement

Tag: Core  |  PR: #41

Balances JFYI's negative-first feedback loop by tracking "Vibe Matches" — interactions that were accepted with zero corrections and a response length ≥ 500 characters. When a vibe match is detected via record_interaction, the analytics engine boosts the confidence of curated rules that were active in that session by 0.05 (capped at 1.0).

A "Best Matches" section in the dashboard analytics view shows recent vibe matches, giving the developer visibility into what the agent is getting right.

Why it matters: A system that only learns from failures does not tell the agent what to keep doing. Vibe matches let high-confidence patterns strengthen and surface as few-shot examples for synthesis.

Schema migration: v12 — new vibe_matches table with (user_id, agent_id, interaction_id, response_length, created_at).

Semantic Rule Inference

Tag: Core  |  PR: #42

A background InferenceEngine processes corrected interactions on a configurable interval (default 600 s). For each correction event, it passes the prompt and response to an LLM and asks it to identify the underlying principle that was violated. The inferred observation is written to profile_notes with source='inferred' and confidence=0.3, where the developer can review it and optionally promote it to a curated rule via the existing synthesis flow.

Implementation note: The spec proposed a separate pending rule state, but the shipped implementation writes to profile_notes instead. This preserves the write-raw/curate/read-curated asymmetry: the LLM files low-confidence raw observations; the developer promotes them to rules via the human curation step. Inferred notes show a badge and a "Promote" shortcut in the dashboard.

Configuration: JFYI_INFERENCE_ENABLED=true (default: true) and JFYI_ANTHROPIC_API_KEY. When no API key is set, the engine is a no-op. Respects a daily token cap to avoid runaway LLM costs.

Schema migration: v13 — new inference_log table for idempotency tracking (prevents reprocessing the same friction event).

Vibe Telemetry

Tag: Core  |  PR: #43

Exposes the first MCP Resource in JFYI: jfyi://sessions/{session_id}/telemetry. The resource provides a rolling alignment score and corrective hints for the current session window (last 10 interactions). An agent that reads this resource mid-session can detect its own drift and self-correct before the developer needs to intervene.

The alignment score is computed as 100 × (1 − correction_rate) over the session window. The friction trend ("improving", "stable", "worsening") compares the first and second halves of the window. Corrective hints are the description fields of the three most recent friction events in the session.

Why it matters: Without telemetry, the agent only receives feedback when the developer makes an explicit correction. Telemetry enables proactive alignment — the agent can pause and confirm direction when its score is falling rather than continuing down a wrong path.

Friction Clustering

Tag: Supplementary  |  PR: #44

Groups similar friction events into semantic clusters using scikit-learn TF-IDF vectorisation and K-Means clustering. For each cluster, an optional LLM generates a "Gap Summary" — a concise description of the recurring pattern (e.g., "Consistent corrections around async context manager usage"). The dashboard surfaces clusters as a "Vibe Map."

Implementation note: The spec assumed ChromaDB embeddings for clustering, but the shipped implementation uses TF-IDF over event descriptions. This keeps clustering self-contained and independent of the vector optional extra — two separate optional features don't compose their dependencies.

Cluster results are cached in a friction_clusters database table and recomputed on demand when friction event count grows by more than 20% since the last computation.

Configuration: JFYI_ENABLE_CLUSTERING=true (default: false). Requires the cluster optional extra (pip install -e ".[cluster]") for scikit-learn. Degrades gracefully without an LLM API key (clusters are labelled with representative event text instead of a synthesised summary).

Agent Warming

Tag: Core  |  PR: #45

Adds the warm_agent MCP tool (discoverable, token cost 300). It queries the database for the developer's five best past sessions — sessions where every single recorded interaction had was_corrected=false — and synthesises a "Vibe Brief" from their episodic summaries.

The brief is a concise style sample: 3–5 examples that illustrate how the developer thinks, communicates, and expects code to be structured. A new agent that reads this brief before its first response can skip the cold-start period where it has to learn the developer's preferences through trial and correction.

SQL correctness note: Session selection uses HAVING SUM(was_corrected) = 0 (filtering after grouping) rather than WHERE was_corrected = 0 (filtering before grouping). The WHERE approach would include sessions with mixed corrections by discarding corrected rows before the GROUP BY, incorrectly classifying them as zero-friction.

Configuration: Requires at least one zero-friction session to return a result. With JFYI_ANTHROPIC_API_KEY set, the brief is LLM-synthesised from episodic summaries; without it, the tool returns session statistics and representative prompts as a fallback.


v2.11.0 — Evidence & Traceability Shipped

Per-rule provenance in synthesis

When the LLM synthesis flow proposes a new rule, the response now includes the IDs of the source notes that contributed to each rule. The dashboard's Synthesize preview panel shows these source notes alongside the proposed rule text, so the developer can verify the evidence before promoting the rule to the constitution.

This closes the "where did this rule come from?" question that emerged after the Notes vs Rules split shipped in v2.9.0. Rules are conclusions; notes are evidence; the link between them is now visible.

Durable architecture & mission doc

Formally documents the write-raw/curate/read-curated pattern, the core/supplementary distinction, the three anti-patterns, and the decision rules for new features in docs/architecture.md. This document is the canonical reference for evaluating new feature proposals.


v2.9.0 — Profile Architecture Shipped

Notes vs Rules — two-tier developer profile

Splits the original single profile_rules table into two tiers:

The key insight: get_developer_profile returns only curated rules, never raw notes. This keeps the agent's "constitution" small and high-signal, regardless of how many notes have been filed.

Retrospective (v2.10.0 correction): The initial synthesis implementation wrote new notes from existing notes, staying within a single tier. This was corrected within 90 minutes of v2.9.0 shipping — synthesis now draws rules from notes, crossing the tier boundary as intended. Notes are evidence; rules are conclusions; one note can support many rules and is never consumed by the rules derived from it.


v2.6.0 — Phase 4: Security & Hardening Shipped

Inline DLP / PII Redaction

A self-contained dlp.py module (no new dependencies beyond the already-required httpx) scrubs secrets and personal data at every storage boundary. Patterns include email addresses, API keys, tokens, credit card numbers, and other common PII patterns. Applied on write to both notes and interaction records.

Developer Behavior Analytics

Adds the /developer dashboard section with correction rates over time, friction score trends, rule confidence accumulation, and per-category breakdown. These analytics are supplementary — they surface patterns for the developer to reflect on, not artifacts that agents read directly.

Rule Synthesis

LLM-backed synthesis that proposes new curated rules from the raw notes corpus. The developer selects notes, triggers synthesis, reviews the LLM-proposed rules in a preview, and approves or discards each. Approved rules enter the curated tier; source notes are retained as evidence. Not in the original Phase 4 spec but emerged from the operational need to manage a growing notes corpus.

Agent Provenance Tracking

Adds an agent_name TEXT column to the interactions and friction events tables. Records which AI agent model produced each interaction, enabling per-agent friction breakdown in analytics. Uses a text column rather than an integer FK to avoid confusion with the existing agent_id FK used in other tables.


v2.5.0 — Phase 3: Advanced Retrieval Shipped

Vector Embeddings Core

Integrates ChromaDB and sentence-transformers (all-MiniLM-L6-v2) as an optional extra (vector) gated behind JFYI_ENABLE_VECTOR_DB=true. Provides semantic similarity search over notes, rules, and episodic memories. Models download at runtime to a persisted volume path and are never baked into the container image.

Instruction-Tool Retrieval (ITR)

Dynamically selects the minimal relevant subset of tools for each task using dense retrieval (embeddings) and greedy knapsack budget selection. Agents pass a query to discover_tools to receive only the tools most relevant to their current task, reducing schema injection cost. Off by default; requires 10+ rules in the corpus to outperform showing everything.


v2.4.0 — Phase 2: Memory Architecture Shipped

Three-Tiered Memory

Formalises the short-term (session scratchpad), long-term (profile notes), and episodic (session summaries) memory tiers under a unified MemoryFacade interface. Each tier has distinct retrieval semantics and a different lifespan. Adds the curated tier as a fourth logical tier (accessed only by the human curation path, not by the facade).

Background Summarization

An async background loop (BackgroundSummarizer) runs on a configurable interval (default 300 s) and distils recent interactions into episodic summaries using a lightweight LLM. Respects a daily token cap. Identified as the primary value driver in Phase 3's post-evaluation: it is the mechanism that turns raw interactions into durable profile signal.

Compiled View Memory

Artifacts (logs, diffs, terminal output) can be stored via store_artifact and referenced by handle rather than injected as raw text. The run_local_script tool executes focused extraction scripts against the stored artifact, returning only the relevant excerpt. Prevents large outputs from consuming context window budget.

Context Compaction

Rolling summaries and prefix-cache-aware prompt structure reduce the effective context footprint of repeated operations. The background summarizer writes session summaries that can replace full interaction transcripts when an agent needs historical context.


v2.13.0 — Dashboard Hardening Shipped

Agent Analytics Page

Tag: Supplementary  |  Commit: b3380a2

Replaces the "Analytics module coming soon" stub that had existed since the dashboard was first built. The backend (GET /api/analytics/agents) was always fully implemented — the page just needed a frontend component.

The page shows a 4-stat summary row (agents tracked, overall alignment, correction rate, total interactions), a per-agent comparison table with alignment, correction rate, avg friction, and avg latency (all colour-coded), and an alignment bar chart ordered by score. No backend changes; one file changed.


v2.3.0 — Phase 1: Foundation Shipped

Progressive Disclosure

Exposes a single discover_tools router tool upfront. Agents call it to enumerate capabilities and fetch full schemas on demand. This keeps the initial context injection to just two tools (the two always-on ones), rather than injecting every tool schema at session start.

Payload Minification

Compact JSON formatting and TOON (Token-Optimised Object Notation) serialisation at the prompt boundary. Profile responses are formatted for minimum token consumption while remaining human-readable for debugging.

Read-only Injection Zone

Injected profile data (developer constitution) is fenced to prevent prompt injection attacks from corrupting the developer's profile. External content (file diffs, terminal output) is stored as artifacts and never injected inline with the profile context.

OAuth 2.1 + RBAC

Multi-user authentication using JWT bearer tokens. The dashboard and REST API require a valid token. Scope-based role access control distinguishes read-only consumers (agents via MCP) from read-write users (the developer via the dashboard). JWT secret persists across Helm upgrades and can be explicitly rotated via scripts/rotate-jwt.sh.