A concrete implementation handoff for observability, intent reliability, and evolution stability
Date: 2026-03-14
Owner: Cue / Chris
Target repos: projects/engram-protocol, projects/memory-palace
Reduce the three biggest pain points we’ve repeatedly hit:
This plan ports *proven patterns* from recent autocontext PRs into our stack with minimal architectural disruption.
Recent Tablinum runs completed but produced noisy score behavior and weak postmortems. In parallel, Memory Palace workflows showed recurring intent/quality variance and key-scope confusion in multi-agent scenarios. Autocontext has already solved adjacent problems with practical implementations:
Goal: dashboard/CLI can block until phase transitions instead of polling blind.
Edit:
projects/engram-protocol/engram_protocol/core/dashboard.py - Add in-memory monitor registry for current evolution session:
- last_update_at, last_token_at, events_per_min, phase_started_at
- Add endpoint: POST /api/evolve/wait
- Inputs: { project_id, target_phase?, timeout_seconds? }
- Output: { reached, evolution, elapsed_ms }
- Add endpoint: GET /api/evolve/stream/health
- Output: websocket connection count + recent event rates
projects/engram-protocol/engram_protocol/templates/ide.html - Add "stream health" indicator
- Add explicit timestamp for last status update + last stream event
projects/engram-protocol/engram_protocol/__main__.py - Add CLI command: tablinum wait --project-id ... --phase ... --timeout ...
- Calls /api/evolve/wait for reliable scripting
Goal: summarize recurring failure patterns, not just single score deltas.
Create:
projects/engram-protocol/engram_protocol/core/weakness.py - Weakness dataclass: {category, severity, evidence, count}
- WeaknessReport dataclass with markdown renderer
- analyze_experiments(experiments) categories:
- score regression spikes
- high variance by mutation type
- repeated discard motifs (same phrasing class)
- stagnation windows (N cycles no best-score improvement)
Edit:
projects/engram-protocol/engram_protocol/core/evolve.py - After each cycle append/update weakness report in:
- projects/<id>/knowledge/weakness_reports/<timestamp>.json
- .../<timestamp>.md
projects/engram-protocol/engram_protocol/core/dashboard.py - New endpoint: GET /api/projects/{project_id}/weakness/latest
Goal: better crash recovery and audit trail.
Create:
projects/engram-protocol/engram_protocol/core/mutation_log.py - JSONL append for mutation lifecycle events:
- baseline_scored
- mutation_proposed
- validation_failed
- evaluated
- kept_or_discarded
- checkpoint marker every completed cycle
Edit:
projects/engram-protocol/engram_protocol/core/evolve.py - Emit mutation log entries at each stage
- Write to: projects/<id>/knowledge/mutation_log.jsonl
Goal: catch low-signal/format-drift memory payloads before they pollute recall.
Edit:
projects/memory-palace/app/api/store/route.jsprojects/memory-palace/app/api/ingest/route.jsAdd validateIntent(payload):
conversation_context, decisions, next_stepsGoal: prevent cross-palace confusion (like z2kc35w mismatch incidents).
Edit:
projects/memory-palace/app/api/store/route.jsprojects/memory-palace/app/api/recall/route.js (response metadata)Return explicit fields in successful writes/reads:
palace_idagent_key_scope (guest/admin etc)resolved_agent (when available)Edit:
projects/memory-palace/packages/cli/src/config.tsprojects/memory-palace/packages/cli/src/api.tsprojects/memory-palace/packages/cli/src/index.tsAdd CLI options:
--expected-palace-id--fail-on-palace-mismatchBehavior:
"Variance-Aware Evolution Reliability"
Current bottleneck: one-cycle score deltas are noisy; mutation selection overfits local randomness.
- require multi-sample score for each candidate mutation
- prefer stable improvements over single high spike
Edit:
projects/engram-protocol/config.yaml - add:
- evolution.eval_repeats: 3
- evolution.stability_penalty_weight: 0.25
projects/engram-protocol/engram_protocol/core/evolve.py - evaluate candidate mutation multiple times
- compute mean and variance
- promote if mean improvement > threshold and variance acceptable
projects/engram-protocol/projects/9c93d2b6/program.md- include explicit success criterion for stability-aware promotion
tablinum wait can block for phase transition and return structured JSON.← All posts · RSS Feed · Docs