Rick Hallett
“How do you trust, evaluate, and operate AI systems at every layer?”
Seven projects. Built solo across ~2 months. Some are running. Some are resting. Some are ideas with teeth that haven't been built yet. This is the stack.
[ Input / Output Trust ]
WASP
Prompt injection prevention. Archived — input defense layer.
Status ArchivedFocus Prevention, not detection
Python
Stain
Multi-agent AI slop detection via linguistic fingerprinting.
░░░░░░░░░░░░░░░░░░░░░░░░░█
today · 65 commits
Model Cerebras Qwen3 235BDetectors 6Benchmark samples 138Accuracy (θ=0.50) 74%
Python · YAML
[ Knowledge Trust ]
Arcana
Document RAG + LLM-as-judge claim verification.
░░░░░░░░░░░░░░░░░░░░░░░░░█
today · 42 commits
Workers 5 (extract, embed, analyst, checker, gateway)Vector store ChromaDBOrchestration LangGraph + NATS
Bash · Python · YAML
[ Agent Trust ]
The Pit
Agent research platform + structured roast battles with traces.
░░░░░░░░░░░░░░░░▒█▒▓▒▓▒▒▒▒
today · 1236 commits
Tests ~1,289Coverage ~96%
Bash · Go · JavaScript · Python · Rust · SQL · TypeScript · YAML
Sortie
Convergence-based code review gating with immutable ledger.
░░░░░░░░░░░░░░░░░░░░░░░░█▒
today · 24 commits
Runs 6Pass rate 100%Branches reviewed 4
Python
[ Orchestration ]
Halo
Event-sourced advisory fleet — personal AI operating system.
░░░░░░░░░░░░░░░▒▒▒▒▒▒▓█▒▒▒
today · 647 commits
CLI modules 28Event bus NATS JetStreamRuntime K8s (k3s)
Bash · Go · JavaScript · Python · SQL · TypeScript · YAML
[ Intelligence ]
Jeany
Cost-attributed AI content economy intelligence.
░░░░░░░░░░░░░░░░░░░░░░░░░█
4d ago · 128 commits
Services 6Observability Prometheus + OTelUnique Cost-attributed pipeline
Bash · Go · Python · TypeScript · YAML
[ The Gaps ]
Unified Observability
Each project tracks its own costs and traces. Nothing connects them yet. This dashboard is the first step — it observes the stack but doesn't unify the telemetry. Per-project API keys give partial visibility. The elephant in the room is Claude Code token spend via subscription, which is trackable but not yet tracked.
Model Drift Detection
When a model version changes, does output quality shift? Isolating real drift from normal churn is genuinely hard. The Pit's traces and Stain's benchmarks produce the raw data this would need. The detection layer doesn't exist yet.
Human Accountability
This dashboard is a partial answer. The act of maintaining it creates a feedback loop — make the numbers honest, keep the lights accurate, notice when something goes dark. Before this, the only external record was a blog. One-way. Monoculture. This is two-way by construction, but it's still v0.
[ Build Cost ]
Total tokens 1.6B
Total cost (est.) $1012.52
2026-03 $540.61
2026-04 $471.91
By model:
Opus 4.6 $443.75 (94%)
Sonnet 4.6 $22.26 (5%)
Haiku 4.5 $5.90 (1%)
Cache hit rate: 97.3%
Per-project token attribution not yet available. OTel adapters planned.
[ Aggregate ]
Commits (6mo) 2,142
Tests 1,289
Languages Python, YAML, Bash, Go, JavaScript, Rust, SQL, TypeScript
Hot: 6Warm: 0Cool: 0Dormant: 0Conceptual: 1