Drop one folder into your project. Claude Code now must define what success means before writing code, write down the scope before giving a time estimate, and plan what happens when the AI gives a wrong answer — or it cannot start the work.
Not edge cases. The default outcome when no product quality enforcement exists.
The feature shipped. Nobody agreed on what success looked like before it was built. Three months later, nobody can tell if it's working. The rollback argument has no data to stand on.
The estimate was given before the scope was written. The scope grew during the sprint. The two weeks became six. The estimate was the ceiling on expectations, not a ceiling on work.
The research was done. Nobody synthesised it into decisions. The most vocal user's request made it into the roadmap. The pattern across 40 interviews didn't — because nobody extracted it.
The AI feature shipped without a plan for hallucination UX, trust calibration, or what happens when it's confidently wrong. Users discovered the edge cases on day two and reported them as bugs.
Two rules block work from starting until they're satisfied. Four rules enforce how every planning task is done.
Each rule requires a specific output with real values. Claude cannot say a plan is ready without showing a real baseline number at a real file path.
An agent that skips the metric definition cannot fill in Baseline: 63% (Mixpanel cohort 2026-Q1) with a real number.
That is the enforcement.
Each specialist has a narrow focus. All produce output at specific file paths — no chat-only results.
Reviews plans and feature specs against all 6 rules. Verdict: pass, fix these things, or blocked. Writes the report to a real file. Never gives 'looks good.'
Sets up how a feature will be measured before it's built — the success number, the warning signs, how to track each step, what to log. Writes it to a real file.
Turns raw user research into a list of findings and decisions — one decision per finding, every quote with context. Writes the output to a real file.
The difference is work that was done before engineering started, not problems discovered after it finished.
product/scopes/ before estimate. Estimate references scope file path.Installs only skills/ and .claude/agents/. Non-destructive — won't overwrite existing files.
prd-quality-gate before any feature enters planning. Run ai-feature-validation for every feature that calls an LLM. The product-critic agent reviews against all gates and writes a PASS / CONDITIONAL / BLOCK verdict to file.
All packs work standalone. All share the same enforcement model. Use one, some, or all.
25 pre-configured engineering specialists, 18 workflow skills, a lead orchestrator, and a Pipeline Auditor. The base layer.
Product quality gates. PRD, scoping, metrics, research synthesis, A/B test design, AI feature validation.
Eval, prompt versioning, fallback design, RAG pipelines, safety review. Three hard gates that block ship.
AI UI design enforcement. States, streaming UI, prompt UX, accessibility, and design tokens.
Growth quality gates. Positioning, copy, funnel analysis, experiments, retention design, AI messaging review.