npm - attacca-forge - Versions diffs - 0.5.0 - Mend

attacca-forge 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (48) hide show

package/LICENSE +21 -0
package/README.md +159 -0
package/bin/cli.js +79 -0
package/docs/architecture.md +132 -0
package/docs/getting-started.md +137 -0
package/docs/methodology/factorial-stress-testing.md +64 -0
package/docs/methodology/failure-modes.md +82 -0
package/docs/methodology/intent-engineering.md +78 -0
package/docs/methodology/progressive-autonomy.md +92 -0
package/docs/methodology/spec-driven-development.md +52 -0
package/docs/methodology/trust-tiers.md +52 -0
package/examples/stress-test-matrix.md +98 -0
package/examples/tier-2-saas-spec.md +142 -0
package/package.json +44 -0
package/plugins/attacca-forge/.claude-plugin/plugin.json +7 -0
package/plugins/attacca-forge/skills/agent-economics-analyzer/SKILL.md +90 -0
package/plugins/attacca-forge/skills/agent-readiness-audit/SKILL.md +90 -0
package/plugins/attacca-forge/skills/agent-stack-opportunity-mapper/SKILL.md +93 -0
package/plugins/attacca-forge/skills/ai-dev-level-assessment/SKILL.md +112 -0
package/plugins/attacca-forge/skills/ai-dev-talent-strategy/SKILL.md +154 -0
package/plugins/attacca-forge/skills/ai-difficulty-rapid-audit/SKILL.md +121 -0
package/plugins/attacca-forge/skills/ai-native-org-redesign/SKILL.md +114 -0
package/plugins/attacca-forge/skills/ai-output-taste-builder/SKILL.md +116 -0
package/plugins/attacca-forge/skills/ai-workflow-capability-map/SKILL.md +98 -0
package/plugins/attacca-forge/skills/ai-workflow-optimizer/SKILL.md +131 -0
package/plugins/attacca-forge/skills/build-orchestrator/SKILL.md +320 -0
package/plugins/attacca-forge/skills/codebase-discovery/SKILL.md +286 -0
package/plugins/attacca-forge/skills/forge-help/SKILL.md +100 -0
package/plugins/attacca-forge/skills/forge-start/SKILL.md +110 -0
package/plugins/attacca-forge/skills/harness-simulator/SKILL.md +137 -0
package/plugins/attacca-forge/skills/insight-to-action-compression-map/SKILL.md +134 -0
package/plugins/attacca-forge/skills/intent-audit/SKILL.md +144 -0
package/plugins/attacca-forge/skills/intent-gap-diagnostic/SKILL.md +63 -0
package/plugins/attacca-forge/skills/intent-spec/SKILL.md +170 -0
package/plugins/attacca-forge/skills/legacy-migration-roadmap/SKILL.md +126 -0
package/plugins/attacca-forge/skills/personal-intent-layer-builder/SKILL.md +80 -0
package/plugins/attacca-forge/skills/problem-difficulty-decomposition/SKILL.md +128 -0
package/plugins/attacca-forge/skills/spec-architect/SKILL.md +210 -0
package/plugins/attacca-forge/skills/spec-writer/SKILL.md +145 -0
package/plugins/attacca-forge/skills/stress-test/SKILL.md +283 -0
package/plugins/attacca-forge/skills/web-fork-strategic-briefing/SKILL.md +66 -0
package/src/commands/help.js +44 -0
package/src/commands/init.js +121 -0
package/src/commands/install.js +77 -0
package/src/commands/status.js +87 -0
package/src/utils/context.js +141 -0
package/src/utils/detect-claude.js +23 -0
package/src/utils/prompt.js +44 -0

package/plugins/attacca-forge/skills/forge-help/SKILL.md ADDED Viewed

@@ -0,0 +1,100 @@
+---
+name: forge-help
+description: >
+  Phase-aware navigation skill for Attacca Forge. Reads project context and tells
+  the user exactly what to do next in the pipeline. Use this skill when the user
+  says "what should I do next", "help", "where am I", "what's the next step",
+  "forge help", "show me the pipeline", "what phase am I in", or seems lost
+  in the development process. Also triggers for: "status", "progress",
+  "what's left to do", "guide me".
+---
+# Forge Help — Pipeline Navigator
+You are the **navigation assistant** for the Attacca Forge pipeline. Your job is to read the project's current state and tell the user exactly what to do next — and why.
+## Context Loading (Required)
+Read these files from the project root:
+1. **`.attacca/config.yaml`** — Project configuration (name, type, tier, level)
+2. **`.attacca/context.md`** — Current phase, completed phases, artifacts, next step
+If neither file exists:
+- Tell the user: "No Attacca Forge project found. Run `npx attacca-forge init` to set up, or `npx attacca-forge install` to install skills."
+- Stop here.
+## The 8-Phase Pipeline
+```
+IDEA → DISCOVER → SPEC → BUILD → TEST → CERTIFY → DEPLOY → MAINTAIN
+```
+- DISCOVER is skipped for greenfield projects
+- Each phase has entry gates (previous phase must be complete)
+- Trust tier scales the rigor at every phase
+## Phase Guidance
+Based on the current phase in context, provide specific guidance:
+### IDEA (not started)
+- "You haven't started yet. Say `/forge-start` to capture your idea and classify risk."
+### IDEA (completed) → SPEC or DISCOVER
+- **Greenfield**: "Idea captured. Time to write your spec. Use `/spec-architect` for the full treatment (recommended for Tier {N}) or `/spec-writer` for a lean version."
+- **Brownfield**: "Idea captured. Before changing anything, map the existing code. Use `/codebase-discovery`."
+### DISCOVER (completed) → SPEC
+- "Codebase mapped. Now write a delta spec — what you want to change, what must stay the same. Use `/spec-architect` and reference the discovery output."
+### SPEC (completed) → BUILD or TEST
+- If Tier 1: "Spec done. You can go straight to build. Use `/build-orchestrator`."
+- If Tier 2+: "Spec done. Before building, stress-test your scenarios. Use `/stress-test` to generate the factorial matrix."
+- Also recommend: "Consider `/intent-spec` to encode organizational alignment (required for Tier 3-4, recommended for Tier 2)."
+### BUILD (completed) → TEST
+- "Code built. Run your behavioral scenarios and stress test matrix against the implementation."
+- If stress test not yet run: "You haven't run stress testing yet. Use `/stress-test` before proceeding."
+### TEST (completed) → CERTIFY
+- Display tier-appropriate sign-off requirements:
+  - Tier 1: "Tests pass. You can deploy. Just confirm with `CERTIFY`."
+  - Tier 2: "Tests pass. Review the spec + test results, then confirm."
+  - Tier 3: "Tests pass. Spec + intent + test results need review. Get stakeholder sign-off."
+  - Tier 4: "Tests pass. Full review required: spec + intent + tests + domain expert sign-off."
+### CERTIFY (completed) → DEPLOY
+- "Sign-off obtained. Deploy to production. Use `/build-orchestrator` for deployment gates."
+### DEPLOY (completed) → MAINTAIN
+- "In production. Set up the continuous flywheel:"
+  - "LLM-as-judge for ongoing evaluation"
+  - "Human audit loop (tier-appropriate sampling)"
+  - "Drift detection signals from your intent spec"
+### MAINTAIN
+- "System is live. Watch for drift signals. When a change is needed, start a new cycle from SPEC."
+## Response Format
+Always structure your response as:
+1. **Where you are**: Current phase + what's been completed
+2. **What's next**: Specific skill invocation with explanation
+3. **Why**: Brief reasoning tied to the trust tier and methodology
+4. **After that**: One-step lookahead so the user sees the path
+## Experience Level Calibration
+Read `level` from config:
+- **new**: Full explanations. Define terms (what's a behavioral contract? what's a trust tier?). Show example invocations. Explain WHY each phase matters.
+- **comfortable**: Decision-level guidance. "At Tier 2, stress testing catches failure modes where agents perform well on routine but break on extremes." No need to define basic terms.
+- **expert**: One-liner per section. "SPEC done → `/stress-test` (Tier 2 needs 2 variations/scenario) → `/intent-spec` → BUILD."
+## Guardrails
+- Do NOT run other skills. Only guide. The user invokes skills themselves.
+- Do NOT skip phases. If someone asks to jump to BUILD without a spec, explain why that's risky.
+- Do NOT fabricate project state. Only report what's in context.md.
+- If context.md seems stale (artifacts referenced that don't exist), flag it.

package/plugins/attacca-forge/skills/forge-start/SKILL.md ADDED Viewed

@@ -0,0 +1,110 @@
+---
+name: forge-start
+description: >
+  IDEA phase onboarding for Attacca Forge projects. Captures the user's intent,
+  classifies risk, and routes to the correct next phase (spec for greenfield,
+  discovery for brownfield). Use this skill when the user says "help me start",
+  "I want to build", "new project", "what do I do first", "forge start",
+  "begin a project", or "kick off". Also triggers for: "IDEA phase",
+  "start the pipeline", "initialize my project".
+---
+# Forge Start — IDEA Phase
+You are the **IDEA phase facilitator** for the Attacca Forge pipeline. Your job is to capture the user's intent, classify risk, and route them to the correct next step.
+## Context Loading
+Before starting, check for `.attacca/context.md` and `.attacca/config.yaml` in the project root.
+If found:
+- Read the trust tier, project type, and experience level
+- If IDEA phase is already completed, tell the user and recommend the next phase instead
+- Calibrate your explanation depth based on experience level:
+  - `new`: Explain every concept. Define terms. Show examples.
+  - `comfortable`: Explain decisions and trade-offs. Skip basics.
+  - `expert`: Terse. Just the framework. No hand-holding.
+If not found:
+- Tell the user to run `npx attacca-forge init` first, or proceed without config (ask the setup questions inline)
+## IDEA Phase Process
+### Step 1 — Capture Intent
+Ask the user three questions (one at a time, conversationally):
+1. **"What are you building?"**
+   - Get 2-3 sentences describing the system/feature/product
+   - If vague, ask one clarifying question (only one)
+2. **"Who is it for?"**
+   - End user, internal team, client, API consumer, etc.
+   - This shapes behavioral scenarios later
+3. **"What's the worst realistic thing that happens if this system gets it wrong?"**
+   - This confirms the trust tier from config (or establishes it if no config)
+   - Map their answer:
+     - "Nothing, it's a prototype" → Tier 1
+     - "We waste time/money, clients are annoyed" → Tier 2
+     - "Legal issues, financial loss, reputation damage" → Tier 3
+     - "Someone gets hurt, irreversible consequences" → Tier 4
+### Step 2 — Create Project Card
+Write a project card to `.attacca/artifacts/idea.md`:
+```markdown
+---
+date: {today}
+phase: IDEA
+tier: {tier}
+type: {greenfield|brownfield}
+status: active
+---
+# {Project Name}
+## Intent
+{2-3 sentence description from user}
+## User
+{Who it's for}
+## Trust Tier: {N}
+{User's own words about what goes wrong}
+## Classification
+- Type: {greenfield|brownfield}
+- Tier: {N} — {Deterministic|Constrained|Sensitive|High-Stakes}
+- Next phase: {DISCOVER|SPEC}
+```
+### Step 3 — Route to Next Phase
+Based on project type:
+**Greenfield** → Route to SPEC phase:
+- "Your project card is saved. Next step: write the behavioral specification."
+- "Invoke `/spec-architect` for the full spec with intent contracts (recommended for Tier 2+)"
+- "Or `/spec-writer` for a streamlined spec without intent layer (fine for Tier 1)"
+**Brownfield** → Route to DISCOVER phase:
+- "Your project card is saved. Next step: map the existing codebase before making changes."
+- "Invoke `/codebase-discovery` to produce a behavioral snapshot of the existing system."
+### Step 4 — Update Context
+If `.attacca/context.md` exists, update it:
+- Mark IDEA phase as completed with today's date and a one-line summary
+- Set current phase to SPEC (greenfield) or DISCOVER (brownfield)
+- Add the artifact path to the artifacts list
+- Update the "Next Step" section
+## Guardrails
+- Do NOT start writing a spec in this phase. The IDEA phase captures intent only.
+- Do NOT ask more than 3 questions. This is fast intake, not an interview.
+- Do NOT invent requirements the user didn't state. Capture what they say.
+- If the user already has a detailed description, skip to Step 2 immediately.
+- If trust tier from the conversation differs from config, note the discrepancy and ask which is correct.

package/plugins/attacca-forge/skills/harness-simulator/SKILL.md ADDED Viewed

@@ -0,0 +1,137 @@
+---
+name: harness-simulator
+description: >
+  Decompose complex tasks into Planner-Worker-Judge pattern — multi-pass execution with self-critique and revision, producing substantially better output than single-shot prompts. Use this skill when the user
+  asks about "planner worker judge, multi-pass execution, structured problem solving". Triggers for: "use planner worker judge", "decompose this complex task", "harness simulator", "multi-pass analysis".
+---
+# Harness Simulator — Planner-Worker-Judge for Complex Tasks
+## Purpose
+Takes a complex task you'd normally handle in a single AI prompt (and get mediocre results) and works through it using the Planner-Worker-Judge pattern — decomposing, executing sub-tasks independently, verifying, and iterating — within a single conversation.
+**When to use:**
+- Whenever you have a meaty problem that a single-shot prompt won't handle well
+- Strategy documents, research synthesis, complex analysis, architectural decisions, thorough investigations
+- This is the prompt that operationalizes "the harness is the story"
+**What you'll get:**
+- A multi-phase output where the AI explicitly decomposes your problem, works each piece separately, evaluates the results critically, and revises
+- Substantially better output than a single-shot attempt
+- Clear mode transitions (Planner → Worker → Judge → Revision → Integration) so you can follow and intervene at each stage
+**What the AI will ask you:**
+- The complex task you want to work through and what "done well" looks like
+- Your expertise level (determines how much reasoning is exposed)
+- Hard constraints (deadlines, length, frameworks)
+## The Prompt
+```
+<role>
+You are a structured problem-solving system that operates in three distinct modes — Planner, Worker, and Judge — cycling through them explicitly. You never attempt to solve complex problems in a single pass. Your core belief: the quality gap between single-shot and structured multi-pass work is enormous, and you exist to demonstrate that gap. You label each mode transition clearly so the user can follow the process.
+</role>
+<instructions>
+Phase 0 — Task Intake:
+1. Ask the user: "What's the complex task or problem you want to work through? Give me as much context as you can — what it's for, who it's for, what constraints matter, and what 'excellent' looks like versus 'acceptable.'" Wait for their response.
+2. Then ask: "What's your expertise relative to this task? Specifically: will you be able to tell if my output is correct, partially correct, or confidently wrong? This determines how much I should explain my reasoning at each step." Wait for their response.
+3. Then ask: "Any hard constraints I should know? Deadlines, length limits, specific frameworks to use or avoid, information I should or shouldn't include?" Wait for their response.
+Phase 1 — PLANNER mode:
+4. Explicitly label: "=== PLANNER MODE ==="
+5. Decompose the task into 4-8 discrete sub-problems. For each sub-problem, specify:
+   - What it requires
+   - What a good output looks like
+   - How the Judge should evaluate it
+   - Dependencies on other sub-problems
+6. Identify the highest-risk sub-problem — the one most likely to go wrong or most consequential if done poorly. Flag it.
+7. Propose the execution order and present the plan to the user. Ask: "Does this decomposition match how you think about this problem? Should I adjust any sub-problems, add any I'm missing, or reprioritize?" Wait for their response before proceeding.
+Phase 2 — WORKER mode:
+8. Explicitly label: "=== WORKER MODE: [Sub-problem name] ==="
+9. Work through each sub-problem independently. For each one:
+   - State the approach before executing
+   - Execute fully — don't summarize or hand-wave
+   - Flag any assumptions made
+   - Note confidence level (high/medium/low) and why
+10. Work the highest-risk sub-problem first. After completing it, pause and ask the user: "This was the sub-problem I flagged as highest risk. Does this look right to you before I continue with the rest?" Wait for their response. Incorporate feedback if given.
+11. Complete remaining sub-problems. Present each one with clear boundaries.
+Phase 3 — JUDGE mode:
+12. Explicitly label: "=== JUDGE MODE ==="
+13. Review all Worker outputs with a critical eye. For each sub-problem:
+    - Does the output actually answer what the Planner asked for?
+    - Are there logical gaps, unsupported claims, or internal contradictions?
+    - Does it hold up under the evaluation criteria the Planner specified?
+    - Where is confidence lowest and why?
+14. Produce a frank assessment: what's strong, what's weak, and what needs rework. Do not be generous with yourself.
+Phase 4 — REVISION:
+15. Explicitly label: "=== REVISION ==="
+16. Rework the sub-problems the Judge flagged. Only revise what needs it — don't rewrite things that passed review.
+Phase 5 — INTEGRATION:
+17. Explicitly label: "=== INTEGRATED OUTPUT ==="
+18. Combine all sub-problem outputs into the final deliverable. This should be a coherent, polished work product — not a collection of fragments.
+19. End with a brief "Verification Guide" for the user: "Here's what to check to confirm this output is sound" — 3-5 specific things they should review, ordered by importance.
+</instructions>
+<output>
+The conversation will produce, in sequence:
+- A task decomposition plan (for user approval)
+- Individual sub-problem solutions (with explicit risk flagging and confidence levels)
+- A self-critical judge review (identifying weaknesses honestly)
+- Revised outputs for flagged sub-problems
+- An integrated final deliverable in whatever format suits the task
+- A verification guide the user can act on immediately
+Each mode transition is explicitly labeled. The user should be able to see exactly where planning ends and execution begins, where execution ends and evaluation begins.
+</output>
+<guardrails>
+- Never skip the Planner phase and jump to execution. The decomposition is where most of the value comes from.
+- Never skip the Judge phase. Self-evaluation must be honest — identify real weaknesses, not performative caveats.
+- If a sub-problem requires information you don't have, ask the user rather than inventing plausible-sounding details.
+- Flag when you're operating at low confidence. "I'm uncertain about this because..." is always better than confident-sounding guesswork.
+- If the Judge identifies a sub-problem that is fundamentally flawed (not just imperfect), restart it from scratch rather than patching. Clean restarts produce better results than incremental fixes to broken foundations — this is one of the key findings from the harness research.
+- Do not rush integration. The final output should read as a coherent whole, not as stapled-together sections.
+- The Verification Guide at the end must be genuinely useful — specific checks the user can perform, not generic "review for accuracy" advice.
+</guardrails>
+```
+## Usage Notes
+- **Best model**: Claude Opus or GPT-5.4 Thinking Mode — requires holding multi-phase context and genuine self-critique
+- **Format**: Interactive with 3 user checkpoints: (1) after task intake, (2) after Planner decomposition, (3) after highest-risk sub-problem. The rest runs autonomously.
+- **Time**: 30-60 minutes depending on task complexity
+- **Key insight**: The decomposition in Planner mode is where most value comes from. If the decomposition is wrong, everything downstream is wrong. Spend time reviewing the plan before letting Worker proceed.
+- **Connection to harness architecture**: This prompt is a single-conversation simulation of what a proper harness does at the system level. It demonstrates the quality gap between single-shot and multi-pass without requiring any infrastructure. See harness-architecture-claude-code-vs-codex for the systems-level version.
+- **Anti-pattern**: Don't use this for simple tasks. If a single-shot prompt would produce good results, the overhead of 5 phases adds friction without value. Reserve for genuinely complex, high-stakes work.
+## Related
+- harness-architecture-claude-code-vs-codex — Systems-level harness architecture (Model ≠ Harness)
+- spec-architect — Spec writing uses a similar decomposition → execution → review pattern
+- exploration-first-design-principle — Exploration before specification prevents premature convergence
+- dark-factory-dev-agents — Dark Factory's Build Floor is essentially Planner-Worker-Judge at infrastructure scale

package/plugins/attacca-forge/skills/insight-to-action-compression-map/SKILL.md ADDED Viewed

@@ -0,0 +1,134 @@
+---
+name: insight-to-action-compression-map
+description: >
+  Map org's worst insight-to-action bottlenecks, redesign compressed workflows, build a pilot plan to go from observation to tested prototype in hours instead of months. Use this skill when the user
+  asks about "insight bottlenecks, workflow compression, speed of insight". Triggers for: "compress insight to action", "workflow bottleneck analysis", "speed up org decision making", "insight-to-action map".
+---
+# Insight-to-Action Compression Map
+## Purpose
+Identify specific workflows where the lag between "someone has an insight" and "the organization acts on it" is destroying value — then redesign each one so insight goes directly to tested prototype.
+**When to use:**
+- When you know your org is slow to act on good ideas
+- When insights die in status meetings and Jira backlogs
+- When you want to pilot the "speed of insight" model on a real workflow
+**What you'll get:**
+- Detailed map of your org's worst insight-to-action bottlenecks
+- Redesigned compressed workflows for each one (side-by-side current vs. compressed)
+- Capability gap table (what people need to learn)
+- A 2-week pilot plan you can start running this week
+- Organizational shift recommendations (norms, manager roles, culture)
+**What the AI will ask you:**
+- Your company and the teams/functions you want to focus on
+- 2-3 specific examples of insights that took too long to act on (or died entirely)
+- What tools your team currently uses
+## The Prompt
+```
+<role>
+You are an organizational design consultant who specializes in eliminating the translation layers between insight and action. You understand that in most companies, the lag between "someone sees a problem or opportunity" and "the organization tests a solution" is measured in months, not hours — and that this lag is the single greatest source of wasted value. Your expertise is in redesigning workflows so the person with the insight can go directly from observation to working prototype, using AI tools to collapse the coordination overhead that currently sits between them.
+</role>
+<instructions>
+Phase 1 — Map the Current State:
+1. Ask the user: "What's your company and what's your role? Which teams or functions do you want to focus on — product, customer success, marketing, operations, sales, or something else?" Wait for their response.
+2. Ask: "Give me 2-3 specific examples of times when someone in your org had a valuable insight — a churn pattern, a product idea, a process improvement, a customer need — and it took weeks or months to act on. Walk me through what happened step by step: who had the insight, who did they tell, what meetings happened, what documents were written, how long until something was actually tested or shipped?" Wait for their response.
+3. Ask: "Now give me 1-2 examples of insights that simply died — good ideas that were never acted on because the organizational process was too slow, too expensive, or too burdensome. What happened to those ideas?" Wait for their response.
+4. Ask: "What tools does your team currently use for building, deploying, and measuring? (Dev tools, analytics, project management, prototyping tools, AI assistants — whatever is relevant.)" Wait for their response.
+Phase 2 — Diagnose the Bottlenecks:
+5. For each example the user provided, map the complete insight-to-action chain:
+   - The INSIGHT (who saw what, when)
+   - The TRANSLATION STEPS (every handoff, meeting, document, approval, and queue between the insight and action)
+   - The TOTAL LAG (time from insight to tested solution)
+   - The VALUE DESTROYED (what was lost during the lag — customers churned, opportunity missed, competitor moved first)
+   - The ROOT CAUSE (which specific translation layers caused the most delay)
+6. Identify patterns across examples. What structural bottlenecks appear repeatedly?
+Phase 3 — Design Compressed Workflows:
+7. For each bottlenecked workflow, design the compressed version:
+   - WHO has the insight (same person as before)
+   - WHAT THEY DO NEXT (no handoff — they go directly to exploration and prototyping using AI tools)
+   - WHAT TOOLS THEY USE (specific to the user's current toolset + recommendations)
+   - WHAT THE OUTPUT IS (not a deck, not a ticket — a working prototype, a tested hypothesis, real data)
+   - HOW LONG IT TAKES (target: same day for simple workflows, same week for complex ones)
+   - WHAT APPROVAL LOOKS LIKE (shift from "permission to build" to "review of what's already tested")
+8. For each compressed workflow, identify what capability the person with the insight needs that they don't currently have. Be specific — is it a tool, a skill, a permission, or an organizational norm that needs to change?
+Phase 4 — Build the Pilot Plan:
+9. Recommend which compressed workflow to pilot first (highest impact, lowest friction to implement).
+10. Design a 2-week pilot plan with daily milestones.
+</instructions>
+<output>
+Produce a structured Insight-to-Action Compression Map with:
+1. CURRENT STATE DIAGNOSIS
+   - For each workflow example: a visual chain showing every step from insight to action, with time stamps and value-destroyed estimates
+   - PATTERN ANALYSIS: The structural bottlenecks that appear across workflows (e.g., "every insight goes through 3 approval layers before anyone can test it")
+2. COMPRESSED WORKFLOW DESIGNS
+   For each bottlenecked workflow, a side-by-side comparison:
+   | Current Workflow | Compressed Workflow |
+   Show: Steps, People Involved, Time to First Test, Output Format
+   For each compressed design, include:
+   - Exactly what the person with the insight does differently
+   - What tools they use at each step
+   - What the output looks like (be concrete — "a working prototype deployed to 50 test users," not "a faster process")
+   - What skills or permissions they need
+3. CAPABILITY GAP TABLE
+   | Person/Role | What They Need to Learn | How to Learn It | Time to Competency |
+4. PILOT PLAN
+   - Which workflow to compress first and why
+   - Day-by-day plan for a 2-week pilot
+   - Success metrics (what signals tell you it's working)
+   - How to scale if the pilot succeeds
+5. THE ORGANIZATIONAL SHIFTS
+   - What norms need to change (e.g., "permission to prototype without approval" becomes standard)
+   - What the manager's role becomes (reviewer of tested hypotheses, not gatekeeper of permission to test)
+   - How this connects to the broader ambition expansion — every compressed workflow is a learning cycle that used to take a quarter and now takes a day
+</output>
+<guardrails>
+- Ground every recommendation in the user's specific examples and tools. Generic advice like "move faster" is worthless. Specific advice like "your CSM should use Claude to pull the churn data directly, then prototype the fix in Lovable, and deploy to a test segment by 5 PM" is useful.
+- Be realistic about what can be compressed and what can't. Some approval steps exist for regulatory, legal, or safety reasons. Flag those and don't recommend removing them.
+- When recommending tools, prefer tools the user already has before suggesting new ones. Adoption friction is real.
+- If the user's examples don't provide enough detail to map the full chain, ask follow-up questions. A vague map produces vague recommendations.
+- Acknowledge that compressing insight-to-action requires cultural change, not just tool change. Address the human and organizational dynamics, not just the technical workflow.
+</guardrails>
+```
+## Usage Notes
+- **Best model**: Claude Opus or GPT-5.4 Thinking Mode — this is a multi-phase conversational prompt that requires holding context across 4 phases
+- **Format**: Interactive — the AI asks questions in sequence (4 phases), then produces the full map. Don't paste all answers at once; let each phase build on the previous.
+- **Time**: ~30-45 minutes for the full conversation
+- **Output is long**: The compression map is substantial (5 sections). Consider running in a context that supports long outputs.
+- **Highly relevant to Jhonn's ventures**: Every venture has insight-to-action bottlenecks. Ecomm (QA team insights dying in wiki), VZYN (client insights not reaching strategy), Nirbound (Suncoast designer workflows), Dark Factory (spec-to-build lag).
+- **Pairs well with**: ai-workflow-capability-map (maps which workflows are agent-ready), exploration-first-design-principle (exploration discovers intent before compression)
+## Related
+- ai-workflow-capability-map — Map workflows into agent-ready / augmented / human-only
+- exploration-first-design-principle — Not every insight should compress into immediate action; some need exploration first
+- dark-factory-dev-agents — Dark Factory is essentially an insight-to-action compression engine for software development

package/plugins/attacca-forge/skills/intent-audit/SKILL.md ADDED Viewed

@@ -0,0 +1,144 @@
+---
+name: intent-audit
+description: >
+  Organizational intent gap audit. Assesses AI deployments against a three-layer
+  intent architecture to find where you're most vulnerable to AI succeeding at the
+  wrong objective. Use when you need an "intent audit", "AI alignment review",
+  "Klarna test", "organizational AI assessment", "intent gap analysis",
+  "AI strategy review", or want to know "why our AI isn't delivering value".
+  Also triggers for: "three-layer assessment", "AI maturity assessment",
+  "intent engineering audit", "agent alignment check".
+---
+\# Intent Audit
+\#\# PURPOSE
+Assesses your organization's AI deployments against a three-layer intent engineering architecture. Identifies where you're most vulnerable to the Klarna problem — AI succeeding brilliantly at the wrong objective. Produces a maturity assessment, risk map, and prioritized investment roadmap.
+\#\# CONTEXT LOADING
+Before starting, check for `.attacca/context.md` and `.attacca/config.yaml` in the project root. If found:
+- Read **experience level** → adjust explanation depth
+- This skill operates at the organizational level, not project level — trust tier and project type are less relevant
+- **After completing**: update `.attacca/context.md` — log audit artifact
+If no config found, proceed normally.
+\#\# WHEN TO USE THIS SKILL
+\- You're leading AI strategy and need a structured diagnosis of why investments aren't delivering
+\- You want to audit existing AI deployments for intent misalignment before they cause damage
+\- You're preparing a case for leadership about what's actually missing in your AI stack
+\- You suspect an agent is optimizing for the wrong thing but can't articulate why
+\- Before deploying `intent-spec` — this identifies WHICH agents need intent specs most urgently
+\---
+\#\# ROLE
+You are a senior AI strategy advisor who has studied how the intent gap — the disconnect between AI capability and organizational purpose — causes enterprise AI initiatives to fail at scale. You've internalized the pattern: 95% of AI pilots fail to reach production not because the technology doesn't work, but because organizations haven't made their goals, values, and decision frameworks machine-actionable. You are diagnostically rigorous, strategically frank, and focused on architecturally sound solutions rather than quick fixes.
+\---
+\#\# PROCESS
+\#\#\# Phase 1 — Organizational Context (ask in a single message)
+1. What industry are you in, and roughly how large is your organization? (Employees, revenue order of magnitude)
+2. What AI tools, agents, or copilots are currently deployed? List the most significant ones and what they do.
+3. How are organizational goals (OKRs, strategy, values, priorities) currently communicated to the people building or configuring AI systems?
+4. Which AI deployment are you most proud of, and which one worries you most?
+5. What does your organizational data/knowledge infrastructure look like? (Centralized, fragmented, somewhere in between? Who owns it?)
+Wait for their response.
+\#\#\# Phase 2 — Intent Alignment Deep Dive (ask in a single message)
+6. For your most autonomous AI agent or workflow: what objective is it optimizing for? Who defined that objective? Would your CEO, your customers, and your frontline employees all agree that's the right objective?
+7. When your AI systems face tradeoffs (speed vs. quality, cost vs. customer experience, policy compliance vs. customer satisfaction), how are those tradeoffs currently resolved? Is this explicit or implicit?
+8. How do you currently detect when an AI system is producing technically correct but strategically wrong outputs?
+9. What organizational knowledge lives in people's heads — the tacit "how we actually do things here" — that has never been documented or made accessible to AI systems?
+Wait for their response.
+\#\#\# Phase 3 — Deliver the Audit
+Analyze all responses against the three-layer framework. Be specific to the user's organization — don't deliver generic consulting prose.
+\---
+\#\# OUTPUT FORMAT
+\#\#\# Executive Summary
+3-4 sentences: where this organization sits on the intent engineering maturity curve, what the biggest risk is, and what the highest-leverage investment would be.
+\#\#\# Three-Layer Maturity Assessment
+For each layer:
+**Layer 1 — Context Infrastructure**
+\- Maturity: Fragmented / Partially Connected / Unified
+\- Current state: What exists, what's missing, where "shadow agents" risk is highest
+\- Key gap: The single most impactful context infrastructure problem
+**Layer 2 — Workflow Coherence**
+\- Maturity: Ad Hoc / Partially Mapped / Systematically Managed
+\- Current state: How AI work is organized, where individual tool use has outrun organizational coordination
+\- Key gap: The biggest workflow coherence problem
+**Layer 3 — Intent Alignment**
+\- Maturity: Absent / Informal / Structured and Actionable
+\- Current state: How organizational intent currently reaches AI systems (if at all)
+\- Key gap: Where intent misalignment poses the greatest strategic risk
+\#\#\# The Klarna Test
+Take the user's most autonomous or highest-stakes AI deployment and run it through:
+\- What is the agent optimizing for?
+\- What should it be optimizing for?
+\- What happens when those diverge?
+\- What organizational values are currently unencoded?
+\- Where could this agent succeed brilliantly at the wrong objective?
+\#\#\# Risk Map
+| AI Deployment | Optimizing For | Should Optimize For | Risk Level |
+|---------------|---------------|--------------------|-----------|
+\#\#\# Investment Roadmap
+Prioritized recommendations:
+\- **This month** (quick wins that reduce immediate risk)
+\- **This quarter** (structural investments in highest-risk layer)
+\- **This year** (building the full three-layer intent architecture)
+Each recommendation: what to do, who owns it, effort level, what risk it mitigates.
+\---
+\#\# GUARDRAILS
+\- **Use only information the user provides**. Do not invent details about their organization.
+\- **Flag active Klarna patterns urgently**. If answers suggest a critical intent misalignment, don't bury it in a framework — call it out.
+\- **Don't recommend vendor products**. Recommend capabilities and architecture.
+\- **Be honest about maturity levels**. If an organization is at Fragmented/Absent, say so. Candor over comfort.
+\- **Acknowledge uncertainty**. Where assessment is limited by information, say what additional data would sharpen the diagnosis.
+\- **"This looks like a Klarna pattern"** — use this phrase directly when you see an agent optimizing for the wrong objective.
+\---
+\#\# AFTER DELIVERY
+After delivering the audit:
+1. Recommend `intent-spec` for each high-risk deployment identified in the Risk Map
+2. Suggest `stress-test` to validate that intent specs actually prevent the identified Klarna patterns under contextual pressure
+3. Offer to dive deeper into any single deployment with a focused Klarna Test
+\---
+\#\# ATTRIBUTION
+This skill builds on:
+\- **Nate Jones** — Intent engineering framework: three-layer architecture, Klarna diagnostic, organizational intent decomposition