project-iris 0.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +424 -0
- package/dist/bridge/agent-runner.js +190 -0
- package/dist/bridge/connector-factory.js +31 -0
- package/dist/bridge/connectors/antigravity-connector.js +18 -0
- package/dist/bridge/connectors/cursor-connector.js +31 -0
- package/dist/bridge/connectors/in-process-connector.js +29 -0
- package/dist/bridge/connectors/vscode-connector.js +31 -0
- package/dist/bridge/connectors/windsurf-connector.js +23 -0
- package/dist/bridge/filesystem-connector.js +110 -0
- package/dist/bridge/helper.js +203 -0
- package/dist/bridge/types.js +10 -0
- package/dist/cli.js +38 -0
- package/dist/commands/ask.js +259 -0
- package/dist/commands/bridge.js +88 -0
- package/dist/commands/develop.js +141 -0
- package/dist/commands/doctor.js +102 -0
- package/dist/commands/flow.js +301 -0
- package/dist/commands/framework.js +273 -0
- package/dist/commands/generate.js +59 -0
- package/dist/commands/install.js +73 -0
- package/dist/commands/pack.js +33 -0
- package/dist/commands/phase.js +38 -0
- package/dist/commands/run.js +199 -0
- package/dist/commands/status.js +114 -0
- package/dist/commands/uninstall.js +14 -0
- package/dist/commands/use.js +20 -0
- package/dist/commands/validate.js +102 -0
- package/dist/framework/framework-loader.js +97 -0
- package/dist/framework/framework-paths.js +48 -0
- package/dist/framework/framework-types.js +15 -0
- package/dist/iris/artifact-checker.js +78 -0
- package/dist/iris/artifacts/config.js +68 -0
- package/dist/iris/artifacts/generator.js +88 -0
- package/dist/iris/artifacts/types.js +1 -0
- package/dist/iris/bundle.js +44 -0
- package/dist/iris/doctrine/collector.js +124 -0
- package/dist/iris/fixer.js +149 -0
- package/dist/iris/flows/manifest.js +124 -0
- package/dist/iris/framework-context.js +49 -0
- package/dist/iris/framework-manager.js +215 -0
- package/dist/iris/fs/atomic.js +22 -0
- package/dist/iris/guard.js +38 -0
- package/dist/iris/importers/bmad.js +70 -0
- package/dist/iris/importers/index.js +9 -0
- package/dist/iris/importers/speckit.js +15 -0
- package/dist/iris/importers/specsmd.js +78 -0
- package/dist/iris/importers/types.js +8 -0
- package/dist/iris/importers/writer.js +139 -0
- package/dist/iris/include.js +49 -0
- package/dist/iris/installer.js +334 -0
- package/dist/iris/interactive/env.js +21 -0
- package/dist/iris/interactive/intent-interview.js +345 -0
- package/dist/iris/interactive/intent-schema.js +28 -0
- package/dist/iris/interactive/interview-io.js +22 -0
- package/dist/iris/interview/config.js +71 -0
- package/dist/iris/interview/types.js +16 -0
- package/dist/iris/interview/utils.js +38 -0
- package/dist/iris/manifest.js +54 -0
- package/dist/iris/packer.js +325 -0
- package/dist/iris/parsers/unit-parser.js +43 -0
- package/dist/iris/paths.js +18 -0
- package/dist/iris/policy.js +133 -0
- package/dist/iris/proc.js +56 -0
- package/dist/iris/report.js +53 -0
- package/dist/iris/resolver.js +66 -0
- package/dist/iris/router.js +114 -0
- package/dist/iris/routes.js +189 -0
- package/dist/iris/run-state.js +146 -0
- package/dist/iris/state.js +113 -0
- package/dist/iris/templates.js +70 -0
- package/dist/iris/tmp.js +24 -0
- package/dist/iris/uninstaller.js +181 -0
- package/dist/iris/utils/interpolate.js +42 -0
- package/dist/iris/validator.js +391 -0
- package/dist/iris/workflow/config.js +51 -0
- package/dist/iris/workflow/engine.js +129 -0
- package/dist/iris/workflow/steps.js +448 -0
- package/dist/iris/workflow/types.js +1 -0
- package/dist/lib.js +96 -0
- package/dist/utils/exit-codes.js +7 -0
- package/dist/workflows/bolt-execution.js +238 -0
- package/dist/workflows/bolt-plan.js +192 -0
- package/dist/workflows/intent-inception.js +210 -0
- package/dist/workflows/reporting.js +74 -0
- package/package.json +45 -0
- package/src/iris_bundle/.iris/aidlc/README.md +16 -0
- package/src/iris_bundle/.iris/aidlc/agents/iris-construction-agent.md +35 -0
- package/src/iris_bundle/.iris/aidlc/agents/iris-inception-agent.md +30 -0
- package/src/iris_bundle/.iris/aidlc/agents/iris-master-agent.md +35 -0
- package/src/iris_bundle/.iris/aidlc/agents/iris-operations-agent.md +29 -0
- package/src/iris_bundle/.iris/aidlc/commands/iris-construction-agent.md +18 -0
- package/src/iris_bundle/.iris/aidlc/commands/iris-inception-agent.md +18 -0
- package/src/iris_bundle/.iris/aidlc/commands/iris-master-agent.md +18 -0
- package/src/iris_bundle/.iris/aidlc/commands/iris-operations-agent.md +18 -0
- package/src/iris_bundle/.iris/aidlc/context/context-map.md +25 -0
- package/src/iris_bundle/.iris/aidlc/context/exclusion-rules.md +13 -0
- package/src/iris_bundle/.iris/aidlc/context/load-order.md +25 -0
- package/src/iris_bundle/.iris/aidlc/memory/intent-rules.md +9 -0
- package/src/iris_bundle/.iris/aidlc/memory/log-rules.md +5 -0
- package/src/iris_bundle/.iris/aidlc/memory/memory-bank.yaml +39 -0
- package/src/iris_bundle/.iris/aidlc/memory/unit-rules.md +9 -0
- package/src/iris_bundle/.iris/aidlc/quick-start.md +24 -0
- package/src/iris_bundle/.iris/aidlc/skills/execution/implementation.md +14 -0
- package/src/iris_bundle/.iris/aidlc/skills/execution/refactoring.md +13 -0
- package/src/iris_bundle/.iris/aidlc/skills/execution/scaffold-generation.md +15 -0
- package/src/iris_bundle/.iris/aidlc/skills/governance/escalation.md +13 -0
- package/src/iris_bundle/.iris/aidlc/skills/governance/quality-gates.md +14 -0
- package/src/iris_bundle/.iris/aidlc/skills/governance/stop-conditions.md +11 -0
- package/src/iris_bundle/.iris/aidlc/skills/reasoning/decomposition.md +23 -0
- package/src/iris_bundle/.iris/aidlc/skills/reasoning/risk-analysis.md +14 -0
- package/src/iris_bundle/.iris/aidlc/skills/reasoning/verification.md +21 -0
- package/src/iris_bundle/.iris/aidlc/standards/artifacts-registry.md +38 -0
- package/src/iris_bundle/.iris/aidlc/standards/decision-logging.md +16 -0
- package/src/iris_bundle/.iris/aidlc/standards/doctrine-structure.md +31 -0
- package/src/iris_bundle/.iris/aidlc/standards/documentation-rules.md +15 -0
- package/src/iris_bundle/.iris/aidlc/standards/file-structure.md +21 -0
- package/src/iris_bundle/.iris/aidlc/standards/naming-conventions.md +18 -0
- package/src/iris_bundle/.iris/aidlc/standards/phases-and-gates.md +25 -0
- package/src/iris_bundle/.iris/aidlc/standards/routes-and-routing.md +35 -0
- package/src/iris_bundle/.iris/aidlc/standards/tool-wrappers.md +32 -0
- package/src/iris_bundle/.iris/aidlc/templates/bolt.md +23 -0
- package/src/iris_bundle/.iris/aidlc/templates/doctrine-doc-template.md +33 -0
- package/src/iris_bundle/.iris/aidlc/templates/intent.md +23 -0
- package/src/iris_bundle/.iris/aidlc/templates/log.md +24 -0
- package/src/iris_bundle/.iris/aidlc/templates/review.md +21 -0
- package/src/iris_bundle/.iris/aidlc/templates/unit.md +31 -0
- package/src/iris_bundle/.iris/aidlc/validation/failure-modes.md +16 -0
- package/src/iris_bundle/.iris/aidlc/validation/phase-preconditions.md +21 -0
- package/src/iris_bundle/.iris/aidlc/validation/quality-checklist.md +20 -0
- package/src/iris_bundle/.iris/flows/specs-md/doctrine/templates/requirements.md +18 -0
- package/src/iris_bundle/.iris/flows/specs-md/doctrine/templates/system-context.md +17 -0
- package/src/iris_bundle/.iris/flows/specs-md/doctrine/templates/unit-briefs/construction.md +16 -0
- package/src/iris_bundle/.iris/flows/specs-md/doctrine/templates/unit-briefs/operations.md +14 -0
- package/src/iris_bundle/.iris/flows/specs-md/doctrine/templates/units.md +11 -0
- package/src/iris_bundle/.iris/flows/specs-md/flow.yaml +8 -0
- package/src/iris_bundle/.iris/flows/specs-md/policy.overlay.yaml +26 -0
- package/src/iris_bundle/.iris/flows/specs-md/routes.overlay.yaml +16 -0
- package/src/iris_bundle/.iris/policy.yaml +27 -0
- package/src/iris_bundle/.iris/routes.yaml +98 -0
- package/src/iris_bundle/.iris/state.yaml +7 -0
- package/src/iris_bundle/.iris/tools/claude/.claude/claude.md +9 -0
- package/src/iris_bundle/.iris/tools/claude/.claude/commands/compare-specs.md +203 -0
- package/src/iris_bundle/.iris/tools/claude/.claude/commands/iris-construction-agent.md +25 -0
- package/src/iris_bundle/.iris/tools/claude/.claude/commands/iris-inception-agent.md +25 -0
- package/src/iris_bundle/.iris/tools/claude/.claude/commands/iris-master-agent.md +25 -0
- package/src/iris_bundle/.iris/tools/claude/.claude/commands/iris-operations-agent.md +25 -0
- package/src/iris_bundle/.iris/tools/codex/AGENTS.md +15 -0
- package/src/iris_bundle/.iris/tools/cursor/.cursor/commands/iris-construction-agent.md +25 -0
- package/src/iris_bundle/.iris/tools/cursor/.cursor/commands/iris-inception-agent.md +25 -0
- package/src/iris_bundle/.iris/tools/cursor/.cursor/commands/iris-master-agent.md +25 -0
- package/src/iris_bundle/.iris/tools/cursor/.cursor/commands/iris-operations-agent.md +25 -0
- package/src/iris_bundle/.iris/tools/gemini/.gemini/commands/iris-construction-agent.toml +29 -0
- package/src/iris_bundle/.iris/tools/gemini/.gemini/commands/iris-inception-agent.toml +29 -0
- package/src/iris_bundle/.iris/tools/gemini/.gemini/commands/iris-master-agent.toml +29 -0
- package/src/iris_bundle/.iris/tools/gemini/.gemini/commands/iris-operations-agent.toml +29 -0
- package/src/iris_bundle/frameworks/iris-core/.claude/claude.md +238 -0
- package/src/iris_bundle/frameworks/iris-core/.claude/commands/compare-iris.md +203 -0
- package/src/iris_bundle/frameworks/iris-core/.claude/commands/irismd-construction-agent.md +63 -0
- package/src/iris_bundle/frameworks/iris-core/.claude/commands/irismd-inception-agent.md +55 -0
- package/src/iris_bundle/frameworks/iris-core/.claude/commands/irismd-master-agent.md +47 -0
- package/src/iris_bundle/frameworks/iris-core/.claude/commands/irismd-operations-agent.md +77 -0
- package/src/iris_bundle/frameworks/iris-core/.github/workflows/claude-code-review.yml +57 -0
- package/src/iris_bundle/frameworks/iris-core/.github/workflows/claude.yml +50 -0
- package/src/iris_bundle/frameworks/iris-core/.github/workflows/npm-package-ci.yml +46 -0
- package/src/iris_bundle/frameworks/iris-core/.github/workflows/npm-package-dev.yml +59 -0
- package/src/iris_bundle/frameworks/iris-core/.github/workflows/npm-package-release.yml +107 -0
- package/src/iris_bundle/frameworks/iris-core/.github/workflows/vscode-publish.yml +113 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/README.md +372 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/agents/construction-agent.md +80 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/agents/inception-agent.md +97 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/agents/master-agent.md +61 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/agents/operations-agent.md +89 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/commands/construction-agent.md +63 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/commands/inception-agent.md +55 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/commands/master-agent.md +47 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/commands/operations-agent.md +77 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/context-config.yaml +67 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/memory-bank.yaml +104 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/quick-start.md +322 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/construction/bolt-list.md +163 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/construction/bolt-replan.md +345 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/construction/bolt-start.md +442 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/construction/bolt-status.md +185 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/construction/navigator.md +196 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/inception/bolt-plan.md +372 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/inception/context.md +171 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/inception/intent-create.md +211 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/inception/intent-list.md +124 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/inception/navigator.md +207 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/inception/requirements.md +227 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/inception/review.md +248 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/inception/story-create.md +304 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/inception/units.md +278 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/master/analyze-context.md +239 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/master/answer-question.md +141 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/master/explain-flow.md +158 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/master/project-init.md +281 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/master/route-request.md +126 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/operations/build.md +237 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/operations/deploy.md +259 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/operations/monitor.md +265 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/operations/navigator.md +209 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/skills/operations/verify.md +224 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/construction/bolt-template.md +226 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/construction/bolt-types/ddd-construction-bolt/adr-template.md +49 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/construction/bolt-types/ddd-construction-bolt/ddd-01-domain-model-template.md +55 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/construction/bolt-types/ddd-construction-bolt/ddd-02-technical-design-template.md +67 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/construction/bolt-types/ddd-construction-bolt/ddd-03-test-report-template.md +62 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/construction/bolt-types/ddd-construction-bolt.md +528 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/construction/bolt-types/simple-construction-bolt.md +347 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/construction/bolt-types/spike-bolt.md +240 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/construction/construction-log-template.md +129 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/construction/standards/coding-standards.md +29 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/construction/standards/system-architecture.md +22 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/construction/standards/tech-stack.md +19 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/inception/inception-log-template.md +134 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/inception/project/README.md +55 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/inception/requirements-template.md +144 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/inception/stories-template.md +38 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/inception/story-template.md +147 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/inception/system-context-template.md +29 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/inception/unit-brief-template.md +177 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/inception/units-template.md +52 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/standards/catalog.yaml +345 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/standards/coding-standards.guide.md +553 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/standards/data-stack.guide.md +162 -0
- package/src/iris_bundle/frameworks/iris-core/.irismd/aidlc/templates/standards/tech-stack.guide.md +280 -0
- package/src/iris_bundle/frameworks/iris-core/.markdownlint.yaml +142 -0
- package/src/iris_bundle/frameworks/iris-core/LICENSE +21 -0
- package/src/iris_bundle/frameworks/iris-core/PRIVACY.md +38 -0
- package/src/iris_bundle/frameworks/iris-core/README.md +397 -0
- package/src/iris_bundle/frameworks/iris-core/_iris_legacy/framework.yaml +4 -0
- package/src/iris_bundle/frameworks/iris-core/_iris_legacy/interview.yaml +9 -0
- package/src/iris_bundle/frameworks/iris-core/_iris_legacy/pack.yaml +2 -0
- package/src/iris_bundle/frameworks/iris-core/_iris_legacy/policy.yaml +27 -0
- package/src/iris_bundle/frameworks/iris-core/_iris_legacy/routes.yaml +98 -0
- package/src/iris_bundle/frameworks/iris-core/_iris_legacy/templates/bolt.md +23 -0
- package/src/iris_bundle/frameworks/iris-core/_iris_legacy/templates/doctrine-doc-template.md +33 -0
- package/src/iris_bundle/frameworks/iris-core/_iris_legacy/templates/intent.md +23 -0
- package/src/iris_bundle/frameworks/iris-core/_iris_legacy/templates/log.md +24 -0
- package/src/iris_bundle/frameworks/iris-core/_iris_legacy/templates/review.md +21 -0
- package/src/iris_bundle/frameworks/iris-core/_iris_legacy/templates/unit.md +31 -0
- package/src/iris_bundle/frameworks/iris-core/artifacts.yaml +78 -0
- package/src/iris_bundle/frameworks/iris-core/dev_release_guide.md +324 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/.claude/skills/frontend-design/SKILL.md +106 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/CLAUDE.md +171 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/README.md +20 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/agents/construction-agent.mdx +358 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/agents/inception-agent.mdx +306 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/agents/master-agent.mdx +230 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/agents/operations-agent.mdx +344 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/agents/overview.mdx +187 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/architecture/flows.mdx +136 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/community.mdx +91 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/compare/overview.mdx +167 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/compare/vs-bmad.mdx +167 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/compare/vs-kiro.mdx +208 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/compare/vs-openspec.mdx +140 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/compare/vs-speckit.mdx +146 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/core-concepts/bolts.mdx +268 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/core-concepts/intents.mdx +164 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/core-concepts/memory-bank.mdx +209 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/core-concepts/standards.mdx +277 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/core-concepts/units.mdx +184 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/docs.json +148 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/faq.mdx +364 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/feedback.mdx +55 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/getting-started/installation.mdx +91 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/getting-started/quick-start.mdx +149 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/getting-started/vscode-extension.mdx +180 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/guides/bolt-types.mdx +182 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/images/extension-gallery/bolts.jpeg +0 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/images/extension-gallery/overview.jpeg +0 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/images/extension-gallery/specs.jpeg +0 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/images/extension-preview.png +0 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/images/favicon.png +0 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/images/hero-dark.svg +129 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/images/hero-light.svg +129 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/images/logo.png +0 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/images/old_favicon.svg +40 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/images/quickstart.cast +3788 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/images/quickstart.gif +0 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/index.mdx +179 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/methodology/ai-dlc-vs-agile.mdx +138 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/methodology/sdlc-reimagined.mdx +270 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/methodology/three-phases.mdx +225 -0
- package/src/iris_bundle/frameworks/iris-core/docs.iris.md/methodology/what-is-ai-dlc.mdx +96 -0
- package/src/iris_bundle/frameworks/iris-core/framework.yaml +4 -0
- package/src/iris_bundle/frameworks/iris-core/images/763995.png +0 -0
- package/src/iris_bundle/frameworks/iris-core/images/763995.svg +354 -0
- package/src/iris_bundle/frameworks/iris-core/images/favicon-64.png +0 -0
- package/src/iris_bundle/frameworks/iris-core/images/favicon.png +0 -0
- package/src/iris_bundle/frameworks/iris-core/images/logo.png +0 -0
- package/src/iris_bundle/frameworks/iris-core/images/old_favicon.svg +40 -0
- package/src/iris_bundle/frameworks/iris-core/images/specs_md_pixel_logo.png +0 -0
- package/src/iris_bundle/frameworks/iris-core/images/specs_md_pixel_logo_big.png +0 -0
- package/src/iris_bundle/frameworks/iris-core/interview.yaml +48 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/PRFAQ.md +193 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/016-analytics-tracker/bolt.md +122 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/016-analytics-tracker/implementation-plan.md +172 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/016-analytics-tracker/implementation-walkthrough.md +56 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/016-analytics-tracker/test-walkthrough.md +96 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/017-privacy-documentation/bolt.md +72 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-artifact-parser-1/bolt.md +94 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-artifact-parser-1/implementation-plan.md +297 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-artifact-parser-1/implementation-walkthrough.md +56 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-artifact-parser-1/test-walkthrough.md +99 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-artifact-parser-2/bolt.md +88 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-artifact-parser-2/implementation-plan.md +196 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-artifact-parser-2/implementation-walkthrough.md +154 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-artifact-parser-2/test-walkthrough.md +119 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-extension-core-1/bolt.md +99 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-extension-core-1/implementation-plan.md +70 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-extension-core-1/implementation-walkthrough.md +45 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-extension-core-1/test-walkthrough.md +60 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-file-watcher-1/bolt.md +86 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-file-watcher-1/implementation-plan.md +154 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-file-watcher-1/implementation-walkthrough.md +43 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-file-watcher-1/test-walkthrough.md +74 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-sidebar-provider-1/bolt.md +89 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-sidebar-provider-1/implementation-plan.md +76 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-sidebar-provider-1/implementation-walkthrough.md +43 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-sidebar-provider-1/test-walkthrough.md +70 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-sidebar-provider-2/bolt.md +90 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-sidebar-provider-2/implementation-plan.md +93 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-sidebar-provider-2/implementation-walkthrough.md +44 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-sidebar-provider-2/test-walkthrough.md +54 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-sidebar-provider-3/bolt.md +90 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-sidebar-provider-3/implementation-plan.md +168 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-sidebar-provider-3/implementation-walkthrough.md +137 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-sidebar-provider-3/test-walkthrough.md +134 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-sidebar-provider-4/bolt.md +93 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-sidebar-provider-4/implementation-plan.md +176 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-sidebar-provider-4/implementation-walkthrough.md +159 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-sidebar-provider-4/test-walkthrough.md +105 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-sidebar-provider-5/bolt.md +104 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-sidebar-provider-5/implementation-plan.md +146 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-webview-lit-migration-1/bolt.md +83 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-webview-lit-migration-1/implementation-plan.md +161 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-webview-lit-migration-1/implementation-walkthrough.md +58 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-webview-lit-migration-1/test-walkthrough.md +104 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-webview-lit-migration-2/bolt.md +83 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-webview-lit-migration-2/implementation-plan.md +179 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-webview-lit-migration-2/implementation-walkthrough.md +124 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-webview-lit-migration-2/test-walkthrough.md +95 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-webview-lit-migration-3/bolt.md +83 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-webview-lit-migration-3/implementation-plan.md +196 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-webview-lit-migration-3/implementation-walkthrough.md +207 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-webview-lit-migration-3/test-walkthrough.md +194 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-webview-lit-migration-4/bolt.md +92 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-webview-lit-migration-4/implementation-plan.md +217 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-webview-lit-migration-4/implementation-walkthrough.md +138 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-webview-lit-migration-4/test-walkthrough.md +196 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-webview-lit-migration-5/bolt.md +89 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-webview-lit-migration-5/implementation-plan.md +181 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-webview-lit-migration-5/implementation-walkthrough.md +160 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-webview-lit-migration-5/test-walkthrough.md +121 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-welcome-view-1/bolt.md +92 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-welcome-view-1/implementation-plan.md +73 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-welcome-view-1/implementation-walkthrough.md +44 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/bolts/bolt-welcome-view-1/test-walkthrough.md +49 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/glossary.md +197 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/001-multi-agent-orchestration/requirements.md +129 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/001-multi-agent-orchestration/research/approval-gates-simplification.md +839 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/001-multi-agent-orchestration/research/archive/approval-gates-analysis.md +331 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/001-multi-agent-orchestration/system-context.md +167 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/001-multi-agent-orchestration/units/construction-agent/unit-brief.md +111 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/001-multi-agent-orchestration/units/inception-agent/unit-brief.md +135 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/001-multi-agent-orchestration/units/master-agent/unit-brief.md +580 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/001-multi-agent-orchestration/units/operations-agent/unit-brief.md +72 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/001-multi-agent-orchestration/units.md +168 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/002-agentic-coding-tool-integration/requirements.md +96 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/002-agentic-coding-tool-integration/system-context.md +170 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/002-agentic-coding-tool-integration/units/antigravity-installer/unit-brief.md +156 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/002-agentic-coding-tool-integration/units/claude-code-installer/unit-brief.md +180 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/002-agentic-coding-tool-integration/units/cline-installer/unit-brief.md +106 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/002-agentic-coding-tool-integration/units/codex-installer/unit-brief.md +106 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/002-agentic-coding-tool-integration/units/copilot-installer/unit-brief.md +139 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/002-agentic-coding-tool-integration/units/cursor-installer/unit-brief.md +119 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/002-agentic-coding-tool-integration/units/gemini-installer/unit-brief.md +107 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/002-agentic-coding-tool-integration/units/installer-core/unit-brief.md +240 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/002-agentic-coding-tool-integration/units/kilo-installer/unit-brief.md +106 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/002-agentic-coding-tool-integration/units/kiro-installer/unit-brief.md +108 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/002-agentic-coding-tool-integration/units/opencode-installer/unit-brief.md +107 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/002-agentic-coding-tool-integration/units/roo-installer/unit-brief.md +107 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/002-agentic-coding-tool-integration/units/windsurf-installer/unit-brief.md +128 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/002-agentic-coding-tool-integration/units.md +188 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/003-memory-bank-system/requirements.md +196 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/003-memory-bank-system/system-context.md +181 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/003-memory-bank-system/units/artifact-storage/unit-brief.md +192 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/003-memory-bank-system/units/configuration-schema/unit-brief.md +204 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/003-memory-bank-system/units/context-loader/unit-brief.md +245 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/003-memory-bank-system/units.md +97 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/004-standards-system/requirements.md +125 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/004-standards-system/system-context.md +158 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/004-standards-system/units/facilitation-guides/unit-brief.md +250 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/004-standards-system/units/standards-catalog/unit-brief.md +355 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/004-standards-system/units/standards-templates/unit-brief.md +394 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/004-standards-system/units.md +110 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/005-testing-strategy/requirements.md +188 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/005-testing-strategy/system-context.md +229 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/005-testing-strategy/units/01-specification-contract-testing/unit-brief.md +197 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/005-testing-strategy/units/02-cli-installer-testing/unit-brief.md +377 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/005-testing-strategy/units/03-golden-dataset-management/unit-brief.md +465 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/005-testing-strategy/units/04-agent-behavior-evaluation/unit-brief.md +459 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/005-testing-strategy/units/05-cicd-integration/unit-brief.md +587 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/005-testing-strategy/units.md +332 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/006-brownfield-support/requirements.md +214 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/007-installer-analytics/requirements.md +308 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/007-installer-analytics/system-context.md +335 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/007-installer-analytics/units/001-analytics-tracker/construction-log.md +52 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/007-installer-analytics/units/001-analytics-tracker/stories/001-initialize-mixpanel.md +56 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/007-installer-analytics/units/001-analytics-tracker/stories/002-detect-shell.md +68 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/007-installer-analytics/units/001-analytics-tracker/stories/003-check-telemetry-disabled.md +66 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/007-installer-analytics/units/001-analytics-tracker/stories/004-track-installer-events.md +67 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/007-installer-analytics/units/001-analytics-tracker/stories/005-track-selection-events.md +66 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/007-installer-analytics/units/001-analytics-tracker/stories/006-cli-no-telemetry-flag.md +59 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/007-installer-analytics/units/001-analytics-tracker/unit-brief.md +191 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/007-installer-analytics/units/002-privacy-documentation/stories/001-create-privacy-md.md +58 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/007-installer-analytics/units/002-privacy-documentation/stories/002-add-readme-section.md +61 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/007-installer-analytics/units/002-privacy-documentation/unit-brief.md +158 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/007-installer-analytics/units.md +453 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/008-terminal-dashboard/requirements.md +222 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/008-terminal-dashboard/system-context.md +198 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/008-terminal-dashboard/units.md +275 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/009-versioning-strategy/requirements.md +308 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/009-versioning-strategy/system-context.md +273 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/009-versioning-strategy/units.md +312 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/010-smart-unit-decomposition/requirements.md +111 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/010-smart-unit-decomposition/system-context.md +154 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/010-smart-unit-decomposition/units.md +102 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/inception-log.md +101 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/requirements.md +282 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/system-context.md +114 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/artifact-parser/construction-log.md +57 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/artifact-parser/stories/001-memory-bank-schema.md +56 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/artifact-parser/stories/002-project-detection.md +54 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/artifact-parser/stories/003-artifact-parsing.md +57 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/artifact-parser/stories/004-frontmatter-parser.md +61 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/artifact-parser/stories/005-bolt-dependencies.md +58 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/artifact-parser/stories/006-activity-feed-derivation.md +70 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/artifact-parser/unit-brief.md +223 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/extension-core/construction-log.md +59 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/extension-core/stories/001-extension-activation.md +55 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/extension-core/stories/002-command-registration.md +54 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/extension-core/stories/003-file-operation-commands.md +57 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/extension-core/unit-brief.md +224 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/file-watcher/construction-log.md +52 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/file-watcher/stories/001-file-system-watcher.md +57 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/file-watcher/stories/002-debounced-refresh.md +53 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/file-watcher/unit-brief.md +173 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/sidebar-provider/construction-log.md +63 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/sidebar-provider/stories/001-tree-data-provider.md +55 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/sidebar-provider/stories/002-intent-unit-story-tree.md +56 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/sidebar-provider/stories/003-bolt-tree.md +56 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/sidebar-provider/stories/004-status-icons.md +55 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/sidebar-provider/stories/005-pixel-logo-footer.md +55 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/sidebar-provider/stories/006-webview-tab-architecture.md +62 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/sidebar-provider/stories/007-command-center-bolts-tab.md +65 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/sidebar-provider/stories/008-current-focus-card.md +62 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/sidebar-provider/stories/009-up-next-queue.md +61 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/sidebar-provider/stories/010-activity-feed-ui.md +63 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/sidebar-provider/stories/011-filewatcher-statestore-integration.md +49 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/sidebar-provider/stories/012-next-actions-ui.md +57 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/sidebar-provider/stories/013-start-bolt-action.md +72 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/sidebar-provider/stories/014-intent-selection-strategies.md +54 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/sidebar-provider/stories/015-persist-expanded-state.md +56 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/sidebar-provider/stories/016-bolt-filtering.md +72 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/sidebar-provider/stories/017-activity-open-button.md +50 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/sidebar-provider/stories/018-specs-view.md +63 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/sidebar-provider/stories/019-overview-view.md +84 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/sidebar-provider/unit-brief.md +305 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/webview-lit-migration/stories/020-fix-infinite-rerender.md +144 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/webview-lit-migration/stories/021-remove-duplicate-files.md +94 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/webview-lit-migration/stories/022-setup-esbuild.md +156 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/webview-lit-migration/stories/023-lit-scaffold.md +202 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/webview-lit-migration/stories/024-tabs-component.md +161 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/webview-lit-migration/stories/025-bolts-view-components.md +182 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/webview-lit-migration/stories/026-specs-view-components.md +223 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/webview-lit-migration/stories/027-overview-view-components.md +391 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/webview-lit-migration/stories/028-state-context.md +317 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/webview-lit-migration/stories/029-ipc-typed-messaging.md +396 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/webview-lit-migration/unit-brief.md +133 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/welcome-view/construction-log.md +58 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/welcome-view/stories/001-welcome-view-ui.md +57 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/welcome-view/stories/002-install-button-flow.md +55 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/welcome-view/stories/003-post-installation-detection.md +55 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units/welcome-view/unit-brief.md +211 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/intents/011-vscode-extension/units.md +129 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/maintenance-log.md +21 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/project.yaml +16 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/research/test_strategy/promptfoo-specsmd-tutorial.md +911 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/research/test_strategy/promptfoo-tutorial.md +796 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/research/test_strategy/testing-strategy.md +1057 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/research/unified-modernization-model.md +559 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/research/vibe-to-production-academic-research.md +578 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/research/vibe-to-spec-flow-options.md +547 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/standards/coding-standards.md +217 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/standards/output-formatting.md +202 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/standards/skill-template.md +308 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/standards/system-architecture.md +177 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/standards/tech-stack.md +88 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/story-index.md +154 -0
- package/src/iris_bundle/frameworks/iris-core/memory-bank/term-mappings.md +121 -0
- package/src/iris_bundle/frameworks/iris-core/pack.yaml +2 -0
- package/src/iris_bundle/frameworks/iris-core/package.json +11 -0
- package/src/iris_bundle/frameworks/iris-core/policy.yaml +73 -0
- package/src/iris_bundle/frameworks/iris-core/resources/ai-dlc-specification.md +286 -0
- package/src/iris_bundle/frameworks/iris-core/resources/aidlc.pdf +0 -0
- package/src/iris_bundle/frameworks/iris-core/resources/images/aidlc-core-framework.png +0 -0
- package/src/iris_bundle/frameworks/iris-core/resources/images/aidlc-phases-detail.png +0 -0
- package/src/iris_bundle/frameworks/iris-core/resources/images/aidlc-workflow-steps.png +0 -0
- package/src/iris_bundle/frameworks/iris-core/routes.yaml +98 -0
- package/src/iris_bundle/frameworks/iris-core/src/README.md +322 -0
- package/src/iris_bundle/frameworks/iris-core/src/__tests__/schemas/agent.schema.yaml +20 -0
- package/src/iris_bundle/frameworks/iris-core/src/__tests__/schemas/catalog.schema.yaml +60 -0
- package/src/iris_bundle/frameworks/iris-core/src/__tests__/schemas/context-config.schema.yaml +24 -0
- package/src/iris_bundle/frameworks/iris-core/src/__tests__/schemas/memory-bank.schema.yaml +61 -0
- package/src/iris_bundle/frameworks/iris-core/src/__tests__/schemas/skill.schema.yaml +22 -0
- package/src/iris_bundle/frameworks/iris-core/src/__tests__/unit/analytics/analytics.test.ts +240 -0
- package/src/iris_bundle/frameworks/iris-core/src/__tests__/unit/architecture/bolt-type-agnostic.test.ts +282 -0
- package/src/iris_bundle/frameworks/iris-core/src/__tests__/unit/flow-consistency/code-examples.test.ts +93 -0
- package/src/iris_bundle/frameworks/iris-core/src/__tests__/unit/flow-consistency/helpers.ts +79 -0
- package/src/iris_bundle/frameworks/iris-core/src/__tests__/unit/flow-consistency/placeholder-consistency.test.ts +76 -0
- package/src/iris_bundle/frameworks/iris-core/src/__tests__/unit/flow-consistency/reference-integrity.test.ts +92 -0
- package/src/iris_bundle/frameworks/iris-core/src/__tests__/unit/flow-consistency/terminology-consistency.test.ts +72 -0
- package/src/iris_bundle/frameworks/iris-core/src/__tests__/unit/schema-validation/markdown-schema.test.ts +55 -0
- package/src/iris_bundle/frameworks/iris-core/src/__tests__/unit/schema-validation/yaml-config-schema.test.ts +53 -0
- package/src/iris_bundle/frameworks/iris-core/src/bin/cli.js +21 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/README.md +372 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/agents/construction-agent.md +80 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/agents/inception-agent.md +97 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/agents/master-agent.md +61 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/agents/operations-agent.md +89 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/commands/construction-agent.md +63 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/commands/inception-agent.md +55 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/commands/master-agent.md +47 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/commands/operations-agent.md +77 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/context-config.yaml +67 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/memory-bank.yaml +104 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/quick-start.md +322 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/construction/bolt-list.md +163 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/construction/bolt-replan.md +345 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/construction/bolt-start.md +442 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/construction/bolt-status.md +185 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/construction/navigator.md +196 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/inception/bolt-plan.md +372 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/inception/context.md +171 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/inception/intent-create.md +211 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/inception/intent-list.md +124 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/inception/navigator.md +207 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/inception/requirements.md +227 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/inception/review.md +248 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/inception/story-create.md +304 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/inception/units.md +278 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/master/analyze-context.md +239 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/master/answer-question.md +141 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/master/explain-flow.md +158 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/master/project-init.md +281 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/master/route-request.md +126 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/operations/build.md +237 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/operations/deploy.md +259 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/operations/monitor.md +265 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/operations/navigator.md +209 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/skills/operations/verify.md +224 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/construction/bolt-template.md +226 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/construction/bolt-types/ddd-construction-bolt/adr-template.md +49 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/construction/bolt-types/ddd-construction-bolt/ddd-01-domain-model-template.md +55 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/construction/bolt-types/ddd-construction-bolt/ddd-02-technical-design-template.md +67 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/construction/bolt-types/ddd-construction-bolt/ddd-03-test-report-template.md +62 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/construction/bolt-types/ddd-construction-bolt.md +528 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/construction/bolt-types/simple-construction-bolt.md +347 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/construction/bolt-types/spike-bolt.md +240 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/construction/construction-log-template.md +129 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/construction/standards/coding-standards.md +29 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/construction/standards/system-architecture.md +22 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/construction/standards/tech-stack.md +19 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/inception/inception-log-template.md +134 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/inception/project/README.md +55 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/inception/requirements-template.md +144 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/inception/stories-template.md +38 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/inception/story-template.md +147 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/inception/system-context-template.md +29 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/inception/unit-brief-template.md +177 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/inception/units-template.md +52 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/standards/catalog.yaml +345 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/standards/coding-standards.guide.md +553 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/standards/data-stack.guide.md +162 -0
- package/src/iris_bundle/frameworks/iris-core/src/flows/aidlc/templates/standards/tech-stack.guide.md +280 -0
- package/src/iris_bundle/frameworks/iris-core/src/lib/InstallerFactory.js +36 -0
- package/src/iris_bundle/frameworks/iris-core/src/lib/analytics/env-detector.js +92 -0
- package/src/iris_bundle/frameworks/iris-core/src/lib/analytics/index.js +22 -0
- package/src/iris_bundle/frameworks/iris-core/src/lib/analytics/machine-id.js +33 -0
- package/src/iris_bundle/frameworks/iris-core/src/lib/analytics/tracker.js +232 -0
- package/src/iris_bundle/frameworks/iris-core/src/lib/cli-utils.js +342 -0
- package/src/iris_bundle/frameworks/iris-core/src/lib/constants.js +32 -0
- package/src/iris_bundle/frameworks/iris-core/src/lib/installer.js +402 -0
- package/src/iris_bundle/frameworks/iris-core/src/lib/installers/AntigravityInstaller.js +22 -0
- package/src/iris_bundle/frameworks/iris-core/src/lib/installers/ClaudeInstaller.js +85 -0
- package/src/iris_bundle/frameworks/iris-core/src/lib/installers/ClineInstaller.js +21 -0
- package/src/iris_bundle/frameworks/iris-core/src/lib/installers/CodexInstaller.js +21 -0
- package/src/iris_bundle/frameworks/iris-core/src/lib/installers/CopilotInstaller.js +113 -0
- package/src/iris_bundle/frameworks/iris-core/src/lib/installers/CursorInstaller.js +63 -0
- package/src/iris_bundle/frameworks/iris-core/src/lib/installers/GeminiInstaller.js +75 -0
- package/src/iris_bundle/frameworks/iris-core/src/lib/installers/KiroInstaller.js +22 -0
- package/src/iris_bundle/frameworks/iris-core/src/lib/installers/OpenCodeInstaller.js +22 -0
- package/src/iris_bundle/frameworks/iris-core/src/lib/installers/RooInstaller.js +22 -0
- package/src/iris_bundle/frameworks/iris-core/src/lib/installers/ToolInstaller.js +73 -0
- package/src/iris_bundle/frameworks/iris-core/src/lib/installers/WindsurfInstaller.js +76 -0
- package/src/iris_bundle/frameworks/iris-core/src/lib/markdown-validator.ts +175 -0
- package/src/iris_bundle/frameworks/iris-core/src/lib/yaml-validator.ts +99 -0
- package/src/iris_bundle/frameworks/iris-core/src/package-lock.json +9922 -0
- package/src/iris_bundle/frameworks/iris-core/src/package.json +118 -0
- package/src/iris_bundle/frameworks/iris-core/src/scripts/artifact-validator.js +594 -0
- package/src/iris_bundle/frameworks/iris-core/src/scripts/bolt-complete.js +606 -0
- package/src/iris_bundle/frameworks/iris-core/src/scripts/status-integrity.js +598 -0
- package/src/iris_bundle/frameworks/iris-core/src/specs_md_pixel_logo.png +0 -0
- package/src/iris_bundle/frameworks/iris-core/src/tsconfig.json +16 -0
- package/src/iris_bundle/frameworks/iris-core/src/vitest.config.ts +12 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/.claude/skills/frontend-design/SKILL.md +42 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/.mocharc.json +5 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/.vscodeignore +17 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/DEVGUIDE.md +103 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/LICENSE +21 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/README.md +198 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/design-mockups/index.html +635 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/design-mockups/variation-1-metrics-dashboard.html +542 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/design-mockups/variation-10-dual-view-focus.html +1105 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/design-mockups/variation-11-dual-view-grouped.html +2304 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/design-mockups/variation-12-dependency-graph.html +1400 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/design-mockups/variation-13-hierarchy-explorer.html +1278 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/design-mockups/variation-14-swimlane-deps.html +1370 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/design-mockups/variation-2-pipeline-flow.html +673 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/design-mockups/variation-3-focus-mode.html +898 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/design-mockups/variation-4-kanban-board.html +858 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/design-mockups/variation-5-timeline.html +890 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/design-mockups/variation-6-activity-feed.html +923 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/design-mockups/variation-7-heatmap-grid.html +932 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/design-mockups/variation-8-2-spec.md +657 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/design-mockups/variation-8-2.html +2098 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/design-mockups/variation-8-command-center.html +2043 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/design-mockups/variation-8a-command-center-timeline.html +1222 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/design-mockups/variation-9-dual-view.html +1101 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/esbuild.webview.mjs +70 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/eslint.config.mjs +36 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/package-lock.json +5712 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/package.json +116 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/resources/extension-preview.png +0 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/resources/extension_icon.svg +40 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/resources/favicon-64.png +0 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/resources/favicon.png +0 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/resources/favicon.svg +40 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/resources/logo.png +0 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/extension.ts +142 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/parser/activityFeed.ts +184 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/parser/artifactParser.ts +477 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/parser/boltTypeParser.ts +191 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/parser/dependencyComputation.ts +157 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/parser/frontmatterParser.ts +144 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/parser/index.ts +93 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/parser/memoryBankSchema.ts +140 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/parser/projectDetection.ts +132 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/parser/types.ts +241 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/sidebar/iconHelper.ts +82 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/sidebar/index.ts +85 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/sidebar/treeBuilder.ts +289 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/sidebar/treeProvider.ts +225 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/sidebar/types.ts +254 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/sidebar/webviewMessaging.ts +306 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/sidebar/webviewProvider.ts +866 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/state/index.ts +114 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/state/selectors.ts +652 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/state/stateStore.ts +419 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/state/types.ts +311 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/test/parser/activityFeed.test.ts +269 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/test/parser/artifactParser.test.ts +440 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/test/parser/dependencyComputation.test.ts +288 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/test/parser/frontmatterParser.test.ts +191 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/test/parser/memoryBankSchema.test.ts +185 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/test/parser/projectDetection.test.ts +146 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/test/runTest.ts +20 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/test/sidebar/treeBuilder.test.ts +329 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/test/sidebar/types.test.ts +239 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/test/sidebar/webviewContent.test.ts +67 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/test/sidebar/webviewMessaging.test.ts +369 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/test/sidebar/webviewProvider.test.ts +282 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/test/state/selectors.test.ts +457 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/test/state/stateStore.test.ts +622 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/test/watcher/debounce.test.ts +155 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/test/watcher/fileWatcher.test.ts +77 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/watcher/debounce.ts +70 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/watcher/fileWatcher.ts +147 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/watcher/index.ts +39 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/watcher/types.ts +43 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/components/app.ts +870 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/components/bolts/activity-feed.ts +232 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/components/bolts/activity-item.ts +208 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/components/bolts/bolts-view.ts +201 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/components/bolts/completion-item.ts +299 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/components/bolts/completions-section.ts +197 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/components/bolts/current-bolts.ts +184 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/components/bolts/focus-card.ts +431 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/components/bolts/focus-section.ts +179 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/components/bolts/queue-item.ts +306 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/components/bolts/queue-section.ts +198 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/components/bolts/stories-list.ts +151 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/components/shared/base-element.ts +29 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/components/shared/empty-state.ts +82 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/components/shared/progress-bar.ts +120 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/components/shared/progress-ring.ts +100 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/components/shared/stage-pipeline.ts +133 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/components/tabs/view-tabs.ts +127 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/html.ts +542 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/index.ts +104 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/lit/index.ts +16 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/scripts.ts +241 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/styles/theme.ts +50 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/styles.ts +1194 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/types/messages.ts +40 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/types/vscode.ts +13 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/utils/messaging.ts +57 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/utils.ts +16 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/webview/vscode-api.ts +14 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/welcome/WelcomeViewProvider.ts +254 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/welcome/index.ts +9 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/src/welcome/installHandler.ts +82 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/tsconfig.json +30 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/tsconfig.test.json +26 -0
- package/src/iris_bundle/frameworks/iris-core/vs-code-extension/tsconfig.webview.json +23 -0
package/src/iris_bundle/frameworks/iris-core/memory-bank/research/test_strategy/testing-strategy.md
ADDED
|
@@ -0,0 +1,1057 @@
|
|
|
1
|
+
# Testing Strategy for MD-Based Agentic Systems
|
|
2
|
+
|
|
3
|
+
Research document outlining a comprehensive testing strategy for irismd, addressing the unique challenges of testing CLI tools combined with markdown-based agentic workflows.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Testing Pyramid for AI Agents
|
|
8
|
+
|
|
9
|
+
Traditional testing pyramids don't account for LLM non-determinism. irismd uses an **adapted testing pyramid** designed for AI-native systems:
|
|
10
|
+
|
|
11
|
+
```mermaid
|
|
12
|
+
%%{init: {'theme': 'base', 'themeVariables': { 'fontSize': '14px'}}}%%
|
|
13
|
+
graph TB
|
|
14
|
+
subgraph pyramid [" "]
|
|
15
|
+
direction TB
|
|
16
|
+
L6["🔍 Release Validation<br/><i>Human Review + Grok 4.1 Free</i>"]
|
|
17
|
+
L5["🤖 Agent Behavior<br/><i>Promptfoo + OpenRouter Free</i>"]
|
|
18
|
+
L4["🔄 E2E Workflow<br/><i>Custom Test Harness</i>"]
|
|
19
|
+
L3["⚙️ Integration<br/><i>BATS + Node.js</i>"]
|
|
20
|
+
L2["📋 Schema/Contract<br/><i>JSON Schema + markdownlint</i>"]
|
|
21
|
+
L1["🧪 Unit Tests<br/><i>Vitest/Jest</i>"]
|
|
22
|
+
end
|
|
23
|
+
|
|
24
|
+
L6 --> L5
|
|
25
|
+
L5 --> L4
|
|
26
|
+
L4 --> L3
|
|
27
|
+
L3 --> L2
|
|
28
|
+
L2 --> L1
|
|
29
|
+
|
|
30
|
+
style L6 fill:#ff6b6b,stroke:#333,color:#fff
|
|
31
|
+
style L5 fill:#feca57,stroke:#333,color:#333
|
|
32
|
+
style L4 fill:#48dbfb,stroke:#333,color:#333
|
|
33
|
+
style L3 fill:#1dd1a1,stroke:#333,color:#333
|
|
34
|
+
style L2 fill:#5f27cd,stroke:#333,color:#fff
|
|
35
|
+
style L1 fill:#576574,stroke:#333,color:#fff
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
### Technology Stack by Layer
|
|
39
|
+
|
|
40
|
+
| Layer | Technology | Why This Tool |
|
|
41
|
+
|-------|------------|---------------|
|
|
42
|
+
| **Unit Tests** | **Vitest** | Fast, TypeScript-native, Jest-compatible |
|
|
43
|
+
| **Schema/Contract** | **JSON Schema (Ajv)** + **markdownlint** | Validates YAML/MD structure without LLM |
|
|
44
|
+
| **Integration** | **BATS** (Bash Automated Testing) | Tests CLI commands in real shell |
|
|
45
|
+
| **E2E Workflow** | **Custom Node.js harness** | Orchestrates multi-agent flows |
|
|
46
|
+
| **Agent Behavior** | **Promptfoo** + **OpenRouter** | Declarative LLM testing, free models |
|
|
47
|
+
| **Release Validation** | **Grok 4.1 Fast** + **Human** | High-quality eval + manual approval |
|
|
48
|
+
|
|
49
|
+
### Characteristics by Layer
|
|
50
|
+
|
|
51
|
+
As you move UP the pyramid, tests become slower, more expensive, less deterministic, and run less frequently:
|
|
52
|
+
|
|
53
|
+
| Layer | Speed | Cost | Determinism | Volume |
|
|
54
|
+
|-------|-------|------|-------------|--------|
|
|
55
|
+
| **Unit Tests** | ~ms | Free | 100% deterministic | 100s of tests |
|
|
56
|
+
| **Schema/Contract** | ~ms | Free | 100% deterministic | 10s of tests |
|
|
57
|
+
| **Integration** | ~sec | Free | 100% deterministic | 10s of tests |
|
|
58
|
+
| **E2E Workflow** | ~min | Free | ~95% deterministic | 5-10 tests |
|
|
59
|
+
| **Agent Behavior** | ~min | Free* | ~70-85% deterministic | 50+ tests |
|
|
60
|
+
| **Release Validation** | ~hr | Free* | ~70-85% + human | 5-10 tests |
|
|
61
|
+
|
|
62
|
+
*100% free using OpenRouter free tier models
|
|
63
|
+
|
|
64
|
+
---
|
|
65
|
+
|
|
66
|
+
## Layer Summary
|
|
67
|
+
|
|
68
|
+
| Layer | What It Tests | Tools | Frequency | Cost | Pass Threshold |
|
|
69
|
+
|-------|---------------|-------|-----------|------|----------------|
|
|
70
|
+
| **Unit** | Installers, utilities, pure functions | Jest/Vitest | Every PR | Free | 100% |
|
|
71
|
+
| **Schema/Contract** | Markdown structure, YAML validity | JSON Schema, markdownlint | Every PR | Free | 100% |
|
|
72
|
+
| **Integration** | CLI commands, file operations | BATS, filesystem mocks | Every PR | Free | 100% |
|
|
73
|
+
| **E2E Workflow** | Full Inception→Construction flow | Custom harness | On main merge | Free | 100% |
|
|
74
|
+
| **Agent Behavior** | LLM output quality, rubric scoring | Promptfoo + OpenRouter free | Nightly | Free* | ≥85% |
|
|
75
|
+
| **Release Validation** | Gold standard comparison, human review | Promptfoo + Grok 4.1 Free | Weekly/Release | Free* | ≥90% + Human approval |
|
|
76
|
+
|
|
77
|
+
*100% free using OpenRouter free tier models
|
|
78
|
+
|
|
79
|
+
---
|
|
80
|
+
|
|
81
|
+
## Frequency & Triggers
|
|
82
|
+
|
|
83
|
+
```text
|
|
84
|
+
┌─────────────────────────────────────────────────────────────────────────┐
|
|
85
|
+
│ TEST EXECUTION SCHEDULE │
|
|
86
|
+
├─────────────────────────────────────────────────────────────────────────┤
|
|
87
|
+
│ │
|
|
88
|
+
│ EVERY PR / PUSH (~2 min) │
|
|
89
|
+
│ ├── Unit Tests ✓ Must pass to merge │
|
|
90
|
+
│ ├── Schema/Contract Tests ✓ Must pass to merge │
|
|
91
|
+
│ └── Integration Tests ✓ Must pass to merge │
|
|
92
|
+
│ │
|
|
93
|
+
│ ON MERGE TO MAIN (~5 min) │
|
|
94
|
+
│ ├── All PR tests ✓ Re-run for safety │
|
|
95
|
+
│ ├── E2E Workflow Tests ✓ Full flow validation │
|
|
96
|
+
│ └── Quick Agent Eval (5 samples) ✓ Smoke test with free model │
|
|
97
|
+
│ │
|
|
98
|
+
│ NIGHTLY AT 2 AM (~30 min) │
|
|
99
|
+
│ ├── Full Golden Dataset ✓ 50+ test cases │
|
|
100
|
+
│ ├── Regression Check ✓ Compare to baseline │
|
|
101
|
+
│ └── Generate Quality Report ✓ Track trends │
|
|
102
|
+
│ │
|
|
103
|
+
│ WEEKLY / PRE-RELEASE (~1 hr) │
|
|
104
|
+
│ ├── Comprehensive Evaluation ✓ 3x runs, averaged │
|
|
105
|
+
│ ├── Grok 4.1 Fast Validation ✓ Free high-quality judge │
|
|
106
|
+
│ └── Human Review Gate ✓ Manual approval required │
|
|
107
|
+
│ │
|
|
108
|
+
└─────────────────────────────────────────────────────────────────────────┘
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
---
|
|
112
|
+
|
|
113
|
+
## Developer Workflow
|
|
114
|
+
|
|
115
|
+
### Workflow 1: Adding a New Feature
|
|
116
|
+
|
|
117
|
+
When implementing a new capability (e.g., new skill, new agent behavior):
|
|
118
|
+
|
|
119
|
+
```text
|
|
120
|
+
┌─────────────────────────────────────────────────────────────────────────┐
|
|
121
|
+
│ ADDING A NEW FEATURE │
|
|
122
|
+
└─────────────────────────────────────────────────────────────────────────┘
|
|
123
|
+
|
|
124
|
+
1. WRITE SPECIFICATION
|
|
125
|
+
└── Create intent/unit/stories in memory-bank/
|
|
126
|
+
└── Schema tests will validate structure automatically
|
|
127
|
+
|
|
128
|
+
2. IMPLEMENT FEATURE
|
|
129
|
+
└── Add skill files, agent updates, etc.
|
|
130
|
+
└── Write unit tests for any new utilities
|
|
131
|
+
|
|
132
|
+
3. ADD GOLDEN EXAMPLE
|
|
133
|
+
└── Create input/output pair in __tests__/golden-datasets/
|
|
134
|
+
└── This becomes your regression baseline
|
|
135
|
+
|
|
136
|
+
4. ADD ASSERTIONS (promptfoo.yaml)
|
|
137
|
+
└── Add test case with:
|
|
138
|
+
- vars: your input
|
|
139
|
+
- assert: structure checks + llm-rubric for quality
|
|
140
|
+
|
|
141
|
+
5. RUN LOCALLY
|
|
142
|
+
└── `npm run test` # Unit + schema + integration
|
|
143
|
+
└── `promptfoo eval` # Agent behavior tests
|
|
144
|
+
└── `promptfoo view` # Review results in UI
|
|
145
|
+
|
|
146
|
+
6. OPEN PR
|
|
147
|
+
└── CI runs: Unit → Schema → Integration
|
|
148
|
+
└── Must pass 100%
|
|
149
|
+
|
|
150
|
+
7. MERGE TO MAIN
|
|
151
|
+
└── CI runs: E2E + Quick Agent Eval
|
|
152
|
+
└── Feature is live
|
|
153
|
+
|
|
154
|
+
8. NIGHTLY
|
|
155
|
+
└── Full golden dataset runs
|
|
156
|
+
└── Your new test is part of regression suite
|
|
157
|
+
```
|
|
158
|
+
|
|
159
|
+
**Checklist for New Features**:
|
|
160
|
+
|
|
161
|
+
- [ ] Spec files follow schema (`memory-bank/intents/.../`)
|
|
162
|
+
- [ ] Unit tests for new utilities
|
|
163
|
+
- [ ] Golden dataset input/output pair added
|
|
164
|
+
- [ ] Promptfoo test case with assertions
|
|
165
|
+
- [ ] Local tests pass
|
|
166
|
+
- [ ] PR tests pass
|
|
167
|
+
|
|
168
|
+
---
|
|
169
|
+
|
|
170
|
+
### Workflow 2: Changing Agent Behavior
|
|
171
|
+
|
|
172
|
+
When modifying how an agent responds (e.g., changing prompt, output format):
|
|
173
|
+
|
|
174
|
+
```text
|
|
175
|
+
┌─────────────────────────────────────────────────────────────────────────┐
|
|
176
|
+
│ CHANGING AGENT BEHAVIOR │
|
|
177
|
+
└─────────────────────────────────────────────────────────────────────────┘
|
|
178
|
+
|
|
179
|
+
1. IDENTIFY IMPACT
|
|
180
|
+
└── Which golden dataset examples will be affected?
|
|
181
|
+
└── Which assertions might break?
|
|
182
|
+
|
|
183
|
+
2. RUN BASELINE
|
|
184
|
+
└── `promptfoo eval --output baseline.json`
|
|
185
|
+
└── Save current scores for comparison
|
|
186
|
+
|
|
187
|
+
3. MAKE CHANGES
|
|
188
|
+
└── Update agent/skill markdown files
|
|
189
|
+
└── Update prompt templates
|
|
190
|
+
|
|
191
|
+
4. RUN COMPARISON
|
|
192
|
+
└── `promptfoo eval --output after.json`
|
|
193
|
+
└── Compare scores: Are they better/same/worse?
|
|
194
|
+
|
|
195
|
+
5. UPDATE GOLDEN DATASET (if intentional change)
|
|
196
|
+
└── Review outputs manually
|
|
197
|
+
└── If new behavior is correct, update golden outputs
|
|
198
|
+
└── Update assertions if output format changed
|
|
199
|
+
|
|
200
|
+
6. UPDATE ASSERTIONS (if needed)
|
|
201
|
+
└── Adjust rubrics for new expected behavior
|
|
202
|
+
└── Update structural assertions
|
|
203
|
+
|
|
204
|
+
7. DOCUMENT CHANGE
|
|
205
|
+
└── Add to CHANGELOG
|
|
206
|
+
└── Note in PR: "Intentional behavior change"
|
|
207
|
+
|
|
208
|
+
8. RUN FULL EVAL
|
|
209
|
+
└── `npm run eval:full`
|
|
210
|
+
└── Ensure no unexpected regressions
|
|
211
|
+
|
|
212
|
+
9. OPEN PR
|
|
213
|
+
└── Include before/after comparison
|
|
214
|
+
└── Note which tests were updated and why
|
|
215
|
+
```
|
|
216
|
+
|
|
217
|
+
**Checklist for Behavior Changes**:
|
|
218
|
+
|
|
219
|
+
- [ ] Baseline captured before changes
|
|
220
|
+
- [ ] After scores compared to baseline
|
|
221
|
+
- [ ] Golden outputs updated (if intentional)
|
|
222
|
+
- [ ] Assertions updated (if format changed)
|
|
223
|
+
- [ ] No unexpected regressions
|
|
224
|
+
- [ ] Change documented in PR
|
|
225
|
+
|
|
226
|
+
---
|
|
227
|
+
|
|
228
|
+
### Workflow 3: Fixing a Bug
|
|
229
|
+
|
|
230
|
+
When fixing incorrect agent behavior:
|
|
231
|
+
|
|
232
|
+
```text
|
|
233
|
+
┌─────────────────────────────────────────────────────────────────────────┐
|
|
234
|
+
│ FIXING A BUG │
|
|
235
|
+
└─────────────────────────────────────────────────────────────────────────┘
|
|
236
|
+
|
|
237
|
+
1. REPRODUCE BUG
|
|
238
|
+
└── Create test case that demonstrates the bug
|
|
239
|
+
└── Add to promptfoo.yaml with `assert` that fails
|
|
240
|
+
|
|
241
|
+
2. WRITE FAILING TEST FIRST
|
|
242
|
+
└── Test should fail with current code
|
|
243
|
+
└── `promptfoo eval` → See failure
|
|
244
|
+
|
|
245
|
+
3. FIX THE BUG
|
|
246
|
+
└── Update agent/skill files
|
|
247
|
+
|
|
248
|
+
4. VERIFY FIX
|
|
249
|
+
└── `promptfoo eval` → Test now passes
|
|
250
|
+
└── No regressions in other tests
|
|
251
|
+
|
|
252
|
+
5. ADD TO GOLDEN DATASET
|
|
253
|
+
└── Bug reproduction case becomes regression test
|
|
254
|
+
└── Prevents bug from returning
|
|
255
|
+
|
|
256
|
+
6. OPEN PR
|
|
257
|
+
└── Include: Bug description, test case, fix
|
|
258
|
+
```
|
|
259
|
+
|
|
260
|
+
**Checklist for Bug Fixes**:
|
|
261
|
+
|
|
262
|
+
- [ ] Failing test created first
|
|
263
|
+
- [ ] Fix makes test pass
|
|
264
|
+
- [ ] No regressions
|
|
265
|
+
- [ ] Test added to golden dataset
|
|
266
|
+
|
|
267
|
+
---
|
|
268
|
+
|
|
269
|
+
### Workflow 4: Updating Golden Dataset
|
|
270
|
+
|
|
271
|
+
When golden examples need refresh:
|
|
272
|
+
|
|
273
|
+
```text
|
|
274
|
+
┌─────────────────────────────────────────────────────────────────────────┐
|
|
275
|
+
│ UPDATING GOLDEN DATASET │
|
|
276
|
+
└─────────────────────────────────────────────────────────────────────────┘
|
|
277
|
+
|
|
278
|
+
1. IDENTIFY STALE EXAMPLES
|
|
279
|
+
└── Run full eval, check similarity scores
|
|
280
|
+
└── Examples with <0.7 similarity may be outdated
|
|
281
|
+
|
|
282
|
+
2. REGENERATE OUTPUTS
|
|
283
|
+
└── Run agent with same inputs
|
|
284
|
+
└── Capture new outputs
|
|
285
|
+
|
|
286
|
+
3. HUMAN REVIEW
|
|
287
|
+
└── Compare old vs new outputs
|
|
288
|
+
└── Is new output actually better?
|
|
289
|
+
└── Get second opinion if unsure
|
|
290
|
+
|
|
291
|
+
4. UPDATE GOLDEN FILE
|
|
292
|
+
└── Replace __tests__/golden-datasets/.../output.md
|
|
293
|
+
└── Update timestamp in file
|
|
294
|
+
|
|
295
|
+
5. RE-RUN EVAL
|
|
296
|
+
└── `promptfoo eval`
|
|
297
|
+
└── Similarity should now be high
|
|
298
|
+
|
|
299
|
+
6. COMMIT WITH JUSTIFICATION
|
|
300
|
+
└── PR message: Why golden was updated
|
|
301
|
+
└── Include diff of old vs new
|
|
302
|
+
```
|
|
303
|
+
|
|
304
|
+
---
|
|
305
|
+
|
|
306
|
+
## Quick Commands Reference
|
|
307
|
+
|
|
308
|
+
```bash
|
|
309
|
+
# Development
|
|
310
|
+
npm run test # Unit + Schema + Integration (fast)
|
|
311
|
+
npm run test:watch # Watch mode for TDD
|
|
312
|
+
|
|
313
|
+
# Agent Evaluation
|
|
314
|
+
promptfoo eval # Run agent tests
|
|
315
|
+
promptfoo eval --cache # Use cached results (faster)
|
|
316
|
+
promptfoo view # Open web UI to review
|
|
317
|
+
|
|
318
|
+
# CI/CD Simulation
|
|
319
|
+
npm run test:ci # Full PR test suite
|
|
320
|
+
npm run eval:agents # Quick agent eval (5 samples)
|
|
321
|
+
npm run eval:full # Full golden dataset
|
|
322
|
+
|
|
323
|
+
# Comparison
|
|
324
|
+
promptfoo eval --output before.json
|
|
325
|
+
# make changes
|
|
326
|
+
promptfoo eval --output after.json
|
|
327
|
+
# compare in UI
|
|
328
|
+
|
|
329
|
+
# Debugging
|
|
330
|
+
promptfoo eval --verbose # See full outputs
|
|
331
|
+
promptfoo eval --filter "intent" # Run specific tests
|
|
332
|
+
```
|
|
333
|
+
|
|
334
|
+
---
|
|
335
|
+
|
|
336
|
+
## Executive Summary
|
|
337
|
+
|
|
338
|
+
irismd presents a unique testing challenge with three distinct layers:
|
|
339
|
+
|
|
340
|
+
1. **CLI Tool Layer** (deterministic) - Commands, installers, file operations
|
|
341
|
+
2. **Markdown Specification Layer** (semi-deterministic) - Schema validation, structure compliance
|
|
342
|
+
3. **Agent Behavior Layer** (non-deterministic) - LLM-driven outputs and decisions
|
|
343
|
+
|
|
344
|
+
Traditional testing approaches fail for agent behaviors due to LLM non-determinism. This document proposes a multi-layer strategy combining specification-driven testing, behavioral assertions, and LLM-as-judge evaluation.
|
|
345
|
+
|
|
346
|
+
---
|
|
347
|
+
|
|
348
|
+
## Industry Context
|
|
349
|
+
|
|
350
|
+
### Current State of AI Agent Testing (2024-2025)
|
|
351
|
+
|
|
352
|
+
- **70% of enterprises** projected to adopt Agentic AI in testing by 2025 ([Gartner](https://www.gartner.com/en/newsroom/press-releases/2024-10-21-gartner-predicts-70-percent-of-enterprises-will-implement-ai-driven-testing-by-2027))
|
|
353
|
+
- **72% of QA teams** exploring AI-driven testing workflows ([Test Guild 2025](https://testguild.com/state-of-testing-2025/))
|
|
354
|
+
- **Terminal-Bench** (Stanford/Laude Institute, May 2025) - Evaluates agents in real CLI environments ([Terminal-Bench](https://ainativedev.io/news/8-benchmarks-shaping-the-next-generation-of-ai-agents))
|
|
355
|
+
- **SWT-Bench** (LogicStar AI, October 2024) - Evaluates agent test suite generation ([SWT-Bench Paper](https://arxiv.org/abs/2410.03859))
|
|
356
|
+
- **Specification-Driven Development (SDD)** emerging as key methodology for AI-native development ([GitHub Blog](https://github.blog/ai-and-ml/generative-ai/spec-driven-development-using-markdown-as-a-programming-language-when-building-with-ai/))
|
|
357
|
+
|
|
358
|
+
### Key Insight
|
|
359
|
+
|
|
360
|
+
> "There remains a fundamental superiority of specifications over tests: you can derive tests from a specification, but not the other way around."
|
|
361
|
+
> — Contract-Driven Development Research
|
|
362
|
+
|
|
363
|
+
This makes irismd's specification-first approach inherently testable.
|
|
364
|
+
|
|
365
|
+
---
|
|
366
|
+
|
|
367
|
+
## Testing Layers
|
|
368
|
+
|
|
369
|
+
### Layer 1: Specification Contract Testing
|
|
370
|
+
|
|
371
|
+
**Priority: Highest**
|
|
372
|
+
|
|
373
|
+
Since irismd is specification-driven, markdown specs serve as contracts:
|
|
374
|
+
|
|
375
|
+
```text
|
|
376
|
+
memory-bank/
|
|
377
|
+
├── intents/ → Contract: Each intent follows schema
|
|
378
|
+
├── units/ → Contract: Each unit has required fields
|
|
379
|
+
├── standards/ → Contract: Standards follow template
|
|
380
|
+
└── memory-bank.yaml → Contract: Schema definition
|
|
381
|
+
```
|
|
382
|
+
|
|
383
|
+
#### Implementation
|
|
384
|
+
|
|
385
|
+
| Component | Validation Method |
|
|
386
|
+
|-----------|------------------|
|
|
387
|
+
| `memory-bank.yaml` | JSON Schema validation |
|
|
388
|
+
| Intent artifacts | Required fields check |
|
|
389
|
+
| Unit briefs | Template compliance |
|
|
390
|
+
| Bolt templates | Structure validation |
|
|
391
|
+
|
|
392
|
+
#### Tools
|
|
393
|
+
|
|
394
|
+
- **JSON Schema / Ajv** - YAML/JSON schema validation
|
|
395
|
+
- **markdownlint** - Markdown structure enforcement
|
|
396
|
+
- **remark-lint** - Markdown AST validation
|
|
397
|
+
- **Custom validators** - irismd-specific schema checks
|
|
398
|
+
|
|
399
|
+
#### Example Test
|
|
400
|
+
|
|
401
|
+
```javascript
|
|
402
|
+
describe('Intent Specification Contract', () => {
|
|
403
|
+
it('should contain required sections', () => {
|
|
404
|
+
const intent = loadMarkdown('memory-bank/intents/001-multi-agent/requirements.md');
|
|
405
|
+
|
|
406
|
+
expect(intent).toHaveSection('## Problem Statement');
|
|
407
|
+
expect(intent).toHaveSection('## Success Criteria');
|
|
408
|
+
expect(intent).toHaveSection('## Acceptance Criteria');
|
|
409
|
+
});
|
|
410
|
+
});
|
|
411
|
+
```
|
|
412
|
+
|
|
413
|
+
---
|
|
414
|
+
|
|
415
|
+
### Layer 2: CLI & Installer Testing
|
|
416
|
+
|
|
417
|
+
**Priority: High**
|
|
418
|
+
|
|
419
|
+
Traditional testing works well for deterministic CLI operations.
|
|
420
|
+
|
|
421
|
+
#### Components to Test
|
|
422
|
+
|
|
423
|
+
| Component | Testing Approach |
|
|
424
|
+
|-----------|------------------|
|
|
425
|
+
| Installers | Unit tests with mocked filesystem |
|
|
426
|
+
| Slash commands | Integration tests with BATS |
|
|
427
|
+
| File operations | Snapshot tests for generated files |
|
|
428
|
+
| Memory Bank CRUD | Property-based tests for schema compliance |
|
|
429
|
+
|
|
430
|
+
#### Tools
|
|
431
|
+
|
|
432
|
+
- **BATS** (Bash Automated Testing System) - CLI interaction testing
|
|
433
|
+
- **Jest/Vitest** - Snapshot testing for generated files
|
|
434
|
+
- **cram** - CLI output comparison
|
|
435
|
+
- **pytest** with `click.testing.CliRunner` - Python CLI testing
|
|
436
|
+
|
|
437
|
+
#### Example BATS Test
|
|
438
|
+
|
|
439
|
+
```bash
|
|
440
|
+
#!/usr/bin/env bats
|
|
441
|
+
|
|
442
|
+
@test "install command creates .irismd directory" {
|
|
443
|
+
run npx irismd install
|
|
444
|
+
[ "$status" -eq 0 ]
|
|
445
|
+
[ -d ".irismd" ]
|
|
446
|
+
[ -f ".irismd/memory-bank.yaml" ]
|
|
447
|
+
}
|
|
448
|
+
|
|
449
|
+
@test "inception agent creates intent directory" {
|
|
450
|
+
run /inception intent-create --name="test-intent"
|
|
451
|
+
[ "$status" -eq 0 ]
|
|
452
|
+
[ -d "memory-bank/intents/test-intent" ]
|
|
453
|
+
}
|
|
454
|
+
```
|
|
455
|
+
|
|
456
|
+
---
|
|
457
|
+
|
|
458
|
+
### Layer 3: Agent Behavior Testing
|
|
459
|
+
|
|
460
|
+
**Priority: Critical (Most Complex)**
|
|
461
|
+
|
|
462
|
+
LLM outputs are non-deterministic. Traditional exact-match testing fails.
|
|
463
|
+
|
|
464
|
+
#### The Non-Determinism Challenge
|
|
465
|
+
|
|
466
|
+
> "Feed the same input to an LLM multiple times, and you might get different outputs each time. This non-determinism makes traditional unit testing approaches ineffective."
|
|
467
|
+
|
|
468
|
+
#### Tiered Approach
|
|
469
|
+
|
|
470
|
+
**Tier A: Mock LLM Responses for CI/CD**
|
|
471
|
+
|
|
472
|
+
Record and replay known-good LLM responses:
|
|
473
|
+
|
|
474
|
+
```text
|
|
475
|
+
__tests__/fixtures/mock-responses/
|
|
476
|
+
├── inception-agent/
|
|
477
|
+
│ ├── intent-create-success.json
|
|
478
|
+
│ ├── requirements-elicitation.json
|
|
479
|
+
│ └── unit-generation.json
|
|
480
|
+
├── construction-agent/
|
|
481
|
+
│ ├── bolt-plan-generation.json
|
|
482
|
+
│ └── story-breakdown.json
|
|
483
|
+
└── master-agent/
|
|
484
|
+
└── request-routing.json
|
|
485
|
+
```
|
|
486
|
+
|
|
487
|
+
Tools: `vcrpy`, `responses`, `nock`, `msw`
|
|
488
|
+
|
|
489
|
+
**Tier B: LLM-as-Judge Evaluation**
|
|
490
|
+
|
|
491
|
+
Use an LLM to evaluate agent outputs against rubrics:
|
|
492
|
+
|
|
493
|
+
```yaml
|
|
494
|
+
# rubrics/inception-agent-quality.yaml
|
|
495
|
+
evaluations:
|
|
496
|
+
- name: "Intent Completeness"
|
|
497
|
+
criteria: "Does the generated intent contain problem statement, goals, and success criteria?"
|
|
498
|
+
threshold: 0.8
|
|
499
|
+
|
|
500
|
+
- name: "Requirement Clarity"
|
|
501
|
+
criteria: "Are requirements specific, measurable, and testable?"
|
|
502
|
+
threshold: 0.75
|
|
503
|
+
```
|
|
504
|
+
|
|
505
|
+
Tools: **DeepEval**, **Promptfoo**, **LangWatch**
|
|
506
|
+
|
|
507
|
+
**LLM-as-Judge Provider Options**:
|
|
508
|
+
|
|
509
|
+
| Provider | Model | Cost | Best For |
|
|
510
|
+
|----------|-------|------|----------|
|
|
511
|
+
| **OpenRouter Free** | `x-ai/grok-4.1-fast:free` | $0 | **Coding agents**, agentic tasks (2M context) |
|
|
512
|
+
| **OpenRouter Free** | `qwen/qwen3-coder:free` | $0 | **Code-specialized** evaluation |
|
|
513
|
+
| **OpenRouter Free** | `meta-llama/llama-3.3-70b-instruct:free` | $0 | High-quality reasoning |
|
|
514
|
+
| **OpenRouter Free** | `meta-llama/llama-4-maverick:free` | $0 | General + coding (1M context) |
|
|
515
|
+
| **OpenRouter Free** | `google/gemma-3-27b-it:free` | $0 | Fast, good quality |
|
|
516
|
+
| **Claude Code** | Claude Opus 4 / Sonnet 4 | Subscription | Release validation (gold standard) |
|
|
517
|
+
|
|
518
|
+
**Recommended**: Use free OpenRouter models for CI/CD, Claude Code for final release validation
|
|
519
|
+
|
|
520
|
+
**Tier C: Semantic Similarity Tests**
|
|
521
|
+
|
|
522
|
+
Compare outputs to reference examples using embeddings:
|
|
523
|
+
|
|
524
|
+
```python
|
|
525
|
+
def test_intent_semantic_similarity():
|
|
526
|
+
generated = inception_agent.create_intent(prompt)
|
|
527
|
+
reference = load_golden_intent("user-auth-intent.md")
|
|
528
|
+
|
|
529
|
+
similarity = cosine_similarity(embed(generated), embed(reference))
|
|
530
|
+
assert similarity > 0.85, f"Semantic similarity {similarity} below threshold"
|
|
531
|
+
```
|
|
532
|
+
|
|
533
|
+
**Tier D: Behavioral Assertions**
|
|
534
|
+
|
|
535
|
+
Test for properties, not exact matches:
|
|
536
|
+
|
|
537
|
+
```python
|
|
538
|
+
def test_agent_output_structure():
|
|
539
|
+
output = agent.execute(prompt)
|
|
540
|
+
|
|
541
|
+
# Structure assertions
|
|
542
|
+
assert "## Problem Statement" in output
|
|
543
|
+
assert len(output) < 5000 # Reasonable length
|
|
544
|
+
assert not contains_code_injection(output)
|
|
545
|
+
|
|
546
|
+
# Semantic assertions
|
|
547
|
+
assert_contains_concept(output, "authentication")
|
|
548
|
+
assert_professional_tone(output)
|
|
549
|
+
```
|
|
550
|
+
|
|
551
|
+
---
|
|
552
|
+
|
|
553
|
+
### Layer 4: End-to-End Workflow Testing
|
|
554
|
+
|
|
555
|
+
**Priority: High**
|
|
556
|
+
|
|
557
|
+
Test complete AI-DLC flows across phases:
|
|
558
|
+
|
|
559
|
+
```text
|
|
560
|
+
Inception → Construction → Operations
|
|
561
|
+
```
|
|
562
|
+
|
|
563
|
+
#### Test Scenarios
|
|
564
|
+
|
|
565
|
+
| Scenario | Steps | Validation |
|
|
566
|
+
|----------|-------|------------|
|
|
567
|
+
| Happy Path | Prompt → Intent → Units → Bolts → Code | All artifacts created correctly |
|
|
568
|
+
| Partial Completion | Start inception, interrupt, resume | State persisted, resumable |
|
|
569
|
+
| Error Recovery | Invalid input → Error → Retry | Graceful degradation |
|
|
570
|
+
| Multi-Agent Coordination | Master routes → Inception executes | Correct agent selection |
|
|
571
|
+
|
|
572
|
+
#### Implementation
|
|
573
|
+
|
|
574
|
+
```python
|
|
575
|
+
class TestAIDLCWorkflow:
|
|
576
|
+
|
|
577
|
+
def test_full_inception_to_construction_flow(self, memory_bank):
|
|
578
|
+
# Inception Phase
|
|
579
|
+
result = master_agent.route("/inception intent-create --name='user-auth'")
|
|
580
|
+
assert result.phase == "inception"
|
|
581
|
+
assert memory_bank.has_intent("user-auth")
|
|
582
|
+
|
|
583
|
+
# Unit elaboration
|
|
584
|
+
result = inception_agent.execute("/inception units")
|
|
585
|
+
assert memory_bank.intent("user-auth").has_units()
|
|
586
|
+
|
|
587
|
+
# Transition to Construction
|
|
588
|
+
result = master_agent.route("/construction bolt-plan")
|
|
589
|
+
assert result.phase == "construction"
|
|
590
|
+
assert memory_bank.has_bolts()
|
|
591
|
+
```
|
|
592
|
+
|
|
593
|
+
---
|
|
594
|
+
|
|
595
|
+
## Golden Dataset Strategy
|
|
596
|
+
|
|
597
|
+
### What is a Golden Dataset?
|
|
598
|
+
|
|
599
|
+
A curated collection of inputs and their ideal outputs for regression testing.
|
|
600
|
+
|
|
601
|
+
### Building the Dataset
|
|
602
|
+
|
|
603
|
+
| Phase | Minimum Examples | Ideal Coverage |
|
|
604
|
+
|-------|-----------------|----------------|
|
|
605
|
+
| Inception Agent | 10 | 50+ |
|
|
606
|
+
| Construction Agent | 10 | 50+ |
|
|
607
|
+
| Operations Agent | 5 | 25+ |
|
|
608
|
+
| Master Agent (Routing) | 20 | 100+ |
|
|
609
|
+
|
|
610
|
+
### Structure
|
|
611
|
+
|
|
612
|
+
```text
|
|
613
|
+
__tests__/golden-datasets/
|
|
614
|
+
├── inception/
|
|
615
|
+
│ ├── inputs/
|
|
616
|
+
│ │ ├── 001-simple-feature.txt
|
|
617
|
+
│ │ ├── 002-complex-system.txt
|
|
618
|
+
│ │ └── 003-refactoring-request.txt
|
|
619
|
+
│ └── outputs/
|
|
620
|
+
│ ├── 001-simple-feature-intent.md
|
|
621
|
+
│ ├── 002-complex-system-intent.md
|
|
622
|
+
│ └── 003-refactoring-request-intent.md
|
|
623
|
+
├── construction/
|
|
624
|
+
│ ├── inputs/
|
|
625
|
+
│ └── outputs/
|
|
626
|
+
└── evaluation-rubrics/
|
|
627
|
+
├── intent-quality.yaml
|
|
628
|
+
├── unit-completeness.yaml
|
|
629
|
+
└── bolt-validity.yaml
|
|
630
|
+
```
|
|
631
|
+
|
|
632
|
+
### Regression Testing Workflow
|
|
633
|
+
|
|
634
|
+
```mermaid
|
|
635
|
+
graph LR
|
|
636
|
+
A[Code Change] --> B[Run Deterministic Tests]
|
|
637
|
+
B --> C{Pass?}
|
|
638
|
+
C -->|Yes| D[Run Golden Dataset Evaluation]
|
|
639
|
+
C -->|No| E[Fail Fast]
|
|
640
|
+
D --> F[Compare to Baselines]
|
|
641
|
+
F --> G{Regression?}
|
|
642
|
+
G -->|No| H[Pass]
|
|
643
|
+
G -->|Yes| I[Alert + Review]
|
|
644
|
+
```
|
|
645
|
+
|
|
646
|
+
---
|
|
647
|
+
|
|
648
|
+
## CI/CD Integration
|
|
649
|
+
|
|
650
|
+
### Pipeline Structure
|
|
651
|
+
|
|
652
|
+
```yaml
|
|
653
|
+
# .github/workflows/test.yml
|
|
654
|
+
name: Test Suite
|
|
655
|
+
|
|
656
|
+
on:
|
|
657
|
+
push:
|
|
658
|
+
branches: [main, develop]
|
|
659
|
+
pull_request:
|
|
660
|
+
schedule:
|
|
661
|
+
- cron: '0 2 * * *' # Nightly at 2 AM
|
|
662
|
+
- cron: '0 14 * * 0' # Weekly comprehensive (Sunday 2 PM)
|
|
663
|
+
|
|
664
|
+
jobs:
|
|
665
|
+
# FAST: Every PR/Push (~2 min)
|
|
666
|
+
deterministic-tests:
|
|
667
|
+
name: Schema & CLI Tests
|
|
668
|
+
runs-on: ubuntu-latest
|
|
669
|
+
steps:
|
|
670
|
+
- uses: actions/checkout@v4
|
|
671
|
+
- name: Schema Validation
|
|
672
|
+
run: npm run test:schema
|
|
673
|
+
- name: CLI Tests
|
|
674
|
+
run: npm run test:cli
|
|
675
|
+
- name: Snapshot Tests
|
|
676
|
+
run: npm run test:snapshots
|
|
677
|
+
|
|
678
|
+
# MEDIUM: On merge to main (~5 min) - Uses FREE OpenRouter models
|
|
679
|
+
agent-evaluation:
|
|
680
|
+
name: Agent Behavior Evaluation
|
|
681
|
+
runs-on: ubuntu-latest
|
|
682
|
+
needs: deterministic-tests
|
|
683
|
+
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
|
|
684
|
+
steps:
|
|
685
|
+
- uses: actions/checkout@v4
|
|
686
|
+
- name: Run Golden Dataset Evaluation
|
|
687
|
+
run: npm run eval:agents
|
|
688
|
+
env:
|
|
689
|
+
OPENROUTER_API_KEY: ${{ secrets.OPENROUTER_API_KEY }}
|
|
690
|
+
EVAL_MODEL: meta-llama/llama-3.1-8b-instruct:free
|
|
691
|
+
- name: Check Regression Threshold
|
|
692
|
+
run: npm run eval:regression-check
|
|
693
|
+
|
|
694
|
+
# SLOW: Nightly (~30 min) - Uses FREE OpenRouter models
|
|
695
|
+
nightly-evaluation:
|
|
696
|
+
name: Comprehensive Agent Evaluation
|
|
697
|
+
runs-on: ubuntu-latest
|
|
698
|
+
if: github.event_name == 'schedule' && github.event.schedule == '0 2 * * *'
|
|
699
|
+
steps:
|
|
700
|
+
- uses: actions/checkout@v4
|
|
701
|
+
- name: Full Golden Dataset Suite
|
|
702
|
+
run: npm run eval:full
|
|
703
|
+
env:
|
|
704
|
+
OPENROUTER_API_KEY: ${{ secrets.OPENROUTER_API_KEY }}
|
|
705
|
+
EVAL_MODEL: google/gemma-2-9b-it:free
|
|
706
|
+
- name: Generate Evaluation Report
|
|
707
|
+
run: npm run eval:report
|
|
708
|
+
|
|
709
|
+
# COMPREHENSIVE: Weekly (~1 hr) - Uses Grok 4.1 Fast (free, 2M context)
|
|
710
|
+
weekly-comprehensive:
|
|
711
|
+
name: Weekly Comprehensive Evaluation
|
|
712
|
+
runs-on: ubuntu-latest
|
|
713
|
+
if: github.event_name == 'schedule' && github.event.schedule == '0 14 * * 0'
|
|
714
|
+
steps:
|
|
715
|
+
- uses: actions/checkout@v4
|
|
716
|
+
- name: Full Suite with Multi-Run Averaging
|
|
717
|
+
run: npm run eval:comprehensive --runs=3
|
|
718
|
+
env:
|
|
719
|
+
OPENROUTER_API_KEY: ${{ secrets.OPENROUTER_API_KEY }}
|
|
720
|
+
EVAL_MODEL: x-ai/grok-4.1-fast:free
|
|
721
|
+
- name: Generate Weekly Report
|
|
722
|
+
run: npm run eval:weekly-report
|
|
723
|
+
```
|
|
724
|
+
|
|
725
|
+
### Test Categories by Trigger
|
|
726
|
+
|
|
727
|
+
| Trigger | Tests Run | Duration |
|
|
728
|
+
|---------|-----------|----------|
|
|
729
|
+
| Every PR | Schema, CLI, Snapshots | ~2 min |
|
|
730
|
+
| Merge to main | + Quick agent eval | ~5 min |
|
|
731
|
+
| Nightly | Full golden dataset | ~30 min |
|
|
732
|
+
| Release | Full suite + human review | ~1 hour |
|
|
733
|
+
|
|
734
|
+
---
|
|
735
|
+
|
|
736
|
+
## Recommended Test Directory Structure
|
|
737
|
+
|
|
738
|
+
```text
|
|
739
|
+
irismd/
|
|
740
|
+
├── __tests__/
|
|
741
|
+
│ ├── unit/
|
|
742
|
+
│ │ ├── installers/
|
|
743
|
+
│ │ │ ├── claude-code-installer.test.ts
|
|
744
|
+
│ │ │ ├── cursor-installer.test.ts
|
|
745
|
+
│ │ │ └── copilot-installer.test.ts
|
|
746
|
+
│ │ ├── schema-validation/
|
|
747
|
+
│ │ │ ├── memory-bank-schema.test.ts
|
|
748
|
+
│ │ │ ├── intent-schema.test.ts
|
|
749
|
+
│ │ │ └── unit-schema.test.ts
|
|
750
|
+
│ │ └── template-generation/
|
|
751
|
+
│ │ ├── bolt-template.test.ts
|
|
752
|
+
│ │ └── artifact-templates.test.ts
|
|
753
|
+
│ │
|
|
754
|
+
│ ├── integration/
|
|
755
|
+
│ │ ├── cli/
|
|
756
|
+
│ │ │ ├── install.bats
|
|
757
|
+
│ │ │ ├── inception-commands.bats
|
|
758
|
+
│ │ │ └── construction-commands.bats
|
|
759
|
+
│ │ └── memory-bank/
|
|
760
|
+
│ │ ├── crud-operations.test.ts
|
|
761
|
+
│ │ └── file-system-state.test.ts
|
|
762
|
+
│ │
|
|
763
|
+
│ ├── e2e/
|
|
764
|
+
│ │ ├── workflows/
|
|
765
|
+
│ │ │ ├── full-aidlc-flow.test.ts
|
|
766
|
+
│ │ │ ├── inception-phase.test.ts
|
|
767
|
+
│ │ │ └── construction-phase.test.ts
|
|
768
|
+
│ │ └── agent-chains/
|
|
769
|
+
│ │ └── multi-agent-coordination.test.ts
|
|
770
|
+
│ │
|
|
771
|
+
│ ├── evaluation/
|
|
772
|
+
│ │ ├── golden-datasets/
|
|
773
|
+
│ │ │ ├── inception/
|
|
774
|
+
│ │ │ ├── construction/
|
|
775
|
+
│ │ │ └── operations/
|
|
776
|
+
│ │ ├── rubrics/
|
|
777
|
+
│ │ │ ├── intent-quality.yaml
|
|
778
|
+
│ │ │ ├── unit-completeness.yaml
|
|
779
|
+
│ │ │ └── bolt-validity.yaml
|
|
780
|
+
│ │ └── regression/
|
|
781
|
+
│ │ ├── baselines/
|
|
782
|
+
│ │ └── reports/
|
|
783
|
+
│ │
|
|
784
|
+
│ └── fixtures/
|
|
785
|
+
│ ├── mock-responses/
|
|
786
|
+
│ │ ├── inception-agent/
|
|
787
|
+
│ │ ├── construction-agent/
|
|
788
|
+
│ │ └── master-agent/
|
|
789
|
+
│ └── memory-bank-states/
|
|
790
|
+
│ ├── empty-project/
|
|
791
|
+
│ ├── inception-complete/
|
|
792
|
+
│ └── construction-in-progress/
|
|
793
|
+
│
|
|
794
|
+
├── jest.config.js
|
|
795
|
+
├── vitest.config.ts
|
|
796
|
+
└── promptfoo.yaml
|
|
797
|
+
```
|
|
798
|
+
|
|
799
|
+
---
|
|
800
|
+
|
|
801
|
+
## Tools Summary
|
|
802
|
+
|
|
803
|
+
| Layer | Tools | Purpose |
|
|
804
|
+
|-------|-------|---------|
|
|
805
|
+
| Schema Validation | JSON Schema, Ajv, Zod | Validate YAML/JSON structures |
|
|
806
|
+
| Markdown Linting | markdownlint, remark-lint | Enforce markdown structure |
|
|
807
|
+
| CLI Testing | BATS, pytest, cram | Test command-line interactions |
|
|
808
|
+
| Snapshot Testing | Jest, Vitest | Verify generated file content |
|
|
809
|
+
| LLM Mocking | vcrpy, responses, nock, msw | Record/replay LLM responses |
|
|
810
|
+
| LLM Evaluation | DeepEval, Promptfoo, LangWatch | Evaluate agent output quality |
|
|
811
|
+
| Regression Testing | Evidently AI, Giskard, custom | Detect quality degradation |
|
|
812
|
+
| E2E Agent Testing | Checksum, custom harness | Full workflow validation |
|
|
813
|
+
|
|
814
|
+
---
|
|
815
|
+
|
|
816
|
+
## LLM-as-Judge Implementation
|
|
817
|
+
|
|
818
|
+
### Provider Strategy
|
|
819
|
+
|
|
820
|
+
Use different providers based on context and cost requirements:
|
|
821
|
+
|
|
822
|
+
```yaml
|
|
823
|
+
# evaluation-config.yaml
|
|
824
|
+
providers:
|
|
825
|
+
pr_checks:
|
|
826
|
+
type: openrouter
|
|
827
|
+
model: x-ai/grok-4.1-fast:free
|
|
828
|
+
temperature: 0
|
|
829
|
+
# Free, fast, optimized for agentic coding (2M context)
|
|
830
|
+
|
|
831
|
+
code_review:
|
|
832
|
+
type: openrouter
|
|
833
|
+
model: qwen/qwen3-coder:free
|
|
834
|
+
temperature: 0
|
|
835
|
+
# Free, code-specialized evaluation
|
|
836
|
+
|
|
837
|
+
nightly:
|
|
838
|
+
type: openrouter
|
|
839
|
+
model: meta-llama/llama-3.3-70b-instruct:free
|
|
840
|
+
temperature: 0
|
|
841
|
+
# Free, high-quality reasoning for comprehensive eval
|
|
842
|
+
|
|
843
|
+
release:
|
|
844
|
+
type: openrouter
|
|
845
|
+
model: x-ai/grok-4.1-fast:free
|
|
846
|
+
temperature: 0
|
|
847
|
+
runs: 3 # Multi-run averaging
|
|
848
|
+
# Best free model: 2M context, excellent for agentic/coding evaluation
|
|
849
|
+
```
|
|
850
|
+
|
|
851
|
+
### OpenRouter Integration (Recommended for CI/CD)
|
|
852
|
+
|
|
853
|
+
OpenRouter provides OpenAI-compatible API with free model access:
|
|
854
|
+
|
|
855
|
+
```python
|
|
856
|
+
# tests/evaluation/judge.py
|
|
857
|
+
from openai import OpenAI
|
|
858
|
+
import os
|
|
859
|
+
|
|
860
|
+
class OpenRouterJudge:
|
|
861
|
+
def __init__(self, model: str = "meta-llama/llama-3.1-8b-instruct:free"):
|
|
862
|
+
self.client = OpenAI(
|
|
863
|
+
base_url="https://openrouter.ai/api/v1",
|
|
864
|
+
api_key=os.getenv("OPENROUTER_API_KEY"),
|
|
865
|
+
default_headers={"HTTP-Referer": "https://iris.md"}
|
|
866
|
+
)
|
|
867
|
+
self.model = model
|
|
868
|
+
|
|
869
|
+
def evaluate(self, output: str, rubric: str) -> dict:
|
|
870
|
+
response = self.client.chat.completions.create(
|
|
871
|
+
model=self.model,
|
|
872
|
+
messages=[{
|
|
873
|
+
"role": "user",
|
|
874
|
+
"content": f"""Evaluate this agent output against the rubric.
|
|
875
|
+
|
|
876
|
+
Output:
|
|
877
|
+
{output}
|
|
878
|
+
|
|
879
|
+
Rubric:
|
|
880
|
+
{rubric}
|
|
881
|
+
|
|
882
|
+
Return JSON: {{"score": 0.0-1.0, "pass": true/false, "reasoning": "..."}}"""
|
|
883
|
+
}],
|
|
884
|
+
temperature=0,
|
|
885
|
+
response_format={"type": "json_object"}
|
|
886
|
+
)
|
|
887
|
+
return json.loads(response.choices[0].message.content)
|
|
888
|
+
```
|
|
889
|
+
|
|
890
|
+
### Promptfoo Configuration with OpenRouter
|
|
891
|
+
|
|
892
|
+
```yaml
|
|
893
|
+
# promptfoo.yaml
|
|
894
|
+
providers:
|
|
895
|
+
- id: openrouter:meta-llama/llama-3.1-8b-instruct:free
|
|
896
|
+
config:
|
|
897
|
+
temperature: 0
|
|
898
|
+
|
|
899
|
+
defaultTest:
|
|
900
|
+
assert:
|
|
901
|
+
- type: llm-rubric
|
|
902
|
+
value: |
|
|
903
|
+
Evaluate if the output:
|
|
904
|
+
1. Contains required sections (problem, goals, criteria)
|
|
905
|
+
2. Uses professional language
|
|
906
|
+
3. Is actionable and specific
|
|
907
|
+
Return PASS if score >= 0.8
|
|
908
|
+
|
|
909
|
+
tests:
|
|
910
|
+
- description: "Intent creation quality"
|
|
911
|
+
vars:
|
|
912
|
+
prompt: "Create an intent for user authentication"
|
|
913
|
+
assert:
|
|
914
|
+
- type: llm-rubric
|
|
915
|
+
provider: openrouter:meta-llama/llama-3.1-8b-instruct:free
|
|
916
|
+
value: "Does this intent have clear requirements?"
|
|
917
|
+
```
|
|
918
|
+
|
|
919
|
+
### Cost Comparison
|
|
920
|
+
|
|
921
|
+
| Provider | Model | Cost/1M tokens | Rate Limit | Best For |
|
|
922
|
+
|----------|-------|----------------|------------|----------|
|
|
923
|
+
| OpenRouter | Grok 4.1 Fast (free) | $0 | ~20 req/min | **Coding agents** |
|
|
924
|
+
| OpenRouter | Qwen3 Coder (free) | $0 | ~20 req/min | **Code evaluation** |
|
|
925
|
+
| OpenRouter | Llama 3.3 70B (free) | $0 | ~20 req/min | Reasoning |
|
|
926
|
+
| OpenRouter | Llama 4 Maverick (free) | $0 | ~20 req/min | General (1M ctx) |
|
|
927
|
+
| OpenRouter | Gemma 3 27B (free) | $0 | ~20 req/min | Fast fallback |
|
|
928
|
+
| OpenRouter | **Grok 4.1 Fast (free)** | $0 | ~20 req/min | **Releases** (2M ctx) |
|
|
929
|
+
|
|
930
|
+
**Recommendation**: Use `grok-4.1-fast:free` for everything - coding agents, release validation (2M context handles full evaluation). Use `qwen3-coder:free` for code-specific assertions. **Entire testing stack is $0.**
|
|
931
|
+
|
|
932
|
+
---
|
|
933
|
+
|
|
934
|
+
## Handling Non-Determinism
|
|
935
|
+
|
|
936
|
+
### Strategies
|
|
937
|
+
|
|
938
|
+
1. **Set Temperature to 0** for tests requiring deterministic outputs
|
|
939
|
+
2. **Use Semantic Similarity** instead of exact match (threshold: 0.85+)
|
|
940
|
+
3. **Run Multiple Evaluations** and average scores for critical tests
|
|
941
|
+
4. **LLM-as-Judge** for subjective quality criteria
|
|
942
|
+
5. **Behavioral Assertions** that check properties, not strings
|
|
943
|
+
|
|
944
|
+
### Example: Handling Flaky Tests
|
|
945
|
+
|
|
946
|
+
```python
|
|
947
|
+
@pytest.mark.flaky(max_runs=3, min_passes=2)
|
|
948
|
+
def test_agent_generates_valid_intent():
|
|
949
|
+
"""
|
|
950
|
+
Run 3 times, pass if 2+ succeed.
|
|
951
|
+
Accounts for LLM non-determinism.
|
|
952
|
+
"""
|
|
953
|
+
output = inception_agent.create_intent(prompt)
|
|
954
|
+
|
|
955
|
+
# Semantic check, not exact match
|
|
956
|
+
assert semantic_similarity(output, reference) > 0.85
|
|
957
|
+
assert has_required_sections(output)
|
|
958
|
+
```
|
|
959
|
+
|
|
960
|
+
---
|
|
961
|
+
|
|
962
|
+
## Human-in-the-Loop Validation
|
|
963
|
+
|
|
964
|
+
### When Human Review is Required
|
|
965
|
+
|
|
966
|
+
| Scenario | Trigger | Review Type |
|
|
967
|
+
|----------|---------|-------------|
|
|
968
|
+
| Phase Transition | Inception → Construction | Approval gate |
|
|
969
|
+
| Operations Deployment | Before production deploy | Manual verification |
|
|
970
|
+
| Golden Dataset Updates | New baseline proposed | Quality review |
|
|
971
|
+
| Regression Detected | Score drop > 10% | Investigation |
|
|
972
|
+
|
|
973
|
+
### Implementation
|
|
974
|
+
|
|
975
|
+
```yaml
|
|
976
|
+
# In CI/CD workflow
|
|
977
|
+
- name: Human Approval Gate
|
|
978
|
+
if: steps.eval.outputs.phase_transition == 'true'
|
|
979
|
+
uses: trstringer/manual-approval@v1
|
|
980
|
+
with:
|
|
981
|
+
approvers: team-leads
|
|
982
|
+
message: "Review inception artifacts before construction phase"
|
|
983
|
+
```
|
|
984
|
+
|
|
985
|
+
---
|
|
986
|
+
|
|
987
|
+
## Metrics and Monitoring
|
|
988
|
+
|
|
989
|
+
### Key Metrics to Track
|
|
990
|
+
|
|
991
|
+
| Metric | Target | Alert Threshold |
|
|
992
|
+
|--------|--------|-----------------|
|
|
993
|
+
| Schema Validation Pass Rate | 100% | < 100% |
|
|
994
|
+
| CLI Test Pass Rate | 100% | < 100% |
|
|
995
|
+
| Agent Output Quality Score | > 0.85 | < 0.80 |
|
|
996
|
+
| Semantic Similarity (avg) | > 0.90 | < 0.85 |
|
|
997
|
+
| Regression from Baseline | 0% | > 5% |
|
|
998
|
+
| Test Execution Time | < 5 min (PR) | > 10 min |
|
|
999
|
+
|
|
1000
|
+
### Dashboard
|
|
1001
|
+
|
|
1002
|
+
Track over time:
|
|
1003
|
+
|
|
1004
|
+
- Quality scores per agent
|
|
1005
|
+
- Regression trends
|
|
1006
|
+
- Test flakiness rate
|
|
1007
|
+
- Coverage of golden dataset
|
|
1008
|
+
|
|
1009
|
+
---
|
|
1010
|
+
|
|
1011
|
+
## Implementation Roadmap
|
|
1012
|
+
|
|
1013
|
+
### Phase 1: Foundation (Week 1-2)
|
|
1014
|
+
|
|
1015
|
+
- [ ] Set up test directory structure
|
|
1016
|
+
- [ ] Implement schema validation for `memory-bank.yaml`
|
|
1017
|
+
- [ ] Add markdownlint configuration
|
|
1018
|
+
- [ ] Create first BATS tests for CLI commands
|
|
1019
|
+
|
|
1020
|
+
### Phase 2: Golden Dataset (Week 3-4)
|
|
1021
|
+
|
|
1022
|
+
- [ ] Create 10 golden examples per agent
|
|
1023
|
+
- [ ] Implement semantic similarity testing
|
|
1024
|
+
- [ ] Set up Promptfoo or DeepEval
|
|
1025
|
+
- [ ] Define evaluation rubrics
|
|
1026
|
+
|
|
1027
|
+
### Phase 3: CI/CD Integration (Week 5-6)
|
|
1028
|
+
|
|
1029
|
+
- [ ] Configure GitHub Actions workflow
|
|
1030
|
+
- [ ] Set up regression baseline tracking
|
|
1031
|
+
- [ ] Implement human approval gates
|
|
1032
|
+
- [ ] Create evaluation dashboards
|
|
1033
|
+
|
|
1034
|
+
### Phase 4: Expansion (Ongoing)
|
|
1035
|
+
|
|
1036
|
+
- [ ] Expand golden dataset to 50+ examples
|
|
1037
|
+
- [ ] Add property-based testing
|
|
1038
|
+
- [ ] Implement continuous monitoring
|
|
1039
|
+
- [ ] Refine rubrics based on production data
|
|
1040
|
+
|
|
1041
|
+
---
|
|
1042
|
+
|
|
1043
|
+
## References
|
|
1044
|
+
|
|
1045
|
+
- [Specification-Driven Development - GitHub Blog](https://github.blog/ai-and-ml/generative-ai/spec-driven-development-using-markdown-as-a-programming-language-when-building-with-ai/)
|
|
1046
|
+
- [LLM Regression Testing Tutorial - Evidently AI](https://www.evidentlyai.com/blog/llm-regression-testing-tutorial)
|
|
1047
|
+
- [DeepEval Documentation](https://deepeval.com/docs)
|
|
1048
|
+
- [Promptfoo - LLM Testing Framework](https://github.com/promptfoo/promptfoo)
|
|
1049
|
+
- [Building Effective Agents - Anthropic](https://www.anthropic.com/engineering/building-effective-agents)
|
|
1050
|
+
- [Terminal-Bench - Stanford/Laude Institute](https://ainativedev.io/news/8-benchmarks-shaping-the-next-generation-of-ai-agents)
|
|
1051
|
+
- [Contract-Driven Development Research](https://link.springer.com/chapter/10.1007/978-3-540-71289-3_2)
|
|
1052
|
+
|
|
1053
|
+
---
|
|
1054
|
+
|
|
1055
|
+
*Document created: 2025-12-09*
|
|
1056
|
+
*Last updated: 2025-12-10*
|
|
1057
|
+
*Status: Research / Draft*
|