@chongyan/autospec 1.0.2 → 1.0.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +242 -21
- package/README.md +54 -608
- package/dist/README.md +54 -0
- package/dist/adapters/claude-code/README.md.enc +6 -0
- package/dist/adapters/claude-code/agents.js +1 -0
- package/dist/adapters/claude-code/commands.config.js +1 -0
- package/dist/adapters/claude-code/commands.js +1 -0
- package/dist/adapters/claude-code/hooks.config.js +2 -0
- package/dist/adapters/claude-code/hooks.js +1 -0
- package/dist/adapters/claude-code/install.js +1 -0
- package/dist/adapters/claude-code/skills.js +1 -0
- package/dist/adapters/codex/README.md.enc +6 -0
- package/dist/adapters/codex/hooks/pre-commit.sh +10 -0
- package/dist/adapters/codex/install.js +1 -0
- package/dist/adapters/codex/prompts/env-learn.md.enc +6 -0
- package/dist/adapters/codex/prompts/review.md.enc +6 -0
- package/dist/adapters/codex/wrappers/autospec-cli.sh +118 -0
- package/dist/adapters/codex/wrappers/parallel.sh +20 -0
- package/dist/adapters/codex/wrappers/post-task.sh +138 -0
- package/dist/bin/autospec.js +2 -0
- package/dist/knowledge/README.en.md.enc +6 -0
- package/dist/knowledge/README.md.enc +6 -0
- package/dist/knowledge/change-management.md.enc +6 -0
- package/dist/knowledge/cognition-engine.md.enc +6 -0
- package/dist/knowledge/config/baseline-permissions.md.enc +6 -0
- package/dist/knowledge/config/external-mounts.yaml.enc +6 -0
- package/dist/knowledge/config/model-profiles.yaml.enc +6 -0
- package/dist/knowledge/config/role-composition.yaml.enc +6 -0
- package/dist/knowledge/config/token-optimization.yaml.enc +6 -0
- package/dist/knowledge/config/validation-patterns.yaml.enc +6 -0
- package/dist/knowledge/constitution.md.enc +6 -0
- package/dist/knowledge/core-rules.md.enc +6 -0
- package/dist/knowledge/environment/adapters/README.md.enc +6 -0
- package/dist/knowledge/environment/detection-patterns.yaml.enc +6 -0
- package/dist/knowledge/environment/failure-patterns.json +223 -0
- package/dist/knowledge/environment/repair-strategies.json +206 -0
- package/dist/knowledge/memory/README.md.enc +6 -0
- package/dist/knowledge/memory/field/README.md.enc +6 -0
- package/dist/knowledge/memory/project/decisions/README.md.enc +6 -0
- package/dist/knowledge/memory/project/evolution-log.md.enc +6 -0
- package/dist/knowledge/memory/project/health-metrics.md.enc +6 -0
- package/dist/knowledge/memory/team/best-practices.md.enc +6 -0
- package/dist/knowledge/pipeline/code.md.enc +6 -0
- package/dist/knowledge/pipeline/explore.md.enc +6 -0
- package/dist/knowledge/pipeline/plan.md.enc +6 -0
- package/dist/knowledge/pipeline/protocol.md.enc +6 -0
- package/dist/knowledge/protocol/capabilities.yaml.enc +6 -0
- package/dist/knowledge/protocol/evolve-integration.md.enc +6 -0
- package/dist/knowledge/skills/README.md.enc +6 -0
- package/dist/knowledge/skills/adversarial-review.md.enc +6 -0
- package/dist/knowledge/skills/analyze-requirement.md.enc +6 -0
- package/dist/knowledge/skills/channel-operations.md.enc +6 -0
- package/dist/knowledge/skills/content-operations.md.enc +6 -0
- package/dist/knowledge/skills/content-prompts.md.enc +6 -0
- package/dist/knowledge/skills/conversion-optimization.md.enc +6 -0
- package/dist/knowledge/skills/data-operations.md.enc +6 -0
- package/dist/knowledge/skills/design-solution.md.enc +6 -0
- package/dist/knowledge/skills/growth-strategies.md.enc +6 -0
- package/dist/knowledge/skills/implement-code.md.enc +6 -0
- package/dist/knowledge/skills/knowledge-distill.md.enc +6 -0
- package/dist/knowledge/skills/parallel-dev.md.enc +6 -0
- package/dist/knowledge/skills/private-domain-traffic.md.enc +6 -0
- package/dist/knowledge/skills/skill-format.md.enc +6 -0
- package/dist/knowledge/skills/social-commerce.md.enc +6 -0
- package/dist/knowledge/skills/team-orchestration.md.enc +6 -0
- package/dist/knowledge/skills/unified-review.md.enc +6 -0
- package/dist/knowledge/skills/user-operations.md.enc +6 -0
- package/dist/knowledge/templates/autospec-config.yaml.enc +6 -0
- package/dist/knowledge/templates/smoke-test.md.enc +6 -0
- package/dist/knowledge/templates/spec/SPEC.md.enc +6 -0
- package/dist/knowledge/templates/spec/layers/delta.md.enc +6 -0
- package/dist/knowledge/templates/spec/layers/how.md.enc +6 -0
- package/dist/knowledge/templates/spec/layers/plan.md.enc +6 -0
- package/dist/knowledge/templates/spec/layers/what.md.enc +6 -0
- package/dist/knowledge/templates/spec/layers/why.md.enc +6 -0
- package/dist/knowledge/templates/wiki/catalog.yaml.enc +6 -0
- package/dist/knowledge/templates/wiki/content.md.enc +6 -0
- package/dist/knowledge/templates/wiki/meta.yaml.enc +6 -0
- package/dist/package.json +62 -0
- package/{plugins → dist/plugins}/.claude-plugin/plugin.json +259 -101
- package/dist/plugins/agents/roles/ai-engineer.md.enc +6 -0
- package/dist/plugins/agents/roles/backend-engineer.md.enc +6 -0
- package/dist/plugins/agents/roles/ceo.md.enc +6 -0
- package/dist/plugins/agents/roles/channel-ops.md.enc +6 -0
- package/dist/plugins/agents/roles/content-ops.md.enc +6 -0
- package/dist/plugins/agents/roles/conversion-ops.md.enc +6 -0
- package/dist/plugins/agents/roles/data-engineer.md.enc +6 -0
- package/dist/plugins/agents/roles/data-ops.md.enc +6 -0
- package/dist/plugins/agents/roles/devops-engineer.md.enc +6 -0
- package/dist/plugins/agents/roles/frontend-engineer.md.enc +6 -0
- package/dist/plugins/agents/roles/marketing-director.md.enc +6 -0
- package/dist/plugins/agents/roles/operations-director.md.enc +6 -0
- package/dist/plugins/agents/roles/private-traffic.md.enc +6 -0
- package/dist/plugins/agents/roles/product-owner.md.enc +6 -0
- package/dist/plugins/agents/roles/quality-engineer.md.enc +6 -0
- package/dist/plugins/agents/roles/security-engineer.md.enc +6 -0
- package/dist/plugins/agents/roles/tech-lead.md.enc +6 -0
- package/dist/plugins/agents/roles/user-ops.md.enc +6 -0
- package/dist/plugins/agents/support/blind-comparator.md.enc +6 -0
- package/dist/plugins/agents/support/consistency-checker.md.enc +6 -0
- package/dist/plugins/agents/support/experiment-evaluator.md.enc +6 -0
- package/dist/plugins/agents/support/failure-diagnostician.md.enc +6 -0
- package/dist/plugins/agents/support/independent-reviewer.md.enc +6 -0
- package/dist/plugins/agents/support/memory-curator.md.enc +6 -0
- package/dist/plugins/agents/support/monitoring-agent.md.enc +6 -0
- package/dist/plugins/agents/support/safety-auditor.md.enc +6 -0
- package/dist/plugins/agents/support/skill-benchmarker.md.enc +6 -0
- package/dist/plugins/agents/support/skill-forger.md.enc +6 -0
- package/dist/plugins/agents/support/stage-gate-evaluator.md.enc +6 -0
- package/dist/plugins/agents/support/team-orchestrator.md.enc +6 -0
- package/dist/plugins/agents/support/test-coverage-reviewer.md.enc +6 -0
- package/dist/plugins/benchmarks/templates/README.en.md.enc +6 -0
- package/dist/plugins/benchmarks/templates/README.md.enc +6 -0
- package/dist/plugins/benchmarks/templates/commands/code-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/commands/explore-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/commands/field-evolve-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/commands/plan-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/commands/project-evolve-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/commands/review-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/commands/run-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/benchmark-executor-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/benchmark-generator-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/delivery-stage-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/design-stage-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/exploration-phase-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/field-evolve-analyzer-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/field-evolve-distiller-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/field-evolve-executor-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/field-evolve-fixer-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/field-evolve-learner-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/field-evolve-scanner-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/field-evolve-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/field-evolve-verifier-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/implementation-stage-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/layer1-validation-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/project-evolve-analyzer-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/project-evolve-fixer-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/project-evolve-generator-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/project-evolve-learner-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/project-evolve-reviewer-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/project-evolve-scanner-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/project-evolve-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/project-evolve-verifier-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/requirement-analyzer-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/skill-forge-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/startup-guard-template.yaml.enc +6 -0
- package/dist/plugins/benchmarks/templates/skills/testing-stage-template.yaml.enc +6 -0
- package/dist/plugins/commands/README.en.md.enc +6 -0
- package/dist/plugins/commands/README.md.enc +6 -0
- package/dist/plugins/commands/automation.md.enc +6 -0
- package/dist/plugins/commands/code.md.enc +6 -0
- package/dist/plugins/commands/contribute.md.enc +6 -0
- package/dist/plugins/commands/dashboard.md.enc +6 -0
- package/dist/plugins/commands/env.md.enc +6 -0
- package/dist/plugins/commands/explore.md.enc +6 -0
- package/dist/plugins/commands/field-evolve.md.enc +6 -0
- package/dist/plugins/commands/global.md.enc +6 -0
- package/dist/plugins/commands/init.md.enc +6 -0
- package/dist/plugins/commands/list.md.enc +6 -0
- package/dist/plugins/commands/memory.md.enc +6 -0
- package/dist/plugins/commands/monitor.md.enc +6 -0
- package/dist/plugins/commands/org.md.enc +6 -0
- package/dist/plugins/commands/persist.md.enc +6 -0
- package/dist/plugins/commands/plan.md.enc +6 -0
- package/dist/plugins/commands/plugin.md.enc +6 -0
- package/dist/plugins/commands/project-evolve.md.enc +6 -0
- package/dist/plugins/commands/review.md.enc +6 -0
- package/dist/plugins/commands/run.md.enc +6 -0
- package/dist/plugins/commands/status.md.enc +6 -0
- package/dist/plugins/commands/sync.md.enc +6 -0
- package/dist/plugins/commands/update.md.enc +6 -0
- package/dist/plugins/env-capabilities/env-core/plugin.json +33 -0
- package/dist/plugins/hooks/README.en.md.enc +6 -0
- package/dist/plugins/hooks/README.md.enc +6 -0
- package/dist/plugins/hooks/artifact-evaluation-hook.js +2 -0
- package/dist/plugins/hooks/cognitive-dreamer.js +2 -0
- package/dist/plugins/hooks/cognitive-sync.js +2 -0
- package/dist/plugins/hooks/cognitive-tracker.js +2 -0
- package/dist/plugins/hooks/config/detection-patterns.yaml.enc +6 -0
- package/dist/plugins/hooks/constitution-guard.js +2 -0
- package/dist/plugins/hooks/do-review-separation-guard.js +2 -0
- package/dist/plugins/hooks/environment-autocommit.js +2 -0
- package/dist/plugins/hooks/environment-doctor.js +1 -0
- package/dist/plugins/hooks/environment-startup-scan.js +2 -0
- package/dist/plugins/hooks/execution-tracker.js +2 -0
- package/dist/plugins/hooks/frozen-zone-guard.js +2 -0
- package/dist/plugins/hooks/layer1-validator.js +2 -0
- package/dist/plugins/hooks/lib/artifact-evaluator.js +1 -0
- package/dist/plugins/hooks/lib/auto-fix-loop.js +1 -0
- package/dist/plugins/hooks/lib/benchmarks/change-detector.js +1 -0
- package/dist/plugins/hooks/lib/benchmarks/evaluator.js +1 -0
- package/dist/plugins/hooks/lib/benchmarks/integration-example.js +1 -0
- package/dist/plugins/hooks/lib/cognitive/adaptive-matcher.js +1 -0
- package/dist/plugins/hooks/lib/cognitive/base-store.js +1 -0
- package/dist/plugins/hooks/lib/cognitive/causal-engine.js +1 -0
- package/dist/plugins/hooks/lib/cognitive/cognitive-config.js +1 -0
- package/dist/plugins/hooks/lib/cognitive/cognitive-fingerprint.js +1 -0
- package/dist/plugins/hooks/lib/cognitive/collective-sync.js +1 -0
- package/dist/plugins/hooks/lib/cognitive/confidence-engine.js +1 -0
- package/dist/plugins/hooks/lib/cognitive/dream-engine.js +1 -0
- package/dist/plugins/hooks/lib/cognitive/episodic-store.js +1 -0
- package/dist/plugins/hooks/lib/cognitive/index.js +1 -0
- package/dist/plugins/hooks/lib/cognitive/kernel.js +1 -0
- package/dist/plugins/hooks/lib/cognitive/knowledge-distiller.js +1 -0
- package/dist/plugins/hooks/lib/cognitive/memory-transport.js +1 -0
- package/dist/plugins/hooks/lib/cognitive/persistence-manager.js +1 -0
- package/dist/plugins/hooks/lib/cognitive/priority-store.js +1 -0
- package/dist/plugins/hooks/lib/cognitive/procedural-store.js +1 -0
- package/dist/plugins/hooks/lib/cognitive/semantic-store.js +1 -0
- package/dist/plugins/hooks/lib/cognitive/skill-tuner.js +1 -0
- package/dist/plugins/hooks/lib/cognitive/wiki-materializer.js +1 -0
- package/dist/plugins/hooks/lib/detection-pattern-loader.js +1 -0
- package/dist/plugins/hooks/lib/directory-discovery.js +1 -0
- package/dist/plugins/hooks/lib/environment-capability-package.js +1 -0
- package/dist/plugins/hooks/lib/environment-capability-probe.js +1 -0
- package/dist/plugins/hooks/lib/environment-config-loader.js +1 -0
- package/dist/plugins/hooks/lib/environment-executor.js +1 -0
- package/dist/plugins/hooks/lib/environment-feedback.js +1 -0
- package/dist/plugins/hooks/lib/environment-health-monitor.js +1 -0
- package/dist/plugins/hooks/lib/environment-knowledge-synthesizer.js +1 -0
- package/dist/plugins/hooks/lib/environment-knowledge-validator.js +1 -0
- package/dist/plugins/hooks/lib/environment-learning-discoverer.js +1 -0
- package/dist/plugins/hooks/lib/environment-learning-engine.js +1 -0
- package/dist/plugins/hooks/lib/environment-module-repository.js +1 -0
- package/dist/plugins/hooks/lib/environment-planner.js +1 -0
- package/dist/plugins/hooks/lib/environment-plugin-registry.js +1 -0
- package/dist/plugins/hooks/lib/environment-readiness.js +1 -0
- package/dist/plugins/hooks/lib/environment-route-ranking.js +1 -0
- package/dist/plugins/hooks/lib/environment-strategy-overlay.js +1 -0
- package/dist/plugins/hooks/lib/execution-path.js +1 -0
- package/dist/plugins/hooks/lib/external-mount-adapter.js +1 -0
- package/dist/plugins/hooks/lib/external-scanner.js +1 -0
- package/dist/plugins/hooks/lib/hook-error-recorder.js +1 -0
- package/dist/plugins/hooks/lib/hook-logger.js +1 -0
- package/dist/plugins/hooks/lib/hook-runner.js +2 -0
- package/dist/plugins/hooks/lib/hook-state-manager.js +1 -0
- package/dist/plugins/hooks/lib/memory-extractor.js +1 -0
- package/dist/plugins/hooks/lib/memory-manager.js +1 -0
- package/dist/plugins/hooks/lib/metrics-analyzer.js +1 -0
- package/dist/plugins/hooks/lib/mount-manager.js +1 -0
- package/dist/plugins/hooks/lib/plugin-activation-registry.js +1 -0
- package/dist/plugins/hooks/lib/plugin-selector.js +1 -0
- package/dist/plugins/hooks/lib/plugin-source-registry.js +1 -0
- package/dist/plugins/hooks/lib/plugin-workspace-registry.js +1 -0
- package/dist/plugins/hooks/lib/project-evolution/auto-fixer.js +1 -0
- package/dist/plugins/hooks/lib/project-evolution/memory-manager.js +1 -0
- package/dist/plugins/hooks/lib/project-evolution/pattern-detector.js +1 -0
- package/dist/plugins/hooks/lib/project-evolution/semantic-indexer.js +1 -0
- package/dist/plugins/hooks/lib/rollback-tracker.js +1 -0
- package/dist/plugins/hooks/lib/source-code-scanner.js +1 -0
- package/dist/plugins/hooks/lib/technology-stack-detector.js +1 -0
- package/dist/plugins/hooks/lib/test-failure-analyzer.js +1 -0
- package/dist/plugins/hooks/lib/test-failure-fixer.js +1 -0
- package/dist/plugins/hooks/lib/trace-context.js +1 -0
- package/dist/plugins/hooks/lib/validation-patterns.js +1 -0
- package/dist/plugins/hooks/memory-sync.js +1 -0
- package/dist/plugins/hooks/pipeline-observer.js +2 -0
- package/dist/plugins/hooks/retry-limit-guard.js +2 -0
- package/dist/plugins/hooks/scope-sentinel.js +2 -0
- package/dist/plugins/hooks/secret-scanner.js +2 -0
- package/dist/plugins/hooks/stop-evolve-prompt.js +1 -0
- package/dist/plugins/hooks/trace-initialization.js +2 -0
- package/dist/plugins/hooks/version-checker.js +2 -0
- package/dist/plugins/memory/templates/code-quality.yaml.enc +6 -0
- package/dist/plugins/memory/templates/multi-system.yaml.enc +6 -0
- package/dist/plugins/memory/templates/team-habits.yaml.enc +6 -0
- package/dist/plugins/memory/templates/testing.yaml.enc +6 -0
- package/dist/plugins/skills/README.en.md.enc +6 -0
- package/dist/plugins/skills/README.md.enc +6 -0
- package/dist/plugins/skills/ab-test-executor/SKILL.md.enc +6 -0
- package/dist/plugins/skills/benchmark-executor/SKILL.md.enc +6 -0
- package/dist/plugins/skills/benchmark-generator/SKILL.md.enc +6 -0
- package/dist/plugins/skills/content-prompts/SKILL.md.enc +6 -0
- package/dist/plugins/skills/delivery-stage/SKILL.md.enc +6 -0
- package/dist/plugins/skills/design-stage/SKILL.md.enc +6 -0
- package/dist/plugins/skills/environment-learning/SKILL.md.enc +6 -0
- package/dist/plugins/skills/environment-resilience/build-failure-doctor.md.enc +6 -0
- package/dist/plugins/skills/environment-resilience/environment-repair.md.enc +6 -0
- package/dist/plugins/skills/environment-resilience/pre-flight-check.md.enc +6 -0
- package/dist/plugins/skills/experiment-evaluator/SKILL.md.enc +6 -0
- package/dist/plugins/skills/exploration-phase/SKILL.md.enc +6 -0
- package/dist/plugins/skills/field-evolve-analyzer/SKILL.md.enc +6 -0
- package/dist/plugins/skills/field-evolve-distiller/SKILL.md.enc +6 -0
- package/dist/plugins/skills/field-evolve-executor/SKILL.md.enc +6 -0
- package/dist/plugins/skills/field-evolve-executor/executor.js +2 -0
- package/dist/plugins/skills/field-evolve-fixer/SKILL.md.enc +6 -0
- package/dist/plugins/skills/field-evolve-learner/SKILL.md.enc +6 -0
- package/dist/plugins/skills/field-evolve-scanner/SKILL.md.enc +6 -0
- package/dist/plugins/skills/field-evolve-scanner/scripts/fallback-scanner.js +2 -0
- package/dist/plugins/skills/field-evolve-verifier/SKILL.md.enc +6 -0
- package/dist/plugins/skills/heartbeat-monitor/SKILL.md.enc +6 -0
- package/dist/plugins/skills/implementation-stage/SKILL.md.enc +6 -0
- package/dist/plugins/skills/layer1-validation/SKILL.md.enc +6 -0
- package/dist/plugins/skills/multi-role-orchestration/SKILL.md.enc +6 -0
- package/dist/plugins/skills/ops-content-marketing/SKILL.md.enc +6 -0
- package/dist/plugins/skills/ops-conversion/SKILL.md.enc +6 -0
- package/dist/plugins/skills/ops-data-driven/SKILL.md.enc +6 -0
- package/dist/plugins/skills/ops-growth-strategies/SKILL.md.enc +6 -0
- package/dist/plugins/skills/ops-private-domain/SKILL.md.enc +6 -0
- package/dist/plugins/skills/ops-social-commerce/SKILL.md.enc +6 -0
- package/dist/plugins/skills/ops-user-growth/SKILL.md.enc +6 -0
- package/dist/plugins/skills/pending-dashboard/SKILL.md.enc +6 -0
- package/dist/plugins/skills/project-evolve-analyzer/SKILL.md.enc +6 -0
- package/dist/plugins/skills/project-evolve-fixer/SKILL.md.enc +6 -0
- package/dist/plugins/skills/project-evolve-generator/SKILL.md.enc +6 -0
- package/dist/plugins/skills/project-evolve-learner/SKILL.md.enc +6 -0
- package/dist/plugins/skills/project-evolve-reviewer/SKILL.md.enc +6 -0
- package/dist/plugins/skills/project-evolve-scanner/SKILL.md.enc +6 -0
- package/dist/plugins/skills/project-evolve-scanner/scripts/dependency-reuse-checker.js +2 -0
- package/dist/plugins/skills/project-evolve-scanner/scripts/subsystem-coverage.js +2 -0
- package/dist/plugins/skills/project-evolve-verifier/SKILL.md.enc +6 -0
- package/dist/plugins/skills/requirement-stage/SKILL.md.enc +6 -0
- package/dist/plugins/skills/secret-scanner/SKILL.md.enc +6 -0
- package/dist/plugins/skills/skill-forge/SKILL.md.enc +6 -0
- package/dist/plugins/skills/skill-forge/references/description-guide.md.enc +6 -0
- package/dist/plugins/skills/skill-forge/references/quality-rubric.md.enc +6 -0
- package/dist/plugins/skills/skill-forge/references/skill-template.md.enc +6 -0
- package/dist/plugins/skills/startup-guard/SKILL.md.enc +6 -0
- package/dist/plugins/skills/tdd-workflow/SKILL.md.enc +6 -0
- package/dist/plugins/skills/testing-stage/SKILL.md.enc +6 -0
- package/dist/plugins/skills/tracking-validator/SKILL.md.enc +6 -0
- package/dist/scripts/build-crypto.js +2 -0
- package/dist/scripts/cli/contribute.js +1 -0
- package/dist/scripts/cli/dashboard.js +1 -0
- package/dist/scripts/cli/env.js +1 -0
- package/dist/scripts/cli/global-init.js +1 -0
- package/dist/scripts/cli/global.js +1 -0
- package/dist/scripts/cli/index.js +1 -0
- package/dist/scripts/cli/init.js +1 -0
- package/dist/scripts/cli/list.js +1 -0
- package/dist/scripts/cli/memory.js +1 -0
- package/dist/scripts/cli/org.js +1 -0
- package/dist/scripts/cli/plugin.js +1 -0
- package/dist/scripts/cli/practice-report.js +1 -0
- package/dist/scripts/cli/runtime-governance.js +1 -0
- package/dist/scripts/cli/system.js +1 -0
- package/dist/scripts/cli/update.js +1 -0
- package/dist/scripts/commands/catalog.js +1 -0
- package/dist/scripts/commands/slash-command-docs.js +1 -0
- package/dist/scripts/config/external-mounts.config.js +2 -0
- package/dist/scripts/heartbeat/check.js +2 -0
- package/dist/scripts/heartbeat/setup-cron.js +2 -0
- package/dist/scripts/install/adapters.js +1 -0
- package/dist/scripts/install/constants.js +1 -0
- package/dist/scripts/install/file-reader.js +1 -0
- package/dist/scripts/install/index.js +1 -0
- package/dist/scripts/install/shards/constants-shard.js +1 -0
- package/dist/scripts/install/shards/crypto-config-shard.js +1 -0
- package/dist/scripts/install/shards/error-messages-shard.js +1 -0
- package/dist/scripts/install/shards/reassemble.js +1 -0
- package/dist/scripts/install/shards/utils-shard.js +1 -0
- package/dist/scripts/install/shards/version-info-shard.js +1 -0
- package/dist/scripts/postinstall.js +1 -0
- package/dist/scripts/state.js +1 -0
- package/package.json +21 -12
- package/README.en.md +0 -598
- package/bin/autospec.js +0 -3
- package/knowledge/01-principles/00-principles-hierarchy.md +0 -247
- package/knowledge/01-principles/01-first-principles.md +0 -241
- package/knowledge/01-principles/02-strategic-principles.md +0 -286
- package/knowledge/01-principles/03-tactical-principles.md +0 -385
- package/knowledge/01-principles/04-operational-principles.md +0 -275
- package/knowledge/01-principles/05-domain-principles.md +0 -539
- package/knowledge/01-principles/06-methodology-principles.md +0 -281
- package/knowledge/01-principles/07-cognitive-principles.md +0 -277
- package/knowledge/01-principles/08-auto-fix-principles.md +0 -320
- package/knowledge/01-principles/09-constitution.md +0 -220
- package/knowledge/01-principles/10-evolution-mechanism.md +0 -699
- package/knowledge/01-principles/README.en.md +0 -385
- package/knowledge/01-principles/README.md +0 -385
- package/knowledge/02-process/00-overview.md +0 -404
- package/knowledge/02-process/01-requirement.md +0 -113
- package/knowledge/02-process/02-design.md +0 -123
- package/knowledge/02-process/03-implementation.md +0 -90
- package/knowledge/02-process/04-review.md +0 -80
- package/knowledge/02-process/05-testing.md +0 -90
- package/knowledge/02-process/06-delivery.md +0 -88
- package/knowledge/02-process/README.en.md +0 -143
- package/knowledge/02-process/README.md +0 -186
- package/knowledge/03-guides/00-pipeline-protocol.md +0 -438
- package/knowledge/03-guides/01-team-orchestrator.md +0 -368
- package/knowledge/03-guides/02-analyze-requirement.md +0 -195
- package/knowledge/03-guides/03-design-solution.md +0 -401
- package/knowledge/03-guides/04-implement-code.md +0 -205
- package/knowledge/03-guides/05-plan-testing.md +0 -183
- package/knowledge/03-guides/06-generate-tests.md +0 -241
- package/knowledge/03-guides/07-check-release.md +0 -205
- package/knowledge/03-guides/08-evaluate-ai-effect.md +0 -100
- package/knowledge/03-guides/09-review-requirement.md +0 -83
- package/knowledge/03-guides/10-review-design.md +0 -83
- package/knowledge/03-guides/11-review-code.md +0 -111
- package/knowledge/03-guides/12-review-testing.md +0 -76
- package/knowledge/03-guides/13-audit-security.md +0 -89
- package/knowledge/03-guides/14-check-consistency.md +0 -177
- package/knowledge/03-guides/15-run-unit-tests.md +0 -83
- package/knowledge/03-guides/16-run-integration-tests.md +0 -105
- package/knowledge/03-guides/17-analyze-test-context.md +0 -250
- package/knowledge/03-guides/18-log-practice.md +0 -359
- package/knowledge/03-guides/19-distill-skill.md +0 -91
- package/knowledge/03-guides/20-update-skill.md +0 -45
- package/knowledge/03-guides/21-validate-skill.md +0 -72
- package/knowledge/03-guides/22-extract-methodology.md +0 -55
- package/knowledge/03-guides/23-infer-scope.md +0 -174
- package/knowledge/03-guides/24-assess-complexity.md +0 -270
- package/knowledge/03-guides/25-discover-component.md +0 -183
- package/knowledge/03-guides/26-analyze-tech-stack.md +0 -139
- package/knowledge/03-guides/27-scan-environment.md +0 -207
- package/knowledge/03-guides/28-validate-environment.md +0 -207
- package/knowledge/03-guides/29-generate-knowledge.md +0 -234
- package/knowledge/03-guides/30-analyze-ai-capability.md +0 -193
- package/knowledge/03-guides/31-analyze-ai-component.md +0 -169
- package/knowledge/03-guides/32-analyze-ai-agent.md +0 -362
- package/knowledge/03-guides/33-analyze-ai-rag.md +0 -339
- package/knowledge/03-guides/34-assess-ai-task.md +0 -418
- package/knowledge/03-guides/35-evaluate-ai-pipeline.md +0 -219
- package/knowledge/03-guides/36-evaluate-ai-artifact.md +0 -192
- package/knowledge/03-guides/37-plan-ai-evaluation.md +0 -374
- package/knowledge/03-guides/38-evaluate-ai-path.md +0 -274
- package/knowledge/03-guides/39-validate-ai-data.md +0 -276
- package/knowledge/03-guides/40-detect-ai-anomaly.md +0 -213
- package/knowledge/03-guides/41-diagnose-ai-test.md +0 -133
- package/knowledge/03-guides/42-apply-ddd.md +0 -345
- package/knowledge/03-guides/43-run-ai-sdlc.md +0 -475
- package/knowledge/03-guides/44-manage-knowledge.md +0 -369
- package/knowledge/03-guides/45-test-runner.md +0 -254
- package/knowledge/03-guides/README.en.md +0 -212
- package/knowledge/03-guides/README.md +0 -212
- package/knowledge/04-checklists/00-requirement.md +0 -169
- package/knowledge/04-checklists/01-design.md +0 -196
- package/knowledge/04-checklists/02-code.md +0 -197
- package/knowledge/04-checklists/03-test.md +0 -46
- package/knowledge/04-checklists/04-release.md +0 -70
- package/knowledge/04-checklists/README.en.md +0 -119
- package/knowledge/04-checklists/README.md +0 -123
- package/knowledge/05-config/00-validation-patterns.yaml +0 -137
- package/knowledge/05-config/01-team-stage.yaml +0 -95
- package/knowledge/05-config/02-team-tasks.yaml +0 -139
- package/knowledge/05-config/03-role-composition.yaml +0 -346
- package/knowledge/05-config/04-role-extensions.yaml +0 -140
- package/knowledge/05-config/05-skill-compositions.yaml +0 -142
- package/knowledge/05-config/README.en.md +0 -54
- package/knowledge/05-config/README.md +0 -132
- package/knowledge/06-environment/00-template-registry.md +0 -310
- package/knowledge/06-environment/01-detection-patterns.yaml +0 -1692
- package/knowledge/06-environment/README.en.md +0 -40
- package/knowledge/06-environment/README.md +0 -128
- package/knowledge/07-standards/00-coding-style.md +0 -1059
- package/knowledge/07-standards/01-code-review.md +0 -876
- package/knowledge/07-standards/02-data-consistency.md +0 -1085
- package/knowledge/07-standards/03-document-versioning.md +0 -210
- package/knowledge/07-standards/04-risk-detection.md +0 -186
- package/knowledge/07-standards/README.en.md +0 -119
- package/knowledge/07-standards/README.md +0 -123
- package/knowledge/08-organization/00-vision-mission.md +0 -113
- package/knowledge/08-organization/01-ai-native-culture.md +0 -318
- package/knowledge/08-organization/02-team-metrics.md +0 -228
- package/knowledge/08-organization/03-committee-structure.md +0 -54
- package/knowledge/08-organization/04-governance-metrics.md +0 -55
- package/knowledge/08-organization/05-improvement-process.md +0 -71
- package/knowledge/08-organization/README.en.md +0 -165
- package/knowledge/08-organization/README.md +0 -165
- package/knowledge/09-templates/00-requirement-proposal.md +0 -344
- package/knowledge/09-templates/01-architecture-design.md +0 -494
- package/knowledge/09-templates/02-api-design.md +0 -408
- package/knowledge/09-templates/03-database-design.md +0 -313
- package/knowledge/09-templates/04-product-design.md +0 -237
- package/knowledge/09-templates/05-domain-business.md +0 -388
- package/knowledge/09-templates/06-test-design.md +0 -268
- package/knowledge/09-templates/07-evaluation-design.md +0 -372
- package/knowledge/09-templates/08-component-knowledge.md +0 -272
- package/knowledge/09-templates/09-best-practices.md +0 -218
- package/knowledge/09-templates/10-middleware-knowledge.md +0 -342
- package/knowledge/09-templates/README.en.md +0 -222
- package/knowledge/09-templates/README.md +0 -216
- package/knowledge/README.en.md +0 -372
- package/knowledge/README.md +0 -399
- package/plugins/agents/roles/ai-engineer.md +0 -129
- package/plugins/agents/roles/backend-engineer.md +0 -165
- package/plugins/agents/roles/ceo.md +0 -94
- package/plugins/agents/roles/data-engineer.md +0 -135
- package/plugins/agents/roles/devops-engineer.md +0 -181
- package/plugins/agents/roles/frontend-engineer.md +0 -129
- package/plugins/agents/roles/product-owner.md +0 -98
- package/plugins/agents/roles/quality-engineer.md +0 -129
- package/plugins/agents/roles/security-engineer.md +0 -180
- package/plugins/agents/roles/tech-lead.md +0 -97
- package/plugins/agents/support/blind-comparator.md +0 -88
- package/plugins/agents/support/consistency-checker.md +0 -136
- package/plugins/agents/support/failure-diagnostician.md +0 -141
- package/plugins/agents/support/independent-reviewer.md +0 -80
- package/plugins/agents/support/monitoring-agent.md +0 -215
- package/plugins/agents/support/safety-auditor.md +0 -121
- package/plugins/agents/support/skill-benchmarker.md +0 -86
- package/plugins/agents/support/skill-forger.md +0 -105
- package/plugins/agents/support/stage-gate-evaluator.md +0 -205
- package/plugins/agents/support/test-coverage-reviewer.md +0 -73
- package/plugins/benchmarks/templates/README.md +0 -196
- package/plugins/benchmarks/templates/commands/apply-template.yaml +0 -108
- package/plugins/benchmarks/templates/commands/archive-template.yaml +0 -65
- package/plugins/benchmarks/templates/commands/env-export-template.yaml +0 -64
- package/plugins/benchmarks/templates/commands/env-sync-template.yaml +0 -104
- package/plugins/benchmarks/templates/commands/env-template-template.yaml +0 -96
- package/plugins/benchmarks/templates/commands/env-template.yaml +0 -58
- package/plugins/benchmarks/templates/commands/env-update-template.yaml +0 -110
- package/plugins/benchmarks/templates/commands/env-validate-template.yaml +0 -95
- package/plugins/benchmarks/templates/commands/explore-template.yaml +0 -48
- package/plugins/benchmarks/templates/commands/field-evolve-template.yaml +0 -104
- package/plugins/benchmarks/templates/commands/project-evolve-template.yaml +0 -104
- package/plugins/benchmarks/templates/commands/propose-template.yaml +0 -88
- package/plugins/benchmarks/templates/commands/review-template.yaml +0 -124
- package/plugins/benchmarks/templates/commands/run-template.yaml +0 -127
- package/plugins/benchmarks/templates/commands/test-template.yaml +0 -149
- package/plugins/benchmarks/templates/pipeline/agile-template.yaml +0 -84
- package/plugins/benchmarks/templates/pipeline/experiment-template.yaml +0 -92
- package/plugins/benchmarks/templates/pipeline/hotfix-template.yaml +0 -81
- package/plugins/benchmarks/templates/pipeline/waterfall-template.yaml +0 -106
- package/plugins/benchmarks/templates/skills/agile-iteration-template.yaml +0 -78
- package/plugins/benchmarks/templates/skills/benchmark-executor-template.yaml +0 -114
- package/plugins/benchmarks/templates/skills/benchmark-generator-template.yaml +0 -52
- package/plugins/benchmarks/templates/skills/delivery-stage-template.yaml +0 -130
- package/plugins/benchmarks/templates/skills/design-stage-template.yaml +0 -131
- package/plugins/benchmarks/templates/skills/experiment-iteration-template.yaml +0 -60
- package/plugins/benchmarks/templates/skills/exploration-phase-template.yaml +0 -114
- package/plugins/benchmarks/templates/skills/field-evolve-analyzer-template.yaml +0 -51
- package/plugins/benchmarks/templates/skills/field-evolve-distiller-template.yaml +0 -34
- package/plugins/benchmarks/templates/skills/field-evolve-executor-template.yaml +0 -50
- package/plugins/benchmarks/templates/skills/field-evolve-fixer-template.yaml +0 -52
- package/plugins/benchmarks/templates/skills/field-evolve-learner-template.yaml +0 -33
- package/plugins/benchmarks/templates/skills/field-evolve-scanner-template.yaml +0 -74
- package/plugins/benchmarks/templates/skills/field-evolve-template.yaml +0 -71
- package/plugins/benchmarks/templates/skills/field-evolve-verifier-template.yaml +0 -51
- package/plugins/benchmarks/templates/skills/hotfix-iteration-template.yaml +0 -54
- package/plugins/benchmarks/templates/skills/implementation-stage-template.yaml +0 -127
- package/plugins/benchmarks/templates/skills/layer1-validation-template.yaml +0 -121
- package/plugins/benchmarks/templates/skills/project-evolve-analyzer-template.yaml +0 -51
- package/plugins/benchmarks/templates/skills/project-evolve-fixer-template.yaml +0 -52
- package/plugins/benchmarks/templates/skills/project-evolve-generator-template.yaml +0 -34
- package/plugins/benchmarks/templates/skills/project-evolve-learner-template.yaml +0 -50
- package/plugins/benchmarks/templates/skills/project-evolve-reviewer-template.yaml +0 -50
- package/plugins/benchmarks/templates/skills/project-evolve-scanner-template.yaml +0 -75
- package/plugins/benchmarks/templates/skills/project-evolve-template.yaml +0 -72
- package/plugins/benchmarks/templates/skills/project-evolve-verifier-template.yaml +0 -51
- package/plugins/benchmarks/templates/skills/requirement-analyzer-template.yaml +0 -48
- package/plugins/benchmarks/templates/skills/skill-forge-template.yaml +0 -117
- package/plugins/benchmarks/templates/skills/startup-guard-template.yaml +0 -103
- package/plugins/benchmarks/templates/skills/testing-stage-template.yaml +0 -146
- package/plugins/benchmarks/templates/skills/waterfall-iteration-template.yaml +0 -55
- package/plugins/commands/README.en.md +0 -96
- package/plugins/commands/README.md +0 -96
- package/plugins/commands/apply.md +0 -277
- package/plugins/commands/archive.md +0 -132
- package/plugins/commands/env-export.md +0 -79
- package/plugins/commands/env-sync.md +0 -1281
- package/plugins/commands/env-template.md +0 -99
- package/plugins/commands/env-update.md +0 -264
- package/plugins/commands/env-validate.md +0 -176
- package/plugins/commands/env.md +0 -79
- package/plugins/commands/explore.md +0 -193
- package/plugins/commands/field-evolve.md +0 -412
- package/plugins/commands/memory.md +0 -249
- package/plugins/commands/project-evolve.md +0 -920
- package/plugins/commands/propose.md +0 -184
- package/plugins/commands/review.md +0 -140
- package/plugins/commands/run.md +0 -1052
- package/plugins/commands/status.md +0 -183
- package/plugins/commands/test.md +0 -389
- package/plugins/hooks/README.en.md +0 -56
- package/plugins/hooks/README.md +0 -56
- package/plugins/hooks/ai-project-guard.js +0 -329
- package/plugins/hooks/artifact-evaluation-hook.js +0 -237
- package/plugins/hooks/constitution-guard.js +0 -211
- package/plugins/hooks/environment-autocommit.js +0 -606
- package/plugins/hooks/environment-manager.js +0 -779
- package/plugins/hooks/execution-tracker.js +0 -459
- package/plugins/hooks/frozen-zone-guard.js +0 -140
- package/plugins/hooks/layer1-validator.js +0 -539
- package/plugins/hooks/lib/artifact-evaluator.js +0 -414
- package/plugins/hooks/lib/auto-fix-loop.js +0 -605
- package/plugins/hooks/lib/benchmarks/change-detector.js +0 -390
- package/plugins/hooks/lib/benchmarks/evaluator.js +0 -605
- package/plugins/hooks/lib/benchmarks/integration-example.js +0 -169
- package/plugins/hooks/lib/data-and-ai-detector.js +0 -275
- package/plugins/hooks/lib/detection-pattern-loader.js +0 -865
- package/plugins/hooks/lib/directory-discovery.js +0 -395
- package/plugins/hooks/lib/environment-config-loader.js +0 -345
- package/plugins/hooks/lib/environment-detector.js +0 -553
- package/plugins/hooks/lib/environment-evolver.js +0 -564
- package/plugins/hooks/lib/environment-registry.js +0 -813
- package/plugins/hooks/lib/execution-path.js +0 -427
- package/plugins/hooks/lib/hook-error-recorder.js +0 -245
- package/plugins/hooks/lib/hook-logger.js +0 -538
- package/plugins/hooks/lib/hook-runner.js +0 -97
- package/plugins/hooks/lib/hook-state-manager.js +0 -578
- package/plugins/hooks/lib/memory-extractor.js +0 -399
- package/plugins/hooks/lib/memory-manager.js +0 -673
- package/plugins/hooks/lib/metrics-analyzer.js +0 -489
- package/plugins/hooks/lib/project-evolution/auto-fixer.js +0 -511
- package/plugins/hooks/lib/project-evolution/memory-manager.js +0 -346
- package/plugins/hooks/lib/project-evolution/pattern-detector.js +0 -476
- package/plugins/hooks/lib/project-evolution/semantic-indexer.js +0 -480
- package/plugins/hooks/lib/project-structure-detector.js +0 -326
- package/plugins/hooks/lib/rollback-tracker.js +0 -346
- package/plugins/hooks/lib/source-code-scanner.js +0 -596
- package/plugins/hooks/lib/technology-stack-detector.js +0 -374
- package/plugins/hooks/lib/test-auto-fix.test.js +0 -194
- package/plugins/hooks/lib/test-failure-analyzer.js +0 -375
- package/plugins/hooks/lib/test-failure-fixer.js +0 -268
- package/plugins/hooks/lib/trace-context.js +0 -277
- package/plugins/hooks/lib/validation-patterns.js +0 -415
- package/plugins/hooks/memory-sync.js +0 -171
- package/plugins/hooks/monitoring-trigger.js +0 -467
- package/plugins/hooks/pipeline-observer.js +0 -413
- package/plugins/hooks/scope-sentinel.js +0 -204
- package/plugins/hooks/trace-initialization.js +0 -169
- package/plugins/memory/templates/code-quality.yaml +0 -149
- package/plugins/memory/templates/multi-system.yaml +0 -155
- package/plugins/memory/templates/team-habits.yaml +0 -119
- package/plugins/memory/templates/testing.yaml +0 -121
- package/plugins/skills/README.en.md +0 -59
- package/plugins/skills/README.md +0 -114
- package/plugins/skills/agile-iteration/SKILL.md +0 -187
- package/plugins/skills/benchmark-executor/SKILL.md +0 -647
- package/plugins/skills/benchmark-generator/SKILL.md +0 -349
- package/plugins/skills/delivery-stage/SKILL.md +0 -324
- package/plugins/skills/design-stage/SKILL.md +0 -307
- package/plugins/skills/experiment-evaluator/SKILL.md +0 -271
- package/plugins/skills/experiment-iteration/SKILL.md +0 -154
- package/plugins/skills/exploration-phase/SKILL.md +0 -216
- package/plugins/skills/field-evolve-analyzer/SKILL.md +0 -65
- package/plugins/skills/field-evolve-distiller/SKILL.md +0 -66
- package/plugins/skills/field-evolve-executor/SKILL.md +0 -94
- package/plugins/skills/field-evolve-executor/executor.js +0 -342
- package/plugins/skills/field-evolve-fixer/SKILL.md +0 -69
- package/plugins/skills/field-evolve-learner/SKILL.md +0 -65
- package/plugins/skills/field-evolve-scanner/SKILL.md +0 -87
- package/plugins/skills/field-evolve-scanner/scripts/fallback-scanner.js +0 -288
- package/plugins/skills/field-evolve-verifier/SKILL.md +0 -64
- package/plugins/skills/hotfix-iteration/SKILL.md +0 -279
- package/plugins/skills/implementation-stage/SKILL.md +0 -320
- package/plugins/skills/layer1-validation/SKILL.md +0 -79
- package/plugins/skills/pending-dashboard/SKILL.md +0 -110
- package/plugins/skills/project-evolve-analyzer/SKILL.md +0 -95
- package/plugins/skills/project-evolve-fixer/SKILL.md +0 -99
- package/plugins/skills/project-evolve-generator/SKILL.md +0 -149
- package/plugins/skills/project-evolve-learner/SKILL.md +0 -103
- package/plugins/skills/project-evolve-reviewer/SKILL.md +0 -104
- package/plugins/skills/project-evolve-scanner/SKILL.md +0 -95
- package/plugins/skills/project-evolve-scanner/scripts/dependency-reuse-checker.js +0 -395
- package/plugins/skills/project-evolve-scanner/scripts/subsystem-coverage.js +0 -315
- package/plugins/skills/project-evolve-verifier/SKILL.md +0 -105
- package/plugins/skills/requirement-stage/SKILL.md +0 -217
- package/plugins/skills/skill-forge/SKILL.md +0 -223
- package/plugins/skills/skill-forge/references/description-guide.md +0 -92
- package/plugins/skills/skill-forge/references/quality-rubric.md +0 -104
- package/plugins/skills/skill-forge/references/skill-template.md +0 -106
- package/plugins/skills/startup-guard/SKILL.md +0 -38
- package/plugins/skills/testing-stage/SKILL.md +0 -770
- package/plugins/skills/waterfall-iteration/SKILL.md +0 -115
- package/scripts/cli/global-init.js +0 -288
- package/scripts/cli/global.js +0 -324
- package/scripts/cli/index.js +0 -55
- package/scripts/cli/init.js +0 -408
- package/scripts/cli/list.js +0 -70
- package/scripts/cli/org.js +0 -340
- package/scripts/cli/update.js +0 -44
- package/scripts/config/commands.config.js +0 -145
- package/scripts/config/hooks.config.js +0 -197
- package/scripts/install/agents.js +0 -106
- package/scripts/install/commands.js +0 -133
- package/scripts/install/constants.js +0 -463
- package/scripts/install/hook-logger.js +0 -536
- package/scripts/install/hooks.js +0 -110
- package/scripts/install/index.js +0 -39
- package/scripts/install/skills.js +0 -95
- package/scripts/postinstall.js +0 -25
- package/scripts/state.js +0 -585
- /package/{plugins → dist/plugins}/hooks/lib/hook-runner.sh +0 -0
|
@@ -1,268 +0,0 @@
|
|
|
1
|
-
# 测试设计:{项目/功能名}
|
|
2
|
-
|
|
3
|
-
> **版本**: v1.0
|
|
4
|
-
> **模板来源**: ISTQB 测试标准、测试金字塔理论、Google 测试方法论
|
|
5
|
-
> **适用范围**: 单元测试、集成测试、端到端测试、性能测试设计
|
|
6
|
-
> **生成模式**: 测试策略 → 测试用例 → 测试数据 → 测试执行
|
|
7
|
-
|
|
8
|
-
---
|
|
9
|
-
|
|
10
|
-
## 1. 测试概述
|
|
11
|
-
|
|
12
|
-
### 1.1 基本信息
|
|
13
|
-
|
|
14
|
-
| 字段 | 值 |
|
|
15
|
-
|------|-----|
|
|
16
|
-
| 测试项目名称 | |
|
|
17
|
-
| 被测系统/功能 | |
|
|
18
|
-
| 测试类型 | 单元/集成/E2E/性能/安全 |
|
|
19
|
-
| 测试负责人 | |
|
|
20
|
-
| 测试环境 | |
|
|
21
|
-
|
|
22
|
-
### 1.2 测试目标
|
|
23
|
-
|
|
24
|
-
- [ ] 功能正确性验证
|
|
25
|
-
- [ ] 边界条件覆盖
|
|
26
|
-
- [ ] 异常场景处理
|
|
27
|
-
- [ ] 性能指标达标
|
|
28
|
-
- [ ] 安全漏洞检测
|
|
29
|
-
|
|
30
|
-
### 1.3 测试范围
|
|
31
|
-
|
|
32
|
-
**包含内容**:
|
|
33
|
-
- 功能模块 A
|
|
34
|
-
- 功能模块 B
|
|
35
|
-
|
|
36
|
-
**不包含内容**:
|
|
37
|
-
- 第三方系统集成(由集成测试覆盖)
|
|
38
|
-
|
|
39
|
-
---
|
|
40
|
-
|
|
41
|
-
## 2. 测试策略
|
|
42
|
-
|
|
43
|
-
### 2.1 测试金字塔
|
|
44
|
-
|
|
45
|
-
```
|
|
46
|
-
/\
|
|
47
|
-
/ \
|
|
48
|
-
/ E2E \ 端到端测试 (10%)
|
|
49
|
-
/______\
|
|
50
|
-
/ \
|
|
51
|
-
/ 集成测试 \ 集成测试 (20%)
|
|
52
|
-
/____________\
|
|
53
|
-
/ \
|
|
54
|
-
/ 单元测试 \ 单元测试 (70%)
|
|
55
|
-
__________________\
|
|
56
|
-
```
|
|
57
|
-
|
|
58
|
-
### 2.2 测试类型分布
|
|
59
|
-
|
|
60
|
-
| 测试类型 | 占比 | 工具 | 执行频率 |
|
|
61
|
-
|---------|------|------|---------|
|
|
62
|
-
| 单元测试 | 70% | Jest/JUnit/pytest | 每次提交 |
|
|
63
|
-
| 集成测试 | 20% | TestContainer | 每日 |
|
|
64
|
-
| E2E 测试 | 10% | Selenium/Cypress | 每周 |
|
|
65
|
-
|
|
66
|
-
---
|
|
67
|
-
|
|
68
|
-
## 3. 测试用例
|
|
69
|
-
|
|
70
|
-
### 3.1 测试用例模板
|
|
71
|
-
|
|
72
|
-
#### TC-{编号}: {测试用例名称}
|
|
73
|
-
|
|
74
|
-
| 属性 | 值 |
|
|
75
|
-
|------|-----|
|
|
76
|
-
| 测试用例 ID | TC-{编号} |
|
|
77
|
-
| 测试名称 | |
|
|
78
|
-
| 测试类型 | 功能/性能/安全 |
|
|
79
|
-
| 优先级 | P0/P1/P2 |
|
|
80
|
-
| 前置条件 | |
|
|
81
|
-
|
|
82
|
-
**测试步骤**:
|
|
83
|
-
| 步骤 | 操作 | 预期结果 |
|
|
84
|
-
|------|------|---------|
|
|
85
|
-
| 1 | | |
|
|
86
|
-
| 2 | | |
|
|
87
|
-
|
|
88
|
-
**测试数据**:
|
|
89
|
-
```json
|
|
90
|
-
{
|
|
91
|
-
"input": {},
|
|
92
|
-
"expected": {}
|
|
93
|
-
}
|
|
94
|
-
```
|
|
95
|
-
|
|
96
|
-
---
|
|
97
|
-
|
|
98
|
-
### 3.2 测试用例列表
|
|
99
|
-
|
|
100
|
-
| 用例 ID | 用例名称 | 类型 | 优先级 | 状态 |
|
|
101
|
-
|--------|---------|------|--------|------|
|
|
102
|
-
| TC-001 | | 功能 | P0 | 设计中 |
|
|
103
|
-
| TC-002 | | 边界 | P1 | 设计中 |
|
|
104
|
-
|
|
105
|
-
---
|
|
106
|
-
|
|
107
|
-
## 4. 测试场景
|
|
108
|
-
|
|
109
|
-
### 4.1 正常场景
|
|
110
|
-
|
|
111
|
-
| 场景 ID | 场景描述 | 输入 | 预期输出 |
|
|
112
|
-
|--------|---------|------|---------|
|
|
113
|
-
| SC-001 | 正常流程 | | |
|
|
114
|
-
|
|
115
|
-
### 4.2 边界场景
|
|
116
|
-
|
|
117
|
-
| 场景 ID | 边界类型 | 输入值 | 预期行为 |
|
|
118
|
-
|--------|---------|-------|---------|
|
|
119
|
-
| SC-002 | 最小值 | 0 | |
|
|
120
|
-
| SC-003 | 最大值 | MAX_INT | |
|
|
121
|
-
| SC-004 | 空值 | null/empty | |
|
|
122
|
-
|
|
123
|
-
### 4.3 异常场景
|
|
124
|
-
|
|
125
|
-
| 场景 ID | 异常类型 | 触发条件 | 预期处理 |
|
|
126
|
-
|--------|---------|---------|---------|
|
|
127
|
-
| SC-005 | 网络异常 | 超时 | 重试 3 次后失败 |
|
|
128
|
-
| SC-006 | 数据异常 | 无效输入 | 返回验证错误 |
|
|
129
|
-
|
|
130
|
-
---
|
|
131
|
-
|
|
132
|
-
## 5. 测试数据
|
|
133
|
-
|
|
134
|
-
### 5.1 数据来源
|
|
135
|
-
|
|
136
|
-
| 数据类型 | 来源 | 说明 |
|
|
137
|
-
|---------|------|------|
|
|
138
|
-
| 基础数据 | 测试数据库 | 预置数据 |
|
|
139
|
-
| 动态数据 | API 生成 | 运行时创建 |
|
|
140
|
-
|
|
141
|
-
### 5.2 数据准备
|
|
142
|
-
|
|
143
|
-
```sql
|
|
144
|
-
-- 测试数据准备脚本
|
|
145
|
-
INSERT INTO users (id, name, email) VALUES (1, 'Test User', 'test@example.com');
|
|
146
|
-
```
|
|
147
|
-
|
|
148
|
-
### 5.3 数据清理
|
|
149
|
-
|
|
150
|
-
```sql
|
|
151
|
-
-- 测试后清理脚本
|
|
152
|
-
DELETE FROM users WHERE id IN (1, 2, 3);
|
|
153
|
-
```
|
|
154
|
-
|
|
155
|
-
---
|
|
156
|
-
|
|
157
|
-
## 6. 测试执行
|
|
158
|
-
|
|
159
|
-
### 6.1 执行环境
|
|
160
|
-
|
|
161
|
-
| 环境 | 配置 | 用途 |
|
|
162
|
-
|------|------|------|
|
|
163
|
-
| 本地开发 | MacBook Pro 16G | 单元测试 |
|
|
164
|
-
| CI 环境 | GitHub Actions | 集成测试 |
|
|
165
|
-
| 测试环境 | AWS EC2 | E2E 测试 |
|
|
166
|
-
|
|
167
|
-
### 6.2 执行命令
|
|
168
|
-
|
|
169
|
-
```bash
|
|
170
|
-
# 单元测试
|
|
171
|
-
npm test
|
|
172
|
-
|
|
173
|
-
# 集成测试
|
|
174
|
-
npm run test:integration
|
|
175
|
-
|
|
176
|
-
# E2E 测试
|
|
177
|
-
npm run test:e2e
|
|
178
|
-
```
|
|
179
|
-
|
|
180
|
-
### 6.3 执行计划
|
|
181
|
-
|
|
182
|
-
| 阶段 | 时间 | 执行内容 | 负责人 |
|
|
183
|
-
|------|------|---------|--------|
|
|
184
|
-
| 阶段一 | | 单元测试 | |
|
|
185
|
-
| 阶段二 | | 集成测试 | |
|
|
186
|
-
|
|
187
|
-
---
|
|
188
|
-
|
|
189
|
-
## 7. 测试覆盖率
|
|
190
|
-
|
|
191
|
-
### 7.1 覆盖率目标
|
|
192
|
-
|
|
193
|
-
| 指标 | 目标值 | 当前值 |
|
|
194
|
-
|------|-------|-------|
|
|
195
|
-
| 代码覆盖率 | > 80% | |
|
|
196
|
-
| 分支覆盖率 | > 70% | |
|
|
197
|
-
| 需求覆盖率 | 100% | |
|
|
198
|
-
|
|
199
|
-
### 7.2 覆盖率报告
|
|
200
|
-
|
|
201
|
-
```
|
|
202
|
-
=============================== coverage summary ===============================
|
|
203
|
-
Stmts : 85% ( 100/120 )
|
|
204
|
-
Branches : 75% ( 50/67 )
|
|
205
|
-
Funcs : 90% ( 45/50 )
|
|
206
|
-
Lines : 84% ( 95/113 )
|
|
207
|
-
================================================================================
|
|
208
|
-
```
|
|
209
|
-
|
|
210
|
-
---
|
|
211
|
-
|
|
212
|
-
## 8. 缺陷管理
|
|
213
|
-
|
|
214
|
-
### 8.1 缺陷记录
|
|
215
|
-
|
|
216
|
-
| 缺陷 ID | 缺陷描述 | 严重程度 | 状态 | 关联用例 |
|
|
217
|
-
|--------|---------|---------|------|---------|
|
|
218
|
-
| BUG-001 | | 高/中/低 | 新建/修复中/已修复 | TC-001 |
|
|
219
|
-
|
|
220
|
-
### 8.2 缺陷流程
|
|
221
|
-
|
|
222
|
-
```mermaid
|
|
223
|
-
flowchart LR
|
|
224
|
-
A[新建] --> B[确认]
|
|
225
|
-
B --> C[修复中]
|
|
226
|
-
C --> D[已修复]
|
|
227
|
-
D --> E[验证]
|
|
228
|
-
E --> F[关闭]
|
|
229
|
-
```
|
|
230
|
-
|
|
231
|
-
---
|
|
232
|
-
|
|
233
|
-
## 9. 测试报告
|
|
234
|
-
|
|
235
|
-
### 9.1 测试结果
|
|
236
|
-
|
|
237
|
-
| 测试类型 | 总数 | 通过 | 失败 | 跳过 | 通过率 |
|
|
238
|
-
|---------|------|------|------|------|-------|
|
|
239
|
-
| 单元测试 | | | | | |
|
|
240
|
-
| 集成测试 | | | | | |
|
|
241
|
-
| E2E 测试 | | | | | |
|
|
242
|
-
|
|
243
|
-
### 9.2 测试结论
|
|
244
|
-
|
|
245
|
-
- [ ] 测试通过,可以发布
|
|
246
|
-
- [ ] 测试通过,但有已知问题
|
|
247
|
-
- [ ] 测试失败,需要修复
|
|
248
|
-
|
|
249
|
-
---
|
|
250
|
-
|
|
251
|
-
## 10. 附录
|
|
252
|
-
|
|
253
|
-
### 10.1 测试工具
|
|
254
|
-
|
|
255
|
-
| 工具名称 | 用途 | 版本 |
|
|
256
|
-
|---------|------|------|
|
|
257
|
-
| | | |
|
|
258
|
-
|
|
259
|
-
### 10.2 参考资料
|
|
260
|
-
|
|
261
|
-
- [ISTQB 测试标准](url)
|
|
262
|
-
- [Google 测试方法论](url)
|
|
263
|
-
|
|
264
|
-
---
|
|
265
|
-
|
|
266
|
-
**维护者**: QA 团队
|
|
267
|
-
**进化分区**: 自由区
|
|
268
|
-
**关联文档**: `knowledge/09-templates/09-evaluation-design.md`, `knowledge/09-templates/01-architecture-design.md`
|
|
@@ -1,372 +0,0 @@
|
|
|
1
|
-
# 评测设计:{AI 模型/功能名}
|
|
2
|
-
|
|
3
|
-
> **版本**: v1.0
|
|
4
|
-
> **模板来源**: AI 评测最佳实践、ML 评估方法论、业界评测基准
|
|
5
|
-
> **适用范围**: AI 模型效果评测、LLM 应用评测、RAG 系统评测
|
|
6
|
-
> **生成模式**: 评测目标 → 评测数据集 → 评测指标 → 评测执行
|
|
7
|
-
|
|
8
|
-
---
|
|
9
|
-
|
|
10
|
-
## 1. 评测概述
|
|
11
|
-
|
|
12
|
-
### 1.1 基本信息
|
|
13
|
-
|
|
14
|
-
| 字段 | 值 |
|
|
15
|
-
|------|-----|
|
|
16
|
-
| 评测项目名称 | |
|
|
17
|
-
| 被测 AI 系统/模型 | |
|
|
18
|
-
| 评测类型 | 效果评测/性能评测/安全评测 |
|
|
19
|
-
| 评测负责人 | |
|
|
20
|
-
| 评测环境 | |
|
|
21
|
-
|
|
22
|
-
### 1.2 评测目标
|
|
23
|
-
|
|
24
|
-
- [ ] 验证模型效果是否达标
|
|
25
|
-
- [ ] 对比不同模型/版本
|
|
26
|
-
- [ ] 发现 Badcase 并优化
|
|
27
|
-
- [ ] 评估上线风险
|
|
28
|
-
|
|
29
|
-
### 1.3 评测范围
|
|
30
|
-
|
|
31
|
-
**包含内容**:
|
|
32
|
-
- 核心场景评测
|
|
33
|
-
- 边界场景评测
|
|
34
|
-
|
|
35
|
-
**不包含内容**:
|
|
36
|
-
- 极端场景(由专项评测覆盖)
|
|
37
|
-
|
|
38
|
-
---
|
|
39
|
-
|
|
40
|
-
## 2. 评测数据集
|
|
41
|
-
|
|
42
|
-
### 2.1 数据集概述
|
|
43
|
-
|
|
44
|
-
| 数据集名称 | 数据来源 | 数据量 | 用途 |
|
|
45
|
-
|-----------|---------|-------|------|
|
|
46
|
-
| 测试集 A | 线上采样 | 1000 | 核心场景评测 |
|
|
47
|
-
| 测试集 B | 人工构造 | 200 | 边界场景评测 |
|
|
48
|
-
|
|
49
|
-
### 2.2 数据分布
|
|
50
|
-
|
|
51
|
-
| 类别 | 训练集 | 验证集 | 测试集 |
|
|
52
|
-
|------|-------|-------|-------|
|
|
53
|
-
| 类别 A | 70% | 15% | 15% |
|
|
54
|
-
| 类别 B | 70% | 15% | 15% |
|
|
55
|
-
|
|
56
|
-
### 2.3 数据样例
|
|
57
|
-
|
|
58
|
-
```json
|
|
59
|
-
{
|
|
60
|
-
"id": "eval-001",
|
|
61
|
-
"input": "用户输入",
|
|
62
|
-
"expected_output": "期望输出",
|
|
63
|
-
"metadata": {
|
|
64
|
-
"scene": "场景类型",
|
|
65
|
-
"difficulty": "简单/中等/困难"
|
|
66
|
-
}
|
|
67
|
-
}
|
|
68
|
-
```
|
|
69
|
-
|
|
70
|
-
---
|
|
71
|
-
|
|
72
|
-
## 3. 评测指标
|
|
73
|
-
|
|
74
|
-
### 3.1 效果指标
|
|
75
|
-
|
|
76
|
-
| 指标名称 | 定义 | 计算方式 | 目标值 |
|
|
77
|
-
|---------|------|---------|-------|
|
|
78
|
-
| 准确率 | 预测正确的比例 | (TP+TN)/(TP+TN+FP+FN) | > 90% |
|
|
79
|
-
| 精确率 | 预测为正的准确率 | TP/(TP+FP) | > 85% |
|
|
80
|
-
| 召回率 | 正例被找出的比例 | TP/(TP+FN) | > 85% |
|
|
81
|
-
| F1 分数 | 精确率和召回率的调和平均 | 2PR/(P+R) | > 85% |
|
|
82
|
-
|
|
83
|
-
### 3.2 LLM 专用指标
|
|
84
|
-
|
|
85
|
-
| 指标名称 | 说明 | 评估方式 |
|
|
86
|
-
|---------|------|---------|
|
|
87
|
-
| 回答准确性 | 回答是否正确 | 人工评分/LLM 评判 |
|
|
88
|
-
| 回答完整性 | 是否覆盖所有要点 | 人工评分 |
|
|
89
|
-
| 回答相关性 | 是否切题 | 人工评分 |
|
|
90
|
-
| 安全性 | 是否有有害内容 | 规则检测 + 人工 |
|
|
91
|
-
| 困惑度 (Perplexity) | 语言模型预测不确定性 | 计算生成文本的 PPL 值 |
|
|
92
|
-
| BERTScore | 语义相似度评估 | 基于 BERT 嵌入的 F1 分数 |
|
|
93
|
-
| 毒性评分 (Toxicity) | 有害/偏见内容检测 | Perspective API/ toxicity 模型 |
|
|
94
|
-
| 有帮助性评分 | RLHF 对齐程度 | 人工评分 (1-5 分) |
|
|
95
|
-
|
|
96
|
-
### 3.3 RAG 系统专用指标
|
|
97
|
-
|
|
98
|
-
| 指标名称 | 说明 | 计算方式 |
|
|
99
|
-
|---------|------|---------|
|
|
100
|
-
| 检索精确率 (Retrieval Precision) | 检索到的相关文档比例 | 相关文档数 / 检索总数 |
|
|
101
|
-
| 检索召回率 (Retrieval Recall) | 被检索到的相关文档比例 | 检索到的相关数 / 总相关数 |
|
|
102
|
-
| 上下文相关性 (Context Relevance) | 检索内容与查询的相关性 | 人工评分/LLM 评判 |
|
|
103
|
-
| 答案忠实度 (Faithfulness) | 答案是否源自检索内容 | 事实一致性检测 |
|
|
104
|
-
| 引用准确率 | 引用来源的准确性 | 正确引用数 / 总引用数 |
|
|
105
|
-
|
|
106
|
-
### 3.4 性能指标
|
|
107
|
-
|
|
108
|
-
| 指标名称 | 目标值 | 说明 |
|
|
109
|
-
|---------|-------|------|
|
|
110
|
-
| 响应时间 (P50) | < 500ms | 50% 请求的响应时间 |
|
|
111
|
-
| 响应时间 (P99) | < 2s | 99% 请求的响应时间 |
|
|
112
|
-
| QPS | > 100 | 每秒查询数 |
|
|
113
|
-
| 并发数 | > 50 | 最大并发连接数 |
|
|
114
|
-
|
|
115
|
-
---
|
|
116
|
-
|
|
117
|
-
## 4. 评测执行
|
|
118
|
-
|
|
119
|
-
### 4.1 评测准备
|
|
120
|
-
|
|
121
|
-
**Step 1: 准备评测环境**
|
|
122
|
-
|
|
123
|
-
1. 检查是否有评测方案(`evaluation-plan.md`)
|
|
124
|
-
2. 检查是否有评测数据集(`evaluation/dataset/`)
|
|
125
|
-
3. 检查是否有评测脚本(`tests/evaluation/`, `evaluation/`)
|
|
126
|
-
|
|
127
|
-
**Step 2: 加载评测数据集**
|
|
128
|
-
|
|
129
|
-
1. 读取评测数据集
|
|
130
|
-
2. 验证数据集格式
|
|
131
|
-
3. 统计数据集规模
|
|
132
|
-
|
|
133
|
-
### 4.2 评测流程
|
|
134
|
-
|
|
135
|
-
```mermaid
|
|
136
|
-
flowchart TD
|
|
137
|
-
A[准备评测数据] --> B[执行评测]
|
|
138
|
-
B --> C[收集结果]
|
|
139
|
-
C --> D[分析指标]
|
|
140
|
-
D --> E{是否达标?}
|
|
141
|
-
E -->|是 | F[通过评测]
|
|
142
|
-
E -->|否 | G[分析 Badcase]
|
|
143
|
-
G --> H[优化模型]
|
|
144
|
-
H --> B
|
|
145
|
-
```
|
|
146
|
-
|
|
147
|
-
### 4.3 评测执行步骤
|
|
148
|
-
|
|
149
|
-
**Step 3: 执行评测**
|
|
150
|
-
|
|
151
|
-
1. 初始化被评测的 AI/模型组件
|
|
152
|
-
2. 对每个测试用例执行推理
|
|
153
|
-
3. 收集预测结果
|
|
154
|
-
|
|
155
|
-
**Step 4: 计算评测指标**
|
|
156
|
-
|
|
157
|
-
根据评测方案中的指标定义计算:
|
|
158
|
-
|
|
159
|
-
1. **准确率指标**:
|
|
160
|
-
- 激活准确率、匹配准确率等
|
|
161
|
-
|
|
162
|
-
2. **质量指标**:
|
|
163
|
-
- 响应质量、任务完成率等
|
|
164
|
-
|
|
165
|
-
3. **性能指标**:
|
|
166
|
-
- 响应时间、吞吐量等
|
|
167
|
-
|
|
168
|
-
**Step 5: 生成评测报告**
|
|
169
|
-
|
|
170
|
-
1. 汇总各项指标
|
|
171
|
-
2. 与目标值对比
|
|
172
|
-
3. 识别 badcase
|
|
173
|
-
|
|
174
|
-
### 4.4 评测命令
|
|
175
|
-
|
|
176
|
-
```bash
|
|
177
|
-
# 执行评测
|
|
178
|
-
python evaluate.py --model {model_name} --dataset {dataset_name}
|
|
179
|
-
|
|
180
|
-
# 生成报告
|
|
181
|
-
python generate_report.py --output report.md
|
|
182
|
-
```
|
|
183
|
-
|
|
184
|
-
### 4.5 评测配置
|
|
185
|
-
|
|
186
|
-
```yaml
|
|
187
|
-
evaluation:
|
|
188
|
-
model:
|
|
189
|
-
name: {model_name}
|
|
190
|
-
version: v1.0
|
|
191
|
-
dataset:
|
|
192
|
-
name: {dataset_name}
|
|
193
|
-
path: data/test.jsonl
|
|
194
|
-
metrics:
|
|
195
|
-
- accuracy
|
|
196
|
-
- precision
|
|
197
|
-
- recall
|
|
198
|
-
- f1
|
|
199
|
-
```
|
|
200
|
-
|
|
201
|
-
---
|
|
202
|
-
|
|
203
|
-
## 5. Badcase 分析
|
|
204
|
-
|
|
205
|
-
### 5.1 Badcase 分类
|
|
206
|
-
|
|
207
|
-
| 分类 | 数量 | 占比 | 说明 |
|
|
208
|
-
|------|------|------|------|
|
|
209
|
-
| 数据质量问题 | | | 标注错误/数据噪声 |
|
|
210
|
-
| 模型能力不足 | | | 模型无法理解某类输入 |
|
|
211
|
-
| 边界场景 | | | 极端输入 |
|
|
212
|
-
| 其他 | | | |
|
|
213
|
-
|
|
214
|
-
### 5.2 Badcase 示例
|
|
215
|
-
|
|
216
|
-
| ID | 输入 | 期望输出 | 实际输出 | 错误类型 |
|
|
217
|
-
|----|------|---------|---------|---------|
|
|
218
|
-
| 001 | | | | |
|
|
219
|
-
|
|
220
|
-
### 5.3 改进建议
|
|
221
|
-
|
|
222
|
-
| 问题 | 改进方案 | 优先级 |
|
|
223
|
-
|------|---------|--------|
|
|
224
|
-
| | | P0/P1/P2 |
|
|
225
|
-
|
|
226
|
-
---
|
|
227
|
-
|
|
228
|
-
## 6. 评测结果
|
|
229
|
-
|
|
230
|
-
### 6.1 结果汇总
|
|
231
|
-
|
|
232
|
-
| 评测集 | 样本数 | 准确率 | 精确率 | 召回率 | F1 分数 |
|
|
233
|
-
|-------|-------|--------|-------|-------|--------|
|
|
234
|
-
| 测试集 A | 1000 | | | | |
|
|
235
|
-
| 测试集 B | 200 | | | | |
|
|
236
|
-
| 总计 | 1200 | | | | |
|
|
237
|
-
|
|
238
|
-
### 6.2 评测报告格式
|
|
239
|
-
|
|
240
|
-
```markdown
|
|
241
|
-
## 效果评测结果
|
|
242
|
-
|
|
243
|
-
### 评测对象
|
|
244
|
-
- 组件名称:{component_name}
|
|
245
|
-
- 评测数据集:{dataset_path}
|
|
246
|
-
- 测试用例数:{total_cases}
|
|
247
|
-
|
|
248
|
-
### 评测指标
|
|
249
|
-
| 指标名称 | 目标值 | 实际值 | 状态 |
|
|
250
|
-
|----------|--------|--------|------|
|
|
251
|
-
| ... | ... | ... | ✅/❌ |
|
|
252
|
-
|
|
253
|
-
### Badcase 分析
|
|
254
|
-
| 用例 ID | 输入 | 预期输出 | 实际输出 | 问题描述 |
|
|
255
|
-
|--------|------|----------|----------|----------|
|
|
256
|
-
| ... | ... | ... | ... | ... |
|
|
257
|
-
|
|
258
|
-
### 结论
|
|
259
|
-
- 评测通过:✅ 是/❌ 否
|
|
260
|
-
- 达标指标:{passed_count}/{total_count}
|
|
261
|
-
- 需要优化:{需要优化的点}
|
|
262
|
-
```
|
|
263
|
-
|
|
264
|
-
### 6.3 结果分析
|
|
265
|
-
|
|
266
|
-
**优势**:
|
|
267
|
-
- 在 XX 场景表现优秀
|
|
268
|
-
|
|
269
|
-
**不足**:
|
|
270
|
-
- 在 XX 场景需要改进
|
|
271
|
-
|
|
272
|
-
### 6.4 与基线对比
|
|
273
|
-
|
|
274
|
-
| 模型 | 准确率 | 精确率 | 召回率 | F1 分数 |
|
|
275
|
-
|------|-------|-------|-------|--------|
|
|
276
|
-
| 基线模型 | | | | |
|
|
277
|
-
| 当前模型 | | | | |
|
|
278
|
-
| 提升 | +X% | +X% | +X% | +X% |
|
|
279
|
-
|
|
280
|
-
---
|
|
281
|
-
|
|
282
|
-
## 7. 判定标准
|
|
283
|
-
|
|
284
|
-
| 判定结果 | 说明 |
|
|
285
|
-
|---------|------|
|
|
286
|
-
| **通过** | 所有指标达到目标值,可以上线 |
|
|
287
|
-
| **部分通过** | 部分指标达标,需要评估风险后决定 |
|
|
288
|
-
| **不通过** | 主要指标未达标,需要优化后重新评测 |
|
|
289
|
-
|
|
290
|
-
### 判定流程
|
|
291
|
-
|
|
292
|
-
```mermaid
|
|
293
|
-
flowchart TD
|
|
294
|
-
A[评测完成] --> B{所有指标达标?}
|
|
295
|
-
B -->|是 | C[评测通过]
|
|
296
|
-
B -->|否 | D{主要指标达标?}
|
|
297
|
-
D -->|是 | E[部分通过,风险评估]
|
|
298
|
-
D -->|否 | F[不通过,需要优化]
|
|
299
|
-
E --> G{风险可接受?}
|
|
300
|
-
G -->|是 | H[有条件通过]
|
|
301
|
-
G -->|否 | F
|
|
302
|
-
```
|
|
303
|
-
|
|
304
|
-
---
|
|
305
|
-
|
|
306
|
-
## 8. 评测结论
|
|
307
|
-
|
|
308
|
-
### 8.1 结论
|
|
309
|
-
|
|
310
|
-
- [ ] 评测通过,可以上线
|
|
311
|
-
- [ ] 评测通过,但有已知问题
|
|
312
|
-
- [ ] 评测失败,需要优化
|
|
313
|
-
|
|
314
|
-
### 8.2 风险提示
|
|
315
|
-
|
|
316
|
-
| 风险 | 影响 | 缓解措施 |
|
|
317
|
-
|------|------|---------|
|
|
318
|
-
| | | |
|
|
319
|
-
|
|
320
|
-
### 8.3 后续计划
|
|
321
|
-
|
|
322
|
-
| 任务 | 负责人 | 时间 |
|
|
323
|
-
|------|-------|------|
|
|
324
|
-
| | | |
|
|
325
|
-
|
|
326
|
-
---
|
|
327
|
-
|
|
328
|
-
## 9. 附录
|
|
329
|
-
|
|
330
|
-
### 9.1 评测工具
|
|
331
|
-
|
|
332
|
-
| 工具名称 | 用途 | 版本 |
|
|
333
|
-
|---------|------|------|
|
|
334
|
-
| | | |
|
|
335
|
-
|
|
336
|
-
### 9.2 评测执行步骤(来自 08-evaluate-ai-effect.md)
|
|
337
|
-
|
|
338
|
-
**Step 1: 准备评测环境**
|
|
339
|
-
- 检查评测方案(`evaluation-plan.md`)
|
|
340
|
-
- 检查评测数据集(`evaluation/dataset/`)
|
|
341
|
-
- 检查评测脚本(`tests/evaluation/`, `evaluation/`)
|
|
342
|
-
|
|
343
|
-
**Step 2: 加载评测数据集**
|
|
344
|
-
- 读取评测数据集
|
|
345
|
-
- 验证数据集格式
|
|
346
|
-
- 统计数据集规模
|
|
347
|
-
|
|
348
|
-
**Step 3: 执行评测**
|
|
349
|
-
- 初始化被评测的 AI/模型组件
|
|
350
|
-
- 对每个测试用例执行推理
|
|
351
|
-
- 收集预测结果
|
|
352
|
-
|
|
353
|
-
**Step 4: 计算评测指标**
|
|
354
|
-
- 准确率指标
|
|
355
|
-
- 质量指标
|
|
356
|
-
- 性能指标
|
|
357
|
-
|
|
358
|
-
**Step 5: 生成评测报告**
|
|
359
|
-
- 汇总各项指标
|
|
360
|
-
- 与目标值对比
|
|
361
|
-
- 识别 badcase
|
|
362
|
-
|
|
363
|
-
### 9.3 参考资料
|
|
364
|
-
|
|
365
|
-
- [AI 评测最佳实践](url)
|
|
366
|
-
- [LLM 评测方法论](url)
|
|
367
|
-
|
|
368
|
-
---
|
|
369
|
-
|
|
370
|
-
**维护者**: AI 团队 + QA 团队
|
|
371
|
-
**进化分区**: 自由区
|
|
372
|
-
**关联文档**: `knowledge/09-templates/08-test-design.md`, `knowledge/09-templates/02-api-design.md`
|