ai-skill-interface 1.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +137 -0
- package/LICENSE +21 -0
- package/README.md +579 -0
- package/bin/init.js +50 -0
- package/package.json +25 -0
- package/skills/agent-orchestration/SKILL.md +37 -0
- package/skills/ai-token-optimize/SKILL.md +36 -0
- package/skills/auto-select/SKILL.md +58 -0
- package/skills/coverage/SKILL.md +26 -0
- package/skills/delivery-workflow/SKILL.md +31 -0
- package/skills/docs-sync/SKILL.md +27 -0
- package/skills/evaluation/SKILL.md +41 -0
- package/skills/finalize/SKILL.md +20 -0
- package/skills/framework-selection/SKILL.md +32 -0
- package/skills/hexagonal-development/SKILL.md +23 -0
- package/skills/human-in-the-loop/SKILL.md +42 -0
- package/skills/interface-first-development/SKILL.md +16 -0
- package/skills/observability/SKILL.md +35 -0
- package/skills/principle-audit/SKILL.md +32 -0
- package/skills/rag-development/SKILL.md +26 -0
- package/skills/security-audit/SKILL.md +22 -0
- package/skills/test-runner/SKILL.md +20 -0
- package/skills/version/SKILL.md +18 -0
|
@@ -0,0 +1,36 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: ai-token-optimize
|
|
3
|
+
description: Optimize AI-consumed code and data for token efficiency — reduce token usage while preserving semantic fidelity. Applies to prompts, tool outputs, and inter-agent messages.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
## Scope
|
|
7
|
+
|
|
8
|
+
- **include**: AI-consumed data (prompts, tool returns, agent messages, AI-parsed structures, LLM context)
|
|
9
|
+
- **exclude**: human-facing data (docs, tests, config, schema)
|
|
10
|
+
|
|
11
|
+
## Techniques
|
|
12
|
+
|
|
13
|
+
1. **legend**: define all abbreviations once at prompt start; reuse without re-definition — prefer abbreviations already present in model training data (domain-standard codes require no definition)
|
|
14
|
+
2. **compact k:v**: verbose structured labels → short key:value pairs, pipe or comma-separated
|
|
15
|
+
3. **tabular**: repeated-structure collections → typed header with column list once, then value rows only
|
|
16
|
+
4. **numeric notation**: express numbers as `{digit_count:value}` to prevent tokenizer fragmentation on large or precise values
|
|
17
|
+
5. **structural tags**: lightweight semantic markup for boundaries instead of deep nesting or verbose prose
|
|
18
|
+
6. **placement**: identity and critical directives at top, supporting data in middle, output schema at bottom — mitigates attention degradation in long contexts
|
|
19
|
+
|
|
20
|
+
## Sequence
|
|
21
|
+
|
|
22
|
+
scan AI-consumed targets → identify repetition and verbosity → apply techniques → verify AI can parse output → run tests → commit per module
|
|
23
|
+
|
|
24
|
+
## Rules
|
|
25
|
+
<constraints>
|
|
26
|
+
- apply only to AI-consumed data — never to human-facing content
|
|
27
|
+
- verify after each technique: AI must parse output correctly
|
|
28
|
+
- legend abbreviations must be unambiguous within their scope
|
|
29
|
+
- tabular format requires consistent column order across all rows
|
|
30
|
+
- placement order: legend → identity/directives → data → output schema
|
|
31
|
+
</constraints>
|
|
32
|
+
|
|
33
|
+
## Done
|
|
34
|
+
<criteria>
|
|
35
|
+
labels compressed + collections tabular + numerics notation-formatted + critical info at top + output schema at bottom + tests pass
|
|
36
|
+
</criteria>
|
|
@@ -0,0 +1,58 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: auto-select
|
|
3
|
+
description: Detect project context and automatically select, apply, and propose skills. This is the entry-point skill — loaded first, drives all other skill activation.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
<instruction>
|
|
7
|
+
You have access to a skill library installed at ~/.claude/skills/. Each skill defines WHAT must be achieved — you decide HOW based on the project's language, tools, and conventions.
|
|
8
|
+
|
|
9
|
+
On every task, before acting:
|
|
10
|
+
1. Detect context signals from the project and the user's request
|
|
11
|
+
2. Select applicable skills from the installed set
|
|
12
|
+
3. Apply them in dependency order (check `requires:` in frontmatter)
|
|
13
|
+
4. If no existing skill covers a needed concern, propose one
|
|
14
|
+
</instruction>
|
|
15
|
+
|
|
16
|
+
## Context Signals
|
|
17
|
+
|
|
18
|
+
Detect before selecting:
|
|
19
|
+
|
|
20
|
+
```
|
|
21
|
+
change.type : code | ai-feature | explicit
|
|
22
|
+
change.scope : core-logic | interface | infra | test | docs
|
|
23
|
+
arch.pattern : hexagonal | layered | none
|
|
24
|
+
quality.status : no-tests | low-coverage | high-coverage
|
|
25
|
+
ai.complexity : single-call | pipeline | stateful | multi-agent
|
|
26
|
+
action.risk : reversible | irreversible
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
## Selection Rules
|
|
30
|
+
|
|
31
|
+
```
|
|
32
|
+
change.type=code
|
|
33
|
+
→ delivery-workflow
|
|
34
|
+
+ hexagonal-development (arch.pattern=hexagonal or layered)
|
|
35
|
+
+ interface-first-development (change.scope=interface)
|
|
36
|
+
→ finalize (after completion)
|
|
37
|
+
|
|
38
|
+
change.type=ai-feature
|
|
39
|
+
→ framework-selection (always first)
|
|
40
|
+
+ rag-development (retrieval pipeline present)
|
|
41
|
+
+ observability (ai.complexity≥pipeline)
|
|
42
|
+
+ evaluation (quality measurement needed)
|
|
43
|
+
+ human-in-the-loop (action.risk=irreversible)
|
|
44
|
+
+ agent-orchestration (ai.complexity=multi-agent)
|
|
45
|
+
|
|
46
|
+
change.type=explicit
|
|
47
|
+
→ the skill the user named directly
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
## Proposing Missing Skills
|
|
51
|
+
|
|
52
|
+
<constraints>
|
|
53
|
+
- if a task requires a concern not covered by any installed skill, propose a new skill
|
|
54
|
+
- proposal target: the repository that hosts this skill library (detect from git remote or package metadata)
|
|
55
|
+
- proposal method: create an issue with label "feature" in the skill library repository, then create a branch and PR with the new SKILL.md
|
|
56
|
+
- the proposed skill must follow all skill design rules: no technology references, WHAT not HOW, single responsibility
|
|
57
|
+
- check installed skills first to avoid overlap before proposing
|
|
58
|
+
</constraints>
|
|
@@ -0,0 +1,26 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: coverage
|
|
3
|
+
description: Analyze test coverage and drive it to 80%+ by writing missing tests. Detects the project's test runner automatically.
|
|
4
|
+
requires: [test-runner]
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Sequence
|
|
8
|
+
|
|
9
|
+
detect coverage tool → measure → identify uncovered code → write tests → remeasure → repeat until ≥80%
|
|
10
|
+
|
|
11
|
+
## Priority
|
|
12
|
+
|
|
13
|
+
core business logic → service/use-case → infrastructure/integration → entry points
|
|
14
|
+
|
|
15
|
+
## Test Strategy
|
|
16
|
+
|
|
17
|
+
happy path + boundary values + error paths + all conditional branches
|
|
18
|
+
|
|
19
|
+
## Exclusions
|
|
20
|
+
|
|
21
|
+
entry-point bootstrap | auto-generated files | config/env files | type-definition-only files
|
|
22
|
+
|
|
23
|
+
## Done
|
|
24
|
+
<criteria>
|
|
25
|
+
coverage ≥80% + all tests pass
|
|
26
|
+
</criteria>
|
|
@@ -0,0 +1,31 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: delivery-workflow
|
|
3
|
+
description: Enforce the full delivery cycle for every implementation task. Covers coding, testing, coverage, commit rules, and auto-commit on completion. This skill MUST be followed for all code changes.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
## Sequence
|
|
7
|
+
|
|
8
|
+
implement → run tests → fix failures → repeat until pass → add/update tests for changed behavior → rerun → commit by purpose
|
|
9
|
+
|
|
10
|
+
## Rules
|
|
11
|
+
<constraints>
|
|
12
|
+
- every code change requires test run
|
|
13
|
+
- new behavior requires new tests
|
|
14
|
+
- changed behavior requires updated tests
|
|
15
|
+
- commit only when all tests pass
|
|
16
|
+
- never mix purposes in one commit
|
|
17
|
+
- auto-commit when work is complete
|
|
18
|
+
</constraints>
|
|
19
|
+
|
|
20
|
+
## Commit Format
|
|
21
|
+
|
|
22
|
+
```
|
|
23
|
+
format: <type>: <imperative summary, ≤72 chars>
|
|
24
|
+
types: feat=new feature | fix=bug fix | refactor=no behavior change | test=test change | docs=doc change | chore=build/config/deps
|
|
25
|
+
forbidden: vague summaries (update, fix issues, changes)
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
## Done
|
|
29
|
+
<criteria>
|
|
30
|
+
all tests pass + tests exist for changed behavior + commits separated by purpose
|
|
31
|
+
</criteria>
|
|
@@ -0,0 +1,27 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: docs-sync
|
|
3
|
+
description: Detect documentation drift from recent code changes and synchronize docs to match the current codebase.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
## Sequence
|
|
7
|
+
|
|
8
|
+
scan project docs → identify changed code → map changes to affected docs → update docs → validate structure
|
|
9
|
+
|
|
10
|
+
## Change-to-Doc Mapping
|
|
11
|
+
|
|
12
|
+
- new public API/module → API docs, architecture overview
|
|
13
|
+
- public interface change → API docs, usage examples
|
|
14
|
+
- config change → config guide, env examples
|
|
15
|
+
- schema change → data model docs
|
|
16
|
+
- dependency change → install guide
|
|
17
|
+
- architecture change → architecture docs
|
|
18
|
+
|
|
19
|
+
## Rules
|
|
20
|
+
<constraints>
|
|
21
|
+
- code is source of truth: docs follow code
|
|
22
|
+
- add sections for new topics
|
|
23
|
+
- remove docs for deleted features
|
|
24
|
+
- fix broken internal links
|
|
25
|
+
- sync TOC with actual headings
|
|
26
|
+
- remove duplicate content
|
|
27
|
+
</constraints>
|
|
@@ -0,0 +1,41 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: evaluation
|
|
3
|
+
description: Build evaluation pipelines for AI outputs — create datasets, write evaluators, and measure quality systematically.
|
|
4
|
+
requires: [observability]
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Sequence
|
|
8
|
+
|
|
9
|
+
collect dataset → write evaluators → run baseline → change system → rerun → compare delta → iterate
|
|
10
|
+
|
|
11
|
+
## Dataset Types
|
|
12
|
+
|
|
13
|
+
```
|
|
14
|
+
final response → input → expected output
|
|
15
|
+
step-level → input → expected intermediate step
|
|
16
|
+
trajectory → input → expected sequence of steps
|
|
17
|
+
```
|
|
18
|
+
|
|
19
|
+
## Evaluator Types
|
|
20
|
+
|
|
21
|
+
```
|
|
22
|
+
code evaluator → deterministic check (exact match, schema, format)
|
|
23
|
+
LLM-as-judge → semantic check (correctness, tone, safety)
|
|
24
|
+
human evaluator → gold standard for ambiguous criteria
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
## Rules
|
|
28
|
+
|
|
29
|
+
<constraints>
|
|
30
|
+
- prefer code evaluators for measurable criteria
|
|
31
|
+
- every evaluator must return a score and a reason
|
|
32
|
+
- judge model must differ from the model being evaluated
|
|
33
|
+
- evaluators must produce consistent output for the same input
|
|
34
|
+
- record a baseline before any system change
|
|
35
|
+
</constraints>
|
|
36
|
+
|
|
37
|
+
## Done
|
|
38
|
+
|
|
39
|
+
<criteria>
|
|
40
|
+
dataset covers representative inputs + each evaluator returns score+reason + baseline recorded + regression detectable
|
|
41
|
+
</criteria>
|
|
@@ -0,0 +1,20 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: finalize
|
|
3
|
+
description: Post-implementation pipeline — run tests, enforce 80% coverage, sync docs, then commit atomically. Run after every feature or fix.
|
|
4
|
+
requires: [test-runner, coverage, docs-sync, delivery-workflow]
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Sequence
|
|
8
|
+
|
|
9
|
+
test(detect→run→fix→pass) → coverage(measure→write tests→≥80%) → docs-sync(detect drift→update) → commit(by purpose, type: summary)
|
|
10
|
+
|
|
11
|
+
## Output Format
|
|
12
|
+
|
|
13
|
+
```
|
|
14
|
+
Tests:total=N passed=N failed=N Coverage:N% Docs:[files] Commits:[hash:msg]
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
## Done
|
|
18
|
+
<criteria>
|
|
19
|
+
all tests pass + coverage ≥80% + docs synced + commits created
|
|
20
|
+
</criteria>
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: framework-selection
|
|
3
|
+
description: Choose the right tool, library, or architecture for the task — minimal complexity for the requirement.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
## Sequence
|
|
7
|
+
|
|
8
|
+
classify problem complexity → detect what the project already uses → match to the lowest sufficient tier → record rationale
|
|
9
|
+
|
|
10
|
+
## Complexity Tiers
|
|
11
|
+
|
|
12
|
+
```
|
|
13
|
+
single call → no framework
|
|
14
|
+
pipeline → composable steps, minimal orchestration
|
|
15
|
+
stateful workflow → explicit state, resumability
|
|
16
|
+
multi-agent → routing, delegation, parallelism
|
|
17
|
+
```
|
|
18
|
+
|
|
19
|
+
## Rules
|
|
20
|
+
|
|
21
|
+
<constraints>
|
|
22
|
+
- detect existing project tools before proposing new ones
|
|
23
|
+
- start at the lowest tier that satisfies the stated requirement
|
|
24
|
+
- escalate only when the lower tier cannot meet a requirement
|
|
25
|
+
- switching tiers mid-project requires explicit justification
|
|
26
|
+
</constraints>
|
|
27
|
+
|
|
28
|
+
## Done
|
|
29
|
+
|
|
30
|
+
<criteria>
|
|
31
|
+
chosen approach matches complexity tier + existing conventions respected + rationale recorded
|
|
32
|
+
</criteria>
|
|
@@ -0,0 +1,23 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: hexagonal-development
|
|
3
|
+
description: Detect and enforce layered architecture boundaries. Use when the project separates concerns into layers and changes touch layer boundaries.
|
|
4
|
+
requires: [interface-first-development]
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Sequence
|
|
8
|
+
|
|
9
|
+
detect project layer structure → apply change flow inside-out
|
|
10
|
+
|
|
11
|
+
## Change Flow
|
|
12
|
+
|
|
13
|
+
domain/business rule → inbound contract → outbound contract → infrastructure adapter → presentation mapping
|
|
14
|
+
|
|
15
|
+
## Rules
|
|
16
|
+
<constraints>
|
|
17
|
+
- detect existing layer naming, do not rename/restructure unless requested
|
|
18
|
+
- domain layer: zero deps on infrastructure or presentation
|
|
19
|
+
- infrastructure types must not leak into domain or contract layers
|
|
20
|
+
- explicit mapping between infrastructure models and domain models
|
|
21
|
+
- presentation layer: thin, delegates to use-cases
|
|
22
|
+
- deps always point inward: outer→inner, never reverse
|
|
23
|
+
</constraints>
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: human-in-the-loop
|
|
3
|
+
description: Insert human approval, review, or correction checkpoints into AI workflows — interrupt, wait, resume safely.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
## Sequence
|
|
7
|
+
|
|
8
|
+
reach checkpoint → serialize state → emit review request → halt → human acts → restore state → resume
|
|
9
|
+
|
|
10
|
+
## Interrupt Triggers
|
|
11
|
+
|
|
12
|
+
```
|
|
13
|
+
irreversible actions → destructive, external, financial
|
|
14
|
+
low confidence → below an explicit threshold
|
|
15
|
+
high-cost ambiguity → wrong assumption cannot be easily undone
|
|
16
|
+
compliance gate → policy or sign-off required
|
|
17
|
+
limit approaching → resource or cost ceiling about to be exceeded
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
## Escalation Tiers
|
|
21
|
+
|
|
22
|
+
```
|
|
23
|
+
soft → surface suggestion, continue unless rejected
|
|
24
|
+
hard → halt until explicit approval
|
|
25
|
+
stop → halt and require human restart
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
## Rules
|
|
29
|
+
|
|
30
|
+
<constraints>
|
|
31
|
+
- state at interrupt must be fully serializable
|
|
32
|
+
- resume path must be idempotent — no duplicated side effects
|
|
33
|
+
- review payload must include: pending action, context, options
|
|
34
|
+
- rejected actions must be recorded with reason
|
|
35
|
+
- never auto-resume on timeout without an explicit policy
|
|
36
|
+
</constraints>
|
|
37
|
+
|
|
38
|
+
## Done
|
|
39
|
+
|
|
40
|
+
<criteria>
|
|
41
|
+
all irreversible actions have an interrupt point + state serializable + resume idempotent + rejections recorded
|
|
42
|
+
</criteria>
|
|
@@ -0,0 +1,16 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: interface-first-development
|
|
3
|
+
description: Define contracts before implementations. Use when adding or changing abstractions, service boundaries, or cross-layer dependencies.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
## Sequence
|
|
7
|
+
|
|
8
|
+
detect project's abstraction mechanism → define/update contract first → implement after contract stable → wire via DI/config → update external boundaries only for surface changes → test contract behavior
|
|
9
|
+
|
|
10
|
+
## Rules
|
|
11
|
+
<constraints>
|
|
12
|
+
- infrastructure types must not appear in contract signatures
|
|
13
|
+
- function signatures: minimal, intention-revealing
|
|
14
|
+
- prefer extending existing contracts over ad-hoc cross-layer deps
|
|
15
|
+
- one contract = one responsibility
|
|
16
|
+
</constraints>
|
|
@@ -0,0 +1,35 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: observability
|
|
3
|
+
description: Instrument AI workflows with tracing, logging, and monitoring to enable debugging, auditing, and performance analysis.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
## Sequence
|
|
7
|
+
|
|
8
|
+
identify instrumentation boundaries → add spans → attach metadata → verify no gaps
|
|
9
|
+
|
|
10
|
+
## Boundaries
|
|
11
|
+
|
|
12
|
+
```
|
|
13
|
+
entry points → user input, external triggers
|
|
14
|
+
model calls → inputs, outputs, latency, token counts
|
|
15
|
+
tool calls → name, inputs, outputs, errors
|
|
16
|
+
inter-agent msgs → sender, receiver, content, timing
|
|
17
|
+
state transitions → before/after snapshot
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
## Rules
|
|
21
|
+
|
|
22
|
+
<constraints>
|
|
23
|
+
- every AI model call must produce a trace entry
|
|
24
|
+
- token counts must be extracted from the model API response metadata, not estimated or hardcoded — never use response length or character count as a token proxy
|
|
25
|
+
- propagate trace context across service boundaries
|
|
26
|
+
- attach a correlation ID to every trace
|
|
27
|
+
- structured output only — no free-form strings
|
|
28
|
+
- no secrets, PII, or credentials in traces
|
|
29
|
+
</constraints>
|
|
30
|
+
|
|
31
|
+
## Done
|
|
32
|
+
|
|
33
|
+
<criteria>
|
|
34
|
+
all model calls traced + tool calls recorded + correlation IDs present + no secrets in traces
|
|
35
|
+
</criteria>
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: principle-audit
|
|
3
|
+
description: Audit the codebase for violations of the project's core principles — detect unintended system-imposed constraints that contradict the project's stated goals.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
## Sequence
|
|
7
|
+
|
|
8
|
+
find project principles(docs, config) → scan for violations → classify(allowed vs violation) → report + fix
|
|
9
|
+
|
|
10
|
+
## Violation Types
|
|
11
|
+
|
|
12
|
+
1. **unintended constraints**: system blocks behavior without business rule
|
|
13
|
+
2. **layer boundary**: deps in forbidden direction, logic in wrong layer
|
|
14
|
+
3. **consistency**: same-purpose logic in conflicting ways, naming/error-handling mismatch
|
|
15
|
+
4. **hardcoded assumptions**: values that should be configurable, env-specific assumptions
|
|
16
|
+
|
|
17
|
+
## Distinguish
|
|
18
|
+
|
|
19
|
+
- allowed: status as info | config-based threshold | domain rule rejection
|
|
20
|
+
- violation: status blocks execution | hardcoded threshold | infra-layer business decision
|
|
21
|
+
|
|
22
|
+
## Severity
|
|
23
|
+
<criteria>
|
|
24
|
+
CRITICAL(direct violation, immediate impact) → fix now
|
|
25
|
+
WARNING(violation, currently inactive) → fix this cycle
|
|
26
|
+
INFO(potential risk) → comment/TODO
|
|
27
|
+
</criteria>
|
|
28
|
+
|
|
29
|
+
## Report
|
|
30
|
+
|
|
31
|
+
per violation: file:line + description + related principle + fix recommendation
|
|
32
|
+
fix CRITICAL immediately → run tests
|
|
@@ -0,0 +1,26 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: rag-development
|
|
3
|
+
description: Implement Retrieval-Augmented Generation pipelines — ingestion, chunking, embedding, retrieval, and generation.
|
|
4
|
+
requires: [framework-selection]
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Sequence
|
|
8
|
+
|
|
9
|
+
ingest → chunk → embed → store → retrieve → rerank → generate
|
|
10
|
+
|
|
11
|
+
## Rules
|
|
12
|
+
|
|
13
|
+
<constraints>
|
|
14
|
+
- preserve source metadata (origin, section, date) through all stages
|
|
15
|
+
- chunk boundaries must respect semantic units
|
|
16
|
+
- embedding model must be identical at index time and query time
|
|
17
|
+
- retrieval store must match the data scale
|
|
18
|
+
- retrieve candidates before filtering by relevance — never pass unfiltered results to generation
|
|
19
|
+
- generation output must attribute retrieved sources
|
|
20
|
+
</constraints>
|
|
21
|
+
|
|
22
|
+
## Done
|
|
23
|
+
|
|
24
|
+
<criteria>
|
|
25
|
+
all stages present + metadata preserved + embedding model consistent + sources attributed in output
|
|
26
|
+
</criteria>
|
|
@@ -0,0 +1,22 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: security-audit
|
|
3
|
+
description: Comprehensive security review — secrets exposure, dependency vulnerabilities, code injection risks, and infrastructure config.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
## Scan Targets
|
|
7
|
+
|
|
8
|
+
1. **secrets**: VCS history + source files for hardcoded credentials + verify secret/key/log files excluded from VCS
|
|
9
|
+
2. **deps**: run project's vulnerability scanner
|
|
10
|
+
3. **code**: injection(query,command,template) + path traversal + missing input validation at boundaries + error exposure(stack traces,internal paths) + credential logging
|
|
11
|
+
4. **infra**: unnecessary port exposure + default/weak credentials + secrets not via env vars
|
|
12
|
+
|
|
13
|
+
## Severity
|
|
14
|
+
<criteria>
|
|
15
|
+
HIGH(secret exposure, injection) → fix immediately
|
|
16
|
+
MEDIUM(vulnerable dep, error exposure) → fix this cycle
|
|
17
|
+
LOW(potential risk, best-practice violation) → next cycle
|
|
18
|
+
</criteria>
|
|
19
|
+
|
|
20
|
+
## Action
|
|
21
|
+
|
|
22
|
+
fix HIGH issues directly when possible
|
|
@@ -0,0 +1,20 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: test-runner
|
|
3
|
+
description: Detect the project's test runner and execute the full test suite with coverage reporting. Fail fast and fix on failure.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
## Sequence
|
|
7
|
+
|
|
8
|
+
detect test framework from project → run full suite → on failure: classify(code bug|test bug) → fix → rerun → repeat until pass → summarize
|
|
9
|
+
|
|
10
|
+
## Output Format
|
|
11
|
+
|
|
12
|
+
```
|
|
13
|
+
Tests:total=N passed=N failed=N skipped=N Coverage:N% Duration:Xs
|
|
14
|
+
```
|
|
15
|
+
|
|
16
|
+
## Rules
|
|
17
|
+
<constraints>
|
|
18
|
+
- tests must be independently runnable, no order dependency
|
|
19
|
+
- mock only external I/O boundaries
|
|
20
|
+
</constraints>
|
|
@@ -0,0 +1,18 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: version
|
|
3
|
+
description: Manage semantic versioning — auto-detect version bump type from recent commits, update version files, write CHANGELOG, and create a git tag.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
## Bump Rules
|
|
7
|
+
|
|
8
|
+
BREAKING CHANGE→MAJOR(X+1.0.0) | feat→MINOR(X.Y+1.0) | fix,refactor,perf→PATCH(X.Y.Z+1) | test,docs,chore→no change
|
|
9
|
+
|
|
10
|
+
## Sequence
|
|
11
|
+
|
|
12
|
+
detect current version(tags→version files→CHANGELOG) → determine bump(arg or auto from commits) → update all version files → write CHANGELOG(Added/Changed/Fixed/Removed) → commit + tag
|
|
13
|
+
|
|
14
|
+
## Usage
|
|
15
|
+
|
|
16
|
+
```
|
|
17
|
+
/version (auto) | /version patch | /version minor | /version major
|
|
18
|
+
```
|