@fyow/copilot-everything 1.0.1 → 1.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,80 @@
1
+ ---
2
+ name: continuous-learning
3
+ description: Automatically extract reusable patterns from Claude Code sessions and save them as learned skills for future use.
4
+ ---
5
+
6
+ # Continuous Learning Skill
7
+
8
+ Automatically evaluates Claude Code sessions on end to extract reusable patterns that can be saved as learned skills.
9
+
10
+ ## How It Works
11
+
12
+ This skill runs as a **Stop hook** at the end of each session:
13
+
14
+ 1. **Session Evaluation**: Checks if session has enough messages (default: 10+)
15
+ 2. **Pattern Detection**: Identifies extractable patterns from the session
16
+ 3. **Skill Extraction**: Saves useful patterns to `~/.claude/skills/learned/`
17
+
18
+ ## Configuration
19
+
20
+ Edit `config.json` to customize:
21
+
22
+ ```json
23
+ {
24
+ "min_session_length": 10,
25
+ "extraction_threshold": "medium",
26
+ "auto_approve": false,
27
+ "learned_skills_path": "~/.claude/skills/learned/",
28
+ "patterns_to_detect": [
29
+ "error_resolution",
30
+ "user_corrections",
31
+ "workarounds",
32
+ "debugging_techniques",
33
+ "project_specific"
34
+ ],
35
+ "ignore_patterns": [
36
+ "simple_typos",
37
+ "one_time_fixes",
38
+ "external_api_issues"
39
+ ]
40
+ }
41
+ ```
42
+
43
+ ## Pattern Types
44
+
45
+ | Pattern | Description |
46
+ |---------|-------------|
47
+ | `error_resolution` | How specific errors were resolved |
48
+ | `user_corrections` | Patterns from user corrections |
49
+ | `workarounds` | Solutions to framework/library quirks |
50
+ | `debugging_techniques` | Effective debugging approaches |
51
+ | `project_specific` | Project-specific conventions |
52
+
53
+ ## Hook Setup
54
+
55
+ Add to your `~/.claude/settings.json`:
56
+
57
+ ```json
58
+ {
59
+ "hooks": {
60
+ "Stop": [{
61
+ "matcher": "*",
62
+ "hooks": [{
63
+ "type": "command",
64
+ "command": "~/.claude/skills/continuous-learning/evaluate-session.sh"
65
+ }]
66
+ }]
67
+ }
68
+ }
69
+ ```
70
+
71
+ ## Why Stop Hook?
72
+
73
+ - **Lightweight**: Runs once at session end
74
+ - **Non-blocking**: Doesn't add latency to every message
75
+ - **Complete context**: Has access to full session transcript
76
+
77
+ ## Related
78
+
79
+ - [The Longform Guide](https://x.com/affaanmustafa/status/2014040193557471352) - Section on continuous learning
80
+ - `/learn` command - Manual pattern extraction mid-session
@@ -0,0 +1,221 @@
1
+ # Eval Harness Skill
2
+
3
+ A formal evaluation framework for Claude Code sessions, implementing eval-driven development (EDD) principles.
4
+
5
+ ## Philosophy
6
+
7
+ Eval-Driven Development treats evals as the "unit tests of AI development":
8
+ - Define expected behavior BEFORE implementation
9
+ - Run evals continuously during development
10
+ - Track regressions with each change
11
+ - Use pass@k metrics for reliability measurement
12
+
13
+ ## Eval Types
14
+
15
+ ### Capability Evals
16
+ Test if Claude can do something it couldn't before:
17
+ ```markdown
18
+ [CAPABILITY EVAL: feature-name]
19
+ Task: Description of what Claude should accomplish
20
+ Success Criteria:
21
+ - [ ] Criterion 1
22
+ - [ ] Criterion 2
23
+ - [ ] Criterion 3
24
+ Expected Output: Description of expected result
25
+ ```
26
+
27
+ ### Regression Evals
28
+ Ensure changes don't break existing functionality:
29
+ ```markdown
30
+ [REGRESSION EVAL: feature-name]
31
+ Baseline: SHA or checkpoint name
32
+ Tests:
33
+ - existing-test-1: PASS/FAIL
34
+ - existing-test-2: PASS/FAIL
35
+ - existing-test-3: PASS/FAIL
36
+ Result: X/Y passed (previously Y/Y)
37
+ ```
38
+
39
+ ## Grader Types
40
+
41
+ ### 1. Code-Based Grader
42
+ Deterministic checks using code:
43
+ ```bash
44
+ # Check if file contains expected pattern
45
+ grep -q "export function handleAuth" src/auth.ts && echo "PASS" || echo "FAIL"
46
+
47
+ # Check if tests pass
48
+ npm test -- --testPathPattern="auth" && echo "PASS" || echo "FAIL"
49
+
50
+ # Check if build succeeds
51
+ npm run build && echo "PASS" || echo "FAIL"
52
+ ```
53
+
54
+ ### 2. Model-Based Grader
55
+ Use Claude to evaluate open-ended outputs:
56
+ ```markdown
57
+ [MODEL GRADER PROMPT]
58
+ Evaluate the following code change:
59
+ 1. Does it solve the stated problem?
60
+ 2. Is it well-structured?
61
+ 3. Are edge cases handled?
62
+ 4. Is error handling appropriate?
63
+
64
+ Score: 1-5 (1=poor, 5=excellent)
65
+ Reasoning: [explanation]
66
+ ```
67
+
68
+ ### 3. Human Grader
69
+ Flag for manual review:
70
+ ```markdown
71
+ [HUMAN REVIEW REQUIRED]
72
+ Change: Description of what changed
73
+ Reason: Why human review is needed
74
+ Risk Level: LOW/MEDIUM/HIGH
75
+ ```
76
+
77
+ ## Metrics
78
+
79
+ ### pass@k
80
+ "At least one success in k attempts"
81
+ - pass@1: First attempt success rate
82
+ - pass@3: Success within 3 attempts
83
+ - Typical target: pass@3 > 90%
84
+
85
+ ### pass^k
86
+ "All k trials succeed"
87
+ - Higher bar for reliability
88
+ - pass^3: 3 consecutive successes
89
+ - Use for critical paths
90
+
91
+ ## Eval Workflow
92
+
93
+ ### 1. Define (Before Coding)
94
+ ```markdown
95
+ ## EVAL DEFINITION: feature-xyz
96
+
97
+ ### Capability Evals
98
+ 1. Can create new user account
99
+ 2. Can validate email format
100
+ 3. Can hash password securely
101
+
102
+ ### Regression Evals
103
+ 1. Existing login still works
104
+ 2. Session management unchanged
105
+ 3. Logout flow intact
106
+
107
+ ### Success Metrics
108
+ - pass@3 > 90% for capability evals
109
+ - pass^3 = 100% for regression evals
110
+ ```
111
+
112
+ ### 2. Implement
113
+ Write code to pass the defined evals.
114
+
115
+ ### 3. Evaluate
116
+ ```bash
117
+ # Run capability evals
118
+ [Run each capability eval, record PASS/FAIL]
119
+
120
+ # Run regression evals
121
+ npm test -- --testPathPattern="existing"
122
+
123
+ # Generate report
124
+ ```
125
+
126
+ ### 4. Report
127
+ ```markdown
128
+ EVAL REPORT: feature-xyz
129
+ ========================
130
+
131
+ Capability Evals:
132
+ create-user: PASS (pass@1)
133
+ validate-email: PASS (pass@2)
134
+ hash-password: PASS (pass@1)
135
+ Overall: 3/3 passed
136
+
137
+ Regression Evals:
138
+ login-flow: PASS
139
+ session-mgmt: PASS
140
+ logout-flow: PASS
141
+ Overall: 3/3 passed
142
+
143
+ Metrics:
144
+ pass@1: 67% (2/3)
145
+ pass@3: 100% (3/3)
146
+
147
+ Status: READY FOR REVIEW
148
+ ```
149
+
150
+ ## Integration Patterns
151
+
152
+ ### Pre-Implementation
153
+ ```
154
+ /eval define feature-name
155
+ ```
156
+ Creates eval definition file at `.claude/evals/feature-name.md`
157
+
158
+ ### During Implementation
159
+ ```
160
+ /eval check feature-name
161
+ ```
162
+ Runs current evals and reports status
163
+
164
+ ### Post-Implementation
165
+ ```
166
+ /eval report feature-name
167
+ ```
168
+ Generates full eval report
169
+
170
+ ## Eval Storage
171
+
172
+ Store evals in project:
173
+ ```
174
+ .claude/
175
+ evals/
176
+ feature-xyz.md # Eval definition
177
+ feature-xyz.log # Eval run history
178
+ baseline.json # Regression baselines
179
+ ```
180
+
181
+ ## Best Practices
182
+
183
+ 1. **Define evals BEFORE coding** - Forces clear thinking about success criteria
184
+ 2. **Run evals frequently** - Catch regressions early
185
+ 3. **Track pass@k over time** - Monitor reliability trends
186
+ 4. **Use code graders when possible** - Deterministic > probabilistic
187
+ 5. **Human review for security** - Never fully automate security checks
188
+ 6. **Keep evals fast** - Slow evals don't get run
189
+ 7. **Version evals with code** - Evals are first-class artifacts
190
+
191
+ ## Example: Adding Authentication
192
+
193
+ ```markdown
194
+ ## EVAL: add-authentication
195
+
196
+ ### Phase 1: Define (10 min)
197
+ Capability Evals:
198
+ - [ ] User can register with email/password
199
+ - [ ] User can login with valid credentials
200
+ - [ ] Invalid credentials rejected with proper error
201
+ - [ ] Sessions persist across page reloads
202
+ - [ ] Logout clears session
203
+
204
+ Regression Evals:
205
+ - [ ] Public routes still accessible
206
+ - [ ] API responses unchanged
207
+ - [ ] Database schema compatible
208
+
209
+ ### Phase 2: Implement (varies)
210
+ [Write code]
211
+
212
+ ### Phase 3: Evaluate
213
+ Run: /eval check add-authentication
214
+
215
+ ### Phase 4: Report
216
+ EVAL REPORT: add-authentication
217
+ ==============================
218
+ Capability: 5/5 passed (pass@3: 100%)
219
+ Regression: 3/3 passed (pass^3: 100%)
220
+ Status: SHIP IT
221
+ ```
@@ -1,8 +1,3 @@
1
- ---
2
- name: project-guidelines-example
3
- description: Example project-specific skill template. Use this as a starting point for creating your own project skills with architecture, patterns, and guidelines.
4
- ---
5
-
6
1
  # Project Guidelines Skill (Example)
7
2
 
8
3
  This is an example of a project-specific skill. Use this as a template for your own projects.
@@ -28,7 +23,7 @@ Reference this skill when working on the specific project it's designed for. Pro
28
23
  - **Frontend**: Next.js 15 (App Router), TypeScript, React
29
24
  - **Backend**: FastAPI (Python), Pydantic models
30
25
  - **Database**: Supabase (PostgreSQL)
31
- - **AI**: LLM API with tool calling and structured output
26
+ - **AI**: Claude API with tool calling and structured output
32
27
  - **Deployment**: Google Cloud Run
33
28
  - **Testing**: Playwright (E2E), pytest (backend), React Testing Library
34
29
 
@@ -50,7 +45,7 @@ Reference this skill when working on the specific project it's designed for. Pro
50
45
  ┌───────────────┼───────────────┐
51
46
  ▼ ▼ ▼
52
47
  ┌──────────┐ ┌──────────┐ ┌──────────┐
53
- │ Supabase │ │ LLM │ │ Redis │
48
+ │ Supabase │ │ Claude │ │ Redis │
54
49
  │ Database │ │ API │ │ Cache │
55
50
  └──────────┘ └──────────┘ └──────────┘
56
51
  ```
@@ -149,7 +144,7 @@ async function fetchApi<T>(
149
144
  }
150
145
  ```
151
146
 
152
- ### LLM AI Integration (Structured Output)
147
+ ### Claude AI Integration (Structured Output)
153
148
 
154
149
  ```python
155
150
  from anthropic import Anthropic
@@ -160,11 +155,11 @@ class AnalysisResult(BaseModel):
160
155
  key_points: list[str]
161
156
  confidence: float
162
157
 
163
- async def analyze_with_llm(content: str) -> AnalysisResult:
158
+ async def analyze_with_claude(content: str) -> AnalysisResult:
164
159
  client = Anthropic()
165
160
 
166
161
  response = client.messages.create(
167
- model="claude-sonnet-4-5-20250514", # Or use OpenAI, etc.
162
+ model="claude-sonnet-4-5-20250514",
168
163
  max_tokens=1024,
169
164
  messages=[{"role": "user", "content": content}],
170
165
  tools=[{
@@ -0,0 +1,63 @@
1
+ ---
2
+ name: strategic-compact
3
+ description: Suggests manual context compaction at logical intervals to preserve context through task phases rather than arbitrary auto-compaction.
4
+ ---
5
+
6
+ # Strategic Compact Skill
7
+
8
+ Suggests manual `/compact` at strategic points in your workflow rather than relying on arbitrary auto-compaction.
9
+
10
+ ## Why Strategic Compaction?
11
+
12
+ Auto-compaction triggers at arbitrary points:
13
+ - Often mid-task, losing important context
14
+ - No awareness of logical task boundaries
15
+ - Can interrupt complex multi-step operations
16
+
17
+ Strategic compaction at logical boundaries:
18
+ - **After exploration, before execution** - Compact research context, keep implementation plan
19
+ - **After completing a milestone** - Fresh start for next phase
20
+ - **Before major context shifts** - Clear exploration context before different task
21
+
22
+ ## How It Works
23
+
24
+ The `suggest-compact.sh` script runs on PreToolUse (Edit/Write) and:
25
+
26
+ 1. **Tracks tool calls** - Counts tool invocations in session
27
+ 2. **Threshold detection** - Suggests at configurable threshold (default: 50 calls)
28
+ 3. **Periodic reminders** - Reminds every 25 calls after threshold
29
+
30
+ ## Hook Setup
31
+
32
+ Add to your `~/.claude/settings.json`:
33
+
34
+ ```json
35
+ {
36
+ "hooks": {
37
+ "PreToolUse": [{
38
+ "matcher": "tool == \"Edit\" || tool == \"Write\"",
39
+ "hooks": [{
40
+ "type": "command",
41
+ "command": "~/.claude/skills/strategic-compact/suggest-compact.sh"
42
+ }]
43
+ }]
44
+ }
45
+ }
46
+ ```
47
+
48
+ ## Configuration
49
+
50
+ Environment variables:
51
+ - `COMPACT_THRESHOLD` - Tool calls before first suggestion (default: 50)
52
+
53
+ ## Best Practices
54
+
55
+ 1. **Compact after planning** - Once plan is finalized, compact to start fresh
56
+ 2. **Compact after debugging** - Clear error-resolution context before continuing
57
+ 3. **Don't compact mid-implementation** - Preserve context for related changes
58
+ 4. **Read the suggestion** - The hook tells you *when*, you decide *if*
59
+
60
+ ## Related
61
+
62
+ - [The Longform Guide](https://x.com/affaanmustafa/status/2014040193557471352) - Token optimization section
63
+ - Memory persistence hooks - For state that survives compaction
@@ -1,11 +1,6 @@
1
- ---
2
- name: verification-loop
3
- description: A comprehensive verification system for coding sessions. Use this after completing features, before creating PRs, or when ensuring quality gates pass.
4
- ---
5
-
6
1
  # Verification Loop Skill
7
2
 
8
- A comprehensive verification system for coding sessions.
3
+ A comprehensive verification system for Claude Code sessions.
9
4
 
10
5
  ## When to Use
11
6
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@fyow/copilot-everything",
3
- "version": "1.0.1",
3
+ "version": "1.0.3",
4
4
  "description": "Everything you need for GitHub Copilot CLI - agents, skills, instructions, and hooks configurations",
5
5
  "main": "src/index.js",
6
6
  "bin": {
@@ -169,6 +169,17 @@ function init(flags = {}) {
169
169
  totalSkipped += copilotResult.skipped;
170
170
  totalErrors.push(...copilotResult.errors);
171
171
 
172
+ // Scripts (hook implementations)
173
+ if (!skipHooks) {
174
+ const scriptsSrc = path.join(PACKAGE_ROOT, 'scripts');
175
+ const scriptsDest = path.join(targetDir, 'scripts');
176
+ const scriptsResult = copyDir(scriptsSrc, scriptsDest, options);
177
+ console.log(` ✅ Scripts: ${scriptsResult.copied} copied, ${scriptsResult.skipped} skipped`);
178
+ totalCopied += scriptsResult.copied;
179
+ totalSkipped += scriptsResult.skipped;
180
+ totalErrors.push(...scriptsResult.errors);
181
+ }
182
+
172
183
  // AGENTS.md at root
173
184
  const agentsMdSrc = path.join(PACKAGE_ROOT, 'AGENTS.md');
174
185
  const agentsMdDest = path.join(targetDir, 'AGENTS.md');
@@ -177,6 +188,27 @@ function init(flags = {}) {
177
188
  totalCopied += agentsMdResult.copied;
178
189
  totalSkipped += agentsMdResult.skipped;
179
190
  totalErrors.push(...agentsMdResult.errors);
191
+
192
+ // MCP config to ~/.copilot/
193
+ const homeDir = process.env.HOME || process.env.USERPROFILE;
194
+ const mcpConfigSrc = path.join(PACKAGE_ROOT, 'copilot', 'mcp-config.json');
195
+ const mcpConfigDest = path.join(homeDir, '.copilot', 'mcp-config.json');
196
+
197
+ if (fs.existsSync(mcpConfigSrc)) {
198
+ if (fs.existsSync(mcpConfigDest) && !force) {
199
+ console.log(` ⚠️ MCP config: skipped (${mcpConfigDest} already exists)`);
200
+ console.log(` 💡 Use --force to overwrite, or manually merge configurations`);
201
+ totalSkipped += 1;
202
+ } else {
203
+ const mcpResult = copyFile(mcpConfigSrc, mcpConfigDest, options);
204
+ if (mcpResult.copied > 0) {
205
+ console.log(` ✅ MCP config: installed to ${mcpConfigDest}`);
206
+ }
207
+ totalCopied += mcpResult.copied;
208
+ totalSkipped += mcpResult.skipped;
209
+ totalErrors.push(...mcpResult.errors);
210
+ }
211
+ }
180
212
  }
181
213
 
182
214
  // Install Claude Code configurations (for backwards compatibility)