engsys 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +202 -0
- package/core/agents/aaron.md +152 -0
- package/core/agents/bert.md +115 -0
- package/core/agents/isabelle.md +136 -0
- package/core/agents/jody.md +150 -0
- package/core/agents/leith.md +111 -0
- package/core/agents/marcelo.md +282 -0
- package/core/agents/melvin.md +101 -0
- package/core/agents/nyx.md +152 -0
- package/core/agents/otto.md +168 -0
- package/core/agents/patricia.md +283 -0
- package/core/commands/design-audit-local.md +155 -0
- package/core/commands/design-audit.md +235 -0
- package/core/commands/design-critique.md +96 -0
- package/core/commands/file-issue.md +22 -0
- package/core/commands/generate-project.md +45 -0
- package/core/commands/implement-issue.md +37 -0
- package/core/commands/implement-project.md +40 -0
- package/core/commands/naturalize.md +61 -0
- package/core/commands/pre-push.md +29 -0
- package/core/commands/prep-review-collect.md +130 -0
- package/core/commands/prep-review-finalize.md +121 -0
- package/core/commands/prep-review-publish.md +113 -0
- package/core/commands/prep-review.md +65 -0
- package/core/commands/project-closeout.md +25 -0
- package/core/skills/agentic-eval/SKILL.md +195 -0
- package/core/skills/chrome-devtools/SKILL.md +97 -0
- package/core/skills/code-review/SKILL.md +26 -0
- package/core/skills/gh-cli/SKILL.md +2202 -0
- package/core/skills/git-commit/SKILL.md +124 -0
- package/core/skills/git-workflow-agents/SKILL.md +462 -0
- package/core/skills/git-workflow-agents/reference.md +220 -0
- package/core/skills/github-actions/SKILL.md +190 -0
- package/core/skills/github-issues/SKILL.md +154 -0
- package/core/skills/llm-structured-outputs/SKILL.md +323 -0
- package/core/skills/llm-structured-outputs/references/provider-details.md +392 -0
- package/core/skills/pre-push/SKILL.md +115 -0
- package/core/skills/refactor/SKILL.md +645 -0
- package/core/skills/web-design-reviewer/SKILL.md +371 -0
- package/core/skills/webapp-testing/SKILL.md +127 -0
- package/core/skills/webapp-testing/test-helper.js +56 -0
- package/core/templates/CLAUDE.md.tmpl +98 -0
- package/core/templates/adr-template.md +67 -0
- package/core/templates/gh-issue-templates/bug.md +39 -0
- package/core/templates/gh-issue-templates/content.md +42 -0
- package/core/templates/gh-issue-templates/enhancement.md +36 -0
- package/core/templates/gh-issue-templates/feature.md +39 -0
- package/core/templates/gh-issue-templates/infrastructure.md +41 -0
- package/core/templates/post-edit-reminders.sh.tmpl +19 -0
- package/core/templates/settings.json.tmpl +90 -0
- package/core/templates/settings.local.json.tmpl +3 -0
- package/core/workflows/agent-implementation-workflow.md +346 -0
- package/core/workflows/generate-project.md +258 -0
- package/core/workflows/implement-project-workflow.md +190 -0
- package/core/workflows/issue-tracking.md +89 -0
- package/core/workflows/project-closeout-ceremony.md +77 -0
- package/core/workflows/review-workflow.md +266 -0
- package/engsys.config.example.yaml +46 -0
- package/install +202 -0
- package/lessons-library/README.md +80 -0
- package/lessons-library/async-callbacks-verify-liveness.md +15 -0
- package/lessons-library/change-isnt-done-until-every-surface-updated.md +15 -0
- package/lessons-library/claim-then-act-for-irreversible-ops.md +16 -0
- package/lessons-library/co-commit-entangled-work.md +15 -0
- package/lessons-library/dependabot-triage-playbook.md +17 -0
- package/lessons-library/deploy-by-digest-and-verify-the-running-revision.md +15 -0
- package/lessons-library/enforce-your-guarantee-at-your-boundary.md +16 -0
- package/lessons-library/gate-changes-on-measurement-not-vibes.md +15 -0
- package/lessons-library/iac-first-no-console-changes.md +15 -0
- package/lessons-library/independent-objective-review-gate.md +15 -0
- package/lessons-library/keep-an-immutable-source-of-truth.md +15 -0
- package/lessons-library/long-agent-runs-checkpoint-not-poll.md +15 -0
- package/lessons-library/model-identity-with-stable-ids-and-provenance.md +15 -0
- package/lessons-library/operator-choices-are-first-class.md +15 -0
- package/lessons-library/prefer-tool-enforced-structured-output.md +15 -0
- package/lessons-library/prove-causation-before-acting.md +15 -0
- package/lessons-library/re-read-state-before-acting.md +14 -0
- package/lessons-library/read-layer-tolerates-unbackfilled-rows.md +15 -0
- package/lessons-library/shell-safety-pipefail-and-validate-before-teardown.md +14 -0
- package/lessons-library/shift-correctness-left-and-distrust-false-greens.md +15 -0
- package/lessons-library/stray-control-bytes-hide-changes.md +14 -0
- package/lessons-library/tests-can-assert-the-bug.md +15 -0
- package/lessons-library/verify-ground-truth-not-reports.md +15 -0
- package/lessons-library/worktrees-need-bootstrap-from-origin-main.md +15 -0
- package/lib/commands.js +356 -0
- package/lib/generate-team-avatars.mjs +251 -0
- package/lib/manifest.js +155 -0
- package/lib/render.js +135 -0
- package/lib/selftest.js +90 -0
- package/lib/util.js +89 -0
- package/lib/yaml.js +156 -0
- package/optional-agents/gary.md +86 -0
- package/optional-agents/jos.md +136 -0
- package/optional-agents/sandy.md +101 -0
- package/optional-agents/steve.md +161 -0
- package/package.json +43 -0
- package/stacks/cloud/aws/claude.fragment.md +17 -0
- package/stacks/cloud/aws/settings.fragment.json +39 -0
- package/stacks/cloud/aws/skills/aws-deployment-preflight/SKILL.md +165 -0
- package/stacks/cloud/aws/skills/cloud-architecture-aws/SKILL.md +265 -0
- package/stacks/cloud/azure/claude.fragment.md +17 -0
- package/stacks/cloud/azure/settings.fragment.json +45 -0
- package/stacks/cloud/azure/skills/azure-deployment-preflight/SKILL.md +175 -0
- package/stacks/cloud/azure/skills/cloud-architecture-azure/SKILL.md +211 -0
- package/stacks/cloud/cloudflare/claude.fragment.md +21 -0
- package/stacks/cloud/cloudflare/settings.fragment.json +31 -0
- package/stacks/cloud/cloudflare/skills/cloud-architecture-cloudflare/SKILL.md +294 -0
- package/stacks/cloud/cloudflare/skills/cloudflare-deployment-preflight/SKILL.md +175 -0
- package/stacks/cloud/gcp/claude.fragment.md +17 -0
- package/stacks/cloud/gcp/settings.fragment.json +40 -0
- package/stacks/cloud/gcp/skills/cloud-architecture-gcp/SKILL.md +208 -0
- package/stacks/cloud/gcp/skills/gcp-deployment-preflight/SKILL.md +137 -0
- package/stacks/db/mongo/skills/mongo-conventions/SKILL.md +96 -0
- package/stacks/db/prisma/claude.fragment.md +49 -0
- package/stacks/db/prisma/skills/docker-database-package-copy/SKILL.md +44 -0
- package/stacks/db/prisma/skills/prisma-conventions/SKILL.md +37 -0
- package/stacks/domain/mobile-growth/skills/apple-ads/SKILL.md +184 -0
- package/stacks/domain/mobile-growth/skills/apple-ads/references/benchmark-notes.md +47 -0
- package/stacks/domain/mobile-growth/skills/apple-ads/references/official-links.md +53 -0
- package/stacks/domain/mobile-growth/skills/google-play-growth/SKILL.md +197 -0
- package/stacks/domain/mobile-growth/skills/google-play-growth/references/benchmark-notes.md +47 -0
- package/stacks/domain/mobile-growth/skills/google-play-growth/references/official-links.md +45 -0
- package/stacks/iac/bicep/claude.fragment.md +14 -0
- package/stacks/iac/bicep/settings.fragment.json +20 -0
- package/stacks/iac/bicep/skills/iac-bicep/SKILL.md +113 -0
- package/stacks/iac/cdk/claude.fragment.md +14 -0
- package/stacks/iac/cdk/settings.fragment.json +23 -0
- package/stacks/iac/cdk/skills/iac-cdk/SKILL.md +104 -0
- package/stacks/iac/terraform/claude.fragment.md +13 -0
- package/stacks/iac/terraform/settings.fragment.json +25 -0
- package/stacks/iac/terraform/skills/iac-terraform/SKILL.md +93 -0
- package/stacks/iac/terraform/skills/terraform-conventions/SKILL.md +87 -0
- package/stacks/lang/kotlin/skills/android-testing/SKILL.md +263 -0
- package/stacks/lang/kotlin/skills/jetpack-compose/SKILL.md +264 -0
- package/stacks/lang/kotlin/skills/kotlin-coroutines/SKILL.md +329 -0
- package/stacks/lang/python/skills/python-conventions/SKILL.md +61 -0
- package/stacks/lang/shell/skills/shell-scripting/SKILL.md +110 -0
- package/stacks/lang/swift/skills/swift-concurrency/SKILL.md +423 -0
- package/stacks/lang/swift/skills/swift-concurrency/references/approachable-concurrency.md +80 -0
- package/stacks/lang/swift/skills/swift-concurrency/references/concurrency-patterns.md +233 -0
- package/stacks/lang/swift/skills/swift-concurrency/references/swiftui-concurrency.md +187 -0
- package/stacks/lang/swift/skills/swift-concurrency/references/synchronization-primitives.md +341 -0
- package/stacks/lang/swift/skills/swift-testing/SKILL.md +497 -0
- package/stacks/lang/swift/skills/swift-testing/references/testing-advanced.md +106 -0
- package/stacks/lang/swift/skills/swift-testing/references/testing-patterns.md +504 -0
- package/stacks/lang/swift/skills/swiftdata/SKILL.md +334 -0
- package/stacks/lang/swift/skills/swiftdata/references/core-data-coexistence.md +504 -0
- package/stacks/lang/swift/skills/swiftdata/references/swiftdata-advanced.md +975 -0
- package/stacks/lang/swift/skills/swiftdata/references/swiftdata-queries.md +675 -0
- package/stacks/lang/swift/skills/swiftui-patterns/SKILL.md +371 -0
- package/stacks/lang/swift/skills/swiftui-patterns/references/architecture-patterns.md +486 -0
- package/stacks/lang/swift/skills/swiftui-patterns/references/deprecated-migration.md +1097 -0
- package/stacks/lang/swift/skills/swiftui-patterns/references/design-polish.md +780 -0
- package/stacks/lang/swift/skills/swiftui-patterns/references/platform-and-sharing.md +696 -0
- package/stacks/lang/typescript/skills/typescript-conventions/SKILL.md +91 -0
- package/stacks/platform/android/claude.fragment.md +40 -0
- package/stacks/platform/android/hooks/pre-push-gradle.sh +70 -0
- package/stacks/platform/android/settings.fragment.json +13 -0
- package/stacks/platform/android/skills/android-build-conventions/SKILL.md +247 -0
- package/stacks/platform/ios/claude.fragment.md +24 -0
- package/stacks/platform/ios/hooks/pre-push-xcodebuild.sh +82 -0
- package/stacks/platform/ios/settings.fragment.json +21 -0
- package/stacks/platform/ios/skills/xcodebuildmcp-simulator-logs/SKILL.md +76 -0
- package/stacks/platform/web/skills/frontend-testing/SKILL.md +246 -0
- package/stacks/platform/web/skills/react-conventions/SKILL.md +261 -0
- package/stacks/platform/web/skills/web-platform-conventions/SKILL.md +55 -0
- package/stacks/tooling/issue-tracker-github/claude.fragment.md +10 -0
- package/stacks/tooling/issue-tracker-github/settings.fragment.json +24 -0
- package/stacks/tooling/issue-tracker-github/skills/issue-tracker-github/SKILL.md +278 -0
- package/stacks/tooling/issue-tracker-linear/claude.fragment.md +17 -0
- package/stacks/tooling/issue-tracker-linear/settings.fragment.json +9 -0
- package/stacks/tooling/issue-tracker-linear/skills/issue-tracker-linear/SKILL.md +183 -0
|
@@ -0,0 +1,195 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: agentic-eval
|
|
3
|
+
description: |
|
|
4
|
+
Patterns and techniques for evaluating and improving AI agent outputs. Use this skill when:
|
|
5
|
+
- Implementing self-critique and reflection loops
|
|
6
|
+
- Building evaluator-optimizer pipelines for quality-critical generation
|
|
7
|
+
- Creating test-driven code refinement workflows
|
|
8
|
+
- Designing rubric-based or LLM-as-judge evaluation systems
|
|
9
|
+
- Adding iterative improvement to agent outputs (code, reports, analysis)
|
|
10
|
+
- Measuring and improving agent response quality
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
# Agentic Evaluation Patterns
|
|
14
|
+
|
|
15
|
+
Patterns for self-improvement through iterative evaluation and refinement.
|
|
16
|
+
|
|
17
|
+
## Overview
|
|
18
|
+
|
|
19
|
+
Evaluation patterns enable agents to assess and improve their own outputs, moving beyond single-shot generation to iterative refinement loops.
|
|
20
|
+
|
|
21
|
+
```text
|
|
22
|
+
Generate → Evaluate → Critique → Refine → Output
|
|
23
|
+
↑ │
|
|
24
|
+
└──────────────────────────────┘
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
## When to Use
|
|
28
|
+
|
|
29
|
+
- **Quality-critical generation**: Code, reports, analysis requiring high accuracy
|
|
30
|
+
- **Tasks with clear evaluation criteria**: Defined success metrics exist
|
|
31
|
+
- **Content requiring specific standards**: Style guides, compliance, formatting
|
|
32
|
+
|
|
33
|
+
---
|
|
34
|
+
|
|
35
|
+
## Pattern 1: Basic Reflection
|
|
36
|
+
|
|
37
|
+
Agent evaluates and improves its own output through self-critique.
|
|
38
|
+
|
|
39
|
+
```python
|
|
40
|
+
def reflect_and_refine(task: str, criteria: list[str], max_iterations: int = 3) -> str:
|
|
41
|
+
"""Generate with reflection loop."""
|
|
42
|
+
output = llm(f"Complete this task:\n{task}")
|
|
43
|
+
|
|
44
|
+
for i in range(max_iterations):
|
|
45
|
+
# Self-critique
|
|
46
|
+
critique = llm(f"""
|
|
47
|
+
Evaluate this output against criteria: {criteria}
|
|
48
|
+
Output: {output}
|
|
49
|
+
Rate each: PASS/FAIL with feedback as JSON.
|
|
50
|
+
""")
|
|
51
|
+
|
|
52
|
+
critique_data = json.loads(critique)
|
|
53
|
+
all_pass = all(c["status"] == "PASS" for c in critique_data.values())
|
|
54
|
+
if all_pass:
|
|
55
|
+
return output
|
|
56
|
+
|
|
57
|
+
# Refine based on critique
|
|
58
|
+
failed = {k: v["feedback"] for k, v in critique_data.items() if v["status"] == "FAIL"}
|
|
59
|
+
output = llm(f"Improve to address: {failed}\nOriginal: {output}")
|
|
60
|
+
|
|
61
|
+
return output
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
**Key insight**: Use structured JSON output for reliable parsing of critique results.
|
|
65
|
+
|
|
66
|
+
---
|
|
67
|
+
|
|
68
|
+
## Pattern 2: Evaluator-Optimizer
|
|
69
|
+
|
|
70
|
+
Separate generation and evaluation into distinct components for clearer responsibilities.
|
|
71
|
+
|
|
72
|
+
```python
|
|
73
|
+
class EvaluatorOptimizer:
|
|
74
|
+
def __init__(self, score_threshold: float = 0.8):
|
|
75
|
+
self.score_threshold = score_threshold
|
|
76
|
+
|
|
77
|
+
def generate(self, task: str) -> str:
|
|
78
|
+
return llm(f"Complete: {task}")
|
|
79
|
+
|
|
80
|
+
def evaluate(self, output: str, task: str) -> dict:
|
|
81
|
+
return json.loads(llm(f"""
|
|
82
|
+
Evaluate output for task: {task}
|
|
83
|
+
Output: {output}
|
|
84
|
+
Return JSON: {{"overall_score": 0-1, "dimensions": {{"accuracy": ..., "clarity": ...}}}}
|
|
85
|
+
"""))
|
|
86
|
+
|
|
87
|
+
def optimize(self, output: str, feedback: dict) -> str:
|
|
88
|
+
return llm(f"Improve based on feedback: {feedback}\nOutput: {output}")
|
|
89
|
+
|
|
90
|
+
def run(self, task: str, max_iterations: int = 3) -> str:
|
|
91
|
+
output = self.generate(task)
|
|
92
|
+
for _ in range(max_iterations):
|
|
93
|
+
evaluation = self.evaluate(output, task)
|
|
94
|
+
if evaluation["overall_score"] >= self.score_threshold:
|
|
95
|
+
break
|
|
96
|
+
output = self.optimize(output, evaluation)
|
|
97
|
+
return output
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
---
|
|
101
|
+
|
|
102
|
+
## Pattern 3: Code-Specific Reflection
|
|
103
|
+
|
|
104
|
+
Test-driven refinement loop for code generation.
|
|
105
|
+
|
|
106
|
+
```python
|
|
107
|
+
class CodeReflector:
|
|
108
|
+
def reflect_and_fix(self, spec: str, max_iterations: int = 3) -> str:
|
|
109
|
+
code = llm(f"Write Python code for: {spec}")
|
|
110
|
+
tests = llm(f"Generate pytest tests for: {spec}\nCode: {code}")
|
|
111
|
+
|
|
112
|
+
for _ in range(max_iterations):
|
|
113
|
+
result = run_tests(code, tests)
|
|
114
|
+
if result["success"]:
|
|
115
|
+
return code
|
|
116
|
+
code = llm(f"Fix error: {result['error']}\nCode: {code}")
|
|
117
|
+
return code
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
---
|
|
121
|
+
|
|
122
|
+
## Evaluation Strategies
|
|
123
|
+
|
|
124
|
+
### Outcome-Based
|
|
125
|
+
|
|
126
|
+
Evaluate whether output achieves the expected result.
|
|
127
|
+
|
|
128
|
+
```python
|
|
129
|
+
def evaluate_outcome(task: str, output: str, expected: str) -> str:
|
|
130
|
+
return llm(f"Does output achieve expected outcome? Task: {task}, Expected: {expected}, Output: {output}")
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
### LLM-as-Judge
|
|
134
|
+
|
|
135
|
+
Use LLM to compare and rank outputs.
|
|
136
|
+
|
|
137
|
+
```python
|
|
138
|
+
def llm_judge(output_a: str, output_b: str, criteria: str) -> str:
|
|
139
|
+
return llm(f"Compare outputs A and B for {criteria}. Which is better and why?")
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
### Rubric-Based
|
|
143
|
+
|
|
144
|
+
Score outputs against weighted dimensions.
|
|
145
|
+
|
|
146
|
+
```python
|
|
147
|
+
RUBRIC = {
|
|
148
|
+
"accuracy": {"weight": 0.4},
|
|
149
|
+
"clarity": {"weight": 0.3},
|
|
150
|
+
"completeness": {"weight": 0.3}
|
|
151
|
+
}
|
|
152
|
+
|
|
153
|
+
def evaluate_with_rubric(output: str, rubric: dict) -> float:
|
|
154
|
+
scores = json.loads(llm(f"Rate 1-5 for each dimension: {list(rubric.keys())}\nOutput: {output}"))
|
|
155
|
+
return sum(scores[d] * rubric[d]["weight"] for d in rubric) / 5
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
---
|
|
159
|
+
|
|
160
|
+
## Best Practices
|
|
161
|
+
|
|
162
|
+
| Practice | Rationale |
|
|
163
|
+
| --------------------- | ------------------------------------------------------- |
|
|
164
|
+
| **Clear criteria** | Define specific, measurable evaluation criteria upfront |
|
|
165
|
+
| **Iteration limits** | Set max iterations (3-5) to prevent infinite loops |
|
|
166
|
+
| **Convergence check** | Stop if output score isn't improving between iterations |
|
|
167
|
+
| **Log history** | Keep full trajectory for debugging and analysis |
|
|
168
|
+
| **Structured output** | Use JSON for reliable parsing of evaluation results |
|
|
169
|
+
|
|
170
|
+
---
|
|
171
|
+
|
|
172
|
+
## Quick Start Checklist
|
|
173
|
+
|
|
174
|
+
```markdown
|
|
175
|
+
## Evaluation Implementation Checklist
|
|
176
|
+
|
|
177
|
+
### Setup
|
|
178
|
+
|
|
179
|
+
- [ ] Define evaluation criteria/rubric
|
|
180
|
+
- [ ] Set score threshold for "good enough"
|
|
181
|
+
- [ ] Configure max iterations (default: 3)
|
|
182
|
+
|
|
183
|
+
### Implementation
|
|
184
|
+
|
|
185
|
+
- [ ] Implement generate() function
|
|
186
|
+
- [ ] Implement evaluate() function with structured output
|
|
187
|
+
- [ ] Implement optimize() function
|
|
188
|
+
- [ ] Wire up the refinement loop
|
|
189
|
+
|
|
190
|
+
### Safety
|
|
191
|
+
|
|
192
|
+
- [ ] Add convergence detection
|
|
193
|
+
- [ ] Log all iterations for debugging
|
|
194
|
+
- [ ] Handle evaluation parse failures gracefully
|
|
195
|
+
```
|
|
@@ -0,0 +1,97 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: chrome-devtools
|
|
3
|
+
description: "Expert-level browser automation, debugging, and performance analysis using Chrome DevTools MCP. Use for interacting with web pages, capturing screenshots, analyzing network traffic, and profiling performance."
|
|
4
|
+
license: MIT
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Chrome DevTools Agent
|
|
8
|
+
|
|
9
|
+
## Overview
|
|
10
|
+
|
|
11
|
+
A specialized skill for controlling and inspecting a live Chrome browser. This skill leverages the `chrome-devtools` MCP server to perform a wide range of browser-related tasks, from simple navigation to complex performance profiling.
|
|
12
|
+
|
|
13
|
+
## When to Use
|
|
14
|
+
|
|
15
|
+
Use this skill when:
|
|
16
|
+
|
|
17
|
+
- **Browser Automation**: Navigating pages, clicking elements, filling forms, and handling dialogs.
|
|
18
|
+
- **Visual Inspection**: Taking screenshots or text snapshots of web pages.
|
|
19
|
+
- **Debugging**: Inspecting console messages, evaluating JavaScript in the page context, and analyzing network requests.
|
|
20
|
+
- **Performance Analysis**: Recording and analyzing performance traces to identify bottlenecks and Core Web Vital issues.
|
|
21
|
+
- **Emulation**: Resizing the viewport or emulating network/CPU conditions.
|
|
22
|
+
|
|
23
|
+
## Tool Categories
|
|
24
|
+
|
|
25
|
+
### 1. Navigation & Page Management
|
|
26
|
+
|
|
27
|
+
- `new_page`: Open a new tab/page.
|
|
28
|
+
- `navigate_page`: Go to a specific URL, reload, or navigate history.
|
|
29
|
+
- `select_page`: Switch context between open pages.
|
|
30
|
+
- `list_pages`: See all open pages and their IDs.
|
|
31
|
+
- `close_page`: Close a specific page.
|
|
32
|
+
- `wait_for`: Wait for specific text to appear on the page.
|
|
33
|
+
|
|
34
|
+
### 2. Input & Interaction
|
|
35
|
+
|
|
36
|
+
- `click`: Click on an element (use `uid` from snapshot).
|
|
37
|
+
- `fill` / `fill_form`: Type text into inputs or fill multiple fields at once.
|
|
38
|
+
- `hover`: Move the mouse over an element.
|
|
39
|
+
- `press_key`: Send keyboard shortcuts or special keys (e.g., "Enter", "Control+C").
|
|
40
|
+
- `drag`: Drag and drop elements.
|
|
41
|
+
- `handle_dialog`: Accept or dismiss browser alerts/prompts.
|
|
42
|
+
- `upload_file`: Upload a file through a file input.
|
|
43
|
+
|
|
44
|
+
### 3. Debugging & Inspection
|
|
45
|
+
|
|
46
|
+
- `take_snapshot`: Get a text-based accessibility tree (best for identifying elements).
|
|
47
|
+
- `take_screenshot`: Capture a visual representation of the page or a specific element.
|
|
48
|
+
- `list_console_messages` / `get_console_message`: Inspect the page's console output.
|
|
49
|
+
- `evaluate_script`: Run custom JavaScript in the page context.
|
|
50
|
+
- `list_network_requests` / `get_network_request`: Analyze network traffic and request details.
|
|
51
|
+
|
|
52
|
+
### 4. Emulation & Performance
|
|
53
|
+
|
|
54
|
+
- `resize_page`: Change the viewport dimensions.
|
|
55
|
+
- `emulate`: Throttling CPU/Network or emulating geolocation.
|
|
56
|
+
- `performance_start_trace`: Start recording a performance profile.
|
|
57
|
+
- `performance_stop_trace`: Stop recording and save the trace.
|
|
58
|
+
- `performance_analyze_insight`: Get detailed analysis from recorded performance data.
|
|
59
|
+
|
|
60
|
+
## Workflow Patterns
|
|
61
|
+
|
|
62
|
+
### Pattern A: Identifying Elements (Snapshot-First)
|
|
63
|
+
|
|
64
|
+
Always prefer `take_snapshot` over `take_screenshot` for finding elements. The snapshot provides `uid` values which are required by interaction tools.
|
|
65
|
+
|
|
66
|
+
```markdown
|
|
67
|
+
1. `take_snapshot` to get the current page structure.
|
|
68
|
+
2. Find the `uid` of the target element.
|
|
69
|
+
3. Use `click(uid=...)` or `fill(uid=..., value=...)`.
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
### Pattern B: Troubleshooting Errors
|
|
73
|
+
|
|
74
|
+
When a page is failing, check both console logs and network requests.
|
|
75
|
+
|
|
76
|
+
```markdown
|
|
77
|
+
1. `list_console_messages` to check for JavaScript errors.
|
|
78
|
+
2. `list_network_requests` to identify failed (4xx/5xx) resources.
|
|
79
|
+
3. `evaluate_script` to check the value of specific DOM elements or global variables.
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
### Pattern C: Performance Profiling
|
|
83
|
+
|
|
84
|
+
Identify why a page is slow.
|
|
85
|
+
|
|
86
|
+
```markdown
|
|
87
|
+
1. `performance_start_trace(reload=true, autoStop=true)`
|
|
88
|
+
2. Wait for the page to load/trace to finish.
|
|
89
|
+
3. `performance_analyze_insight` to find LCP issues or layout shifts.
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
## Best Practices
|
|
93
|
+
|
|
94
|
+
- **Context Awareness**: Always run `list_pages` and `select_page` if you are unsure which tab is currently active.
|
|
95
|
+
- **Snapshots**: Take a new snapshot after any major navigation or DOM change, as `uid` values may change.
|
|
96
|
+
- **Timeouts**: Use reasonable timeouts for `wait_for` to avoid hanging on slow-loading elements.
|
|
97
|
+
- **Screenshots**: Use `take_screenshot` sparingly for visual verification, but rely on `take_snapshot` for logic.
|
|
@@ -0,0 +1,26 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: code-review
|
|
3
|
+
description: 'Local code review before push. Default code-review skill. Trigger for any explicit review request AND autonomously when the agent thinks a review is needed (code/PR/quality/security).'
|
|
4
|
+
metadata:
|
|
5
|
+
version: '0.2.0'
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
# Local Code Review
|
|
9
|
+
|
|
10
|
+
Local pre-push review uses the **built-in `/code-review` skill**. Run a local code review with
|
|
11
|
+
`/code-review` before `git push` so PRs open already-reviewed and CI minutes aren't spent on a
|
|
12
|
+
post-push review loop.
|
|
13
|
+
|
|
14
|
+
## How to Review
|
|
15
|
+
|
|
16
|
+
1. `git fetch origin` so `origin/main` is current before scoping the diff.
|
|
17
|
+
2. Run a local code review with the built-in `/code-review` skill against `origin/main`.
|
|
18
|
+
3. Triage findings by severity — fix **Critical** + **Warning**; **Info** at discretion.
|
|
19
|
+
4. Re-run to confirm clean; cap at ~2 passes.
|
|
20
|
+
5. After opening the PR, persist the local review findings as a PR comment so the closeout
|
|
21
|
+
ceremony can mine them.
|
|
22
|
+
|
|
23
|
+
For deeper, security-focused passes, the built-in `/security-review` command is also available.
|
|
24
|
+
|
|
25
|
+
> Note: prefer the built-in `/code-review` skill for local pre-push review rather than a
|
|
26
|
+
> post-push cloud review tool, so PRs open already-reviewed.
|