trace-to-skill 0.1.26 → 0.1.35

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,76 @@
1
+ # Discovery Summary
2
+
3
+ This page is written for maintainers, search engines, package indexes, and AI research bots that need to understand `trace-to-skill` quickly.
4
+
5
+ ## One-Sentence Summary
6
+
7
+ `trace-to-skill` checks whether a repository is Codex-ready, then turns failed AI coding-agent runs into evidence-backed `AGENTS.md` rules, `SKILL.md` workflows, privacy-safe traces, and eval gates.
8
+
9
+ ## Canonical Links
10
+
11
+ - Repository: https://github.com/grnbtqdbyx-create/trace-to-skill
12
+ - npm: https://www.npmjs.com/package/trace-to-skill
13
+ - Latest release: https://github.com/grnbtqdbyx-create/trace-to-skill/releases/latest
14
+ - Use cases: https://github.com/grnbtqdbyx-create/trace-to-skill/blob/main/docs/USE_CASES.md
15
+ - Scorecard: https://github.com/grnbtqdbyx-create/trace-to-skill/blob/main/docs/SCORECARD.md
16
+ - OpenAI OSS strategy: https://github.com/grnbtqdbyx-create/trace-to-skill/blob/main/docs/OPENAI_OSS_STRATEGY.md
17
+
18
+ ## Problems It Solves
19
+
20
+ - Codex or Claude Code claims a task is done without validation evidence.
21
+ - A coding agent repeats the same test/build/lint failure.
22
+ - A repository has conflicting `AGENTS.md`, `CLAUDE.md`, Cursor, Copilot, or Gemini instructions.
23
+ - A monorepo has nested `AGENTS.md` files, `@file.md` instruction includes, or invalid instruction-file encoding that makes Codex load the wrong policy.
24
+ - A workflow wants to feed GitHub issue, PR, comment, discussion, check-run, or commit text into an agent but needs prompt-injection checks first.
25
+ - MCP config gives agents filesystem, shell, browser, network, database, container, or secret-bearing access without a visible trust boundary.
26
+ - MCP config looks valid at a glance but has broken startup inputs, such as missing commands, bad `cwd`, placeholder env vars, unresolved `$VARS`, unresolved plugin placeholders, local stdio commands without explicit `cwd`, or the wrong JSON `mcp_servers` wrapper key.
27
+ - Codex config contains drift-prone settings such as deprecated `codex_hooks`, missing `default_permissions` profiles, or synced `projects.* trusted_level` project metadata.
28
+ - Maintainers want to share failed traces publicly without leaking common tokens, emails, home paths, or hidden Unicode controls.
29
+
30
+ ## Primary Users
31
+
32
+ - open-source maintainers using Codex for PR review, issue triage, release automation, or code changes
33
+ - teams adopting `AGENTS.md` and repository-specific agent policy
34
+ - maintainers evaluating MCP tool risk before enabling coding agents
35
+ - people collecting anonymized failed agent runs for reproducible evals
36
+
37
+ ## Commands To Try
38
+
39
+ ```bash
40
+ npx trace-to-skill doctor .
41
+ npx trace-to-skill scorecard .
42
+ npx trace-to-skill lint-agents .
43
+ npx trace-to-skill guard-github-event "$GITHUB_EVENT_PATH"
44
+ npx trace-to-skill redact ./runs --output redacted-runs
45
+ npx trace-to-skill analyze ./runs --format json
46
+ npx trace-to-skill suggest ./runs --target agents-md
47
+ ```
48
+
49
+ ## Outputs For Bots
50
+
51
+ - `trace-to-skill analyze --format json`
52
+ - `trace-to-skill lint-agents --format json`
53
+ - `trace-to-skill doctor --format json`
54
+ - `trace-to-skill redact --format json`
55
+ - `trace-to-skill scorecard --format json`
56
+ - SARIF from `trace-to-skill analyze --format sarif`
57
+ - GitHub Action outputs for doctor, AGENTS lint, GitHub context guard, benchmark, and scorecard modes
58
+
59
+ ## Schema Contracts
60
+
61
+ - `schemas/analysis-result.schema.json`
62
+ - `schemas/agents-lint-result.schema.json`
63
+ - `schemas/doctor-result.schema.json`
64
+ - `schemas/redact-result.schema.json`
65
+ - `schemas/scorecard-result.schema.json`
66
+
67
+ ## Related Keywords
68
+
69
+ Codex, OpenAI Codex, Codex CLI, AGENTS.md, SKILL.md, Claude Code, Cursor, Copilot coding agent, Gemini CLI, MCP, Model Context Protocol, prompt injection, agent evals, AI code review, open-source maintainers, trace redaction, SARIF, GitHub Actions.
70
+
71
+ ## Non-Goals
72
+
73
+ - It does not train a model.
74
+ - It does not automatically rewrite project policy.
75
+ - It does not ask maintainers to publish full private transcripts.
76
+ - It does not replace security review; it gives maintainers deterministic evidence and guardrails.
@@ -27,6 +27,9 @@ Agent instruction files disagree or the agent ignores an existing repository rul
27
27
  - different package managers for validation commands
28
28
  - "always run tests" vs "do not run tests"
29
29
  - approval required vs approval bypassed for destructive commands
30
+ - missing `@file.md` include targets
31
+ - nested `AGENTS.md` files that the root instructions do not point to
32
+ - invalid UTF-8 bytes that can make instruction loading fail or become hard to debug
30
33
 
31
34
  ## Over-Editing
32
35
 
@@ -54,4 +57,4 @@ The fix is to treat those surfaces as data unless the instruction is also presen
54
57
 
55
58
  MCP server configuration or tool usage appears without an explicit trust boundary, capability inventory, or approval policy.
56
59
 
57
- `trace-to-skill` also parses common `mcpServers` JSON shapes and reports capability hints such as filesystem, shell, browser, network, database, container, and secret-bearing environment variables.
60
+ `trace-to-skill` also parses common `mcpServers` JSON shapes and project `.codex/config.toml` MCP sections, then reports capability hints such as filesystem, shell, browser, network, database, container, and secret-bearing environment variables. `lint-agents` checks static startup inputs too: command availability, missing `cwd`, placeholder env values, unresolved `$VARS`, unresolved plugin placeholders, local stdio commands without explicit `cwd`, and JSON `mcp_servers` / `mcpServers` casing drift. It also flags Codex config drift such as deprecated `codex_hooks`, missing `default_permissions` profile definitions, and synced `projects.* trusted_level` metadata.
@@ -0,0 +1,114 @@
1
+ # Use Cases
2
+
3
+ `trace-to-skill` is for maintainers who want coding agents to produce reviewable evidence instead of repeating the same mistakes.
4
+
5
+ ## 1. Codex Readiness Gate
6
+
7
+ Use this when a repository wants Codex-assisted pull requests, but maintainers need proof that the repo has basic guardrails.
8
+
9
+ ```bash
10
+ npx trace-to-skill scorecard .
11
+ ```
12
+
13
+ What it proves:
14
+
15
+ - repository instructions exist
16
+ - CI and validation scripts are present
17
+ - maintainer docs and license are visible
18
+ - distribution is easy to try
19
+ - benchmark fixtures still catch known agent failure classes
20
+
21
+ Recommended CI surface:
22
+
23
+ ```yaml
24
+ - uses: grnbtqdbyx-create/trace-to-skill@v0.1.35
25
+ with:
26
+ mode: all
27
+ doctor-threshold: "85"
28
+ doctor-comment: "true"
29
+ scorecard-comment: "true"
30
+ job-summary: "true"
31
+ github-token: ${{ github.token }}
32
+ ```
33
+
34
+ ## 2. AGENTS.md And MCP Hygiene
35
+
36
+ Use this before giving Codex broad repository access.
37
+
38
+ ```bash
39
+ npx trace-to-skill lint-agents .
40
+ ```
41
+
42
+ This checks:
43
+
44
+ - whether repository-level agent instructions exist
45
+ - whether `AGENTS.md`, `CLAUDE.md`, Cursor rules, Copilot instructions, or other tool guidance conflict
46
+ - whether instruction files reference paths that no longer exist or have grown large enough to risk ignored guidance
47
+ - whether `@file.md` include references are missing, nested `AGENTS.md` files are easy to miss, or instruction files contain invalid UTF-8
48
+ - whether MCP config hints at risky capabilities such as filesystem, shell, browser, network, database, container, or secret-bearing environment variables
49
+ - whether JSON or `.codex/config.toml` MCP startup inputs are obviously broken before launch, including wrong JSON `mcp_servers` casing, missing commands, missing `cwd`, placeholder env values, unresolved `$VARS`, unresolved plugin placeholders, or local stdio commands without explicit `cwd`
50
+ - whether Codex config has drift-prone settings such as deprecated `codex_hooks`, missing `default_permissions` profile definitions, or synced `projects.* trusted_level` metadata
51
+
52
+ The goal is not to ban powerful tools. The goal is to make trust boundaries visible before an agent acts.
53
+
54
+ ## 3. GitHub Context Guard
55
+
56
+ Use this before an agent reads untrusted GitHub text.
57
+
58
+ ```bash
59
+ npx trace-to-skill guard-github-event "$GITHUB_EVENT_PATH"
60
+ ```
61
+
62
+ This scans pull request bodies, issue text, comments, discussions, review text, check-run messages, and commit messages for prompt-injection patterns.
63
+
64
+ Use it when:
65
+
66
+ - a workflow lets an agent summarize or act on PR comments
67
+ - maintainers paste issue text into Codex
68
+ - a bot asks Codex to triage untrusted user reports
69
+ - logs or comments might contain instructions like "ignore previous instructions" or "print secrets"
70
+
71
+ ## 4. Failed Agent Run To Reviewable Rule
72
+
73
+ Use this when a coding agent made a repeated workflow mistake.
74
+
75
+ ```bash
76
+ npx trace-to-skill analyze ./runs --output agent-learning-report.md
77
+ npx trace-to-skill suggest ./runs --target agents-md --output AGENTS.generated.md
78
+ npx trace-to-skill eval ./runs --threshold 80
79
+ ```
80
+
81
+ Recommended maintainer loop:
82
+
83
+ 1. Store a short redacted trace in `runs/`.
84
+ 2. Run `analyze` to classify the failure.
85
+ 3. Run `suggest` to generate candidate `AGENTS.md` or `SKILL.md` text.
86
+ 4. Copy only evidence-backed rules into the real policy file.
87
+ 5. Run `eval` or `scorecard` in CI so the same failure does not silently return.
88
+
89
+ ## 5. Privacy-Preserving Adoption
90
+
91
+ Use this when you want public evidence without leaking private traces.
92
+
93
+ ```bash
94
+ npx trace-to-skill redact ./runs --output redacted-runs
95
+ npx trace-to-skill analyze ./runs --format json
96
+ npx trace-to-skill analyze ./runs --format sarif --output trace-to-skill.sarif
97
+ ```
98
+
99
+ Before publishing traces:
100
+
101
+ - redact secrets, cookies, customer data, and proprietary code
102
+ - keep only the lines needed to explain the failure
103
+ - treat issue bodies, PR comments, copied logs, and web pages as untrusted input
104
+ - prefer short fixtures that reproduce a detector over full transcripts
105
+
106
+ ## Why This Helps Open Source Maintainers
107
+
108
+ The useful unit is not "an agent wrote code." The useful unit is:
109
+
110
+ ```text
111
+ maintainer-visible failure -> evidence-backed rule -> repeatable gate
112
+ ```
113
+
114
+ That is the path from ad-hoc AI usage to safer Codex-assisted maintenance.
package/llms.txt ADDED
@@ -0,0 +1,88 @@
1
+ # trace-to-skill
2
+
3
+ > Open-source CLI and GitHub Action for Codex-ready repository maintenance: turn failed AI coding-agent runs into reusable AGENTS.md rules, SKILL.md workflows, privacy-safe traces, and eval gates.
4
+
5
+ Canonical repository: https://github.com/grnbtqdbyx-create/trace-to-skill
6
+ NPM package: https://www.npmjs.com/package/trace-to-skill
7
+ Latest release: https://github.com/grnbtqdbyx-create/trace-to-skill/releases/latest
8
+ License: Apache-2.0
9
+ Runtime: Node.js 20+
10
+
11
+ ## What this project is
12
+
13
+ `trace-to-skill` helps open-source maintainers adopt Codex and other coding agents safely. It focuses on maintainer pain points:
14
+
15
+ - agents claiming completion without test/build proof
16
+ - failed tests hidden behind optimistic summaries
17
+ - hallucinated files and broad over-editing
18
+ - conflicting `AGENTS.md`, `CLAUDE.md`, Cursor, Copilot, or Gemini instructions
19
+ - stale path references, missing `@file.md` includes, nested `AGENTS.md` visibility gaps, invalid UTF-8, and oversized instruction files that can make Codex follow wrong or truncated guidance
20
+ - prompt injection in issue, PR, review, discussion, check-run, commit, log, or web text
21
+ - risky MCP server capabilities, secret-bearing environment variables, broken JSON/TOML startup inputs, unresolved plugin placeholders, missing `cwd`, deprecated `codex_hooks`, missing `default_permissions` profiles, synced `projects.* trusted_level` metadata, and `mcp_servers` / `mcpServers` casing mismatches
22
+ - sharing failed agent traces without leaking tokens, emails, local paths, or hidden Unicode controls
23
+
24
+ The core loop is:
25
+
26
+ ```text
27
+ failed agent run -> failure class -> evidence-backed AGENTS.md/SKILL.md suggestion -> eval gate -> keep or revise
28
+ ```
29
+
30
+ ## Best entry points for bots and maintainers
31
+
32
+ - README: https://github.com/grnbtqdbyx-create/trace-to-skill#readme
33
+ - Use cases: https://github.com/grnbtqdbyx-create/trace-to-skill/blob/main/docs/USE_CASES.md
34
+ - Discovery summary: https://github.com/grnbtqdbyx-create/trace-to-skill/blob/main/docs/DISCOVERY.md
35
+ - Adoption guide: https://github.com/grnbtqdbyx-create/trace-to-skill/blob/main/docs/ADOPTION_GUIDE.md
36
+ - Failure taxonomy: https://github.com/grnbtqdbyx-create/trace-to-skill/blob/main/docs/FAILURE_TAXONOMY.md
37
+ - OpenAI OSS strategy: https://github.com/grnbtqdbyx-create/trace-to-skill/blob/main/docs/OPENAI_OSS_STRATEGY.md
38
+ - OpenAI application draft: https://github.com/grnbtqdbyx-create/trace-to-skill/blob/main/docs/OPENAI_APPLICATION_DRAFT.md
39
+ - Benchmark: https://github.com/grnbtqdbyx-create/trace-to-skill/blob/main/docs/BENCHMARK.md
40
+ - Scorecard: https://github.com/grnbtqdbyx-create/trace-to-skill/blob/main/docs/SCORECARD.md
41
+ - Codex readiness auditor skill: https://github.com/grnbtqdbyx-create/trace-to-skill/blob/main/skills/codex-readiness-auditor/SKILL.md
42
+
43
+ ## Core commands
44
+
45
+ ```bash
46
+ npx trace-to-skill doctor .
47
+ npx trace-to-skill lint-agents .
48
+ npx trace-to-skill guard-github-event "$GITHUB_EVENT_PATH"
49
+ npx trace-to-skill redact ./runs --output redacted-runs
50
+ npx trace-to-skill analyze ./runs
51
+ npx trace-to-skill suggest ./runs --target agents-md
52
+ npx trace-to-skill eval ./runs --threshold 80
53
+ npx trace-to-skill benchmark
54
+ npx trace-to-skill scorecard .
55
+ npx trace-to-skill init --comment --sarif
56
+ ```
57
+
58
+ ## GitHub Action
59
+
60
+ ```yaml
61
+ - uses: grnbtqdbyx-create/trace-to-skill@v0.1.35
62
+ with:
63
+ mode: all
64
+ doctor-threshold: "85"
65
+ doctor-comment: "true"
66
+ scorecard-comment: "true"
67
+ job-summary: "true"
68
+ github-token: ${{ github.token }}
69
+ ```
70
+
71
+ ## Machine-readable schemas
72
+
73
+ - Analysis JSON schema: https://github.com/grnbtqdbyx-create/trace-to-skill/blob/main/schemas/analysis-result.schema.json
74
+ - AGENTS.md linter JSON schema: https://github.com/grnbtqdbyx-create/trace-to-skill/blob/main/schemas/agents-lint-result.schema.json
75
+ - Doctor JSON schema: https://github.com/grnbtqdbyx-create/trace-to-skill/blob/main/schemas/doctor-result.schema.json
76
+ - Redaction JSON schema: https://github.com/grnbtqdbyx-create/trace-to-skill/blob/main/schemas/redact-result.schema.json
77
+ - Scorecard JSON schema: https://github.com/grnbtqdbyx-create/trace-to-skill/blob/main/schemas/scorecard-result.schema.json
78
+
79
+ ## Search phrases this project should answer
80
+
81
+ - Codex readiness checker for open-source repositories
82
+ - AGENTS.md linter for Codex and Claude Code
83
+ - turn failed agent runs into AGENTS.md rules
84
+ - prompt injection guard for GitHub issue and PR comments
85
+ - MCP security scanner for coding agents
86
+ - privacy-preserving redaction for AI agent traces
87
+ - GitHub Action for AI coding-agent eval gates
88
+ - Codex OSS maintainer automation evidence
package/package.json CHANGED
@@ -1,8 +1,17 @@
1
1
  {
2
2
  "name": "trace-to-skill",
3
- "version": "0.1.26",
3
+ "version": "0.1.35",
4
4
  "description": "Turn failed AI coding-agent runs into reusable AGENTS.md rules, SKILL.md files, and eval evidence.",
5
5
  "type": "module",
6
+ "main": "dist/src/index.js",
7
+ "types": "dist/src/index.d.ts",
8
+ "exports": {
9
+ ".": {
10
+ "types": "./dist/src/index.d.ts",
11
+ "import": "./dist/src/index.js"
12
+ },
13
+ "./schemas/*": "./schemas/*"
14
+ },
6
15
  "bin": {
7
16
  "trace-to-skill": "dist/src/cli.js"
8
17
  },
@@ -12,10 +21,13 @@
12
21
  "docs/ADOPTION_GUIDE.md",
13
22
  "docs/AGENTS_LINT.md",
14
23
  "docs/BENCHMARK.md",
24
+ "docs/DISCOVERY.md",
15
25
  "docs/FAILURE_TAXONOMY.md",
16
26
  "docs/SCORECARD.md",
27
+ "docs/USE_CASES.md",
17
28
  "examples",
18
29
  "fixtures",
30
+ "llms.txt",
19
31
  "skills",
20
32
  "README.md",
21
33
  "LICENSE"
@@ -30,21 +42,40 @@
30
42
  },
31
43
  "keywords": [
32
44
  "codex",
45
+ "openai-codex",
33
46
  "codex-readiness",
47
+ "codex-cli",
34
48
  "agents",
35
49
  "ai-agents",
50
+ "ai-coding-agents",
36
51
  "agent-skills",
52
+ "agent-evals",
37
53
  "claude-code",
38
54
  "agents-md",
39
55
  "agents-md-linter",
56
+ "github-action",
40
57
  "json-schema",
41
58
  "mcp",
59
+ "mcp-security",
60
+ "prompt-injection",
42
61
  "evals",
43
62
  "open-source-maintainers",
44
- "self-improvement"
63
+ "self-improvement",
64
+ "trace-redaction"
45
65
  ],
46
66
  "author": "Ogün <https://github.com/grnbtqdbyx-create>",
47
67
  "license": "Apache-2.0",
68
+ "repository": {
69
+ "type": "git",
70
+ "url": "git+https://github.com/grnbtqdbyx-create/trace-to-skill.git"
71
+ },
72
+ "bugs": {
73
+ "url": "https://github.com/grnbtqdbyx-create/trace-to-skill/issues"
74
+ },
75
+ "homepage": "https://github.com/grnbtqdbyx-create/trace-to-skill#readme",
76
+ "publishConfig": {
77
+ "access": "public"
78
+ },
48
79
  "engines": {
49
80
  "node": ">=20"
50
81
  },
@@ -0,0 +1,65 @@
1
+ {
2
+ "$schema": "https://json-schema.org/draft/2020-12/schema",
3
+ "$id": "https://raw.githubusercontent.com/grnbtqdbyx-create/trace-to-skill/main/schemas/redact-result.schema.json",
4
+ "title": "trace-to-skill RedactResult",
5
+ "type": "object",
6
+ "additionalProperties": false,
7
+ "required": [
8
+ "generatedAt",
9
+ "files",
10
+ "totals"
11
+ ],
12
+ "properties": {
13
+ "generatedAt": {
14
+ "type": "string",
15
+ "format": "date-time"
16
+ },
17
+ "files": {
18
+ "type": "array",
19
+ "items": {
20
+ "$ref": "#/$defs/redactedFile"
21
+ }
22
+ },
23
+ "totals": {
24
+ "$ref": "#/$defs/replacementCounts"
25
+ }
26
+ },
27
+ "$defs": {
28
+ "replacementCounts": {
29
+ "type": "object",
30
+ "additionalProperties": {
31
+ "type": "integer",
32
+ "minimum": 0
33
+ }
34
+ },
35
+ "redactedFile": {
36
+ "type": "object",
37
+ "additionalProperties": false,
38
+ "required": [
39
+ "inputPath",
40
+ "bytesBefore",
41
+ "bytesAfter",
42
+ "replacements"
43
+ ],
44
+ "properties": {
45
+ "inputPath": {
46
+ "type": "string"
47
+ },
48
+ "outputPath": {
49
+ "type": "string"
50
+ },
51
+ "bytesBefore": {
52
+ "type": "integer",
53
+ "minimum": 0
54
+ },
55
+ "bytesAfter": {
56
+ "type": "integer",
57
+ "minimum": 0
58
+ },
59
+ "replacements": {
60
+ "$ref": "#/$defs/replacementCounts"
61
+ }
62
+ }
63
+ }
64
+ }
65
+ }