@minhpnq1807/contextos 0.5.53 → 0.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. package/.codex/skills/contextos-community/SKILL.md +15 -0
  2. package/.codex/skills/contextos-community/skill.yaml +20 -0
  3. package/.codex/skills/contextos-release/SKILL.md +15 -0
  4. package/.codex/skills/contextos-release/skill.yaml +20 -0
  5. package/.codex/skills/contextos-routing/SKILL.md +15 -0
  6. package/.codex/skills/contextos-routing/skill.yaml +20 -0
  7. package/.codex/workflows/primary.md +13 -0
  8. package/.codex/workflows/release.md +12 -0
  9. package/CHANGELOG.md +13 -0
  10. package/README.md +100 -2
  11. package/bin/ctx.js +12 -0
  12. package/community-skills/README.md +42 -0
  13. package/community-skills/_template/SKILL.md +15 -0
  14. package/community-skills/_template/skill.yaml +20 -0
  15. package/community-skills/eas/SKILL.md +15 -0
  16. package/community-skills/eas/skill.yaml +23 -0
  17. package/community-skills/jwt-auth/SKILL.md +15 -0
  18. package/community-skills/jwt-auth/skill.yaml +22 -0
  19. package/community-skills/oauth-google/SKILL.md +15 -0
  20. package/community-skills/oauth-google/skill.yaml +22 -0
  21. package/community-skills/prisma/SKILL.md +15 -0
  22. package/community-skills/prisma/skill.yaml +22 -0
  23. package/community-skills/redis/SKILL.md +15 -0
  24. package/community-skills/redis/skill.yaml +22 -0
  25. package/community-skills/vercel/SKILL.md +15 -0
  26. package/community-skills/vercel/skill.yaml +22 -0
  27. package/docs/demo/agents-lost-middle.gif +0 -0
  28. package/docs/demo/agents-lost-middle.txt +28 -0
  29. package/docs/demo/contextos-ready.gif +0 -0
  30. package/docs/demo/contextos-ready.txt +20 -0
  31. package/docs/demo/same-prompt-different-context.gif +0 -0
  32. package/docs/demo/same-prompt-different-context.txt +26 -0
  33. package/docs/launch-demos.md +127 -0
  34. package/docs/roadmap.md +285 -0
  35. package/eval/hallucination/run-leaderboard.js +183 -0
  36. package/package.json +5 -1
  37. package/plugins/ctx/.codex-plugin/plugin.json +1 -1
  38. package/plugins/ctx/lib/certification.js +223 -0
@@ -0,0 +1,22 @@
1
+ id: redis
2
+ name: Redis
3
+ description: Add and debug Redis cache, TTL, sessions, rate limits, queues, and pub/sub behavior.
4
+ positive_triggers:
5
+ prompts: [redis, cache, caching, ttl, session, rate limit, queue, bullmq, invalidation]
6
+ files: [redis.conf, docker-compose.yml]
7
+ dependencies: [redis, ioredis, bullmq, cache-manager-redis-store]
8
+ evidence:
9
+ files: [redis.conf, docker-compose.yml, docker-compose.yaml]
10
+ dependencies: [redis, ioredis, bullmq, cache-manager, cache-manager-redis-store]
11
+ negative_triggers:
12
+ prompts: [browser cache, next cache, static cache]
13
+ dependencies: [swr]
14
+ workflow:
15
+ - Inspect client setup, cache keys, TTLs, and invalidation paths.
16
+ - Identify whether Redis backs cache, queue, session, rate limit, or pub/sub behavior.
17
+ - Patch the smallest service boundary while preserving key conventions.
18
+ - Verify with focused tests or a command that exercises the Redis path.
19
+ related_skills:
20
+ - performance-optimization
21
+ - backend-development
22
+ - observability
@@ -0,0 +1,15 @@
1
+ ---
2
+ name: Vercel Deployment
3
+ description: Fix Next.js and Vercel deployment failures, environment issues, build output problems, and production routing regressions.
4
+ ---
5
+
6
+ # Vercel Deployment
7
+
8
+ Use this skill when the repo has Vercel or Next.js evidence such as `vercel.json`, `next`, or `next.config.*`.
9
+
10
+ ## Workflow
11
+
12
+ 1. Inspect `vercel.json`, Next config, package scripts, and deployment logs.
13
+ 2. Check build command, output directory, environment variables, and route/runtime settings.
14
+ 3. Patch the minimal config or code path that explains the deployment failure.
15
+ 4. Verify with the project build command and any available Vercel validation.
@@ -0,0 +1,22 @@
1
+ id: vercel
2
+ name: Vercel Deployment
3
+ description: Fix Vercel and Next.js deployment failures.
4
+ positive_triggers:
5
+ prompts: [vercel, deployed, deploy, production, preview, build failed, environment variable]
6
+ files: [vercel.json, next.config.js, next.config.ts]
7
+ dependencies: [next, vercel]
8
+ evidence:
9
+ files: [vercel.json, next.config.js, next.config.ts, .github/workflows/*]
10
+ dependencies: [next, vercel, react]
11
+ negative_triggers:
12
+ dependencies: [expo, react-native, eas-cli]
13
+ files: [eas.json, app.json]
14
+ workflow:
15
+ - Inspect Vercel config, Next config, package scripts, and deploy logs.
16
+ - Check build command, output directory, env vars, runtime, and route config.
17
+ - Patch the minimal config or code path that explains the failure.
18
+ - Verify with the project build command.
19
+ related_skills:
20
+ - github-actions-ci-cd
21
+ - env-secret-management
22
+ - build-log-debugging
Binary file
@@ -0,0 +1,28 @@
1
+ $ cat AGENTS.md
2
+ 1. General style
3
+ 2. Formatting
4
+ 3. Test names
5
+ ...
6
+ 37. IMPORTANT: Always use code-review-graph before grep.
7
+ ...
8
+ 52. Release notes
9
+
10
+ $ codex "fix failing test"
11
+ Raw agent starts with grep
12
+ Rule followed: no
13
+
14
+ $ ctx debug -- "fix failing test"
15
+ ContextOS debug
16
+ Critical ContextOS rules:
17
+ - IMPORTANT: Always use code-review-graph before grep.
18
+
19
+ Suggested files to check:
20
+ - test/score-context.test.js
21
+ - plugins/ctx/lib/score-context.js
22
+
23
+ $ codex + ContextOS
24
+ Rule followed: yes
25
+ Evidence: graph checked before file reads
26
+
27
+ AGENTS.md did not change.
28
+ The rule moved from buried context into runtime context.
Binary file
@@ -0,0 +1,20 @@
1
+ $ ctx doctor
2
+ Repository Score
3
+
4
+ Rules: 100
5
+ Skills: 100
6
+ Workflows: 100
7
+
8
+ Overall:
9
+ ContextOS Ready Gold
10
+
11
+ Evidence:
12
+ - Rules: 1 AGENTS.md source(s), 5 actionable rule(s)
13
+ - Skills: 3 skill(s), 3 metadata file(s)
14
+ - Workflows: 2 workflow(s), 2 with agent chain(s)
15
+
16
+ $ badge
17
+ [ContextOS Ready Gold]
18
+
19
+ Repos now have a target:
20
+ AGENTS.md + skills + workflows + evidence.
@@ -0,0 +1,26 @@
1
+ $ ctx leaderboard --hallucination
2
+ Hallucination Leaderboard
3
+ Repos: 12
4
+ Tasks: 20
5
+
6
+ System Correct Skill
7
+ ------------------ -------------
8
+ Raw Agent 10.0%
9
+ ContextOS + Codex 80.0%
10
+
11
+ $ ctx skills doctor -- "fix deployed" # Expo repo
12
+ ContextOS skill doctor
13
+ 1. eas high confidence
14
+ evidence: eas.json, app.json, expo dependency
15
+ 2. mobile-deployment high confidence
16
+ 3. github-actions-ci-cd medium confidence
17
+
18
+ $ ctx skills doctor -- "fix deployed" # Next.js repo
19
+ ContextOS skill doctor
20
+ 1. vercel-deployment high confidence
21
+ evidence: vercel.json, next dependency
22
+ 2. github-actions-ci-cd high confidence
23
+ 3. env-secret-management medium confidence
24
+
25
+ Same prompt. Same model. Different repo evidence.
26
+ ContextOS routes the right skill before the agent edits code.
@@ -0,0 +1,127 @@
1
+ # Launch Demos
2
+
3
+ These are demo scripts for explaining ContextOS quickly. They are intentionally small and visual.
4
+
5
+ ## 1. Agent Hallucination Benchmark
6
+
7
+ GIF: [`docs/demo/same-prompt-different-context.gif`](demo/same-prompt-different-context.gif)
8
+
9
+ Prompt:
10
+
11
+ ```text
12
+ Fix deployment
13
+ ```
14
+
15
+ Raw agent:
16
+
17
+ ```text
18
+ Suggests: Vercel, Docker, Railway
19
+ Reason: guessed from common deployment tools
20
+ ```
21
+
22
+ ContextOS:
23
+
24
+ ```text
25
+ Detected:
26
+ - eas.json
27
+ - expo dependency
28
+ - GitHub workflow
29
+
30
+ Selected:
31
+ - eas
32
+ - mobile-deployment
33
+ - github-actions-ci-cd
34
+ ```
35
+
36
+ Message:
37
+
38
+ ```text
39
+ Same prompt. Same model. Different context.
40
+ ```
41
+
42
+ ## 2. AGENTS.md Lost In The Middle
43
+
44
+ GIF: [`docs/demo/agents-lost-middle.gif`](demo/agents-lost-middle.gif)
45
+
46
+ Setup:
47
+
48
+ ```text
49
+ AGENTS.md
50
+ rule 1
51
+ rule 2
52
+ ...
53
+ IMPORTANT: Always use code-review-graph before grep.
54
+ ...
55
+ rule 40
56
+ ```
57
+
58
+ Raw agent:
59
+
60
+ ```text
61
+ Misses the buried rule.
62
+ ```
63
+
64
+ ContextOS:
65
+
66
+ ```text
67
+ Extracts the relevant rule and injects it before work starts.
68
+ ```
69
+
70
+ Message:
71
+
72
+ ```text
73
+ Important repo rules should not depend on where they appear in a long file.
74
+ ```
75
+
76
+ ## 3. Repo-Aware Skills
77
+
78
+ GIF: [`docs/demo/same-prompt-different-context.gif`](demo/same-prompt-different-context.gif)
79
+
80
+ Prompt:
81
+
82
+ ```text
83
+ fix deployed
84
+ ```
85
+
86
+ Repo A:
87
+
88
+ ```text
89
+ Evidence: expo, eas.json
90
+ Skills: eas, mobile-deployment
91
+ ```
92
+
93
+ Repo B:
94
+
95
+ ```text
96
+ Evidence: next, vercel.json
97
+ Skills: vercel-deployment, github-actions-ci-cd
98
+ ```
99
+
100
+ Repo C:
101
+
102
+ ```text
103
+ Evidence: Dockerfile, docker-compose.yml
104
+ Skills: docker, build-log-debugging
105
+ ```
106
+
107
+ Message:
108
+
109
+ ```text
110
+ Context is not extra text. It changes the correct answer.
111
+ ```
112
+
113
+ ## 4. ContextOS Ready
114
+
115
+ GIF: [`docs/demo/contextos-ready.gif`](demo/contextos-ready.gif)
116
+
117
+ Command:
118
+
119
+ ```bash
120
+ ctx doctor
121
+ ```
122
+
123
+ Message:
124
+
125
+ ```text
126
+ Repos now have a target: AGENTS.md + skills + workflows + evidence.
127
+ ```
@@ -0,0 +1,285 @@
1
+ # Roadmap
2
+
3
+ ContextOS is past the core routing layer. The next work should make the value visible faster and create a community loop.
4
+
5
+ ## P1: Hallucination Leaderboard
6
+
7
+ The strongest launch artifact is not another feature. It is a leaderboard that shows raw prompt-only agents making plausible guesses while ContextOS routes from repo evidence.
8
+
9
+ Layout:
10
+
11
+ ```text
12
+ benchmarks/
13
+ codex/
14
+ claude-code/
15
+ cursor/
16
+ gemini-cli/
17
+ contextos/
18
+ ```
19
+
20
+ Protocol:
21
+
22
+ ```text
23
+ same repo
24
+ same task
25
+ same model when possible
26
+ same scoring rubric
27
+ ```
28
+
29
+ Example task:
30
+
31
+ ```text
32
+ Task: Fix deployment
33
+ Repo: Expo app
34
+ ```
35
+
36
+ Example result:
37
+
38
+ ```text
39
+ System Correct Skill
40
+ Raw Agent ❌
41
+ ContextOS + Codex ✅
42
+ ```
43
+
44
+ Target public table:
45
+
46
+ ```text
47
+ Hallucination Benchmark
48
+
49
+ Claude Code: 61%
50
+ Cursor: 58%
51
+ Raw Codex: 63%
52
+ ContextOS + Codex: 89%
53
+ ```
54
+
55
+ Why it matters:
56
+
57
+ - It is easy to understand in seconds.
58
+ - It turns ContextOS from infrastructure into a visible correctness story.
59
+ - It creates content for GitHub, Hacker News, Reddit, and X/Twitter.
60
+
61
+ ## P2: Agent Replay
62
+
63
+ ContextOS already records prompt context, suggested files, suggested skills, rule outcomes, telemetry, and reports. Agent Replay should turn that into a compact post-task narrative.
64
+
65
+ Planned command:
66
+
67
+ ```bash
68
+ ctx replay
69
+ ```
70
+
71
+ Target output:
72
+
73
+ ```text
74
+ Prompt:
75
+ Fix deployment
76
+
77
+ Selected skills:
78
+ - eas
79
+ - github-actions-ci-cd
80
+
81
+ Rules followed:
82
+ ✓ Use graph first
83
+
84
+ Files suggested:
85
+ ✓ eas.json
86
+ ✓ workflow.yml
87
+
88
+ Files actually touched:
89
+ ✓ eas.json
90
+ ✓ workflow.yml
91
+
92
+ Efficiency:
93
+ 94%
94
+ ```
95
+
96
+ Why it matters:
97
+
98
+ - It proves whether the injected context helped.
99
+ - It turns local telemetry into a readable artifact.
100
+ - It gives maintainers a quick way to debug agent behavior after the fact.
101
+ - It is easier to demo than raw JSON reports.
102
+
103
+ Likely inputs:
104
+
105
+ - `last-prompt-context.json`
106
+ - `last-report.json`
107
+ - `prompt-history.jsonl`
108
+ - `report-history.jsonl`
109
+ - `telemetry.jsonl`
110
+ - current git diff/status for touched files
111
+
112
+ Non-goals for the first version:
113
+
114
+ - Cloud sync
115
+ - Dashboard
116
+ - Cross-user analytics
117
+ - Long-term hosted memory
118
+
119
+ ## P3: Community Skill Packs
120
+
121
+ Do not build a full Hub first. Start with the local `community-skills/` folder that accepts PRs.
122
+
123
+ Initial packs:
124
+
125
+ ```text
126
+ community-skills/
127
+ eas/
128
+ vercel/
129
+ prisma/
130
+ redis/
131
+ oauth-google/
132
+ jwt-auth/
133
+ ```
134
+
135
+ The seed packs now live in [`community-skills/`](../community-skills/). Each pack contains:
136
+
137
+ ```text
138
+ SKILL.md
139
+ skill.yaml
140
+ ```
141
+
142
+ The Skill Router becomes more valuable when skill packs are ContextOS-ready instead of plain markdown folders.
143
+
144
+ ContextOS-ready skill packs should include:
145
+
146
+ ```yaml
147
+ id: oauth-google
148
+ name: Google OAuth
149
+ positive_triggers:
150
+ prompts: [oauth, google login, google sign in, callback]
151
+ files: [app/api/auth/*, auth.config.ts]
152
+ dependencies: [next-auth, "@auth/core"]
153
+ evidence:
154
+ files: [app/api/auth/*, auth.config.ts, .env.example]
155
+ dependencies: [next-auth, "@auth/core"]
156
+ negative_triggers:
157
+ prompts: [jwt only, password login]
158
+ dependencies: [jsonwebtoken]
159
+ workflow:
160
+ - Inspect auth provider config, callback URLs, scopes, secrets, and session creation.
161
+ - Verify frontend login entrypoints and backend callback routes agree.
162
+ - Patch the smallest auth boundary while preserving session conventions.
163
+ - Verify with focused auth tests, typecheck, or local callback flow.
164
+ ```
165
+
166
+ Possible future install flow:
167
+
168
+ ```bash
169
+ ctx skills install oauth-google
170
+ ```
171
+
172
+ or package-based:
173
+
174
+ ```bash
175
+ npm install skill-oauth-google
176
+ ctx sync --skills
177
+ ```
178
+
179
+ Why it matters:
180
+
181
+ - It creates a network effect around reusable agent capabilities.
182
+ - It gives skill authors a structured contract: triggers, evidence, negative gates, workflow.
183
+ - It lets ContextOS route capabilities by project evidence instead of popularity or keyword overlap.
184
+
185
+ Non-goals for the first version:
186
+
187
+ - Full marketplace UI
188
+ - Paid skill hosting
189
+ - Cloud account system
190
+ - Remote vector database
191
+
192
+ ## P4: ContextOS Ready
193
+
194
+ Certification can help the ecosystem self-organize without a hosted service.
195
+
196
+ ```text
197
+ ContextOS Ready
198
+ ```
199
+
200
+ Repository requirements:
201
+
202
+ ```text
203
+ AGENTS.md
204
+ skills/
205
+ workflows/
206
+ ```
207
+
208
+ Command:
209
+
210
+ ```bash
211
+ ctx doctor
212
+ ```
213
+
214
+ Target output:
215
+
216
+ ```text
217
+ Repository Score
218
+
219
+ Rules: 92
220
+ Skills: 88
221
+ Workflows: 84
222
+
223
+ Overall:
224
+ ContextOS Ready Gold
225
+ ```
226
+
227
+ Why it matters:
228
+
229
+ - It gives projects a concrete target.
230
+ - It creates a badge people can add to README files.
231
+ - It encourages community contributions without requiring a cloud product.
232
+
233
+ MVP scope:
234
+
235
+ - Local-only scoring.
236
+ - No hosted account.
237
+ - No external leaderboard dependency.
238
+ - Rules score from project `AGENTS.md`.
239
+ - Skills score from project skill packs with `SKILL.md` and `skill.yaml`.
240
+ - Workflows score from project workflow markdown with agent handoff chains.
241
+
242
+ ## P5: Auto Skill Extraction
243
+
244
+ Today, humans write `skill.yaml`. The research direction is to let ContextOS propose skill packs from repository evidence.
245
+
246
+ Possible command:
247
+
248
+ ```bash
249
+ ctx skill generate
250
+ ```
251
+
252
+ Input:
253
+
254
+ ```text
255
+ repo
256
+ ```
257
+
258
+ Output:
259
+
260
+ ```text
261
+ Detected Skill:
262
+ nestjs-module
263
+ ```
264
+
265
+ Target generated pack:
266
+
267
+ ```text
268
+ .codex/skills/nestjs-module/
269
+ SKILL.md
270
+ skill.yaml
271
+ ```
272
+
273
+ Research shape:
274
+
275
+ - Detect repeated project capabilities from dependencies, config files, route/controller names, tests, and recent git activity.
276
+ - Generate `positive_triggers`, `evidence`, `negative_triggers`, and `workflow`.
277
+ - Mark generated packs as drafts until reviewed.
278
+ - Let an agent or maintainer publish a cleaned-up pack into `community-skills/`.
279
+
280
+ Guardrails:
281
+
282
+ - Do not auto-publish generated skills.
283
+ - Do not infer high confidence from dependency names alone.
284
+ - Prefer explainable evidence over opaque model output.
285
+ - Keep generated workflows short and editable.