cc-proficiency 0.2.10 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (81) hide show
  1. package/README.md +79 -3
  2. package/dist/cli/commands/ai-grade.d.ts +2 -0
  3. package/dist/cli/commands/ai-grade.d.ts.map +1 -0
  4. package/dist/cli/commands/ai-grade.js +139 -0
  5. package/dist/cli/commands/ai-grade.js.map +1 -0
  6. package/dist/cli/commands/config.d.ts.map +1 -1
  7. package/dist/cli/commands/config.js +2 -0
  8. package/dist/cli/commands/config.js.map +1 -1
  9. package/dist/cli/commands/explain.d.ts.map +1 -1
  10. package/dist/cli/commands/explain.js +18 -0
  11. package/dist/cli/commands/explain.js.map +1 -1
  12. package/dist/cli/commands/process.d.ts.map +1 -1
  13. package/dist/cli/commands/process.js +30 -0
  14. package/dist/cli/commands/process.js.map +1 -1
  15. package/dist/cli/index.js +6 -0
  16. package/dist/cli/index.js.map +1 -1
  17. package/dist/cli/services/publishing.d.ts +1 -0
  18. package/dist/cli/services/publishing.d.ts.map +1 -1
  19. package/dist/cli/services/publishing.js +43 -13
  20. package/dist/cli/services/publishing.js.map +1 -1
  21. package/dist/i18n/locales/en.d.ts.map +1 -1
  22. package/dist/i18n/locales/en.js +28 -1
  23. package/dist/i18n/locales/en.js.map +1 -1
  24. package/dist/i18n/locales/es.d.ts.map +1 -1
  25. package/dist/i18n/locales/es.js +28 -1
  26. package/dist/i18n/locales/es.js.map +1 -1
  27. package/dist/i18n/locales/fr.d.ts.map +1 -1
  28. package/dist/i18n/locales/fr.js +27 -0
  29. package/dist/i18n/locales/fr.js.map +1 -1
  30. package/dist/i18n/locales/ja.d.ts.map +1 -1
  31. package/dist/i18n/locales/ja.js +27 -0
  32. package/dist/i18n/locales/ja.js.map +1 -1
  33. package/dist/i18n/locales/ko.d.ts.map +1 -1
  34. package/dist/i18n/locales/ko.js +27 -0
  35. package/dist/i18n/locales/ko.js.map +1 -1
  36. package/dist/i18n/locales/zh-CN.d.ts.map +1 -1
  37. package/dist/i18n/locales/zh-CN.js +27 -0
  38. package/dist/i18n/locales/zh-CN.js.map +1 -1
  39. package/dist/i18n/types.d.ts +21 -0
  40. package/dist/i18n/types.d.ts.map +1 -1
  41. package/dist/renderer/ai-animated-svg.d.ts +11 -0
  42. package/dist/renderer/ai-animated-svg.d.ts.map +1 -0
  43. package/dist/renderer/ai-animated-svg.js +206 -0
  44. package/dist/renderer/ai-animated-svg.js.map +1 -0
  45. package/dist/scoring/ai-evidence.d.ts +108 -0
  46. package/dist/scoring/ai-evidence.d.ts.map +1 -0
  47. package/dist/scoring/ai-evidence.js +305 -0
  48. package/dist/scoring/ai-evidence.js.map +1 -0
  49. package/dist/scoring/ai-grader.d.ts +64 -0
  50. package/dist/scoring/ai-grader.d.ts.map +1 -0
  51. package/dist/scoring/ai-grader.js +229 -0
  52. package/dist/scoring/ai-grader.js.map +1 -0
  53. package/dist/scoring/rubric-loader.d.ts +4 -0
  54. package/dist/scoring/rubric-loader.d.ts.map +1 -0
  55. package/dist/scoring/rubric-loader.js +102 -0
  56. package/dist/scoring/rubric-loader.js.map +1 -0
  57. package/dist/store/achievements.d.ts.map +1 -1
  58. package/dist/store/achievements.js +25 -0
  59. package/dist/store/achievements.js.map +1 -1
  60. package/dist/store/local-store.d.ts +2 -0
  61. package/dist/store/local-store.d.ts.map +1 -1
  62. package/dist/store/local-store.js +11 -0
  63. package/dist/store/local-store.js.map +1 -1
  64. package/dist/store/queue.d.ts +8 -0
  65. package/dist/store/queue.d.ts.map +1 -1
  66. package/dist/store/queue.js +44 -0
  67. package/dist/store/queue.js.map +1 -1
  68. package/dist/store/remote-store.d.ts.map +1 -1
  69. package/dist/store/remote-store.js +3 -0
  70. package/dist/store/remote-store.js.map +1 -1
  71. package/dist/types.d.ts +51 -0
  72. package/dist/types.d.ts.map +1 -1
  73. package/docs/ai-grading/README.md +39 -0
  74. package/docs/ai-grading/anti-gaming.md +11 -0
  75. package/docs/ai-grading/collaboration-quality.md +25 -0
  76. package/docs/ai-grading/goal-achievement.md +25 -0
  77. package/docs/ai-grading/growth-learning.md +25 -0
  78. package/docs/ai-grading/system-prompt-header.md +32 -0
  79. package/docs/ai-grading/verification-quality.md +25 -0
  80. package/docs/ai-grading/workflow-mastery.md +25 -0
  81. package/package.json +2 -1
@@ -0,0 +1,25 @@
1
+ ---
2
+ id: goal-achievement
3
+ label: Goal Achievement
4
+ description: How effectively does the user define and accomplish objectives?
5
+ ---
6
+
7
+ ## Q1: Goal clarity & specificity
8
+ **Anchors:** 1=vague single-word goals, 2=basic intent visible, 3=clear goals with context, 4=well-defined with constraints, 5=expert-level framing with success criteria
9
+ **Evidence:** Session-meta `first_prompt` word count and structure (available on all sessions). When FACET_AVAILABLE: also use facets `underlying_goal` text quality for richer assessment.
10
+
11
+ ## Q2: Achievement rate
12
+ **Anchors:** 1=<20% achieved, 2=20-40%, 3=40-60%, 4=60-80%, 5=>80%
13
+ **Evidence:** Facets `outcome` distribution when available (precomputed achieved_rate). Disclose coverage %. When facets unavailable or coverage <20%, score 3 (insufficient signal).
14
+
15
+ ## Q3: Session purposefulness
16
+ **Anchors:** 1=mostly warmups/idle, 2=many warmups or abandoned, 3=mixed productive and idle, 4=mostly productive, 5=consistently productive sessions
17
+ **Evidence:** Session-meta warmup ratio from `first_prompt` pattern detection (e.g., "Respond with OK"). Tool activity ratio: sessions with >0 tool calls / total sessions.
18
+
19
+ ## Q4: Project engagement depth
20
+ **Anchors:** 1=single project one-off, 2=few projects shallow, 3=some sustained engagement, 4=multi-session projects, 5=deep multi-session projects with sustained effort
21
+ **Evidence:** Session-meta `project_path`: sessions per unique project distribution. Look for projects with repeated engagement over time rather than breadth alone.
22
+
23
+ ## Q5: Task completion signals
24
+ **Anchors:** 1=mostly abandoned (low tool activity), 2=many sessions end early, 3=mixed completion evidence, 4=most sessions show substantive work, 5=consistently substantive tool activity indicating completion
25
+ **Evidence:** Session-meta: sessions with >5 tool calls as proxy for substantive work. Proportion of non-warmup sessions with meaningful tool activity. Do NOT penalize sparse `lines_added` — use tool call volume as primary proxy.
@@ -0,0 +1,25 @@
1
+ ---
2
+ id: growth-learning
3
+ label: Growth & Learning
4
+ description: Is the user improving over time? Requires sufficient data for meaningful trends.
5
+ ---
6
+
7
+ ## Q1: Friction trajectory
8
+ **Anchors:** 1=increasing friction over time, 2=slightly increasing, 3=flat or insufficient data, 4=slightly declining, 5=clearly declining friction
9
+ **Evidence:** Facets `friction_counts` first-half vs second-half comparison when available (requires >=20 facets). When FACET_AVAILABLE and SUFFICIENT_FOR_TRENDS: use facet trend. Otherwise score 3 (insufficient for trend analysis).
10
+
11
+ ## Q2: Tool adoption curve
12
+ **Anchors:** 1=static tool usage throughout, 2=minimal new tools, 3=gradual adoption of new tools, 4=steady new tool adoption, 5=consistent new tool types appearing in later sessions
13
+ **Evidence:** Session-meta: new `tool_counts` key types appearing in chronologically later sessions vs earlier sessions. Compare tool vocabulary in first-half vs second-half of session history.
14
+
15
+ ## Q3: Feature adoption progression
16
+ **Anchors:** 1=basic features only, 2=mostly basic, 3=some intermediate features, 4=intermediate plus some advanced, 5=advanced features adopted (agents, worktrees, MCP, task management, skills)
17
+ **Evidence:** Rule-engine FeatureInventory: presence of advanced-tier features. Classify features into basic (Read, Edit, Bash), intermediate (Grep, Glob, hooks), advanced (agents, worktrees, MCP, task mgmt, skills).
18
+
19
+ ## Q4: Capability breadth expansion
20
+ **Anchors:** 1=narrow and unchanging, 2=minimal expansion, 3=some variety growth, 4=steady breadth increase, 5=broad and growing capability set
21
+ **Evidence:** Session-meta: unique `tool_counts` keys across time windows. Compare distinct tool types used in early sessions vs recent sessions. Growth in unique tool types indicates expanding capability.
22
+
23
+ ## Q5: Resilience development
24
+ **Anchors:** 1=same failures repeat consistently, 2=slow improvement, 3=mixed or insufficient data, 4=good learning from errors, 5=learns from errors — decreasing error rate over time
25
+ **Evidence:** Session-meta: `tool_errors` rate trend over chronological session windows (declining = learning). When FACET_AVAILABLE: also use facets `friction_detail` to judge if same friction types recur. Without facets, use error rate trend alone.
@@ -0,0 +1,32 @@
1
+ You evaluate Claude Code user proficiency from usage data.
2
+ Score each criterion 1-5. Use the numeric anchors provided.
3
+
4
+ DATA SOURCES (tiered):
5
+ - Tier 1 — SESSION-META (primary, quantitative): Available for most sessions. Includes tool_counts, tokens, first_prompt, project_path, timestamps, message counts, tool_errors. This is your primary evidence base.
6
+ - Tier 2 — FACETS (optional, qualitative): Available for a subset of sessions only. Coverage percentage is disclosed. Includes outcome, satisfaction, friction, session_type, helpfulness. Use when available to enrich scoring.
7
+ - Tier 3 — RULE-ENGINE (aggregate features): Computed across all sessions. Includes domain scores, feature inventory, config maturity. Use for adoption and setup criteria.
8
+
9
+ COVERAGE FLAGS:
10
+ - FACET_AVAILABLE: true when facet data exists for this user. When false, score facet-dependent criteria as 3 (neutral) — do not guess.
11
+ - SUFFICIENT_FOR_TRENDS: true when enough temporal data exists for trend analysis. When false, score trend-dependent criteria as 3 (neutral).
12
+
13
+ SCORING RULES:
14
+ - Every criterion CAN be scored without facets using session-meta or rule-engine evidence.
15
+ - When facets ARE available, they enrich the score — use them as qualitative supplement.
16
+ - When data is marked "insufficient" or a coverage flag is false, score 3 (neutral).
17
+ - Do NOT penalize users for sparse optional fields (lines_added, duration_minutes, git_commits). Absence of data is not evidence of absence of work.
18
+
19
+ Return ONLY the criteria scores and brief evidence strings.
20
+ Do NOT compute totals or levels — those are calculated after.
21
+
22
+ Output JSON:
23
+ {
24
+ "domains": [
25
+ {
26
+ "id": "<domain-id>",
27
+ "criteria": [
28
+ { "q": "Q1", "score": 3, "evidence": "brief justification" }
29
+ ]
30
+ }
31
+ ]
32
+ }
@@ -0,0 +1,25 @@
1
+ ---
2
+ id: verification-quality
3
+ label: Verification & Quality
4
+ description: Does the user produce reliable, validated results?
5
+ ---
6
+
7
+ ## Q1: Outcome reliability
8
+ **Anchors:** 1=<30% achieved, 2=30-50%, 3=50-70%, 4=70-85%, 5=>85%
9
+ **Evidence:** Facets `outcome` success rate when available (precomputed). Disclose coverage %. When facets unavailable or coverage <20%, score 3 (insufficient signal — this criterion is coverage-gated).
10
+
11
+ ## Q2: Error handling quality
12
+ **Anchors:** 1=ignores errors entirely, 2=fixes occasionally, 3=fixes some errors, 4=systematic error handling, 5=low error rate with systematic recovery
13
+ **Evidence:** Session-meta: `tool_errors` / total tool calls ratio across sessions. Lower ratio = better error management. Also consider `tool_error_categories` for pattern analysis — diverse error types with low rates suggest sophisticated usage.
14
+
15
+ ## Q3: Investigation thoroughness
16
+ **Anchors:** 1=trial-and-error only, 2=minimal investigation, 3=some investigation before action, 4=regular investigation patterns, 5=systematic investigation with LSP, read-before-edit, and investigation chains
17
+ **Evidence:** Rule-engine aggregate signals: `tool-investigation-chain` fire count, LSP usage indicators, `tool-read-before-edit` patterns. Higher aggregate scores indicate thorough verification habits.
18
+
19
+ ## Q4: Iterative refinement discipline
20
+ **Anchors:** 1=one-shot only (no iteration), 2=rare iteration, 3=some iteration, 4=regular iterative patterns, 5=systematic iteration with multiple edit cycles
21
+ **Evidence:** Session-meta: sessions with multiple Edit tool calls indicating iterative refinement. When FACET_AVAILABLE: also use facets `session_type=iterative_refinement` rate for explicit classification.
22
+
23
+ ## Q5: Course-correction capability
24
+ **Anchors:** 1=never corrects course after errors, 2=eventual correction, 3=moderate recovery, 4=quick recovery, 5=rapid recovery — sessions with errors continue to substantial tool activity
25
+ **Evidence:** Session-meta: sessions with `tool_errors` > 0 that continue to substantial tool activity afterward (>5 tool calls after first error). Higher continuation ratio = better course-correction ability.
@@ -0,0 +1,25 @@
1
+ ---
2
+ id: workflow-mastery
3
+ label: Workflow Mastery
4
+ description: How sophisticated and efficient are the user's work patterns?
5
+ ---
6
+
7
+ ## Q1: Session strategy diversity
8
+ **Anchors:** 1=all sessions same type/pattern, 2=mostly one type, 3=2-3 distinct types, 4=strategic variety, 5=strategic type selection matched to goals
9
+ **Evidence:** Session-meta `tool_counts` patterns to derive session types (read-heavy, edit-heavy, bash-heavy, mixed). When FACET_AVAILABLE: use facets `session_type` distribution for explicit classification.
10
+
11
+ ## Q2: Investigation discipline
12
+ **Anchors:** 1=no read-before-edit behavior, 2=rarely investigates first, 3=occasional investigation, 4=regular read-then-edit pattern, 5=systematic investigation-first discipline
13
+ **Evidence:** Rule-engine aggregate signals: `tool-read-before-edit` and `tool-investigation-chain` fire counts. Higher fire rates indicate disciplined investigation before action.
14
+
15
+ ## Q3: Configuration maturity
16
+ **Anchors:** 1=bare default setup, 2=minimal customization, 3=some configuration present, 4=well-configured environment, 5=sophisticated setup with hooks, plugins, MCP, rules, memory
17
+ **Evidence:** Rule-engine ConfigSignals: presence and count of hooks, plugins, CLAUDE.md, MCP servers, custom rules, memory usage, agents, skills. More sophisticated setup = higher maturity.
18
+
19
+ ## Q4: Tool repertoire effectiveness
20
+ **Anchors:** 1=uses only 1-2 tools, 2=narrow tool range, 3=varied tool usage, 4=broad repertoire with appropriate selection, 5=right tool for each job with high entropy across sessions
21
+ **Evidence:** Session-meta `tool_counts` distribution entropy across sessions. Look for breadth of tool types used AND variation in tool mix between sessions (not same pattern every time).
22
+
23
+ ## Q5: Shipping behavior
24
+ **Anchors:** 1=no evidence of shipping, 2=rare evidence, 3=occasional evidence, 4=regular shipping signals, 5=consistent commit and push discipline
25
+ **Evidence:** Session-meta `git_commits` and `git_pushes` when present. This is a sparse signal — DO NOT penalize absence (score 3 when insufficient data). Only reward when evidence is present. One-directional criterion.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "cc-proficiency",
3
- "version": "0.2.10",
3
+ "version": "0.3.1",
4
4
  "description": "Claude Code proficiency badge generator — analyze usage patterns across 5 domains aligned with Claude Certified Architect",
5
5
  "main": "dist/index.js",
6
6
  "types": "dist/index.d.ts",
@@ -24,6 +24,7 @@
24
24
  "hooks/",
25
25
  ".claude-plugin/",
26
26
  "skills/",
27
+ "docs/ai-grading/",
27
28
  "README.md"
28
29
  ],
29
30
  "keywords": [