@aiready/context-analyzer 0.5.1 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
 
2
2
  
3
- > @aiready/context-analyzer@0.5.1 build /Users/pengcao/projects/aiready/packages/context-analyzer
3
+ > @aiready/context-analyzer@0.6.0 build /Users/pengcao/projects/aiready/packages/context-analyzer
4
4
  > tsup src/index.ts src/cli.ts --format cjs,esm --dts
5
5
 
6
6
  CLI Building entry: src/cli.ts, src/index.ts
@@ -9,15 +9,15 @@
9
9
  CLI Target: es2020
10
10
  CJS Build start
11
11
  ESM Build start
12
- CJS dist/index.js 20.62 KB
13
- CJS dist/cli.js 39.27 KB
14
- CJS ⚡️ Build success in 48ms
12
+ CJS dist/cli.js 41.76 KB
13
+ CJS dist/index.js 23.11 KB
14
+ CJS ⚡️ Build success in 55ms
15
15
  ESM dist/index.mjs 164.00 B
16
+ ESM dist/chunk-DD7UVNE3.mjs 21.95 KB
16
17
  ESM dist/cli.mjs 18.45 KB
17
- ESM dist/chunk-NJUW6VED.mjs 19.48 KB
18
- ESM ⚡️ Build success in 48ms
18
+ ESM ⚡️ Build success in 55ms
19
19
  DTS Build start
20
- DTS ⚡️ Build success in 547ms
20
+ DTS ⚡️ Build success in 531ms
21
21
  DTS dist/cli.d.ts 20.00 B
22
22
  DTS dist/index.d.ts 2.44 KB
23
23
  DTS dist/cli.d.mts 20.00 B
@@ -1,36 +1,19 @@
1
1
 
2
2
  
3
- > @aiready/context-analyzer@0.5.1 test /Users/pengcao/projects/aiready/packages/context-analyzer
3
+ > @aiready/context-analyzer@0.6.0 test /Users/pengcao/projects/aiready/packages/context-analyzer
4
4
  > vitest run
5
5
 
6
6
 
7
7
   RUN  v2.1.9 /Users/pengcao/projects/aiready/packages/context-analyzer
8
8
 
9
- ✓ src/__tests__/analyzer.test.ts (13)
10
- ✓ buildDependencyGraph (1)
11
- ✓ should build a basic dependency graph
12
- ✓ calculateImportDepth (2)
13
- ✓ should calculate import depth correctly
14
- ✓ should handle circular dependencies gracefully
15
- ✓ getTransitiveDependencies (1)
16
- ✓ should get all transitive dependencies
17
- ✓ calculateContextBudget (1)
18
- ✓ should calculate total token cost including dependencies
19
- ✓ detectCircularDependencies (2)
20
- ✓ should detect circular dependencies
21
- ✓ should return empty for no circular dependencies
22
- ✓ calculateCohesion (3)
23
- ✓ should return 1 for single export
24
- ✓ should return high cohesion for related exports
25
- ✓ should return low cohesion for mixed exports
26
- ✓ calculateFragmentation (3)
27
- ✓ should return 0 for single file
28
- ✓ should return 0 for files in same directory
29
- ✓ should return high fragmentation for scattered files
9
+ [?25l · src/__tests__/analyzer.test.ts (14)
10
+ ✓ src/__tests__/enhanced-cohesion.test.ts (6)
11
+  ✓ src/__tests__/analyzer.test.ts (14)
12
+ ✓ src/__tests__/enhanced-cohesion.test.ts (6)
30
13
 
31
-  Test Files  1 passed (1)
32
-  Tests  13 passed (13)
33
-  Start at  08:00:54
34
-  Duration  315ms (transform 75ms, setup 0ms, collect 83ms, tests 3ms, environment 0ms, prepare 44ms)
14
+  Test Files  2 passed (2)
15
+  Tests  20 passed (20)
16
+  Start at  12:49:14
17
+  Duration  504ms (transform 124ms, setup 0ms, collect 482ms, tests 16ms, environment 0ms, prepare 73ms)
35
18
 
36
- [?25h
19
+ [?25h[?25h
@@ -0,0 +1,202 @@
1
+ # Cohesion Measurement Improvements
2
+
3
+ ## Overview
4
+
5
+ This document describes the improvements made to cohesion calculation in @aiready/context-analyzer, implementing the medium-term enhancements from the cohesion roadmap.
6
+
7
+ ## Problem Statement
8
+
9
+ The original cohesion calculation relied solely on domain inference from export names:
10
+ - **Domain inference issues**: Used simple keyword matching (e.g., "getUserOrder" → "user" instead of "order")
11
+ - **False positives**: Files with mixed domain names but shared functionality scored poorly
12
+ - **Test file penalties**: Test utilities with multiple domain mocks incorrectly flagged as low cohesion
13
+ - **Entropy sensitivity**: Single misclassified export could drastically lower the score
14
+
15
+ ### Real-World Example
16
+
17
+ `useReceiptFilters.ts`:
18
+ ```typescript
19
+ export function useReceiptFilters() { /* uses React hooks */ }
20
+ export function useOrderFilters() { /* uses same React hooks */ }
21
+ export function useCustomerFilters() { /* uses same React hooks */ }
22
+ ```
23
+
24
+ **Old calculation**:
25
+ - Domains: receipt, order, customer (3 different)
26
+ - Entropy: High (low cohesion score ~0.0)
27
+ - **False negative**: Actually cohesive (all use same React patterns)
28
+
29
+ ## Solution: Import-Based Cohesion
30
+
31
+ ### Key Improvements
32
+
33
+ 1. **AST-Based Export Extraction** (`@aiready/core`)
34
+ - Uses `@typescript-eslint/typescript-estree` to parse TypeScript/JavaScript files
35
+ - Extracts exports with their actual import dependencies
36
+ - Tracks which imports each export uses via AST traversal
37
+
38
+ 2. **Jaccard Similarity for Imports**
39
+ - Calculates similarity between exports based on shared imports
40
+ - Exports using the same libraries/modules are considered related
41
+ - More objective than name-based domain inference
42
+
43
+ 3. **Weighted Cohesion Score**
44
+ ```
45
+ Enhanced Cohesion = (0.6 × Import Similarity) + (0.4 × Domain-Based)
46
+ ```
47
+ - **60% weight on import analysis**: More reliable indicator of code relationships
48
+ - **40% weight on domain inference**: Still considers naming conventions
49
+ - Combines objective (imports) with heuristic (names) signals
50
+
51
+ ### Implementation
52
+
53
+ ```typescript
54
+ // Enhanced cohesion calculation
55
+ export function calculateEnhancedCohesion(
56
+ exports: ExportInfo[],
57
+ filePath?: string
58
+ ): number {
59
+ // Special cases
60
+ if (exports.length === 0) return 1;
61
+ if (exports.length === 1) return 1;
62
+ if (filePath && isTestFile(filePath)) return 1;
63
+
64
+ // Calculate domain-based cohesion (entropy method)
65
+ const domainCohesion = calculateDomainCohesion(exports);
66
+
67
+ // Calculate import-based cohesion if import data available
68
+ const hasImportData = exports.some(e => e.imports && e.imports.length > 0);
69
+
70
+ if (!hasImportData) {
71
+ return domainCohesion; // Fallback to domain-based only
72
+ }
73
+
74
+ const importCohesion = calculateImportBasedCohesion(exports);
75
+
76
+ // Weighted combination: 60% import, 40% domain
77
+ return importCohesion * 0.6 + domainCohesion * 0.4;
78
+ }
79
+ ```
80
+
81
+ ### AST Parsing Implementation
82
+
83
+ ```typescript
84
+ // In @aiready/core/src/utils/ast-parser.ts
85
+ export function parseFileExports(code: string, filePath: string): {
86
+ exports: ExportWithImports[];
87
+ imports: FileImport[];
88
+ } {
89
+ const ast = parse(code, {
90
+ loc: true,
91
+ range: true,
92
+ ecmaVersion: 'latest',
93
+ sourceType: 'module',
94
+ });
95
+
96
+ const imports = extractFileImports(ast);
97
+ const exports = extractExportsWithDependencies(ast, imports);
98
+
99
+ return { exports, imports };
100
+ }
101
+
102
+ // Track which imports each export uses
103
+ function findUsedImports(
104
+ node: TSESTree.Node,
105
+ availableImports: Map<string, FileImport>
106
+ ): string[] {
107
+ const usedImports = new Set<string>();
108
+
109
+ // Recursively visit AST to find identifier references
110
+ visit(node, {
111
+ Identifier(n) {
112
+ if (availableImports.has(n.name)) {
113
+ usedImports.add(n.name);
114
+ }
115
+ },
116
+ });
117
+
118
+ return Array.from(usedImports);
119
+ }
120
+ ```
121
+
122
+ ## Results
123
+
124
+ ### Before (v0.5.3)
125
+ ```
126
+ useReceiptFilters.ts
127
+ - Domains: receipt, order, customer
128
+ - Cohesion: 0.0 (entropy-based)
129
+ - Classification: ❌ Low cohesion (false negative)
130
+ ```
131
+
132
+ ### After (Current)
133
+ ```
134
+ useReceiptFilters.ts
135
+ - Domains: receipt, order, customer (entropy = low)
136
+ - Imports: react (useState, useEffect), shared across all exports
137
+ - Import Similarity: 1.0 (Jaccard index)
138
+ - Enhanced Cohesion: 0.6 × 1.0 + 0.4 × 0.0 = 0.6
139
+ - Classification: ✅ Moderate cohesion (correct)
140
+ ```
141
+
142
+ ## Graceful Fallback
143
+
144
+ The system gracefully handles files without import data:
145
+ - **Regex parsing failures**: Falls back to domain-based calculation only
146
+ - **No imports detected**: Uses domain inference (legacy behavior)
147
+ - **Mixed data**: Only uses import-based when all exports have import info
148
+
149
+ ## Testing
150
+
151
+ Comprehensive test suite with 6 new tests:
152
+ - ✅ Domain-based fallback when no imports available
153
+ - ✅ Import-based scoring with shared dependencies
154
+ - ✅ Weighted combination (import > domain priority)
155
+ - ✅ Handles mixed case (some exports with/without imports)
156
+ - ✅ Single export edge case
157
+ - ✅ Test file special casing
158
+
159
+ **Test Results**: 20/20 tests passing (14 existing + 6 new)
160
+
161
+ ## Dependencies Added
162
+
163
+ In `@aiready/core`:
164
+ ```json
165
+ {
166
+ "dependencies": {
167
+ "typescript": "^5.9.3",
168
+ "@typescript-eslint/parser": "^8.53.0",
169
+ "@typescript-eslint/typescript-estree": "^8.53.0"
170
+ }
171
+ }
172
+ ```
173
+
174
+ ## Next Steps (Future Enhancements)
175
+
176
+ 1. **Co-usage tracking**: Analyze which exports are imported together across the codebase
177
+ 2. **Function call analysis**: Track which functions call which (deeper dependency analysis)
178
+ 3. **Type usage patterns**: Detect exports sharing the same type definitions
179
+ 4. **Shared constants**: Identify exports using the same configuration/constants
180
+
181
+ ## Breaking Changes
182
+
183
+ None. The API remains backward compatible:
184
+ - `calculateCohesion(exports, filePath?)` signature unchanged
185
+ - Falls back to domain-based calculation when import data unavailable
186
+ - All existing tests continue to pass
187
+
188
+ ## Performance
189
+
190
+ - **AST parsing**: ~50-100ms per file (cached by file content hash)
191
+ - **Fallback available**: Regex-based extraction if AST parsing fails
192
+ - **Minimal overhead**: Only parses files being analyzed (not entire codebase)
193
+
194
+ ## Conclusion
195
+
196
+ The enhanced cohesion calculation provides:
197
+ - **More accurate** classification of file cohesion
198
+ - **Fewer false positives** from domain name mismatches
199
+ - **Objective metrics** based on actual code dependencies
200
+ - **Backward compatibility** with graceful fallback
201
+
202
+ This represents a significant step toward AI-ready codebases by providing more reliable metrics for code organization and refactoring opportunities.