@liendev/lien 0.13.0 → 0.15.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,459 +1,110 @@
1
1
  ---
2
+ description: MANDATORY code search rules - use Lien MCP tools instead of grep
3
+ globs: ["**/*"]
2
4
  alwaysApply: true
3
5
  ---
4
6
 
5
- # Lien MCP Integration Rules
7
+ # MANDATORY: Use Lien MCP for Code Search
6
8
 
7
- This project uses **Lien** - a local semantic code search MCP server. You MUST use Lien tools proactively to understand the codebase before making changes.
9
+ You have access to Lien semantic search tools. USE THEM INSTEAD OF grep/ripgrep/built-in search.
8
10
 
9
- ## Core Rules
11
+ ## Tool Selection (FOLLOW THIS)
10
12
 
11
- ### ALWAYS Use Lien When:
12
- 1. **Before reading any file** - Use `semantic_search` or `get_file_context` to understand what you're looking for
13
- 2. **User asks about code location** - Use `semantic_search` before grepping
14
- 3. **User asks "how does X work"** - Use `semantic_search` to find implementations
15
- 4. **Before making changes** - Use `get_file_context` to understand dependencies
16
- 5. **User asks for examples** - Use `find_similar` to locate patterns
17
- 6. **Exploring unfamiliar code** - Use `semantic_search` with broad queries first
13
+ | User wants... | Use this | NOT this |
14
+ |---------------|----------|----------|
15
+ | "Where is X implemented?" | `semantic_search` | grep |
16
+ | "How does X work?" | `semantic_search` | reading random files |
17
+ | "Find all Controllers" | `list_functions` | grep |
18
+ | Edit a file | `get_file_context` FIRST | direct edit |
19
+ | Find similar code | `find_similar` | manual search |
18
20
 
19
- ### NEVER:
20
- 1. Skip Lien and go straight to reading files when you don't know the codebase
21
- 2. Use grep when the user is asking about functionality (use semantic search instead)
22
- 3. Make assumptions about code location without searching first
23
- 4. Edit files without getting context via `get_file_context`
21
+ ## Before ANY Code Change
24
22
 
25
- ---
26
-
27
- ## MCP Tools Reference
28
-
29
- ### `semantic_search` - PRIMARY TOOL
30
- **Use this FIRST for almost all code understanding tasks.**
31
-
32
- ```typescript
33
- semantic_search({
34
- query: "natural language description of what you're looking for",
35
- limit: 5 // increase to 10-15 for broad exploration
36
- })
37
- ```
38
-
39
- **Use for:**
40
- - "Where is X implemented?" → `semantic_search({ query: "X implementation" })`
41
- - "How does Y work?" → `semantic_search({ query: "Y functionality" })`
42
- - Finding patterns, features, utilities, handlers, validators, etc.
43
- - Understanding architecture before making changes
44
-
45
- **Query tips:**
46
- - Use full sentences describing what the code does
47
- - Focus on behavior: "handles user authentication", "validates email input"
48
- - Not exact names: search semantically, not syntactically
49
-
50
- ### `get_file_context`
51
- **Use BEFORE editing any file you haven't read yet.**
52
-
53
- ```typescript
54
- get_file_context({
55
- filepath: "relative/path/to/file.ts",
56
- includeRelated: true // default, gets related chunks
57
- })
58
- ```
59
-
60
- **MANDATORY for:**
61
- - Before making any file edits
62
- - Understanding file dependencies and relationships
63
- - Getting full context of what a file does
64
-
65
- **Pro tip:** Use after `semantic_search` identifies the right file
66
-
67
- ### `find_similar`
68
- **Use for finding patterns and ensuring consistency.**
69
-
70
- ```typescript
71
- find_similar({
72
- code: "function example() { ... }",
73
- limit: 5
74
- })
75
- ```
76
-
77
- **Use for:**
78
- - Refactoring: find all similar implementations
79
- - Consistency: ensure new code matches existing patterns
80
- - Duplication detection
81
-
82
- ### `list_functions` ⚡ NEW in v0.5.0
83
- **Fast symbol-based search for functions, classes, and interfaces by name.**
84
-
85
- ```typescript
86
- list_functions({
87
- pattern: ".*Controller.*", // regex to match symbol names
88
- language: "php" // optional language filter
89
- })
90
- ```
91
-
92
- **How it works:**
93
- - Extracts and indexes function/class/interface names during indexing
94
- - Direct symbol name matching (not semantic search)
95
- - **10x faster** than semantic search for finding specific symbols
96
- - Automatic fallback for old indices
97
-
98
- **Use for:**
99
- - Finding all classes matching a pattern (e.g., `.*Controller.*`, `.*Service$`)
100
- - Getting structural overview of functions/classes
101
- - Discovering API endpoints, handlers, or utilities by name pattern
102
- - Understanding code organization and naming conventions
103
-
104
- **Best practices:**
105
- - Use regex patterns that match naming conventions: `.*Controller.*`, `handle.*`, `get.*`
106
- - Combine with language filter for large codebases: `language: "typescript"`
107
- - For best results: run `lien reindex` after upgrading to v0.5.0
108
-
109
- **When to use `list_functions` vs `semantic_search`:**
110
- - ✅ Use `list_functions` when you know the naming pattern (e.g., "all Controllers")
111
- - ✅ Use `semantic_search` when searching by functionality (e.g., "handles authentication")
112
-
113
- **Note:** Test files are indexed alongside source code and will naturally appear in semantic search results when relevant.
114
-
115
- ---
116
-
117
- ## Enhanced Metadata (AST-Based) ⚡ NEW in v0.13.0
118
-
119
- Lien now uses **Abstract Syntax Tree (AST) parsing** for TypeScript/JavaScript files to provide rich code metadata:
120
-
121
- ### Metadata Fields in Search Results
122
-
123
- All search results (`semantic_search`, `get_file_context`, `find_similar`, `list_functions`) now include enhanced metadata when available:
124
-
125
- ```typescript
126
- {
127
- content: "function validateEmail(email: string): boolean { ... }",
128
- metadata: {
129
- file: "src/validators.ts",
130
- startLine: 45,
131
- endLine: 60,
132
- type: "function", // 'function' | 'class' | 'block'
133
- language: "typescript",
134
-
135
- // AST-derived metadata (NEW in v0.13.0):
136
- symbolName: "validateEmail", // Function/class name
137
- symbolType: "function", // 'function' | 'method' | 'class' | 'interface'
138
- parentClass: undefined, // For methods: parent class name
139
- complexity: 3, // Cyclomatic complexity
140
- parameters: ["email: string"], // Function parameters
141
- signature: "function validateEmail(email: string): boolean", // Full signature
142
- imports: ["@/utils/regex"] // File imports (for context)
143
- },
144
- score: 0.85,
145
- relevance: "highly_relevant"
146
- }
147
- ```
148
-
149
- ### AST Metadata Benefits
150
-
151
- 1. **Never splits functions** - Chunks respect semantic boundaries (no mid-function splits)
152
- 2. **Function context** - Know exactly which function you're looking at
153
- 3. **Complexity metrics** - Identify complex functions that may need refactoring
154
- 4. **Signature awareness** - See parameters and return types at a glance
155
- 5. **Better AI context** - AI assistants get structured code information
156
-
157
- ### Using AST Metadata
158
-
159
- **Find complex functions:**
160
- ```typescript
161
- // Search for authentication logic
162
- const results = await semantic_search({ query: "authentication logic" });
163
-
164
- // Filter by complexity
165
- const complexFunctions = results.filter(r => (r.metadata.complexity || 0) > 5);
166
- ```
167
-
168
- **Identify methods in a class:**
169
- ```typescript
170
- // Get file context
171
- const context = await get_file_context({ filepath: "src/auth/AuthService.ts" });
172
-
173
- // Find all methods
174
- const methods = context.results.filter(r => r.metadata.symbolType === 'method');
175
- ```
176
-
177
- **List functions with specific parameters:**
178
- ```typescript
179
- const functions = await list_functions({ pattern: ".*validate.*", language: "typescript" });
180
-
181
- // Filter by parameter count
182
- const simpleValidators = functions.filter(r => (r.metadata.parameters?.length || 0) <= 2);
183
- ```
184
-
185
- ### AST Support
186
-
187
- **Currently supported:**
188
- - ✅ TypeScript (`.ts`, `.tsx`)
189
- - ✅ JavaScript (`.js`, `.jsx`, `.mjs`, `.cjs`)
190
-
191
- **Coming soon:**
192
- - 🔜 Python, Go, Rust, Java, PHP, and more
193
-
194
- **Fallback behavior:**
195
- - For unsupported languages, Lien automatically falls back to line-based chunking
196
- - No disruption to existing workflows
197
-
198
- ### Known Limitations
199
-
200
- **Very large files (1000+ lines):**
201
- - Tree-sitter may fail with "Invalid argument" error on extremely large files
202
- - When this occurs, Lien automatically falls back to line-based chunking
203
- - This is a known Tree-sitter limitation with very large syntax trees
204
- - Fallback behavior is configurable via `astFallback` setting
205
-
206
- **Resilient parsing:**
207
- - Tree-sitter is designed to produce best-effort ASTs even for invalid syntax
208
- - Parse errors are rare; most malformed code still produces usable chunks
209
- - The `astFallback: 'error'` option mainly catches edge cases like large file errors
210
-
211
- ### Configuration
23
+ REQUIRED sequence:
24
+ 1. `semantic_search` → find relevant files
25
+ 2. `get_file_context` understand the file + check `testAssociations`
26
+ 3. Make changes
27
+ 4. Remind user to run affected tests
212
28
 
213
- Control AST behavior in `.lien.config.json`:
29
+ ## Tool Reference
214
30
 
215
- ```json
216
- {
217
- "chunking": {
218
- "useAST": true, // Enable AST-based chunking (default: true)
219
- "astFallback": "line-based" // Fallback strategy: 'line-based' | 'error'
220
- }
221
- }
222
- ```
223
-
224
- ---
225
-
226
- ## Input Validation & Error Handling
227
-
228
- Lien uses Zod schemas for runtime type-safe validation of all tool inputs. This provides:
229
- - **Automatic validation** of all parameters before tool execution
230
- - **Rich error messages** with field-level feedback
231
- - **Type safety** with full TypeScript inference
232
- - **Consistent error structure** across all tools
233
-
234
- ### Understanding Validation Errors
235
-
236
- When you provide invalid parameters, you'll receive a structured error response:
237
-
238
- ```json
239
- {
240
- "error": "Invalid parameters",
241
- "code": "INVALID_INPUT",
242
- "details": [
243
- {
244
- "field": "query",
245
- "message": "Query must be at least 3 characters"
246
- },
247
- {
248
- "field": "limit",
249
- "message": "Limit cannot exceed 50"
250
- }
251
- ]
252
- }
253
- ```
254
-
255
- ### Common Validation Rules
256
-
257
- **semantic_search:**
258
- - `query`: 3-500 characters (required)
259
- - `limit`: 1-50 (default: 5)
260
-
261
- **find_similar:**
262
- - `code`: minimum 10 characters (required)
263
- - `limit`: 1-20 (default: 5)
264
-
265
- **get_file_context:**
266
- - `filepath`: cannot be empty (required)
267
- - `includeRelated`: boolean (default: true)
31
+ **`semantic_search({ query: "what the code does", limit: 5 })`**
32
+ - Use natural language: "handles authentication", "validates email"
33
+ - NOT function names (use grep for exact names)
34
+ - Returns relevance category: `highly_relevant`, `relevant`, `loosely_related`, `not_relevant`
268
35
 
269
- **list_functions:**
270
- - `pattern`: optional regex string
271
- - `language`: optional language filter
36
+ **`get_file_context({ filepath: "path/to/file.ts" })`**
37
+ - MANDATORY before editing any file
38
+ - Returns `testAssociations`: which tests cover this file
39
+ - Shows file dependencies and relationships
272
40
 
273
- ### Error Codes
41
+ **`list_functions({ pattern: ".*Controller.*" })`**
42
+ - Fast symbol lookup by naming pattern
43
+ - Use for structural queries: "show all services", "find handlers"
44
+ - 10x faster than semantic_search for name-based lookups
274
45
 
275
- Lien uses structured error codes for programmatic error handling:
46
+ **`find_similar({ code: "snippet to match" })`**
47
+ - Find similar implementations for consistency
48
+ - Use when refactoring or detecting duplication
276
49
 
277
- - `INVALID_INPUT` - Parameter validation failed
278
- - `FILE_NOT_FOUND` - Requested file doesn't exist in index
279
- - `INDEX_NOT_FOUND` - No index found (run `lien index`)
280
- - `INDEX_CORRUPTED` - Index is corrupted (run `lien reindex`)
281
- - `EMBEDDING_GENERATION_FAILED` - Embedding model failed (retryable)
282
- - `INTERNAL_ERROR` - Unexpected internal error
50
+ ## Test Associations
283
51
 
284
- ### Best Practices
52
+ `get_file_context` returns `testAssociations` showing which tests cover the file.
53
+ ALWAYS check this before modifying source code.
54
+ After changes, remind the user: "This file is covered by [test files] - run these to verify."
285
55
 
286
- 1. **Always provide required fields**: Check tool schemas for required parameters
287
- 2. **Respect validation limits**: Don't exceed max values for `limit` parameters
288
- 3. **Use descriptive queries**: Avoid very short or vague queries
289
- 4. **Handle validation errors gracefully**: Parse error details to understand what went wrong
56
+ ## Workflow Patterns
290
57
 
291
- ---
292
-
293
- ## Workflow Patterns (FOLLOW THESE)
294
-
295
- ### Pattern 1: User Asks "Where is X?"
58
+ ### Pattern 1: "Where is X?" / "How does X work?"
296
59
  ```
297
- 1. semantic_search({ query: "X functionality" })
298
- 2. Review results, identify file(s)
60
+ 1. semantic_search({ query: "X implementation" })
61
+ 2. Review results (check relevance scores)
299
62
  3. get_file_context({ filepath: "identified/file.ts" })
300
- 4. Answer with specific information
63
+ 4. Answer with specific code locations
301
64
  ```
302
65
 
303
- ### Pattern 2: User Asks to Edit/Change Code
66
+ ### Pattern 2: Edit or Change Code
304
67
  ```
305
68
  1. semantic_search({ query: "area being modified" })
306
69
  2. get_file_context({ filepath: "target/file.ts" })
307
- 3. find_similar({ code: "existing pattern" }) // if ensuring consistency
308
- 4. Make changes with full context
309
- ```
310
-
311
- ### Pattern 3: User Asks "How Does X Work?"
312
- ```
313
- 1. semantic_search({ query: "X implementation", limit: 10 })
314
- 2. Review top results
315
- 3. get_file_context for key files
316
- 4. Explain with references to actual code locations
317
- ```
318
-
319
- ### Pattern 4: Debugging or Understanding Error
320
- ```
321
- 1. semantic_search({ query: "error handling for [area]" })
322
- 2. semantic_search({ query: "[specific error type] handling" })
323
- 3. get_file_context for relevant files
324
- 4. Provide analysis
325
- ```
326
-
327
- ### Pattern 5: Modifying Source Code (Test-Aware)
328
- ```
329
- 1. semantic_search({ query: "functionality being modified" })
330
- 2. get_file_context({ filepath: "target/file.ts" })
331
- 3. Check testAssociations in response to see which tests cover this code
70
+ 3. Check testAssociations in response
332
71
  4. Make changes
333
- 5. Remind user to run the associated tests
72
+ 5. Tell user which tests to run
334
73
  ```
335
74
 
336
- ### Pattern 6: Understanding Test Coverage
337
- ```
338
- 1. get_file_context({ filepath: "src/component.ts" })
339
- 2. Review testAssociations field in response
340
- 3. Use get_file_context for each test file to understand coverage
341
- 4. Analyze and suggest improvements
342
- ```
343
-
344
- ### Pattern 7: Finding All Classes/Functions by Name Pattern ⚡ NEW
345
- ```
346
- 1. list_functions({ pattern: ".*Controller.*", language: "php" })
347
- 2. Review the list of matching classes
348
- 3. Use get_file_context on specific files for deeper investigation
349
- 4. Answer user's structural/architectural questions
350
- ```
351
-
352
- **Example queries:**
353
- - "Show me all Controllers" → `list_functions({ pattern: ".*Controller.*" })`
354
- - "What Services exist?" → `list_functions({ pattern: ".*Service.*" })`
355
- - "Find all API handlers" → `list_functions({ pattern: "handle.*" })`
356
-
357
- ---
358
-
359
- ## Decision Tree: Lien vs Other Tools
360
-
361
- ### Use `semantic_search` when:
362
- ✅ User asks about functionality, features, or "how X works"
363
- ✅ You need to understand what code exists before editing
364
- ✅ Looking for patterns, implementations, handlers, validators, etc.
365
- ✅ Exploring unfamiliar parts of codebase
366
- ✅ Searching by what code **does** (behavior, functionality)
367
-
368
- ### Use `list_functions` when: ⚡ NEW
369
- ✅ User asks "show me all Controllers" or similar structural queries
370
- ✅ Looking for classes/functions matching a **naming pattern**
371
- ✅ Getting architectural overview (all Services, all Handlers, etc.)
372
- ✅ Searching by what code is **named** (symbol names, not behavior)
373
- ✅ Need fast results for known naming conventions
374
-
375
- ### Use `grep` when:
376
- ✅ User provides exact function/variable name to find
377
- ✅ Looking for specific string literals or imports
378
- ✅ Finding all occurrences of exact text
379
-
380
- ### Use `get_file_context` when:
381
- ✅ You identified a file via search and need to understand it
382
- ✅ About to edit a file (MANDATORY)
383
- ✅ Need to understand file relationships and dependencies
75
+ ## Query Construction
384
76
 
385
- ### Use `find_similar` when:
386
- ✅ Refactoring multiple similar pieces of code
387
- ✅ Ensuring new code matches existing patterns
388
- ✅ Finding duplicated logic
389
-
390
- ### Check test associations when:
391
- ✅ Before modifying any source file (use `get_file_context` to see testAssociations)
392
- ✅ User asks "what tests cover this?" (use `semantic_search` and check metadata)
393
- ✅ Understanding what a test file is testing (use `get_file_context` on the test file)
394
- ✅ Working on bug fixes (search results include test metadata)
395
-
396
- ---
397
-
398
- ## Query Construction Guide
399
-
400
- ### Good Semantic Queries (DO THIS):
77
+ ### Good Queries (DO THIS)
401
78
  - "handles user authentication"
402
- - "validates email addresses"
79
+ - "validates email addresses"
403
80
  - "processes payment transactions"
404
- - "parses JSON responses"
405
- - "middleware for authorization"
406
81
  - "React components with form state"
407
- - "database migration scripts"
408
82
  - "API endpoints for user data"
409
83
 
410
- ### Bad Queries (DON'T DO THIS):
84
+ ### Bad Queries (DON'T DO THIS)
411
85
  - "auth" (too vague)
412
86
  - "validateEmail" (use grep for exact names)
413
- - "line 234" (Lien doesn't work with line numbers)
414
87
  - "code" (way too generic)
415
88
 
416
- ### Query Formula:
417
- `[action verb] + [domain object] + [optional context]`
418
- - "handles authentication for API requests"
419
- - "validates user input in forms"
420
- - "caches API responses from external services"
89
+ **Formula:** `[action verb] + [domain object] + [optional context]`
421
90
 
422
- ---
91
+ ## AST Metadata
423
92
 
424
- ## Performance Notes
93
+ Results include rich metadata: `symbolName`, `symbolType`, `complexity`, `parameters`, `signature`.
425
94
 
426
- - First query loads embeddings (~1-2s), subsequent queries are fast (<500ms)
427
- - Increase `limit` to 10-15 for broad exploration
428
- - Results are ranked by semantic relevance (trust the ranking)
429
- - User can re-index with `lien reindex` if results seem stale
430
- - **Relevance categories**: All search results include a `relevance` field (`highly_relevant`, `relevant`, `loosely_related`, `not_relevant`) to help interpret search quality at a glance
431
- - **Test associations**: Lien automatically detects test-source relationships across 12 languages using convention-based patterns and import analysis
95
+ Use for filtering:
96
+ - Complex functions: `results.filter(r => r.metadata.complexity > 5)`
97
+ - Methods only: `results.filter(r => r.metadata.symbolType === 'method')`
432
98
 
433
- ---
99
+ ## When to Use grep Instead
434
100
 
435
- ## Key Principles
101
+ ONLY use grep/ripgrep when:
102
+ - User provides an exact string/function name to find
103
+ - Looking for specific imports or string literals
104
+ - Semantic search returned no results
436
105
 
437
- 1. **Search First, Read Second**: Use Lien before reading files blindly
438
- 2. **Semantic Over Syntactic**: Think about what code *does*, not what it's *named*
439
- 3. **Context Before Changes**: Always get file context before editing
440
- 4. **Test-Aware Development**: Check testAssociations in results to understand test coverage
441
- 5. **Trust the Results**: Semantic search finds relevant code even with different naming. Use the `relevance` field (`highly_relevant`, `relevant`, `loosely_related`, `not_relevant`) to quickly assess result quality
442
- 6. **Chain Your Tools**: semantic_search → get_file_context (includes testAssociations) → make changes is a powerful pattern
106
+ For everything else: **Lien first.**
443
107
 
444
108
  ---
445
109
 
446
- ## Setup Instructions
447
-
448
- Create a `lien.mdc` file in your `.cursor/rules/` directory:
449
-
450
- ```bash
451
- # From your project directory
452
- mkdir -p .cursor/rules
453
- cp /path/to/lien/CURSOR_RULES_TEMPLATE.md .cursor/rules/lien.mdc
454
- ```
455
-
456
- The `alwaysApply: true` frontmatter ensures Cursor uses Lien for all files in your project.
457
-
458
- This approach allows you to have multiple rule files in `.cursor/rules/` without conflicts.
459
-
110
+ REMINDER: semantic_search and get_file_context FIRST. grep is the fallback, not the default.
package/README.md CHANGED
@@ -72,7 +72,15 @@ Contributions welcome! See **[CONTRIBUTING.md](./CONTRIBUTING.md)** for guidelin
72
72
 
73
73
  ## License
74
74
 
75
- MIT © [Alf Henderson](https://github.com/alfhen)
75
+ AGPL-3.0 © [Alf Henderson](https://github.com/alfhen)
76
+
77
+ **Lien is free forever for local use.** The AGPL license ensures that:
78
+ - ✅ You can use Lien locally without restrictions
79
+ - ✅ You can modify and distribute Lien freely
80
+ - ✅ Improvements get contributed back to the community
81
+ - ✅ We can sustain long-term development
82
+
83
+ For questions about licensing, contact us at alf@lien.dev
76
84
 
77
85
  ---
78
86