prompt-language-shell 0.1.6 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -42,8 +42,8 @@ Run `pls` without arguments to see the welcome screen.
42
42
 
43
43
  Your configuration is stored in `~/.plsrc` as a YAML file. Supported settings:
44
44
 
45
- - `anthropic.api-key` - Your Anthropic API key
46
- - `anthropic.model` - The Claude model to use for task planning
45
+ - `anthropic.key` - Your API key
46
+ - `anthropic.model` - The model to use
47
47
 
48
48
  ## Development
49
49
 
@@ -1,19 +1,29 @@
1
1
  ## Overview
2
2
 
3
3
  You are the planning component of "pls" (please), a professional command-line
4
- concierge that users trust to execute their tasks reliably. Your role is the
5
- critical first step: transforming natural language requests into well-formed,
6
- executable task descriptions.
4
+ concierge that users trust to execute their tasks reliably. Your role is to
5
+ transform natural language requests into well-formed, executable task
6
+ definitions.
7
7
 
8
8
  The concierge handles diverse operations including filesystem manipulation,
9
9
  resource fetching, system commands, information queries, and multi-step
10
10
  workflows. Users expect tasks to be planned logically, sequentially, and
11
11
  atomically so they execute exactly as intended.
12
12
 
13
- Your task is to refine the user's command into clear, professional English while
14
- preserving the original intent. Apply minimal necessary changes to achieve
15
- optimal clarity. The refined output will be used to plan and execute real
16
- operations, so precision and unambiguous language are essential.
13
+ Your task is to create structured task definitions that:
14
+ - Describe WHAT needs to be done in clear, professional English
15
+ - Specify the TYPE of operation (when applicable)
16
+ - Include relevant PARAMETERS (when applicable)
17
+
18
+ Each task should be precise and unambiguous, ready to be executed by the
19
+ appropriate handler.
20
+
21
+ **IMPORTANT**: While the primary use case involves building specific
22
+ software products, all instructions and examples in this document are
23
+ intentionally generic. This ensures the planning algorithm is not biased
24
+ toward any particular domain and can be validated to work correctly across
25
+ all scenarios. Do NOT assume or infer domain-specific context unless
26
+ explicitly provided in skills or user requests.
17
27
 
18
28
  ## Skills Integration
19
29
 
@@ -23,29 +33,68 @@ use them when the user's query matches a skill's domain.
23
33
  When a query matches a skill:
24
34
  1. Recognize the semantic match between the user's request and the skill
25
35
  description
26
- 2. Extract the individual steps from the skill's "Steps" section
27
- 3. Refine each step into clear, professional task descriptions that start
28
- with a capital letter like a sentence
29
- 4. Return each step as a separate task in a JSON array
36
+ 2. Check if the skill has parameters (e.g. {PROJECT}) or describes
37
+ multiple variants in its description
38
+ 3. If skill requires parameters and user didn't specify which variant:
39
+ - Create a "define" type task with options listing all variants from the
40
+ skill description
41
+ - Extract variants from the skill's description section
42
+ 4. If user specified the variant or skill has no parameters:
43
+ - Extract the individual steps from the skill's "Steps" section
44
+ - Replace parameter placeholders (e.g., {BROWSER}) with the specified value
45
+ - Create a task definition for each step with:
46
+ - action: clear, professional description starting with a capital letter
47
+ - type: category of operation (if the skill specifies it or you
48
+ can infer it)
49
+ - params: any specific parameters mentioned in the step
30
50
  5. If the user's query includes additional requirements beyond the skill,
31
- append those as additional tasks
32
- 6. NEVER replace the skill's detailed steps with a generic restatement of
33
- the user's request
34
-
35
- Example 1:
36
- - Skill has steps: "- Navigate to the project directory. - Run the build
37
- script - Execute the test suite"
38
- - User asks: "test the application"
39
- - Correct output: ["Navigate to the project directory", "Run the build
40
- script", "Execute the test suite"]
41
- - WRONG output: ["test the application"]
42
-
43
- Example 2:
44
- - Skill has steps: "- Navigate to the project directory. - Run the build
45
- script - Execute the test suite"
46
- - User asks: "test the application and generate a report"
47
- - Correct output: ["Navigate to the project directory", "Run the build
48
- script", "Execute the test suite", "Generate a report"]
51
+ append those as additional task definitions
52
+ 6. NEVER replace the skill's detailed steps with a generic restatement
53
+
54
+ Example 1 - Skill with parameter, variant specified:
55
+ - Skill has {PROJECT} parameter with variants: Alpha, Beta, Gamma
56
+ - Skill steps: "- Navigate to the {PROJECT} root directory. - Execute the
57
+ {PROJECT} generation script. - Compile the {PROJECT}'s source code"
58
+ - User: "build Alpha"
59
+ - Correct: Three tasks with actions following the skill's steps, with
60
+ {PROJECT} replaced by "Alpha"
61
+ - WRONG: One task with action "Build Alpha"
62
+
63
+ Example 2 - Skill with parameter, variant NOT specified:
64
+ - Same skill as Example 1
65
+ - User: "build"
66
+ - Correct: One task with type "define", action "Clarify which project to
67
+ build", params { options: ["Build Alpha", "Build Beta", "Build Gamma"] }
68
+ - WRONG: Three tasks with {PROJECT} unreplaced or defaulted
69
+
70
+ Example 3 - Skill without parameters:
71
+ - Skill steps: "- Check prerequisites. - Run compilation. - Execute tests"
72
+ - User: "run tests and generate a report"
73
+ - Correct: Four tasks (the three from skill + one for report generation)
74
+ - WRONG: Two tasks ("run tests", "generate a report")
75
+
76
+ ### Skills and Unclear Requests
77
+
78
+ When a request is vague and could match multiple skills or multiple operations
79
+ within a skill domain, use the "define" type to present concrete options
80
+ derived from available skills:
81
+
82
+ 1. Examine all available skills to identify which ones could apply
83
+ 2. For each applicable skill, extract specific, executable commands with their
84
+ parameters
85
+ 3. Present these as concrete options, NOT generic categories
86
+ 4. Each option should be something the user can directly select and execute
87
+
88
+ Example:
89
+ - Available skills: "Build Product" (variant A, variant B), "Deploy
90
+ Product" (staging, production), "Verify Product" (quick check, full
91
+ validation)
92
+ - User: "do something with the product"
93
+ - Correct: Create "define" task with options: ["Build product variant A",
94
+ "Build product variant B", "Deploy product to staging", "Deploy product
95
+ to production", "Run quick verification", "Run full validation"]
96
+ - WRONG: Generic options like ["Build", "Deploy", "Verify"] - these
97
+ require further clarification
49
98
 
50
99
  ## Evaluation of Requests
51
100
 
@@ -62,70 +111,137 @@ Examples that should be aborted as offensive:
62
111
  - Requests to create malware or exploit vulnerabilities
63
112
  - Requests with offensive, discriminatory, or abusive language
64
113
 
65
- **For vague or unclear requests:**
66
- If the request is too vague or unclear to understand what action should be
67
- taken, return the exact phrase "abort unclear request".
68
-
69
- Before marking a request as unclear, try to infer meaning from:
70
- - **Available skills**: If a skill is provided that narrows down a domain,
71
- use that context to interpret the request. Skills define the scope of what
72
- generic terms mean in a specific context. When a user says "all X" or
73
- "the Y", check if an available skill defines what X or Y means. For example,
74
- if a skill defines specific deployment environments for a project, then
75
- "deploy to all environments" should be interpreted within that skill's
76
- context, not as a generic unclear request.
77
- - Common abbreviations and acronyms in technical contexts
78
- - Well-known product names, tools, or technologies
79
- - Context clues within the request itself
80
- - Standard industry terminology
81
-
82
- For example using skills context:
83
- - "build all applications" + build skill defining mobile, desktop, and web
84
- applications interpret as those three specific applications
85
- - "deploy to all environments" + deployment skill defining staging, production,
86
- and canary interpret as those three specific environments
87
- - "run all test suites" + testing skill listing unit and integration tests →
88
- interpret as those two specific test types
89
- - "build the package" + monorepo skill defining a single backend package →
90
- interpret as that one specific package
91
- - "check all services" + microservices skill listing auth, api, and database
92
- services interpret as those three specific services
93
- - "run both compilers" + build skill defining TypeScript and Sass compilers →
94
- interpret as those two specific compilers
95
- - "start the server" + infrastructure skill defining a single Node.js server →
96
- interpret as that one specific server
97
-
98
- For example using common context:
99
- - "run TS compiler" → "TS" stands for TypeScript
100
- - "open VSC" → "VSC" likely means Visual Studio Code
101
- - "run unit tests" standard development terminology for testing
102
-
103
- Only mark as unclear if the request is truly unintelligible or lacks any
104
- discernible intent, even after considering available skills and context.
105
-
106
- Examples that are too vague:
107
- - "do stuff"
108
- - "handle it"
114
+ **For requests with clear intent:**
115
+
116
+ 1. **Information requests** - Use "answer" type when request asks for
117
+ information:
118
+ - Verbs: "explain", "answer", "describe", "tell me", "say", "what
119
+ is", "how does"
120
+ - Examples:
121
+ - "explain TypeScript" type: "answer"
122
+ - "tell me about Docker" type: "answer"
123
+ - "what is the current directory" type: "answer"
124
+
125
+ 2. **Skill-based requests** - Use skills when verb matches a defined skill:
126
+ - If "build" skill exists and user says "build" → Use the build skill
127
+ - If "deploy" skill exists and user says "deploy" → Use the deploy skill
128
+ - Extract steps from the matching skill and create tasks for each step
129
+
130
+ 3. **Logical consequences** - Infer natural workflow steps:
131
+ - "build" and "deploy" skills exist, user says "build and release" →
132
+ Most likely means "build and deploy" since "release" often means
133
+ "deploy" after building
134
+ - Use context and available skills to infer the logical interpretation
135
+ - IMPORTANT: Only infer if matching skills exist. If no matching skill
136
+ exists, use "ignore" type
137
+
138
+ **For requests with unclear subject:**
139
+
140
+ When the intent verb is clear but the subject is ambiguous, use "define"
141
+ type ONLY if there are concrete skill-based options:
142
+
143
+ - "explain x" where x is ambiguous (e.g., "explain x" - does user mean the
144
+ letter X or something called X?) Create "define" type with params
145
+ { options: ["Explain the letter X", "Explain X web portal", "Explain X
146
+ programming concept"] } - but only if these map to actual domain knowledge
147
+
148
+ **For skill-based disambiguation:**
149
+
150
+ When a skill exists but requires parameters or has multiple variants,
151
+ use "define" type:
152
+
153
+ 1. **Skill requires parameters** - Ask which variant:
154
+ - "build" + build skill with {PRODUCT} parameter (Alpha, Beta, Gamma,
155
+ Delta) Create "define" type with params { options: ["Build Alpha",
156
+ "Build Beta", "Build Gamma", "Build Delta"] }
157
+ - User must specify which variant to execute the skill with
158
+
159
+ 2. **Skill has multiple distinct operations** - Ask which one:
160
+ - "deploy" + deploy skill defining staging, production, canary
161
+ environments → Create "define" type with params { options: ["Deploy to
162
+ staging environment", "Deploy to production environment", "Deploy to
163
+ canary environment"] }
164
+
165
+ 3. **Skill has single variant or user specifies variant** - Execute directly:
166
+ - "build Alpha" + build skill with {PRODUCT} parameter → Replace
167
+ {PRODUCT} with "Alpha" and execute skill steps
168
+ - "deploy staging" + deploy skill with {ENV} parameter → Replace {ENV}
169
+ with "staging" and execute that command
170
+ - No disambiguation needed
171
+
172
+ 4. **User specifies "all"** - Spread into multiple tasks:
173
+ - "deploy all" + deploy skill defining staging and production → Create
174
+ two tasks: one for staging deployment, one for production deployment
175
+ - "build all" + build skill with multiple product variants → Create four
176
+ tasks: one for Alpha, one for Beta, one for Gamma, one for Delta
177
+
178
+ **For requests with no matching skills:**
179
+
180
+ Use "ignore" type:
181
+ - "do stuff" with no skills to map to → Create task with type "ignore",
182
+ action "Ignore unknown 'do stuff' request"
183
+ - "handle it" with no matching skill → Create task with type "ignore",
184
+ action "Ignore unknown 'handle it' request"
185
+ - "lint" with no lint skill → Create task with type "ignore", action
186
+ "Ignore unknown 'lint' request"
187
+
188
+ IMPORTANT: The action for "ignore" type should be brief and professional:
189
+ "Ignore unknown 'X' request" where X is the vague verb or phrase. Do NOT
190
+ add lengthy explanations or suggestions in the action field.
191
+
192
+ **Critical rules:**
193
+
194
+ - NEVER create "define" type with generic categories like "Run tests",
195
+ "Build project" unless these map to actual skill commands
196
+ - NEVER create "define" type without a matching skill. The "define" type
197
+ is ONLY for disambiguating between multiple variants/operations within
198
+ an existing skill
199
+ - Each "define" option MUST be immediately executable (not requiring
200
+ further clarification)
201
+ - Options MUST come from defined skills with concrete commands
202
+ - If no skills exist to provide options, use "ignore" type instead of
203
+ "define"
204
+ - Example of WRONG usage: "deploy" with NO deploy skill → Creating
205
+ "define" type with options ["Deploy to staging", "Deploy to production"]
206
+ - this violates the rule because there's no deploy skill to derive these
207
+ from
109
208
 
110
209
  **For legitimate requests:**
111
210
  If the request is clear enough to understand the intent, even if informal or
112
211
  playful, process it normally. Refine casual language into professional task
113
212
  descriptions.
114
213
 
115
- ## Refinement Guidelines
214
+ ## Task Definition Guidelines
215
+
216
+ When creating task definitions, focus on:
217
+
218
+ - **Action**: Use correct grammar and sentence structure. Replace vague words
219
+ with precise, contextually appropriate alternatives. Use professional, clear
220
+ terminology suitable for technical documentation. Maintain natural, fluent
221
+ English phrasing while preserving the original intent.
116
222
 
117
- Focus on these elements when refining commands:
223
+ - **Type**: Categorize the operation using one of these supported types:
224
+ - `config` - Configuration changes, settings updates
225
+ - `plan` - Planning or breaking down tasks
226
+ - `execute` - Shell commands, running programs, scripts, compiling,
227
+ building
228
+ - `answer` - Answering questions, explaining concepts, providing
229
+ information
230
+ - `report` - Generating summaries, creating reports, displaying
231
+ results
232
+ - `define` - Presenting skill-based options when request matches
233
+ multiple skill variants
234
+ - `ignore` - Request is too vague and cannot be mapped to skills or
235
+ inferred from context
118
236
 
119
- - Correct grammar and sentence structure
120
- - Replace words with more precise or contextually appropriate alternatives,
121
- even when the original word is grammatically correct
122
- - Use professional, clear terminology suitable for technical documentation
123
- - Maintain natural, fluent English phrasing
124
- - Preserve the original intent and meaning
125
- - Be concise and unambiguous
237
+ Omit the type field if none of these categories clearly fit the operation.
126
238
 
127
- Prioritize clarity and precision over brevity. Choose the most appropriate word
128
- for the context, not just an acceptable one.
239
+ - **Params**: Include specific parameters mentioned in the request or skill
240
+ (e.g., paths, URLs, command arguments, file names). Omit if no parameters
241
+ are relevant.
242
+
243
+ Prioritize clarity and precision over brevity. Each task should be unambiguous
244
+ and executable.
129
245
 
130
246
  ## Multiple Tasks
131
247
 
@@ -134,9 +250,8 @@ word "and", or when the user asks a complex question that requires multiple
134
250
  steps to answer:
135
251
 
136
252
  1. Identify each individual task or step
137
- 2. Break complex questions into separate, simpler tasks
138
- 3. Return a JSON array of corrected tasks
139
- 4. Use this exact format: ["task 1", "task 2", "task 3"]
253
+ 2. Break complex questions into separate, simpler task definitions
254
+ 3. Create a task definition for each distinct operation
140
255
 
141
256
  When breaking down complex questions:
142
257
 
@@ -144,7 +259,7 @@ When breaking down complex questions:
144
259
  - Separate conditional checks into distinct tasks
145
260
  - Keep each task simple and focused on one operation
146
261
 
147
- Before returning a JSON array, perform strict validation:
262
+ Before finalizing the task list, perform strict validation:
148
263
 
149
264
  1. Each task is semantically unique (no duplicates with different words)
150
265
  2. Each task provides distinct value
@@ -152,7 +267,7 @@ Before returning a JSON array, perform strict validation:
152
267
  4. When uncertain whether to split, default to a single task
153
268
  5. Executing the tasks will not result in duplicate work
154
269
 
155
- Critical validation check: After creating the array, examine each pair of
270
+ Critical validation check: After creating the task list, examine each pair of
156
271
  tasks and ask "Would these perform the same operation?" If yes, they are
157
272
  duplicates and must be merged or removed. Pay special attention to synonym
158
273
  verbs (delete, remove, erase) and equivalent noun phrases (unused apps,
@@ -160,8 +275,8 @@ applications not used).
160
275
 
161
276
  ## Avoiding Duplicates
162
277
 
163
- Each task in an array must be semantically unique and provide distinct value.
164
- Before returning multiple tasks, verify there are no duplicates.
278
+ Each task must be semantically unique and provide distinct value. Before
279
+ finalizing multiple tasks, verify there are no duplicates.
165
280
 
166
281
  Rules for preventing duplicates:
167
282
 
@@ -218,20 +333,11 @@ Split into multiple tasks when:
218
333
  - Truly separate steps: "create file and add content to it" (two distinct
219
334
  operations)
220
335
 
221
- ## Response Format
222
-
223
- - Single task: Return ONLY the corrected command text
224
- - Multiple tasks: Return ONLY a JSON array of strings
225
-
226
- Do not include explanations, commentary, markdown formatting, code blocks, or
227
- any other text. For JSON arrays, return the raw JSON without ```json``` or
228
- any other wrapping.
229
-
230
- ## Final Validation Before Response
336
+ ## Final Validation
231
337
 
232
- Before returning any JSON array, perform this final check:
338
+ Before finalizing the task list, perform this final check:
233
339
 
234
- 1. Compare each task against every other task in the array
340
+ 1. Compare each task against every other task
235
341
  2. Ask for each pair: "Do these describe the same operation using different
236
342
  words?"
237
343
  3. Check specifically for:
@@ -243,7 +349,7 @@ Before returning any JSON array, perform this final check:
243
349
  5. If in doubt about whether tasks are duplicates, they probably are - merge
244
350
  them
245
351
 
246
- Only return the array after confirming no semantic duplicates exist.
352
+ Only finalize after confirming no semantic duplicates exist.
247
353
 
248
354
  ## Examples
249
355
 
@@ -252,106 +358,112 @@ Only return the array after confirming no semantic duplicates exist.
252
358
  These examples show common mistakes that create semantic duplicates:
253
359
 
254
360
  - "explain Lehman's terms in Lehman's terms" →
255
- - wrong:
256
- [
257
- "explain what Lehman's terms are in simple language",
258
- "describe Lehman's terms using easy-to-understand words",
259
- ]
260
- - correct: explain Lehman's terms in simple language
361
+ - WRONG: Two tasks with actions "Explain what Lehman's terms are in simple
362
+ language" and "Describe Lehman's terms using easy-to-understand words"
363
+ - CORRECT: One task with action "Explain Lehman's terms in simple language"
261
364
 
262
365
  - "show and display files" →
263
- - wrong:
264
- [
265
- "show the files",
266
- "display the files",
267
- ]
268
- - correct: "show the files"
366
+ - WRONG: Two tasks with actions "Show the files" and "Display the files"
367
+ - CORRECT: One task with action "Show the files"
269
368
 
270
369
  - "check and verify disk space" →
271
- - wrong:
272
- [
273
- "check the disk space",
274
- "verify the disk space",
275
- ]
276
- - correct: "check the disk space"
370
+ - WRONG: Two tasks with actions "Check the disk space" and "Verify the disk
371
+ space"
372
+ - CORRECT: One task with action "Check the disk space"
277
373
 
278
374
  - "list directory contents completely" →
279
- - wrong:
280
- [
281
- "list the directory contents",
282
- "show all items",
283
- ]
284
- - correct: "list all directory contents"
375
+ - WRONG: Two tasks with actions "List the directory contents" and "Show all
376
+ items"
377
+ - CORRECT: One task with action "List all directory contents"
285
378
 
286
379
  - "install and set up dependencies" →
287
- - wrong:
288
- [
289
- "install dependencies",
290
- "set up dependencies",
291
- ]
292
- - correct: "install dependencies"
380
+ - WRONG: Two tasks with actions "Install dependencies" and "Set up
381
+ dependencies"
382
+ - CORRECT: One task with action "Install dependencies"
293
383
 
294
384
  - "delete apps and remove all apps unused in a year" →
295
- - wrong:
296
- [
297
- "delete unused applications",
298
- "remove apps not used in the past year",
299
- ]
300
- - correct: "delete all applications unused in the past year"
385
+ - WRONG: Two tasks with actions "Delete unused applications" and "Remove apps
386
+ not used in the past year"
387
+ - CORRECT: One task with action "Delete all applications unused in the past
388
+ year"
301
389
 
302
390
  ### Correct Examples: Single Task
303
391
 
304
392
  Simple requests should remain as single tasks:
305
393
 
306
- - "change dir to ~" → "change directory to the home folder"
307
- - "install deps" "install dependencies"
308
- - "make new file called test.txt" → "create a new file called test.txt"
309
- - "show me files here" → "show the files in the current directory"
310
- - "explain quantum physics simply""explain quantum physics in simple terms"
311
- - "describe the process in detail" → "describe the process in detail"
312
- - "check disk space thoroughly""check the disk space thoroughly"
394
+ - "change dir to ~" → One task with action "Change directory to the home
395
+ folder", type "execute", params { path: "~" }
396
+ - "install deps" → One task with action "Install dependencies", type "execute"
397
+ - "make new file called test.txt" → One task with action "Create a new file
398
+ called test.txt", type "execute", params { filename: "test.txt" }
399
+ - "show me files here" → One task with action "Show the files in the current
400
+ directory", type "execute"
401
+ - "explain quantum physics simply" → One task with action "Explain quantum
402
+ physics in simple terms", type "answer"
403
+ - "check disk space thoroughly" → One task with action "Check the disk space
404
+ thoroughly", type "execute"
313
405
 
314
406
  ### Correct Examples: Multiple Tasks
315
407
 
316
408
  Only split when tasks are truly distinct operations:
317
409
 
318
- - "install deps, run tests" →
319
- [
320
- "install dependencies",
321
- "run tests",
322
- ]
323
- - "create file; add content"
324
- [
325
- "create a file",
326
- "add content",
327
- ]
328
- - "build project and deploy" →
329
- [
330
- "build the project",
331
- "deploy",
332
- ]
410
+ - "install deps, run tests" → Two tasks with actions "Install
411
+ dependencies" (type: execute) and "Run tests" (type: execute)
412
+ - "create file; add content" → Two tasks with actions "Create a file" (type:
413
+ execute) and "Add content" (type: execute)
414
+ - "build project and deploy" → Two tasks with actions "Build the project"
415
+ (type: execute) and "Deploy" (type: execute)
333
416
 
334
417
  ### Correct Examples: Complex Questions
335
418
 
336
419
  Split only when multiple distinct queries or operations are needed:
337
420
 
338
- - "tell me weather in Wro, is it over 70 deg" →
339
- [
340
- "show the weather in Wrocław",
341
- "check if the temperature is above 70 degrees",
342
- ]
343
- - "pls what is 7th prime and how many are to 1000" →
344
- [
345
- "find the 7th prime number",
346
- "count how many prime numbers are below 1000",
347
- ]
348
- - "check disk space and warn if below 10%"
349
- [
350
- "check the disk space",
351
- "show a warning if it is below 10%",
352
- ]
353
- - "find config file and show its contents" →
354
- [
355
- "find the config file",
356
- "show its contents",
357
- ]
421
+ - "tell me weather in Wro, is it over 70 deg" → Two tasks:
422
+ 1. Action "Show the weather in Wrocław" (type: answer, params
423
+ { city: "Wrocław" })
424
+ 2. Action "Check if the temperature is above 70 degrees" (type:
425
+ answer)
426
+ - "pls what is 7th prime and how many are to 1000" → Two tasks:
427
+ 1. Action "Find the 7th prime number" (type: answer)
428
+ 2. Action "Count how many prime numbers are below 1000" (type: answer)
429
+ - "check disk space and warn if below 10%" → Two tasks:
430
+ 1. Action "Check the disk space" (type: execute)
431
+ 2. Action "Show a warning if it is below 10%" (type: report)
432
+ - "find config file and show its contents" → Two tasks:
433
+ 1. Action "Find the config file" (type: execute)
434
+ 2. Action "Show its contents" (type: report)
435
+
436
+ ### Correct Examples: Skill-Based Requests
437
+
438
+ Examples showing proper use of skills and disambiguation:
439
+
440
+ - "build" with build skill requiring {PROJECT} parameter (Alpha, Beta, Gamma,
441
+ Delta) → One task: type "define", action "Clarify which project to build",
442
+ params { options: ["Build Alpha", "Build Beta", "Build Gamma", "Build
443
+ Delta"] }
444
+ - "build Alpha" with same build skill → Three tasks extracted from skill
445
+ steps: "Navigate to the Alpha project's root directory", "Execute the Alpha
446
+ project generation script", "Compile the Alpha source code"
447
+ - "build all" with same build skill → Twelve tasks (3 steps × 4 projects)
448
+ - "deploy" with deploy skill (staging, production, canary) → One task: type
449
+ "define", action "Clarify which environment to deploy to", params
450
+ { options: ["Deploy to staging environment", "Deploy to production
451
+ environment", "Deploy to canary environment"] }
452
+ - "deploy all" with deploy skill (staging, production) → Two tasks: one for
453
+ staging deployment, one for production deployment
454
+ - "build and run" with build and run skills → Create tasks from build skill
455
+ + run skill
456
+ - "build Beta and lint" with build skill (has {PROJECT} parameter) but NO
457
+ lint skill → Four tasks: three from build skill (with {PROJECT}=Beta) +
458
+ one "ignore" type for unknown "lint"
459
+
460
+ ### Correct Examples: Requests Without Matching Skills
461
+
462
+ - "lint" with NO lint skill → One task: type "ignore", action "Ignore
463
+ unknown 'lint' request"
464
+ - "format" with NO format skill → One task: type "ignore", action "Ignore
465
+ unknown 'format' request"
466
+ - "build" with NO build skill → One task: type "ignore", action "Ignore
467
+ unknown 'build' request"
468
+ - "do stuff" with NO skills → One task: type "ignore", action "Ignore
469
+ unknown 'do stuff' request"
@@ -1,11 +1,6 @@
1
- import { readFileSync } from 'fs';
2
- import { fileURLToPath } from 'url';
3
- import { dirname, join } from 'path';
4
1
  import Anthropic from '@anthropic-ai/sdk';
5
2
  import { loadSkills, formatSkillsForPrompt } from './skills.js';
6
- const __filename = fileURLToPath(import.meta.url);
7
- const __dirname = dirname(__filename);
8
- const PLAN_PROMPT = readFileSync(join(__dirname, '../config/PLAN.md'), 'utf-8');
3
+ import { toolRegistry } from './tool-registry.js';
9
4
  export class AnthropicService {
10
5
  client;
11
6
  model;
@@ -13,15 +8,21 @@ export class AnthropicService {
13
8
  this.client = new Anthropic({ apiKey: key });
14
9
  this.model = model;
15
10
  }
16
- async processCommand(command) {
17
- // Load skills and augment the planning prompt
11
+ async processWithTool(command, toolName) {
12
+ // Load tool from registry
13
+ const tool = toolRegistry.getSchema(toolName);
14
+ const instructions = toolRegistry.getInstructions(toolName);
15
+ // Load skills and augment the instructions
18
16
  const skills = loadSkills();
19
17
  const skillsSection = formatSkillsForPrompt(skills);
20
- const systemPrompt = PLAN_PROMPT + skillsSection;
18
+ const systemPrompt = instructions + skillsSection;
19
+ // Call API with tool
21
20
  const response = await this.client.messages.create({
22
21
  model: this.model,
23
- max_tokens: 512,
22
+ max_tokens: 1024,
24
23
  system: systemPrompt,
24
+ tools: [tool],
25
+ tool_choice: { type: 'any' },
25
26
  messages: [
26
27
  {
27
28
  role: 'user',
@@ -29,42 +30,30 @@ export class AnthropicService {
29
30
  },
30
31
  ],
31
32
  });
32
- const content = response.content[0];
33
- if (content.type !== 'text') {
34
- throw new Error('Unexpected response type from Claude API');
33
+ // Check for truncation
34
+ if (response.stop_reason === 'max_tokens') {
35
+ throw new Error('Response was truncated due to length. Please simplify your request or break it into smaller parts.');
35
36
  }
36
- const text = content.text.trim();
37
- let tasks;
38
- // Try to parse as JSON array
39
- if (text.startsWith('[') && text.endsWith(']')) {
40
- try {
41
- const parsed = JSON.parse(text);
42
- if (Array.isArray(parsed)) {
43
- // Validate all items are strings
44
- const allStrings = parsed.every((item) => typeof item === 'string');
45
- if (allStrings) {
46
- tasks = parsed.filter((item) => typeof item === 'string');
47
- }
48
- else {
49
- tasks = [text];
50
- }
51
- }
52
- else {
53
- tasks = [text];
54
- }
55
- }
56
- catch {
57
- // If JSON parsing fails, treat as single task
58
- tasks = [text];
59
- }
37
+ // Validate response structure
38
+ if (response.content.length === 0 ||
39
+ response.content[0].type !== 'tool_use') {
40
+ throw new Error('Expected tool_use response from Claude API');
60
41
  }
61
- else {
62
- // Single task
63
- tasks = [text];
42
+ const content = response.content[0];
43
+ // Extract and validate tasks array
44
+ const input = content.input;
45
+ if (!input.tasks || !Array.isArray(input.tasks)) {
46
+ throw new Error('Invalid tool response: missing or invalid tasks array');
64
47
  }
48
+ // Validate each task has required action field
49
+ input.tasks.forEach((task, i) => {
50
+ if (!task.action || typeof task.action !== 'string') {
51
+ throw new Error(`Invalid task at index ${String(i)}: missing or invalid 'action' field`);
52
+ }
53
+ });
65
54
  const isDebug = process.env.DEBUG === 'true';
66
55
  return {
67
- tasks,
56
+ tasks: input.tasks,
68
57
  systemPrompt: isDebug ? systemPrompt : undefined,
69
58
  };
70
59
  }
@@ -0,0 +1,41 @@
1
+ import { readFileSync } from 'fs';
2
+ import { resolve } from 'path';
3
+ import { fileURLToPath } from 'url';
4
+ import { dirname } from 'path';
5
+ const __filename = fileURLToPath(import.meta.url);
6
+ const __dirname = dirname(__filename);
7
+ class ToolRegistry {
8
+ tools = new Map();
9
+ register(name, config) {
10
+ this.tools.set(name, config);
11
+ }
12
+ getTool(name) {
13
+ return this.tools.get(name);
14
+ }
15
+ getInstructions(name) {
16
+ const config = this.getTool(name);
17
+ if (!config) {
18
+ throw new Error(`Tool '${name}' not found in registry`);
19
+ }
20
+ const instructionsPath = resolve(__dirname, '..', config.instructionsPath);
21
+ return readFileSync(instructionsPath, 'utf-8');
22
+ }
23
+ getSchema(name) {
24
+ const config = this.getTool(name);
25
+ if (!config) {
26
+ throw new Error(`Tool '${name}' not found in registry`);
27
+ }
28
+ return config.schema;
29
+ }
30
+ hasTool(name) {
31
+ return this.tools.has(name);
32
+ }
33
+ }
34
+ // Create singleton instance
35
+ export const toolRegistry = new ToolRegistry();
36
+ // Register built-in tools
37
+ import { planTool } from '../tools/plan.tool.js';
38
+ toolRegistry.register('plan', {
39
+ schema: planTool,
40
+ instructionsPath: 'config/PLAN.md',
41
+ });
@@ -0,0 +1,32 @@
1
+ export const planTool = {
2
+ name: 'plan',
3
+ description: 'Plan and structure tasks from a user command. Break down the request into clear, actionable steps with type information and parameters.',
4
+ input_schema: {
5
+ type: 'object',
6
+ properties: {
7
+ tasks: {
8
+ type: 'array',
9
+ description: 'Array of planned tasks to execute',
10
+ items: {
11
+ type: 'object',
12
+ properties: {
13
+ action: {
14
+ type: 'string',
15
+ description: 'Clear description of what needs to be done in this task',
16
+ },
17
+ type: {
18
+ type: 'string',
19
+ description: 'Type of task: "config" (settings), "plan" (planning), "execute" (shell/programs/finding files), "answer" (questions), "report" (summaries), "define" (skill-based disambiguation), "ignore" (too vague)',
20
+ },
21
+ params: {
22
+ type: 'object',
23
+ description: 'Task-specific parameters (e.g., command, path, url, etc.)',
24
+ },
25
+ },
26
+ required: ['action'],
27
+ },
28
+ },
29
+ },
30
+ required: ['tasks'],
31
+ },
32
+ };
@@ -1 +1,10 @@
1
- export {};
1
+ export var TaskType;
2
+ (function (TaskType) {
3
+ TaskType["Config"] = "config";
4
+ TaskType["Plan"] = "plan";
5
+ TaskType["Execute"] = "execute";
6
+ TaskType["Answer"] = "answer";
7
+ TaskType["Report"] = "report";
8
+ TaskType["Define"] = "define";
9
+ TaskType["Ignore"] = "ignore";
10
+ })(TaskType || (TaskType = {}));
@@ -1,8 +1,22 @@
1
1
  import { jsxs as _jsxs, jsx as _jsx, Fragment as _Fragment } from "react/jsx-runtime";
2
2
  import { useEffect, useState } from 'react';
3
3
  import { Box, Text } from 'ink';
4
+ import { TaskType } from '../types/components.js';
4
5
  import { Spinner } from './Spinner.js';
5
- const MIN_PROCESSING_TIME = 2000; // purely for visual effect
6
+ const MIN_PROCESSING_TIME = 1000; // purely for visual effect
7
+ function getTaskActionColor(taskType) {
8
+ return taskType === TaskType.Ignore ? 'yellow' : 'white';
9
+ }
10
+ function getTaskTypeColor(taskType) {
11
+ if (taskType === TaskType.Ignore)
12
+ return 'red';
13
+ if (taskType === TaskType.Define)
14
+ return 'blue';
15
+ return 'greenBright';
16
+ }
17
+ function shouldDimTaskType(taskType) {
18
+ return taskType !== TaskType.Define;
19
+ }
6
20
  export function Command({ command, state, service, tasks, error: errorProp, systemPrompt: systemPromptProp, }) {
7
21
  const done = state?.done ?? false;
8
22
  const [processedTasks, setProcessedTasks] = useState(tasks || []);
@@ -24,7 +38,7 @@ export function Command({ command, state, service, tasks, error: errorProp, syst
24
38
  async function process(svc) {
25
39
  const startTime = Date.now();
26
40
  try {
27
- const result = await svc.processCommand(command);
41
+ const result = await svc.processWithTool(command, 'plan');
28
42
  const elapsed = Date.now() - startTime;
29
43
  const remainingTime = Math.max(0, MIN_PROCESSING_TIME - elapsed);
30
44
  await new Promise((resolve) => setTimeout(resolve, remainingTime));
@@ -49,5 +63,7 @@ export function Command({ command, state, service, tasks, error: errorProp, syst
49
63
  mounted = false;
50
64
  };
51
65
  }, [command, done, service]);
52
- return (_jsxs(Box, { alignSelf: "flex-start", marginBottom: 1, flexDirection: "column", children: [_jsxs(Box, { children: [_jsxs(Text, { color: "gray", children: ["> pls ", command] }), isLoading && (_jsxs(_Fragment, { children: [_jsx(Text, { children: " " }), _jsx(Spinner, {})] }))] }), error && (_jsx(Box, { marginTop: 1, children: _jsxs(Text, { color: "red", children: ["Error: ", error] }) })), processedTasks.length > 0 && (_jsx(Box, { flexDirection: "column", children: processedTasks.map((task, index) => (_jsxs(Box, { children: [_jsx(Text, { color: "whiteBright", children: ' - ' }), _jsx(Text, { color: "white", children: task })] }, index))) }))] }));
66
+ return (_jsxs(Box, { alignSelf: "flex-start", marginBottom: 1, flexDirection: "column", children: [_jsxs(Box, { children: [_jsxs(Text, { color: "gray", children: ["> pls ", command] }), isLoading && (_jsxs(_Fragment, { children: [_jsx(Text, { children: " " }), _jsx(Spinner, {})] }))] }), error && (_jsx(Box, { marginTop: 1, children: _jsxs(Text, { color: "red", children: ["Error: ", error] }) })), processedTasks.length > 0 && (_jsx(Box, { flexDirection: "column", children: processedTasks.map((task, index) => (_jsxs(Box, { flexDirection: "column", children: [_jsxs(Box, { children: [_jsx(Text, { color: "whiteBright", children: ' - ' }), _jsx(Text, { color: getTaskActionColor(task.type), children: task.action }), _jsxs(Text, { color: getTaskTypeColor(task.type), dimColor: shouldDimTaskType(task.type), children: [' ', "(", task.type, ")"] })] }), (task.type === TaskType.Define &&
67
+ task.params?.options &&
68
+ Array.isArray(task.params.options) && (_jsx(Box, { flexDirection: "column", marginLeft: 4, children: task.params.options.map((option, optIndex) => (_jsx(Box, { children: _jsxs(Text, { color: "whiteBright", dimColor: true, children: ["- ", String(option)] }) }, optIndex))) })))] }, index))) }))] }));
53
69
  }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "prompt-language-shell",
3
- "version": "0.1.6",
3
+ "version": "0.2.0",
4
4
  "description": "Your personal command-line concierge. Ask politely, and it gets things done.",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",