@tecet/ollm 0.1.4-b → 0.1.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (66) hide show
  1. package/docs/README.md +3 -410
  2. package/package.json +2 -2
  3. package/docs/Context/CheckpointFlowDiagram.md +0 -673
  4. package/docs/Context/ContextArchitecture.md +0 -898
  5. package/docs/Context/ContextCompression.md +0 -1102
  6. package/docs/Context/ContextManagment.md +0 -750
  7. package/docs/Context/Index.md +0 -209
  8. package/docs/Context/README.md +0 -390
  9. package/docs/DevelopmentRoadmap/Index.md +0 -238
  10. package/docs/DevelopmentRoadmap/OLLM-CLI_Releases.md +0 -419
  11. package/docs/DevelopmentRoadmap/PlanedFeatures.md +0 -448
  12. package/docs/DevelopmentRoadmap/README.md +0 -174
  13. package/docs/DevelopmentRoadmap/Roadmap.md +0 -572
  14. package/docs/DevelopmentRoadmap/RoadmapVisual.md +0 -372
  15. package/docs/Hooks/Architecture.md +0 -885
  16. package/docs/Hooks/Index.md +0 -244
  17. package/docs/Hooks/KeyboardShortcuts.md +0 -248
  18. package/docs/Hooks/Protocol.md +0 -817
  19. package/docs/Hooks/README.md +0 -403
  20. package/docs/Hooks/UserGuide.md +0 -1483
  21. package/docs/Hooks/VisualGuide.md +0 -598
  22. package/docs/Index.md +0 -506
  23. package/docs/Installation.md +0 -586
  24. package/docs/Introduction.md +0 -367
  25. package/docs/LLM Models/Index.md +0 -239
  26. package/docs/LLM Models/LLM_GettingStarted.md +0 -748
  27. package/docs/LLM Models/LLM_Index.md +0 -701
  28. package/docs/LLM Models/LLM_MemorySystem.md +0 -337
  29. package/docs/LLM Models/LLM_ModelCompatibility.md +0 -499
  30. package/docs/LLM Models/LLM_ModelsArchitecture.md +0 -933
  31. package/docs/LLM Models/LLM_ModelsCommands.md +0 -839
  32. package/docs/LLM Models/LLM_ModelsConfiguration.md +0 -1094
  33. package/docs/LLM Models/LLM_ModelsList.md +0 -1071
  34. package/docs/LLM Models/LLM_ModelsList.md.backup +0 -400
  35. package/docs/LLM Models/README.md +0 -355
  36. package/docs/MCP/MCP_Architecture.md +0 -1086
  37. package/docs/MCP/MCP_Commands.md +0 -1111
  38. package/docs/MCP/MCP_GettingStarted.md +0 -590
  39. package/docs/MCP/MCP_Index.md +0 -524
  40. package/docs/MCP/MCP_Integration.md +0 -866
  41. package/docs/MCP/MCP_Marketplace.md +0 -160
  42. package/docs/MCP/README.md +0 -415
  43. package/docs/Prompts System/Architecture.md +0 -760
  44. package/docs/Prompts System/Index.md +0 -223
  45. package/docs/Prompts System/PromptsRouting.md +0 -1047
  46. package/docs/Prompts System/PromptsTemplates.md +0 -1102
  47. package/docs/Prompts System/README.md +0 -389
  48. package/docs/Prompts System/SystemPrompts.md +0 -856
  49. package/docs/Quickstart.md +0 -535
  50. package/docs/Tools/Architecture.md +0 -884
  51. package/docs/Tools/GettingStarted.md +0 -624
  52. package/docs/Tools/Index.md +0 -216
  53. package/docs/Tools/ManifestReference.md +0 -141
  54. package/docs/Tools/README.md +0 -440
  55. package/docs/Tools/UserGuide.md +0 -773
  56. package/docs/Troubleshooting.md +0 -1265
  57. package/docs/UI&Settings/Architecture.md +0 -729
  58. package/docs/UI&Settings/ColorASCII.md +0 -34
  59. package/docs/UI&Settings/Commands.md +0 -755
  60. package/docs/UI&Settings/Configuration.md +0 -872
  61. package/docs/UI&Settings/Index.md +0 -293
  62. package/docs/UI&Settings/Keybinds.md +0 -372
  63. package/docs/UI&Settings/README.md +0 -278
  64. package/docs/UI&Settings/Terminal.md +0 -637
  65. package/docs/UI&Settings/Themes.md +0 -604
  66. package/docs/UI&Settings/UIGuide.md +0 -550
@@ -1,1094 +0,0 @@
1
- # Model Management Configuration
2
-
3
- **Complete Configuration Guide**
4
-
5
- This document covers all configuration options for the Model Management system, including model settings, routing, memory, templates, and project profiles.
6
-
7
- ---
8
-
9
- ## Table of Contents
10
-
11
- 1. [Configuration Files](#configuration-files)
12
- 2. [Model Management](#model-management)
13
- - [Unknown Model Handling](#unknown-model-handling)
14
- - [Cache Settings](#cache-settings)
15
- - [Keep-Alive Settings](#keep-alive-settings)
16
- - [Auto-Pull Settings](#auto-pull-settings)
17
- - [Tool Configuration](#tool-configuration)
18
- 3. [Model Routing](#model-routing)
19
- 4. [Memory System](#memory-system)
20
- 5. [Template System](#template-system)
21
- 6. [Project Profiles](#project-profiles)
22
- 7. [Environment Variables](#environment-variables)
23
- 8. [Configuration Precedence](#configuration-precedence)
24
-
25
- ---
26
-
27
- ## Configuration Files
28
-
29
- ### User Configuration
30
-
31
- **Location:** `~/.ollm/config.yaml`
32
-
33
- **Purpose:** Global settings for all projects
34
-
35
- **Example:**
36
-
37
- ```yaml
38
- # Model management
39
- models:
40
- cache_ttl: 60
41
- keep_alive: 300
42
- auto_pull: false
43
-
44
- # Routing
45
- routing:
46
- profile: general
47
- preferred_families:
48
- - llama
49
- - mistral
50
- fallback_profile: fast
51
-
52
- # Memory
53
- memory:
54
- enabled: true
55
- system_prompt_budget: 500
56
- file: ~/.ollm/memory.json
57
-
58
- # Templates
59
- templates:
60
- user_dir: ~/.ollm/templates
61
- workspace_dir: .ollm/templates
62
-
63
- # Project profiles
64
- project:
65
- auto_detect: true
66
- profile: null
67
- ```
68
-
69
- ### Project Configuration
70
-
71
- **Location:** `.ollm/config.yaml` (in project root)
72
-
73
- **Purpose:** Project-specific settings (override user settings)
74
-
75
- **Example:**
76
-
77
- ```yaml
78
- # Project-specific model
79
- model: codellama:13b
80
-
81
- # Project-specific routing
82
- routing:
83
- profile: code
84
- preferred_families:
85
- - codellama
86
- - deepseek-coder
87
-
88
- # Project-specific memory
89
- memory:
90
- enabled: true
91
- system_prompt_budget: 1000
92
-
93
- # Project profile
94
- project:
95
- profile: typescript
96
- ```
97
-
98
- ### Project Profile
99
-
100
- **Location:** `.ollm/project.yaml`
101
-
102
- **Purpose:** Project type and settings
103
-
104
- **Example:**
105
-
106
- ```yaml
107
- # Project metadata
108
- name: my-project
109
- type: typescript
110
- version: 1.0.0
111
-
112
- # Model settings
113
- model: llama3.1:8b
114
- system_prompt: |
115
- You are a TypeScript expert helping with a web application.
116
- Follow TypeScript best practices and modern patterns.
117
-
118
- # Tool restrictions
119
- tools:
120
- allowed:
121
- - read_file
122
- - write_file
123
- - shell
124
- denied:
125
- - web_fetch
126
-
127
- # Routing
128
- routing:
129
- profile: code
130
- preferred_families:
131
- - llama
132
- - qwen
133
- ```
134
-
135
- ---
136
-
137
- ## Model Management
138
-
139
- ### Unknown Model Handling
140
-
141
- **Overview:** When you install a model that isn't in OLLM's database, the system automatically creates a profile using the "user-unknown-model" template. This allows you to use any model with Ollama, even if it's not officially supported.
142
-
143
- **How It Works:**
144
-
145
- 1. **Automatic Detection:** On startup, OLLM queries Ollama for installed models
146
- 2. **Database Matching:** Each model is matched against the master database
147
- 3. **Template Application:** Unknown models receive default settings based on Llama 3.2 3B
148
- 4. **User Customization:** You can manually edit the generated profile
149
-
150
- **User Profile Location:**
151
-
152
- - **Windows:** `C:\Users\{username}\.ollm\LLM_profiles.json`
153
- - **Linux/Mac:** `~/.ollm/LLM_profiles.json`
154
-
155
- **Example Unknown Model Entry:**
156
-
157
- ```json
158
- {
159
- "id": "custom-model:latest",
160
- "name": "Unknown Model (custom-model:latest)",
161
- "creator": "User",
162
- "parameters": "Based on Llama 3.2 3B",
163
- "quantization": "Based on Llama 3.2 3B (4-bit estimated)",
164
- "description": "Unknown model \"custom-model:latest\". Please edit your settings at ~/.ollm/LLM_profiles.json",
165
- "abilities": ["Unknown"],
166
- "tool_support": false,
167
- "ollama_url": "Unknown",
168
- "max_context_window": 131072,
169
- "context_profiles": [
170
- {
171
- "size": 4096,
172
- "size_label": "4k",
173
- "vram_estimate": "2.5 GB",
174
- "ollama_context_size": 2867,
175
- "vram_estimate_gb": 2.5
176
- }
177
- // ... more context profiles
178
- ],
179
- "default_context": 4096
180
- }
181
- ```
182
-
183
- **Customizing Unknown Models:**
184
-
185
- 1. **Locate the file:**
186
-
187
- ```bash
188
- # Windows
189
- notepad %USERPROFILE%\.ollm\LLM_profiles.json
190
-
191
- # Linux/Mac
192
- nano ~/.ollm/LLM_profiles.json
193
- ```
194
-
195
- 2. **Find your model:** Search for the model ID (e.g., "custom-model:latest")
196
-
197
- 3. **Update fields:**
198
-
199
- ```json
200
- {
201
- "id": "custom-model:latest",
202
- "name": "My Custom Model 7B", // ← Update display name
203
- "creator": "Custom Creator", // ← Update creator
204
- "parameters": "7B", // ← Update parameter count
205
- "quantization": "4-bit", // ← Update quantization
206
- "description": "My custom fine-tuned model",
207
- "abilities": ["Coding", "Math"], // ← Update capabilities
208
- "tool_support": true, // ← Enable if model supports tools
209
- "ollama_url": "https://...", // ← Add documentation link
210
- "context_profiles": [
211
- {
212
- "size": 4096,
213
- "vram_estimate": "5.5 GB", // ← Adjust VRAM estimates
214
- "ollama_context_size": 3482,
215
- "vram_estimate_gb": 5.5
216
- }
217
- // ... adjust other profiles
218
- ]
219
- }
220
- ```
221
-
222
- 4. **Save and restart:** Your changes persist across app restarts
223
-
224
- **Important Notes:**
225
-
226
- - **Preservation:** Your edits are preserved when the profile is recompiled
227
- - **Context Sizes:** The `ollama_context_size` values are pre-calculated at 85% of the `size` value
228
- - **VRAM Estimates:** Adjust based on your actual GPU memory usage
229
- - **Tool Support:** Only enable if your model supports function calling
230
- - **Default Template:** Based on Llama 3.2 3B (3.2B parameters, 4-bit quantization)
231
-
232
- **Common Customizations:**
233
-
234
- **For Larger Models (13B+):**
235
-
236
- ```json
237
- {
238
- "parameters": "13B",
239
- "context_profiles": [
240
- {
241
- "size": 4096,
242
- "vram_estimate": "8.5 GB",
243
- "vram_estimate_gb": 8.5
244
- }
245
- ]
246
- }
247
- ```
248
-
249
- **For Code-Specialized Models:**
250
-
251
- ```json
252
- {
253
- "abilities": ["Coding", "Debugging", "Code Review"],
254
- "tool_support": true
255
- }
256
- ```
257
-
258
- **For Reasoning Models:**
259
-
260
- ```json
261
- {
262
- "abilities": ["Reasoning", "Math", "Logic"],
263
- "thinking_enabled": true,
264
- "reasoning_buffer": "Variable",
265
- "warmup_timeout": 120000
266
- }
267
- ```
268
-
269
- **Troubleshooting:**
270
-
271
- **Model Not Detected:**
272
-
273
- - Ensure Ollama is running: `curl http://localhost:11434/api/tags`
274
- - Check model is installed: `ollama list`
275
- - Restart OLLM to trigger recompilation
276
-
277
- **Wrong VRAM Estimates:**
278
-
279
- - Monitor actual usage with `nvidia-smi` (NVIDIA) or `rocm-smi` (AMD)
280
- - Update `vram_estimate_gb` values in your profile
281
- - Consider reducing context size if running out of memory
282
-
283
- **Tool Support Not Working:**
284
-
285
- - Verify model actually supports function calling
286
- - Check Ollama model documentation
287
- - Test with simple tool call before enabling
288
-
289
- **See Also:**
290
-
291
- - [Model Compiler System](../../.dev/docs/knowledgeDB/dev_ModelCompiler.md) - Technical details
292
- - [Model Database](../../.dev/docs/knowledgeDB/dev_ModelDB.md) - Database schema
293
- - [Model Management](../../.dev/docs/knowledgeDB/dev_ModelManagement.md) - Model selection
294
-
295
- ---
296
-
297
- ### Cache Settings
298
-
299
- **Option:** `models.cache_ttl`
300
- **Type:** Number (seconds)
301
- **Default:** 60
302
- **Description:** How long to cache model list
303
-
304
- **Example:**
305
-
306
- ```yaml
307
- models:
308
- cache_ttl: 120 # Cache for 2 minutes
309
- ```
310
-
311
- **Impact:**
312
-
313
- - Higher values: Fewer provider calls, faster responses, stale data
314
- - Lower values: More provider calls, slower responses, fresh data
315
-
316
- ### Keep-Alive Settings
317
-
318
- **Option:** `models.keep_alive`
319
- **Type:** Number (seconds)
320
- **Default:** 300
321
- **Description:** How long to keep models loaded
322
-
323
- **Example:**
324
-
325
- ```yaml
326
- models:
327
- keep_alive: 600 # Keep loaded for 10 minutes
328
- ```
329
-
330
- **Impact:**
331
-
332
- - Higher values: Models stay in memory longer, faster responses, more VRAM used
333
- - Lower values: Models unload sooner, slower responses, less VRAM used
334
-
335
- **Special Values:**
336
-
337
- - `0`: Unload immediately after use
338
- - `-1`: Keep loaded indefinitely
339
-
340
- ### Auto-Pull Settings
341
-
342
- **Option:** `models.auto_pull`
343
- **Type:** Boolean
344
- **Default:** false
345
- **Description:** Automatically pull missing models
346
-
347
- **Example:**
348
-
349
- ```yaml
350
- models:
351
- auto_pull: true # Pull models automatically
352
- ```
353
-
354
- **Impact:**
355
-
356
- - `true`: Convenient, but may download large files unexpectedly
357
- - `false`: Manual control, but requires explicit pull commands
358
-
359
- ### Tool Configuration
360
-
361
- **Option:** `tools`
362
- **Type:** Object (map of tool ID to boolean)
363
- **Default:** All tools enabled
364
- **Description:** Enable or disable individual tools
365
-
366
- **Example:**
367
-
368
- ```yaml
369
- tools:
370
- executePwsh: false # Disable shell execution
371
- controlPwshProcess: false # Disable process management
372
- remote_web_search: true # Enable web search
373
- webFetch: true # Enable web fetch
374
- ```
375
-
376
- **Available Tools:**
377
-
378
- **File Operations:**
379
-
380
- - `fsWrite`: Create or overwrite files
381
- - `fsAppend`: Append content to files
382
- - `strReplace`: Replace text in files
383
- - `deleteFile`: Delete files
384
-
385
- **File Discovery:**
386
-
387
- - `readFile`: Read file contents
388
- - `readMultipleFiles`: Read multiple files
389
- - `listDirectory`: List directory contents
390
- - `fileSearch`: Search for files by name
391
- - `grepSearch`: Search file contents with regex
392
-
393
- **Shell:**
394
-
395
- - `executePwsh`: Execute shell commands
396
- - `controlPwshProcess`: Manage background processes
397
- - `listProcesses`: List running processes
398
- - `getProcessOutput`: Read process output
399
-
400
- **Web:**
401
-
402
- - `remote_web_search`: Search the web
403
- - `webFetch`: Fetch content from URLs
404
-
405
- **Memory:**
406
-
407
- - `userInput`: Get input from the user
408
-
409
- **Context:**
410
-
411
- - `prework`: Acceptance criteria testing prework
412
- - `taskStatus`: Update task status
413
- - `updatePBTStatus`: Update property-based test status
414
- - `invokeSubAgent`: Delegate to specialized agents
415
-
416
- **Persistence:**
417
-
418
- - Tool settings are saved to `~/.ollm/settings.json`
419
- - Settings persist across sessions
420
- - Workspace settings (`.ollm/settings.json`) override user settings
421
-
422
- **Tool Filtering:**
423
- Tools are filtered in two stages:
424
-
425
- 1. **Model Capability**: If model doesn't support function calling, all tools are disabled
426
- 2. **User Preference**: Disabled tools are never sent to the LLM
427
-
428
- **Use Cases:**
429
-
430
- - Disable shell tools for safety in untrusted environments
431
- - Disable web tools for offline work
432
- - Reduce tool count to improve LLM focus
433
- - Project-specific tool restrictions
434
-
435
- ---
436
-
437
- ## Model Routing
438
-
439
- ### Routing Profile
440
-
441
- **Option:** `routing.profile`
442
- **Type:** String
443
- **Default:** general
444
- **Values:** fast, general, code, creative
445
- **Description:** Task profile for model selection
446
-
447
- **Example:**
448
-
449
- ```yaml
450
- routing:
451
- profile: code # Optimize for coding tasks
452
- ```
453
-
454
- **Profiles:**
455
-
456
- - `fast`: Quick responses, smaller models
457
- - `general`: Balanced performance
458
- - `code`: Programming tasks, code-specialized models
459
- - `creative`: Creative writing, larger models
460
-
461
- ### Preferred Families
462
-
463
- **Option:** `routing.preferred_families`
464
- **Type:** Array of strings
465
- **Default:** []
466
- **Description:** Model families to prefer
467
-
468
- **Example:**
469
-
470
- ```yaml
471
- routing:
472
- preferred_families:
473
- - llama # Prefer Llama models
474
- - mistral # Then Mistral models
475
- - qwen # Then Qwen models
476
- ```
477
-
478
- **Impact:**
479
-
480
- - Models from preferred families score higher
481
- - Order matters (first is most preferred)
482
- - Empty array means no preference
483
-
484
- ### Fallback Profile
485
-
486
- **Option:** `routing.fallback_profile`
487
- **Type:** String
488
- **Default:** general
489
- **Values:** fast, general, code, creative
490
- **Description:** Profile to use if primary fails
491
-
492
- **Example:**
493
-
494
- ```yaml
495
- routing:
496
- fallback_profile: fast # Fall back to fast profile
497
- ```
498
-
499
- **Impact:**
500
-
501
- - Used when no models match primary profile
502
- - Prevents selection failures
503
- - Can chain multiple fallbacks
504
-
505
- ### Manual Override
506
-
507
- **Option:** `model`
508
- **Type:** String
509
- **Default:** null
510
- **Description:** Manually specify model (bypasses routing)
511
-
512
- **Example:**
513
-
514
- ```yaml
515
- model: llama3.1:8b # Always use this model
516
- ```
517
-
518
- **Impact:**
519
-
520
- - Overrides routing completely
521
- - Useful for testing specific models
522
- - Disables automatic selection
523
-
524
- ---
525
-
526
- ## Memory System
527
-
528
- ### Enable/Disable
529
-
530
- **Option:** `memory.enabled`
531
- **Type:** Boolean
532
- **Default:** true
533
- **Description:** Enable memory system
534
-
535
- **Example:**
536
-
537
- ```yaml
538
- memory:
539
- enabled: false # Disable memory
540
- ```
541
-
542
- **Impact:**
543
-
544
- - `true`: Memories injected into system prompt
545
- - `false`: No memory injection, faster responses
546
-
547
- ### System Prompt Budget
548
-
549
- **Option:** `memory.system_prompt_budget`
550
- **Type:** Number (tokens)
551
- **Default:** 500
552
- **Description:** Maximum tokens for memory injection
553
-
554
- **Example:**
555
-
556
- ```yaml
557
- memory:
558
- system_prompt_budget: 1000 # Allow more memories
559
- ```
560
-
561
- **Impact:**
562
-
563
- - Higher values: More memories included, less room for conversation
564
- - Lower values: Fewer memories included, more room for conversation
565
-
566
- **Recommendations:**
567
-
568
- - Small context (2K): 200-300 tokens
569
- - Medium context (8K): 500-800 tokens
570
- - Large context (32K+): 1000-2000 tokens
571
-
572
- ### Memory File
573
-
574
- **Option:** `memory.file`
575
- **Type:** String (path)
576
- **Default:** ~/.ollm/memory.json
577
- **Description:** Location of memory storage
578
-
579
- **Example:**
580
-
581
- ```yaml
582
- memory:
583
- file: /custom/path/memory.json
584
- ```
585
-
586
- **Impact:**
587
-
588
- - Different projects can have different memory files
589
- - Useful for isolation or sharing
590
-
591
- ---
592
-
593
- ## Template System
594
-
595
- ### User Template Directory
596
-
597
- **Option:** `templates.user_dir`
598
- **Type:** String (path)
599
- **Default:** ~/.ollm/templates
600
- **Description:** User-level template directory
601
-
602
- **Example:**
603
-
604
- ```yaml
605
- templates:
606
- user_dir: ~/my-templates
607
- ```
608
-
609
- **Impact:**
610
-
611
- - Templates available across all projects
612
- - Good for personal templates
613
-
614
- ### Workspace Template Directory
615
-
616
- **Option:** `templates.workspace_dir`
617
- **Type:** String (path)
618
- **Default:** .ollm/templates
619
- **Description:** Project-level template directory
620
-
621
- **Example:**
622
-
623
- ```yaml
624
- templates:
625
- workspace_dir: .templates
626
- ```
627
-
628
- **Impact:**
629
-
630
- - Templates specific to project
631
- - Workspace templates override user templates
632
- - Good for team-shared templates
633
-
634
- ---
635
-
636
- ## Project Profiles
637
-
638
- ### Auto-Detection
639
-
640
- **Option:** `project.auto_detect`
641
- **Type:** Boolean
642
- **Default:** true
643
- **Description:** Automatically detect project type
644
-
645
- **Example:**
646
-
647
- ```yaml
648
- project:
649
- auto_detect: false # Disable auto-detection
650
- ```
651
-
652
- **Impact:**
653
-
654
- - `true`: Automatic profile selection based on files
655
- - `false`: Manual profile selection required
656
-
657
- **Detection Rules:**
658
-
659
- - TypeScript: package.json with typescript dependency
660
- - Python: requirements.txt or setup.py
661
- - Rust: Cargo.toml
662
- - Go: go.mod
663
- - Documentation: docs/ directory
664
-
665
- ### Manual Profile
666
-
667
- **Option:** `project.profile`
668
- **Type:** String
669
- **Default:** null
670
- **Values:** typescript, python, rust, go, documentation
671
- **Description:** Manually specify project profile
672
-
673
- **Example:**
674
-
675
- ```yaml
676
- project:
677
- profile: typescript # Force TypeScript profile
678
- ```
679
-
680
- **Impact:**
681
-
682
- - Overrides auto-detection
683
- - Useful when detection fails
684
- - Applies profile-specific settings
685
-
686
- ---
687
-
688
- ## Environment Variables
689
-
690
- ### Model Override
691
-
692
- **Variable:** `OLLM_MODEL`
693
- **Type:** String
694
- **Description:** Override default model
695
-
696
- **Example:**
697
-
698
- ```bash
699
- export OLLM_MODEL=llama3.1:8b
700
- ```
701
-
702
- **Precedence:** Highest (overrides all config)
703
-
704
- ### Temperature
705
-
706
- **Variable:** `OLLM_TEMPERATURE`
707
- **Type:** Number (0.0-2.0)
708
- **Description:** Override temperature
709
-
710
- **Example:**
711
-
712
- ```bash
713
- export OLLM_TEMPERATURE=0.7
714
- ```
715
-
716
- ### Max Tokens
717
-
718
- **Variable:** `OLLM_MAX_TOKENS`
719
- **Type:** Number
720
- **Description:** Override max tokens
721
-
722
- **Example:**
723
-
724
- ```bash
725
- export OLLM_MAX_TOKENS=2048
726
- ```
727
-
728
- ### Context Size
729
-
730
- **Variable:** `OLLM_CONTEXT_SIZE`
731
- **Type:** Number
732
- **Description:** Override context window
733
-
734
- **Example:**
735
-
736
- ```bash
737
- export OLLM_CONTEXT_SIZE=8192
738
- ```
739
-
740
- ### Ollama Host
741
-
742
- **Variable:** `OLLAMA_HOST`
743
- **Type:** String (URL)
744
- **Default:** http://localhost:11434
745
- **Description:** Ollama server URL
746
-
747
- **Example:**
748
-
749
- ```bash
750
- export OLLAMA_HOST=http://remote-server:11434
751
- ```
752
-
753
- ### Log Level
754
-
755
- **Variable:** `OLLM_LOG_LEVEL`
756
- **Type:** String
757
- **Values:** debug, info, warn, error
758
- **Default:** info
759
- **Description:** Logging verbosity
760
-
761
- **Example:**
762
-
763
- ```bash
764
- export OLLM_LOG_LEVEL=debug
765
- ```
766
-
767
- ---
768
-
769
- ## Configuration Precedence
770
-
771
- ### Order (Highest to Lowest)
772
-
773
- 1. **Environment Variables** (highest)
774
- - `OLLM_MODEL`, `OLLM_TEMPERATURE`, etc.
775
- - Override everything
776
-
777
- 2. **Command-Line Arguments**
778
- - `--model`, `--temperature`, etc.
779
- - Override config files
780
-
781
- 3. **Project Configuration**
782
- - `.ollm/config.yaml`
783
- - Project-specific settings
784
-
785
- 4. **User Configuration**
786
- - `~/.ollm/config.yaml`
787
- - Global settings
788
-
789
- 5. **Defaults** (lowest)
790
- - Built-in defaults
791
- - Used when nothing else specified
792
-
793
- ### Example
794
-
795
- ```yaml
796
- # User config (~/.ollm/config.yaml)
797
- model: llama3.1:8b
798
- temperature: 0.7
799
-
800
- # Project config (.ollm/config.yaml)
801
- model: codellama:13b # Overrides user config
802
-
803
- # Environment variable
804
- export OLLM_MODEL=mistral:7b # Overrides project config
805
-
806
- # Final result: mistral:7b is used
807
- ```
808
-
809
- ---
810
-
811
- ## Complete Configuration Example
812
-
813
- ### User Configuration
814
-
815
- ```yaml
816
- # ~/.ollm/config.yaml
817
-
818
- # Model management
819
- models:
820
- cache_ttl: 60
821
- keep_alive: 300
822
- auto_pull: false
823
-
824
- # Default model
825
- model: llama3.1:8b
826
-
827
- # Generation options
828
- options:
829
- temperature: 0.7
830
- top_p: 0.9
831
- top_k: 40
832
- repeat_penalty: 1.1
833
-
834
- # Routing
835
- routing:
836
- profile: general
837
- preferred_families:
838
- - llama
839
- - mistral
840
- fallback_profile: fast
841
-
842
- # Memory
843
- memory:
844
- enabled: true
845
- system_prompt_budget: 500
846
- file: ~/.ollm/memory.json
847
-
848
- # Templates
849
- templates:
850
- user_dir: ~/.ollm/templates
851
- workspace_dir: .ollm/templates
852
-
853
- # Project profiles
854
- project:
855
- auto_detect: true
856
- profile: null
857
-
858
- # Logging
859
- log_level: info
860
- ```
861
-
862
- ### Project Configuration
863
-
864
- ```yaml
865
- # .ollm/config.yaml
866
-
867
- # Project-specific model
868
- model: codellama:13b
869
-
870
- # Project-specific routing
871
- routing:
872
- profile: code
873
- preferred_families:
874
- - codellama
875
- - deepseek-coder
876
- - qwen
877
-
878
- # Higher memory budget for this project
879
- memory:
880
- system_prompt_budget: 1000
881
-
882
- # Project profile
883
- project:
884
- profile: typescript
885
- ```
886
-
887
- ### Project Profile
888
-
889
- ```yaml
890
- # .ollm/project.yaml
891
-
892
- # Project metadata
893
- name: my-web-app
894
- type: typescript
895
- version: 1.0.0
896
- description: A TypeScript web application
897
-
898
- # Model settings
899
- model: llama3.1:8b
900
- system_prompt: |
901
- You are a TypeScript expert helping with a React web application.
902
- Follow TypeScript best practices, use modern React patterns,
903
- and prioritize type safety.
904
-
905
- # Tool restrictions
906
- tools:
907
- allowed:
908
- - read_file
909
- - write_file
910
- - shell
911
- - grep
912
- - glob
913
- denied:
914
- - web_fetch
915
- - web_search
916
-
917
- # Routing
918
- routing:
919
- profile: code
920
- preferred_families:
921
- - llama
922
- - qwen
923
- - codellama
924
-
925
- # Memory
926
- memory:
927
- enabled: true
928
- system_prompt_budget: 800
929
- ```
930
-
931
- ---
932
-
933
- ## Configuration Tips
934
-
935
- ### Performance Optimization
936
-
937
- ```yaml
938
- # Fast responses
939
- models:
940
- cache_ttl: 120 # Cache longer
941
- keep_alive: 600 # Keep models loaded
942
-
943
- routing:
944
- profile: fast # Use smaller models
945
- preferred_families:
946
- - phi # Fast models
947
- - gemma
948
- ```
949
-
950
- ### Quality Optimization
951
-
952
- ```yaml
953
- # Best quality
954
- models:
955
- keep_alive: -1 # Keep loaded indefinitely
956
-
957
- routing:
958
- profile: general # Use larger models
959
- preferred_families:
960
- - llama # High-quality models
961
- - qwen
962
-
963
- options:
964
- temperature: 0.3 # More focused
965
- top_p: 0.95
966
- ```
967
-
968
- ### Memory Optimization
969
-
970
- ```yaml
971
- # Minimize memory usage
972
- models:
973
- keep_alive: 0 # Unload immediately
974
-
975
- memory:
976
- enabled: false # Disable memory injection
977
-
978
- routing:
979
- profile: fast # Use smaller models
980
- ```
981
-
982
- ### Development Setup
983
-
984
- ```yaml
985
- # Development configuration
986
- models:
987
- auto_pull: true # Auto-download models
988
-
989
- memory:
990
- enabled: true
991
- system_prompt_budget: 1000
992
-
993
- templates:
994
- workspace_dir: .templates
995
-
996
- project:
997
- auto_detect: true
998
-
999
- log_level: debug # Verbose logging
1000
- ```
1001
-
1002
- ---
1003
-
1004
- ## Troubleshooting
1005
-
1006
- ### Model Not Found
1007
-
1008
- **Problem:** Model not available
1009
-
1010
- **Solution:**
1011
-
1012
- ```yaml
1013
- models:
1014
- auto_pull: true # Enable auto-pull
1015
- ```
1016
-
1017
- Or manually pull:
1018
-
1019
- ```bash
1020
- /model pull llama3.1:8b
1021
- ```
1022
-
1023
- ### Slow Responses
1024
-
1025
- **Problem:** Model takes too long to respond
1026
-
1027
- **Solution:**
1028
-
1029
- ```yaml
1030
- routing:
1031
- profile: fast # Use faster models
1032
- preferred_families:
1033
- - phi
1034
- - gemma
1035
-
1036
- models:
1037
- keep_alive: 600 # Keep models loaded
1038
- ```
1039
-
1040
- ### Memory Not Injected
1041
-
1042
- **Problem:** Memories not appearing in responses
1043
-
1044
- **Solution:**
1045
-
1046
- ```yaml
1047
- memory:
1048
- enabled: true # Enable memory
1049
- system_prompt_budget: 1000 # Increase budget
1050
- ```
1051
-
1052
- ### Wrong Model Selected
1053
-
1054
- **Problem:** Routing selects unexpected model
1055
-
1056
- **Solution:**
1057
-
1058
- ```yaml
1059
- model: llama3.1:8b # Manual override
1060
- ```
1061
-
1062
- Or adjust routing:
1063
-
1064
- ```yaml
1065
- routing:
1066
- preferred_families:
1067
- - llama # Prefer specific family
1068
- ```
1069
-
1070
- ---
1071
-
1072
- ## See Also
1073
-
1074
- ### User Documentation
1075
-
1076
- - [Getting Started](3%20projects/OLLM%20CLI/LLM%20Models/getting-started.md) - Quick start guide
1077
- - [Commands Reference](Models_commands.md) - CLI commands
1078
- - [Architecture](Models_architecture.md) - System design
1079
- - [Routing Guide](3%20projects/OLLM%20CLI/LLM%20Models/routing/user-guide.md) - Routing details
1080
- - [Memory Guide](3%20projects/OLLM%20CLI/LLM%20Models/memory/user-guide.md) - Memory system
1081
- - [Template Guide](3%20projects/OLLM%20CLI/LLM%20Models/templates/user-guide.md) - Templates
1082
- - [Profile Guide](3%20projects/OLLM%20CLI/LLM%20Models/profiles/user-guide.md) - Project profiles
1083
-
1084
- ### Developer Documentation
1085
-
1086
- - [Model Compiler System](../../.dev/docs/knowledgeDB/dev_ModelCompiler.md) - Profile compilation system
1087
- - [Model Database](../../.dev/docs/knowledgeDB/dev_ModelDB.md) - Database schema and access patterns
1088
- - [Model Management](../../.dev/docs/knowledgeDB/dev_ModelManagement.md) - Model selection and profiles
1089
-
1090
- ---
1091
-
1092
- **Document Version:** 1.1
1093
- **Last Updated:** 2026-01-27
1094
- **Status:** Complete