agent-cli 0.70.4__py3-none-any.whl → 0.71.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: agent-cli
3
- Version: 0.70.4
3
+ Version: 0.71.0
4
4
  Summary: A suite of AI-powered command-line tools for text correction, audio transcription, and voice assistance.
5
5
  Project-URL: Homepage, https://github.com/basnijholt/agent-cli
6
6
  Author-email: Bas Nijholt <bas@nijho.lt>
@@ -80,6 +80,8 @@ Requires-Dist: pytest-timeout; extra == 'test'
80
80
  Requires-Dist: pytest>=7.0.0; extra == 'test'
81
81
  Provides-Extra: vad
82
82
  Requires-Dist: silero-vad>=5.1; extra == 'vad'
83
+ Provides-Extra: wyoming
84
+ Requires-Dist: wyoming>=1.5.2; extra == 'wyoming'
83
85
  Description-Content-Type: text/markdown
84
86
 
85
87
  # Agent CLI
@@ -132,7 +134,7 @@ Since then I have expanded the tool with many more features, all focused on loca
132
134
  - **[`memory`](docs/commands/memory.md)**: Long-term memory system with `memory proxy` and `memory add`.
133
135
  - **[`rag-proxy`](docs/commands/rag-proxy.md)**: RAG proxy server for chatting with your documents.
134
136
  - **[`dev`](docs/commands/dev.md)**: Parallel development with git worktrees and AI coding agents.
135
- - **[`server`](docs/commands/server/index.md)**: Local ASR and TTS servers with dual-protocol (Wyoming & OpenAI), TTL-based memory management, and multi-platform acceleration. Whisper uses MLX on Apple Silicon or Faster Whisper on Linux/CUDA. TTS supports Kokoro (GPU) or Piper (CPU).
137
+ - **[`server`](docs/commands/server/index.md)**: Local ASR and TTS servers with dual-protocol (Wyoming & OpenAI-compatible APIs), TTL-based memory management, and multi-platform acceleration. Whisper uses MLX on Apple Silicon or Faster Whisper on Linux/CUDA. TTS supports Kokoro (GPU) or Piper (CPU).
136
138
  - **[`transcribe-daemon`](docs/commands/transcribe-daemon.md)**: Continuous background transcription with VAD. Install with `uv tool install "agent-cli[vad]" -p 3.13`.
137
139
 
138
140
  ## Quick Start
@@ -496,21 +498,43 @@ agent-cli install-extras rag memory vad
496
498
 
497
499
  Usage: agent-cli install-extras [OPTIONS] [EXTRAS]...
498
500
 
499
- Install optional extras (rag, memory, vad, etc.) with pinned versions.
501
+ Install optional dependencies with pinned, compatible versions.
502
+
503
+ Many agent-cli features require optional dependencies. This command installs them with
504
+ version pinning to ensure compatibility. Dependencies persist across uv tool upgrade
505
+ when installed via uv tool.
506
+
507
+ Available extras:
508
+
509
+ • rag - RAG proxy server (ChromaDB, embeddings)
510
+ • memory - Long-term memory proxy (ChromaDB)
511
+ • vad - Voice Activity Detection (silero-vad)
512
+ • audio - Local audio recording/playback
513
+ • piper - Local Piper TTS engine
514
+ • kokoro - Kokoro neural TTS engine
515
+ • faster-whisper - Whisper ASR for CUDA/CPU
516
+ • mlx-whisper - Whisper ASR for Apple Silicon
517
+ • wyoming - Wyoming protocol for ASR/TTS servers
518
+ • server - FastAPI server components
519
+ • speed - Audio speed adjustment
520
+ • llm - LLM framework (pydantic-ai)
500
521
 
501
522
  Examples:
502
523
 
503
- • agent-cli install-extras rag # Install RAG dependencies
504
- agent-cli install-extras memory vad # Install multiple extras
505
- agent-cli install-extras --list # Show available extras
506
- agent-cli install-extras --all # Install all extras
524
+
525
+ agent-cli install-extras rag # Install RAG dependencies
526
+ agent-cli install-extras memory vad # Install multiple extras
527
+ agent-cli install-extras --list # Show available extras
528
+ agent-cli install-extras --all # Install all extras
529
+
507
530
 
508
531
  ╭─ Arguments ────────────────────────────────────────────────────────────────────────────╮
509
- │ extras [EXTRAS]... Extras to install
532
+ │ extras [EXTRAS]... Extras to install: rag, memory, vad, audio, piper, kokoro,
533
+ │ faster-whisper, mlx-whisper, wyoming, server, speed, llm │
510
534
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
511
535
  ╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
512
- │ --list -l List available extras
513
- │ --all -a Install all available extras
536
+ │ --list -l Show available extras with descriptions (what each one enables)
537
+ │ --all -a Install all available extras at once
514
538
  │ --help -h Show this message and exit. │
515
539
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
516
540
 
@@ -569,13 +593,21 @@ agent-cli config edit
569
593
 
570
594
  Manage agent-cli configuration files.
571
595
 
596
+ Config files are TOML format and searched in order:
597
+
598
+ 1 ./agent-cli-config.toml (project-local)
599
+ 2 ~/.config/agent-cli/config.toml (user default)
600
+
601
+ Settings in [defaults] apply to all commands. Override per-command with sections like
602
+ [chat] or [transcribe]. CLI arguments override config file settings.
603
+
572
604
  ╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
573
605
  │ --help -h Show this message and exit. │
574
606
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
575
607
  ╭─ Commands ─────────────────────────────────────────────────────────────────────────────╮
576
- │ init Create a new config file with all options commented out.
608
+ │ init Create a new config file with all options as commented-out examples.
577
609
  │ edit Open the config file in your default editor. │
578
- │ show Display the config file location and contents.
610
+ │ show Display the active config file path and contents.
579
611
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
580
612
 
581
613
  ```
@@ -633,10 +665,37 @@ the `[defaults]` section of your configuration file.
633
665
 
634
666
  Usage: agent-cli autocorrect [OPTIONS] [TEXT]
635
667
 
636
- Correct text from clipboard using a local or remote LLM.
668
+ Fix grammar, spelling, and punctuation using an LLM.
669
+
670
+ Reads text from clipboard (or argument), sends to LLM for correction, and copies the
671
+ result back to clipboard. Only makes technical corrections without changing meaning or
672
+ tone.
673
+
674
+ Workflow:
675
+
676
+ 1 Read text from clipboard (or TEXT argument)
677
+ 2 Send to LLM for grammar/spelling/punctuation fixes
678
+ 3 Copy corrected text to clipboard (unless --json)
679
+ 4 Display result
680
+
681
+ Examples:
682
+
683
+
684
+ # Correct text from clipboard (default)
685
+ agent-cli autocorrect
686
+
687
+ # Correct specific text
688
+ agent-cli autocorrect "this is incorect"
689
+
690
+ # Use OpenAI instead of local Ollama
691
+ agent-cli autocorrect --llm-provider openai
692
+
693
+ # Get JSON output for scripting (disables clipboard)
694
+ agent-cli autocorrect --json
695
+
637
696
 
638
697
  ╭─ General Options ──────────────────────────────────────────────────────────────────────╮
639
- │ text [TEXT] The text to correct. If not provided, reads from clipboard.
698
+ │ text [TEXT] Text to correct. If omitted, reads from system clipboard.
640
699
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
641
700
  ╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
642
701
  │ --help -h Show this message and exit. │
@@ -680,9 +739,8 @@ the `[defaults]` section of your configuration file.
680
739
  │ [default: info] │
681
740
  │ --log-file TEXT Path to a file to write logs to. │
682
741
  │ --quiet -q Suppress console output from rich. │
683
- │ --json Output result as JSON for
684
- automation. Implies --quiet and
685
- │ --no-clipboard. │
742
+ │ --json Output result as JSON (implies
743
+ │ --quiet and --no-clipboard).
686
744
  │ --config TEXT Path to a TOML configuration file. │
687
745
  │ --print-args Print the command line arguments, │
688
746
  │ including variables taken from the │
@@ -730,30 +788,50 @@ the `[defaults]` section of your configuration file.
730
788
 
731
789
  Usage: agent-cli transcribe [OPTIONS]
732
790
 
733
- Wyoming ASR Client for streaming microphone audio to a transcription server.
791
+ Record audio from microphone and transcribe to text.
792
+
793
+ Records until you press Ctrl+C (or send SIGINT), then transcribes using your configured
794
+ ASR provider. The transcript is copied to the clipboard by default.
795
+
796
+ With --llm: Passes the raw transcript through an LLM to clean up speech recognition
797
+ errors, add punctuation, remove filler words, and improve readability.
798
+
799
+ With --toggle: Bind to a hotkey for push-to-talk. First call starts recording, second
800
+ call stops and transcribes.
801
+
802
+ Examples:
803
+
804
+ • Record and transcribe: agent-cli transcribe
805
+ • With LLM cleanup: agent-cli transcribe --llm
806
+ • Re-transcribe last recording: agent-cli transcribe --last-recording 1
734
807
 
735
808
  ╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
736
809
  │ --help -h Show this message and exit. │
737
810
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
738
811
  ╭─ LLM Configuration ────────────────────────────────────────────────────────────────────╮
739
- │ --extra-instructions TEXT Additional instructions for the LLM to
740
- process the transcription.
741
- │ --llm --no-llm Use an LLM to process the transcript.
812
+ │ --extra-instructions TEXT Extra instructions appended to the LLM │
813
+ cleanup prompt (requires --llm).
814
+ │ --llm --no-llm Clean up transcript with LLM: fix errors,
815
+ │ add punctuation, remove filler words. Uses │
816
+ │ --extra-instructions if set (via CLI or │
817
+ │ config file). │
742
818
  │ [default: no-llm] │
743
819
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
744
820
  ╭─ Audio Recovery ───────────────────────────────────────────────────────────────────────╮
745
- │ --from-file PATH Transcribe audio from a file
746
- (supports wav, mp3, m4a, ogg,
747
- │ flac, aac, webm). Requires ffmpeg
748
- │ for non-WAV formats with Wyoming
749
- provider.
750
- │ --last-recording INTEGER Transcribe a saved recording. Use
751
- │ 1 for most recent, 2 for
752
- second-to-last, etc. Use 0 to
753
- disable (default).
821
+ │ --from-file PATH Transcribe from audio file instead
822
+ of microphone. Supports wav, mp3,
823
+ m4a, ogg, flac, aac, webm.
824
+ Requires ffmpeg for non-WAV
825
+ formats with Wyoming.
826
+ │ --last-recording INTEGER Re-transcribe a saved recording
827
+ (1=most recent, 2=second-to-last,
828
+ │ etc). Useful after connection
829
+ failures or to retry with
830
+ │ different options. │
754
831
  │ [default: 0] │
755
- │ --save-recording --no-save-recording Save the audio recording to disk
756
- │ for recovery.
832
+ │ --save-recording --no-save-recording Save recordings to
833
+ ~/.cache/agent-cli/ for
834
+ │ --last-recording recovery. │
757
835
  │ [default: save-recording] │
758
836
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
759
837
  ╭─ Provider Selection ───────────────────────────────────────────────────────────────────╮
@@ -765,10 +843,12 @@ the `[defaults]` section of your configuration file.
765
843
  │ [default: ollama] │
766
844
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
767
845
  ╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
768
- │ --input-device-index INTEGER Index of the audio input device to use.
769
- --input-device-name TEXT Device name keywords for partial matching.
770
- │ --list-devices List available audio input and output devices and
771
- exit.
846
+ │ --input-device-index INTEGER Audio input device index (see --list-devices).
847
+ Uses system default if omitted.
848
+ │ --input-device-name TEXT Select input device by name substring (e.g.,
849
+ MacBook or USB).
850
+ │ --list-devices List available audio devices with their indices │
851
+ │ and exit. │
772
852
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
773
853
  ╭─ Audio Input: Wyoming ─────────────────────────────────────────────────────────────────╮
774
854
  │ --asr-wyoming-ip TEXT Wyoming ASR server IP address. │
@@ -823,10 +903,9 @@ the `[defaults]` section of your configuration file.
823
903
  │ [env var: GEMINI_API_KEY] │
824
904
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
825
905
  ╭─ Process Management ───────────────────────────────────────────────────────────────────╮
826
- │ --stop Stop any running background process.
827
- │ --status Check if a background process is running.
828
- │ --toggle Toggle the background process on/off. If the process is running, it
829
- │ will be stopped. If the process is not running, it will be started. │
906
+ │ --stop Stop any running instance of this command.
907
+ │ --status Check if an instance is currently running.
908
+ │ --toggle Start if not running, stop if running. Ideal for hotkey binding.
830
909
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
831
910
  ╭─ General Options ──────────────────────────────────────────────────────────────────────╮
832
911
  │ --clipboard --no-clipboard Copy result to │
@@ -840,9 +919,8 @@ the `[defaults]` section of your configuration file.
840
919
  │ --quiet -q Suppress console │
841
920
  │ output from rich. │
842
921
  │ --json Output result as JSON │
843
- for automation.
844
- Implies --quiet and
845
- │ --no-clipboard. │
922
+ (implies --quiet and
923
+ │ --no-clipboard).
846
924
  │ --config TEXT Path to a TOML │
847
925
  │ configuration file. │
848
926
  │ --print-args Print the command │
@@ -850,11 +928,13 @@ the `[defaults]` section of your configuration file.
850
928
  │ including variables │
851
929
  │ taken from the │
852
930
  │ configuration file. │
853
- │ --transcription-log PATH Path to log
854
- transcription results
855
- with timestamps,
856
- hostname, model, and
857
- raw output.
931
+ │ --transcription-log PATH Append transcripts to │
932
+ JSONL file
933
+ (timestamp, hostname,
934
+ │ model, raw/processed
935
+ text). Recent entries
936
+ │ provide context for │
937
+ │ LLM cleanup. │
858
938
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
859
939
 
860
940
  ```
@@ -910,46 +990,76 @@ uv tool install "agent-cli[vad]" -p 3.13
910
990
 
911
991
  Usage: agent-cli transcribe-daemon [OPTIONS]
912
992
 
913
- Run a continuous transcription daemon with voice activity detection.
993
+ Continuous transcription daemon using Silero VAD for speech detection.
994
+
995
+ Unlike transcribe (single recording session), this daemon runs indefinitely and
996
+ automatically detects speech segments using Voice Activity Detection (VAD). Each
997
+ detected segment is transcribed and logged with timestamps.
914
998
 
915
- This command runs indefinitely, capturing audio from your microphone, detecting speech
916
- segments using Silero VAD, transcribing them, and logging results with timestamps.
999
+ How it works:
917
1000
 
918
- Examples: # Basic daemon agent-cli transcribe-daemon
1001
+ 1 Listens continuously to microphone input
1002
+ 2 Silero VAD detects when you start/stop speaking
1003
+ 3 After --silence-threshold seconds of silence, the segment is finalized
1004
+ 4 Segment is transcribed (and optionally cleaned by LLM with --llm)
1005
+ 5 Results are appended to the JSONL log file
1006
+ 6 Audio is saved as MP3 if --save-audio is enabled (requires ffmpeg)
919
1007
 
1008
+ Use cases: Meeting transcription, note-taking, voice journaling, accessibility.
920
1009
 
921
- # With role and custom silence threshold
1010
+ Examples:
1011
+
1012
+
1013
+ agent-cli transcribe-daemon
922
1014
  agent-cli transcribe-daemon --role meeting --silence-threshold 1.5
1015
+ agent-cli transcribe-daemon --llm --clipboard --role notes
1016
+ agent-cli transcribe-daemon --transcription-log ~/meeting.jsonl --no-save-audio
1017
+ agent-cli transcribe-daemon --asr-provider openai --llm-provider gemini --llm
923
1018
 
924
- # With LLM cleanup
925
- agent-cli transcribe-daemon --llm --role notes
926
1019
 
927
- # Custom log file and audio directory
928
- agent-cli transcribe-daemon --transcription-log ~/meeting.jsonl --audio-dir ~/audio
1020
+ Tips:
929
1021
 
1022
+ • Use --role to tag entries (e.g., speaker1, meeting, personal)
1023
+ • Adjust --vad-threshold if detection is too sensitive (increase) or missing speech
1024
+ (decrease)
1025
+ • Use --stop to cleanly terminate a running daemon
1026
+ • With --llm, transcripts are cleaned up (punctuation, filler words removed)
930
1027
 
931
1028
  ╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
932
- │ --role -r TEXT Role name for logging (e.g.,
933
- 'meeting', 'notes', 'user').
1029
+ │ --role -r TEXT Label for log entries. Use to
1030
+ distinguish speakers or contexts in
1031
+ │ logs. │
934
1032
  │ [default: user] │
935
- │ --silence-threshold -s FLOAT Seconds of silence to end a speech
936
- │ segment.
1033
+ │ --silence-threshold -s FLOAT Seconds of silence after speech to
1034
+ finalize a segment. Increase for
1035
+ │ slower speakers. │
937
1036
  │ [default: 1.0] │
938
- │ --min-segment -m FLOAT Minimum speech duration in seconds
939
- to trigger a segment.
1037
+ │ --min-segment -m FLOAT Minimum seconds of speech required
1038
+ before a segment is processed.
1039
+ │ Filters brief sounds. │
940
1040
  │ [default: 0.25] │
941
- │ --vad-threshold FLOAT VAD speech detection threshold
942
- │ (0.0-1.0). Higher = more aggressive
943
- filtering.
1041
+ │ --vad-threshold FLOAT Silero VAD confidence threshold
1042
+ │ (0.0-1.0). Higher values require
1043
+ clearer speech; lower values are
1044
+ │ more sensitive to quiet/distant │
1045
+ │ voices. │
944
1046
  │ [default: 0.3] │
945
- │ --save-audio --no-save-audio Save audio segments as MP3 files.
1047
+ │ --save-audio --no-save-audio Save each speech segment as MP3.
1048
+ │ Requires ffmpeg to be installed. │
946
1049
  │ [default: save-audio] │
947
- │ --audio-dir PATH Directory for MP3 files. Default:
948
- ~/.config/agent-cli/audio
949
- --transcription-log -t PATH JSON Lines log file path. Default:
1050
+ │ --audio-dir PATH Base directory for MP3 files. Files
1051
+ are organized by date:
1052
+ YYYY/MM/DD/HHMMSS_mmm.mp3. Default:
1053
+ │ ~/.config/agent-cli/audio. │
1054
+ │ --transcription-log -t PATH JSONL file for transcript logging │
1055
+ │ (one JSON object per line with │
1056
+ │ timestamp, role, raw/processed │
1057
+ │ text, audio path). Default: │
950
1058
  │ ~/.config/agent-cli/transcriptions… │
951
- │ --clipboard --no-clipboard Copy each transcription to
952
- │ clipboard.
1059
+ │ --clipboard --no-clipboard Copy each completed transcription
1060
+ to clipboard (overwrites previous).
1061
+ │ Useful with --llm to get cleaned │
1062
+ │ text. │
953
1063
  │ [default: no-clipboard] │
954
1064
  │ --help -h Show this message and exit. │
955
1065
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
@@ -962,10 +1072,12 @@ uv tool install "agent-cli[vad]" -p 3.13
962
1072
  │ [default: ollama] │
963
1073
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
964
1074
  ╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
965
- │ --input-device-index INTEGER Index of the audio input device to use.
966
- --input-device-name TEXT Device name keywords for partial matching.
967
- │ --list-devices List available audio input and output devices and
968
- exit.
1075
+ │ --input-device-index INTEGER Audio input device index (see --list-devices).
1076
+ Uses system default if omitted.
1077
+ │ --input-device-name TEXT Select input device by name substring (e.g.,
1078
+ MacBook or USB).
1079
+ │ --list-devices List available audio devices with their indices │
1080
+ │ and exit. │
969
1081
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
970
1082
  ╭─ Audio Input: Wyoming ─────────────────────────────────────────────────────────────────╮
971
1083
  │ --asr-wyoming-ip TEXT Wyoming ASR server IP address. │
@@ -1020,12 +1132,14 @@ uv tool install "agent-cli[vad]" -p 3.13
1020
1132
  │ [env var: GEMINI_API_KEY] │
1021
1133
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1022
1134
  ╭─ LLM Configuration ────────────────────────────────────────────────────────────────────╮
1023
- │ --llm --no-llm Use an LLM to process the transcript.
1135
+ │ --llm --no-llm Clean up transcript with LLM: fix errors, add punctuation,
1136
+ │ remove filler words. Uses --extra-instructions if set (via CLI │
1137
+ │ or config file). │
1024
1138
  │ [default: no-llm] │
1025
1139
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1026
1140
  ╭─ Process Management ───────────────────────────────────────────────────────────────────╮
1027
- │ --stop Stop any running background process.
1028
- │ --status Check if a background process is running.
1141
+ │ --stop Stop any running instance of this command.
1142
+ │ --status Check if an instance is currently running.
1029
1143
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1030
1144
  ╭─ General Options ──────────────────────────────────────────────────────────────────────╮
1031
1145
  │ --log-level [debug|info|warning|error] Set logging level. │
@@ -1079,10 +1193,25 @@ uv tool install "agent-cli[vad]" -p 3.13
1079
1193
 
1080
1194
  Usage: agent-cli speak [OPTIONS] [TEXT]
1081
1195
 
1082
- Convert text to speech using Wyoming or OpenAI-compatible TTS server.
1196
+ Convert text to speech and play audio through speakers.
1197
+
1198
+ By default, synthesized audio plays immediately. Use --save-file to save to a WAV file
1199
+ instead (skips playback).
1200
+
1201
+ Text can be provided as an argument or read from clipboard automatically.
1202
+
1203
+ Examples:
1204
+
1205
+ Speak text directly: agent-cli speak "Hello, world!"
1206
+
1207
+ Speak clipboard contents: agent-cli speak
1208
+
1209
+ Save to file instead of playing: agent-cli speak "Hello" --save-file greeting.wav
1210
+
1211
+ Use OpenAI-compatible TTS: agent-cli speak "Hello" --tts-provider openai
1083
1212
 
1084
1213
  ╭─ General Options ──────────────────────────────────────────────────────────────────────╮
1085
- │ text [TEXT] Text to speak. Reads from clipboard if not provided.
1214
+ │ text [TEXT] Text to synthesize. If not provided, reads from clipboard.
1086
1215
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1087
1216
  ╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
1088
1217
  │ --help -h Show this message and exit. │
@@ -1094,9 +1223,10 @@ uv tool install "agent-cli[vad]" -p 3.13
1094
1223
  │ [default: wyoming] │
1095
1224
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1096
1225
  ╭─ Audio Output ─────────────────────────────────────────────────────────────────────────╮
1097
- │ --output-device-index INTEGER Index of the audio output device to use for TTS.
1098
- --output-device-name TEXT Output device name keywords for partial
1099
- matching.
1226
+ │ --output-device-index INTEGER Audio output device index (see --list-devices
1227
+ for available devices).
1228
+ --output-device-name TEXT Partial match on device name (e.g., 'speakers',
1229
+ │ 'headphones'). │
1100
1230
  │ --tts-speed FLOAT Speech speed multiplier (1.0 = normal, 2.0 = │
1101
1231
  │ twice as fast, 0.5 = half speed). │
1102
1232
  │ [default: 1.0] │
@@ -1114,7 +1244,8 @@ uv tool install "agent-cli[vad]" -p 3.13
1114
1244
  ╭─ Audio Output: OpenAI-compatible ──────────────────────────────────────────────────────╮
1115
1245
  │ --tts-openai-model TEXT The OpenAI model to use for TTS. │
1116
1246
  │ [default: tts-1] │
1117
- │ --tts-openai-voice TEXT The voice to use for OpenAI-compatible TTS.
1247
+ │ --tts-openai-voice TEXT Voice for OpenAI TTS (alloy, echo, fable, onyx,
1248
+ │ nova, shimmer). │
1118
1249
  │ [default: alloy] │
1119
1250
  │ --tts-openai-base-url TEXT Custom base URL for OpenAI-compatible TTS API │
1120
1251
  │ (e.g., http://localhost:8000/v1 for a proxy). │
@@ -1140,28 +1271,27 @@ uv tool install "agent-cli[vad]" -p 3.13
1140
1271
  │ [env var: GEMINI_API_KEY] │
1141
1272
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1142
1273
  ╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
1143
- │ --list-devices List available audio input and output devices and exit.
1274
+ │ --list-devices List available audio devices with their indices and exit.
1144
1275
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1145
1276
  ╭─ General Options ──────────────────────────────────────────────────────────────────────╮
1146
- │ --save-file PATH Save TTS response audio to WAV file.
1277
+ │ --save-file PATH Save audio to WAV file instead of
1278
+ │ playing through speakers. │
1147
1279
  │ --log-level [debug|info|warning|error] Set logging level. │
1148
1280
  │ [env var: LOG_LEVEL] │
1149
1281
  │ [default: info] │
1150
1282
  │ --log-file TEXT Path to a file to write logs to. │
1151
1283
  │ --quiet -q Suppress console output from rich. │
1152
- │ --json Output result as JSON for
1153
- automation. Implies --quiet and
1154
- │ --no-clipboard. │
1284
+ │ --json Output result as JSON (implies
1285
+ │ --quiet and --no-clipboard).
1155
1286
  │ --config TEXT Path to a TOML configuration file. │
1156
1287
  │ --print-args Print the command line arguments, │
1157
1288
  │ including variables taken from the │
1158
1289
  │ configuration file. │
1159
1290
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1160
1291
  ╭─ Process Management ───────────────────────────────────────────────────────────────────╮
1161
- │ --stop Stop any running background process.
1162
- │ --status Check if a background process is running.
1163
- │ --toggle Toggle the background process on/off. If the process is running, it
1164
- │ will be stopped. If the process is not running, it will be started. │
1292
+ │ --stop Stop any running instance of this command.
1293
+ │ --status Check if an instance is currently running.
1294
+ │ --toggle Start if not running, stop if running. Ideal for hotkey binding.
1165
1295
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1166
1296
 
1167
1297
  ```
@@ -1203,16 +1333,23 @@ uv tool install "agent-cli[vad]" -p 3.13
1203
1333
 
1204
1334
  Usage: agent-cli voice-edit [OPTIONS]
1205
1335
 
1206
- Interact with clipboard text via a voice command using local or remote services.
1336
+ Edit or query clipboard text using voice commands.
1337
+
1338
+ Workflow: Captures clipboard text → records your voice command → transcribes it → sends
1339
+ both to an LLM → copies result back to clipboard.
1340
+
1341
+ Use this for hands-free text editing (e.g., "make this more formal") or asking questions
1342
+ about clipboard content (e.g., "summarize this").
1207
1343
 
1208
- Usage:
1344
+ Typical hotkey integration: Run voice-edit & on keypress to start recording, then send
1345
+ SIGINT (via --stop) on second keypress to process.
1346
+
1347
+ Examples:
1209
1348
 
1210
- Run in foreground: agent-cli voice-edit --input-device-index 1
1211
- Run in background: agent-cli voice-edit --input-device-index 1 &
1212
- Check status: agent-cli voice-edit --status
1213
- Stop background process: agent-cli voice-edit --stop
1214
- • List output devices: agent-cli voice-edit --list-output-devices
1215
- • Save TTS to file: agent-cli voice-edit --tts --save-file response.wav
1349
+ Basic usage: agent-cli voice-edit
1350
+ With TTS response: agent-cli voice-edit --tts
1351
+ Toggle on/off: agent-cli voice-edit --toggle
1352
+ List audio devices: agent-cli voice-edit --list-devices
1216
1353
 
1217
1354
  ╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
1218
1355
  │ --help -h Show this message and exit. │
@@ -1230,10 +1367,12 @@ uv tool install "agent-cli[vad]" -p 3.13
1230
1367
  │ [default: wyoming] │
1231
1368
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1232
1369
  ╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
1233
- │ --input-device-index INTEGER Index of the audio input device to use.
1234
- --input-device-name TEXT Device name keywords for partial matching.
1235
- │ --list-devices List available audio input and output devices and
1236
- exit.
1370
+ │ --input-device-index INTEGER Audio input device index (see --list-devices).
1371
+ Uses system default if omitted.
1372
+ │ --input-device-name TEXT Select input device by name substring (e.g.,
1373
+ MacBook or USB).
1374
+ │ --list-devices List available audio devices with their indices │
1375
+ │ and exit. │
1237
1376
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1238
1377
  ╭─ Audio Input: Wyoming ─────────────────────────────────────────────────────────────────╮
1239
1378
  │ --asr-wyoming-ip TEXT Wyoming ASR server IP address. │
@@ -1284,10 +1423,10 @@ uv tool install "agent-cli[vad]" -p 3.13
1284
1423
  ╭─ Audio Output ─────────────────────────────────────────────────────────────────────────╮
1285
1424
  │ --tts --no-tts Enable text-to-speech for responses. │
1286
1425
  │ [default: no-tts] │
1287
- │ --output-device-index INTEGER Index of the audio output device to use
1288
- │ for TTS.
1289
- │ --output-device-name TEXT Output device name keywords for partial
1290
- matching.
1426
+ │ --output-device-index INTEGER Audio output device index (see
1427
+ --list-devices for available devices).
1428
+ │ --output-device-name TEXT Partial match on device name (e.g.,
1429
+ 'speakers', 'headphones').
1291
1430
  │ --tts-speed FLOAT Speech speed multiplier (1.0 = normal, │
1292
1431
  │ 2.0 = twice as fast, 0.5 = half speed). │
1293
1432
  │ [default: 1.0] │
@@ -1305,7 +1444,8 @@ uv tool install "agent-cli[vad]" -p 3.13
1305
1444
  ╭─ Audio Output: OpenAI-compatible ──────────────────────────────────────────────────────╮
1306
1445
  │ --tts-openai-model TEXT The OpenAI model to use for TTS. │
1307
1446
  │ [default: tts-1] │
1308
- │ --tts-openai-voice TEXT The voice to use for OpenAI-compatible TTS.
1447
+ │ --tts-openai-voice TEXT Voice for OpenAI TTS (alloy, echo, fable, onyx,
1448
+ │ nova, shimmer). │
1309
1449
  │ [default: alloy] │
1310
1450
  │ --tts-openai-base-url TEXT Custom base URL for OpenAI-compatible TTS API │
1311
1451
  │ (e.g., http://localhost:8000/v1 for a proxy). │
@@ -1326,14 +1466,14 @@ uv tool install "agent-cli[vad]" -p 3.13
1326
1466
  │ [default: Kore] │
1327
1467
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1328
1468
  ╭─ Process Management ───────────────────────────────────────────────────────────────────╮
1329
- │ --stop Stop any running background process.
1330
- │ --status Check if a background process is running.
1331
- │ --toggle Toggle the background process on/off. If the process is running, it
1332
- │ will be stopped. If the process is not running, it will be started. │
1469
+ │ --stop Stop any running instance of this command.
1470
+ │ --status Check if an instance is currently running.
1471
+ │ --toggle Start if not running, stop if running. Ideal for hotkey binding.
1333
1472
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1334
1473
  ╭─ General Options ──────────────────────────────────────────────────────────────────────╮
1335
- │ --save-file PATH Save TTS response audio
1336
- to WAV file.
1474
+ │ --save-file PATH Save audio to WAV file
1475
+ instead of playing
1476
+ │ through speakers. │
1337
1477
  │ --clipboard --no-clipboard Copy result to │
1338
1478
  │ clipboard. │
1339
1479
  │ [default: clipboard] │
@@ -1345,9 +1485,8 @@ uv tool install "agent-cli[vad]" -p 3.13
1345
1485
  │ --quiet -q Suppress console output │
1346
1486
  │ from rich. │
1347
1487
  │ --json Output result as JSON │
1348
- for automation. Implies
1349
- │ --quiet and
1350
- │ --no-clipboard. │
1488
+ (implies --quiet and
1489
+ │ --no-clipboard).
1351
1490
  │ --config TEXT Path to a TOML │
1352
1491
  │ configuration file. │
1353
1492
  │ --print-args Print the command line │
@@ -1398,7 +1537,28 @@ uv tool install "agent-cli[vad]" -p 3.13
1398
1537
 
1399
1538
  Usage: agent-cli assistant [OPTIONS]
1400
1539
 
1401
- Wake word-based voice assistant using local or remote services.
1540
+ Hands-free voice assistant using wake word detection.
1541
+
1542
+ Continuously listens for a wake word, then records your speech until you say the wake
1543
+ word again. The recording is transcribed and sent to an LLM for a conversational
1544
+ response, optionally spoken back via TTS.
1545
+
1546
+ Conversation flow:
1547
+
1548
+ 1 Say wake word → starts recording
1549
+ 2 Speak your question/command
1550
+ 3 Say wake word again → stops recording and processes
1551
+
1552
+ The assistant runs in a loop, ready for the next command after each response. Stop with
1553
+ Ctrl+C or --stop.
1554
+
1555
+ Requirements:
1556
+
1557
+ • Wyoming wake word server (e.g., wyoming-openwakeword on port 10400)
1558
+ • Wyoming ASR server (e.g., wyoming-whisper on port 10300)
1559
+ • Optional: TTS server for spoken responses (enable with --tts)
1560
+
1561
+ Example: assistant --wake-word ok_nabu --tts --input-device-name USB
1402
1562
 
1403
1563
  ╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
1404
1564
  │ --help -h Show this message and exit. │
@@ -1416,19 +1576,23 @@ uv tool install "agent-cli[vad]" -p 3.13
1416
1576
  │ [default: wyoming] │
1417
1577
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1418
1578
  ╭─ Wake Word ────────────────────────────────────────────────────────────────────────────╮
1419
- │ --wake-server-ip TEXT Wyoming wake word server IP address.
1579
+ │ --wake-server-ip TEXT Wyoming wake word server IP (requires
1580
+ │ wyoming-openwakeword or similar). │
1420
1581
  │ [default: localhost] │
1421
1582
  │ --wake-server-port INTEGER Wyoming wake word server port. │
1422
1583
  │ [default: 10400] │
1423
- │ --wake-word TEXT Name of wake word to detect (e.g., 'ok_nabu', │
1424
- 'hey_jarvis').
1584
+ │ --wake-word TEXT Wake word to detect. Common options: ok_nabu, │
1585
+ │ hey_jarvis, alexa. Must match a model loaded in
1586
+ │ your wake word server. │
1425
1587
  │ [default: ok_nabu] │
1426
1588
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1427
1589
  ╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
1428
- │ --input-device-index INTEGER Index of the audio input device to use.
1429
- --input-device-name TEXT Device name keywords for partial matching.
1430
- │ --list-devices List available audio input and output devices and
1431
- exit.
1590
+ │ --input-device-index INTEGER Audio input device index (see --list-devices).
1591
+ Uses system default if omitted.
1592
+ │ --input-device-name TEXT Select input device by name substring (e.g.,
1593
+ MacBook or USB).
1594
+ │ --list-devices List available audio devices with their indices │
1595
+ │ and exit. │
1432
1596
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1433
1597
  ╭─ Audio Input: Wyoming ─────────────────────────────────────────────────────────────────╮
1434
1598
  │ --asr-wyoming-ip TEXT Wyoming ASR server IP address. │
@@ -1479,10 +1643,10 @@ uv tool install "agent-cli[vad]" -p 3.13
1479
1643
  ╭─ Audio Output ─────────────────────────────────────────────────────────────────────────╮
1480
1644
  │ --tts --no-tts Enable text-to-speech for responses. │
1481
1645
  │ [default: no-tts] │
1482
- │ --output-device-index INTEGER Index of the audio output device to use
1483
- │ for TTS.
1484
- │ --output-device-name TEXT Output device name keywords for partial
1485
- matching.
1646
+ │ --output-device-index INTEGER Audio output device index (see
1647
+ --list-devices for available devices).
1648
+ │ --output-device-name TEXT Partial match on device name (e.g.,
1649
+ 'speakers', 'headphones').
1486
1650
  │ --tts-speed FLOAT Speech speed multiplier (1.0 = normal, │
1487
1651
  │ 2.0 = twice as fast, 0.5 = half speed). │
1488
1652
  │ [default: 1.0] │
@@ -1500,7 +1664,8 @@ uv tool install "agent-cli[vad]" -p 3.13
1500
1664
  ╭─ Audio Output: OpenAI-compatible ──────────────────────────────────────────────────────╮
1501
1665
  │ --tts-openai-model TEXT The OpenAI model to use for TTS. │
1502
1666
  │ [default: tts-1] │
1503
- │ --tts-openai-voice TEXT The voice to use for OpenAI-compatible TTS.
1667
+ │ --tts-openai-voice TEXT Voice for OpenAI TTS (alloy, echo, fable, onyx,
1668
+ │ nova, shimmer). │
1504
1669
  │ [default: alloy] │
1505
1670
  │ --tts-openai-base-url TEXT Custom base URL for OpenAI-compatible TTS API │
1506
1671
  │ (e.g., http://localhost:8000/v1 for a proxy). │
@@ -1521,14 +1686,14 @@ uv tool install "agent-cli[vad]" -p 3.13
1521
1686
  │ [default: Kore] │
1522
1687
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1523
1688
  ╭─ Process Management ───────────────────────────────────────────────────────────────────╮
1524
- │ --stop Stop any running background process.
1525
- │ --status Check if a background process is running.
1526
- │ --toggle Toggle the background process on/off. If the process is running, it
1527
- │ will be stopped. If the process is not running, it will be started. │
1689
+ │ --stop Stop any running instance of this command.
1690
+ │ --status Check if an instance is currently running.
1691
+ │ --toggle Start if not running, stop if running. Ideal for hotkey binding.
1528
1692
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1529
1693
  ╭─ General Options ──────────────────────────────────────────────────────────────────────╮
1530
- │ --save-file PATH Save TTS response audio
1531
- to WAV file.
1694
+ │ --save-file PATH Save audio to WAV file
1695
+ instead of playing
1696
+ │ through speakers. │
1532
1697
  │ --clipboard --no-clipboard Copy result to │
1533
1698
  │ clipboard. │
1534
1699
  │ [default: clipboard] │
@@ -1596,7 +1761,39 @@ uv tool install "agent-cli[vad]" -p 3.13
1596
1761
 
1597
1762
  Usage: agent-cli chat [OPTIONS]
1598
1763
 
1599
- An chat agent that you can talk to.
1764
+ Voice-based conversational chat agent with memory and tools.
1765
+
1766
+ Runs an interactive loop: listen → transcribe → LLM → speak response. Conversation
1767
+ history is persisted and included as context for continuity.
1768
+
1769
+ Built-in tools (LLM uses automatically when relevant):
1770
+
1771
+ • add_memory/search_memory/update_memory - persistent long-term memory
1772
+ • duckduckgo_search - web search for current information
1773
+ • read_file/execute_code - file access and shell commands
1774
+
1775
+ Process management: Use --toggle to start/stop via hotkey (bind to a keyboard shortcut),
1776
+ --stop to terminate, or --status to check state.
1777
+
1778
+ Examples:
1779
+
1780
+ Use OpenAI-compatible providers for speech and LLM, with TTS enabled:
1781
+
1782
+
1783
+ agent-cli chat --asr-provider openai --llm-provider openai --tts
1784
+
1785
+
1786
+ Start in background mode (toggle on/off with hotkey):
1787
+
1788
+
1789
+ agent-cli chat --toggle
1790
+
1791
+
1792
+ Use local Ollama LLM with Wyoming ASR:
1793
+
1794
+
1795
+ agent-cli chat --llm-provider ollama
1796
+
1600
1797
 
1601
1798
  ╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
1602
1799
  │ --help -h Show this message and exit. │
@@ -1614,10 +1811,12 @@ uv tool install "agent-cli[vad]" -p 3.13
1614
1811
  │ [default: wyoming] │
1615
1812
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1616
1813
  ╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
1617
- │ --input-device-index INTEGER Index of the audio input device to use.
1618
- --input-device-name TEXT Device name keywords for partial matching.
1619
- │ --list-devices List available audio input and output devices and
1620
- exit.
1814
+ │ --input-device-index INTEGER Audio input device index (see --list-devices).
1815
+ Uses system default if omitted.
1816
+ │ --input-device-name TEXT Select input device by name substring (e.g.,
1817
+ MacBook or USB).
1818
+ │ --list-devices List available audio devices with their indices │
1819
+ │ and exit. │
1621
1820
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1622
1821
  ╭─ Audio Input: Wyoming ─────────────────────────────────────────────────────────────────╮
1623
1822
  │ --asr-wyoming-ip TEXT Wyoming ASR server IP address. │
@@ -1674,10 +1873,10 @@ uv tool install "agent-cli[vad]" -p 3.13
1674
1873
  ╭─ Audio Output ─────────────────────────────────────────────────────────────────────────╮
1675
1874
  │ --tts --no-tts Enable text-to-speech for responses. │
1676
1875
  │ [default: no-tts] │
1677
- │ --output-device-index INTEGER Index of the audio output device to use
1678
- │ for TTS.
1679
- │ --output-device-name TEXT Output device name keywords for partial
1680
- matching.
1876
+ │ --output-device-index INTEGER Audio output device index (see
1877
+ --list-devices for available devices).
1878
+ │ --output-device-name TEXT Partial match on device name (e.g.,
1879
+ 'speakers', 'headphones').
1681
1880
  │ --tts-speed FLOAT Speech speed multiplier (1.0 = normal, │
1682
1881
  │ 2.0 = twice as fast, 0.5 = half speed). │
1683
1882
  │ [default: 1.0] │
@@ -1695,7 +1894,8 @@ uv tool install "agent-cli[vad]" -p 3.13
1695
1894
  ╭─ Audio Output: OpenAI-compatible ──────────────────────────────────────────────────────╮
1696
1895
  │ --tts-openai-model TEXT The OpenAI model to use for TTS. │
1697
1896
  │ [default: tts-1] │
1698
- │ --tts-openai-voice TEXT The voice to use for OpenAI-compatible TTS.
1897
+ │ --tts-openai-voice TEXT Voice for OpenAI TTS (alloy, echo, fable, onyx,
1898
+ │ nova, shimmer). │
1699
1899
  │ [default: alloy] │
1700
1900
  │ --tts-openai-base-url TEXT Custom base URL for OpenAI-compatible TTS API │
1701
1901
  │ (e.g., http://localhost:8000/v1 for a proxy). │
@@ -1716,20 +1916,23 @@ uv tool install "agent-cli[vad]" -p 3.13
1716
1916
  │ [default: Kore] │
1717
1917
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1718
1918
  ╭─ Process Management ───────────────────────────────────────────────────────────────────╮
1719
- │ --stop Stop any running background process.
1720
- │ --status Check if a background process is running.
1721
- │ --toggle Toggle the background process on/off. If the process is running, it
1722
- │ will be stopped. If the process is not running, it will be started. │
1919
+ │ --stop Stop any running instance of this command.
1920
+ │ --status Check if an instance is currently running.
1921
+ │ --toggle Start if not running, stop if running. Ideal for hotkey binding.
1723
1922
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1724
1923
  ╭─ History Options ──────────────────────────────────────────────────────────────────────╮
1725
- │ --history-dir PATH Directory to store conversation history.
1924
+ │ --history-dir PATH Directory for conversation history and long-term
1925
+ │ memory. Both conversation.json and │
1926
+ │ long_term_memory.json are stored here. │
1726
1927
  │ [default: ~/.config/agent-cli/history] │
1727
- │ --last-n-messages INTEGER Number of messages to include in the conversation
1728
- history. Set to 0 to disable history.
1928
+ │ --last-n-messages INTEGER Number of past messages to include as context for
1929
+ the LLM. Set to 0 to start fresh each session
1930
+ │ (memory tools still persist). │
1729
1931
  │ [default: 50] │
1730
1932
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1731
1933
  ╭─ General Options ──────────────────────────────────────────────────────────────────────╮
1732
- │ --save-file PATH Save TTS response audio to WAV file.
1934
+ │ --save-file PATH Save audio to WAV file instead of
1935
+ │ playing through speakers. │
1733
1936
  │ --log-level [debug|info|warning|error] Set logging level. │
1734
1937
  │ [env var: LOG_LEVEL] │
1735
1938
  │ [default: info] │
@@ -1784,25 +1987,68 @@ uv tool install "agent-cli[vad]" -p 3.13
1784
1987
 
1785
1988
  Usage: agent-cli rag-proxy [OPTIONS]
1786
1989
 
1787
- Start the RAG (Retrieval-Augmented Generation) Proxy Server.
1990
+ Start a RAG proxy server that enables "chat with your documents".
1991
+
1992
+ Watches a folder for documents, indexes them into a vector store, and provides an
1993
+ OpenAI-compatible API at /v1/chat/completions. When you send a chat request, the server
1994
+ retrieves relevant document chunks and injects them as context before forwarding to your
1995
+ LLM backend.
1996
+
1997
+ Quick start:
1998
+
1999
+ • agent-cli rag-proxy — Start with defaults (./rag_docs, OpenAI-compatible API)
2000
+ • agent-cli rag-proxy --docs-folder ~/notes — Index your notes folder
2001
+
2002
+ How it works:
2003
+
2004
+ 1 Documents in --docs-folder are chunked, embedded, and stored in ChromaDB
2005
+ 2 A file watcher auto-reindexes when files change
2006
+ 3 Chat requests trigger a semantic search for relevant chunks
2007
+ 4 Retrieved context is injected into the prompt before forwarding to the LLM
2008
+ 5 Responses include a rag_sources field listing which documents were used
2009
+
2010
+ Supported file formats:
1788
2011
 
1789
- This server watches a folder for documents, indexes them, and provides an
1790
- OpenAI-compatible API that proxies requests to a backend LLM (like llama.cpp), injecting
1791
- relevant context from the documents.
2012
+ Text: .txt, .md, .json, .py, .js, .ts, .yaml, .toml, .rst, etc. Rich documents (via
2013
+ MarkItDown): .pdf, .docx, .pptx, .xlsx, .html, .csv
2014
+
2015
+ API endpoints:
2016
+
2017
+ • POST /v1/chat/completions — Main chat endpoint (OpenAI-compatible)
2018
+ • GET /health — Health check with configuration info
2019
+ • GET /files — List indexed files with chunk counts
2020
+ • POST /reindex — Trigger manual reindex
2021
+ • All other paths are proxied to the LLM backend
2022
+
2023
+ Per-request overrides (in JSON body):
2024
+
2025
+ • rag_top_k: Override --limit for this request
2026
+ • rag_enable_tools: Override --rag-tools for this request
1792
2027
 
1793
2028
  ╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
1794
2029
  │ --help -h Show this message and exit. │
1795
2030
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1796
2031
  ╭─ RAG Configuration ────────────────────────────────────────────────────────────────────╮
1797
- │ --docs-folder PATH Folder to watch for documents
2032
+ │ --docs-folder PATH Folder to watch for documents. Files are
2033
+ │ auto-indexed on startup and when changed. │
2034
+ │ Must not overlap with --chroma-path. │
1798
2035
  │ [default: ./rag_docs] │
1799
- │ --chroma-path PATH Path to ChromaDB persistence directory
2036
+ │ --chroma-path PATH ChromaDB storage directory for vector
2037
+ │ embeddings. Must be separate from │
2038
+ │ --docs-folder to avoid indexing database │
2039
+ │ files. │
1800
2040
  │ [default: ./rag_db] │
1801
2041
  │ --limit INTEGER Number of document chunks to retrieve per │
1802
- │ query.
2042
+ │ query. Higher values provide more context
2043
+ │ but use more tokens. Can be overridden │
2044
+ │ per-request via rag_top_k in the JSON │
2045
+ │ body. │
1803
2046
  │ [default: 3] │
1804
- │ --rag-tools --no-rag-tools Allow agent to fetch full documents when
1805
- snippets are insufficient.
2047
+ │ --rag-tools --no-rag-tools Enable read_full_document() tool so the
2048
+ LLM can request full document content when
2049
+ │ retrieved snippets are insufficient. Can │
2050
+ │ be overridden per-request via │
2051
+ │ rag_enable_tools in the JSON body. │
1806
2052
  │ [default: rag-tools] │
1807
2053
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1808
2054
  ╭─ LLM: OpenAI-compatible ───────────────────────────────────────────────────────────────╮
@@ -1820,7 +2066,8 @@ uv tool install "agent-cli[vad]" -p 3.13
1820
2066
  ╭─ Server Configuration ─────────────────────────────────────────────────────────────────╮
1821
2067
  │ --host TEXT Host/IP to bind API servers to. │
1822
2068
  │ [default: 0.0.0.0] │
1823
- │ --port INTEGER Port to bind to
2069
+ │ --port INTEGER Port for the RAG proxy API (e.g.,
2070
+ │ http://localhost:8000/v1/chat/completions). │
1824
2071
  │ [default: 8000] │
1825
2072
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1826
2073
  ╭─ General Options ──────────────────────────────────────────────────────────────────────╮
@@ -1910,41 +2157,61 @@ The `memory proxy` command is the core feature—a middleware server that gives
1910
2157
  5 Extracts new facts from the conversation in the background and updates the long-term
1911
2158
  memory store (including handling contradictions).
1912
2159
 
1913
- Use this to give "long-term memory" to any OpenAI-compatible application. Point your
1914
- client's base URL to http://localhost:8100/v1.
2160
+ Example:
2161
+
2162
+
2163
+ # Start proxy pointing to local Ollama
2164
+ agent-cli memory proxy --openai-base-url http://localhost:11434/v1
2165
+
2166
+ # Then configure your chat client to use http://localhost:8100/v1
2167
+ # as its OpenAI base URL. All requests flow through the memory proxy.
2168
+
2169
+
2170
+ Per-request overrides: Clients can include these fields in the request body: memory_id
2171
+ (conversation ID), memory_top_k, memory_recency_weight, memory_score_threshold.
1915
2172
 
1916
2173
  ╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
1917
2174
  │ --help -h Show this message and exit. │
1918
2175
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1919
2176
  ╭─ Memory Configuration ─────────────────────────────────────────────────────────────────╮
1920
- │ --memory-path PATH Path to the memory store (files +
1921
- derived vector index).
2177
+ │ --memory-path PATH Directory for memory storage.
2178
+ Contains entries/ (Markdown
2179
+ │ files) and chroma/ (vector │
2180
+ │ index). Created automatically if │
2181
+ │ it doesn't exist. │
1922
2182
  │ [default: ./memory_db] │
1923
- │ --default-top-k INTEGER Number of memory entries to
1924
- retrieve per query.
2183
+ │ --default-top-k INTEGER Number of relevant memories to
2184
+ inject into each request. Higher
2185
+ │ values provide more context but │
2186
+ │ increase token usage. │
1925
2187
  │ [default: 5] │
1926
- │ --max-entries INTEGER Maximum stored memory entries per │
1927
- conversation (excluding summary).
2188
+ │ --max-entries INTEGER Maximum entries per conversation
2189
+ before oldest are evicted.
2190
+ │ Summaries are preserved │
2191
+ │ separately. │
1928
2192
  │ [default: 500] │
1929
2193
  │ --mmr-lambda FLOAT MMR lambda (0-1): higher favors │
1930
2194
  │ relevance, lower favors │
1931
2195
  │ diversity. │
1932
2196
  │ [default: 0.7] │
1933
- │ --recency-weight FLOAT Recency score weight (0.0-1.0).
1934
- Controls freshness vs. relevance. │
1935
- Default 0.2 (20% recency, 80%
1936
- │ semantic relevance). │
2197
+ │ --recency-weight FLOAT Weight for recency vs semantic
2198
+ relevance (0.0-1.0). At 0.2: 20%
2199
+ │ recency, 80% semantic similarity.
1937
2200
  │ [default: 0.2] │
1938
2201
  │ --score-threshold FLOAT Minimum semantic relevance │
1939
2202
  │ threshold (0.0-1.0). Memories │
1940
2203
  │ below this score are discarded to │
1941
2204
  │ reduce noise. │
1942
2205
  │ [default: 0.35] │
1943
- │ --summarization --no-summarization Enable automatic fact extraction
1944
- and summaries.
2206
+ │ --summarization --no-summarization Extract facts and generate
2207
+ summaries after each turn using
2208
+ │ the LLM. Disable to only store │
2209
+ │ raw conversation turns. │
1945
2210
  │ [default: summarization] │
1946
- │ --git-versioning --no-git-versioning Enable automatic git commit of
1947
- memory changes.
2211
+ │ --git-versioning --no-git-versioning Auto-commit memory changes to
2212
+ git. Initializes a repo in
2213
+ │ --memory-path if needed. Provides │
2214
+ │ full history of memory evolution. │
1948
2215
  │ [default: git-versioning] │
1949
2216
  ╰────────────────────────────────────────────────────────────────────────────────────────╯
1950
2217
  ╭─ LLM: OpenAI-compatible ───────────────────────────────────────────────────────────────╮
@@ -2057,12 +2324,16 @@ agent-cli memory add -c work "Project deadline is Friday"
2057
2324
  │ for stdin. Supports JSON array, │
2058
2325
  │ JSON object with 'memories' key, │
2059
2326
  │ or plain text (one per line). │
2060
- │ --conversation-id -c TEXT Conversation ID to add memories
2061
- to.
2327
+ │ --conversation-id -c TEXT Conversation namespace for these
2328
+ memories. Memories are retrieved
2329
+ │ per-conversation unless shared │
2330
+ │ globally. │
2062
2331
  │ [default: default] │
2063
- │ --memory-path PATH Path to the memory store.
2332
+ │ --memory-path PATH Directory for memory storage (same
2333
+ │ as memory proxy --memory-path). │
2064
2334
  │ [default: ./memory_db] │
2065
- │ --git-versioning --no-git-versioning Commit changes to git.
2335
+ │ --git-versioning --no-git-versioning Auto-commit changes to git for
2336
+ │ version history. │
2066
2337
  │ [default: git-versioning] │
2067
2338
  │ --help -h Show this message and exit. │
2068
2339
  ╰────────────────────────────────────────────────────────────────────────────────────────╯