agent-cli 0.70.4__py3-none-any.whl → 0.71.0__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- agent_cli/_extras.json +2 -1
- agent_cli/_requirements/wyoming.txt +71 -0
- agent_cli/agents/assistant.py +23 -27
- agent_cli/agents/autocorrect.py +29 -3
- agent_cli/agents/chat.py +44 -14
- agent_cli/agents/memory/__init__.py +19 -1
- agent_cli/agents/memory/add.py +3 -3
- agent_cli/agents/memory/proxy.py +19 -10
- agent_cli/agents/rag_proxy.py +41 -9
- agent_cli/agents/speak.py +22 -2
- agent_cli/agents/transcribe.py +20 -2
- agent_cli/agents/transcribe_daemon.py +33 -21
- agent_cli/agents/voice_edit.py +17 -9
- agent_cli/cli.py +25 -2
- agent_cli/config_cmd.py +30 -11
- agent_cli/dev/cli.py +295 -65
- agent_cli/docs_gen.py +18 -8
- agent_cli/install/extras.py +39 -10
- agent_cli/install/hotkeys.py +22 -11
- agent_cli/install/services.py +54 -14
- agent_cli/opts.py +23 -20
- agent_cli/server/cli.py +119 -45
- agent_cli/server/proxy/api.py +12 -1
- agent_cli/services/__init__.py +46 -5
- {agent_cli-0.70.4.dist-info → agent_cli-0.71.0.dist-info}/METADATA +458 -187
- {agent_cli-0.70.4.dist-info → agent_cli-0.71.0.dist-info}/RECORD +29 -28
- {agent_cli-0.70.4.dist-info → agent_cli-0.71.0.dist-info}/WHEEL +0 -0
- {agent_cli-0.70.4.dist-info → agent_cli-0.71.0.dist-info}/entry_points.txt +0 -0
- {agent_cli-0.70.4.dist-info → agent_cli-0.71.0.dist-info}/licenses/LICENSE +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: agent-cli
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.71.0
|
|
4
4
|
Summary: A suite of AI-powered command-line tools for text correction, audio transcription, and voice assistance.
|
|
5
5
|
Project-URL: Homepage, https://github.com/basnijholt/agent-cli
|
|
6
6
|
Author-email: Bas Nijholt <bas@nijho.lt>
|
|
@@ -80,6 +80,8 @@ Requires-Dist: pytest-timeout; extra == 'test'
|
|
|
80
80
|
Requires-Dist: pytest>=7.0.0; extra == 'test'
|
|
81
81
|
Provides-Extra: vad
|
|
82
82
|
Requires-Dist: silero-vad>=5.1; extra == 'vad'
|
|
83
|
+
Provides-Extra: wyoming
|
|
84
|
+
Requires-Dist: wyoming>=1.5.2; extra == 'wyoming'
|
|
83
85
|
Description-Content-Type: text/markdown
|
|
84
86
|
|
|
85
87
|
# Agent CLI
|
|
@@ -132,7 +134,7 @@ Since then I have expanded the tool with many more features, all focused on loca
|
|
|
132
134
|
- **[`memory`](docs/commands/memory.md)**: Long-term memory system with `memory proxy` and `memory add`.
|
|
133
135
|
- **[`rag-proxy`](docs/commands/rag-proxy.md)**: RAG proxy server for chatting with your documents.
|
|
134
136
|
- **[`dev`](docs/commands/dev.md)**: Parallel development with git worktrees and AI coding agents.
|
|
135
|
-
- **[`server`](docs/commands/server/index.md)**: Local ASR and TTS servers with dual-protocol (Wyoming & OpenAI), TTL-based memory management, and multi-platform acceleration. Whisper uses MLX on Apple Silicon or Faster Whisper on Linux/CUDA. TTS supports Kokoro (GPU) or Piper (CPU).
|
|
137
|
+
- **[`server`](docs/commands/server/index.md)**: Local ASR and TTS servers with dual-protocol (Wyoming & OpenAI-compatible APIs), TTL-based memory management, and multi-platform acceleration. Whisper uses MLX on Apple Silicon or Faster Whisper on Linux/CUDA. TTS supports Kokoro (GPU) or Piper (CPU).
|
|
136
138
|
- **[`transcribe-daemon`](docs/commands/transcribe-daemon.md)**: Continuous background transcription with VAD. Install with `uv tool install "agent-cli[vad]" -p 3.13`.
|
|
137
139
|
|
|
138
140
|
## Quick Start
|
|
@@ -496,21 +498,43 @@ agent-cli install-extras rag memory vad
|
|
|
496
498
|
|
|
497
499
|
Usage: agent-cli install-extras [OPTIONS] [EXTRAS]...
|
|
498
500
|
|
|
499
|
-
Install optional
|
|
501
|
+
Install optional dependencies with pinned, compatible versions.
|
|
502
|
+
|
|
503
|
+
Many agent-cli features require optional dependencies. This command installs them with
|
|
504
|
+
version pinning to ensure compatibility. Dependencies persist across uv tool upgrade
|
|
505
|
+
when installed via uv tool.
|
|
506
|
+
|
|
507
|
+
Available extras:
|
|
508
|
+
|
|
509
|
+
• rag - RAG proxy server (ChromaDB, embeddings)
|
|
510
|
+
• memory - Long-term memory proxy (ChromaDB)
|
|
511
|
+
• vad - Voice Activity Detection (silero-vad)
|
|
512
|
+
• audio - Local audio recording/playback
|
|
513
|
+
• piper - Local Piper TTS engine
|
|
514
|
+
• kokoro - Kokoro neural TTS engine
|
|
515
|
+
• faster-whisper - Whisper ASR for CUDA/CPU
|
|
516
|
+
• mlx-whisper - Whisper ASR for Apple Silicon
|
|
517
|
+
• wyoming - Wyoming protocol for ASR/TTS servers
|
|
518
|
+
• server - FastAPI server components
|
|
519
|
+
• speed - Audio speed adjustment
|
|
520
|
+
• llm - LLM framework (pydantic-ai)
|
|
500
521
|
|
|
501
522
|
Examples:
|
|
502
523
|
|
|
503
|
-
|
|
504
|
-
|
|
505
|
-
|
|
506
|
-
|
|
524
|
+
|
|
525
|
+
agent-cli install-extras rag # Install RAG dependencies
|
|
526
|
+
agent-cli install-extras memory vad # Install multiple extras
|
|
527
|
+
agent-cli install-extras --list # Show available extras
|
|
528
|
+
agent-cli install-extras --all # Install all extras
|
|
529
|
+
|
|
507
530
|
|
|
508
531
|
╭─ Arguments ────────────────────────────────────────────────────────────────────────────╮
|
|
509
|
-
│ extras [EXTRAS]... Extras to install
|
|
532
|
+
│ extras [EXTRAS]... Extras to install: rag, memory, vad, audio, piper, kokoro, │
|
|
533
|
+
│ faster-whisper, mlx-whisper, wyoming, server, speed, llm │
|
|
510
534
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
511
535
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
512
|
-
│ --list -l
|
|
513
|
-
│ --all -a Install all available extras
|
|
536
|
+
│ --list -l Show available extras with descriptions (what each one enables) │
|
|
537
|
+
│ --all -a Install all available extras at once │
|
|
514
538
|
│ --help -h Show this message and exit. │
|
|
515
539
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
516
540
|
|
|
@@ -569,13 +593,21 @@ agent-cli config edit
|
|
|
569
593
|
|
|
570
594
|
Manage agent-cli configuration files.
|
|
571
595
|
|
|
596
|
+
Config files are TOML format and searched in order:
|
|
597
|
+
|
|
598
|
+
1 ./agent-cli-config.toml (project-local)
|
|
599
|
+
2 ~/.config/agent-cli/config.toml (user default)
|
|
600
|
+
|
|
601
|
+
Settings in [defaults] apply to all commands. Override per-command with sections like
|
|
602
|
+
[chat] or [transcribe]. CLI arguments override config file settings.
|
|
603
|
+
|
|
572
604
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
573
605
|
│ --help -h Show this message and exit. │
|
|
574
606
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
575
607
|
╭─ Commands ─────────────────────────────────────────────────────────────────────────────╮
|
|
576
|
-
│ init Create a new config file with all options commented
|
|
608
|
+
│ init Create a new config file with all options as commented-out examples. │
|
|
577
609
|
│ edit Open the config file in your default editor. │
|
|
578
|
-
│ show Display the config file
|
|
610
|
+
│ show Display the active config file path and contents. │
|
|
579
611
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
580
612
|
|
|
581
613
|
```
|
|
@@ -633,10 +665,37 @@ the `[defaults]` section of your configuration file.
|
|
|
633
665
|
|
|
634
666
|
Usage: agent-cli autocorrect [OPTIONS] [TEXT]
|
|
635
667
|
|
|
636
|
-
|
|
668
|
+
Fix grammar, spelling, and punctuation using an LLM.
|
|
669
|
+
|
|
670
|
+
Reads text from clipboard (or argument), sends to LLM for correction, and copies the
|
|
671
|
+
result back to clipboard. Only makes technical corrections without changing meaning or
|
|
672
|
+
tone.
|
|
673
|
+
|
|
674
|
+
Workflow:
|
|
675
|
+
|
|
676
|
+
1 Read text from clipboard (or TEXT argument)
|
|
677
|
+
2 Send to LLM for grammar/spelling/punctuation fixes
|
|
678
|
+
3 Copy corrected text to clipboard (unless --json)
|
|
679
|
+
4 Display result
|
|
680
|
+
|
|
681
|
+
Examples:
|
|
682
|
+
|
|
683
|
+
|
|
684
|
+
# Correct text from clipboard (default)
|
|
685
|
+
agent-cli autocorrect
|
|
686
|
+
|
|
687
|
+
# Correct specific text
|
|
688
|
+
agent-cli autocorrect "this is incorect"
|
|
689
|
+
|
|
690
|
+
# Use OpenAI instead of local Ollama
|
|
691
|
+
agent-cli autocorrect --llm-provider openai
|
|
692
|
+
|
|
693
|
+
# Get JSON output for scripting (disables clipboard)
|
|
694
|
+
agent-cli autocorrect --json
|
|
695
|
+
|
|
637
696
|
|
|
638
697
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
639
|
-
│ text [TEXT]
|
|
698
|
+
│ text [TEXT] Text to correct. If omitted, reads from system clipboard. │
|
|
640
699
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
641
700
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
642
701
|
│ --help -h Show this message and exit. │
|
|
@@ -680,9 +739,8 @@ the `[defaults]` section of your configuration file.
|
|
|
680
739
|
│ [default: info] │
|
|
681
740
|
│ --log-file TEXT Path to a file to write logs to. │
|
|
682
741
|
│ --quiet -q Suppress console output from rich. │
|
|
683
|
-
│ --json Output result as JSON
|
|
684
|
-
│
|
|
685
|
-
│ --no-clipboard. │
|
|
742
|
+
│ --json Output result as JSON (implies │
|
|
743
|
+
│ --quiet and --no-clipboard). │
|
|
686
744
|
│ --config TEXT Path to a TOML configuration file. │
|
|
687
745
|
│ --print-args Print the command line arguments, │
|
|
688
746
|
│ including variables taken from the │
|
|
@@ -730,30 +788,50 @@ the `[defaults]` section of your configuration file.
|
|
|
730
788
|
|
|
731
789
|
Usage: agent-cli transcribe [OPTIONS]
|
|
732
790
|
|
|
733
|
-
|
|
791
|
+
Record audio from microphone and transcribe to text.
|
|
792
|
+
|
|
793
|
+
Records until you press Ctrl+C (or send SIGINT), then transcribes using your configured
|
|
794
|
+
ASR provider. The transcript is copied to the clipboard by default.
|
|
795
|
+
|
|
796
|
+
With --llm: Passes the raw transcript through an LLM to clean up speech recognition
|
|
797
|
+
errors, add punctuation, remove filler words, and improve readability.
|
|
798
|
+
|
|
799
|
+
With --toggle: Bind to a hotkey for push-to-talk. First call starts recording, second
|
|
800
|
+
call stops and transcribes.
|
|
801
|
+
|
|
802
|
+
Examples:
|
|
803
|
+
|
|
804
|
+
• Record and transcribe: agent-cli transcribe
|
|
805
|
+
• With LLM cleanup: agent-cli transcribe --llm
|
|
806
|
+
• Re-transcribe last recording: agent-cli transcribe --last-recording 1
|
|
734
807
|
|
|
735
808
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
736
809
|
│ --help -h Show this message and exit. │
|
|
737
810
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
738
811
|
╭─ LLM Configuration ────────────────────────────────────────────────────────────────────╮
|
|
739
|
-
│ --extra-instructions TEXT
|
|
740
|
-
│
|
|
741
|
-
│ --llm --no-llm
|
|
812
|
+
│ --extra-instructions TEXT Extra instructions appended to the LLM │
|
|
813
|
+
│ cleanup prompt (requires --llm). │
|
|
814
|
+
│ --llm --no-llm Clean up transcript with LLM: fix errors, │
|
|
815
|
+
│ add punctuation, remove filler words. Uses │
|
|
816
|
+
│ --extra-instructions if set (via CLI or │
|
|
817
|
+
│ config file). │
|
|
742
818
|
│ [default: no-llm] │
|
|
743
819
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
744
820
|
╭─ Audio Recovery ───────────────────────────────────────────────────────────────────────╮
|
|
745
|
-
│ --from-file PATH Transcribe audio
|
|
746
|
-
│
|
|
747
|
-
│ flac, aac, webm
|
|
748
|
-
│ for non-WAV
|
|
749
|
-
│
|
|
750
|
-
│ --last-recording INTEGER
|
|
751
|
-
│ 1
|
|
752
|
-
│
|
|
753
|
-
│
|
|
821
|
+
│ --from-file PATH Transcribe from audio file instead │
|
|
822
|
+
│ of microphone. Supports wav, mp3, │
|
|
823
|
+
│ m4a, ogg, flac, aac, webm. │
|
|
824
|
+
│ Requires ffmpeg for non-WAV │
|
|
825
|
+
│ formats with Wyoming. │
|
|
826
|
+
│ --last-recording INTEGER Re-transcribe a saved recording │
|
|
827
|
+
│ (1=most recent, 2=second-to-last, │
|
|
828
|
+
│ etc). Useful after connection │
|
|
829
|
+
│ failures or to retry with │
|
|
830
|
+
│ different options. │
|
|
754
831
|
│ [default: 0] │
|
|
755
|
-
│ --save-recording --no-save-recording Save
|
|
756
|
-
│ for
|
|
832
|
+
│ --save-recording --no-save-recording Save recordings to │
|
|
833
|
+
│ ~/.cache/agent-cli/ for │
|
|
834
|
+
│ --last-recording recovery. │
|
|
757
835
|
│ [default: save-recording] │
|
|
758
836
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
759
837
|
╭─ Provider Selection ───────────────────────────────────────────────────────────────────╮
|
|
@@ -765,10 +843,12 @@ the `[defaults]` section of your configuration file.
|
|
|
765
843
|
│ [default: ollama] │
|
|
766
844
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
767
845
|
╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
|
|
768
|
-
│ --input-device-index INTEGER
|
|
769
|
-
│
|
|
770
|
-
│ --
|
|
771
|
-
│
|
|
846
|
+
│ --input-device-index INTEGER Audio input device index (see --list-devices). │
|
|
847
|
+
│ Uses system default if omitted. │
|
|
848
|
+
│ --input-device-name TEXT Select input device by name substring (e.g., │
|
|
849
|
+
│ MacBook or USB). │
|
|
850
|
+
│ --list-devices List available audio devices with their indices │
|
|
851
|
+
│ and exit. │
|
|
772
852
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
773
853
|
╭─ Audio Input: Wyoming ─────────────────────────────────────────────────────────────────╮
|
|
774
854
|
│ --asr-wyoming-ip TEXT Wyoming ASR server IP address. │
|
|
@@ -823,10 +903,9 @@ the `[defaults]` section of your configuration file.
|
|
|
823
903
|
│ [env var: GEMINI_API_KEY] │
|
|
824
904
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
825
905
|
╭─ Process Management ───────────────────────────────────────────────────────────────────╮
|
|
826
|
-
│ --stop Stop any running
|
|
827
|
-
│ --status Check if
|
|
828
|
-
│ --toggle
|
|
829
|
-
│ will be stopped. If the process is not running, it will be started. │
|
|
906
|
+
│ --stop Stop any running instance of this command. │
|
|
907
|
+
│ --status Check if an instance is currently running. │
|
|
908
|
+
│ --toggle Start if not running, stop if running. Ideal for hotkey binding. │
|
|
830
909
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
831
910
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
832
911
|
│ --clipboard --no-clipboard Copy result to │
|
|
@@ -840,9 +919,8 @@ the `[defaults]` section of your configuration file.
|
|
|
840
919
|
│ --quiet -q Suppress console │
|
|
841
920
|
│ output from rich. │
|
|
842
921
|
│ --json Output result as JSON │
|
|
843
|
-
│
|
|
844
|
-
│
|
|
845
|
-
│ --no-clipboard. │
|
|
922
|
+
│ (implies --quiet and │
|
|
923
|
+
│ --no-clipboard). │
|
|
846
924
|
│ --config TEXT Path to a TOML │
|
|
847
925
|
│ configuration file. │
|
|
848
926
|
│ --print-args Print the command │
|
|
@@ -850,11 +928,13 @@ the `[defaults]` section of your configuration file.
|
|
|
850
928
|
│ including variables │
|
|
851
929
|
│ taken from the │
|
|
852
930
|
│ configuration file. │
|
|
853
|
-
│ --transcription-log PATH
|
|
854
|
-
│
|
|
855
|
-
│
|
|
856
|
-
│
|
|
857
|
-
│
|
|
931
|
+
│ --transcription-log PATH Append transcripts to │
|
|
932
|
+
│ JSONL file │
|
|
933
|
+
│ (timestamp, hostname, │
|
|
934
|
+
│ model, raw/processed │
|
|
935
|
+
│ text). Recent entries │
|
|
936
|
+
│ provide context for │
|
|
937
|
+
│ LLM cleanup. │
|
|
858
938
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
859
939
|
|
|
860
940
|
```
|
|
@@ -910,46 +990,76 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
910
990
|
|
|
911
991
|
Usage: agent-cli transcribe-daemon [OPTIONS]
|
|
912
992
|
|
|
913
|
-
|
|
993
|
+
Continuous transcription daemon using Silero VAD for speech detection.
|
|
994
|
+
|
|
995
|
+
Unlike transcribe (single recording session), this daemon runs indefinitely and
|
|
996
|
+
automatically detects speech segments using Voice Activity Detection (VAD). Each
|
|
997
|
+
detected segment is transcribed and logged with timestamps.
|
|
914
998
|
|
|
915
|
-
|
|
916
|
-
segments using Silero VAD, transcribing them, and logging results with timestamps.
|
|
999
|
+
How it works:
|
|
917
1000
|
|
|
918
|
-
|
|
1001
|
+
1 Listens continuously to microphone input
|
|
1002
|
+
2 Silero VAD detects when you start/stop speaking
|
|
1003
|
+
3 After --silence-threshold seconds of silence, the segment is finalized
|
|
1004
|
+
4 Segment is transcribed (and optionally cleaned by LLM with --llm)
|
|
1005
|
+
5 Results are appended to the JSONL log file
|
|
1006
|
+
6 Audio is saved as MP3 if --save-audio is enabled (requires ffmpeg)
|
|
919
1007
|
|
|
1008
|
+
Use cases: Meeting transcription, note-taking, voice journaling, accessibility.
|
|
920
1009
|
|
|
921
|
-
|
|
1010
|
+
Examples:
|
|
1011
|
+
|
|
1012
|
+
|
|
1013
|
+
agent-cli transcribe-daemon
|
|
922
1014
|
agent-cli transcribe-daemon --role meeting --silence-threshold 1.5
|
|
1015
|
+
agent-cli transcribe-daemon --llm --clipboard --role notes
|
|
1016
|
+
agent-cli transcribe-daemon --transcription-log ~/meeting.jsonl --no-save-audio
|
|
1017
|
+
agent-cli transcribe-daemon --asr-provider openai --llm-provider gemini --llm
|
|
923
1018
|
|
|
924
|
-
# With LLM cleanup
|
|
925
|
-
agent-cli transcribe-daemon --llm --role notes
|
|
926
1019
|
|
|
927
|
-
|
|
928
|
-
agent-cli transcribe-daemon --transcription-log ~/meeting.jsonl --audio-dir ~/audio
|
|
1020
|
+
Tips:
|
|
929
1021
|
|
|
1022
|
+
• Use --role to tag entries (e.g., speaker1, meeting, personal)
|
|
1023
|
+
• Adjust --vad-threshold if detection is too sensitive (increase) or missing speech
|
|
1024
|
+
(decrease)
|
|
1025
|
+
• Use --stop to cleanly terminate a running daemon
|
|
1026
|
+
• With --llm, transcripts are cleaned up (punctuation, filler words removed)
|
|
930
1027
|
|
|
931
1028
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
932
|
-
│ --role -r TEXT
|
|
933
|
-
│
|
|
1029
|
+
│ --role -r TEXT Label for log entries. Use to │
|
|
1030
|
+
│ distinguish speakers or contexts in │
|
|
1031
|
+
│ logs. │
|
|
934
1032
|
│ [default: user] │
|
|
935
|
-
│ --silence-threshold -s FLOAT Seconds of silence
|
|
936
|
-
│ segment.
|
|
1033
|
+
│ --silence-threshold -s FLOAT Seconds of silence after speech to │
|
|
1034
|
+
│ finalize a segment. Increase for │
|
|
1035
|
+
│ slower speakers. │
|
|
937
1036
|
│ [default: 1.0] │
|
|
938
|
-
│ --min-segment -m FLOAT Minimum
|
|
939
|
-
│
|
|
1037
|
+
│ --min-segment -m FLOAT Minimum seconds of speech required │
|
|
1038
|
+
│ before a segment is processed. │
|
|
1039
|
+
│ Filters brief sounds. │
|
|
940
1040
|
│ [default: 0.25] │
|
|
941
|
-
│ --vad-threshold FLOAT VAD
|
|
942
|
-
│ (0.0-1.0). Higher
|
|
943
|
-
│
|
|
1041
|
+
│ --vad-threshold FLOAT Silero VAD confidence threshold │
|
|
1042
|
+
│ (0.0-1.0). Higher values require │
|
|
1043
|
+
│ clearer speech; lower values are │
|
|
1044
|
+
│ more sensitive to quiet/distant │
|
|
1045
|
+
│ voices. │
|
|
944
1046
|
│ [default: 0.3] │
|
|
945
|
-
│ --save-audio --no-save-audio Save
|
|
1047
|
+
│ --save-audio --no-save-audio Save each speech segment as MP3. │
|
|
1048
|
+
│ Requires ffmpeg to be installed. │
|
|
946
1049
|
│ [default: save-audio] │
|
|
947
|
-
│ --audio-dir PATH
|
|
948
|
-
│
|
|
949
|
-
│
|
|
1050
|
+
│ --audio-dir PATH Base directory for MP3 files. Files │
|
|
1051
|
+
│ are organized by date: │
|
|
1052
|
+
│ YYYY/MM/DD/HHMMSS_mmm.mp3. Default: │
|
|
1053
|
+
│ ~/.config/agent-cli/audio. │
|
|
1054
|
+
│ --transcription-log -t PATH JSONL file for transcript logging │
|
|
1055
|
+
│ (one JSON object per line with │
|
|
1056
|
+
│ timestamp, role, raw/processed │
|
|
1057
|
+
│ text, audio path). Default: │
|
|
950
1058
|
│ ~/.config/agent-cli/transcriptions… │
|
|
951
|
-
│ --clipboard --no-clipboard Copy each transcription
|
|
952
|
-
│ clipboard.
|
|
1059
|
+
│ --clipboard --no-clipboard Copy each completed transcription │
|
|
1060
|
+
│ to clipboard (overwrites previous). │
|
|
1061
|
+
│ Useful with --llm to get cleaned │
|
|
1062
|
+
│ text. │
|
|
953
1063
|
│ [default: no-clipboard] │
|
|
954
1064
|
│ --help -h Show this message and exit. │
|
|
955
1065
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
@@ -962,10 +1072,12 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
962
1072
|
│ [default: ollama] │
|
|
963
1073
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
964
1074
|
╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
|
|
965
|
-
│ --input-device-index INTEGER
|
|
966
|
-
│
|
|
967
|
-
│ --
|
|
968
|
-
│
|
|
1075
|
+
│ --input-device-index INTEGER Audio input device index (see --list-devices). │
|
|
1076
|
+
│ Uses system default if omitted. │
|
|
1077
|
+
│ --input-device-name TEXT Select input device by name substring (e.g., │
|
|
1078
|
+
│ MacBook or USB). │
|
|
1079
|
+
│ --list-devices List available audio devices with their indices │
|
|
1080
|
+
│ and exit. │
|
|
969
1081
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
970
1082
|
╭─ Audio Input: Wyoming ─────────────────────────────────────────────────────────────────╮
|
|
971
1083
|
│ --asr-wyoming-ip TEXT Wyoming ASR server IP address. │
|
|
@@ -1020,12 +1132,14 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1020
1132
|
│ [env var: GEMINI_API_KEY] │
|
|
1021
1133
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1022
1134
|
╭─ LLM Configuration ────────────────────────────────────────────────────────────────────╮
|
|
1023
|
-
│ --llm --no-llm
|
|
1135
|
+
│ --llm --no-llm Clean up transcript with LLM: fix errors, add punctuation, │
|
|
1136
|
+
│ remove filler words. Uses --extra-instructions if set (via CLI │
|
|
1137
|
+
│ or config file). │
|
|
1024
1138
|
│ [default: no-llm] │
|
|
1025
1139
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1026
1140
|
╭─ Process Management ───────────────────────────────────────────────────────────────────╮
|
|
1027
|
-
│ --stop Stop any running
|
|
1028
|
-
│ --status Check if
|
|
1141
|
+
│ --stop Stop any running instance of this command. │
|
|
1142
|
+
│ --status Check if an instance is currently running. │
|
|
1029
1143
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1030
1144
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
1031
1145
|
│ --log-level [debug|info|warning|error] Set logging level. │
|
|
@@ -1079,10 +1193,25 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1079
1193
|
|
|
1080
1194
|
Usage: agent-cli speak [OPTIONS] [TEXT]
|
|
1081
1195
|
|
|
1082
|
-
Convert text to speech
|
|
1196
|
+
Convert text to speech and play audio through speakers.
|
|
1197
|
+
|
|
1198
|
+
By default, synthesized audio plays immediately. Use --save-file to save to a WAV file
|
|
1199
|
+
instead (skips playback).
|
|
1200
|
+
|
|
1201
|
+
Text can be provided as an argument or read from clipboard automatically.
|
|
1202
|
+
|
|
1203
|
+
Examples:
|
|
1204
|
+
|
|
1205
|
+
Speak text directly: agent-cli speak "Hello, world!"
|
|
1206
|
+
|
|
1207
|
+
Speak clipboard contents: agent-cli speak
|
|
1208
|
+
|
|
1209
|
+
Save to file instead of playing: agent-cli speak "Hello" --save-file greeting.wav
|
|
1210
|
+
|
|
1211
|
+
Use OpenAI-compatible TTS: agent-cli speak "Hello" --tts-provider openai
|
|
1083
1212
|
|
|
1084
1213
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
1085
|
-
│ text [TEXT] Text to
|
|
1214
|
+
│ text [TEXT] Text to synthesize. If not provided, reads from clipboard. │
|
|
1086
1215
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1087
1216
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
1088
1217
|
│ --help -h Show this message and exit. │
|
|
@@ -1094,9 +1223,10 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1094
1223
|
│ [default: wyoming] │
|
|
1095
1224
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1096
1225
|
╭─ Audio Output ─────────────────────────────────────────────────────────────────────────╮
|
|
1097
|
-
│ --output-device-index INTEGER
|
|
1098
|
-
│
|
|
1099
|
-
│
|
|
1226
|
+
│ --output-device-index INTEGER Audio output device index (see --list-devices │
|
|
1227
|
+
│ for available devices). │
|
|
1228
|
+
│ --output-device-name TEXT Partial match on device name (e.g., 'speakers', │
|
|
1229
|
+
│ 'headphones'). │
|
|
1100
1230
|
│ --tts-speed FLOAT Speech speed multiplier (1.0 = normal, 2.0 = │
|
|
1101
1231
|
│ twice as fast, 0.5 = half speed). │
|
|
1102
1232
|
│ [default: 1.0] │
|
|
@@ -1114,7 +1244,8 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1114
1244
|
╭─ Audio Output: OpenAI-compatible ──────────────────────────────────────────────────────╮
|
|
1115
1245
|
│ --tts-openai-model TEXT The OpenAI model to use for TTS. │
|
|
1116
1246
|
│ [default: tts-1] │
|
|
1117
|
-
│ --tts-openai-voice TEXT
|
|
1247
|
+
│ --tts-openai-voice TEXT Voice for OpenAI TTS (alloy, echo, fable, onyx, │
|
|
1248
|
+
│ nova, shimmer). │
|
|
1118
1249
|
│ [default: alloy] │
|
|
1119
1250
|
│ --tts-openai-base-url TEXT Custom base URL for OpenAI-compatible TTS API │
|
|
1120
1251
|
│ (e.g., http://localhost:8000/v1 for a proxy). │
|
|
@@ -1140,28 +1271,27 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1140
1271
|
│ [env var: GEMINI_API_KEY] │
|
|
1141
1272
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1142
1273
|
╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
|
|
1143
|
-
│ --list-devices List available audio
|
|
1274
|
+
│ --list-devices List available audio devices with their indices and exit. │
|
|
1144
1275
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1145
1276
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
1146
|
-
│ --save-file PATH Save
|
|
1277
|
+
│ --save-file PATH Save audio to WAV file instead of │
|
|
1278
|
+
│ playing through speakers. │
|
|
1147
1279
|
│ --log-level [debug|info|warning|error] Set logging level. │
|
|
1148
1280
|
│ [env var: LOG_LEVEL] │
|
|
1149
1281
|
│ [default: info] │
|
|
1150
1282
|
│ --log-file TEXT Path to a file to write logs to. │
|
|
1151
1283
|
│ --quiet -q Suppress console output from rich. │
|
|
1152
|
-
│ --json Output result as JSON
|
|
1153
|
-
│
|
|
1154
|
-
│ --no-clipboard. │
|
|
1284
|
+
│ --json Output result as JSON (implies │
|
|
1285
|
+
│ --quiet and --no-clipboard). │
|
|
1155
1286
|
│ --config TEXT Path to a TOML configuration file. │
|
|
1156
1287
|
│ --print-args Print the command line arguments, │
|
|
1157
1288
|
│ including variables taken from the │
|
|
1158
1289
|
│ configuration file. │
|
|
1159
1290
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1160
1291
|
╭─ Process Management ───────────────────────────────────────────────────────────────────╮
|
|
1161
|
-
│ --stop Stop any running
|
|
1162
|
-
│ --status Check if
|
|
1163
|
-
│ --toggle
|
|
1164
|
-
│ will be stopped. If the process is not running, it will be started. │
|
|
1292
|
+
│ --stop Stop any running instance of this command. │
|
|
1293
|
+
│ --status Check if an instance is currently running. │
|
|
1294
|
+
│ --toggle Start if not running, stop if running. Ideal for hotkey binding. │
|
|
1165
1295
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1166
1296
|
|
|
1167
1297
|
```
|
|
@@ -1203,16 +1333,23 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1203
1333
|
|
|
1204
1334
|
Usage: agent-cli voice-edit [OPTIONS]
|
|
1205
1335
|
|
|
1206
|
-
|
|
1336
|
+
Edit or query clipboard text using voice commands.
|
|
1337
|
+
|
|
1338
|
+
Workflow: Captures clipboard text → records your voice command → transcribes it → sends
|
|
1339
|
+
both to an LLM → copies result back to clipboard.
|
|
1340
|
+
|
|
1341
|
+
Use this for hands-free text editing (e.g., "make this more formal") or asking questions
|
|
1342
|
+
about clipboard content (e.g., "summarize this").
|
|
1207
1343
|
|
|
1208
|
-
|
|
1344
|
+
Typical hotkey integration: Run voice-edit & on keypress to start recording, then send
|
|
1345
|
+
SIGINT (via --stop) on second keypress to process.
|
|
1346
|
+
|
|
1347
|
+
Examples:
|
|
1209
1348
|
|
|
1210
|
-
•
|
|
1211
|
-
•
|
|
1212
|
-
•
|
|
1213
|
-
•
|
|
1214
|
-
• List output devices: agent-cli voice-edit --list-output-devices
|
|
1215
|
-
• Save TTS to file: agent-cli voice-edit --tts --save-file response.wav
|
|
1349
|
+
• Basic usage: agent-cli voice-edit
|
|
1350
|
+
• With TTS response: agent-cli voice-edit --tts
|
|
1351
|
+
• Toggle on/off: agent-cli voice-edit --toggle
|
|
1352
|
+
• List audio devices: agent-cli voice-edit --list-devices
|
|
1216
1353
|
|
|
1217
1354
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
1218
1355
|
│ --help -h Show this message and exit. │
|
|
@@ -1230,10 +1367,12 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1230
1367
|
│ [default: wyoming] │
|
|
1231
1368
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1232
1369
|
╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
|
|
1233
|
-
│ --input-device-index INTEGER
|
|
1234
|
-
│
|
|
1235
|
-
│ --
|
|
1236
|
-
│
|
|
1370
|
+
│ --input-device-index INTEGER Audio input device index (see --list-devices). │
|
|
1371
|
+
│ Uses system default if omitted. │
|
|
1372
|
+
│ --input-device-name TEXT Select input device by name substring (e.g., │
|
|
1373
|
+
│ MacBook or USB). │
|
|
1374
|
+
│ --list-devices List available audio devices with their indices │
|
|
1375
|
+
│ and exit. │
|
|
1237
1376
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1238
1377
|
╭─ Audio Input: Wyoming ─────────────────────────────────────────────────────────────────╮
|
|
1239
1378
|
│ --asr-wyoming-ip TEXT Wyoming ASR server IP address. │
|
|
@@ -1284,10 +1423,10 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1284
1423
|
╭─ Audio Output ─────────────────────────────────────────────────────────────────────────╮
|
|
1285
1424
|
│ --tts --no-tts Enable text-to-speech for responses. │
|
|
1286
1425
|
│ [default: no-tts] │
|
|
1287
|
-
│ --output-device-index INTEGER
|
|
1288
|
-
│ for
|
|
1289
|
-
│ --output-device-name TEXT
|
|
1290
|
-
│
|
|
1426
|
+
│ --output-device-index INTEGER Audio output device index (see │
|
|
1427
|
+
│ --list-devices for available devices). │
|
|
1428
|
+
│ --output-device-name TEXT Partial match on device name (e.g., │
|
|
1429
|
+
│ 'speakers', 'headphones'). │
|
|
1291
1430
|
│ --tts-speed FLOAT Speech speed multiplier (1.0 = normal, │
|
|
1292
1431
|
│ 2.0 = twice as fast, 0.5 = half speed). │
|
|
1293
1432
|
│ [default: 1.0] │
|
|
@@ -1305,7 +1444,8 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1305
1444
|
╭─ Audio Output: OpenAI-compatible ──────────────────────────────────────────────────────╮
|
|
1306
1445
|
│ --tts-openai-model TEXT The OpenAI model to use for TTS. │
|
|
1307
1446
|
│ [default: tts-1] │
|
|
1308
|
-
│ --tts-openai-voice TEXT
|
|
1447
|
+
│ --tts-openai-voice TEXT Voice for OpenAI TTS (alloy, echo, fable, onyx, │
|
|
1448
|
+
│ nova, shimmer). │
|
|
1309
1449
|
│ [default: alloy] │
|
|
1310
1450
|
│ --tts-openai-base-url TEXT Custom base URL for OpenAI-compatible TTS API │
|
|
1311
1451
|
│ (e.g., http://localhost:8000/v1 for a proxy). │
|
|
@@ -1326,14 +1466,14 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1326
1466
|
│ [default: Kore] │
|
|
1327
1467
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1328
1468
|
╭─ Process Management ───────────────────────────────────────────────────────────────────╮
|
|
1329
|
-
│ --stop Stop any running
|
|
1330
|
-
│ --status Check if
|
|
1331
|
-
│ --toggle
|
|
1332
|
-
│ will be stopped. If the process is not running, it will be started. │
|
|
1469
|
+
│ --stop Stop any running instance of this command. │
|
|
1470
|
+
│ --status Check if an instance is currently running. │
|
|
1471
|
+
│ --toggle Start if not running, stop if running. Ideal for hotkey binding. │
|
|
1333
1472
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1334
1473
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
1335
|
-
│ --save-file PATH Save
|
|
1336
|
-
│
|
|
1474
|
+
│ --save-file PATH Save audio to WAV file │
|
|
1475
|
+
│ instead of playing │
|
|
1476
|
+
│ through speakers. │
|
|
1337
1477
|
│ --clipboard --no-clipboard Copy result to │
|
|
1338
1478
|
│ clipboard. │
|
|
1339
1479
|
│ [default: clipboard] │
|
|
@@ -1345,9 +1485,8 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1345
1485
|
│ --quiet -q Suppress console output │
|
|
1346
1486
|
│ from rich. │
|
|
1347
1487
|
│ --json Output result as JSON │
|
|
1348
|
-
│
|
|
1349
|
-
│ --
|
|
1350
|
-
│ --no-clipboard. │
|
|
1488
|
+
│ (implies --quiet and │
|
|
1489
|
+
│ --no-clipboard). │
|
|
1351
1490
|
│ --config TEXT Path to a TOML │
|
|
1352
1491
|
│ configuration file. │
|
|
1353
1492
|
│ --print-args Print the command line │
|
|
@@ -1398,7 +1537,28 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1398
1537
|
|
|
1399
1538
|
Usage: agent-cli assistant [OPTIONS]
|
|
1400
1539
|
|
|
1401
|
-
|
|
1540
|
+
Hands-free voice assistant using wake word detection.
|
|
1541
|
+
|
|
1542
|
+
Continuously listens for a wake word, then records your speech until you say the wake
|
|
1543
|
+
word again. The recording is transcribed and sent to an LLM for a conversational
|
|
1544
|
+
response, optionally spoken back via TTS.
|
|
1545
|
+
|
|
1546
|
+
Conversation flow:
|
|
1547
|
+
|
|
1548
|
+
1 Say wake word → starts recording
|
|
1549
|
+
2 Speak your question/command
|
|
1550
|
+
3 Say wake word again → stops recording and processes
|
|
1551
|
+
|
|
1552
|
+
The assistant runs in a loop, ready for the next command after each response. Stop with
|
|
1553
|
+
Ctrl+C or --stop.
|
|
1554
|
+
|
|
1555
|
+
Requirements:
|
|
1556
|
+
|
|
1557
|
+
• Wyoming wake word server (e.g., wyoming-openwakeword on port 10400)
|
|
1558
|
+
• Wyoming ASR server (e.g., wyoming-whisper on port 10300)
|
|
1559
|
+
• Optional: TTS server for spoken responses (enable with --tts)
|
|
1560
|
+
|
|
1561
|
+
Example: assistant --wake-word ok_nabu --tts --input-device-name USB
|
|
1402
1562
|
|
|
1403
1563
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
1404
1564
|
│ --help -h Show this message and exit. │
|
|
@@ -1416,19 +1576,23 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1416
1576
|
│ [default: wyoming] │
|
|
1417
1577
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1418
1578
|
╭─ Wake Word ────────────────────────────────────────────────────────────────────────────╮
|
|
1419
|
-
│ --wake-server-ip TEXT Wyoming wake word server IP
|
|
1579
|
+
│ --wake-server-ip TEXT Wyoming wake word server IP (requires │
|
|
1580
|
+
│ wyoming-openwakeword or similar). │
|
|
1420
1581
|
│ [default: localhost] │
|
|
1421
1582
|
│ --wake-server-port INTEGER Wyoming wake word server port. │
|
|
1422
1583
|
│ [default: 10400] │
|
|
1423
|
-
│ --wake-word TEXT
|
|
1424
|
-
│
|
|
1584
|
+
│ --wake-word TEXT Wake word to detect. Common options: ok_nabu, │
|
|
1585
|
+
│ hey_jarvis, alexa. Must match a model loaded in │
|
|
1586
|
+
│ your wake word server. │
|
|
1425
1587
|
│ [default: ok_nabu] │
|
|
1426
1588
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1427
1589
|
╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
|
|
1428
|
-
│ --input-device-index INTEGER
|
|
1429
|
-
│
|
|
1430
|
-
│ --
|
|
1431
|
-
│
|
|
1590
|
+
│ --input-device-index INTEGER Audio input device index (see --list-devices). │
|
|
1591
|
+
│ Uses system default if omitted. │
|
|
1592
|
+
│ --input-device-name TEXT Select input device by name substring (e.g., │
|
|
1593
|
+
│ MacBook or USB). │
|
|
1594
|
+
│ --list-devices List available audio devices with their indices │
|
|
1595
|
+
│ and exit. │
|
|
1432
1596
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1433
1597
|
╭─ Audio Input: Wyoming ─────────────────────────────────────────────────────────────────╮
|
|
1434
1598
|
│ --asr-wyoming-ip TEXT Wyoming ASR server IP address. │
|
|
@@ -1479,10 +1643,10 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1479
1643
|
╭─ Audio Output ─────────────────────────────────────────────────────────────────────────╮
|
|
1480
1644
|
│ --tts --no-tts Enable text-to-speech for responses. │
|
|
1481
1645
|
│ [default: no-tts] │
|
|
1482
|
-
│ --output-device-index INTEGER
|
|
1483
|
-
│ for
|
|
1484
|
-
│ --output-device-name TEXT
|
|
1485
|
-
│
|
|
1646
|
+
│ --output-device-index INTEGER Audio output device index (see │
|
|
1647
|
+
│ --list-devices for available devices). │
|
|
1648
|
+
│ --output-device-name TEXT Partial match on device name (e.g., │
|
|
1649
|
+
│ 'speakers', 'headphones'). │
|
|
1486
1650
|
│ --tts-speed FLOAT Speech speed multiplier (1.0 = normal, │
|
|
1487
1651
|
│ 2.0 = twice as fast, 0.5 = half speed). │
|
|
1488
1652
|
│ [default: 1.0] │
|
|
@@ -1500,7 +1664,8 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1500
1664
|
╭─ Audio Output: OpenAI-compatible ──────────────────────────────────────────────────────╮
|
|
1501
1665
|
│ --tts-openai-model TEXT The OpenAI model to use for TTS. │
|
|
1502
1666
|
│ [default: tts-1] │
|
|
1503
|
-
│ --tts-openai-voice TEXT
|
|
1667
|
+
│ --tts-openai-voice TEXT Voice for OpenAI TTS (alloy, echo, fable, onyx, │
|
|
1668
|
+
│ nova, shimmer). │
|
|
1504
1669
|
│ [default: alloy] │
|
|
1505
1670
|
│ --tts-openai-base-url TEXT Custom base URL for OpenAI-compatible TTS API │
|
|
1506
1671
|
│ (e.g., http://localhost:8000/v1 for a proxy). │
|
|
@@ -1521,14 +1686,14 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1521
1686
|
│ [default: Kore] │
|
|
1522
1687
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1523
1688
|
╭─ Process Management ───────────────────────────────────────────────────────────────────╮
|
|
1524
|
-
│ --stop Stop any running
|
|
1525
|
-
│ --status Check if
|
|
1526
|
-
│ --toggle
|
|
1527
|
-
│ will be stopped. If the process is not running, it will be started. │
|
|
1689
|
+
│ --stop Stop any running instance of this command. │
|
|
1690
|
+
│ --status Check if an instance is currently running. │
|
|
1691
|
+
│ --toggle Start if not running, stop if running. Ideal for hotkey binding. │
|
|
1528
1692
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1529
1693
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
1530
|
-
│ --save-file PATH Save
|
|
1531
|
-
│
|
|
1694
|
+
│ --save-file PATH Save audio to WAV file │
|
|
1695
|
+
│ instead of playing │
|
|
1696
|
+
│ through speakers. │
|
|
1532
1697
|
│ --clipboard --no-clipboard Copy result to │
|
|
1533
1698
|
│ clipboard. │
|
|
1534
1699
|
│ [default: clipboard] │
|
|
@@ -1596,7 +1761,39 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1596
1761
|
|
|
1597
1762
|
Usage: agent-cli chat [OPTIONS]
|
|
1598
1763
|
|
|
1599
|
-
|
|
1764
|
+
Voice-based conversational chat agent with memory and tools.
|
|
1765
|
+
|
|
1766
|
+
Runs an interactive loop: listen → transcribe → LLM → speak response. Conversation
|
|
1767
|
+
history is persisted and included as context for continuity.
|
|
1768
|
+
|
|
1769
|
+
Built-in tools (LLM uses automatically when relevant):
|
|
1770
|
+
|
|
1771
|
+
• add_memory/search_memory/update_memory - persistent long-term memory
|
|
1772
|
+
• duckduckgo_search - web search for current information
|
|
1773
|
+
• read_file/execute_code - file access and shell commands
|
|
1774
|
+
|
|
1775
|
+
Process management: Use --toggle to start/stop via hotkey (bind to a keyboard shortcut),
|
|
1776
|
+
--stop to terminate, or --status to check state.
|
|
1777
|
+
|
|
1778
|
+
Examples:
|
|
1779
|
+
|
|
1780
|
+
Use OpenAI-compatible providers for speech and LLM, with TTS enabled:
|
|
1781
|
+
|
|
1782
|
+
|
|
1783
|
+
agent-cli chat --asr-provider openai --llm-provider openai --tts
|
|
1784
|
+
|
|
1785
|
+
|
|
1786
|
+
Start in background mode (toggle on/off with hotkey):
|
|
1787
|
+
|
|
1788
|
+
|
|
1789
|
+
agent-cli chat --toggle
|
|
1790
|
+
|
|
1791
|
+
|
|
1792
|
+
Use local Ollama LLM with Wyoming ASR:
|
|
1793
|
+
|
|
1794
|
+
|
|
1795
|
+
agent-cli chat --llm-provider ollama
|
|
1796
|
+
|
|
1600
1797
|
|
|
1601
1798
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
1602
1799
|
│ --help -h Show this message and exit. │
|
|
@@ -1614,10 +1811,12 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1614
1811
|
│ [default: wyoming] │
|
|
1615
1812
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1616
1813
|
╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
|
|
1617
|
-
│ --input-device-index INTEGER
|
|
1618
|
-
│
|
|
1619
|
-
│ --
|
|
1620
|
-
│
|
|
1814
|
+
│ --input-device-index INTEGER Audio input device index (see --list-devices). │
|
|
1815
|
+
│ Uses system default if omitted. │
|
|
1816
|
+
│ --input-device-name TEXT Select input device by name substring (e.g., │
|
|
1817
|
+
│ MacBook or USB). │
|
|
1818
|
+
│ --list-devices List available audio devices with their indices │
|
|
1819
|
+
│ and exit. │
|
|
1621
1820
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1622
1821
|
╭─ Audio Input: Wyoming ─────────────────────────────────────────────────────────────────╮
|
|
1623
1822
|
│ --asr-wyoming-ip TEXT Wyoming ASR server IP address. │
|
|
@@ -1674,10 +1873,10 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1674
1873
|
╭─ Audio Output ─────────────────────────────────────────────────────────────────────────╮
|
|
1675
1874
|
│ --tts --no-tts Enable text-to-speech for responses. │
|
|
1676
1875
|
│ [default: no-tts] │
|
|
1677
|
-
│ --output-device-index INTEGER
|
|
1678
|
-
│ for
|
|
1679
|
-
│ --output-device-name TEXT
|
|
1680
|
-
│
|
|
1876
|
+
│ --output-device-index INTEGER Audio output device index (see │
|
|
1877
|
+
│ --list-devices for available devices). │
|
|
1878
|
+
│ --output-device-name TEXT Partial match on device name (e.g., │
|
|
1879
|
+
│ 'speakers', 'headphones'). │
|
|
1681
1880
|
│ --tts-speed FLOAT Speech speed multiplier (1.0 = normal, │
|
|
1682
1881
|
│ 2.0 = twice as fast, 0.5 = half speed). │
|
|
1683
1882
|
│ [default: 1.0] │
|
|
@@ -1695,7 +1894,8 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1695
1894
|
╭─ Audio Output: OpenAI-compatible ──────────────────────────────────────────────────────╮
|
|
1696
1895
|
│ --tts-openai-model TEXT The OpenAI model to use for TTS. │
|
|
1697
1896
|
│ [default: tts-1] │
|
|
1698
|
-
│ --tts-openai-voice TEXT
|
|
1897
|
+
│ --tts-openai-voice TEXT Voice for OpenAI TTS (alloy, echo, fable, onyx, │
|
|
1898
|
+
│ nova, shimmer). │
|
|
1699
1899
|
│ [default: alloy] │
|
|
1700
1900
|
│ --tts-openai-base-url TEXT Custom base URL for OpenAI-compatible TTS API │
|
|
1701
1901
|
│ (e.g., http://localhost:8000/v1 for a proxy). │
|
|
@@ -1716,20 +1916,23 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1716
1916
|
│ [default: Kore] │
|
|
1717
1917
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1718
1918
|
╭─ Process Management ───────────────────────────────────────────────────────────────────╮
|
|
1719
|
-
│ --stop Stop any running
|
|
1720
|
-
│ --status Check if
|
|
1721
|
-
│ --toggle
|
|
1722
|
-
│ will be stopped. If the process is not running, it will be started. │
|
|
1919
|
+
│ --stop Stop any running instance of this command. │
|
|
1920
|
+
│ --status Check if an instance is currently running. │
|
|
1921
|
+
│ --toggle Start if not running, stop if running. Ideal for hotkey binding. │
|
|
1723
1922
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1724
1923
|
╭─ History Options ──────────────────────────────────────────────────────────────────────╮
|
|
1725
|
-
│ --history-dir PATH Directory
|
|
1924
|
+
│ --history-dir PATH Directory for conversation history and long-term │
|
|
1925
|
+
│ memory. Both conversation.json and │
|
|
1926
|
+
│ long_term_memory.json are stored here. │
|
|
1726
1927
|
│ [default: ~/.config/agent-cli/history] │
|
|
1727
|
-
│ --last-n-messages INTEGER Number of messages to include
|
|
1728
|
-
│
|
|
1928
|
+
│ --last-n-messages INTEGER Number of past messages to include as context for │
|
|
1929
|
+
│ the LLM. Set to 0 to start fresh each session │
|
|
1930
|
+
│ (memory tools still persist). │
|
|
1729
1931
|
│ [default: 50] │
|
|
1730
1932
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1731
1933
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
1732
|
-
│ --save-file PATH Save
|
|
1934
|
+
│ --save-file PATH Save audio to WAV file instead of │
|
|
1935
|
+
│ playing through speakers. │
|
|
1733
1936
|
│ --log-level [debug|info|warning|error] Set logging level. │
|
|
1734
1937
|
│ [env var: LOG_LEVEL] │
|
|
1735
1938
|
│ [default: info] │
|
|
@@ -1784,25 +1987,68 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1784
1987
|
|
|
1785
1988
|
Usage: agent-cli rag-proxy [OPTIONS]
|
|
1786
1989
|
|
|
1787
|
-
Start
|
|
1990
|
+
Start a RAG proxy server that enables "chat with your documents".
|
|
1991
|
+
|
|
1992
|
+
Watches a folder for documents, indexes them into a vector store, and provides an
|
|
1993
|
+
OpenAI-compatible API at /v1/chat/completions. When you send a chat request, the server
|
|
1994
|
+
retrieves relevant document chunks and injects them as context before forwarding to your
|
|
1995
|
+
LLM backend.
|
|
1996
|
+
|
|
1997
|
+
Quick start:
|
|
1998
|
+
|
|
1999
|
+
• agent-cli rag-proxy — Start with defaults (./rag_docs, OpenAI-compatible API)
|
|
2000
|
+
• agent-cli rag-proxy --docs-folder ~/notes — Index your notes folder
|
|
2001
|
+
|
|
2002
|
+
How it works:
|
|
2003
|
+
|
|
2004
|
+
1 Documents in --docs-folder are chunked, embedded, and stored in ChromaDB
|
|
2005
|
+
2 A file watcher auto-reindexes when files change
|
|
2006
|
+
3 Chat requests trigger a semantic search for relevant chunks
|
|
2007
|
+
4 Retrieved context is injected into the prompt before forwarding to the LLM
|
|
2008
|
+
5 Responses include a rag_sources field listing which documents were used
|
|
2009
|
+
|
|
2010
|
+
Supported file formats:
|
|
1788
2011
|
|
|
1789
|
-
|
|
1790
|
-
|
|
1791
|
-
|
|
2012
|
+
Text: .txt, .md, .json, .py, .js, .ts, .yaml, .toml, .rst, etc. Rich documents (via
|
|
2013
|
+
MarkItDown): .pdf, .docx, .pptx, .xlsx, .html, .csv
|
|
2014
|
+
|
|
2015
|
+
API endpoints:
|
|
2016
|
+
|
|
2017
|
+
• POST /v1/chat/completions — Main chat endpoint (OpenAI-compatible)
|
|
2018
|
+
• GET /health — Health check with configuration info
|
|
2019
|
+
• GET /files — List indexed files with chunk counts
|
|
2020
|
+
• POST /reindex — Trigger manual reindex
|
|
2021
|
+
• All other paths are proxied to the LLM backend
|
|
2022
|
+
|
|
2023
|
+
Per-request overrides (in JSON body):
|
|
2024
|
+
|
|
2025
|
+
• rag_top_k: Override --limit for this request
|
|
2026
|
+
• rag_enable_tools: Override --rag-tools for this request
|
|
1792
2027
|
|
|
1793
2028
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
1794
2029
|
│ --help -h Show this message and exit. │
|
|
1795
2030
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1796
2031
|
╭─ RAG Configuration ────────────────────────────────────────────────────────────────────╮
|
|
1797
|
-
│ --docs-folder PATH Folder to watch for documents
|
|
2032
|
+
│ --docs-folder PATH Folder to watch for documents. Files are │
|
|
2033
|
+
│ auto-indexed on startup and when changed. │
|
|
2034
|
+
│ Must not overlap with --chroma-path. │
|
|
1798
2035
|
│ [default: ./rag_docs] │
|
|
1799
|
-
│ --chroma-path PATH
|
|
2036
|
+
│ --chroma-path PATH ChromaDB storage directory for vector │
|
|
2037
|
+
│ embeddings. Must be separate from │
|
|
2038
|
+
│ --docs-folder to avoid indexing database │
|
|
2039
|
+
│ files. │
|
|
1800
2040
|
│ [default: ./rag_db] │
|
|
1801
2041
|
│ --limit INTEGER Number of document chunks to retrieve per │
|
|
1802
|
-
│ query.
|
|
2042
|
+
│ query. Higher values provide more context │
|
|
2043
|
+
│ but use more tokens. Can be overridden │
|
|
2044
|
+
│ per-request via rag_top_k in the JSON │
|
|
2045
|
+
│ body. │
|
|
1803
2046
|
│ [default: 3] │
|
|
1804
|
-
│ --rag-tools --no-rag-tools
|
|
1805
|
-
│
|
|
2047
|
+
│ --rag-tools --no-rag-tools Enable read_full_document() tool so the │
|
|
2048
|
+
│ LLM can request full document content when │
|
|
2049
|
+
│ retrieved snippets are insufficient. Can │
|
|
2050
|
+
│ be overridden per-request via │
|
|
2051
|
+
│ rag_enable_tools in the JSON body. │
|
|
1806
2052
|
│ [default: rag-tools] │
|
|
1807
2053
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1808
2054
|
╭─ LLM: OpenAI-compatible ───────────────────────────────────────────────────────────────╮
|
|
@@ -1820,7 +2066,8 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1820
2066
|
╭─ Server Configuration ─────────────────────────────────────────────────────────────────╮
|
|
1821
2067
|
│ --host TEXT Host/IP to bind API servers to. │
|
|
1822
2068
|
│ [default: 0.0.0.0] │
|
|
1823
|
-
│ --port INTEGER Port
|
|
2069
|
+
│ --port INTEGER Port for the RAG proxy API (e.g., │
|
|
2070
|
+
│ http://localhost:8000/v1/chat/completions). │
|
|
1824
2071
|
│ [default: 8000] │
|
|
1825
2072
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1826
2073
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
@@ -1910,41 +2157,61 @@ The `memory proxy` command is the core feature—a middleware server that gives
|
|
|
1910
2157
|
5 Extracts new facts from the conversation in the background and updates the long-term
|
|
1911
2158
|
memory store (including handling contradictions).
|
|
1912
2159
|
|
|
1913
|
-
|
|
1914
|
-
|
|
2160
|
+
Example:
|
|
2161
|
+
|
|
2162
|
+
|
|
2163
|
+
# Start proxy pointing to local Ollama
|
|
2164
|
+
agent-cli memory proxy --openai-base-url http://localhost:11434/v1
|
|
2165
|
+
|
|
2166
|
+
# Then configure your chat client to use http://localhost:8100/v1
|
|
2167
|
+
# as its OpenAI base URL. All requests flow through the memory proxy.
|
|
2168
|
+
|
|
2169
|
+
|
|
2170
|
+
Per-request overrides: Clients can include these fields in the request body: memory_id
|
|
2171
|
+
(conversation ID), memory_top_k, memory_recency_weight, memory_score_threshold.
|
|
1915
2172
|
|
|
1916
2173
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
1917
2174
|
│ --help -h Show this message and exit. │
|
|
1918
2175
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1919
2176
|
╭─ Memory Configuration ─────────────────────────────────────────────────────────────────╮
|
|
1920
|
-
│ --memory-path PATH
|
|
1921
|
-
│
|
|
2177
|
+
│ --memory-path PATH Directory for memory storage. │
|
|
2178
|
+
│ Contains entries/ (Markdown │
|
|
2179
|
+
│ files) and chroma/ (vector │
|
|
2180
|
+
│ index). Created automatically if │
|
|
2181
|
+
│ it doesn't exist. │
|
|
1922
2182
|
│ [default: ./memory_db] │
|
|
1923
|
-
│ --default-top-k INTEGER Number of
|
|
1924
|
-
│
|
|
2183
|
+
│ --default-top-k INTEGER Number of relevant memories to │
|
|
2184
|
+
│ inject into each request. Higher │
|
|
2185
|
+
│ values provide more context but │
|
|
2186
|
+
│ increase token usage. │
|
|
1925
2187
|
│ [default: 5] │
|
|
1926
|
-
│ --max-entries INTEGER Maximum
|
|
1927
|
-
│
|
|
2188
|
+
│ --max-entries INTEGER Maximum entries per conversation │
|
|
2189
|
+
│ before oldest are evicted. │
|
|
2190
|
+
│ Summaries are preserved │
|
|
2191
|
+
│ separately. │
|
|
1928
2192
|
│ [default: 500] │
|
|
1929
2193
|
│ --mmr-lambda FLOAT MMR lambda (0-1): higher favors │
|
|
1930
2194
|
│ relevance, lower favors │
|
|
1931
2195
|
│ diversity. │
|
|
1932
2196
|
│ [default: 0.7] │
|
|
1933
|
-
│ --recency-weight FLOAT
|
|
1934
|
-
│
|
|
1935
|
-
│
|
|
1936
|
-
│ semantic relevance). │
|
|
2197
|
+
│ --recency-weight FLOAT Weight for recency vs semantic │
|
|
2198
|
+
│ relevance (0.0-1.0). At 0.2: 20% │
|
|
2199
|
+
│ recency, 80% semantic similarity. │
|
|
1937
2200
|
│ [default: 0.2] │
|
|
1938
2201
|
│ --score-threshold FLOAT Minimum semantic relevance │
|
|
1939
2202
|
│ threshold (0.0-1.0). Memories │
|
|
1940
2203
|
│ below this score are discarded to │
|
|
1941
2204
|
│ reduce noise. │
|
|
1942
2205
|
│ [default: 0.35] │
|
|
1943
|
-
│ --summarization --no-summarization
|
|
1944
|
-
│
|
|
2206
|
+
│ --summarization --no-summarization Extract facts and generate │
|
|
2207
|
+
│ summaries after each turn using │
|
|
2208
|
+
│ the LLM. Disable to only store │
|
|
2209
|
+
│ raw conversation turns. │
|
|
1945
2210
|
│ [default: summarization] │
|
|
1946
|
-
│ --git-versioning --no-git-versioning
|
|
1947
|
-
│
|
|
2211
|
+
│ --git-versioning --no-git-versioning Auto-commit memory changes to │
|
|
2212
|
+
│ git. Initializes a repo in │
|
|
2213
|
+
│ --memory-path if needed. Provides │
|
|
2214
|
+
│ full history of memory evolution. │
|
|
1948
2215
|
│ [default: git-versioning] │
|
|
1949
2216
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1950
2217
|
╭─ LLM: OpenAI-compatible ───────────────────────────────────────────────────────────────╮
|
|
@@ -2057,12 +2324,16 @@ agent-cli memory add -c work "Project deadline is Friday"
|
|
|
2057
2324
|
│ for stdin. Supports JSON array, │
|
|
2058
2325
|
│ JSON object with 'memories' key, │
|
|
2059
2326
|
│ or plain text (one per line). │
|
|
2060
|
-
│ --conversation-id -c TEXT Conversation
|
|
2061
|
-
│
|
|
2327
|
+
│ --conversation-id -c TEXT Conversation namespace for these │
|
|
2328
|
+
│ memories. Memories are retrieved │
|
|
2329
|
+
│ per-conversation unless shared │
|
|
2330
|
+
│ globally. │
|
|
2062
2331
|
│ [default: default] │
|
|
2063
|
-
│ --memory-path PATH
|
|
2332
|
+
│ --memory-path PATH Directory for memory storage (same │
|
|
2333
|
+
│ as memory proxy --memory-path). │
|
|
2064
2334
|
│ [default: ./memory_db] │
|
|
2065
|
-
│ --git-versioning --no-git-versioning
|
|
2335
|
+
│ --git-versioning --no-git-versioning Auto-commit changes to git for │
|
|
2336
|
+
│ version history. │
|
|
2066
2337
|
│ [default: git-versioning] │
|
|
2067
2338
|
│ --help -h Show this message and exit. │
|
|
2068
2339
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|