agent-cli 0.70.5__py3-none-any.whl → 0.72.1__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- agent_cli/_extras.json +2 -2
- agent_cli/_requirements/memory.txt +14 -1
- agent_cli/_requirements/rag.txt +14 -1
- agent_cli/_requirements/vad.txt +1 -85
- agent_cli/agents/assistant.py +23 -27
- agent_cli/agents/autocorrect.py +29 -3
- agent_cli/agents/chat.py +44 -14
- agent_cli/agents/memory/__init__.py +19 -1
- agent_cli/agents/memory/add.py +3 -3
- agent_cli/agents/memory/proxy.py +20 -11
- agent_cli/agents/rag_proxy.py +42 -10
- agent_cli/agents/speak.py +22 -2
- agent_cli/agents/transcribe.py +20 -2
- agent_cli/agents/transcribe_daemon.py +33 -21
- agent_cli/agents/voice_edit.py +17 -9
- agent_cli/cli.py +25 -2
- agent_cli/config_cmd.py +30 -11
- agent_cli/core/deps.py +6 -3
- agent_cli/core/vad.py +6 -24
- agent_cli/dev/cli.py +295 -65
- agent_cli/docs_gen.py +18 -8
- agent_cli/install/extras.py +44 -13
- agent_cli/install/hotkeys.py +22 -11
- agent_cli/install/services.py +54 -14
- agent_cli/opts.py +25 -21
- agent_cli/server/cli.py +121 -47
- {agent_cli-0.70.5.dist-info → agent_cli-0.72.1.dist-info}/METADATA +466 -195
- {agent_cli-0.70.5.dist-info → agent_cli-0.72.1.dist-info}/RECORD +31 -31
- {agent_cli-0.70.5.dist-info → agent_cli-0.72.1.dist-info}/WHEEL +0 -0
- {agent_cli-0.70.5.dist-info → agent_cli-0.72.1.dist-info}/entry_points.txt +0 -0
- {agent_cli-0.70.5.dist-info → agent_cli-0.72.1.dist-info}/licenses/LICENSE +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: agent-cli
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.72.1
|
|
4
4
|
Summary: A suite of AI-powered command-line tools for text correction, audio transcription, and voice assistance.
|
|
5
5
|
Project-URL: Homepage, https://github.com/basnijholt/agent-cli
|
|
6
6
|
Author-email: Bas Nijholt <bas@nijho.lt>
|
|
@@ -51,6 +51,7 @@ Requires-Dist: chromadb>=0.4.22; extra == 'memory'
|
|
|
51
51
|
Requires-Dist: fastapi[standard]; extra == 'memory'
|
|
52
52
|
Requires-Dist: huggingface-hub>=0.20.0; extra == 'memory'
|
|
53
53
|
Requires-Dist: onnxruntime>=1.17.0; extra == 'memory'
|
|
54
|
+
Requires-Dist: openai>=1.0.0; extra == 'memory'
|
|
54
55
|
Requires-Dist: pyyaml>=6.0.0; extra == 'memory'
|
|
55
56
|
Requires-Dist: transformers>=4.30.0; extra == 'memory'
|
|
56
57
|
Requires-Dist: watchfiles>=0.21.0; extra == 'memory'
|
|
@@ -66,6 +67,7 @@ Requires-Dist: fastapi[standard]; extra == 'rag'
|
|
|
66
67
|
Requires-Dist: huggingface-hub>=0.20.0; extra == 'rag'
|
|
67
68
|
Requires-Dist: markitdown[docx,pdf,pptx]>=0.1.3; extra == 'rag'
|
|
68
69
|
Requires-Dist: onnxruntime>=1.17.0; extra == 'rag'
|
|
70
|
+
Requires-Dist: openai>=1.0.0; extra == 'rag'
|
|
69
71
|
Requires-Dist: transformers>=4.30.0; extra == 'rag'
|
|
70
72
|
Requires-Dist: watchfiles>=0.21.0; extra == 'rag'
|
|
71
73
|
Provides-Extra: server
|
|
@@ -79,7 +81,7 @@ Requires-Dist: pytest-mock; extra == 'test'
|
|
|
79
81
|
Requires-Dist: pytest-timeout; extra == 'test'
|
|
80
82
|
Requires-Dist: pytest>=7.0.0; extra == 'test'
|
|
81
83
|
Provides-Extra: vad
|
|
82
|
-
Requires-Dist: silero-vad>=
|
|
84
|
+
Requires-Dist: silero-vad-lite>=0.2.1; extra == 'vad'
|
|
83
85
|
Provides-Extra: wyoming
|
|
84
86
|
Requires-Dist: wyoming>=1.5.2; extra == 'wyoming'
|
|
85
87
|
Description-Content-Type: text/markdown
|
|
@@ -134,7 +136,7 @@ Since then I have expanded the tool with many more features, all focused on loca
|
|
|
134
136
|
- **[`memory`](docs/commands/memory.md)**: Long-term memory system with `memory proxy` and `memory add`.
|
|
135
137
|
- **[`rag-proxy`](docs/commands/rag-proxy.md)**: RAG proxy server for chatting with your documents.
|
|
136
138
|
- **[`dev`](docs/commands/dev.md)**: Parallel development with git worktrees and AI coding agents.
|
|
137
|
-
- **[`server`](docs/commands/server/index.md)**: Local ASR and TTS servers with dual-protocol (Wyoming & OpenAI), TTL-based memory management, and multi-platform acceleration. Whisper uses MLX on Apple Silicon or Faster Whisper on Linux/CUDA. TTS supports Kokoro (GPU) or Piper (CPU).
|
|
139
|
+
- **[`server`](docs/commands/server/index.md)**: Local ASR and TTS servers with dual-protocol (Wyoming & OpenAI-compatible APIs), TTL-based memory management, and multi-platform acceleration. Whisper uses MLX on Apple Silicon or Faster Whisper on Linux/CUDA. TTS supports Kokoro (GPU) or Piper (CPU).
|
|
138
140
|
- **[`transcribe-daemon`](docs/commands/transcribe-daemon.md)**: Continuous background transcription with VAD. Install with `uv tool install "agent-cli[vad]" -p 3.13`.
|
|
139
141
|
|
|
140
142
|
## Quick Start
|
|
@@ -498,21 +500,43 @@ agent-cli install-extras rag memory vad
|
|
|
498
500
|
|
|
499
501
|
Usage: agent-cli install-extras [OPTIONS] [EXTRAS]...
|
|
500
502
|
|
|
501
|
-
Install optional
|
|
503
|
+
Install optional dependencies with pinned, compatible versions.
|
|
504
|
+
|
|
505
|
+
Many agent-cli features require optional dependencies. This command installs them with
|
|
506
|
+
version pinning to ensure compatibility. Dependencies persist across uv tool upgrade
|
|
507
|
+
when installed via uv tool.
|
|
508
|
+
|
|
509
|
+
Available extras:
|
|
510
|
+
|
|
511
|
+
• rag - RAG proxy server (ChromaDB, embeddings)
|
|
512
|
+
• memory - Long-term memory proxy (ChromaDB)
|
|
513
|
+
• vad - Voice Activity Detection (silero-vad)
|
|
514
|
+
• audio - Local audio recording/playback
|
|
515
|
+
• piper - Local Piper TTS engine
|
|
516
|
+
• kokoro - Kokoro neural TTS engine
|
|
517
|
+
• faster-whisper - Whisper ASR for CUDA/CPU
|
|
518
|
+
• mlx-whisper - Whisper ASR for Apple Silicon
|
|
519
|
+
• wyoming - Wyoming protocol for ASR/TTS servers
|
|
520
|
+
• server - FastAPI server components
|
|
521
|
+
• speed - Audio speed adjustment
|
|
522
|
+
• llm - LLM framework (pydantic-ai)
|
|
502
523
|
|
|
503
524
|
Examples:
|
|
504
525
|
|
|
505
|
-
|
|
506
|
-
|
|
507
|
-
|
|
508
|
-
|
|
526
|
+
|
|
527
|
+
agent-cli install-extras rag # Install RAG dependencies
|
|
528
|
+
agent-cli install-extras memory vad # Install multiple extras
|
|
529
|
+
agent-cli install-extras --list # Show available extras
|
|
530
|
+
agent-cli install-extras --all # Install all extras
|
|
531
|
+
|
|
509
532
|
|
|
510
533
|
╭─ Arguments ────────────────────────────────────────────────────────────────────────────╮
|
|
511
|
-
│ extras [EXTRAS]... Extras to install
|
|
534
|
+
│ extras [EXTRAS]... Extras to install: rag, memory, vad, audio, piper, kokoro, │
|
|
535
|
+
│ faster-whisper, mlx-whisper, wyoming, server, speed, llm │
|
|
512
536
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
513
537
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
514
|
-
│ --list -l
|
|
515
|
-
│ --all -a Install all available extras
|
|
538
|
+
│ --list -l Show available extras with descriptions (what each one enables) │
|
|
539
|
+
│ --all -a Install all available extras at once │
|
|
516
540
|
│ --help -h Show this message and exit. │
|
|
517
541
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
518
542
|
|
|
@@ -571,13 +595,21 @@ agent-cli config edit
|
|
|
571
595
|
|
|
572
596
|
Manage agent-cli configuration files.
|
|
573
597
|
|
|
598
|
+
Config files are TOML format and searched in order:
|
|
599
|
+
|
|
600
|
+
1 ./agent-cli-config.toml (project-local)
|
|
601
|
+
2 ~/.config/agent-cli/config.toml (user default)
|
|
602
|
+
|
|
603
|
+
Settings in [defaults] apply to all commands. Override per-command with sections like
|
|
604
|
+
[chat] or [transcribe]. CLI arguments override config file settings.
|
|
605
|
+
|
|
574
606
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
575
607
|
│ --help -h Show this message and exit. │
|
|
576
608
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
577
609
|
╭─ Commands ─────────────────────────────────────────────────────────────────────────────╮
|
|
578
|
-
│ init Create a new config file with all options commented
|
|
610
|
+
│ init Create a new config file with all options as commented-out examples. │
|
|
579
611
|
│ edit Open the config file in your default editor. │
|
|
580
|
-
│ show Display the config file
|
|
612
|
+
│ show Display the active config file path and contents. │
|
|
581
613
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
582
614
|
|
|
583
615
|
```
|
|
@@ -635,10 +667,37 @@ the `[defaults]` section of your configuration file.
|
|
|
635
667
|
|
|
636
668
|
Usage: agent-cli autocorrect [OPTIONS] [TEXT]
|
|
637
669
|
|
|
638
|
-
|
|
670
|
+
Fix grammar, spelling, and punctuation using an LLM.
|
|
671
|
+
|
|
672
|
+
Reads text from clipboard (or argument), sends to LLM for correction, and copies the
|
|
673
|
+
result back to clipboard. Only makes technical corrections without changing meaning or
|
|
674
|
+
tone.
|
|
675
|
+
|
|
676
|
+
Workflow:
|
|
677
|
+
|
|
678
|
+
1 Read text from clipboard (or TEXT argument)
|
|
679
|
+
2 Send to LLM for grammar/spelling/punctuation fixes
|
|
680
|
+
3 Copy corrected text to clipboard (unless --json)
|
|
681
|
+
4 Display result
|
|
682
|
+
|
|
683
|
+
Examples:
|
|
684
|
+
|
|
685
|
+
|
|
686
|
+
# Correct text from clipboard (default)
|
|
687
|
+
agent-cli autocorrect
|
|
688
|
+
|
|
689
|
+
# Correct specific text
|
|
690
|
+
agent-cli autocorrect "this is incorect"
|
|
691
|
+
|
|
692
|
+
# Use OpenAI instead of local Ollama
|
|
693
|
+
agent-cli autocorrect --llm-provider openai
|
|
694
|
+
|
|
695
|
+
# Get JSON output for scripting (disables clipboard)
|
|
696
|
+
agent-cli autocorrect --json
|
|
697
|
+
|
|
639
698
|
|
|
640
699
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
641
|
-
│ text [TEXT]
|
|
700
|
+
│ text [TEXT] Text to correct. If omitted, reads from system clipboard. │
|
|
642
701
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
643
702
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
644
703
|
│ --help -h Show this message and exit. │
|
|
@@ -679,12 +738,11 @@ the `[defaults]` section of your configuration file.
|
|
|
679
738
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
680
739
|
│ --log-level [debug|info|warning|error] Set logging level. │
|
|
681
740
|
│ [env var: LOG_LEVEL] │
|
|
682
|
-
│ [default:
|
|
741
|
+
│ [default: warning] │
|
|
683
742
|
│ --log-file TEXT Path to a file to write logs to. │
|
|
684
743
|
│ --quiet -q Suppress console output from rich. │
|
|
685
|
-
│ --json Output result as JSON
|
|
686
|
-
│
|
|
687
|
-
│ --no-clipboard. │
|
|
744
|
+
│ --json Output result as JSON (implies │
|
|
745
|
+
│ --quiet and --no-clipboard). │
|
|
688
746
|
│ --config TEXT Path to a TOML configuration file. │
|
|
689
747
|
│ --print-args Print the command line arguments, │
|
|
690
748
|
│ including variables taken from the │
|
|
@@ -732,30 +790,50 @@ the `[defaults]` section of your configuration file.
|
|
|
732
790
|
|
|
733
791
|
Usage: agent-cli transcribe [OPTIONS]
|
|
734
792
|
|
|
735
|
-
|
|
793
|
+
Record audio from microphone and transcribe to text.
|
|
794
|
+
|
|
795
|
+
Records until you press Ctrl+C (or send SIGINT), then transcribes using your configured
|
|
796
|
+
ASR provider. The transcript is copied to the clipboard by default.
|
|
797
|
+
|
|
798
|
+
With --llm: Passes the raw transcript through an LLM to clean up speech recognition
|
|
799
|
+
errors, add punctuation, remove filler words, and improve readability.
|
|
800
|
+
|
|
801
|
+
With --toggle: Bind to a hotkey for push-to-talk. First call starts recording, second
|
|
802
|
+
call stops and transcribes.
|
|
803
|
+
|
|
804
|
+
Examples:
|
|
805
|
+
|
|
806
|
+
• Record and transcribe: agent-cli transcribe
|
|
807
|
+
• With LLM cleanup: agent-cli transcribe --llm
|
|
808
|
+
• Re-transcribe last recording: agent-cli transcribe --last-recording 1
|
|
736
809
|
|
|
737
810
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
738
811
|
│ --help -h Show this message and exit. │
|
|
739
812
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
740
813
|
╭─ LLM Configuration ────────────────────────────────────────────────────────────────────╮
|
|
741
|
-
│ --extra-instructions TEXT
|
|
742
|
-
│
|
|
743
|
-
│ --llm --no-llm
|
|
814
|
+
│ --extra-instructions TEXT Extra instructions appended to the LLM │
|
|
815
|
+
│ cleanup prompt (requires --llm). │
|
|
816
|
+
│ --llm --no-llm Clean up transcript with LLM: fix errors, │
|
|
817
|
+
│ add punctuation, remove filler words. Uses │
|
|
818
|
+
│ --extra-instructions if set (via CLI or │
|
|
819
|
+
│ config file). │
|
|
744
820
|
│ [default: no-llm] │
|
|
745
821
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
746
822
|
╭─ Audio Recovery ───────────────────────────────────────────────────────────────────────╮
|
|
747
|
-
│ --from-file PATH Transcribe audio
|
|
748
|
-
│
|
|
749
|
-
│ flac, aac, webm
|
|
750
|
-
│ for non-WAV
|
|
751
|
-
│
|
|
752
|
-
│ --last-recording INTEGER
|
|
753
|
-
│ 1
|
|
754
|
-
│
|
|
755
|
-
│
|
|
823
|
+
│ --from-file PATH Transcribe from audio file instead │
|
|
824
|
+
│ of microphone. Supports wav, mp3, │
|
|
825
|
+
│ m4a, ogg, flac, aac, webm. │
|
|
826
|
+
│ Requires ffmpeg for non-WAV │
|
|
827
|
+
│ formats with Wyoming. │
|
|
828
|
+
│ --last-recording INTEGER Re-transcribe a saved recording │
|
|
829
|
+
│ (1=most recent, 2=second-to-last, │
|
|
830
|
+
│ etc). Useful after connection │
|
|
831
|
+
│ failures or to retry with │
|
|
832
|
+
│ different options. │
|
|
756
833
|
│ [default: 0] │
|
|
757
|
-
│ --save-recording --no-save-recording Save
|
|
758
|
-
│ for
|
|
834
|
+
│ --save-recording --no-save-recording Save recordings to │
|
|
835
|
+
│ ~/.cache/agent-cli/ for │
|
|
836
|
+
│ --last-recording recovery. │
|
|
759
837
|
│ [default: save-recording] │
|
|
760
838
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
761
839
|
╭─ Provider Selection ───────────────────────────────────────────────────────────────────╮
|
|
@@ -767,10 +845,12 @@ the `[defaults]` section of your configuration file.
|
|
|
767
845
|
│ [default: ollama] │
|
|
768
846
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
769
847
|
╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
|
|
770
|
-
│ --input-device-index INTEGER
|
|
771
|
-
│
|
|
772
|
-
│ --
|
|
773
|
-
│
|
|
848
|
+
│ --input-device-index INTEGER Audio input device index (see --list-devices). │
|
|
849
|
+
│ Uses system default if omitted. │
|
|
850
|
+
│ --input-device-name TEXT Select input device by name substring (e.g., │
|
|
851
|
+
│ MacBook or USB). │
|
|
852
|
+
│ --list-devices List available audio devices with their indices │
|
|
853
|
+
│ and exit. │
|
|
774
854
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
775
855
|
╭─ Audio Input: Wyoming ─────────────────────────────────────────────────────────────────╮
|
|
776
856
|
│ --asr-wyoming-ip TEXT Wyoming ASR server IP address. │
|
|
@@ -825,10 +905,9 @@ the `[defaults]` section of your configuration file.
|
|
|
825
905
|
│ [env var: GEMINI_API_KEY] │
|
|
826
906
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
827
907
|
╭─ Process Management ───────────────────────────────────────────────────────────────────╮
|
|
828
|
-
│ --stop Stop any running
|
|
829
|
-
│ --status Check if
|
|
830
|
-
│ --toggle
|
|
831
|
-
│ will be stopped. If the process is not running, it will be started. │
|
|
908
|
+
│ --stop Stop any running instance of this command. │
|
|
909
|
+
│ --status Check if an instance is currently running. │
|
|
910
|
+
│ --toggle Start if not running, stop if running. Ideal for hotkey binding. │
|
|
832
911
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
833
912
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
834
913
|
│ --clipboard --no-clipboard Copy result to │
|
|
@@ -836,15 +915,14 @@ the `[defaults]` section of your configuration file.
|
|
|
836
915
|
│ [default: clipboard] │
|
|
837
916
|
│ --log-level [debug|info|warning| Set logging level. │
|
|
838
917
|
│ error] [env var: LOG_LEVEL] │
|
|
839
|
-
│ [default:
|
|
918
|
+
│ [default: warning] │
|
|
840
919
|
│ --log-file TEXT Path to a file to │
|
|
841
920
|
│ write logs to. │
|
|
842
921
|
│ --quiet -q Suppress console │
|
|
843
922
|
│ output from rich. │
|
|
844
923
|
│ --json Output result as JSON │
|
|
845
|
-
│
|
|
846
|
-
│
|
|
847
|
-
│ --no-clipboard. │
|
|
924
|
+
│ (implies --quiet and │
|
|
925
|
+
│ --no-clipboard). │
|
|
848
926
|
│ --config TEXT Path to a TOML │
|
|
849
927
|
│ configuration file. │
|
|
850
928
|
│ --print-args Print the command │
|
|
@@ -852,11 +930,13 @@ the `[defaults]` section of your configuration file.
|
|
|
852
930
|
│ including variables │
|
|
853
931
|
│ taken from the │
|
|
854
932
|
│ configuration file. │
|
|
855
|
-
│ --transcription-log PATH
|
|
856
|
-
│
|
|
857
|
-
│
|
|
858
|
-
│
|
|
859
|
-
│
|
|
933
|
+
│ --transcription-log PATH Append transcripts to │
|
|
934
|
+
│ JSONL file │
|
|
935
|
+
│ (timestamp, hostname, │
|
|
936
|
+
│ model, raw/processed │
|
|
937
|
+
│ text). Recent entries │
|
|
938
|
+
│ provide context for │
|
|
939
|
+
│ LLM cleanup. │
|
|
860
940
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
861
941
|
|
|
862
942
|
```
|
|
@@ -912,46 +992,76 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
912
992
|
|
|
913
993
|
Usage: agent-cli transcribe-daemon [OPTIONS]
|
|
914
994
|
|
|
915
|
-
|
|
995
|
+
Continuous transcription daemon using Silero VAD for speech detection.
|
|
996
|
+
|
|
997
|
+
Unlike transcribe (single recording session), this daemon runs indefinitely and
|
|
998
|
+
automatically detects speech segments using Voice Activity Detection (VAD). Each
|
|
999
|
+
detected segment is transcribed and logged with timestamps.
|
|
916
1000
|
|
|
917
|
-
|
|
918
|
-
segments using Silero VAD, transcribing them, and logging results with timestamps.
|
|
1001
|
+
How it works:
|
|
919
1002
|
|
|
920
|
-
|
|
1003
|
+
1 Listens continuously to microphone input
|
|
1004
|
+
2 Silero VAD detects when you start/stop speaking
|
|
1005
|
+
3 After --silence-threshold seconds of silence, the segment is finalized
|
|
1006
|
+
4 Segment is transcribed (and optionally cleaned by LLM with --llm)
|
|
1007
|
+
5 Results are appended to the JSONL log file
|
|
1008
|
+
6 Audio is saved as MP3 if --save-audio is enabled (requires ffmpeg)
|
|
921
1009
|
|
|
1010
|
+
Use cases: Meeting transcription, note-taking, voice journaling, accessibility.
|
|
922
1011
|
|
|
923
|
-
|
|
1012
|
+
Examples:
|
|
1013
|
+
|
|
1014
|
+
|
|
1015
|
+
agent-cli transcribe-daemon
|
|
924
1016
|
agent-cli transcribe-daemon --role meeting --silence-threshold 1.5
|
|
1017
|
+
agent-cli transcribe-daemon --llm --clipboard --role notes
|
|
1018
|
+
agent-cli transcribe-daemon --transcription-log ~/meeting.jsonl --no-save-audio
|
|
1019
|
+
agent-cli transcribe-daemon --asr-provider openai --llm-provider gemini --llm
|
|
925
1020
|
|
|
926
|
-
# With LLM cleanup
|
|
927
|
-
agent-cli transcribe-daemon --llm --role notes
|
|
928
1021
|
|
|
929
|
-
|
|
930
|
-
agent-cli transcribe-daemon --transcription-log ~/meeting.jsonl --audio-dir ~/audio
|
|
1022
|
+
Tips:
|
|
931
1023
|
|
|
1024
|
+
• Use --role to tag entries (e.g., speaker1, meeting, personal)
|
|
1025
|
+
• Adjust --vad-threshold if detection is too sensitive (increase) or missing speech
|
|
1026
|
+
(decrease)
|
|
1027
|
+
• Use --stop to cleanly terminate a running daemon
|
|
1028
|
+
• With --llm, transcripts are cleaned up (punctuation, filler words removed)
|
|
932
1029
|
|
|
933
1030
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
934
|
-
│ --role -r TEXT
|
|
935
|
-
│
|
|
1031
|
+
│ --role -r TEXT Label for log entries. Use to │
|
|
1032
|
+
│ distinguish speakers or contexts in │
|
|
1033
|
+
│ logs. │
|
|
936
1034
|
│ [default: user] │
|
|
937
|
-
│ --silence-threshold -s FLOAT Seconds of silence
|
|
938
|
-
│ segment.
|
|
1035
|
+
│ --silence-threshold -s FLOAT Seconds of silence after speech to │
|
|
1036
|
+
│ finalize a segment. Increase for │
|
|
1037
|
+
│ slower speakers. │
|
|
939
1038
|
│ [default: 1.0] │
|
|
940
|
-
│ --min-segment -m FLOAT Minimum
|
|
941
|
-
│
|
|
1039
|
+
│ --min-segment -m FLOAT Minimum seconds of speech required │
|
|
1040
|
+
│ before a segment is processed. │
|
|
1041
|
+
│ Filters brief sounds. │
|
|
942
1042
|
│ [default: 0.25] │
|
|
943
|
-
│ --vad-threshold FLOAT VAD
|
|
944
|
-
│ (0.0-1.0). Higher
|
|
945
|
-
│
|
|
1043
|
+
│ --vad-threshold FLOAT Silero VAD confidence threshold │
|
|
1044
|
+
│ (0.0-1.0). Higher values require │
|
|
1045
|
+
│ clearer speech; lower values are │
|
|
1046
|
+
│ more sensitive to quiet/distant │
|
|
1047
|
+
│ voices. │
|
|
946
1048
|
│ [default: 0.3] │
|
|
947
|
-
│ --save-audio --no-save-audio Save
|
|
1049
|
+
│ --save-audio --no-save-audio Save each speech segment as MP3. │
|
|
1050
|
+
│ Requires ffmpeg to be installed. │
|
|
948
1051
|
│ [default: save-audio] │
|
|
949
|
-
│ --audio-dir PATH
|
|
950
|
-
│
|
|
951
|
-
│
|
|
1052
|
+
│ --audio-dir PATH Base directory for MP3 files. Files │
|
|
1053
|
+
│ are organized by date: │
|
|
1054
|
+
│ YYYY/MM/DD/HHMMSS_mmm.mp3. Default: │
|
|
1055
|
+
│ ~/.config/agent-cli/audio. │
|
|
1056
|
+
│ --transcription-log -t PATH JSONL file for transcript logging │
|
|
1057
|
+
│ (one JSON object per line with │
|
|
1058
|
+
│ timestamp, role, raw/processed │
|
|
1059
|
+
│ text, audio path). Default: │
|
|
952
1060
|
│ ~/.config/agent-cli/transcriptions… │
|
|
953
|
-
│ --clipboard --no-clipboard Copy each transcription
|
|
954
|
-
│ clipboard.
|
|
1061
|
+
│ --clipboard --no-clipboard Copy each completed transcription │
|
|
1062
|
+
│ to clipboard (overwrites previous). │
|
|
1063
|
+
│ Useful with --llm to get cleaned │
|
|
1064
|
+
│ text. │
|
|
955
1065
|
│ [default: no-clipboard] │
|
|
956
1066
|
│ --help -h Show this message and exit. │
|
|
957
1067
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
@@ -964,10 +1074,12 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
964
1074
|
│ [default: ollama] │
|
|
965
1075
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
966
1076
|
╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
|
|
967
|
-
│ --input-device-index INTEGER
|
|
968
|
-
│
|
|
969
|
-
│ --
|
|
970
|
-
│
|
|
1077
|
+
│ --input-device-index INTEGER Audio input device index (see --list-devices). │
|
|
1078
|
+
│ Uses system default if omitted. │
|
|
1079
|
+
│ --input-device-name TEXT Select input device by name substring (e.g., │
|
|
1080
|
+
│ MacBook or USB). │
|
|
1081
|
+
│ --list-devices List available audio devices with their indices │
|
|
1082
|
+
│ and exit. │
|
|
971
1083
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
972
1084
|
╭─ Audio Input: Wyoming ─────────────────────────────────────────────────────────────────╮
|
|
973
1085
|
│ --asr-wyoming-ip TEXT Wyoming ASR server IP address. │
|
|
@@ -1022,17 +1134,19 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1022
1134
|
│ [env var: GEMINI_API_KEY] │
|
|
1023
1135
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1024
1136
|
╭─ LLM Configuration ────────────────────────────────────────────────────────────────────╮
|
|
1025
|
-
│ --llm --no-llm
|
|
1137
|
+
│ --llm --no-llm Clean up transcript with LLM: fix errors, add punctuation, │
|
|
1138
|
+
│ remove filler words. Uses --extra-instructions if set (via CLI │
|
|
1139
|
+
│ or config file). │
|
|
1026
1140
|
│ [default: no-llm] │
|
|
1027
1141
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1028
1142
|
╭─ Process Management ───────────────────────────────────────────────────────────────────╮
|
|
1029
|
-
│ --stop Stop any running
|
|
1030
|
-
│ --status Check if
|
|
1143
|
+
│ --stop Stop any running instance of this command. │
|
|
1144
|
+
│ --status Check if an instance is currently running. │
|
|
1031
1145
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1032
1146
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
1033
1147
|
│ --log-level [debug|info|warning|error] Set logging level. │
|
|
1034
1148
|
│ [env var: LOG_LEVEL] │
|
|
1035
|
-
│ [default:
|
|
1149
|
+
│ [default: warning] │
|
|
1036
1150
|
│ --log-file TEXT Path to a file to write logs to. │
|
|
1037
1151
|
│ --quiet -q Suppress console output from rich. │
|
|
1038
1152
|
│ --config TEXT Path to a TOML configuration file. │
|
|
@@ -1081,10 +1195,25 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1081
1195
|
|
|
1082
1196
|
Usage: agent-cli speak [OPTIONS] [TEXT]
|
|
1083
1197
|
|
|
1084
|
-
Convert text to speech
|
|
1198
|
+
Convert text to speech and play audio through speakers.
|
|
1199
|
+
|
|
1200
|
+
By default, synthesized audio plays immediately. Use --save-file to save to a WAV file
|
|
1201
|
+
instead (skips playback).
|
|
1202
|
+
|
|
1203
|
+
Text can be provided as an argument or read from clipboard automatically.
|
|
1204
|
+
|
|
1205
|
+
Examples:
|
|
1206
|
+
|
|
1207
|
+
Speak text directly: agent-cli speak "Hello, world!"
|
|
1208
|
+
|
|
1209
|
+
Speak clipboard contents: agent-cli speak
|
|
1210
|
+
|
|
1211
|
+
Save to file instead of playing: agent-cli speak "Hello" --save-file greeting.wav
|
|
1212
|
+
|
|
1213
|
+
Use OpenAI-compatible TTS: agent-cli speak "Hello" --tts-provider openai
|
|
1085
1214
|
|
|
1086
1215
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
1087
|
-
│ text [TEXT] Text to
|
|
1216
|
+
│ text [TEXT] Text to synthesize. If not provided, reads from clipboard. │
|
|
1088
1217
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1089
1218
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
1090
1219
|
│ --help -h Show this message and exit. │
|
|
@@ -1096,9 +1225,10 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1096
1225
|
│ [default: wyoming] │
|
|
1097
1226
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1098
1227
|
╭─ Audio Output ─────────────────────────────────────────────────────────────────────────╮
|
|
1099
|
-
│ --output-device-index INTEGER
|
|
1100
|
-
│
|
|
1101
|
-
│
|
|
1228
|
+
│ --output-device-index INTEGER Audio output device index (see --list-devices │
|
|
1229
|
+
│ for available devices). │
|
|
1230
|
+
│ --output-device-name TEXT Partial match on device name (e.g., 'speakers', │
|
|
1231
|
+
│ 'headphones'). │
|
|
1102
1232
|
│ --tts-speed FLOAT Speech speed multiplier (1.0 = normal, 2.0 = │
|
|
1103
1233
|
│ twice as fast, 0.5 = half speed). │
|
|
1104
1234
|
│ [default: 1.0] │
|
|
@@ -1116,7 +1246,8 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1116
1246
|
╭─ Audio Output: OpenAI-compatible ──────────────────────────────────────────────────────╮
|
|
1117
1247
|
│ --tts-openai-model TEXT The OpenAI model to use for TTS. │
|
|
1118
1248
|
│ [default: tts-1] │
|
|
1119
|
-
│ --tts-openai-voice TEXT
|
|
1249
|
+
│ --tts-openai-voice TEXT Voice for OpenAI TTS (alloy, echo, fable, onyx, │
|
|
1250
|
+
│ nova, shimmer). │
|
|
1120
1251
|
│ [default: alloy] │
|
|
1121
1252
|
│ --tts-openai-base-url TEXT Custom base URL for OpenAI-compatible TTS API │
|
|
1122
1253
|
│ (e.g., http://localhost:8000/v1 for a proxy). │
|
|
@@ -1142,28 +1273,27 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1142
1273
|
│ [env var: GEMINI_API_KEY] │
|
|
1143
1274
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1144
1275
|
╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
|
|
1145
|
-
│ --list-devices List available audio
|
|
1276
|
+
│ --list-devices List available audio devices with their indices and exit. │
|
|
1146
1277
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1147
1278
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
1148
|
-
│ --save-file PATH Save
|
|
1279
|
+
│ --save-file PATH Save audio to WAV file instead of │
|
|
1280
|
+
│ playing through speakers. │
|
|
1149
1281
|
│ --log-level [debug|info|warning|error] Set logging level. │
|
|
1150
1282
|
│ [env var: LOG_LEVEL] │
|
|
1151
|
-
│ [default:
|
|
1283
|
+
│ [default: warning] │
|
|
1152
1284
|
│ --log-file TEXT Path to a file to write logs to. │
|
|
1153
1285
|
│ --quiet -q Suppress console output from rich. │
|
|
1154
|
-
│ --json Output result as JSON
|
|
1155
|
-
│
|
|
1156
|
-
│ --no-clipboard. │
|
|
1286
|
+
│ --json Output result as JSON (implies │
|
|
1287
|
+
│ --quiet and --no-clipboard). │
|
|
1157
1288
|
│ --config TEXT Path to a TOML configuration file. │
|
|
1158
1289
|
│ --print-args Print the command line arguments, │
|
|
1159
1290
|
│ including variables taken from the │
|
|
1160
1291
|
│ configuration file. │
|
|
1161
1292
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1162
1293
|
╭─ Process Management ───────────────────────────────────────────────────────────────────╮
|
|
1163
|
-
│ --stop Stop any running
|
|
1164
|
-
│ --status Check if
|
|
1165
|
-
│ --toggle
|
|
1166
|
-
│ will be stopped. If the process is not running, it will be started. │
|
|
1294
|
+
│ --stop Stop any running instance of this command. │
|
|
1295
|
+
│ --status Check if an instance is currently running. │
|
|
1296
|
+
│ --toggle Start if not running, stop if running. Ideal for hotkey binding. │
|
|
1167
1297
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1168
1298
|
|
|
1169
1299
|
```
|
|
@@ -1205,16 +1335,23 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1205
1335
|
|
|
1206
1336
|
Usage: agent-cli voice-edit [OPTIONS]
|
|
1207
1337
|
|
|
1208
|
-
|
|
1338
|
+
Edit or query clipboard text using voice commands.
|
|
1339
|
+
|
|
1340
|
+
Workflow: Captures clipboard text → records your voice command → transcribes it → sends
|
|
1341
|
+
both to an LLM → copies result back to clipboard.
|
|
1342
|
+
|
|
1343
|
+
Use this for hands-free text editing (e.g., "make this more formal") or asking questions
|
|
1344
|
+
about clipboard content (e.g., "summarize this").
|
|
1209
1345
|
|
|
1210
|
-
|
|
1346
|
+
Typical hotkey integration: Run voice-edit & on keypress to start recording, then send
|
|
1347
|
+
SIGINT (via --stop) on second keypress to process.
|
|
1348
|
+
|
|
1349
|
+
Examples:
|
|
1211
1350
|
|
|
1212
|
-
•
|
|
1213
|
-
•
|
|
1214
|
-
•
|
|
1215
|
-
•
|
|
1216
|
-
• List output devices: agent-cli voice-edit --list-output-devices
|
|
1217
|
-
• Save TTS to file: agent-cli voice-edit --tts --save-file response.wav
|
|
1351
|
+
• Basic usage: agent-cli voice-edit
|
|
1352
|
+
• With TTS response: agent-cli voice-edit --tts
|
|
1353
|
+
• Toggle on/off: agent-cli voice-edit --toggle
|
|
1354
|
+
• List audio devices: agent-cli voice-edit --list-devices
|
|
1218
1355
|
|
|
1219
1356
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
1220
1357
|
│ --help -h Show this message and exit. │
|
|
@@ -1232,10 +1369,12 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1232
1369
|
│ [default: wyoming] │
|
|
1233
1370
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1234
1371
|
╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
|
|
1235
|
-
│ --input-device-index INTEGER
|
|
1236
|
-
│
|
|
1237
|
-
│ --
|
|
1238
|
-
│
|
|
1372
|
+
│ --input-device-index INTEGER Audio input device index (see --list-devices). │
|
|
1373
|
+
│ Uses system default if omitted. │
|
|
1374
|
+
│ --input-device-name TEXT Select input device by name substring (e.g., │
|
|
1375
|
+
│ MacBook or USB). │
|
|
1376
|
+
│ --list-devices List available audio devices with their indices │
|
|
1377
|
+
│ and exit. │
|
|
1239
1378
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1240
1379
|
╭─ Audio Input: Wyoming ─────────────────────────────────────────────────────────────────╮
|
|
1241
1380
|
│ --asr-wyoming-ip TEXT Wyoming ASR server IP address. │
|
|
@@ -1286,10 +1425,10 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1286
1425
|
╭─ Audio Output ─────────────────────────────────────────────────────────────────────────╮
|
|
1287
1426
|
│ --tts --no-tts Enable text-to-speech for responses. │
|
|
1288
1427
|
│ [default: no-tts] │
|
|
1289
|
-
│ --output-device-index INTEGER
|
|
1290
|
-
│ for
|
|
1291
|
-
│ --output-device-name TEXT
|
|
1292
|
-
│
|
|
1428
|
+
│ --output-device-index INTEGER Audio output device index (see │
|
|
1429
|
+
│ --list-devices for available devices). │
|
|
1430
|
+
│ --output-device-name TEXT Partial match on device name (e.g., │
|
|
1431
|
+
│ 'speakers', 'headphones'). │
|
|
1293
1432
|
│ --tts-speed FLOAT Speech speed multiplier (1.0 = normal, │
|
|
1294
1433
|
│ 2.0 = twice as fast, 0.5 = half speed). │
|
|
1295
1434
|
│ [default: 1.0] │
|
|
@@ -1307,7 +1446,8 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1307
1446
|
╭─ Audio Output: OpenAI-compatible ──────────────────────────────────────────────────────╮
|
|
1308
1447
|
│ --tts-openai-model TEXT The OpenAI model to use for TTS. │
|
|
1309
1448
|
│ [default: tts-1] │
|
|
1310
|
-
│ --tts-openai-voice TEXT
|
|
1449
|
+
│ --tts-openai-voice TEXT Voice for OpenAI TTS (alloy, echo, fable, onyx, │
|
|
1450
|
+
│ nova, shimmer). │
|
|
1311
1451
|
│ [default: alloy] │
|
|
1312
1452
|
│ --tts-openai-base-url TEXT Custom base URL for OpenAI-compatible TTS API │
|
|
1313
1453
|
│ (e.g., http://localhost:8000/v1 for a proxy). │
|
|
@@ -1328,28 +1468,27 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1328
1468
|
│ [default: Kore] │
|
|
1329
1469
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1330
1470
|
╭─ Process Management ───────────────────────────────────────────────────────────────────╮
|
|
1331
|
-
│ --stop Stop any running
|
|
1332
|
-
│ --status Check if
|
|
1333
|
-
│ --toggle
|
|
1334
|
-
│ will be stopped. If the process is not running, it will be started. │
|
|
1471
|
+
│ --stop Stop any running instance of this command. │
|
|
1472
|
+
│ --status Check if an instance is currently running. │
|
|
1473
|
+
│ --toggle Start if not running, stop if running. Ideal for hotkey binding. │
|
|
1335
1474
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1336
1475
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
1337
|
-
│ --save-file PATH Save
|
|
1338
|
-
│
|
|
1476
|
+
│ --save-file PATH Save audio to WAV file │
|
|
1477
|
+
│ instead of playing │
|
|
1478
|
+
│ through speakers. │
|
|
1339
1479
|
│ --clipboard --no-clipboard Copy result to │
|
|
1340
1480
|
│ clipboard. │
|
|
1341
1481
|
│ [default: clipboard] │
|
|
1342
1482
|
│ --log-level [debug|info|warning|erro Set logging level. │
|
|
1343
1483
|
│ r] [env var: LOG_LEVEL] │
|
|
1344
|
-
│ [default:
|
|
1484
|
+
│ [default: warning] │
|
|
1345
1485
|
│ --log-file TEXT Path to a file to write │
|
|
1346
1486
|
│ logs to. │
|
|
1347
1487
|
│ --quiet -q Suppress console output │
|
|
1348
1488
|
│ from rich. │
|
|
1349
1489
|
│ --json Output result as JSON │
|
|
1350
|
-
│
|
|
1351
|
-
│ --
|
|
1352
|
-
│ --no-clipboard. │
|
|
1490
|
+
│ (implies --quiet and │
|
|
1491
|
+
│ --no-clipboard). │
|
|
1353
1492
|
│ --config TEXT Path to a TOML │
|
|
1354
1493
|
│ configuration file. │
|
|
1355
1494
|
│ --print-args Print the command line │
|
|
@@ -1400,7 +1539,28 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1400
1539
|
|
|
1401
1540
|
Usage: agent-cli assistant [OPTIONS]
|
|
1402
1541
|
|
|
1403
|
-
|
|
1542
|
+
Hands-free voice assistant using wake word detection.
|
|
1543
|
+
|
|
1544
|
+
Continuously listens for a wake word, then records your speech until you say the wake
|
|
1545
|
+
word again. The recording is transcribed and sent to an LLM for a conversational
|
|
1546
|
+
response, optionally spoken back via TTS.
|
|
1547
|
+
|
|
1548
|
+
Conversation flow:
|
|
1549
|
+
|
|
1550
|
+
1 Say wake word → starts recording
|
|
1551
|
+
2 Speak your question/command
|
|
1552
|
+
3 Say wake word again → stops recording and processes
|
|
1553
|
+
|
|
1554
|
+
The assistant runs in a loop, ready for the next command after each response. Stop with
|
|
1555
|
+
Ctrl+C or --stop.
|
|
1556
|
+
|
|
1557
|
+
Requirements:
|
|
1558
|
+
|
|
1559
|
+
• Wyoming wake word server (e.g., wyoming-openwakeword on port 10400)
|
|
1560
|
+
• Wyoming ASR server (e.g., wyoming-whisper on port 10300)
|
|
1561
|
+
• Optional: TTS server for spoken responses (enable with --tts)
|
|
1562
|
+
|
|
1563
|
+
Example: assistant --wake-word ok_nabu --tts --input-device-name USB
|
|
1404
1564
|
|
|
1405
1565
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
1406
1566
|
│ --help -h Show this message and exit. │
|
|
@@ -1418,19 +1578,23 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1418
1578
|
│ [default: wyoming] │
|
|
1419
1579
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1420
1580
|
╭─ Wake Word ────────────────────────────────────────────────────────────────────────────╮
|
|
1421
|
-
│ --wake-server-ip TEXT Wyoming wake word server IP
|
|
1581
|
+
│ --wake-server-ip TEXT Wyoming wake word server IP (requires │
|
|
1582
|
+
│ wyoming-openwakeword or similar). │
|
|
1422
1583
|
│ [default: localhost] │
|
|
1423
1584
|
│ --wake-server-port INTEGER Wyoming wake word server port. │
|
|
1424
1585
|
│ [default: 10400] │
|
|
1425
|
-
│ --wake-word TEXT
|
|
1426
|
-
│
|
|
1586
|
+
│ --wake-word TEXT Wake word to detect. Common options: ok_nabu, │
|
|
1587
|
+
│ hey_jarvis, alexa. Must match a model loaded in │
|
|
1588
|
+
│ your wake word server. │
|
|
1427
1589
|
│ [default: ok_nabu] │
|
|
1428
1590
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1429
1591
|
╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
|
|
1430
|
-
│ --input-device-index INTEGER
|
|
1431
|
-
│
|
|
1432
|
-
│ --
|
|
1433
|
-
│
|
|
1592
|
+
│ --input-device-index INTEGER Audio input device index (see --list-devices). │
|
|
1593
|
+
│ Uses system default if omitted. │
|
|
1594
|
+
│ --input-device-name TEXT Select input device by name substring (e.g., │
|
|
1595
|
+
│ MacBook or USB). │
|
|
1596
|
+
│ --list-devices List available audio devices with their indices │
|
|
1597
|
+
│ and exit. │
|
|
1434
1598
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1435
1599
|
╭─ Audio Input: Wyoming ─────────────────────────────────────────────────────────────────╮
|
|
1436
1600
|
│ --asr-wyoming-ip TEXT Wyoming ASR server IP address. │
|
|
@@ -1481,10 +1645,10 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1481
1645
|
╭─ Audio Output ─────────────────────────────────────────────────────────────────────────╮
|
|
1482
1646
|
│ --tts --no-tts Enable text-to-speech for responses. │
|
|
1483
1647
|
│ [default: no-tts] │
|
|
1484
|
-
│ --output-device-index INTEGER
|
|
1485
|
-
│ for
|
|
1486
|
-
│ --output-device-name TEXT
|
|
1487
|
-
│
|
|
1648
|
+
│ --output-device-index INTEGER Audio output device index (see │
|
|
1649
|
+
│ --list-devices for available devices). │
|
|
1650
|
+
│ --output-device-name TEXT Partial match on device name (e.g., │
|
|
1651
|
+
│ 'speakers', 'headphones'). │
|
|
1488
1652
|
│ --tts-speed FLOAT Speech speed multiplier (1.0 = normal, │
|
|
1489
1653
|
│ 2.0 = twice as fast, 0.5 = half speed). │
|
|
1490
1654
|
│ [default: 1.0] │
|
|
@@ -1502,7 +1666,8 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1502
1666
|
╭─ Audio Output: OpenAI-compatible ──────────────────────────────────────────────────────╮
|
|
1503
1667
|
│ --tts-openai-model TEXT The OpenAI model to use for TTS. │
|
|
1504
1668
|
│ [default: tts-1] │
|
|
1505
|
-
│ --tts-openai-voice TEXT
|
|
1669
|
+
│ --tts-openai-voice TEXT Voice for OpenAI TTS (alloy, echo, fable, onyx, │
|
|
1670
|
+
│ nova, shimmer). │
|
|
1506
1671
|
│ [default: alloy] │
|
|
1507
1672
|
│ --tts-openai-base-url TEXT Custom base URL for OpenAI-compatible TTS API │
|
|
1508
1673
|
│ (e.g., http://localhost:8000/v1 for a proxy). │
|
|
@@ -1523,20 +1688,20 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1523
1688
|
│ [default: Kore] │
|
|
1524
1689
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1525
1690
|
╭─ Process Management ───────────────────────────────────────────────────────────────────╮
|
|
1526
|
-
│ --stop Stop any running
|
|
1527
|
-
│ --status Check if
|
|
1528
|
-
│ --toggle
|
|
1529
|
-
│ will be stopped. If the process is not running, it will be started. │
|
|
1691
|
+
│ --stop Stop any running instance of this command. │
|
|
1692
|
+
│ --status Check if an instance is currently running. │
|
|
1693
|
+
│ --toggle Start if not running, stop if running. Ideal for hotkey binding. │
|
|
1530
1694
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1531
1695
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
1532
|
-
│ --save-file PATH Save
|
|
1533
|
-
│
|
|
1696
|
+
│ --save-file PATH Save audio to WAV file │
|
|
1697
|
+
│ instead of playing │
|
|
1698
|
+
│ through speakers. │
|
|
1534
1699
|
│ --clipboard --no-clipboard Copy result to │
|
|
1535
1700
|
│ clipboard. │
|
|
1536
1701
|
│ [default: clipboard] │
|
|
1537
1702
|
│ --log-level [debug|info|warning|erro Set logging level. │
|
|
1538
1703
|
│ r] [env var: LOG_LEVEL] │
|
|
1539
|
-
│ [default:
|
|
1704
|
+
│ [default: warning] │
|
|
1540
1705
|
│ --log-file TEXT Path to a file to write │
|
|
1541
1706
|
│ logs to. │
|
|
1542
1707
|
│ --quiet -q Suppress console output │
|
|
@@ -1598,7 +1763,39 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1598
1763
|
|
|
1599
1764
|
Usage: agent-cli chat [OPTIONS]
|
|
1600
1765
|
|
|
1601
|
-
|
|
1766
|
+
Voice-based conversational chat agent with memory and tools.
|
|
1767
|
+
|
|
1768
|
+
Runs an interactive loop: listen → transcribe → LLM → speak response. Conversation
|
|
1769
|
+
history is persisted and included as context for continuity.
|
|
1770
|
+
|
|
1771
|
+
Built-in tools (LLM uses automatically when relevant):
|
|
1772
|
+
|
|
1773
|
+
• add_memory/search_memory/update_memory - persistent long-term memory
|
|
1774
|
+
• duckduckgo_search - web search for current information
|
|
1775
|
+
• read_file/execute_code - file access and shell commands
|
|
1776
|
+
|
|
1777
|
+
Process management: Use --toggle to start/stop via hotkey (bind to a keyboard shortcut),
|
|
1778
|
+
--stop to terminate, or --status to check state.
|
|
1779
|
+
|
|
1780
|
+
Examples:
|
|
1781
|
+
|
|
1782
|
+
Use OpenAI-compatible providers for speech and LLM, with TTS enabled:
|
|
1783
|
+
|
|
1784
|
+
|
|
1785
|
+
agent-cli chat --asr-provider openai --llm-provider openai --tts
|
|
1786
|
+
|
|
1787
|
+
|
|
1788
|
+
Start in background mode (toggle on/off with hotkey):
|
|
1789
|
+
|
|
1790
|
+
|
|
1791
|
+
agent-cli chat --toggle
|
|
1792
|
+
|
|
1793
|
+
|
|
1794
|
+
Use local Ollama LLM with Wyoming ASR:
|
|
1795
|
+
|
|
1796
|
+
|
|
1797
|
+
agent-cli chat --llm-provider ollama
|
|
1798
|
+
|
|
1602
1799
|
|
|
1603
1800
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
1604
1801
|
│ --help -h Show this message and exit. │
|
|
@@ -1616,10 +1813,12 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1616
1813
|
│ [default: wyoming] │
|
|
1617
1814
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1618
1815
|
╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
|
|
1619
|
-
│ --input-device-index INTEGER
|
|
1620
|
-
│
|
|
1621
|
-
│ --
|
|
1622
|
-
│
|
|
1816
|
+
│ --input-device-index INTEGER Audio input device index (see --list-devices). │
|
|
1817
|
+
│ Uses system default if omitted. │
|
|
1818
|
+
│ --input-device-name TEXT Select input device by name substring (e.g., │
|
|
1819
|
+
│ MacBook or USB). │
|
|
1820
|
+
│ --list-devices List available audio devices with their indices │
|
|
1821
|
+
│ and exit. │
|
|
1623
1822
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1624
1823
|
╭─ Audio Input: Wyoming ─────────────────────────────────────────────────────────────────╮
|
|
1625
1824
|
│ --asr-wyoming-ip TEXT Wyoming ASR server IP address. │
|
|
@@ -1676,10 +1875,10 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1676
1875
|
╭─ Audio Output ─────────────────────────────────────────────────────────────────────────╮
|
|
1677
1876
|
│ --tts --no-tts Enable text-to-speech for responses. │
|
|
1678
1877
|
│ [default: no-tts] │
|
|
1679
|
-
│ --output-device-index INTEGER
|
|
1680
|
-
│ for
|
|
1681
|
-
│ --output-device-name TEXT
|
|
1682
|
-
│
|
|
1878
|
+
│ --output-device-index INTEGER Audio output device index (see │
|
|
1879
|
+
│ --list-devices for available devices). │
|
|
1880
|
+
│ --output-device-name TEXT Partial match on device name (e.g., │
|
|
1881
|
+
│ 'speakers', 'headphones'). │
|
|
1683
1882
|
│ --tts-speed FLOAT Speech speed multiplier (1.0 = normal, │
|
|
1684
1883
|
│ 2.0 = twice as fast, 0.5 = half speed). │
|
|
1685
1884
|
│ [default: 1.0] │
|
|
@@ -1697,7 +1896,8 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1697
1896
|
╭─ Audio Output: OpenAI-compatible ──────────────────────────────────────────────────────╮
|
|
1698
1897
|
│ --tts-openai-model TEXT The OpenAI model to use for TTS. │
|
|
1699
1898
|
│ [default: tts-1] │
|
|
1700
|
-
│ --tts-openai-voice TEXT
|
|
1899
|
+
│ --tts-openai-voice TEXT Voice for OpenAI TTS (alloy, echo, fable, onyx, │
|
|
1900
|
+
│ nova, shimmer). │
|
|
1701
1901
|
│ [default: alloy] │
|
|
1702
1902
|
│ --tts-openai-base-url TEXT Custom base URL for OpenAI-compatible TTS API │
|
|
1703
1903
|
│ (e.g., http://localhost:8000/v1 for a proxy). │
|
|
@@ -1718,23 +1918,26 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1718
1918
|
│ [default: Kore] │
|
|
1719
1919
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1720
1920
|
╭─ Process Management ───────────────────────────────────────────────────────────────────╮
|
|
1721
|
-
│ --stop Stop any running
|
|
1722
|
-
│ --status Check if
|
|
1723
|
-
│ --toggle
|
|
1724
|
-
│ will be stopped. If the process is not running, it will be started. │
|
|
1921
|
+
│ --stop Stop any running instance of this command. │
|
|
1922
|
+
│ --status Check if an instance is currently running. │
|
|
1923
|
+
│ --toggle Start if not running, stop if running. Ideal for hotkey binding. │
|
|
1725
1924
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1726
1925
|
╭─ History Options ──────────────────────────────────────────────────────────────────────╮
|
|
1727
|
-
│ --history-dir PATH Directory
|
|
1926
|
+
│ --history-dir PATH Directory for conversation history and long-term │
|
|
1927
|
+
│ memory. Both conversation.json and │
|
|
1928
|
+
│ long_term_memory.json are stored here. │
|
|
1728
1929
|
│ [default: ~/.config/agent-cli/history] │
|
|
1729
|
-
│ --last-n-messages INTEGER Number of messages to include
|
|
1730
|
-
│
|
|
1930
|
+
│ --last-n-messages INTEGER Number of past messages to include as context for │
|
|
1931
|
+
│ the LLM. Set to 0 to start fresh each session │
|
|
1932
|
+
│ (memory tools still persist). │
|
|
1731
1933
|
│ [default: 50] │
|
|
1732
1934
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1733
1935
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
1734
|
-
│ --save-file PATH Save
|
|
1936
|
+
│ --save-file PATH Save audio to WAV file instead of │
|
|
1937
|
+
│ playing through speakers. │
|
|
1735
1938
|
│ --log-level [debug|info|warning|error] Set logging level. │
|
|
1736
1939
|
│ [env var: LOG_LEVEL] │
|
|
1737
|
-
│ [default:
|
|
1940
|
+
│ [default: warning] │
|
|
1738
1941
|
│ --log-file TEXT Path to a file to write logs to. │
|
|
1739
1942
|
│ --quiet -q Suppress console output from rich. │
|
|
1740
1943
|
│ --config TEXT Path to a TOML configuration file. │
|
|
@@ -1786,25 +1989,68 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1786
1989
|
|
|
1787
1990
|
Usage: agent-cli rag-proxy [OPTIONS]
|
|
1788
1991
|
|
|
1789
|
-
Start
|
|
1992
|
+
Start a RAG proxy server that enables "chat with your documents".
|
|
1993
|
+
|
|
1994
|
+
Watches a folder for documents, indexes them into a vector store, and provides an
|
|
1995
|
+
OpenAI-compatible API at /v1/chat/completions. When you send a chat request, the server
|
|
1996
|
+
retrieves relevant document chunks and injects them as context before forwarding to your
|
|
1997
|
+
LLM backend.
|
|
1998
|
+
|
|
1999
|
+
Quick start:
|
|
2000
|
+
|
|
2001
|
+
• agent-cli rag-proxy — Start with defaults (./rag_docs, OpenAI-compatible API)
|
|
2002
|
+
• agent-cli rag-proxy --docs-folder ~/notes — Index your notes folder
|
|
2003
|
+
|
|
2004
|
+
How it works:
|
|
2005
|
+
|
|
2006
|
+
1 Documents in --docs-folder are chunked, embedded, and stored in ChromaDB
|
|
2007
|
+
2 A file watcher auto-reindexes when files change
|
|
2008
|
+
3 Chat requests trigger a semantic search for relevant chunks
|
|
2009
|
+
4 Retrieved context is injected into the prompt before forwarding to the LLM
|
|
2010
|
+
5 Responses include a rag_sources field listing which documents were used
|
|
2011
|
+
|
|
2012
|
+
Supported file formats:
|
|
1790
2013
|
|
|
1791
|
-
|
|
1792
|
-
|
|
1793
|
-
|
|
2014
|
+
Text: .txt, .md, .json, .py, .js, .ts, .yaml, .toml, .rst, etc. Rich documents (via
|
|
2015
|
+
MarkItDown): .pdf, .docx, .pptx, .xlsx, .html, .csv
|
|
2016
|
+
|
|
2017
|
+
API endpoints:
|
|
2018
|
+
|
|
2019
|
+
• POST /v1/chat/completions — Main chat endpoint (OpenAI-compatible)
|
|
2020
|
+
• GET /health — Health check with configuration info
|
|
2021
|
+
• GET /files — List indexed files with chunk counts
|
|
2022
|
+
• POST /reindex — Trigger manual reindex
|
|
2023
|
+
• All other paths are proxied to the LLM backend
|
|
2024
|
+
|
|
2025
|
+
Per-request overrides (in JSON body):
|
|
2026
|
+
|
|
2027
|
+
• rag_top_k: Override --limit for this request
|
|
2028
|
+
• rag_enable_tools: Override --rag-tools for this request
|
|
1794
2029
|
|
|
1795
2030
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
1796
2031
|
│ --help -h Show this message and exit. │
|
|
1797
2032
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1798
2033
|
╭─ RAG Configuration ────────────────────────────────────────────────────────────────────╮
|
|
1799
|
-
│ --docs-folder PATH Folder to watch for documents
|
|
2034
|
+
│ --docs-folder PATH Folder to watch for documents. Files are │
|
|
2035
|
+
│ auto-indexed on startup and when changed. │
|
|
2036
|
+
│ Must not overlap with --chroma-path. │
|
|
1800
2037
|
│ [default: ./rag_docs] │
|
|
1801
|
-
│ --chroma-path PATH
|
|
2038
|
+
│ --chroma-path PATH ChromaDB storage directory for vector │
|
|
2039
|
+
│ embeddings. Must be separate from │
|
|
2040
|
+
│ --docs-folder to avoid indexing database │
|
|
2041
|
+
│ files. │
|
|
1802
2042
|
│ [default: ./rag_db] │
|
|
1803
2043
|
│ --limit INTEGER Number of document chunks to retrieve per │
|
|
1804
|
-
│ query.
|
|
2044
|
+
│ query. Higher values provide more context │
|
|
2045
|
+
│ but use more tokens. Can be overridden │
|
|
2046
|
+
│ per-request via rag_top_k in the JSON │
|
|
2047
|
+
│ body. │
|
|
1805
2048
|
│ [default: 3] │
|
|
1806
|
-
│ --rag-tools --no-rag-tools
|
|
1807
|
-
│
|
|
2049
|
+
│ --rag-tools --no-rag-tools Enable read_full_document() tool so the │
|
|
2050
|
+
│ LLM can request full document content when │
|
|
2051
|
+
│ retrieved snippets are insufficient. Can │
|
|
2052
|
+
│ be overridden per-request via │
|
|
2053
|
+
│ rag_enable_tools in the JSON body. │
|
|
1808
2054
|
│ [default: rag-tools] │
|
|
1809
2055
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1810
2056
|
╭─ LLM: OpenAI-compatible ───────────────────────────────────────────────────────────────╮
|
|
@@ -1822,7 +2068,8 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1822
2068
|
╭─ Server Configuration ─────────────────────────────────────────────────────────────────╮
|
|
1823
2069
|
│ --host TEXT Host/IP to bind API servers to. │
|
|
1824
2070
|
│ [default: 0.0.0.0] │
|
|
1825
|
-
│ --port INTEGER Port
|
|
2071
|
+
│ --port INTEGER Port for the RAG proxy API (e.g., │
|
|
2072
|
+
│ http://localhost:8000/v1/chat/completions). │
|
|
1826
2073
|
│ [default: 8000] │
|
|
1827
2074
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1828
2075
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
@@ -1912,41 +2159,61 @@ The `memory proxy` command is the core feature—a middleware server that gives
|
|
|
1912
2159
|
5 Extracts new facts from the conversation in the background and updates the long-term
|
|
1913
2160
|
memory store (including handling contradictions).
|
|
1914
2161
|
|
|
1915
|
-
|
|
1916
|
-
|
|
2162
|
+
Example:
|
|
2163
|
+
|
|
2164
|
+
|
|
2165
|
+
# Start proxy pointing to local Ollama
|
|
2166
|
+
agent-cli memory proxy --openai-base-url http://localhost:11434/v1
|
|
2167
|
+
|
|
2168
|
+
# Then configure your chat client to use http://localhost:8100/v1
|
|
2169
|
+
# as its OpenAI base URL. All requests flow through the memory proxy.
|
|
2170
|
+
|
|
2171
|
+
|
|
2172
|
+
Per-request overrides: Clients can include these fields in the request body: memory_id
|
|
2173
|
+
(conversation ID), memory_top_k, memory_recency_weight, memory_score_threshold.
|
|
1917
2174
|
|
|
1918
2175
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
1919
2176
|
│ --help -h Show this message and exit. │
|
|
1920
2177
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1921
2178
|
╭─ Memory Configuration ─────────────────────────────────────────────────────────────────╮
|
|
1922
|
-
│ --memory-path PATH
|
|
1923
|
-
│
|
|
2179
|
+
│ --memory-path PATH Directory for memory storage. │
|
|
2180
|
+
│ Contains entries/ (Markdown │
|
|
2181
|
+
│ files) and chroma/ (vector │
|
|
2182
|
+
│ index). Created automatically if │
|
|
2183
|
+
│ it doesn't exist. │
|
|
1924
2184
|
│ [default: ./memory_db] │
|
|
1925
|
-
│ --default-top-k INTEGER Number of
|
|
1926
|
-
│
|
|
2185
|
+
│ --default-top-k INTEGER Number of relevant memories to │
|
|
2186
|
+
│ inject into each request. Higher │
|
|
2187
|
+
│ values provide more context but │
|
|
2188
|
+
│ increase token usage. │
|
|
1927
2189
|
│ [default: 5] │
|
|
1928
|
-
│ --max-entries INTEGER Maximum
|
|
1929
|
-
│
|
|
2190
|
+
│ --max-entries INTEGER Maximum entries per conversation │
|
|
2191
|
+
│ before oldest are evicted. │
|
|
2192
|
+
│ Summaries are preserved │
|
|
2193
|
+
│ separately. │
|
|
1930
2194
|
│ [default: 500] │
|
|
1931
2195
|
│ --mmr-lambda FLOAT MMR lambda (0-1): higher favors │
|
|
1932
2196
|
│ relevance, lower favors │
|
|
1933
2197
|
│ diversity. │
|
|
1934
2198
|
│ [default: 0.7] │
|
|
1935
|
-
│ --recency-weight FLOAT
|
|
1936
|
-
│
|
|
1937
|
-
│
|
|
1938
|
-
│ semantic relevance). │
|
|
2199
|
+
│ --recency-weight FLOAT Weight for recency vs semantic │
|
|
2200
|
+
│ relevance (0.0-1.0). At 0.2: 20% │
|
|
2201
|
+
│ recency, 80% semantic similarity. │
|
|
1939
2202
|
│ [default: 0.2] │
|
|
1940
2203
|
│ --score-threshold FLOAT Minimum semantic relevance │
|
|
1941
2204
|
│ threshold (0.0-1.0). Memories │
|
|
1942
2205
|
│ below this score are discarded to │
|
|
1943
2206
|
│ reduce noise. │
|
|
1944
2207
|
│ [default: 0.35] │
|
|
1945
|
-
│ --summarization --no-summarization
|
|
1946
|
-
│
|
|
2208
|
+
│ --summarization --no-summarization Extract facts and generate │
|
|
2209
|
+
│ summaries after each turn using │
|
|
2210
|
+
│ the LLM. Disable to only store │
|
|
2211
|
+
│ raw conversation turns. │
|
|
1947
2212
|
│ [default: summarization] │
|
|
1948
|
-
│ --git-versioning --no-git-versioning
|
|
1949
|
-
│
|
|
2213
|
+
│ --git-versioning --no-git-versioning Auto-commit memory changes to │
|
|
2214
|
+
│ git. Initializes a repo in │
|
|
2215
|
+
│ --memory-path if needed. Provides │
|
|
2216
|
+
│ full history of memory evolution. │
|
|
1950
2217
|
│ [default: git-versioning] │
|
|
1951
2218
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1952
2219
|
╭─ LLM: OpenAI-compatible ───────────────────────────────────────────────────────────────╮
|
|
@@ -2059,12 +2326,16 @@ agent-cli memory add -c work "Project deadline is Friday"
|
|
|
2059
2326
|
│ for stdin. Supports JSON array, │
|
|
2060
2327
|
│ JSON object with 'memories' key, │
|
|
2061
2328
|
│ or plain text (one per line). │
|
|
2062
|
-
│ --conversation-id -c TEXT Conversation
|
|
2063
|
-
│
|
|
2329
|
+
│ --conversation-id -c TEXT Conversation namespace for these │
|
|
2330
|
+
│ memories. Memories are retrieved │
|
|
2331
|
+
│ per-conversation unless shared │
|
|
2332
|
+
│ globally. │
|
|
2064
2333
|
│ [default: default] │
|
|
2065
|
-
│ --memory-path PATH
|
|
2334
|
+
│ --memory-path PATH Directory for memory storage (same │
|
|
2335
|
+
│ as memory proxy --memory-path). │
|
|
2066
2336
|
│ [default: ./memory_db] │
|
|
2067
|
-
│ --git-versioning --no-git-versioning
|
|
2337
|
+
│ --git-versioning --no-git-versioning Auto-commit changes to git for │
|
|
2338
|
+
│ version history. │
|
|
2068
2339
|
│ [default: git-versioning] │
|
|
2069
2340
|
│ --help -h Show this message and exit. │
|
|
2070
2341
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|