agent-cli 0.70.5__py3-none-any.whl → 0.71.0__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- agent_cli/agents/assistant.py +23 -27
- agent_cli/agents/autocorrect.py +29 -3
- agent_cli/agents/chat.py +44 -14
- agent_cli/agents/memory/__init__.py +19 -1
- agent_cli/agents/memory/add.py +3 -3
- agent_cli/agents/memory/proxy.py +19 -10
- agent_cli/agents/rag_proxy.py +41 -9
- agent_cli/agents/speak.py +22 -2
- agent_cli/agents/transcribe.py +20 -2
- agent_cli/agents/transcribe_daemon.py +33 -21
- agent_cli/agents/voice_edit.py +17 -9
- agent_cli/cli.py +25 -2
- agent_cli/config_cmd.py +30 -11
- agent_cli/dev/cli.py +295 -65
- agent_cli/docs_gen.py +18 -8
- agent_cli/install/extras.py +39 -10
- agent_cli/install/hotkeys.py +22 -11
- agent_cli/install/services.py +54 -14
- agent_cli/opts.py +23 -20
- agent_cli/server/cli.py +118 -44
- {agent_cli-0.70.5.dist-info → agent_cli-0.71.0.dist-info}/METADATA +456 -187
- {agent_cli-0.70.5.dist-info → agent_cli-0.71.0.dist-info}/RECORD +25 -25
- {agent_cli-0.70.5.dist-info → agent_cli-0.71.0.dist-info}/WHEEL +0 -0
- {agent_cli-0.70.5.dist-info → agent_cli-0.71.0.dist-info}/entry_points.txt +0 -0
- {agent_cli-0.70.5.dist-info → agent_cli-0.71.0.dist-info}/licenses/LICENSE +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: agent-cli
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.71.0
|
|
4
4
|
Summary: A suite of AI-powered command-line tools for text correction, audio transcription, and voice assistance.
|
|
5
5
|
Project-URL: Homepage, https://github.com/basnijholt/agent-cli
|
|
6
6
|
Author-email: Bas Nijholt <bas@nijho.lt>
|
|
@@ -134,7 +134,7 @@ Since then I have expanded the tool with many more features, all focused on loca
|
|
|
134
134
|
- **[`memory`](docs/commands/memory.md)**: Long-term memory system with `memory proxy` and `memory add`.
|
|
135
135
|
- **[`rag-proxy`](docs/commands/rag-proxy.md)**: RAG proxy server for chatting with your documents.
|
|
136
136
|
- **[`dev`](docs/commands/dev.md)**: Parallel development with git worktrees and AI coding agents.
|
|
137
|
-
- **[`server`](docs/commands/server/index.md)**: Local ASR and TTS servers with dual-protocol (Wyoming & OpenAI), TTL-based memory management, and multi-platform acceleration. Whisper uses MLX on Apple Silicon or Faster Whisper on Linux/CUDA. TTS supports Kokoro (GPU) or Piper (CPU).
|
|
137
|
+
- **[`server`](docs/commands/server/index.md)**: Local ASR and TTS servers with dual-protocol (Wyoming & OpenAI-compatible APIs), TTL-based memory management, and multi-platform acceleration. Whisper uses MLX on Apple Silicon or Faster Whisper on Linux/CUDA. TTS supports Kokoro (GPU) or Piper (CPU).
|
|
138
138
|
- **[`transcribe-daemon`](docs/commands/transcribe-daemon.md)**: Continuous background transcription with VAD. Install with `uv tool install "agent-cli[vad]" -p 3.13`.
|
|
139
139
|
|
|
140
140
|
## Quick Start
|
|
@@ -498,21 +498,43 @@ agent-cli install-extras rag memory vad
|
|
|
498
498
|
|
|
499
499
|
Usage: agent-cli install-extras [OPTIONS] [EXTRAS]...
|
|
500
500
|
|
|
501
|
-
Install optional
|
|
501
|
+
Install optional dependencies with pinned, compatible versions.
|
|
502
|
+
|
|
503
|
+
Many agent-cli features require optional dependencies. This command installs them with
|
|
504
|
+
version pinning to ensure compatibility. Dependencies persist across uv tool upgrade
|
|
505
|
+
when installed via uv tool.
|
|
506
|
+
|
|
507
|
+
Available extras:
|
|
508
|
+
|
|
509
|
+
• rag - RAG proxy server (ChromaDB, embeddings)
|
|
510
|
+
• memory - Long-term memory proxy (ChromaDB)
|
|
511
|
+
• vad - Voice Activity Detection (silero-vad)
|
|
512
|
+
• audio - Local audio recording/playback
|
|
513
|
+
• piper - Local Piper TTS engine
|
|
514
|
+
• kokoro - Kokoro neural TTS engine
|
|
515
|
+
• faster-whisper - Whisper ASR for CUDA/CPU
|
|
516
|
+
• mlx-whisper - Whisper ASR for Apple Silicon
|
|
517
|
+
• wyoming - Wyoming protocol for ASR/TTS servers
|
|
518
|
+
• server - FastAPI server components
|
|
519
|
+
• speed - Audio speed adjustment
|
|
520
|
+
• llm - LLM framework (pydantic-ai)
|
|
502
521
|
|
|
503
522
|
Examples:
|
|
504
523
|
|
|
505
|
-
|
|
506
|
-
|
|
507
|
-
|
|
508
|
-
|
|
524
|
+
|
|
525
|
+
agent-cli install-extras rag # Install RAG dependencies
|
|
526
|
+
agent-cli install-extras memory vad # Install multiple extras
|
|
527
|
+
agent-cli install-extras --list # Show available extras
|
|
528
|
+
agent-cli install-extras --all # Install all extras
|
|
529
|
+
|
|
509
530
|
|
|
510
531
|
╭─ Arguments ────────────────────────────────────────────────────────────────────────────╮
|
|
511
|
-
│ extras [EXTRAS]... Extras to install
|
|
532
|
+
│ extras [EXTRAS]... Extras to install: rag, memory, vad, audio, piper, kokoro, │
|
|
533
|
+
│ faster-whisper, mlx-whisper, wyoming, server, speed, llm │
|
|
512
534
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
513
535
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
514
|
-
│ --list -l
|
|
515
|
-
│ --all -a Install all available extras
|
|
536
|
+
│ --list -l Show available extras with descriptions (what each one enables) │
|
|
537
|
+
│ --all -a Install all available extras at once │
|
|
516
538
|
│ --help -h Show this message and exit. │
|
|
517
539
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
518
540
|
|
|
@@ -571,13 +593,21 @@ agent-cli config edit
|
|
|
571
593
|
|
|
572
594
|
Manage agent-cli configuration files.
|
|
573
595
|
|
|
596
|
+
Config files are TOML format and searched in order:
|
|
597
|
+
|
|
598
|
+
1 ./agent-cli-config.toml (project-local)
|
|
599
|
+
2 ~/.config/agent-cli/config.toml (user default)
|
|
600
|
+
|
|
601
|
+
Settings in [defaults] apply to all commands. Override per-command with sections like
|
|
602
|
+
[chat] or [transcribe]. CLI arguments override config file settings.
|
|
603
|
+
|
|
574
604
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
575
605
|
│ --help -h Show this message and exit. │
|
|
576
606
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
577
607
|
╭─ Commands ─────────────────────────────────────────────────────────────────────────────╮
|
|
578
|
-
│ init Create a new config file with all options commented
|
|
608
|
+
│ init Create a new config file with all options as commented-out examples. │
|
|
579
609
|
│ edit Open the config file in your default editor. │
|
|
580
|
-
│ show Display the config file
|
|
610
|
+
│ show Display the active config file path and contents. │
|
|
581
611
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
582
612
|
|
|
583
613
|
```
|
|
@@ -635,10 +665,37 @@ the `[defaults]` section of your configuration file.
|
|
|
635
665
|
|
|
636
666
|
Usage: agent-cli autocorrect [OPTIONS] [TEXT]
|
|
637
667
|
|
|
638
|
-
|
|
668
|
+
Fix grammar, spelling, and punctuation using an LLM.
|
|
669
|
+
|
|
670
|
+
Reads text from clipboard (or argument), sends to LLM for correction, and copies the
|
|
671
|
+
result back to clipboard. Only makes technical corrections without changing meaning or
|
|
672
|
+
tone.
|
|
673
|
+
|
|
674
|
+
Workflow:
|
|
675
|
+
|
|
676
|
+
1 Read text from clipboard (or TEXT argument)
|
|
677
|
+
2 Send to LLM for grammar/spelling/punctuation fixes
|
|
678
|
+
3 Copy corrected text to clipboard (unless --json)
|
|
679
|
+
4 Display result
|
|
680
|
+
|
|
681
|
+
Examples:
|
|
682
|
+
|
|
683
|
+
|
|
684
|
+
# Correct text from clipboard (default)
|
|
685
|
+
agent-cli autocorrect
|
|
686
|
+
|
|
687
|
+
# Correct specific text
|
|
688
|
+
agent-cli autocorrect "this is incorect"
|
|
689
|
+
|
|
690
|
+
# Use OpenAI instead of local Ollama
|
|
691
|
+
agent-cli autocorrect --llm-provider openai
|
|
692
|
+
|
|
693
|
+
# Get JSON output for scripting (disables clipboard)
|
|
694
|
+
agent-cli autocorrect --json
|
|
695
|
+
|
|
639
696
|
|
|
640
697
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
641
|
-
│ text [TEXT]
|
|
698
|
+
│ text [TEXT] Text to correct. If omitted, reads from system clipboard. │
|
|
642
699
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
643
700
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
644
701
|
│ --help -h Show this message and exit. │
|
|
@@ -682,9 +739,8 @@ the `[defaults]` section of your configuration file.
|
|
|
682
739
|
│ [default: info] │
|
|
683
740
|
│ --log-file TEXT Path to a file to write logs to. │
|
|
684
741
|
│ --quiet -q Suppress console output from rich. │
|
|
685
|
-
│ --json Output result as JSON
|
|
686
|
-
│
|
|
687
|
-
│ --no-clipboard. │
|
|
742
|
+
│ --json Output result as JSON (implies │
|
|
743
|
+
│ --quiet and --no-clipboard). │
|
|
688
744
|
│ --config TEXT Path to a TOML configuration file. │
|
|
689
745
|
│ --print-args Print the command line arguments, │
|
|
690
746
|
│ including variables taken from the │
|
|
@@ -732,30 +788,50 @@ the `[defaults]` section of your configuration file.
|
|
|
732
788
|
|
|
733
789
|
Usage: agent-cli transcribe [OPTIONS]
|
|
734
790
|
|
|
735
|
-
|
|
791
|
+
Record audio from microphone and transcribe to text.
|
|
792
|
+
|
|
793
|
+
Records until you press Ctrl+C (or send SIGINT), then transcribes using your configured
|
|
794
|
+
ASR provider. The transcript is copied to the clipboard by default.
|
|
795
|
+
|
|
796
|
+
With --llm: Passes the raw transcript through an LLM to clean up speech recognition
|
|
797
|
+
errors, add punctuation, remove filler words, and improve readability.
|
|
798
|
+
|
|
799
|
+
With --toggle: Bind to a hotkey for push-to-talk. First call starts recording, second
|
|
800
|
+
call stops and transcribes.
|
|
801
|
+
|
|
802
|
+
Examples:
|
|
803
|
+
|
|
804
|
+
• Record and transcribe: agent-cli transcribe
|
|
805
|
+
• With LLM cleanup: agent-cli transcribe --llm
|
|
806
|
+
• Re-transcribe last recording: agent-cli transcribe --last-recording 1
|
|
736
807
|
|
|
737
808
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
738
809
|
│ --help -h Show this message and exit. │
|
|
739
810
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
740
811
|
╭─ LLM Configuration ────────────────────────────────────────────────────────────────────╮
|
|
741
|
-
│ --extra-instructions TEXT
|
|
742
|
-
│
|
|
743
|
-
│ --llm --no-llm
|
|
812
|
+
│ --extra-instructions TEXT Extra instructions appended to the LLM │
|
|
813
|
+
│ cleanup prompt (requires --llm). │
|
|
814
|
+
│ --llm --no-llm Clean up transcript with LLM: fix errors, │
|
|
815
|
+
│ add punctuation, remove filler words. Uses │
|
|
816
|
+
│ --extra-instructions if set (via CLI or │
|
|
817
|
+
│ config file). │
|
|
744
818
|
│ [default: no-llm] │
|
|
745
819
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
746
820
|
╭─ Audio Recovery ───────────────────────────────────────────────────────────────────────╮
|
|
747
|
-
│ --from-file PATH Transcribe audio
|
|
748
|
-
│
|
|
749
|
-
│ flac, aac, webm
|
|
750
|
-
│ for non-WAV
|
|
751
|
-
│
|
|
752
|
-
│ --last-recording INTEGER
|
|
753
|
-
│ 1
|
|
754
|
-
│
|
|
755
|
-
│
|
|
821
|
+
│ --from-file PATH Transcribe from audio file instead │
|
|
822
|
+
│ of microphone. Supports wav, mp3, │
|
|
823
|
+
│ m4a, ogg, flac, aac, webm. │
|
|
824
|
+
│ Requires ffmpeg for non-WAV │
|
|
825
|
+
│ formats with Wyoming. │
|
|
826
|
+
│ --last-recording INTEGER Re-transcribe a saved recording │
|
|
827
|
+
│ (1=most recent, 2=second-to-last, │
|
|
828
|
+
│ etc). Useful after connection │
|
|
829
|
+
│ failures or to retry with │
|
|
830
|
+
│ different options. │
|
|
756
831
|
│ [default: 0] │
|
|
757
|
-
│ --save-recording --no-save-recording Save
|
|
758
|
-
│ for
|
|
832
|
+
│ --save-recording --no-save-recording Save recordings to │
|
|
833
|
+
│ ~/.cache/agent-cli/ for │
|
|
834
|
+
│ --last-recording recovery. │
|
|
759
835
|
│ [default: save-recording] │
|
|
760
836
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
761
837
|
╭─ Provider Selection ───────────────────────────────────────────────────────────────────╮
|
|
@@ -767,10 +843,12 @@ the `[defaults]` section of your configuration file.
|
|
|
767
843
|
│ [default: ollama] │
|
|
768
844
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
769
845
|
╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
|
|
770
|
-
│ --input-device-index INTEGER
|
|
771
|
-
│
|
|
772
|
-
│ --
|
|
773
|
-
│
|
|
846
|
+
│ --input-device-index INTEGER Audio input device index (see --list-devices). │
|
|
847
|
+
│ Uses system default if omitted. │
|
|
848
|
+
│ --input-device-name TEXT Select input device by name substring (e.g., │
|
|
849
|
+
│ MacBook or USB). │
|
|
850
|
+
│ --list-devices List available audio devices with their indices │
|
|
851
|
+
│ and exit. │
|
|
774
852
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
775
853
|
╭─ Audio Input: Wyoming ─────────────────────────────────────────────────────────────────╮
|
|
776
854
|
│ --asr-wyoming-ip TEXT Wyoming ASR server IP address. │
|
|
@@ -825,10 +903,9 @@ the `[defaults]` section of your configuration file.
|
|
|
825
903
|
│ [env var: GEMINI_API_KEY] │
|
|
826
904
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
827
905
|
╭─ Process Management ───────────────────────────────────────────────────────────────────╮
|
|
828
|
-
│ --stop Stop any running
|
|
829
|
-
│ --status Check if
|
|
830
|
-
│ --toggle
|
|
831
|
-
│ will be stopped. If the process is not running, it will be started. │
|
|
906
|
+
│ --stop Stop any running instance of this command. │
|
|
907
|
+
│ --status Check if an instance is currently running. │
|
|
908
|
+
│ --toggle Start if not running, stop if running. Ideal for hotkey binding. │
|
|
832
909
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
833
910
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
834
911
|
│ --clipboard --no-clipboard Copy result to │
|
|
@@ -842,9 +919,8 @@ the `[defaults]` section of your configuration file.
|
|
|
842
919
|
│ --quiet -q Suppress console │
|
|
843
920
|
│ output from rich. │
|
|
844
921
|
│ --json Output result as JSON │
|
|
845
|
-
│
|
|
846
|
-
│
|
|
847
|
-
│ --no-clipboard. │
|
|
922
|
+
│ (implies --quiet and │
|
|
923
|
+
│ --no-clipboard). │
|
|
848
924
|
│ --config TEXT Path to a TOML │
|
|
849
925
|
│ configuration file. │
|
|
850
926
|
│ --print-args Print the command │
|
|
@@ -852,11 +928,13 @@ the `[defaults]` section of your configuration file.
|
|
|
852
928
|
│ including variables │
|
|
853
929
|
│ taken from the │
|
|
854
930
|
│ configuration file. │
|
|
855
|
-
│ --transcription-log PATH
|
|
856
|
-
│
|
|
857
|
-
│
|
|
858
|
-
│
|
|
859
|
-
│
|
|
931
|
+
│ --transcription-log PATH Append transcripts to │
|
|
932
|
+
│ JSONL file │
|
|
933
|
+
│ (timestamp, hostname, │
|
|
934
|
+
│ model, raw/processed │
|
|
935
|
+
│ text). Recent entries │
|
|
936
|
+
│ provide context for │
|
|
937
|
+
│ LLM cleanup. │
|
|
860
938
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
861
939
|
|
|
862
940
|
```
|
|
@@ -912,46 +990,76 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
912
990
|
|
|
913
991
|
Usage: agent-cli transcribe-daemon [OPTIONS]
|
|
914
992
|
|
|
915
|
-
|
|
993
|
+
Continuous transcription daemon using Silero VAD for speech detection.
|
|
994
|
+
|
|
995
|
+
Unlike transcribe (single recording session), this daemon runs indefinitely and
|
|
996
|
+
automatically detects speech segments using Voice Activity Detection (VAD). Each
|
|
997
|
+
detected segment is transcribed and logged with timestamps.
|
|
916
998
|
|
|
917
|
-
|
|
918
|
-
segments using Silero VAD, transcribing them, and logging results with timestamps.
|
|
999
|
+
How it works:
|
|
919
1000
|
|
|
920
|
-
|
|
1001
|
+
1 Listens continuously to microphone input
|
|
1002
|
+
2 Silero VAD detects when you start/stop speaking
|
|
1003
|
+
3 After --silence-threshold seconds of silence, the segment is finalized
|
|
1004
|
+
4 Segment is transcribed (and optionally cleaned by LLM with --llm)
|
|
1005
|
+
5 Results are appended to the JSONL log file
|
|
1006
|
+
6 Audio is saved as MP3 if --save-audio is enabled (requires ffmpeg)
|
|
921
1007
|
|
|
1008
|
+
Use cases: Meeting transcription, note-taking, voice journaling, accessibility.
|
|
922
1009
|
|
|
923
|
-
|
|
1010
|
+
Examples:
|
|
1011
|
+
|
|
1012
|
+
|
|
1013
|
+
agent-cli transcribe-daemon
|
|
924
1014
|
agent-cli transcribe-daemon --role meeting --silence-threshold 1.5
|
|
1015
|
+
agent-cli transcribe-daemon --llm --clipboard --role notes
|
|
1016
|
+
agent-cli transcribe-daemon --transcription-log ~/meeting.jsonl --no-save-audio
|
|
1017
|
+
agent-cli transcribe-daemon --asr-provider openai --llm-provider gemini --llm
|
|
925
1018
|
|
|
926
|
-
# With LLM cleanup
|
|
927
|
-
agent-cli transcribe-daemon --llm --role notes
|
|
928
1019
|
|
|
929
|
-
|
|
930
|
-
agent-cli transcribe-daemon --transcription-log ~/meeting.jsonl --audio-dir ~/audio
|
|
1020
|
+
Tips:
|
|
931
1021
|
|
|
1022
|
+
• Use --role to tag entries (e.g., speaker1, meeting, personal)
|
|
1023
|
+
• Adjust --vad-threshold if detection is too sensitive (increase) or missing speech
|
|
1024
|
+
(decrease)
|
|
1025
|
+
• Use --stop to cleanly terminate a running daemon
|
|
1026
|
+
• With --llm, transcripts are cleaned up (punctuation, filler words removed)
|
|
932
1027
|
|
|
933
1028
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
934
|
-
│ --role -r TEXT
|
|
935
|
-
│
|
|
1029
|
+
│ --role -r TEXT Label for log entries. Use to │
|
|
1030
|
+
│ distinguish speakers or contexts in │
|
|
1031
|
+
│ logs. │
|
|
936
1032
|
│ [default: user] │
|
|
937
|
-
│ --silence-threshold -s FLOAT Seconds of silence
|
|
938
|
-
│ segment.
|
|
1033
|
+
│ --silence-threshold -s FLOAT Seconds of silence after speech to │
|
|
1034
|
+
│ finalize a segment. Increase for │
|
|
1035
|
+
│ slower speakers. │
|
|
939
1036
|
│ [default: 1.0] │
|
|
940
|
-
│ --min-segment -m FLOAT Minimum
|
|
941
|
-
│
|
|
1037
|
+
│ --min-segment -m FLOAT Minimum seconds of speech required │
|
|
1038
|
+
│ before a segment is processed. │
|
|
1039
|
+
│ Filters brief sounds. │
|
|
942
1040
|
│ [default: 0.25] │
|
|
943
|
-
│ --vad-threshold FLOAT VAD
|
|
944
|
-
│ (0.0-1.0). Higher
|
|
945
|
-
│
|
|
1041
|
+
│ --vad-threshold FLOAT Silero VAD confidence threshold │
|
|
1042
|
+
│ (0.0-1.0). Higher values require │
|
|
1043
|
+
│ clearer speech; lower values are │
|
|
1044
|
+
│ more sensitive to quiet/distant │
|
|
1045
|
+
│ voices. │
|
|
946
1046
|
│ [default: 0.3] │
|
|
947
|
-
│ --save-audio --no-save-audio Save
|
|
1047
|
+
│ --save-audio --no-save-audio Save each speech segment as MP3. │
|
|
1048
|
+
│ Requires ffmpeg to be installed. │
|
|
948
1049
|
│ [default: save-audio] │
|
|
949
|
-
│ --audio-dir PATH
|
|
950
|
-
│
|
|
951
|
-
│
|
|
1050
|
+
│ --audio-dir PATH Base directory for MP3 files. Files │
|
|
1051
|
+
│ are organized by date: │
|
|
1052
|
+
│ YYYY/MM/DD/HHMMSS_mmm.mp3. Default: │
|
|
1053
|
+
│ ~/.config/agent-cli/audio. │
|
|
1054
|
+
│ --transcription-log -t PATH JSONL file for transcript logging │
|
|
1055
|
+
│ (one JSON object per line with │
|
|
1056
|
+
│ timestamp, role, raw/processed │
|
|
1057
|
+
│ text, audio path). Default: │
|
|
952
1058
|
│ ~/.config/agent-cli/transcriptions… │
|
|
953
|
-
│ --clipboard --no-clipboard Copy each transcription
|
|
954
|
-
│ clipboard.
|
|
1059
|
+
│ --clipboard --no-clipboard Copy each completed transcription │
|
|
1060
|
+
│ to clipboard (overwrites previous). │
|
|
1061
|
+
│ Useful with --llm to get cleaned │
|
|
1062
|
+
│ text. │
|
|
955
1063
|
│ [default: no-clipboard] │
|
|
956
1064
|
│ --help -h Show this message and exit. │
|
|
957
1065
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
@@ -964,10 +1072,12 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
964
1072
|
│ [default: ollama] │
|
|
965
1073
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
966
1074
|
╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
|
|
967
|
-
│ --input-device-index INTEGER
|
|
968
|
-
│
|
|
969
|
-
│ --
|
|
970
|
-
│
|
|
1075
|
+
│ --input-device-index INTEGER Audio input device index (see --list-devices). │
|
|
1076
|
+
│ Uses system default if omitted. │
|
|
1077
|
+
│ --input-device-name TEXT Select input device by name substring (e.g., │
|
|
1078
|
+
│ MacBook or USB). │
|
|
1079
|
+
│ --list-devices List available audio devices with their indices │
|
|
1080
|
+
│ and exit. │
|
|
971
1081
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
972
1082
|
╭─ Audio Input: Wyoming ─────────────────────────────────────────────────────────────────╮
|
|
973
1083
|
│ --asr-wyoming-ip TEXT Wyoming ASR server IP address. │
|
|
@@ -1022,12 +1132,14 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1022
1132
|
│ [env var: GEMINI_API_KEY] │
|
|
1023
1133
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1024
1134
|
╭─ LLM Configuration ────────────────────────────────────────────────────────────────────╮
|
|
1025
|
-
│ --llm --no-llm
|
|
1135
|
+
│ --llm --no-llm Clean up transcript with LLM: fix errors, add punctuation, │
|
|
1136
|
+
│ remove filler words. Uses --extra-instructions if set (via CLI │
|
|
1137
|
+
│ or config file). │
|
|
1026
1138
|
│ [default: no-llm] │
|
|
1027
1139
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1028
1140
|
╭─ Process Management ───────────────────────────────────────────────────────────────────╮
|
|
1029
|
-
│ --stop Stop any running
|
|
1030
|
-
│ --status Check if
|
|
1141
|
+
│ --stop Stop any running instance of this command. │
|
|
1142
|
+
│ --status Check if an instance is currently running. │
|
|
1031
1143
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1032
1144
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
1033
1145
|
│ --log-level [debug|info|warning|error] Set logging level. │
|
|
@@ -1081,10 +1193,25 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1081
1193
|
|
|
1082
1194
|
Usage: agent-cli speak [OPTIONS] [TEXT]
|
|
1083
1195
|
|
|
1084
|
-
Convert text to speech
|
|
1196
|
+
Convert text to speech and play audio through speakers.
|
|
1197
|
+
|
|
1198
|
+
By default, synthesized audio plays immediately. Use --save-file to save to a WAV file
|
|
1199
|
+
instead (skips playback).
|
|
1200
|
+
|
|
1201
|
+
Text can be provided as an argument or read from clipboard automatically.
|
|
1202
|
+
|
|
1203
|
+
Examples:
|
|
1204
|
+
|
|
1205
|
+
Speak text directly: agent-cli speak "Hello, world!"
|
|
1206
|
+
|
|
1207
|
+
Speak clipboard contents: agent-cli speak
|
|
1208
|
+
|
|
1209
|
+
Save to file instead of playing: agent-cli speak "Hello" --save-file greeting.wav
|
|
1210
|
+
|
|
1211
|
+
Use OpenAI-compatible TTS: agent-cli speak "Hello" --tts-provider openai
|
|
1085
1212
|
|
|
1086
1213
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
1087
|
-
│ text [TEXT] Text to
|
|
1214
|
+
│ text [TEXT] Text to synthesize. If not provided, reads from clipboard. │
|
|
1088
1215
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1089
1216
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
1090
1217
|
│ --help -h Show this message and exit. │
|
|
@@ -1096,9 +1223,10 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1096
1223
|
│ [default: wyoming] │
|
|
1097
1224
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1098
1225
|
╭─ Audio Output ─────────────────────────────────────────────────────────────────────────╮
|
|
1099
|
-
│ --output-device-index INTEGER
|
|
1100
|
-
│
|
|
1101
|
-
│
|
|
1226
|
+
│ --output-device-index INTEGER Audio output device index (see --list-devices │
|
|
1227
|
+
│ for available devices). │
|
|
1228
|
+
│ --output-device-name TEXT Partial match on device name (e.g., 'speakers', │
|
|
1229
|
+
│ 'headphones'). │
|
|
1102
1230
|
│ --tts-speed FLOAT Speech speed multiplier (1.0 = normal, 2.0 = │
|
|
1103
1231
|
│ twice as fast, 0.5 = half speed). │
|
|
1104
1232
|
│ [default: 1.0] │
|
|
@@ -1116,7 +1244,8 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1116
1244
|
╭─ Audio Output: OpenAI-compatible ──────────────────────────────────────────────────────╮
|
|
1117
1245
|
│ --tts-openai-model TEXT The OpenAI model to use for TTS. │
|
|
1118
1246
|
│ [default: tts-1] │
|
|
1119
|
-
│ --tts-openai-voice TEXT
|
|
1247
|
+
│ --tts-openai-voice TEXT Voice for OpenAI TTS (alloy, echo, fable, onyx, │
|
|
1248
|
+
│ nova, shimmer). │
|
|
1120
1249
|
│ [default: alloy] │
|
|
1121
1250
|
│ --tts-openai-base-url TEXT Custom base URL for OpenAI-compatible TTS API │
|
|
1122
1251
|
│ (e.g., http://localhost:8000/v1 for a proxy). │
|
|
@@ -1142,28 +1271,27 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1142
1271
|
│ [env var: GEMINI_API_KEY] │
|
|
1143
1272
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1144
1273
|
╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
|
|
1145
|
-
│ --list-devices List available audio
|
|
1274
|
+
│ --list-devices List available audio devices with their indices and exit. │
|
|
1146
1275
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1147
1276
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
1148
|
-
│ --save-file PATH Save
|
|
1277
|
+
│ --save-file PATH Save audio to WAV file instead of │
|
|
1278
|
+
│ playing through speakers. │
|
|
1149
1279
|
│ --log-level [debug|info|warning|error] Set logging level. │
|
|
1150
1280
|
│ [env var: LOG_LEVEL] │
|
|
1151
1281
|
│ [default: info] │
|
|
1152
1282
|
│ --log-file TEXT Path to a file to write logs to. │
|
|
1153
1283
|
│ --quiet -q Suppress console output from rich. │
|
|
1154
|
-
│ --json Output result as JSON
|
|
1155
|
-
│
|
|
1156
|
-
│ --no-clipboard. │
|
|
1284
|
+
│ --json Output result as JSON (implies │
|
|
1285
|
+
│ --quiet and --no-clipboard). │
|
|
1157
1286
|
│ --config TEXT Path to a TOML configuration file. │
|
|
1158
1287
|
│ --print-args Print the command line arguments, │
|
|
1159
1288
|
│ including variables taken from the │
|
|
1160
1289
|
│ configuration file. │
|
|
1161
1290
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1162
1291
|
╭─ Process Management ───────────────────────────────────────────────────────────────────╮
|
|
1163
|
-
│ --stop Stop any running
|
|
1164
|
-
│ --status Check if
|
|
1165
|
-
│ --toggle
|
|
1166
|
-
│ will be stopped. If the process is not running, it will be started. │
|
|
1292
|
+
│ --stop Stop any running instance of this command. │
|
|
1293
|
+
│ --status Check if an instance is currently running. │
|
|
1294
|
+
│ --toggle Start if not running, stop if running. Ideal for hotkey binding. │
|
|
1167
1295
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1168
1296
|
|
|
1169
1297
|
```
|
|
@@ -1205,16 +1333,23 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1205
1333
|
|
|
1206
1334
|
Usage: agent-cli voice-edit [OPTIONS]
|
|
1207
1335
|
|
|
1208
|
-
|
|
1336
|
+
Edit or query clipboard text using voice commands.
|
|
1337
|
+
|
|
1338
|
+
Workflow: Captures clipboard text → records your voice command → transcribes it → sends
|
|
1339
|
+
both to an LLM → copies result back to clipboard.
|
|
1340
|
+
|
|
1341
|
+
Use this for hands-free text editing (e.g., "make this more formal") or asking questions
|
|
1342
|
+
about clipboard content (e.g., "summarize this").
|
|
1209
1343
|
|
|
1210
|
-
|
|
1344
|
+
Typical hotkey integration: Run voice-edit & on keypress to start recording, then send
|
|
1345
|
+
SIGINT (via --stop) on second keypress to process.
|
|
1346
|
+
|
|
1347
|
+
Examples:
|
|
1211
1348
|
|
|
1212
|
-
•
|
|
1213
|
-
•
|
|
1214
|
-
•
|
|
1215
|
-
•
|
|
1216
|
-
• List output devices: agent-cli voice-edit --list-output-devices
|
|
1217
|
-
• Save TTS to file: agent-cli voice-edit --tts --save-file response.wav
|
|
1349
|
+
• Basic usage: agent-cli voice-edit
|
|
1350
|
+
• With TTS response: agent-cli voice-edit --tts
|
|
1351
|
+
• Toggle on/off: agent-cli voice-edit --toggle
|
|
1352
|
+
• List audio devices: agent-cli voice-edit --list-devices
|
|
1218
1353
|
|
|
1219
1354
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
1220
1355
|
│ --help -h Show this message and exit. │
|
|
@@ -1232,10 +1367,12 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1232
1367
|
│ [default: wyoming] │
|
|
1233
1368
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1234
1369
|
╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
|
|
1235
|
-
│ --input-device-index INTEGER
|
|
1236
|
-
│
|
|
1237
|
-
│ --
|
|
1238
|
-
│
|
|
1370
|
+
│ --input-device-index INTEGER Audio input device index (see --list-devices). │
|
|
1371
|
+
│ Uses system default if omitted. │
|
|
1372
|
+
│ --input-device-name TEXT Select input device by name substring (e.g., │
|
|
1373
|
+
│ MacBook or USB). │
|
|
1374
|
+
│ --list-devices List available audio devices with their indices │
|
|
1375
|
+
│ and exit. │
|
|
1239
1376
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1240
1377
|
╭─ Audio Input: Wyoming ─────────────────────────────────────────────────────────────────╮
|
|
1241
1378
|
│ --asr-wyoming-ip TEXT Wyoming ASR server IP address. │
|
|
@@ -1286,10 +1423,10 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1286
1423
|
╭─ Audio Output ─────────────────────────────────────────────────────────────────────────╮
|
|
1287
1424
|
│ --tts --no-tts Enable text-to-speech for responses. │
|
|
1288
1425
|
│ [default: no-tts] │
|
|
1289
|
-
│ --output-device-index INTEGER
|
|
1290
|
-
│ for
|
|
1291
|
-
│ --output-device-name TEXT
|
|
1292
|
-
│
|
|
1426
|
+
│ --output-device-index INTEGER Audio output device index (see │
|
|
1427
|
+
│ --list-devices for available devices). │
|
|
1428
|
+
│ --output-device-name TEXT Partial match on device name (e.g., │
|
|
1429
|
+
│ 'speakers', 'headphones'). │
|
|
1293
1430
|
│ --tts-speed FLOAT Speech speed multiplier (1.0 = normal, │
|
|
1294
1431
|
│ 2.0 = twice as fast, 0.5 = half speed). │
|
|
1295
1432
|
│ [default: 1.0] │
|
|
@@ -1307,7 +1444,8 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1307
1444
|
╭─ Audio Output: OpenAI-compatible ──────────────────────────────────────────────────────╮
|
|
1308
1445
|
│ --tts-openai-model TEXT The OpenAI model to use for TTS. │
|
|
1309
1446
|
│ [default: tts-1] │
|
|
1310
|
-
│ --tts-openai-voice TEXT
|
|
1447
|
+
│ --tts-openai-voice TEXT Voice for OpenAI TTS (alloy, echo, fable, onyx, │
|
|
1448
|
+
│ nova, shimmer). │
|
|
1311
1449
|
│ [default: alloy] │
|
|
1312
1450
|
│ --tts-openai-base-url TEXT Custom base URL for OpenAI-compatible TTS API │
|
|
1313
1451
|
│ (e.g., http://localhost:8000/v1 for a proxy). │
|
|
@@ -1328,14 +1466,14 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1328
1466
|
│ [default: Kore] │
|
|
1329
1467
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1330
1468
|
╭─ Process Management ───────────────────────────────────────────────────────────────────╮
|
|
1331
|
-
│ --stop Stop any running
|
|
1332
|
-
│ --status Check if
|
|
1333
|
-
│ --toggle
|
|
1334
|
-
│ will be stopped. If the process is not running, it will be started. │
|
|
1469
|
+
│ --stop Stop any running instance of this command. │
|
|
1470
|
+
│ --status Check if an instance is currently running. │
|
|
1471
|
+
│ --toggle Start if not running, stop if running. Ideal for hotkey binding. │
|
|
1335
1472
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1336
1473
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
1337
|
-
│ --save-file PATH Save
|
|
1338
|
-
│
|
|
1474
|
+
│ --save-file PATH Save audio to WAV file │
|
|
1475
|
+
│ instead of playing │
|
|
1476
|
+
│ through speakers. │
|
|
1339
1477
|
│ --clipboard --no-clipboard Copy result to │
|
|
1340
1478
|
│ clipboard. │
|
|
1341
1479
|
│ [default: clipboard] │
|
|
@@ -1347,9 +1485,8 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1347
1485
|
│ --quiet -q Suppress console output │
|
|
1348
1486
|
│ from rich. │
|
|
1349
1487
|
│ --json Output result as JSON │
|
|
1350
|
-
│
|
|
1351
|
-
│ --
|
|
1352
|
-
│ --no-clipboard. │
|
|
1488
|
+
│ (implies --quiet and │
|
|
1489
|
+
│ --no-clipboard). │
|
|
1353
1490
|
│ --config TEXT Path to a TOML │
|
|
1354
1491
|
│ configuration file. │
|
|
1355
1492
|
│ --print-args Print the command line │
|
|
@@ -1400,7 +1537,28 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1400
1537
|
|
|
1401
1538
|
Usage: agent-cli assistant [OPTIONS]
|
|
1402
1539
|
|
|
1403
|
-
|
|
1540
|
+
Hands-free voice assistant using wake word detection.
|
|
1541
|
+
|
|
1542
|
+
Continuously listens for a wake word, then records your speech until you say the wake
|
|
1543
|
+
word again. The recording is transcribed and sent to an LLM for a conversational
|
|
1544
|
+
response, optionally spoken back via TTS.
|
|
1545
|
+
|
|
1546
|
+
Conversation flow:
|
|
1547
|
+
|
|
1548
|
+
1 Say wake word → starts recording
|
|
1549
|
+
2 Speak your question/command
|
|
1550
|
+
3 Say wake word again → stops recording and processes
|
|
1551
|
+
|
|
1552
|
+
The assistant runs in a loop, ready for the next command after each response. Stop with
|
|
1553
|
+
Ctrl+C or --stop.
|
|
1554
|
+
|
|
1555
|
+
Requirements:
|
|
1556
|
+
|
|
1557
|
+
• Wyoming wake word server (e.g., wyoming-openwakeword on port 10400)
|
|
1558
|
+
• Wyoming ASR server (e.g., wyoming-whisper on port 10300)
|
|
1559
|
+
• Optional: TTS server for spoken responses (enable with --tts)
|
|
1560
|
+
|
|
1561
|
+
Example: assistant --wake-word ok_nabu --tts --input-device-name USB
|
|
1404
1562
|
|
|
1405
1563
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
1406
1564
|
│ --help -h Show this message and exit. │
|
|
@@ -1418,19 +1576,23 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1418
1576
|
│ [default: wyoming] │
|
|
1419
1577
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1420
1578
|
╭─ Wake Word ────────────────────────────────────────────────────────────────────────────╮
|
|
1421
|
-
│ --wake-server-ip TEXT Wyoming wake word server IP
|
|
1579
|
+
│ --wake-server-ip TEXT Wyoming wake word server IP (requires │
|
|
1580
|
+
│ wyoming-openwakeword or similar). │
|
|
1422
1581
|
│ [default: localhost] │
|
|
1423
1582
|
│ --wake-server-port INTEGER Wyoming wake word server port. │
|
|
1424
1583
|
│ [default: 10400] │
|
|
1425
|
-
│ --wake-word TEXT
|
|
1426
|
-
│
|
|
1584
|
+
│ --wake-word TEXT Wake word to detect. Common options: ok_nabu, │
|
|
1585
|
+
│ hey_jarvis, alexa. Must match a model loaded in │
|
|
1586
|
+
│ your wake word server. │
|
|
1427
1587
|
│ [default: ok_nabu] │
|
|
1428
1588
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1429
1589
|
╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
|
|
1430
|
-
│ --input-device-index INTEGER
|
|
1431
|
-
│
|
|
1432
|
-
│ --
|
|
1433
|
-
│
|
|
1590
|
+
│ --input-device-index INTEGER Audio input device index (see --list-devices). │
|
|
1591
|
+
│ Uses system default if omitted. │
|
|
1592
|
+
│ --input-device-name TEXT Select input device by name substring (e.g., │
|
|
1593
|
+
│ MacBook or USB). │
|
|
1594
|
+
│ --list-devices List available audio devices with their indices │
|
|
1595
|
+
│ and exit. │
|
|
1434
1596
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1435
1597
|
╭─ Audio Input: Wyoming ─────────────────────────────────────────────────────────────────╮
|
|
1436
1598
|
│ --asr-wyoming-ip TEXT Wyoming ASR server IP address. │
|
|
@@ -1481,10 +1643,10 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1481
1643
|
╭─ Audio Output ─────────────────────────────────────────────────────────────────────────╮
|
|
1482
1644
|
│ --tts --no-tts Enable text-to-speech for responses. │
|
|
1483
1645
|
│ [default: no-tts] │
|
|
1484
|
-
│ --output-device-index INTEGER
|
|
1485
|
-
│ for
|
|
1486
|
-
│ --output-device-name TEXT
|
|
1487
|
-
│
|
|
1646
|
+
│ --output-device-index INTEGER Audio output device index (see │
|
|
1647
|
+
│ --list-devices for available devices). │
|
|
1648
|
+
│ --output-device-name TEXT Partial match on device name (e.g., │
|
|
1649
|
+
│ 'speakers', 'headphones'). │
|
|
1488
1650
|
│ --tts-speed FLOAT Speech speed multiplier (1.0 = normal, │
|
|
1489
1651
|
│ 2.0 = twice as fast, 0.5 = half speed). │
|
|
1490
1652
|
│ [default: 1.0] │
|
|
@@ -1502,7 +1664,8 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1502
1664
|
╭─ Audio Output: OpenAI-compatible ──────────────────────────────────────────────────────╮
|
|
1503
1665
|
│ --tts-openai-model TEXT The OpenAI model to use for TTS. │
|
|
1504
1666
|
│ [default: tts-1] │
|
|
1505
|
-
│ --tts-openai-voice TEXT
|
|
1667
|
+
│ --tts-openai-voice TEXT Voice for OpenAI TTS (alloy, echo, fable, onyx, │
|
|
1668
|
+
│ nova, shimmer). │
|
|
1506
1669
|
│ [default: alloy] │
|
|
1507
1670
|
│ --tts-openai-base-url TEXT Custom base URL for OpenAI-compatible TTS API │
|
|
1508
1671
|
│ (e.g., http://localhost:8000/v1 for a proxy). │
|
|
@@ -1523,14 +1686,14 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1523
1686
|
│ [default: Kore] │
|
|
1524
1687
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1525
1688
|
╭─ Process Management ───────────────────────────────────────────────────────────────────╮
|
|
1526
|
-
│ --stop Stop any running
|
|
1527
|
-
│ --status Check if
|
|
1528
|
-
│ --toggle
|
|
1529
|
-
│ will be stopped. If the process is not running, it will be started. │
|
|
1689
|
+
│ --stop Stop any running instance of this command. │
|
|
1690
|
+
│ --status Check if an instance is currently running. │
|
|
1691
|
+
│ --toggle Start if not running, stop if running. Ideal for hotkey binding. │
|
|
1530
1692
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1531
1693
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
1532
|
-
│ --save-file PATH Save
|
|
1533
|
-
│
|
|
1694
|
+
│ --save-file PATH Save audio to WAV file │
|
|
1695
|
+
│ instead of playing │
|
|
1696
|
+
│ through speakers. │
|
|
1534
1697
|
│ --clipboard --no-clipboard Copy result to │
|
|
1535
1698
|
│ clipboard. │
|
|
1536
1699
|
│ [default: clipboard] │
|
|
@@ -1598,7 +1761,39 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1598
1761
|
|
|
1599
1762
|
Usage: agent-cli chat [OPTIONS]
|
|
1600
1763
|
|
|
1601
|
-
|
|
1764
|
+
Voice-based conversational chat agent with memory and tools.
|
|
1765
|
+
|
|
1766
|
+
Runs an interactive loop: listen → transcribe → LLM → speak response. Conversation
|
|
1767
|
+
history is persisted and included as context for continuity.
|
|
1768
|
+
|
|
1769
|
+
Built-in tools (LLM uses automatically when relevant):
|
|
1770
|
+
|
|
1771
|
+
• add_memory/search_memory/update_memory - persistent long-term memory
|
|
1772
|
+
• duckduckgo_search - web search for current information
|
|
1773
|
+
• read_file/execute_code - file access and shell commands
|
|
1774
|
+
|
|
1775
|
+
Process management: Use --toggle to start/stop via hotkey (bind to a keyboard shortcut),
|
|
1776
|
+
--stop to terminate, or --status to check state.
|
|
1777
|
+
|
|
1778
|
+
Examples:
|
|
1779
|
+
|
|
1780
|
+
Use OpenAI-compatible providers for speech and LLM, with TTS enabled:
|
|
1781
|
+
|
|
1782
|
+
|
|
1783
|
+
agent-cli chat --asr-provider openai --llm-provider openai --tts
|
|
1784
|
+
|
|
1785
|
+
|
|
1786
|
+
Start in background mode (toggle on/off with hotkey):
|
|
1787
|
+
|
|
1788
|
+
|
|
1789
|
+
agent-cli chat --toggle
|
|
1790
|
+
|
|
1791
|
+
|
|
1792
|
+
Use local Ollama LLM with Wyoming ASR:
|
|
1793
|
+
|
|
1794
|
+
|
|
1795
|
+
agent-cli chat --llm-provider ollama
|
|
1796
|
+
|
|
1602
1797
|
|
|
1603
1798
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
1604
1799
|
│ --help -h Show this message and exit. │
|
|
@@ -1616,10 +1811,12 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1616
1811
|
│ [default: wyoming] │
|
|
1617
1812
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1618
1813
|
╭─ Audio Input ──────────────────────────────────────────────────────────────────────────╮
|
|
1619
|
-
│ --input-device-index INTEGER
|
|
1620
|
-
│
|
|
1621
|
-
│ --
|
|
1622
|
-
│
|
|
1814
|
+
│ --input-device-index INTEGER Audio input device index (see --list-devices). │
|
|
1815
|
+
│ Uses system default if omitted. │
|
|
1816
|
+
│ --input-device-name TEXT Select input device by name substring (e.g., │
|
|
1817
|
+
│ MacBook or USB). │
|
|
1818
|
+
│ --list-devices List available audio devices with their indices │
|
|
1819
|
+
│ and exit. │
|
|
1623
1820
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1624
1821
|
╭─ Audio Input: Wyoming ─────────────────────────────────────────────────────────────────╮
|
|
1625
1822
|
│ --asr-wyoming-ip TEXT Wyoming ASR server IP address. │
|
|
@@ -1676,10 +1873,10 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1676
1873
|
╭─ Audio Output ─────────────────────────────────────────────────────────────────────────╮
|
|
1677
1874
|
│ --tts --no-tts Enable text-to-speech for responses. │
|
|
1678
1875
|
│ [default: no-tts] │
|
|
1679
|
-
│ --output-device-index INTEGER
|
|
1680
|
-
│ for
|
|
1681
|
-
│ --output-device-name TEXT
|
|
1682
|
-
│
|
|
1876
|
+
│ --output-device-index INTEGER Audio output device index (see │
|
|
1877
|
+
│ --list-devices for available devices). │
|
|
1878
|
+
│ --output-device-name TEXT Partial match on device name (e.g., │
|
|
1879
|
+
│ 'speakers', 'headphones'). │
|
|
1683
1880
|
│ --tts-speed FLOAT Speech speed multiplier (1.0 = normal, │
|
|
1684
1881
|
│ 2.0 = twice as fast, 0.5 = half speed). │
|
|
1685
1882
|
│ [default: 1.0] │
|
|
@@ -1697,7 +1894,8 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1697
1894
|
╭─ Audio Output: OpenAI-compatible ──────────────────────────────────────────────────────╮
|
|
1698
1895
|
│ --tts-openai-model TEXT The OpenAI model to use for TTS. │
|
|
1699
1896
|
│ [default: tts-1] │
|
|
1700
|
-
│ --tts-openai-voice TEXT
|
|
1897
|
+
│ --tts-openai-voice TEXT Voice for OpenAI TTS (alloy, echo, fable, onyx, │
|
|
1898
|
+
│ nova, shimmer). │
|
|
1701
1899
|
│ [default: alloy] │
|
|
1702
1900
|
│ --tts-openai-base-url TEXT Custom base URL for OpenAI-compatible TTS API │
|
|
1703
1901
|
│ (e.g., http://localhost:8000/v1 for a proxy). │
|
|
@@ -1718,20 +1916,23 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1718
1916
|
│ [default: Kore] │
|
|
1719
1917
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1720
1918
|
╭─ Process Management ───────────────────────────────────────────────────────────────────╮
|
|
1721
|
-
│ --stop Stop any running
|
|
1722
|
-
│ --status Check if
|
|
1723
|
-
│ --toggle
|
|
1724
|
-
│ will be stopped. If the process is not running, it will be started. │
|
|
1919
|
+
│ --stop Stop any running instance of this command. │
|
|
1920
|
+
│ --status Check if an instance is currently running. │
|
|
1921
|
+
│ --toggle Start if not running, stop if running. Ideal for hotkey binding. │
|
|
1725
1922
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1726
1923
|
╭─ History Options ──────────────────────────────────────────────────────────────────────╮
|
|
1727
|
-
│ --history-dir PATH Directory
|
|
1924
|
+
│ --history-dir PATH Directory for conversation history and long-term │
|
|
1925
|
+
│ memory. Both conversation.json and │
|
|
1926
|
+
│ long_term_memory.json are stored here. │
|
|
1728
1927
|
│ [default: ~/.config/agent-cli/history] │
|
|
1729
|
-
│ --last-n-messages INTEGER Number of messages to include
|
|
1730
|
-
│
|
|
1928
|
+
│ --last-n-messages INTEGER Number of past messages to include as context for │
|
|
1929
|
+
│ the LLM. Set to 0 to start fresh each session │
|
|
1930
|
+
│ (memory tools still persist). │
|
|
1731
1931
|
│ [default: 50] │
|
|
1732
1932
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1733
1933
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
1734
|
-
│ --save-file PATH Save
|
|
1934
|
+
│ --save-file PATH Save audio to WAV file instead of │
|
|
1935
|
+
│ playing through speakers. │
|
|
1735
1936
|
│ --log-level [debug|info|warning|error] Set logging level. │
|
|
1736
1937
|
│ [env var: LOG_LEVEL] │
|
|
1737
1938
|
│ [default: info] │
|
|
@@ -1786,25 +1987,68 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1786
1987
|
|
|
1787
1988
|
Usage: agent-cli rag-proxy [OPTIONS]
|
|
1788
1989
|
|
|
1789
|
-
Start
|
|
1990
|
+
Start a RAG proxy server that enables "chat with your documents".
|
|
1991
|
+
|
|
1992
|
+
Watches a folder for documents, indexes them into a vector store, and provides an
|
|
1993
|
+
OpenAI-compatible API at /v1/chat/completions. When you send a chat request, the server
|
|
1994
|
+
retrieves relevant document chunks and injects them as context before forwarding to your
|
|
1995
|
+
LLM backend.
|
|
1996
|
+
|
|
1997
|
+
Quick start:
|
|
1998
|
+
|
|
1999
|
+
• agent-cli rag-proxy — Start with defaults (./rag_docs, OpenAI-compatible API)
|
|
2000
|
+
• agent-cli rag-proxy --docs-folder ~/notes — Index your notes folder
|
|
2001
|
+
|
|
2002
|
+
How it works:
|
|
2003
|
+
|
|
2004
|
+
1 Documents in --docs-folder are chunked, embedded, and stored in ChromaDB
|
|
2005
|
+
2 A file watcher auto-reindexes when files change
|
|
2006
|
+
3 Chat requests trigger a semantic search for relevant chunks
|
|
2007
|
+
4 Retrieved context is injected into the prompt before forwarding to the LLM
|
|
2008
|
+
5 Responses include a rag_sources field listing which documents were used
|
|
2009
|
+
|
|
2010
|
+
Supported file formats:
|
|
1790
2011
|
|
|
1791
|
-
|
|
1792
|
-
|
|
1793
|
-
|
|
2012
|
+
Text: .txt, .md, .json, .py, .js, .ts, .yaml, .toml, .rst, etc. Rich documents (via
|
|
2013
|
+
MarkItDown): .pdf, .docx, .pptx, .xlsx, .html, .csv
|
|
2014
|
+
|
|
2015
|
+
API endpoints:
|
|
2016
|
+
|
|
2017
|
+
• POST /v1/chat/completions — Main chat endpoint (OpenAI-compatible)
|
|
2018
|
+
• GET /health — Health check with configuration info
|
|
2019
|
+
• GET /files — List indexed files with chunk counts
|
|
2020
|
+
• POST /reindex — Trigger manual reindex
|
|
2021
|
+
• All other paths are proxied to the LLM backend
|
|
2022
|
+
|
|
2023
|
+
Per-request overrides (in JSON body):
|
|
2024
|
+
|
|
2025
|
+
• rag_top_k: Override --limit for this request
|
|
2026
|
+
• rag_enable_tools: Override --rag-tools for this request
|
|
1794
2027
|
|
|
1795
2028
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
1796
2029
|
│ --help -h Show this message and exit. │
|
|
1797
2030
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1798
2031
|
╭─ RAG Configuration ────────────────────────────────────────────────────────────────────╮
|
|
1799
|
-
│ --docs-folder PATH Folder to watch for documents
|
|
2032
|
+
│ --docs-folder PATH Folder to watch for documents. Files are │
|
|
2033
|
+
│ auto-indexed on startup and when changed. │
|
|
2034
|
+
│ Must not overlap with --chroma-path. │
|
|
1800
2035
|
│ [default: ./rag_docs] │
|
|
1801
|
-
│ --chroma-path PATH
|
|
2036
|
+
│ --chroma-path PATH ChromaDB storage directory for vector │
|
|
2037
|
+
│ embeddings. Must be separate from │
|
|
2038
|
+
│ --docs-folder to avoid indexing database │
|
|
2039
|
+
│ files. │
|
|
1802
2040
|
│ [default: ./rag_db] │
|
|
1803
2041
|
│ --limit INTEGER Number of document chunks to retrieve per │
|
|
1804
|
-
│ query.
|
|
2042
|
+
│ query. Higher values provide more context │
|
|
2043
|
+
│ but use more tokens. Can be overridden │
|
|
2044
|
+
│ per-request via rag_top_k in the JSON │
|
|
2045
|
+
│ body. │
|
|
1805
2046
|
│ [default: 3] │
|
|
1806
|
-
│ --rag-tools --no-rag-tools
|
|
1807
|
-
│
|
|
2047
|
+
│ --rag-tools --no-rag-tools Enable read_full_document() tool so the │
|
|
2048
|
+
│ LLM can request full document content when │
|
|
2049
|
+
│ retrieved snippets are insufficient. Can │
|
|
2050
|
+
│ be overridden per-request via │
|
|
2051
|
+
│ rag_enable_tools in the JSON body. │
|
|
1808
2052
|
│ [default: rag-tools] │
|
|
1809
2053
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1810
2054
|
╭─ LLM: OpenAI-compatible ───────────────────────────────────────────────────────────────╮
|
|
@@ -1822,7 +2066,8 @@ uv tool install "agent-cli[vad]" -p 3.13
|
|
|
1822
2066
|
╭─ Server Configuration ─────────────────────────────────────────────────────────────────╮
|
|
1823
2067
|
│ --host TEXT Host/IP to bind API servers to. │
|
|
1824
2068
|
│ [default: 0.0.0.0] │
|
|
1825
|
-
│ --port INTEGER Port
|
|
2069
|
+
│ --port INTEGER Port for the RAG proxy API (e.g., │
|
|
2070
|
+
│ http://localhost:8000/v1/chat/completions). │
|
|
1826
2071
|
│ [default: 8000] │
|
|
1827
2072
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1828
2073
|
╭─ General Options ──────────────────────────────────────────────────────────────────────╮
|
|
@@ -1912,41 +2157,61 @@ The `memory proxy` command is the core feature—a middleware server that gives
|
|
|
1912
2157
|
5 Extracts new facts from the conversation in the background and updates the long-term
|
|
1913
2158
|
memory store (including handling contradictions).
|
|
1914
2159
|
|
|
1915
|
-
|
|
1916
|
-
|
|
2160
|
+
Example:
|
|
2161
|
+
|
|
2162
|
+
|
|
2163
|
+
# Start proxy pointing to local Ollama
|
|
2164
|
+
agent-cli memory proxy --openai-base-url http://localhost:11434/v1
|
|
2165
|
+
|
|
2166
|
+
# Then configure your chat client to use http://localhost:8100/v1
|
|
2167
|
+
# as its OpenAI base URL. All requests flow through the memory proxy.
|
|
2168
|
+
|
|
2169
|
+
|
|
2170
|
+
Per-request overrides: Clients can include these fields in the request body: memory_id
|
|
2171
|
+
(conversation ID), memory_top_k, memory_recency_weight, memory_score_threshold.
|
|
1917
2172
|
|
|
1918
2173
|
╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
|
|
1919
2174
|
│ --help -h Show this message and exit. │
|
|
1920
2175
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1921
2176
|
╭─ Memory Configuration ─────────────────────────────────────────────────────────────────╮
|
|
1922
|
-
│ --memory-path PATH
|
|
1923
|
-
│
|
|
2177
|
+
│ --memory-path PATH Directory for memory storage. │
|
|
2178
|
+
│ Contains entries/ (Markdown │
|
|
2179
|
+
│ files) and chroma/ (vector │
|
|
2180
|
+
│ index). Created automatically if │
|
|
2181
|
+
│ it doesn't exist. │
|
|
1924
2182
|
│ [default: ./memory_db] │
|
|
1925
|
-
│ --default-top-k INTEGER Number of
|
|
1926
|
-
│
|
|
2183
|
+
│ --default-top-k INTEGER Number of relevant memories to │
|
|
2184
|
+
│ inject into each request. Higher │
|
|
2185
|
+
│ values provide more context but │
|
|
2186
|
+
│ increase token usage. │
|
|
1927
2187
|
│ [default: 5] │
|
|
1928
|
-
│ --max-entries INTEGER Maximum
|
|
1929
|
-
│
|
|
2188
|
+
│ --max-entries INTEGER Maximum entries per conversation │
|
|
2189
|
+
│ before oldest are evicted. │
|
|
2190
|
+
│ Summaries are preserved │
|
|
2191
|
+
│ separately. │
|
|
1930
2192
|
│ [default: 500] │
|
|
1931
2193
|
│ --mmr-lambda FLOAT MMR lambda (0-1): higher favors │
|
|
1932
2194
|
│ relevance, lower favors │
|
|
1933
2195
|
│ diversity. │
|
|
1934
2196
|
│ [default: 0.7] │
|
|
1935
|
-
│ --recency-weight FLOAT
|
|
1936
|
-
│
|
|
1937
|
-
│
|
|
1938
|
-
│ semantic relevance). │
|
|
2197
|
+
│ --recency-weight FLOAT Weight for recency vs semantic │
|
|
2198
|
+
│ relevance (0.0-1.0). At 0.2: 20% │
|
|
2199
|
+
│ recency, 80% semantic similarity. │
|
|
1939
2200
|
│ [default: 0.2] │
|
|
1940
2201
|
│ --score-threshold FLOAT Minimum semantic relevance │
|
|
1941
2202
|
│ threshold (0.0-1.0). Memories │
|
|
1942
2203
|
│ below this score are discarded to │
|
|
1943
2204
|
│ reduce noise. │
|
|
1944
2205
|
│ [default: 0.35] │
|
|
1945
|
-
│ --summarization --no-summarization
|
|
1946
|
-
│
|
|
2206
|
+
│ --summarization --no-summarization Extract facts and generate │
|
|
2207
|
+
│ summaries after each turn using │
|
|
2208
|
+
│ the LLM. Disable to only store │
|
|
2209
|
+
│ raw conversation turns. │
|
|
1947
2210
|
│ [default: summarization] │
|
|
1948
|
-
│ --git-versioning --no-git-versioning
|
|
1949
|
-
│
|
|
2211
|
+
│ --git-versioning --no-git-versioning Auto-commit memory changes to │
|
|
2212
|
+
│ git. Initializes a repo in │
|
|
2213
|
+
│ --memory-path if needed. Provides │
|
|
2214
|
+
│ full history of memory evolution. │
|
|
1950
2215
|
│ [default: git-versioning] │
|
|
1951
2216
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1952
2217
|
╭─ LLM: OpenAI-compatible ───────────────────────────────────────────────────────────────╮
|
|
@@ -2059,12 +2324,16 @@ agent-cli memory add -c work "Project deadline is Friday"
|
|
|
2059
2324
|
│ for stdin. Supports JSON array, │
|
|
2060
2325
|
│ JSON object with 'memories' key, │
|
|
2061
2326
|
│ or plain text (one per line). │
|
|
2062
|
-
│ --conversation-id -c TEXT Conversation
|
|
2063
|
-
│
|
|
2327
|
+
│ --conversation-id -c TEXT Conversation namespace for these │
|
|
2328
|
+
│ memories. Memories are retrieved │
|
|
2329
|
+
│ per-conversation unless shared │
|
|
2330
|
+
│ globally. │
|
|
2064
2331
|
│ [default: default] │
|
|
2065
|
-
│ --memory-path PATH
|
|
2332
|
+
│ --memory-path PATH Directory for memory storage (same │
|
|
2333
|
+
│ as memory proxy --memory-path). │
|
|
2066
2334
|
│ [default: ./memory_db] │
|
|
2067
|
-
│ --git-versioning --no-git-versioning
|
|
2335
|
+
│ --git-versioning --no-git-versioning Auto-commit changes to git for │
|
|
2336
|
+
│ version history. │
|
|
2068
2337
|
│ [default: git-versioning] │
|
|
2069
2338
|
│ --help -h Show this message and exit. │
|
|
2070
2339
|
╰────────────────────────────────────────────────────────────────────────────────────────╯
|