@xdfnet/ispeak 1.6.9 → 1.6.11
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +103 -0
- package/CLAUDE.md +9 -0
- package/Makefile +120 -0
- package/avaudioengine_player_darwin.go +315 -0
- package/main.go +0 -6
- package/main_test.go +746 -0
- package/package.json +7 -1
- package/stream_player_unsupported.go +9 -0
package/AGENTS.md
ADDED
|
@@ -0,0 +1,103 @@
|
|
|
1
|
+
# AGENTS.md
|
|
2
|
+
|
|
3
|
+
This file provides guidance to Codex (Codex.ai/code) when working with code in this repository.
|
|
4
|
+
|
|
5
|
+
## 项目概述
|
|
6
|
+
|
|
7
|
+
iSpeak — 字节跳动 TTS 本地播报服务。守护进程 `ispeakd` 监听 Unix Socket,接收文本后调用火山引擎 TTS 流式 API,边合成边播放。
|
|
8
|
+
|
|
9
|
+
## 常用命令
|
|
10
|
+
|
|
11
|
+
```bash
|
|
12
|
+
make build # 编译 ispeakd
|
|
13
|
+
make install # 安装 + 启动 launchd 服务
|
|
14
|
+
make deploy # 同 install
|
|
15
|
+
make uninstall # 卸载(停止服务 + 删除文件)
|
|
16
|
+
make clean # 清理编译产物
|
|
17
|
+
make help # 显示帮助
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
## 命令行测试约定
|
|
21
|
+
|
|
22
|
+
- 测试 Claude:`claude -p "你好"`
|
|
23
|
+
- 测试 Codex:`codex exec "你好"`
|
|
24
|
+
|
|
25
|
+
## 架构
|
|
26
|
+
|
|
27
|
+
```
|
|
28
|
+
ispeak (CLI, bash)
|
|
29
|
+
└─ nc -U ~/.config/iSpeak/ispeak.sock
|
|
30
|
+
└─ ispeakd (Go daemon)
|
|
31
|
+
├─ Task Engine (任务仓库)
|
|
32
|
+
│ └─ pending FIFO
|
|
33
|
+
└─ transactionWorker (single)
|
|
34
|
+
└─ pending -> running -> delete
|
|
35
|
+
└─ SSE PCM chunk -> AVAudioEngine
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
- **Socket**: `~/.config/iSpeak/ispeak.sock`
|
|
39
|
+
- **日志**: `~/.config/iSpeak/ispeak.log` (lumberjack 轮转, 10MB/份, 保留3份)
|
|
40
|
+
- **Temp**: 进程级 tempDir,退出时清理
|
|
41
|
+
- **Launchd PLIST**: `~/Library/LaunchAgents/com.ispeak.plist`
|
|
42
|
+
|
|
43
|
+
## 核心文件
|
|
44
|
+
|
|
45
|
+
- `main.go` — 守护进程、任务引擎、TTS 流式请求、SSE 解析、流式播放
|
|
46
|
+
- `avaudioengine_player_darwin.go` — macOS 原生 `AVAudioEngine` PCM 播放器
|
|
47
|
+
- `clean_text.go` — TTS 播报文本清洗
|
|
48
|
+
- `main_test.go` — 任务引擎关键行为测试
|
|
49
|
+
- `scripts/ispeak` — CLI 入口,通过 nc 发送文本到 socket
|
|
50
|
+
- `configs/hook-speak.sh` — Claude/Codex Hook,bash + Node 解析输入
|
|
51
|
+
|
|
52
|
+
## 消息格式
|
|
53
|
+
|
|
54
|
+
CLI 与 daemon 通过 socket 传输原始文本,支持音色前缀:
|
|
55
|
+
|
|
56
|
+
```
|
|
57
|
+
{source:claude}文本 → 使用 claude 来源音色
|
|
58
|
+
{source:codex}文本 → 使用 codex 来源音色
|
|
59
|
+
文本 → 使用默认音色
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
## 任务策略(节省 TTS 费用)
|
|
63
|
+
|
|
64
|
+
新消息到达时:
|
|
65
|
+
1. 删除所有 `pending` 任务(未开始)
|
|
66
|
+
2. 不打断当前 `running` 事务
|
|
67
|
+
3. 创建新任务并进入 `pending`
|
|
68
|
+
|
|
69
|
+
**任务状态流转:**
|
|
70
|
+
```
|
|
71
|
+
pending → running → delete
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
## 失败策略
|
|
75
|
+
|
|
76
|
+
- 流式合成/播放失败:直接删除任务,不重试,避免重复播报
|
|
77
|
+
|
|
78
|
+
## 配置
|
|
79
|
+
|
|
80
|
+
`~/.config/iSpeak/config.json`:
|
|
81
|
+
|
|
82
|
+
```json
|
|
83
|
+
{
|
|
84
|
+
"apiKey": "...",
|
|
85
|
+
"endpoint": "https://openspeech.bytedance.com/api/v3/tts/unidirectional/sse",
|
|
86
|
+
"defaultVoice": { "voice_type": "zh_female_mizai_uranus_bigtts", "resourceId": "seed-tts-2.0" },
|
|
87
|
+
"sourceVoices": {
|
|
88
|
+
"claude": { "voice_type": "zh_female_tianmeitaozi_uranus_bigtts", "resourceId": "seed-tts-2.0" },
|
|
89
|
+
"codex": { "voice_type": "zh_male_shaonianzixin_uranus_bigtts", "resourceId": "seed-tts-2.0" }
|
|
90
|
+
}
|
|
91
|
+
}
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
## 稳定性设计
|
|
95
|
+
|
|
96
|
+
- 单 transaction worker,合成与播放同链路,降低首播延迟
|
|
97
|
+
- 关键 goroutine 有 `panic recover`
|
|
98
|
+
- 配置热更新(mtime 缓存 + 自动重载)
|
|
99
|
+
- TTS HTTP Client 复用,减少连接开销
|
|
100
|
+
- 主链路使用 macOS 原生 `AVAudioEngine` 播放 PCM
|
|
101
|
+
- 播放失败直接删除任务,不重试
|
|
102
|
+
- 日志轮转,防止文件过大
|
|
103
|
+
- 进程级 temp 目录,退出时自动清理
|
package/CLAUDE.md
ADDED
|
@@ -0,0 +1,9 @@
|
|
|
1
|
+
# CLAUDE.md
|
|
2
|
+
|
|
3
|
+
这个仓库的共享工程约定、命令、架构和配置,统一看 [AGENTS.md](/Users/admin/iCode/iSpeak/AGENTS.md)。
|
|
4
|
+
|
|
5
|
+
这里仅补充 Claude Code 相关的最小约定:
|
|
6
|
+
|
|
7
|
+
- `configs/hook-speak.sh` 是 Claude/Codex 共用 hook
|
|
8
|
+
- `{source:claude}` 会走 Claude 音色
|
|
9
|
+
- 其余行为与 [AGENTS.md](/Users/admin/iCode/iSpeak/AGENTS.md) 保持一致
|
package/Makefile
ADDED
|
@@ -0,0 +1,120 @@
|
|
|
1
|
+
.PHONY: build test pack release push install deploy uninstall clean help
|
|
2
|
+
|
|
3
|
+
VERSION := 1.6.9
|
|
4
|
+
TAG := v$(VERSION)
|
|
5
|
+
NPM_PKG := @xdfnet/ispeak
|
|
6
|
+
BIN := build/ispeakd
|
|
7
|
+
BIN_DIR := $(HOME)/.local/bin
|
|
8
|
+
DST := $(BIN_DIR)/ispeakd
|
|
9
|
+
PLIST := $(HOME)/Library/LaunchAgents/com.ispeak.plist
|
|
10
|
+
LEGACY_PLIST := $(HOME)/Library/LaunchAgents/com.iSpeak.plist
|
|
11
|
+
CONFIG := $(HOME)/.config/iSpeak
|
|
12
|
+
LOG := $(HOME)/.config/iSpeak/ispeak.log
|
|
13
|
+
CLI_SRC := scripts/ispeak
|
|
14
|
+
HOOK_SRC := configs/hook-speak.sh
|
|
15
|
+
PLIST_SRC := configs/com.ispeak.plist
|
|
16
|
+
CONFIG_SRC := configs/config.example.json
|
|
17
|
+
|
|
18
|
+
help:
|
|
19
|
+
@echo "iSpeak $(VERSION)"
|
|
20
|
+
@echo ""
|
|
21
|
+
@echo " make build # 编译 ispeakd"
|
|
22
|
+
@echo " make test # 运行 Go 测试、race 测试、构建、npm 打包预检"
|
|
23
|
+
@echo " make release # 推送 GitHub tag 并发布 npm latest"
|
|
24
|
+
@echo " make push # 同 release"
|
|
25
|
+
@echo " make install # 安装并启动服务(首次运行会部署配置和 hook)"
|
|
26
|
+
@echo " make deploy # 同 install"
|
|
27
|
+
@echo " make uninstall # 卸载(停止服务 + 删除文件)"
|
|
28
|
+
@echo " make clean # 清理编译产物"
|
|
29
|
+
|
|
30
|
+
build:
|
|
31
|
+
@mkdir -p build
|
|
32
|
+
@go build -ldflags="-s -w" -o $(BIN) .
|
|
33
|
+
@echo "编译完成: $(BIN)"
|
|
34
|
+
|
|
35
|
+
test:
|
|
36
|
+
@go test -count=1 ./...
|
|
37
|
+
@go test -race -count=1 ./...
|
|
38
|
+
@go build ./...
|
|
39
|
+
@bash scripts/test-hook-speak.sh
|
|
40
|
+
@npm pack --dry-run
|
|
41
|
+
|
|
42
|
+
pack:
|
|
43
|
+
@npm pack --dry-run
|
|
44
|
+
|
|
45
|
+
release: test
|
|
46
|
+
@if [ -n "$$(git status --porcelain)" ]; then \
|
|
47
|
+
echo "工作区不干净,请先提交或暂存改动"; \
|
|
48
|
+
git status --short; \
|
|
49
|
+
exit 1; \
|
|
50
|
+
fi
|
|
51
|
+
@if npm view $(NPM_PKG)@$(VERSION) version >/dev/null 2>&1; then \
|
|
52
|
+
echo "npm 版本已存在: $(NPM_PKG)@$(VERSION)"; \
|
|
53
|
+
exit 1; \
|
|
54
|
+
fi
|
|
55
|
+
@if git rev-parse "$(TAG)" >/dev/null 2>&1; then \
|
|
56
|
+
echo "tag 已存在: $(TAG)"; \
|
|
57
|
+
else \
|
|
58
|
+
git tag "$(TAG)"; \
|
|
59
|
+
fi
|
|
60
|
+
@git push origin HEAD
|
|
61
|
+
@git push origin "$(TAG)"
|
|
62
|
+
@npm publish --access public
|
|
63
|
+
@echo "已发布: $(NPM_PKG)@$(VERSION) / $(TAG)"
|
|
64
|
+
|
|
65
|
+
push: release
|
|
66
|
+
|
|
67
|
+
install: build
|
|
68
|
+
@# 停止旧服务
|
|
69
|
+
@launchctl unload $(LEGACY_PLIST) 2>/dev/null || true
|
|
70
|
+
@launchctl unload $(PLIST) 2>/dev/null || true
|
|
71
|
+
@rm -f $(LEGACY_PLIST)
|
|
72
|
+
@# 安装二进制和 CLI
|
|
73
|
+
@mkdir -p $(BIN_DIR)
|
|
74
|
+
@install -m 0755 $(BIN) $(DST)
|
|
75
|
+
@install -m 0755 $(CURDIR)/$(CLI_SRC) $(BIN_DIR)/ispeak
|
|
76
|
+
@# 部署配置文件(首次不覆盖已有)
|
|
77
|
+
@mkdir -p $(CONFIG)
|
|
78
|
+
@if [ ! -f $(CONFIG)/config.json ]; then \
|
|
79
|
+
cp $(CONFIG_SRC) $(CONFIG)/config.json; \
|
|
80
|
+
echo "配置文件已创建: $(CONFIG)/config.json"; \
|
|
81
|
+
else \
|
|
82
|
+
echo "配置文件已存在: $(CONFIG)/config.json"; \
|
|
83
|
+
fi
|
|
84
|
+
@if grep -q '"endpoint"[[:space:]]*:[[:space:]]*"https://openspeech.bytedance.com/api/v3/tts/unidirectional"' $(CONFIG)/config.json; then \
|
|
85
|
+
cp $(CONFIG)/config.json $(CONFIG)/config.json.bak; \
|
|
86
|
+
perl -pi -e 's|"https://openspeech.bytedance.com/api/v3/tts/unidirectional"|"https://openspeech.bytedance.com/api/v3/tts/unidirectional/sse"|g' $(CONFIG)/config.json; \
|
|
87
|
+
echo "配置 endpoint 已迁移到 SSE,旧配置备份: $(CONFIG)/config.json.bak"; \
|
|
88
|
+
fi
|
|
89
|
+
@# 部署 hook 脚本(覆盖安装;如有本地改动先备份)
|
|
90
|
+
@if [ -f $(CONFIG)/hook-speak.sh ] && ! cmp -s $(HOOK_SRC) $(CONFIG)/hook-speak.sh; then \
|
|
91
|
+
cp $(CONFIG)/hook-speak.sh $(CONFIG)/hook-speak.sh.bak; \
|
|
92
|
+
echo "旧 Hook 已备份: $(CONFIG)/hook-speak.sh.bak"; \
|
|
93
|
+
fi
|
|
94
|
+
@cp $(HOOK_SRC) $(CONFIG)/hook-speak.sh
|
|
95
|
+
@chmod +x $(CONFIG)/hook-speak.sh
|
|
96
|
+
@echo "Hook 脚本已安装: $(CONFIG)/hook-speak.sh"
|
|
97
|
+
@# 安装 launchd plist
|
|
98
|
+
@sed 's|BINARY_PATH_PLACEHOLDER|$(DST)|' $(PLIST_SRC) > $(PLIST)
|
|
99
|
+
@# 启动
|
|
100
|
+
@launchctl load $(PLIST)
|
|
101
|
+
@sleep 0.5
|
|
102
|
+
@# 自检
|
|
103
|
+
@$(BIN_DIR)/ispeak status && echo "" && echo "安装成功!" || { echo "安装失败,请检查日志: $(LOG)"; exit 1; }
|
|
104
|
+
|
|
105
|
+
deploy: install
|
|
106
|
+
|
|
107
|
+
uninstall:
|
|
108
|
+
@echo "停止服务..."
|
|
109
|
+
@launchctl unload $(LEGACY_PLIST) 2>/dev/null || true
|
|
110
|
+
@launchctl unload $(PLIST) 2>/dev/null || true
|
|
111
|
+
@rm -f $(LEGACY_PLIST)
|
|
112
|
+
@rm -f $(PLIST)
|
|
113
|
+
@echo "删除文件..."
|
|
114
|
+
@rm -f $(BIN_DIR)/ispeakd $(BIN_DIR)/ispeak
|
|
115
|
+
@echo "保留配置目录: $(CONFIG)"
|
|
116
|
+
@echo "卸载完成"
|
|
117
|
+
|
|
118
|
+
clean:
|
|
119
|
+
@rm -rf build
|
|
120
|
+
@echo "清理完成"
|
|
@@ -0,0 +1,315 @@
|
|
|
1
|
+
//go:build darwin && cgo
|
|
2
|
+
|
|
3
|
+
package main
|
|
4
|
+
|
|
5
|
+
/*
|
|
6
|
+
#cgo CFLAGS: -x objective-c -fblocks
|
|
7
|
+
#cgo LDFLAGS: -framework AVFoundation -framework Foundation
|
|
8
|
+
|
|
9
|
+
#include <AVFoundation/AVFoundation.h>
|
|
10
|
+
#include <pthread.h>
|
|
11
|
+
#include <stdint.h>
|
|
12
|
+
#include <stdio.h>
|
|
13
|
+
#include <stdlib.h>
|
|
14
|
+
#include <string.h>
|
|
15
|
+
|
|
16
|
+
typedef struct {
|
|
17
|
+
AVAudioEngine *engine;
|
|
18
|
+
AVAudioPlayerNode *node;
|
|
19
|
+
AVAudioFormat *format;
|
|
20
|
+
pthread_mutex_t mu;
|
|
21
|
+
pthread_cond_t cond;
|
|
22
|
+
int pending;
|
|
23
|
+
int closing;
|
|
24
|
+
int started;
|
|
25
|
+
} AVNativePlayer;
|
|
26
|
+
|
|
27
|
+
static char *av_make_error(const char *prefix, NSError *error) {
|
|
28
|
+
const char *detail = "";
|
|
29
|
+
if (error != nil && [error localizedDescription] != nil) {
|
|
30
|
+
detail = [[error localizedDescription] UTF8String];
|
|
31
|
+
}
|
|
32
|
+
char buffer[512];
|
|
33
|
+
snprintf(buffer, sizeof(buffer), "%s: %s", prefix, detail);
|
|
34
|
+
return strdup(buffer);
|
|
35
|
+
}
|
|
36
|
+
|
|
37
|
+
static int av_player_create(double sampleRate, unsigned int channels, AVNativePlayer **out, char **err) {
|
|
38
|
+
if (out == NULL) {
|
|
39
|
+
if (err) *err = strdup("av_player_create: out is nil");
|
|
40
|
+
return -1;
|
|
41
|
+
}
|
|
42
|
+
*out = NULL;
|
|
43
|
+
|
|
44
|
+
@autoreleasepool {
|
|
45
|
+
AVNativePlayer *player = (AVNativePlayer *)calloc(1, sizeof(AVNativePlayer));
|
|
46
|
+
if (player == NULL) {
|
|
47
|
+
if (err) *err = strdup("av_player_create: calloc failed");
|
|
48
|
+
return -1;
|
|
49
|
+
}
|
|
50
|
+
|
|
51
|
+
if (pthread_mutex_init(&player->mu, NULL) != 0) {
|
|
52
|
+
if (err) *err = strdup("av_player_create: mutex init failed");
|
|
53
|
+
free(player);
|
|
54
|
+
return -1;
|
|
55
|
+
}
|
|
56
|
+
if (pthread_cond_init(&player->cond, NULL) != 0) {
|
|
57
|
+
if (err) *err = strdup("av_player_create: cond init failed");
|
|
58
|
+
pthread_mutex_destroy(&player->mu);
|
|
59
|
+
free(player);
|
|
60
|
+
return -1;
|
|
61
|
+
}
|
|
62
|
+
|
|
63
|
+
player->engine = [[AVAudioEngine alloc] init];
|
|
64
|
+
player->node = [[AVAudioPlayerNode alloc] init];
|
|
65
|
+
player->format = [[AVAudioFormat alloc] initWithCommonFormat:AVAudioPCMFormatFloat32 sampleRate:sampleRate channels:channels interleaved:NO];
|
|
66
|
+
if (player->engine == nil || player->node == nil || player->format == nil) {
|
|
67
|
+
if (err) *err = strdup("av_player_create: failed to allocate AVAudio objects");
|
|
68
|
+
[player->engine release];
|
|
69
|
+
[player->node release];
|
|
70
|
+
[player->format release];
|
|
71
|
+
pthread_cond_destroy(&player->cond);
|
|
72
|
+
pthread_mutex_destroy(&player->mu);
|
|
73
|
+
free(player);
|
|
74
|
+
return -1;
|
|
75
|
+
}
|
|
76
|
+
|
|
77
|
+
[player->engine attachNode:player->node];
|
|
78
|
+
[player->engine connect:player->node to:player->engine.mainMixerNode format:player->format];
|
|
79
|
+
[player->engine prepare];
|
|
80
|
+
|
|
81
|
+
NSError *error = nil;
|
|
82
|
+
if (![player->engine startAndReturnError:&error]) {
|
|
83
|
+
if (err) *err = av_make_error("AVAudioEngine start failed", error);
|
|
84
|
+
[player->node release];
|
|
85
|
+
[player->engine release];
|
|
86
|
+
[player->format release];
|
|
87
|
+
pthread_cond_destroy(&player->cond);
|
|
88
|
+
pthread_mutex_destroy(&player->mu);
|
|
89
|
+
free(player);
|
|
90
|
+
return -1;
|
|
91
|
+
}
|
|
92
|
+
|
|
93
|
+
[player->node play];
|
|
94
|
+
player->started = 1;
|
|
95
|
+
*out = player;
|
|
96
|
+
return 0;
|
|
97
|
+
}
|
|
98
|
+
}
|
|
99
|
+
|
|
100
|
+
static int av_player_write(AVNativePlayer *player, const void *data, size_t len, char **err) {
|
|
101
|
+
if (player == NULL || player->engine == nil || player->node == nil || player->format == nil) {
|
|
102
|
+
if (err) *err = strdup("av_player_write: player is closed");
|
|
103
|
+
return -1;
|
|
104
|
+
}
|
|
105
|
+
if (data == NULL || len == 0) {
|
|
106
|
+
return 0;
|
|
107
|
+
}
|
|
108
|
+
|
|
109
|
+
AVAudioFrameCount frames = (AVAudioFrameCount)(len / 2);
|
|
110
|
+
if (frames == 0) {
|
|
111
|
+
return 0;
|
|
112
|
+
}
|
|
113
|
+
|
|
114
|
+
AVAudioPCMBuffer *buffer = [[AVAudioPCMBuffer alloc] initWithPCMFormat:player->format frameCapacity:frames];
|
|
115
|
+
if (buffer == nil) {
|
|
116
|
+
if (err) *err = strdup("av_player_write: failed to allocate buffer");
|
|
117
|
+
return -1;
|
|
118
|
+
}
|
|
119
|
+
buffer.frameLength = frames;
|
|
120
|
+
float *dest = buffer.floatChannelData[0];
|
|
121
|
+
const int16_t *src = (const int16_t *)data;
|
|
122
|
+
for (AVAudioFrameCount i = 0; i < frames; i++) {
|
|
123
|
+
dest[i] = (float)src[i] / 32768.0f;
|
|
124
|
+
}
|
|
125
|
+
|
|
126
|
+
pthread_mutex_lock(&player->mu);
|
|
127
|
+
player->pending++;
|
|
128
|
+
pthread_mutex_unlock(&player->mu);
|
|
129
|
+
|
|
130
|
+
AVNativePlayer *captured = player;
|
|
131
|
+
[player->node scheduleBuffer:buffer completionHandler:^{
|
|
132
|
+
pthread_mutex_lock(&captured->mu);
|
|
133
|
+
if (captured->pending > 0) {
|
|
134
|
+
captured->pending--;
|
|
135
|
+
}
|
|
136
|
+
if (captured->closing && captured->pending == 0) {
|
|
137
|
+
pthread_cond_signal(&captured->cond);
|
|
138
|
+
}
|
|
139
|
+
pthread_mutex_unlock(&captured->mu);
|
|
140
|
+
}];
|
|
141
|
+
|
|
142
|
+
return 0;
|
|
143
|
+
}
|
|
144
|
+
|
|
145
|
+
static int av_player_close_and_dispose(AVNativePlayer *player, char **err) {
|
|
146
|
+
if (player == NULL) {
|
|
147
|
+
return 0;
|
|
148
|
+
}
|
|
149
|
+
|
|
150
|
+
pthread_mutex_lock(&player->mu);
|
|
151
|
+
player->closing = 1;
|
|
152
|
+
while (player->pending > 0) {
|
|
153
|
+
pthread_cond_wait(&player->cond, &player->mu);
|
|
154
|
+
}
|
|
155
|
+
pthread_mutex_unlock(&player->mu);
|
|
156
|
+
|
|
157
|
+
@autoreleasepool {
|
|
158
|
+
[player->node stop];
|
|
159
|
+
[player->engine stop];
|
|
160
|
+
[player->node release];
|
|
161
|
+
[player->engine release];
|
|
162
|
+
[player->format release];
|
|
163
|
+
}
|
|
164
|
+
|
|
165
|
+
pthread_cond_destroy(&player->cond);
|
|
166
|
+
pthread_mutex_destroy(&player->mu);
|
|
167
|
+
free(player);
|
|
168
|
+
return 0;
|
|
169
|
+
}
|
|
170
|
+
|
|
171
|
+
static int av_player_abort(AVNativePlayer *player, char **err) {
|
|
172
|
+
return av_player_close_and_dispose(player, err);
|
|
173
|
+
}
|
|
174
|
+
|
|
175
|
+
*/
|
|
176
|
+
import "C"
|
|
177
|
+
|
|
178
|
+
import (
|
|
179
|
+
"errors"
|
|
180
|
+
"fmt"
|
|
181
|
+
"log"
|
|
182
|
+
"sync"
|
|
183
|
+
"unsafe"
|
|
184
|
+
)
|
|
185
|
+
|
|
186
|
+
const (
|
|
187
|
+
audioEngineSampleRate = 48000
|
|
188
|
+
audioEngineChannels = 1
|
|
189
|
+
audioEngineChunkSize = 32 * 1024
|
|
190
|
+
)
|
|
191
|
+
|
|
192
|
+
type avAudioEngineStreamPlayer struct {
|
|
193
|
+
mu sync.Mutex
|
|
194
|
+
ptr *C.AVNativePlayer
|
|
195
|
+
tail []byte
|
|
196
|
+
}
|
|
197
|
+
|
|
198
|
+
func newDefaultStreamPlayer() (StreamPlayer, error) {
|
|
199
|
+
var ptr *C.AVNativePlayer
|
|
200
|
+
var cerr *C.char
|
|
201
|
+
if C.av_player_create(C.double(audioEngineSampleRate), C.uint(audioEngineChannels), &ptr, &cerr) != 0 {
|
|
202
|
+
return nil, cError(cerr)
|
|
203
|
+
}
|
|
204
|
+
log.Printf("播放器模式: AVAudioEngine PCM 流式 (%d Hz, %d channel)", audioEngineSampleRate, audioEngineChannels)
|
|
205
|
+
return &avAudioEngineStreamPlayer{ptr: ptr}, nil
|
|
206
|
+
}
|
|
207
|
+
|
|
208
|
+
func (p *avAudioEngineStreamPlayer) Write(audio []byte) error {
|
|
209
|
+
if len(audio) == 0 {
|
|
210
|
+
return nil
|
|
211
|
+
}
|
|
212
|
+
|
|
213
|
+
p.mu.Lock()
|
|
214
|
+
defer p.mu.Unlock()
|
|
215
|
+
|
|
216
|
+
if p.ptr == nil {
|
|
217
|
+
return errors.New("AVAudioEngine player 已关闭")
|
|
218
|
+
}
|
|
219
|
+
|
|
220
|
+
data := audio
|
|
221
|
+
if len(p.tail) > 0 {
|
|
222
|
+
data = append(append([]byte(nil), p.tail...), audio...)
|
|
223
|
+
p.tail = nil
|
|
224
|
+
}
|
|
225
|
+
|
|
226
|
+
if len(data)%2 == 1 {
|
|
227
|
+
p.tail = append(p.tail[:0], data[len(data)-1])
|
|
228
|
+
data = data[:len(data)-1]
|
|
229
|
+
}
|
|
230
|
+
|
|
231
|
+
for len(data) > 0 {
|
|
232
|
+
n := audioEngineChunkSize
|
|
233
|
+
if n > len(data) {
|
|
234
|
+
n = len(data)
|
|
235
|
+
}
|
|
236
|
+
if n%2 == 1 {
|
|
237
|
+
n--
|
|
238
|
+
}
|
|
239
|
+
if n <= 0 {
|
|
240
|
+
break
|
|
241
|
+
}
|
|
242
|
+
if err := p.writeChunk(data[:n]); err != nil {
|
|
243
|
+
return err
|
|
244
|
+
}
|
|
245
|
+
data = data[n:]
|
|
246
|
+
}
|
|
247
|
+
|
|
248
|
+
return nil
|
|
249
|
+
}
|
|
250
|
+
|
|
251
|
+
func (p *avAudioEngineStreamPlayer) CloseAndWait() error {
|
|
252
|
+
p.mu.Lock()
|
|
253
|
+
defer p.mu.Unlock()
|
|
254
|
+
return p.closeLocked()
|
|
255
|
+
}
|
|
256
|
+
|
|
257
|
+
func (p *avAudioEngineStreamPlayer) Abort() error {
|
|
258
|
+
p.mu.Lock()
|
|
259
|
+
defer p.mu.Unlock()
|
|
260
|
+
return p.closeLocked()
|
|
261
|
+
}
|
|
262
|
+
|
|
263
|
+
func (p *avAudioEngineStreamPlayer) writeChunk(data []byte) error {
|
|
264
|
+
if len(data) == 0 {
|
|
265
|
+
return nil
|
|
266
|
+
}
|
|
267
|
+
if p.ptr == nil {
|
|
268
|
+
return errors.New("AVAudioEngine player 已关闭")
|
|
269
|
+
}
|
|
270
|
+
|
|
271
|
+
var cerr *C.char
|
|
272
|
+
if C.av_player_write(p.ptr, unsafe.Pointer(&data[0]), C.size_t(len(data)), &cerr) != 0 {
|
|
273
|
+
return cError(cerr)
|
|
274
|
+
}
|
|
275
|
+
return nil
|
|
276
|
+
}
|
|
277
|
+
|
|
278
|
+
func (p *avAudioEngineStreamPlayer) closeLocked() error {
|
|
279
|
+
if p.ptr == nil {
|
|
280
|
+
return nil
|
|
281
|
+
}
|
|
282
|
+
|
|
283
|
+
if len(p.tail) > 0 {
|
|
284
|
+
pad := append(append([]byte(nil), p.tail...), 0)
|
|
285
|
+
p.tail = nil
|
|
286
|
+
if err := p.writeChunk(pad); err != nil {
|
|
287
|
+
_ = p.disposeLocked()
|
|
288
|
+
return err
|
|
289
|
+
}
|
|
290
|
+
}
|
|
291
|
+
|
|
292
|
+
return p.disposeLocked()
|
|
293
|
+
}
|
|
294
|
+
|
|
295
|
+
func (p *avAudioEngineStreamPlayer) disposeLocked() error {
|
|
296
|
+
if p.ptr == nil {
|
|
297
|
+
return nil
|
|
298
|
+
}
|
|
299
|
+
|
|
300
|
+
var cerr *C.char
|
|
301
|
+
if C.av_player_close_and_dispose(p.ptr, &cerr) != 0 {
|
|
302
|
+
p.ptr = nil
|
|
303
|
+
return cError(cerr)
|
|
304
|
+
}
|
|
305
|
+
p.ptr = nil
|
|
306
|
+
return nil
|
|
307
|
+
}
|
|
308
|
+
|
|
309
|
+
func cError(cerr *C.char) error {
|
|
310
|
+
if cerr == nil {
|
|
311
|
+
return nil
|
|
312
|
+
}
|
|
313
|
+
defer C.free(unsafe.Pointer(cerr))
|
|
314
|
+
return fmt.Errorf("%s", C.GoString(cerr))
|
|
315
|
+
}
|
package/main.go
CHANGED
|
@@ -96,12 +96,6 @@ func (e *TaskEngine) Start() {
|
|
|
96
96
|
func (e *TaskEngine) Submit(text string, voice VoiceInfo, cfg Config) uint64 {
|
|
97
97
|
e.mu.Lock()
|
|
98
98
|
|
|
99
|
-
// 新任务进来先删所有未开始的任务。
|
|
100
|
-
for _, id := range e.pending {
|
|
101
|
-
delete(e.tasks, id)
|
|
102
|
-
log.Printf("删除待执行任务: id=%d", id)
|
|
103
|
-
}
|
|
104
|
-
e.pending = e.pending[:0]
|
|
105
99
|
|
|
106
100
|
e.nextID++
|
|
107
101
|
task := &Task{
|
package/main_test.go
ADDED
|
@@ -0,0 +1,746 @@
|
|
|
1
|
+
package main
|
|
2
|
+
|
|
3
|
+
import (
|
|
4
|
+
"context"
|
|
5
|
+
"encoding/base64"
|
|
6
|
+
"errors"
|
|
7
|
+
"net"
|
|
8
|
+
"os"
|
|
9
|
+
"path/filepath"
|
|
10
|
+
"strings"
|
|
11
|
+
"sync"
|
|
12
|
+
"testing"
|
|
13
|
+
"time"
|
|
14
|
+
)
|
|
15
|
+
|
|
16
|
+
func TestSubmitKeepsAllPendingTasks(t *testing.T) {
|
|
17
|
+
e := NewTaskEngine()
|
|
18
|
+
e.synthesizeStreamFn = func(ctx context.Context, cfg Config, text string, voice *VoiceInfo, onAudio func([]byte) error) error {
|
|
19
|
+
return onAudio([]byte("ok"))
|
|
20
|
+
}
|
|
21
|
+
e.newStreamPlayerFn = newFakeStreamPlayerFactory()
|
|
22
|
+
|
|
23
|
+
cfg := Config{}
|
|
24
|
+
voice := VoiceInfo{VoiceType: "v", ResourceID: "r"}
|
|
25
|
+
|
|
26
|
+
e.Submit("a", voice, cfg)
|
|
27
|
+
e.Submit("b", voice, cfg)
|
|
28
|
+
|
|
29
|
+
e.mu.Lock()
|
|
30
|
+
defer e.mu.Unlock()
|
|
31
|
+
if len(e.pending) != 2 {
|
|
32
|
+
t.Fatalf("expected 2 pending, got %d", len(e.pending))
|
|
33
|
+
}
|
|
34
|
+
if e.latestID != 2 {
|
|
35
|
+
t.Fatalf("expected latestID 2, got %d", e.latestID)
|
|
36
|
+
}
|
|
37
|
+
}
|
|
38
|
+
|
|
39
|
+
func TestTransactionDeletesOnSynthesisFailureWithoutRetry(t *testing.T) {
|
|
40
|
+
e := NewTaskEngine()
|
|
41
|
+
var mu sync.Mutex
|
|
42
|
+
calls := 0
|
|
43
|
+
e.synthesizeStreamFn = func(ctx context.Context, cfg Config, text string, voice *VoiceInfo, onAudio func([]byte) error) error {
|
|
44
|
+
mu.Lock()
|
|
45
|
+
defer mu.Unlock()
|
|
46
|
+
calls++
|
|
47
|
+
return errors.New("fail")
|
|
48
|
+
}
|
|
49
|
+
e.newStreamPlayerFn = newFakeStreamPlayerFactory()
|
|
50
|
+
e.Start()
|
|
51
|
+
|
|
52
|
+
cfg := Config{}
|
|
53
|
+
voice := VoiceInfo{VoiceType: "v", ResourceID: "r"}
|
|
54
|
+
id := e.Submit("x", voice, cfg)
|
|
55
|
+
|
|
56
|
+
waitFor(t, 2*time.Second, func() bool {
|
|
57
|
+
e.mu.Lock()
|
|
58
|
+
defer e.mu.Unlock()
|
|
59
|
+
_, ok := e.tasks[id]
|
|
60
|
+
return !ok
|
|
61
|
+
})
|
|
62
|
+
|
|
63
|
+
mu.Lock()
|
|
64
|
+
defer mu.Unlock()
|
|
65
|
+
if calls != 1 {
|
|
66
|
+
t.Fatalf("expected synth attempts=1, got %d", calls)
|
|
67
|
+
}
|
|
68
|
+
}
|
|
69
|
+
|
|
70
|
+
func TestTransactionDeletesOnPlayerWriteFailureWithoutRetry(t *testing.T) {
|
|
71
|
+
e := NewTaskEngine()
|
|
72
|
+
e.synthesizeStreamFn = func(ctx context.Context, cfg Config, text string, voice *VoiceInfo, onAudio func([]byte) error) error {
|
|
73
|
+
return onAudio([]byte("audio"))
|
|
74
|
+
}
|
|
75
|
+
var mu sync.Mutex
|
|
76
|
+
calls := 0
|
|
77
|
+
e.newStreamPlayerFn = func() (StreamPlayer, error) {
|
|
78
|
+
mu.Lock()
|
|
79
|
+
defer mu.Unlock()
|
|
80
|
+
calls++
|
|
81
|
+
return &fakeStreamPlayer{writeErr: calls == 1}, nil
|
|
82
|
+
}
|
|
83
|
+
e.Start()
|
|
84
|
+
|
|
85
|
+
cfg := Config{}
|
|
86
|
+
voice := VoiceInfo{VoiceType: "v", ResourceID: "r"}
|
|
87
|
+
id := e.Submit("x", voice, cfg)
|
|
88
|
+
|
|
89
|
+
waitFor(t, 2*time.Second, func() bool {
|
|
90
|
+
e.mu.Lock()
|
|
91
|
+
defer e.mu.Unlock()
|
|
92
|
+
_, ok := e.tasks[id]
|
|
93
|
+
return !ok
|
|
94
|
+
})
|
|
95
|
+
|
|
96
|
+
mu.Lock()
|
|
97
|
+
defer mu.Unlock()
|
|
98
|
+
if calls != 1 {
|
|
99
|
+
t.Fatalf("expected player attempts=1, got %d", calls)
|
|
100
|
+
}
|
|
101
|
+
}
|
|
102
|
+
|
|
103
|
+
func TestSubmitDoesNotInterruptRunningTask(t *testing.T) {
|
|
104
|
+
e := NewTaskEngine()
|
|
105
|
+
start := make(chan struct{}, 1)
|
|
106
|
+
release := make(chan struct{})
|
|
107
|
+
var mu sync.Mutex
|
|
108
|
+
processed := make([]string, 0, 2)
|
|
109
|
+
|
|
110
|
+
e.synthesizeStreamFn = func(ctx context.Context, cfg Config, text string, voice *VoiceInfo, onAudio func([]byte) error) error {
|
|
111
|
+
if text == "a" {
|
|
112
|
+
start <- struct{}{}
|
|
113
|
+
}
|
|
114
|
+
if text == "a" {
|
|
115
|
+
<-release
|
|
116
|
+
}
|
|
117
|
+
mu.Lock()
|
|
118
|
+
processed = append(processed, text)
|
|
119
|
+
mu.Unlock()
|
|
120
|
+
return onAudio([]byte(text))
|
|
121
|
+
}
|
|
122
|
+
e.newStreamPlayerFn = newFakeStreamPlayerFactory()
|
|
123
|
+
e.Start()
|
|
124
|
+
|
|
125
|
+
cfg := Config{}
|
|
126
|
+
voice := VoiceInfo{VoiceType: "v", ResourceID: "r"}
|
|
127
|
+
e.Submit("a", voice, cfg)
|
|
128
|
+
<-start // a 已进入 running
|
|
129
|
+
secondID := e.Submit("b", voice, cfg)
|
|
130
|
+
|
|
131
|
+
e.mu.Lock()
|
|
132
|
+
if len(e.pending) != 1 || e.pending[0] != secondID {
|
|
133
|
+
t.Fatalf("expected second task pending while first keeps running, got %#v", e.pending)
|
|
134
|
+
}
|
|
135
|
+
e.mu.Unlock()
|
|
136
|
+
|
|
137
|
+
close(release)
|
|
138
|
+
|
|
139
|
+
waitFor(t, 3*time.Second, func() bool {
|
|
140
|
+
e.mu.Lock()
|
|
141
|
+
defer e.mu.Unlock()
|
|
142
|
+
return len(e.tasks) == 0
|
|
143
|
+
})
|
|
144
|
+
|
|
145
|
+
mu.Lock()
|
|
146
|
+
defer mu.Unlock()
|
|
147
|
+
if strings.Join(processed, "") != "ab" {
|
|
148
|
+
t.Fatalf("expected processed [a b], got %#v", processed)
|
|
149
|
+
}
|
|
150
|
+
}
|
|
151
|
+
|
|
152
|
+
func TestClaimedStaleTaskIsSkippedBeforeTransaction(t *testing.T) {
|
|
153
|
+
e := NewTaskEngine()
|
|
154
|
+
calls := 0
|
|
155
|
+
e.synthesizeStreamFn = func(ctx context.Context, cfg Config, text string, voice *VoiceInfo, onAudio func([]byte) error) error {
|
|
156
|
+
calls++
|
|
157
|
+
return onAudio([]byte(text))
|
|
158
|
+
}
|
|
159
|
+
e.newStreamPlayerFn = newFakeStreamPlayerFactory()
|
|
160
|
+
|
|
161
|
+
cfg := Config{}
|
|
162
|
+
voice := VoiceInfo{VoiceType: "v", ResourceID: "r"}
|
|
163
|
+
firstID := e.Submit("a", voice, cfg)
|
|
164
|
+
claimedID := e.claimPending()
|
|
165
|
+
if claimedID != firstID {
|
|
166
|
+
t.Fatalf("expected claimed first task %d, got %d", firstID, claimedID)
|
|
167
|
+
}
|
|
168
|
+
e.Submit("b", voice, cfg)
|
|
169
|
+
|
|
170
|
+
e.processTransaction(firstID)
|
|
171
|
+
|
|
172
|
+
if calls != 0 {
|
|
173
|
+
t.Fatalf("expected stale task skipped before transaction, got calls=%d", calls)
|
|
174
|
+
}
|
|
175
|
+
e.mu.Lock()
|
|
176
|
+
_, firstExists := e.tasks[firstID]
|
|
177
|
+
e.mu.Unlock()
|
|
178
|
+
if firstExists {
|
|
179
|
+
t.Fatalf("expected stale task deleted")
|
|
180
|
+
}
|
|
181
|
+
}
|
|
182
|
+
|
|
183
|
+
func TestSynthesisPanicDeletesTaskAndWorkerContinues(t *testing.T) {
|
|
184
|
+
e := NewTaskEngine()
|
|
185
|
+
e.synthesizeStreamFn = func(ctx context.Context, cfg Config, text string, voice *VoiceInfo, onAudio func([]byte) error) error {
|
|
186
|
+
if text == "panic" {
|
|
187
|
+
panic("boom")
|
|
188
|
+
}
|
|
189
|
+
return onAudio([]byte(text))
|
|
190
|
+
}
|
|
191
|
+
e.newStreamPlayerFn = newFakeStreamPlayerFactory()
|
|
192
|
+
e.Start()
|
|
193
|
+
|
|
194
|
+
cfg := Config{}
|
|
195
|
+
voice := VoiceInfo{VoiceType: "v", ResourceID: "r"}
|
|
196
|
+
panicID := e.Submit("panic", voice, cfg)
|
|
197
|
+
waitForTaskDeleted(t, e, panicID)
|
|
198
|
+
|
|
199
|
+
okID := e.Submit("ok", voice, cfg)
|
|
200
|
+
waitForTaskDeleted(t, e, okID)
|
|
201
|
+
}
|
|
202
|
+
|
|
203
|
+
func TestPlaybackPanicDeletesTaskAndWorkerContinues(t *testing.T) {
|
|
204
|
+
e := NewTaskEngine()
|
|
205
|
+
e.synthesizeStreamFn = func(ctx context.Context, cfg Config, text string, voice *VoiceInfo, onAudio func([]byte) error) error {
|
|
206
|
+
return onAudio([]byte(text))
|
|
207
|
+
}
|
|
208
|
+
var mu sync.Mutex
|
|
209
|
+
calls := 0
|
|
210
|
+
e.newStreamPlayerFn = func() (StreamPlayer, error) {
|
|
211
|
+
mu.Lock()
|
|
212
|
+
defer mu.Unlock()
|
|
213
|
+
calls++
|
|
214
|
+
return &fakeStreamPlayer{panicOnWrite: calls == 1}, nil
|
|
215
|
+
}
|
|
216
|
+
e.Start()
|
|
217
|
+
|
|
218
|
+
cfg := Config{}
|
|
219
|
+
voice := VoiceInfo{VoiceType: "v", ResourceID: "r"}
|
|
220
|
+
panicID := e.Submit("panic", voice, cfg)
|
|
221
|
+
waitForTaskDeleted(t, e, panicID)
|
|
222
|
+
|
|
223
|
+
okID := e.Submit("ok", voice, cfg)
|
|
224
|
+
waitForTaskDeleted(t, e, okID)
|
|
225
|
+
}
|
|
226
|
+
|
|
227
|
+
func TestParseSSEStreamWritesChunksInOrder(t *testing.T) {
|
|
228
|
+
stream := strings.NewReader(
|
|
229
|
+
"data: {\"audio\":\"YQ==\"}\n\n" +
|
|
230
|
+
"data: {\"data\":{\"audio\":\"Yg==\"}}\n\n" +
|
|
231
|
+
"data: [DONE]\n\n",
|
|
232
|
+
)
|
|
233
|
+
|
|
234
|
+
var got []string
|
|
235
|
+
err := parseSSEStream(stream, func(audio []byte) error {
|
|
236
|
+
got = append(got, string(audio))
|
|
237
|
+
return nil
|
|
238
|
+
})
|
|
239
|
+
if err != nil {
|
|
240
|
+
t.Fatalf("parse stream: %v", err)
|
|
241
|
+
}
|
|
242
|
+
if strings.Join(got, "") != "ab" {
|
|
243
|
+
t.Fatalf("expected chunks ab, got %#v", got)
|
|
244
|
+
}
|
|
245
|
+
}
|
|
246
|
+
|
|
247
|
+
func TestParseSSEStreamHandlesLargeAudioLine(t *testing.T) {
|
|
248
|
+
payload := strings.Repeat("a", 300*1024)
|
|
249
|
+
encoded := base64.StdEncoding.EncodeToString([]byte(payload))
|
|
250
|
+
stream := strings.NewReader("data: {\"audio\":\"" + encoded + "\"}\n\n")
|
|
251
|
+
|
|
252
|
+
var got []byte
|
|
253
|
+
err := parseSSEStream(stream, func(audio []byte) error {
|
|
254
|
+
got = append(got, audio...)
|
|
255
|
+
return nil
|
|
256
|
+
})
|
|
257
|
+
if err != nil {
|
|
258
|
+
t.Fatalf("parse stream: %v", err)
|
|
259
|
+
}
|
|
260
|
+
if string(got) != payload {
|
|
261
|
+
t.Fatalf("expected large payload preserved, got len=%d", len(got))
|
|
262
|
+
}
|
|
263
|
+
}
|
|
264
|
+
|
|
265
|
+
func TestParseSSEStreamReturnsTTSFailureMessage(t *testing.T) {
|
|
266
|
+
stream := strings.NewReader("event: 153\ndata: {\"code\":55001307,\"message\":\"voice clone failed\",\"data\":null}\n\n")
|
|
267
|
+
|
|
268
|
+
err := parseSSEStream(stream, func(audio []byte) error {
|
|
269
|
+
t.Fatalf("unexpected audio callback")
|
|
270
|
+
return nil
|
|
271
|
+
})
|
|
272
|
+
if err == nil {
|
|
273
|
+
t.Fatalf("expected tts failure")
|
|
274
|
+
}
|
|
275
|
+
if !strings.Contains(err.Error(), "55001307") || !strings.Contains(err.Error(), "voice clone failed") {
|
|
276
|
+
t.Fatalf("expected code and message in error, got %v", err)
|
|
277
|
+
}
|
|
278
|
+
}
|
|
279
|
+
|
|
280
|
+
func TestCleanTextRemovesSpeechNoise(t *testing.T) {
|
|
281
|
+
input := strings.Join([]string{
|
|
282
|
+
"## 结果",
|
|
283
|
+
"- **验证通过**:[main.go](/Users/admin/iCode/iSpeak/main.go:123)",
|
|
284
|
+
"- commit: a97e57d Improve latest-only task handling",
|
|
285
|
+
"- 路径:/Users/admin/iCode/iSpeak/main.go",
|
|
286
|
+
"| 文件 | 状态 |",
|
|
287
|
+
"|------|------|",
|
|
288
|
+
"| model-00001.safetensors | ✅ 完整 |",
|
|
289
|
+
"```go",
|
|
290
|
+
"fmt.Println(\"不要播代码\")",
|
|
291
|
+
"```",
|
|
292
|
+
"https://example.com/path",
|
|
293
|
+
"飞哥,需要你重启服务。",
|
|
294
|
+
}, "\n")
|
|
295
|
+
|
|
296
|
+
got := cleanText(input)
|
|
297
|
+
for _, bad := range []string{
|
|
298
|
+
"**",
|
|
299
|
+
"`",
|
|
300
|
+
"/Users/admin",
|
|
301
|
+
"https://",
|
|
302
|
+
"fmt.Println",
|
|
303
|
+
"safetensors",
|
|
304
|
+
"文件",
|
|
305
|
+
"状态",
|
|
306
|
+
"完整",
|
|
307
|
+
"|------|",
|
|
308
|
+
"a97e57d",
|
|
309
|
+
} {
|
|
310
|
+
if strings.Contains(got, bad) {
|
|
311
|
+
t.Fatalf("expected cleaned text not to contain %q, got %q", bad, got)
|
|
312
|
+
}
|
|
313
|
+
}
|
|
314
|
+
for _, want := range []string{
|
|
315
|
+
"结果",
|
|
316
|
+
"验证通过",
|
|
317
|
+
"main.go",
|
|
318
|
+
"路径",
|
|
319
|
+
"飞哥,需要你重启服务。",
|
|
320
|
+
} {
|
|
321
|
+
if !strings.Contains(got, want) {
|
|
322
|
+
t.Fatalf("expected cleaned text to contain %q, got %q", want, got)
|
|
323
|
+
}
|
|
324
|
+
}
|
|
325
|
+
}
|
|
326
|
+
|
|
327
|
+
func TestCleanTextOrderingPreservesLinkTitleBeforeRemovingURL(t *testing.T) {
|
|
328
|
+
got := cleanText("参考:[架构文档](https://example.com/docs)。")
|
|
329
|
+
if !strings.Contains(got, "架构文档") {
|
|
330
|
+
t.Fatalf("expected link title preserved, got %q", got)
|
|
331
|
+
}
|
|
332
|
+
if strings.Contains(got, "https://") || strings.Contains(got, "example.com") {
|
|
333
|
+
t.Fatalf("expected URL removed, got %q", got)
|
|
334
|
+
}
|
|
335
|
+
}
|
|
336
|
+
|
|
337
|
+
func TestCleanTextOrderingSkipsCodeBeforePathAndTableRules(t *testing.T) {
|
|
338
|
+
input := strings.Join([]string{
|
|
339
|
+
"结论保留。",
|
|
340
|
+
"```text",
|
|
341
|
+
"| 不该 | 播 |",
|
|
342
|
+
"/Users/admin/iCode/iSpeak/main.go",
|
|
343
|
+
"```",
|
|
344
|
+
"后续保留。",
|
|
345
|
+
}, "\n")
|
|
346
|
+
|
|
347
|
+
got := cleanText(input)
|
|
348
|
+
for _, bad := range []string{"不该", "/Users/admin", "main.go"} {
|
|
349
|
+
if strings.Contains(got, bad) {
|
|
350
|
+
t.Fatalf("expected code block removed before inline cleaning, got %q", got)
|
|
351
|
+
}
|
|
352
|
+
}
|
|
353
|
+
if !strings.Contains(got, "结论保留。") || !strings.Contains(got, "后续保留。") {
|
|
354
|
+
t.Fatalf("expected surrounding text preserved, got %q", got)
|
|
355
|
+
}
|
|
356
|
+
}
|
|
357
|
+
|
|
358
|
+
func TestCleanTextOrderingRemovesTableHeaderWhenSeparatorAppears(t *testing.T) {
|
|
359
|
+
input := strings.Join([]string{
|
|
360
|
+
"前言。",
|
|
361
|
+
"| 文件 | 状态 |",
|
|
362
|
+
"|---|---|",
|
|
363
|
+
"| main.go | 通过 |",
|
|
364
|
+
"结论。",
|
|
365
|
+
}, "\n")
|
|
366
|
+
|
|
367
|
+
got := cleanText(input)
|
|
368
|
+
for _, bad := range []string{"文件", "状态", "main.go"} {
|
|
369
|
+
if strings.Contains(got, bad) {
|
|
370
|
+
t.Fatalf("expected table header/content removed, got %q", got)
|
|
371
|
+
}
|
|
372
|
+
}
|
|
373
|
+
if !strings.Contains(got, "前言。") || !strings.Contains(got, "结论。") {
|
|
374
|
+
t.Fatalf("expected surrounding text preserved, got %q", got)
|
|
375
|
+
}
|
|
376
|
+
}
|
|
377
|
+
|
|
378
|
+
func TestCleanTextOrderingRemovesUUIDBeforeCommitHash(t *testing.T) {
|
|
379
|
+
got := cleanText("请求 ID:123e4567-e89b-12d3-a456-426614174000,状态成功。")
|
|
380
|
+
if strings.Contains(got, "123e4567") || strings.Contains(got, "426614174000") {
|
|
381
|
+
t.Fatalf("expected UUID removed as a whole, got %q", got)
|
|
382
|
+
}
|
|
383
|
+
if !strings.Contains(got, "状态成功。") {
|
|
384
|
+
t.Fatalf("expected conclusion preserved, got %q", got)
|
|
385
|
+
}
|
|
386
|
+
}
|
|
387
|
+
|
|
388
|
+
func TestCleanTextSkipsWholeMarkdownTable(t *testing.T) {
|
|
389
|
+
input := strings.Join([]string{
|
|
390
|
+
"表格如下:",
|
|
391
|
+
"| 文件 | 状态 |",
|
|
392
|
+
"|------|------|",
|
|
393
|
+
"| main.go | 通过 |",
|
|
394
|
+
"| main_test.go | 通过 |",
|
|
395
|
+
"结论:验证通过。",
|
|
396
|
+
}, "\n")
|
|
397
|
+
|
|
398
|
+
got := cleanText(input)
|
|
399
|
+
for _, bad := range []string{"文件", "状态", "main.go", "main_test.go"} {
|
|
400
|
+
if strings.Contains(got, bad) {
|
|
401
|
+
t.Fatalf("expected table content removed, got %q", got)
|
|
402
|
+
}
|
|
403
|
+
}
|
|
404
|
+
if !strings.Contains(got, "表格如下:") || !strings.Contains(got, "结论:验证通过。") {
|
|
405
|
+
t.Fatalf("expected surrounding text preserved, got %q", got)
|
|
406
|
+
}
|
|
407
|
+
}
|
|
408
|
+
|
|
409
|
+
func TestCleanTextSkipsArtifactAndHTML(t *testing.T) {
|
|
410
|
+
input := strings.Join([]string{
|
|
411
|
+
"这是前置结论。",
|
|
412
|
+
`<artifact identifier="demo" type="text/html">`,
|
|
413
|
+
"<!doctype html>",
|
|
414
|
+
"<html><body>不要播 HTML</body></html>",
|
|
415
|
+
"</artifact>",
|
|
416
|
+
"这是后置结论。",
|
|
417
|
+
}, "\n")
|
|
418
|
+
|
|
419
|
+
got := cleanText(input)
|
|
420
|
+
if strings.Contains(got, "HTML") || strings.Contains(got, "artifact") {
|
|
421
|
+
t.Fatalf("expected artifact/html removed, got %q", got)
|
|
422
|
+
}
|
|
423
|
+
if !strings.Contains(got, "这是前置结论。") || !strings.Contains(got, "这是后置结论。") {
|
|
424
|
+
t.Fatalf("expected surrounding conclusions preserved, got %q", got)
|
|
425
|
+
}
|
|
426
|
+
}
|
|
427
|
+
|
|
428
|
+
func TestCleanTextKeepsChinesePercentConclusion(t *testing.T) {
|
|
429
|
+
input := strings.Join([]string{
|
|
430
|
+
"下载 42% 12MB/s eta 1m",
|
|
431
|
+
"测试通过率 95%,可以发布。",
|
|
432
|
+
}, "\n")
|
|
433
|
+
|
|
434
|
+
got := cleanText(input)
|
|
435
|
+
if strings.Contains(got, "12MB/s") || strings.Contains(got, "eta") {
|
|
436
|
+
t.Fatalf("expected progress noise removed, got %q", got)
|
|
437
|
+
}
|
|
438
|
+
if !strings.Contains(got, "测试通过率 95%,可以发布。") {
|
|
439
|
+
t.Fatalf("expected Chinese percent conclusion preserved, got %q", got)
|
|
440
|
+
}
|
|
441
|
+
}
|
|
442
|
+
|
|
443
|
+
func TestCleanTextKeepsPlainPercentLine(t *testing.T) {
|
|
444
|
+
got := cleanText("覆盖率 95%")
|
|
445
|
+
if !strings.Contains(got, "覆盖率 95%") {
|
|
446
|
+
t.Fatalf("expected plain percent line preserved, got %q", got)
|
|
447
|
+
}
|
|
448
|
+
}
|
|
449
|
+
|
|
450
|
+
func TestCleanTextKeepsOrdinaryFileReferenceLine(t *testing.T) {
|
|
451
|
+
got := cleanText("已更新 main.go 和 README.md。")
|
|
452
|
+
if !strings.Contains(got, "main.go") || !strings.Contains(got, "README.md") {
|
|
453
|
+
t.Fatalf("expected ordinary file references preserved, got %q", got)
|
|
454
|
+
}
|
|
455
|
+
}
|
|
456
|
+
|
|
457
|
+
func TestCleanTextSingleLineArtifactDoesNotSwallowFollowingText(t *testing.T) {
|
|
458
|
+
input := strings.Join([]string{
|
|
459
|
+
`<artifact identifier="demo">不要播</artifact>`,
|
|
460
|
+
"后面的结论要保留。",
|
|
461
|
+
}, "\n")
|
|
462
|
+
|
|
463
|
+
got := cleanText(input)
|
|
464
|
+
if strings.Contains(got, "不要播") || strings.Contains(got, "artifact") {
|
|
465
|
+
t.Fatalf("expected single-line artifact removed, got %q", got)
|
|
466
|
+
}
|
|
467
|
+
if !strings.Contains(got, "后面的结论要保留。") {
|
|
468
|
+
t.Fatalf("expected following text preserved, got %q", got)
|
|
469
|
+
}
|
|
470
|
+
}
|
|
471
|
+
|
|
472
|
+
func TestHandleConnectionPreservesMultilineBeforeCleaning(t *testing.T) {
|
|
473
|
+
oldConfigDir := configDir
|
|
474
|
+
oldCacheValid := configCacheValid
|
|
475
|
+
oldCachePath := configCachePath
|
|
476
|
+
oldCacheModTime := configCacheModTime
|
|
477
|
+
oldCache := configCache
|
|
478
|
+
t.Cleanup(func() {
|
|
479
|
+
configDir = oldConfigDir
|
|
480
|
+
configCacheValid = oldCacheValid
|
|
481
|
+
configCachePath = oldCachePath
|
|
482
|
+
configCacheModTime = oldCacheModTime
|
|
483
|
+
configCache = oldCache
|
|
484
|
+
})
|
|
485
|
+
|
|
486
|
+
dir := t.TempDir()
|
|
487
|
+
configDir = dir
|
|
488
|
+
configCacheValid = false
|
|
489
|
+
cfg := `{
|
|
490
|
+
"apiKey": "key",
|
|
491
|
+
"endpoint": "https://example.com/tts",
|
|
492
|
+
"defaultVoice": {"voice_type": "voice", "resourceId": "resource"}
|
|
493
|
+
}`
|
|
494
|
+
if err := os.WriteFile(filepath.Join(dir, "config.json"), []byte(cfg), 0644); err != nil {
|
|
495
|
+
t.Fatalf("write config: %v", err)
|
|
496
|
+
}
|
|
497
|
+
|
|
498
|
+
server, client := net.Pipe()
|
|
499
|
+
e := NewTaskEngine()
|
|
500
|
+
done := make(chan struct{})
|
|
501
|
+
go func() {
|
|
502
|
+
handleConnection(server, e)
|
|
503
|
+
close(done)
|
|
504
|
+
}()
|
|
505
|
+
|
|
506
|
+
msg := strings.Join([]string{
|
|
507
|
+
"不是,飞哥。",
|
|
508
|
+
"",
|
|
509
|
+
"| 部分 | 是否常驻 |",
|
|
510
|
+
"|---|---|",
|
|
511
|
+
"| ispeakd | 是 |",
|
|
512
|
+
"",
|
|
513
|
+
"也就是说:daemon 常驻,播放器不是常驻。",
|
|
514
|
+
}, "\n")
|
|
515
|
+
if _, err := client.Write([]byte(msg)); err != nil {
|
|
516
|
+
t.Fatalf("write client: %v", err)
|
|
517
|
+
}
|
|
518
|
+
if err := client.Close(); err != nil {
|
|
519
|
+
t.Fatalf("close client: %v", err)
|
|
520
|
+
}
|
|
521
|
+
select {
|
|
522
|
+
case <-done:
|
|
523
|
+
case <-time.After(time.Second):
|
|
524
|
+
t.Fatalf("handleConnection did not return")
|
|
525
|
+
}
|
|
526
|
+
|
|
527
|
+
e.mu.Lock()
|
|
528
|
+
defer e.mu.Unlock()
|
|
529
|
+
if len(e.pending) != 1 {
|
|
530
|
+
t.Fatalf("expected one pending task, got %d", len(e.pending))
|
|
531
|
+
}
|
|
532
|
+
task := e.tasks[e.pending[0]]
|
|
533
|
+
if !strings.Contains(task.Text, "不是,飞哥。") ||
|
|
534
|
+
!strings.Contains(task.Text, "也就是说:daemon 常驻,播放器不是常驻。") {
|
|
535
|
+
t.Fatalf("expected surrounding text preserved, got %q", task.Text)
|
|
536
|
+
}
|
|
537
|
+
if strings.Contains(task.Text, "是否常驻") || strings.Contains(task.Text, "ispeakd | 是") {
|
|
538
|
+
t.Fatalf("expected table removed, got %q", task.Text)
|
|
539
|
+
}
|
|
540
|
+
}
|
|
541
|
+
|
|
542
|
+
func TestInvalidSSEAudioDeletesTaskAndWorkerContinues(t *testing.T) {
|
|
543
|
+
e := NewTaskEngine()
|
|
544
|
+
e.synthesizeStreamFn = func(ctx context.Context, cfg Config, text string, voice *VoiceInfo, onAudio func([]byte) error) error {
|
|
545
|
+
if text == "bad" {
|
|
546
|
+
return parseSSEStream(strings.NewReader("data: {\"audio\":\"***\"}\n\n"), onAudio)
|
|
547
|
+
}
|
|
548
|
+
return onAudio([]byte(text))
|
|
549
|
+
}
|
|
550
|
+
e.newStreamPlayerFn = newFakeStreamPlayerFactory()
|
|
551
|
+
e.Start()
|
|
552
|
+
|
|
553
|
+
cfg := Config{}
|
|
554
|
+
voice := VoiceInfo{VoiceType: "v", ResourceID: "r"}
|
|
555
|
+
badID := e.Submit("bad", voice, cfg)
|
|
556
|
+
waitForTaskDeleted(t, e, badID)
|
|
557
|
+
|
|
558
|
+
okID := e.Submit("ok", voice, cfg)
|
|
559
|
+
waitForTaskDeleted(t, e, okID)
|
|
560
|
+
}
|
|
561
|
+
|
|
562
|
+
func TestValidateConfigRequiresDefaultVoiceResourceID(t *testing.T) {
|
|
563
|
+
cfg := Config{
|
|
564
|
+
APIKey: "key",
|
|
565
|
+
Endpoint: "https://example.com/tts",
|
|
566
|
+
DefaultVoice: &VoiceInfo{
|
|
567
|
+
VoiceType: "voice",
|
|
568
|
+
},
|
|
569
|
+
}
|
|
570
|
+
|
|
571
|
+
err := validateConfig(cfg)
|
|
572
|
+
if err == nil || !strings.Contains(err.Error(), "defaultVoice.resourceId") {
|
|
573
|
+
t.Fatalf("expected defaultVoice.resourceId error, got %v", err)
|
|
574
|
+
}
|
|
575
|
+
}
|
|
576
|
+
|
|
577
|
+
func TestValidateConfigRequiresSourceVoiceResourceID(t *testing.T) {
|
|
578
|
+
cfg := Config{
|
|
579
|
+
APIKey: "key",
|
|
580
|
+
Endpoint: "https://example.com/tts",
|
|
581
|
+
DefaultVoice: &VoiceInfo{
|
|
582
|
+
VoiceType: "voice",
|
|
583
|
+
ResourceID: "resource",
|
|
584
|
+
},
|
|
585
|
+
SourceVoices: map[string]*VoiceInfo{
|
|
586
|
+
"codex": {
|
|
587
|
+
VoiceType: "codex-voice",
|
|
588
|
+
},
|
|
589
|
+
},
|
|
590
|
+
}
|
|
591
|
+
|
|
592
|
+
err := validateConfig(cfg)
|
|
593
|
+
if err == nil || !strings.Contains(err.Error(), "sourceVoices.codex.resourceId") {
|
|
594
|
+
t.Fatalf("expected sourceVoices.codex.resourceId error, got %v", err)
|
|
595
|
+
}
|
|
596
|
+
}
|
|
597
|
+
|
|
598
|
+
type fakeStreamPlayer struct {
|
|
599
|
+
writeErr bool
|
|
600
|
+
closeErr bool
|
|
601
|
+
panicOnWrite bool
|
|
602
|
+
chunks [][]byte
|
|
603
|
+
aborted bool
|
|
604
|
+
closed bool
|
|
605
|
+
closeBlock chan struct{}
|
|
606
|
+
closeStarted chan struct{}
|
|
607
|
+
closeOnce sync.Once
|
|
608
|
+
}
|
|
609
|
+
|
|
610
|
+
func newFakeStreamPlayerFactory() func() (StreamPlayer, error) {
|
|
611
|
+
return func() (StreamPlayer, error) {
|
|
612
|
+
return &fakeStreamPlayer{}, nil
|
|
613
|
+
}
|
|
614
|
+
}
|
|
615
|
+
|
|
616
|
+
func (p *fakeStreamPlayer) Write(audio []byte) error {
|
|
617
|
+
if p.panicOnWrite {
|
|
618
|
+
panic("boom")
|
|
619
|
+
}
|
|
620
|
+
if p.writeErr {
|
|
621
|
+
return errors.New("write failed")
|
|
622
|
+
}
|
|
623
|
+
p.chunks = append(p.chunks, append([]byte(nil), audio...))
|
|
624
|
+
return nil
|
|
625
|
+
}
|
|
626
|
+
|
|
627
|
+
func (p *fakeStreamPlayer) CloseAndWait() error {
|
|
628
|
+
p.closed = true
|
|
629
|
+
if p.closeStarted != nil {
|
|
630
|
+
p.closeStarted <- struct{}{}
|
|
631
|
+
}
|
|
632
|
+
if p.closeBlock != nil {
|
|
633
|
+
<-p.closeBlock
|
|
634
|
+
}
|
|
635
|
+
if p.closeErr {
|
|
636
|
+
return errors.New("close failed")
|
|
637
|
+
}
|
|
638
|
+
return nil
|
|
639
|
+
}
|
|
640
|
+
|
|
641
|
+
func (p *fakeStreamPlayer) Abort() error {
|
|
642
|
+
p.aborted = true
|
|
643
|
+
if p.closeBlock != nil {
|
|
644
|
+
p.closeOnce.Do(func() {
|
|
645
|
+
close(p.closeBlock)
|
|
646
|
+
})
|
|
647
|
+
}
|
|
648
|
+
return nil
|
|
649
|
+
}
|
|
650
|
+
|
|
651
|
+
func TestListenUnixSocketRemovesStalePath(t *testing.T) {
|
|
652
|
+
socketPath := shortSocketPath(t)
|
|
653
|
+
if err := os.WriteFile(socketPath, []byte("stale"), 0644); err != nil {
|
|
654
|
+
t.Fatalf("write stale socket path: %v", err)
|
|
655
|
+
}
|
|
656
|
+
|
|
657
|
+
listener, err := listenUnixSocket(socketPath)
|
|
658
|
+
if err != nil {
|
|
659
|
+
t.Fatalf("listen with stale socket path: %v", err)
|
|
660
|
+
}
|
|
661
|
+
defer listener.Close()
|
|
662
|
+
}
|
|
663
|
+
|
|
664
|
+
func TestListenUnixSocketDetectsRunningInstance(t *testing.T) {
|
|
665
|
+
socketPath := shortSocketPath(t)
|
|
666
|
+
listener, err := listenUnixSocket(socketPath)
|
|
667
|
+
if err != nil {
|
|
668
|
+
t.Fatalf("first listen: %v", err)
|
|
669
|
+
}
|
|
670
|
+
defer listener.Close()
|
|
671
|
+
|
|
672
|
+
done := make(chan struct{})
|
|
673
|
+
go func() {
|
|
674
|
+
conn, err := listener.Accept()
|
|
675
|
+
if err == nil {
|
|
676
|
+
_ = conn.Close()
|
|
677
|
+
}
|
|
678
|
+
close(done)
|
|
679
|
+
}()
|
|
680
|
+
|
|
681
|
+
second, err := listenUnixSocket(socketPath)
|
|
682
|
+
if second != nil {
|
|
683
|
+
_ = second.Close()
|
|
684
|
+
}
|
|
685
|
+
if !errors.Is(err, errAlreadyRunning) {
|
|
686
|
+
t.Fatalf("expected errAlreadyRunning, got %v", err)
|
|
687
|
+
}
|
|
688
|
+
|
|
689
|
+
select {
|
|
690
|
+
case <-done:
|
|
691
|
+
case <-time.After(time.Second):
|
|
692
|
+
t.Fatalf("test listener did not accept probe connection")
|
|
693
|
+
}
|
|
694
|
+
}
|
|
695
|
+
|
|
696
|
+
func TestListenUnixSocketRemovesClosedListenerSocket(t *testing.T) {
|
|
697
|
+
socketPath := shortSocketPath(t)
|
|
698
|
+
stale, err := net.Listen("unix", socketPath)
|
|
699
|
+
if err != nil {
|
|
700
|
+
t.Fatalf("create stale listener: %v", err)
|
|
701
|
+
}
|
|
702
|
+
if err := stale.Close(); err != nil {
|
|
703
|
+
t.Fatalf("close stale listener: %v", err)
|
|
704
|
+
}
|
|
705
|
+
|
|
706
|
+
listener, err := listenUnixSocket(socketPath)
|
|
707
|
+
if err != nil {
|
|
708
|
+
t.Fatalf("listen after stale listener close: %v", err)
|
|
709
|
+
}
|
|
710
|
+
defer listener.Close()
|
|
711
|
+
}
|
|
712
|
+
|
|
713
|
+
func shortSocketPath(t *testing.T) string {
|
|
714
|
+
t.Helper()
|
|
715
|
+
|
|
716
|
+
dir, err := os.MkdirTemp("/tmp", "ispeak-*")
|
|
717
|
+
if err != nil {
|
|
718
|
+
t.Fatalf("create temp dir: %v", err)
|
|
719
|
+
}
|
|
720
|
+
t.Cleanup(func() {
|
|
721
|
+
_ = os.RemoveAll(dir)
|
|
722
|
+
})
|
|
723
|
+
return filepath.Join(dir, "sock")
|
|
724
|
+
}
|
|
725
|
+
|
|
726
|
+
func waitForTaskDeleted(t *testing.T, e *TaskEngine, id uint64) {
|
|
727
|
+
t.Helper()
|
|
728
|
+
waitFor(t, 2*time.Second, func() bool {
|
|
729
|
+
e.mu.Lock()
|
|
730
|
+
defer e.mu.Unlock()
|
|
731
|
+
_, ok := e.tasks[id]
|
|
732
|
+
return !ok
|
|
733
|
+
})
|
|
734
|
+
}
|
|
735
|
+
|
|
736
|
+
func waitFor(t *testing.T, timeout time.Duration, fn func() bool) {
|
|
737
|
+
t.Helper()
|
|
738
|
+
deadline := time.Now().Add(timeout)
|
|
739
|
+
for time.Now().Before(deadline) {
|
|
740
|
+
if fn() {
|
|
741
|
+
return
|
|
742
|
+
}
|
|
743
|
+
time.Sleep(10 * time.Millisecond)
|
|
744
|
+
}
|
|
745
|
+
t.Fatalf("condition not met within %s", timeout)
|
|
746
|
+
}
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@xdfnet/ispeak",
|
|
3
|
-
"version": "1.6.
|
|
3
|
+
"version": "1.6.11",
|
|
4
4
|
"description": "Local macOS TTS daemon for AI coding assistants, powered by Volcengine streaming TTS.",
|
|
5
5
|
"license": "MIT",
|
|
6
6
|
"homepage": "https://github.com/xdfnet/iSpeak#readme",
|
|
@@ -31,6 +31,12 @@
|
|
|
31
31
|
"files": [
|
|
32
32
|
"main.go",
|
|
33
33
|
"clean_text.go",
|
|
34
|
+
"avaudioengine_player_darwin.go",
|
|
35
|
+
"stream_player_unsupported.go",
|
|
36
|
+
"main_test.go",
|
|
37
|
+
"Makefile",
|
|
38
|
+
"CLAUDE.md",
|
|
39
|
+
"AGENTS.md",
|
|
34
40
|
"go.mod",
|
|
35
41
|
"go.sum",
|
|
36
42
|
"scripts/ispeak",
|