@aiyiran/myclaw 1.1.108 → 1.1.110
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json
CHANGED
|
@@ -33,6 +33,11 @@
|
|
|
33
33
|
"templates"
|
|
34
34
|
],
|
|
35
35
|
"description": "课程模板制作流水线(demo -> student/teacher JSON -> HTML 打包)"
|
|
36
|
+
},
|
|
37
|
+
{
|
|
38
|
+
"name": "chat-history-extractor",
|
|
39
|
+
"strategy": "on",
|
|
40
|
+
"description": "从 OpenClaw session URL 提取并渲染聊天记录"
|
|
36
41
|
}
|
|
37
42
|
],
|
|
38
43
|
"_doc_agents": "Step 3: 将 agent-list/ 下的智能体分发到 ~/.openclaw/ 并注册到 openclaw.json",
|
|
@@ -0,0 +1,87 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: chat-history-extractor
|
|
3
|
+
description: Extract and render chat history from OpenClaw session URLs. Use when user provides a URL like https://claw1.kekouen.cn/chat?session=agent%3Ac108-v1-1811%3Amain containing a session key. The skill parses the URL, finds the corresponding chat JSONL file, and generates a JS data file for rendering. Also provide template HTML for visualizing the conversation with timing metrics.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Chat History Extractor
|
|
7
|
+
|
|
8
|
+
## Workflow
|
|
9
|
+
|
|
10
|
+
### Step 1: Parse URL and Extract Session Key
|
|
11
|
+
|
|
12
|
+
From URL like `https://claw1.kekouen.cn/chat?session=agent%3Ac108-v1-1811%3Amain`:
|
|
13
|
+
|
|
14
|
+
1. URL-decode the `session` query parameter
|
|
15
|
+
2. The decoded session key is `agent:c108-v1-1811:main`
|
|
16
|
+
|
|
17
|
+
### Step 2: Find Session JSONL File
|
|
18
|
+
|
|
19
|
+
Session files are stored at:
|
|
20
|
+
```
|
|
21
|
+
/root/.openclaw/agents/{agentId}/sessions/sessions.json
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
For session key `agent:c108-v1-1811:main`:
|
|
25
|
+
- Agent ID: `c108-v1-1811`
|
|
26
|
+
- Session file: `/root/.openclaw/agents/c108-v1-1811/sessions/sessions.json`
|
|
27
|
+
|
|
28
|
+
Look up the session key in `sessions.json` to find the `sessionFile` path (e.g., `5bee9664-7b72-4efd-8dc1-e8bf125c6b9c.jsonl`).
|
|
29
|
+
|
|
30
|
+
### Step 3: Parse JSONL and Build Conversation Pairs
|
|
31
|
+
|
|
32
|
+
The JSONL contains message events. Parse them with this logic:
|
|
33
|
+
|
|
34
|
+
1. Filter for `type == "message"` events
|
|
35
|
+
2. For each user message, collect ALL consecutive assistant messages until the next user message
|
|
36
|
+
3. Combine multiple AI responses into one reply per user message
|
|
37
|
+
4. Skip `toolResult` events (they're tool outputs, not user/AI messages)
|
|
38
|
+
|
|
39
|
+
**Important pairing rules:**
|
|
40
|
+
- A user message pairs with the first assistant message that follows
|
|
41
|
+
- If the assistant sends multiple messages before the next user message, merge ALL of them into one AI reply
|
|
42
|
+
- AI messages with empty text content (just tool calls) should be skipped
|
|
43
|
+
- Use `toolResult` to skip over intermediate tool outputs
|
|
44
|
+
|
|
45
|
+
### Step 4: Generate JS Data File
|
|
46
|
+
|
|
47
|
+
Output a JS file with this structure:
|
|
48
|
+
|
|
49
|
+
```javascript
|
|
50
|
+
const chatData = {
|
|
51
|
+
"session": "session-name",
|
|
52
|
+
"session_id": "uuid",
|
|
53
|
+
"total_pairs": N,
|
|
54
|
+
"initiator": "who started this conversation",
|
|
55
|
+
"conversations": [
|
|
56
|
+
{
|
|
57
|
+
"user": "full user message text",
|
|
58
|
+
"user_time": "2026-05-14 18:11:56",
|
|
59
|
+
"ai": "full AI reply (merged if multiple messages)",
|
|
60
|
+
"ai_time": "2026-05-14 18:12:18"
|
|
61
|
+
},
|
|
62
|
+
...
|
|
63
|
+
]
|
|
64
|
+
};
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
### Step 5: Render with Template HTML
|
|
68
|
+
|
|
69
|
+
Use the template at `assets/chat-history-template.html` to render the conversation.
|
|
70
|
+
|
|
71
|
+
Copy the template to the output directory alongside the JS file. The template supports:
|
|
72
|
+
- Displaying conversation pairs
|
|
73
|
+
- Computing and showing three timing metrics per pair:
|
|
74
|
+
- **课程流逝** (elapsed time since first message)
|
|
75
|
+
- **回复耗时** (AI response time = AI time - user time)
|
|
76
|
+
- **间隔耗时** (gap to next user message)
|
|
77
|
+
- Rich visual styling with color-coded timing badges
|
|
78
|
+
|
|
79
|
+
## Usage Example
|
|
80
|
+
|
|
81
|
+
User provides: `https://claw1.kekouen.cn/chat?session=agent%3Ac108-v1-1811%3Amain`
|
|
82
|
+
|
|
83
|
+
1. Extract session key: `agent:c108-v1-1811:main`
|
|
84
|
+
2. Find JSONL at: `/root/.openclaw/agents/c108-v1-1811/sessions/5bee9664-7b72-4efd-8dc1-e8bf125c6b9c.jsonl`
|
|
85
|
+
3. Parse and generate: `聊天记录.js`
|
|
86
|
+
4. Copy template: `聊天记录.html`
|
|
87
|
+
5. User opens `聊天记录.html` to view the rendered conversation
|
|
@@ -0,0 +1,271 @@
|
|
|
1
|
+
<!DOCTYPE html>
|
|
2
|
+
<html lang="zh-CN">
|
|
3
|
+
<head>
|
|
4
|
+
<meta charset="UTF-8">
|
|
5
|
+
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
|
6
|
+
<title>聊天记录</title>
|
|
7
|
+
<style>
|
|
8
|
+
* { box-sizing: border-box; }
|
|
9
|
+
body {
|
|
10
|
+
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif;
|
|
11
|
+
background: #f0f2f5;
|
|
12
|
+
margin: 0;
|
|
13
|
+
padding: 20px;
|
|
14
|
+
min-height: 100vh;
|
|
15
|
+
}
|
|
16
|
+
.container {
|
|
17
|
+
max-width: 1000px;
|
|
18
|
+
margin: 0 auto;
|
|
19
|
+
}
|
|
20
|
+
h1 {
|
|
21
|
+
text-align: center;
|
|
22
|
+
color: #1a1a2e;
|
|
23
|
+
margin-bottom: 10px;
|
|
24
|
+
}
|
|
25
|
+
.info-box {
|
|
26
|
+
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
|
|
27
|
+
color: white;
|
|
28
|
+
padding: 20px;
|
|
29
|
+
border-radius: 12px;
|
|
30
|
+
margin-bottom: 30px;
|
|
31
|
+
box-shadow: 0 4px 15px rgba(102, 126, 234, 0.4);
|
|
32
|
+
}
|
|
33
|
+
.info-box p { margin: 6px 0; font-size: 14px; }
|
|
34
|
+
.info-box strong { color: #ffd700; }
|
|
35
|
+
.stats {
|
|
36
|
+
display: flex;
|
|
37
|
+
gap: 20px;
|
|
38
|
+
margin-top: 15px;
|
|
39
|
+
flex-wrap: wrap;
|
|
40
|
+
}
|
|
41
|
+
.stat-item {
|
|
42
|
+
background: rgba(255,255,255,0.2);
|
|
43
|
+
padding: 10px 20px;
|
|
44
|
+
border-radius: 8px;
|
|
45
|
+
text-align: center;
|
|
46
|
+
}
|
|
47
|
+
.stat-num { font-size: 28px; font-weight: bold; }
|
|
48
|
+
.stat-label { font-size: 12px; opacity: 0.9; }
|
|
49
|
+
.pair {
|
|
50
|
+
background: white;
|
|
51
|
+
border-radius: 12px;
|
|
52
|
+
margin-bottom: 20px;
|
|
53
|
+
box-shadow: 0 2px 8px rgba(0,0,0,0.08);
|
|
54
|
+
overflow: hidden;
|
|
55
|
+
}
|
|
56
|
+
.pair-header {
|
|
57
|
+
background: #f8f9fa;
|
|
58
|
+
padding: 12px 20px;
|
|
59
|
+
border-bottom: 1px solid #eee;
|
|
60
|
+
display: flex;
|
|
61
|
+
justify-content: space-between;
|
|
62
|
+
align-items: center;
|
|
63
|
+
flex-wrap: wrap;
|
|
64
|
+
gap: 10px;
|
|
65
|
+
}
|
|
66
|
+
.pair-left { display: flex; align-items: center; gap: 12px; }
|
|
67
|
+
.pair-num {
|
|
68
|
+
background: #4CAF50;
|
|
69
|
+
color: white;
|
|
70
|
+
padding: 4px 12px;
|
|
71
|
+
border-radius: 20px;
|
|
72
|
+
font-weight: bold;
|
|
73
|
+
font-size: 13px;
|
|
74
|
+
}
|
|
75
|
+
.pair-duration {
|
|
76
|
+
padding: 6px 14px;
|
|
77
|
+
border-radius: 20px;
|
|
78
|
+
font-weight: bold;
|
|
79
|
+
font-size: 14px;
|
|
80
|
+
background: #fff3e0;
|
|
81
|
+
color: #e65100;
|
|
82
|
+
border: 2px solid #ff9800;
|
|
83
|
+
}
|
|
84
|
+
.pair-duration.elapsed { background: #f3e5f5; color: #7b1fa2; border-color: #9c27b0; }
|
|
85
|
+
.pair-duration.interval { background: #e3f2fd; color: #1565c0; border-color: #2196f3; }
|
|
86
|
+
.pair-duration.no-reply { background: #ffebee; color: #c62828; border-color: #f44336; }
|
|
87
|
+
.message-section { padding: 0; }
|
|
88
|
+
.user-section { background: #fff8e1; border-left: 4px solid #ff9800; }
|
|
89
|
+
.ai-section { background: #e3f2fd; border-left: 4px solid #2196f3; }
|
|
90
|
+
.msg-label {
|
|
91
|
+
display: flex;
|
|
92
|
+
align-items: center;
|
|
93
|
+
padding: 12px 20px 8px 20px;
|
|
94
|
+
font-weight: bold;
|
|
95
|
+
font-size: 14px;
|
|
96
|
+
}
|
|
97
|
+
.user-label { color: #e65100; }
|
|
98
|
+
.ai-label { color: #1565c0; }
|
|
99
|
+
.msg-time { font-size: 12px; color: #999; font-weight: normal; margin-left: 10px; }
|
|
100
|
+
.msg-content { padding: 0 20px 15px 20px; }
|
|
101
|
+
.msg-content pre {
|
|
102
|
+
white-space: pre-wrap;
|
|
103
|
+
word-wrap: break-word;
|
|
104
|
+
font-family: inherit;
|
|
105
|
+
font-size: 14px;
|
|
106
|
+
line-height: 1.6;
|
|
107
|
+
color: #333;
|
|
108
|
+
margin: 0;
|
|
109
|
+
background: rgba(255,255,255,0.6);
|
|
110
|
+
padding: 12px;
|
|
111
|
+
border-radius: 8px;
|
|
112
|
+
}
|
|
113
|
+
</style>
|
|
114
|
+
</head>
|
|
115
|
+
<body>
|
|
116
|
+
<div class="container">
|
|
117
|
+
<h1>📋 聊天记录</h1>
|
|
118
|
+
|
|
119
|
+
<div class="info-box">
|
|
120
|
+
<p><strong>会话ID:</strong> <span id="sessionId">--</span></p>
|
|
121
|
+
<p><strong>会话名称:</strong> <span id="sessionName">--</span></p>
|
|
122
|
+
<p><strong>发起人:</strong> <span id="initiator">--</span></p>
|
|
123
|
+
<div class="stats">
|
|
124
|
+
<div class="stat-item">
|
|
125
|
+
<div class="stat-num" id="totalPairs">--</div>
|
|
126
|
+
<div class="stat-label">总对话轮次</div>
|
|
127
|
+
</div>
|
|
128
|
+
<div class="stat-item">
|
|
129
|
+
<div class="stat-num" id="userCount">--</div>
|
|
130
|
+
<div class="stat-label">学生发言</div>
|
|
131
|
+
</div>
|
|
132
|
+
<div class="stat-item">
|
|
133
|
+
<div class="stat-num" id="aiCount">--</div>
|
|
134
|
+
<div class="stat-label">AI回复</div>
|
|
135
|
+
</div>
|
|
136
|
+
<div class="stat-item">
|
|
137
|
+
<div class="stat-num" id="totalTime">--</div>
|
|
138
|
+
<div class="stat-label">总回复耗时</div>
|
|
139
|
+
</div>
|
|
140
|
+
</div>
|
|
141
|
+
</div>
|
|
142
|
+
|
|
143
|
+
<div id="conversationList"></div>
|
|
144
|
+
</div>
|
|
145
|
+
|
|
146
|
+
<!-- Load chat data JS file -->
|
|
147
|
+
<script src="chat_history.js"></script>
|
|
148
|
+
<script>
|
|
149
|
+
// Fallback if chatData not loaded
|
|
150
|
+
if (typeof chatData === 'undefined') {
|
|
151
|
+
document.getElementById('sessionId').textContent = '数据未加载';
|
|
152
|
+
document.getElementById('conversationList').innerHTML = '<p style="text-align:center;color:#999;">请确保 chat_history.js 文件存在且可访问</p>';
|
|
153
|
+
} else {
|
|
154
|
+
init();
|
|
155
|
+
}
|
|
156
|
+
|
|
157
|
+
function parseTime(timeStr) {
|
|
158
|
+
if (!timeStr) return null;
|
|
159
|
+
const parts = timeStr.split(/[\s:\-]/);
|
|
160
|
+
if (parts.length >= 6) {
|
|
161
|
+
return new Date(parts[0], parts[1]-1, parts[2], parts[3], parts[4], parts[5]);
|
|
162
|
+
}
|
|
163
|
+
return null;
|
|
164
|
+
}
|
|
165
|
+
|
|
166
|
+
function formatDuration(ms) {
|
|
167
|
+
if (ms === null || ms === undefined || ms < 0) return '--';
|
|
168
|
+
const seconds = Math.floor(ms / 1000);
|
|
169
|
+
if (seconds < 60) return seconds + '秒';
|
|
170
|
+
const minutes = Math.floor(seconds / 60);
|
|
171
|
+
const remainingSeconds = seconds % 60;
|
|
172
|
+
if (minutes < 60) {
|
|
173
|
+
return remainingSeconds > 0 ? minutes + '分' + remainingSeconds + '秒' : minutes + '分';
|
|
174
|
+
}
|
|
175
|
+
const hours = Math.floor(minutes / 60);
|
|
176
|
+
const remainingMinutes = minutes % 60;
|
|
177
|
+
return hours + '时' + remainingMinutes + '分';
|
|
178
|
+
}
|
|
179
|
+
|
|
180
|
+
function escapeHtml(text) {
|
|
181
|
+
const div = document.createElement('div');
|
|
182
|
+
div.textContent = text;
|
|
183
|
+
return div.innerHTML;
|
|
184
|
+
}
|
|
185
|
+
|
|
186
|
+
function init() {
|
|
187
|
+
document.getElementById('sessionId').textContent = chatData.session_id || '--';
|
|
188
|
+
document.getElementById('sessionName').textContent = chatData.session || '--';
|
|
189
|
+
document.getElementById('initiator').textContent = chatData.initiator || '--';
|
|
190
|
+
document.getElementById('totalPairs').textContent = chatData.total_pairs || 0;
|
|
191
|
+
|
|
192
|
+
let userCount = 0;
|
|
193
|
+
let aiCount = 0;
|
|
194
|
+
let totalDuration = 0;
|
|
195
|
+
const list = document.getElementById('conversationList');
|
|
196
|
+
|
|
197
|
+
chatData.conversations.forEach((c, i) => {
|
|
198
|
+
const hasReply = c.ai && c.ai.trim();
|
|
199
|
+
if (hasReply) aiCount++;
|
|
200
|
+
|
|
201
|
+
// Compute metrics
|
|
202
|
+
let responseDuration = null;
|
|
203
|
+
if (hasReply) {
|
|
204
|
+
const userTime = parseTime(c.user_time);
|
|
205
|
+
const aiTime = parseTime(c.ai_time);
|
|
206
|
+
if (userTime && aiTime) {
|
|
207
|
+
responseDuration = aiTime - userTime;
|
|
208
|
+
if (responseDuration > 0) totalDuration += responseDuration;
|
|
209
|
+
}
|
|
210
|
+
}
|
|
211
|
+
|
|
212
|
+
let intervalDuration = null;
|
|
213
|
+
if (i < chatData.conversations.length - 1) {
|
|
214
|
+
const nextUserTime = parseTime(chatData.conversations[i + 1].user_time);
|
|
215
|
+
const currentUserTime = parseTime(c.user_time);
|
|
216
|
+
if (currentUserTime && nextUserTime) {
|
|
217
|
+
intervalDuration = nextUserTime - currentUserTime;
|
|
218
|
+
}
|
|
219
|
+
}
|
|
220
|
+
|
|
221
|
+
let elapsedDuration = null;
|
|
222
|
+
if (i > 0) {
|
|
223
|
+
const firstUserTime = parseTime(chatData.conversations[0].user_time);
|
|
224
|
+
const currentUserTime = parseTime(c.user_time);
|
|
225
|
+
if (firstUserTime && currentUserTime) {
|
|
226
|
+
elapsedDuration = currentUserTime - firstUserTime;
|
|
227
|
+
}
|
|
228
|
+
}
|
|
229
|
+
|
|
230
|
+
const pair = document.createElement('div');
|
|
231
|
+
pair.className = 'pair';
|
|
232
|
+
pair.innerHTML = `
|
|
233
|
+
<div class="pair-header">
|
|
234
|
+
<div class="pair-left">
|
|
235
|
+
<span class="pair-num">第 ${i + 1} 轮</span>
|
|
236
|
+
${elapsedDuration !== null ? `<span class="pair-duration elapsed">⏰ 课程流逝: ${formatDuration(elapsedDuration)}</span>` : '<span class="pair-duration elapsed">⏰ 课程流逝: 0分</span>'}
|
|
237
|
+
${intervalDuration !== null ? `<span class="pair-duration interval">⏸️ 间隔耗时: ${formatDuration(intervalDuration)}</span>` : ''}
|
|
238
|
+
</div>
|
|
239
|
+
</div>
|
|
240
|
+
<div class="message-section user-section">
|
|
241
|
+
<div class="msg-label user-label">
|
|
242
|
+
<span>👤 学生</span>
|
|
243
|
+
<span class="msg-time">${c.user_time || '未知时间'}</span>
|
|
244
|
+
</div>
|
|
245
|
+
<div class="msg-content">
|
|
246
|
+
<pre>${escapeHtml(c.user)}</pre>
|
|
247
|
+
</div>
|
|
248
|
+
</div>
|
|
249
|
+
${hasReply ? `
|
|
250
|
+
<div class="message-section ai-section">
|
|
251
|
+
<div class="msg-label ai-label">
|
|
252
|
+
<span>🤖 AI</span>
|
|
253
|
+
<span class="msg-time">${c.ai_time || '未知时间'}</span>
|
|
254
|
+
<span class="pair-duration">⏱️ 回复耗时: ${formatDuration(responseDuration)}</span>
|
|
255
|
+
</div>
|
|
256
|
+
<div class="msg-content">
|
|
257
|
+
<pre>${escapeHtml(c.ai)}</pre>
|
|
258
|
+
</div>
|
|
259
|
+
</div>
|
|
260
|
+
` : ''}
|
|
261
|
+
`;
|
|
262
|
+
list.appendChild(pair);
|
|
263
|
+
});
|
|
264
|
+
|
|
265
|
+
document.getElementById('userCount').textContent = chatData.conversations.length;
|
|
266
|
+
document.getElementById('aiCount').textContent = aiCount;
|
|
267
|
+
document.getElementById('totalTime').textContent = formatDuration(totalDuration);
|
|
268
|
+
}
|
|
269
|
+
</script>
|
|
270
|
+
</body>
|
|
271
|
+
</html>
|
|
@@ -0,0 +1,169 @@
|
|
|
1
|
+
#!/usr/bin/env python3
|
|
2
|
+
"""
|
|
3
|
+
Extract chat history from OpenClaw session URL and generate JS data file.
|
|
4
|
+
Usage: python3 extract_chat.py <session-url-or-key> [output-dir]
|
|
5
|
+
"""
|
|
6
|
+
|
|
7
|
+
import json
|
|
8
|
+
import sys
|
|
9
|
+
import os
|
|
10
|
+
import re
|
|
11
|
+
from urllib.parse import urlparse, parse_qs
|
|
12
|
+
from datetime import datetime, timezone, timedelta
|
|
13
|
+
|
|
14
|
+
tz_beijing = timezone(timedelta(hours=8))
|
|
15
|
+
|
|
16
|
+
def parse_time(ts_str):
|
|
17
|
+
"""Parse ISO timestamp string to Beijing time string."""
|
|
18
|
+
try:
|
|
19
|
+
ts_str = ts_str.replace('Z', '')
|
|
20
|
+
dt = datetime.fromisoformat(ts_str)
|
|
21
|
+
dt_utc = dt.replace(tzinfo=timezone.utc)
|
|
22
|
+
dt_beijing = dt_utc.astimezone(tz_beijing)
|
|
23
|
+
return dt_beijing.strftime('%Y-%m-%d %H:%M:%S')
|
|
24
|
+
except:
|
|
25
|
+
return None
|
|
26
|
+
|
|
27
|
+
def extract_session_key(url_or_key):
|
|
28
|
+
"""Extract session key from URL or return as-is if already a key."""
|
|
29
|
+
if 'session=' in url_or_key:
|
|
30
|
+
parsed = urlparse(url_or_key)
|
|
31
|
+
params = parse_qs(parsed.query)
|
|
32
|
+
if 'session' in params:
|
|
33
|
+
return params['session'][0]
|
|
34
|
+
return url_or_key
|
|
35
|
+
|
|
36
|
+
def find_session_file(session_key):
|
|
37
|
+
"""Find the JSONL file path for a given session key."""
|
|
38
|
+
# session_key format: agent:c108-v1-1811:session-name
|
|
39
|
+
parts = session_key.split(':')
|
|
40
|
+
if len(parts) < 2:
|
|
41
|
+
return None
|
|
42
|
+
agent_id = parts[1]
|
|
43
|
+
sessions_json_path = f'/root/.openclaw/agents/{agent_id}/sessions/sessions.json'
|
|
44
|
+
|
|
45
|
+
if not os.path.exists(sessions_json_path):
|
|
46
|
+
return None
|
|
47
|
+
|
|
48
|
+
with open(sessions_json_path, 'r') as f:
|
|
49
|
+
sessions = json.load(f)
|
|
50
|
+
|
|
51
|
+
if session_key in sessions:
|
|
52
|
+
return sessions[session_key].get('sessionFile')
|
|
53
|
+
return None
|
|
54
|
+
|
|
55
|
+
def parse_jsonl_to_conversations(jsonl_path):
|
|
56
|
+
"""Parse JSONL file and build conversation pairs."""
|
|
57
|
+
messages = []
|
|
58
|
+
with open(jsonl_path, 'r') as f:
|
|
59
|
+
for line in f:
|
|
60
|
+
line = line.strip()
|
|
61
|
+
if not line:
|
|
62
|
+
continue
|
|
63
|
+
obj = json.loads(line)
|
|
64
|
+
if obj.get('type') == 'message':
|
|
65
|
+
role = obj['message']['role']
|
|
66
|
+
content = obj['message'].get('content', [])
|
|
67
|
+
timestamp = obj.get('timestamp', '')
|
|
68
|
+
|
|
69
|
+
if isinstance(content, list):
|
|
70
|
+
text = ''.join([c.get('text', '') for c in content if c.get('type') == 'text'])
|
|
71
|
+
else:
|
|
72
|
+
text = str(content)
|
|
73
|
+
|
|
74
|
+
messages.append({
|
|
75
|
+
'role': role,
|
|
76
|
+
'timestamp': timestamp,
|
|
77
|
+
'text': text.strip(),
|
|
78
|
+
'has_text': bool(text.strip())
|
|
79
|
+
})
|
|
80
|
+
|
|
81
|
+
# Build conversation pairs
|
|
82
|
+
conversations = []
|
|
83
|
+
i = 0
|
|
84
|
+
while i < len(messages):
|
|
85
|
+
msg = messages[i]
|
|
86
|
+
if msg['role'] == 'user':
|
|
87
|
+
user_text = msg['text']
|
|
88
|
+
user_time = msg['timestamp']
|
|
89
|
+
|
|
90
|
+
# Collect ALL consecutive AI messages until next user
|
|
91
|
+
ai_messages = []
|
|
92
|
+
j = i + 1
|
|
93
|
+
while j < len(messages) and messages[j]['role'] != 'user':
|
|
94
|
+
if messages[j]['role'] == 'assistant':
|
|
95
|
+
ai_messages.append(messages[j])
|
|
96
|
+
j += 1
|
|
97
|
+
|
|
98
|
+
# Merge all AI messages with text
|
|
99
|
+
ai_text = '\n\n'.join([m['text'] for m in ai_messages if m['has_text']])
|
|
100
|
+
ai_time = ai_messages[-1]['timestamp'] if ai_messages else ''
|
|
101
|
+
|
|
102
|
+
conversations.append({
|
|
103
|
+
'user': user_text,
|
|
104
|
+
'user_time': parse_time(user_time) if user_time else '',
|
|
105
|
+
'ai': ai_text,
|
|
106
|
+
'ai_time': parse_time(ai_time) if ai_time else ''
|
|
107
|
+
})
|
|
108
|
+
i += 1
|
|
109
|
+
else:
|
|
110
|
+
i += 1
|
|
111
|
+
|
|
112
|
+
return conversations
|
|
113
|
+
|
|
114
|
+
def generate_js(conversations, session_key, output_path):
|
|
115
|
+
"""Generate JS file from conversations."""
|
|
116
|
+
session_name = session_key.split(':')[-1] if ':' in session_key else session_key
|
|
117
|
+
|
|
118
|
+
output_data = {
|
|
119
|
+
'session': session_name,
|
|
120
|
+
'session_id': session_key,
|
|
121
|
+
'total_pairs': len(conversations),
|
|
122
|
+
'initiator': 'session initiator',
|
|
123
|
+
'conversations': conversations
|
|
124
|
+
}
|
|
125
|
+
|
|
126
|
+
js_content = f'''// Chat History - {session_name}
|
|
127
|
+
// Session Key: {session_key}
|
|
128
|
+
// Generated by chat-history-extractor skill
|
|
129
|
+
|
|
130
|
+
const chatData = {json.dumps(output_data, ensure_ascii=False, indent=2)};
|
|
131
|
+
'''
|
|
132
|
+
|
|
133
|
+
with open(output_path, 'w', encoding='utf-8') as f:
|
|
134
|
+
f.write(js_content)
|
|
135
|
+
|
|
136
|
+
return len(conversations)
|
|
137
|
+
|
|
138
|
+
def main():
|
|
139
|
+
if len(sys.argv) < 2:
|
|
140
|
+
print("Usage: python3 extract_chat.py <session-url-or-key> [output-dir]")
|
|
141
|
+
sys.exit(1)
|
|
142
|
+
|
|
143
|
+
url_or_key = sys.argv[1]
|
|
144
|
+
output_dir = sys.argv[2] if len(sys.argv) > 2 else os.getcwd()
|
|
145
|
+
|
|
146
|
+
# Extract session key
|
|
147
|
+
session_key = extract_session_key(url_or_key)
|
|
148
|
+
print(f"Session key: {session_key}")
|
|
149
|
+
|
|
150
|
+
# Find session file
|
|
151
|
+
jsonl_path = find_session_file(session_key)
|
|
152
|
+
if not jsonl_path:
|
|
153
|
+
print(f"ERROR: Could not find session file for {session_key}")
|
|
154
|
+
sys.exit(1)
|
|
155
|
+
print(f"JSONL path: {jsonl_path}")
|
|
156
|
+
|
|
157
|
+
# Parse conversations
|
|
158
|
+
conversations = parse_jsonl_to_conversations(jsonl_path)
|
|
159
|
+
print(f"Found {len(conversations)} conversation pairs")
|
|
160
|
+
|
|
161
|
+
# Generate JS file
|
|
162
|
+
os.makedirs(output_dir, exist_ok=True)
|
|
163
|
+
js_path = os.path.join(output_dir, 'chat_history.js')
|
|
164
|
+
count = generate_js(conversations, session_key, js_path)
|
|
165
|
+
print(f"Generated: {js_path}")
|
|
166
|
+
print(f"Total pairs: {count}")
|
|
167
|
+
|
|
168
|
+
if __name__ == '__main__':
|
|
169
|
+
main()
|