@coratch/ai-news-agent 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +177 -0
- package/bin/ai-news.js +208 -0
- package/config.example.yaml +56 -0
- package/package.json +53 -0
- package/src/analyzer.js +131 -0
- package/src/config.js +96 -0
- package/src/extractor.js +67 -0
- package/src/feeds.js +50 -0
- package/src/index.js +147 -0
- package/src/output.js +160 -0
- package/src/storage.js +88 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Coratch
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,177 @@
|
|
|
1
|
+
# AI News Agent
|
|
2
|
+
|
|
3
|
+
AI 前沿资讯智能订阅 Agent — 自动抓取 RSS 订阅源,通过 Claude AI 匹配用户关注点,生成中文摘要报告。
|
|
4
|
+
|
|
5
|
+
## 项目概览
|
|
6
|
+
|
|
7
|
+
### 痛点
|
|
8
|
+
- AI 领域资讯爆炸,手动追踪效率低
|
|
9
|
+
- 关注的技术方向(如 Claude Code 新版本特性)散落在多个博客/论坛,容易遗漏
|
|
10
|
+
- 英文资讯需要额外时间阅读理解
|
|
11
|
+
|
|
12
|
+
### 目标价值
|
|
13
|
+
- 自动聚合多个 RSS 源的 AI 资讯
|
|
14
|
+
- 基于用户自定义的关注主题,AI 智能筛选匹配文章
|
|
15
|
+
- 生成中文摘要 + 行动建议,节省 80% 的信息获取时间
|
|
16
|
+
|
|
17
|
+
## 架构
|
|
18
|
+
|
|
19
|
+
```
|
|
20
|
+
RSS 订阅源 → 抓取解析 → 去重(SQLite) → AI 快速筛选 → 正文提取 → AI 深度分析 → 终端输出 + Markdown 报告
|
|
21
|
+
(阶段1: Haiku) (阶段2: Haiku)
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
**两阶段 AI 分析策略**(控制成本):
|
|
25
|
+
1. **快速筛选**:仅传标题+摘要(几百 token),批量判断是否匹配用户关注点
|
|
26
|
+
2. **深度分析**:仅对匹配文章提取全文,生成中文摘要和行动建议
|
|
27
|
+
|
|
28
|
+
## 快速开始
|
|
29
|
+
|
|
30
|
+
### 方式一:npm 全局安装(推荐)
|
|
31
|
+
|
|
32
|
+
```bash
|
|
33
|
+
npm install -g @coratch/ai-news-agent
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
安装后直接使用 `ai-news` 命令。
|
|
37
|
+
|
|
38
|
+
### 方式二:从源码安装
|
|
39
|
+
|
|
40
|
+
```bash
|
|
41
|
+
git clone https://github.com/Coratch/ai-news-agent.git
|
|
42
|
+
cd ai-news-agent
|
|
43
|
+
npm install
|
|
44
|
+
npm link # 全局注册 ai-news 命令
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
### 设置 API Key
|
|
48
|
+
|
|
49
|
+
```bash
|
|
50
|
+
export ANTHROPIC_API_KEY=your-api-key
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
### 初始化配置
|
|
54
|
+
|
|
55
|
+
```bash
|
|
56
|
+
ai-news init
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
交互式创建配置,选择 RSS 源和关注主题。
|
|
60
|
+
|
|
61
|
+
### 运行
|
|
62
|
+
|
|
63
|
+
```bash
|
|
64
|
+
# 正常模式(需要 API Key)
|
|
65
|
+
ai-news run
|
|
66
|
+
|
|
67
|
+
# Dry-run 模式(本地关键词匹配,无需 API Key)
|
|
68
|
+
ai-news run --dry-run
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
## CLI 命令
|
|
72
|
+
|
|
73
|
+
| 命令 | 说明 |
|
|
74
|
+
|------|------|
|
|
75
|
+
| `ai-news init` | 交互式初始化配置 |
|
|
76
|
+
| `ai-news run` | 执行一次抓取+分析 |
|
|
77
|
+
| `ai-news run --dry-run` | 跳过 AI,使用本地关键词匹配 |
|
|
78
|
+
| `ai-news add-feed` | 添加 RSS 订阅源 |
|
|
79
|
+
| `ai-news add-topic` | 添加关注主题 |
|
|
80
|
+
| `ai-news history` | 查看历史匹配记录 |
|
|
81
|
+
| `ai-news config` | 显示当前配置 |
|
|
82
|
+
|
|
83
|
+
## 配置文件
|
|
84
|
+
|
|
85
|
+
配置文件位于 `~/.ai-news-agent/config.yaml`:
|
|
86
|
+
|
|
87
|
+
```yaml
|
|
88
|
+
# RSS 订阅源
|
|
89
|
+
feeds:
|
|
90
|
+
- name: "Hacker News - Claude"
|
|
91
|
+
url: "https://hnrss.org/newest?q=claude+anthropic"
|
|
92
|
+
|
|
93
|
+
# 关注主题(AI 据此匹配文章)
|
|
94
|
+
topics:
|
|
95
|
+
- name: "Claude Code 版本特性"
|
|
96
|
+
description: "Claude Code CLI 工具的新版本发布、新功能、效率提升特性"
|
|
97
|
+
keywords: ["claude code", "claude cli"]
|
|
98
|
+
priority: high # high/medium/low
|
|
99
|
+
|
|
100
|
+
# Claude API 设置
|
|
101
|
+
claude:
|
|
102
|
+
model: "claude-haiku-4-5-20251001" # 用 Haiku 控成本
|
|
103
|
+
max_articles_per_run: 50
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
## 输出示例
|
|
107
|
+
|
|
108
|
+
### 终端输出
|
|
109
|
+
|
|
110
|
+
```
|
|
111
|
+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
112
|
+
AI 资讯日报 — 2026/2/9
|
|
113
|
+
已扫描 4 个源 | 32 篇文章 | 新增 32 篇 | 命中 4 篇
|
|
114
|
+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
115
|
+
|
|
116
|
+
[HIGH] Claude Code 版本特性
|
|
117
|
+
Claude Code 1.0.20: 支持后台 Agent 模式
|
|
118
|
+
来源: Anthropic Blog | 2026/2/8 10:00
|
|
119
|
+
|
|
120
|
+
新版本引入后台 Agent 模式,允许多个任务并行执行...
|
|
121
|
+
|
|
122
|
+
• 后台 Agent 支持并行任务
|
|
123
|
+
• 新增 /compact 命令优化上下文管理
|
|
124
|
+
|
|
125
|
+
→ 立即升级,后台 Agent 对多任务开发场景有直接帮助
|
|
126
|
+
```
|
|
127
|
+
|
|
128
|
+
### Markdown 报告
|
|
129
|
+
|
|
130
|
+
自动保存到 `~/.ai-news-agent/reports/YYYY-MM-DD.md`
|
|
131
|
+
|
|
132
|
+
## 技术栈
|
|
133
|
+
|
|
134
|
+
| 组件 | 技术 | 用途 |
|
|
135
|
+
|------|------|------|
|
|
136
|
+
| RSS 解析 | rss-parser | 抓取和解析 RSS feed |
|
|
137
|
+
| 正文提取 | linkedom | 从网页 HTML 提取正文 |
|
|
138
|
+
| AI 分析 | @anthropic-ai/sdk | Claude API 匹配+摘要 |
|
|
139
|
+
| 数据存储 | better-sqlite3 | 文章去重和历史记录 |
|
|
140
|
+
| CLI 框架 | commander + inquirer | 命令行交互 |
|
|
141
|
+
| 终端美化 | chalk + ora | 彩色输出和进度动画 |
|
|
142
|
+
|
|
143
|
+
## 项目结构
|
|
144
|
+
|
|
145
|
+
```
|
|
146
|
+
ai-news-agent/
|
|
147
|
+
├── bin/
|
|
148
|
+
│ └── ai-news.js # CLI 入口
|
|
149
|
+
├── src/
|
|
150
|
+
│ ├── index.js # 主流程编排
|
|
151
|
+
│ ├── config.js # 配置管理
|
|
152
|
+
│ ├── feeds.js # RSS 抓取
|
|
153
|
+
│ ├── extractor.js # 网页正文提取
|
|
154
|
+
│ ├── analyzer.js # Claude API 分析
|
|
155
|
+
│ ├── storage.js # SQLite 去重存储
|
|
156
|
+
│ └── output.js # 终端+Markdown 输出
|
|
157
|
+
├── config.example.yaml # 配置示例
|
|
158
|
+
├── package.json
|
|
159
|
+
└── README.md
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
## 效果预估
|
|
163
|
+
|
|
164
|
+
| 指标 | 手动方式 | 使用 Agent |
|
|
165
|
+
|------|---------|-----------|
|
|
166
|
+
| 每日信息获取时间 | 30-60 分钟 | 2 分钟(看报告) |
|
|
167
|
+
| 遗漏重要资讯概率 | 高 | 低(自动化覆盖) |
|
|
168
|
+
| 英文资讯理解成本 | 高 | 低(中文摘要) |
|
|
169
|
+
| API 成本 | - | ~$0.01/次(Haiku) |
|
|
170
|
+
|
|
171
|
+
## 后续规划
|
|
172
|
+
|
|
173
|
+
- [ ] 邮件通知(Resend/SMTP)
|
|
174
|
+
- [ ] 定时任务(node-cron / 系统 crontab)
|
|
175
|
+
- [ ] 更多 RSS 源预设(arxiv、GitHub trending 等)
|
|
176
|
+
- [ ] Web UI 管理界面
|
|
177
|
+
- [ ] 多用户支持
|
package/bin/ai-news.js
ADDED
|
@@ -0,0 +1,208 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
|
|
3
|
+
import { Command } from 'commander';
|
|
4
|
+
import inquirer from 'inquirer';
|
|
5
|
+
import chalk from 'chalk';
|
|
6
|
+
import { initConfig, configExists, loadConfig, addFeed, addTopic, getConfigDir } from '../src/config.js';
|
|
7
|
+
import { run } from '../src/index.js';
|
|
8
|
+
import { getHistory, closeDb } from '../src/storage.js';
|
|
9
|
+
|
|
10
|
+
const program = new Command();
|
|
11
|
+
|
|
12
|
+
program
|
|
13
|
+
.name('ai-news')
|
|
14
|
+
.description('AI 前沿资讯智能订阅 Agent')
|
|
15
|
+
.version('1.0.0');
|
|
16
|
+
|
|
17
|
+
// init - 交互式创建配置
|
|
18
|
+
program
|
|
19
|
+
.command('init')
|
|
20
|
+
.description('初始化配置文件')
|
|
21
|
+
.action(async () => {
|
|
22
|
+
if (configExists()) {
|
|
23
|
+
const { overwrite } = await inquirer.prompt([{
|
|
24
|
+
type: 'confirm',
|
|
25
|
+
name: 'overwrite',
|
|
26
|
+
message: '配置文件已存在,是否覆盖?',
|
|
27
|
+
default: false,
|
|
28
|
+
}]);
|
|
29
|
+
if (!overwrite) {
|
|
30
|
+
console.log('已取消');
|
|
31
|
+
return;
|
|
32
|
+
}
|
|
33
|
+
}
|
|
34
|
+
|
|
35
|
+
console.log(chalk.cyan('\n🚀 AI News Agent 初始化向导\n'));
|
|
36
|
+
|
|
37
|
+
const answers = await inquirer.prompt([
|
|
38
|
+
{
|
|
39
|
+
type: 'input',
|
|
40
|
+
name: 'apiKey',
|
|
41
|
+
message: 'Anthropic API Key (留空则使用环境变量 ANTHROPIC_API_KEY):',
|
|
42
|
+
default: '',
|
|
43
|
+
},
|
|
44
|
+
{
|
|
45
|
+
type: 'checkbox',
|
|
46
|
+
name: 'defaultFeeds',
|
|
47
|
+
message: '选择默认订阅的 RSS 源:',
|
|
48
|
+
choices: [
|
|
49
|
+
{ name: 'Anthropic Engineering (GitHub)', value: { name: 'Anthropic Engineering', url: 'https://raw.githubusercontent.com/conoro/anthropic-engineering-rss-feed/main/anthropic_engineering_rss.xml' }, checked: true },
|
|
50
|
+
{ name: 'Hacker News - AI/LLM', value: { name: 'Hacker News - AI/LLM', url: 'https://hnrss.org/newest?q=AI+LLM+agent' }, checked: true },
|
|
51
|
+
{ name: 'Hacker News - Claude', value: { name: 'Hacker News - Claude', url: 'https://hnrss.org/newest?q=claude+anthropic' }, checked: true },
|
|
52
|
+
{ name: 'The Verge - AI', value: { name: 'The Verge - AI', url: 'https://www.theverge.com/rss/ai-artificial-intelligence/index.xml' }, checked: false },
|
|
53
|
+
],
|
|
54
|
+
},
|
|
55
|
+
{
|
|
56
|
+
type: 'input',
|
|
57
|
+
name: 'topicName',
|
|
58
|
+
message: '输入你最关注的 AI 主题名称:',
|
|
59
|
+
default: 'Claude Code 版本特性',
|
|
60
|
+
},
|
|
61
|
+
{
|
|
62
|
+
type: 'input',
|
|
63
|
+
name: 'topicDesc',
|
|
64
|
+
message: '描述这个主题(AI 会据此匹配文章):',
|
|
65
|
+
default: 'Claude Code CLI 工具的新版本发布、新功能、效率提升特性',
|
|
66
|
+
},
|
|
67
|
+
{
|
|
68
|
+
type: 'input',
|
|
69
|
+
name: 'topicKeywords',
|
|
70
|
+
message: '关键词(逗号分隔):',
|
|
71
|
+
default: 'claude code, claude cli, anthropic cli',
|
|
72
|
+
},
|
|
73
|
+
]);
|
|
74
|
+
|
|
75
|
+
const config = {
|
|
76
|
+
feeds: answers.defaultFeeds,
|
|
77
|
+
topics: [{
|
|
78
|
+
name: answers.topicName,
|
|
79
|
+
description: answers.topicDesc,
|
|
80
|
+
keywords: answers.topicKeywords.split(',').map(k => k.trim()),
|
|
81
|
+
priority: 'high',
|
|
82
|
+
}],
|
|
83
|
+
output: {
|
|
84
|
+
terminal: true,
|
|
85
|
+
markdown: { enabled: true, dir: '~/.ai-news-agent/reports' },
|
|
86
|
+
},
|
|
87
|
+
claude: {
|
|
88
|
+
model: 'claude-haiku-4-5-20251001',
|
|
89
|
+
max_articles_per_run: 50,
|
|
90
|
+
},
|
|
91
|
+
};
|
|
92
|
+
|
|
93
|
+
const configPath = initConfig(config);
|
|
94
|
+
console.log(chalk.green(`\n✅ 配置已保存到: ${configPath}`));
|
|
95
|
+
console.log(chalk.gray('运行 ai-news run 开始抓取资讯\n'));
|
|
96
|
+
});
|
|
97
|
+
|
|
98
|
+
// run - 执行一次抓取分析
|
|
99
|
+
program
|
|
100
|
+
.command('run')
|
|
101
|
+
.description('立即执行一次抓取+分析')
|
|
102
|
+
.option('--dry-run', '跳过 Claude API,使用本地关键词匹配')
|
|
103
|
+
.action(async (opts) => {
|
|
104
|
+
if (!configExists()) {
|
|
105
|
+
console.log(chalk.red('配置文件不存在,请先运行: ai-news init'));
|
|
106
|
+
return;
|
|
107
|
+
}
|
|
108
|
+
try {
|
|
109
|
+
await run({ dryRun: opts.dryRun });
|
|
110
|
+
} catch (err) {
|
|
111
|
+
console.error(chalk.red(`执行失败: ${err.message}`));
|
|
112
|
+
process.exit(1);
|
|
113
|
+
}
|
|
114
|
+
});
|
|
115
|
+
|
|
116
|
+
// add-feed - 添加 RSS 源
|
|
117
|
+
program
|
|
118
|
+
.command('add-feed')
|
|
119
|
+
.description('添加 RSS 订阅源')
|
|
120
|
+
.action(async () => {
|
|
121
|
+
const { name, url } = await inquirer.prompt([
|
|
122
|
+
{ type: 'input', name: 'name', message: 'RSS 源名称:' },
|
|
123
|
+
{ type: 'input', name: 'url', message: 'RSS URL:' },
|
|
124
|
+
]);
|
|
125
|
+
try {
|
|
126
|
+
addFeed(name, url);
|
|
127
|
+
console.log(chalk.green(`✅ 已添加: ${name} (${url})`));
|
|
128
|
+
} catch (err) {
|
|
129
|
+
console.error(chalk.red(err.message));
|
|
130
|
+
}
|
|
131
|
+
});
|
|
132
|
+
|
|
133
|
+
// add-topic - 添加关注点
|
|
134
|
+
program
|
|
135
|
+
.command('add-topic')
|
|
136
|
+
.description('添加关注主题')
|
|
137
|
+
.action(async () => {
|
|
138
|
+
const answers = await inquirer.prompt([
|
|
139
|
+
{ type: 'input', name: 'name', message: '主题名称:' },
|
|
140
|
+
{ type: 'input', name: 'description', message: '主题描述(AI 据此匹配文章):' },
|
|
141
|
+
{ type: 'input', name: 'keywords', message: '关键词(逗号分隔):' },
|
|
142
|
+
{ type: 'list', name: 'priority', message: '优先级:', choices: ['high', 'medium', 'low'] },
|
|
143
|
+
]);
|
|
144
|
+
try {
|
|
145
|
+
addTopic(answers.name, answers.description, answers.keywords.split(',').map(k => k.trim()), answers.priority);
|
|
146
|
+
console.log(chalk.green(`✅ 已添加主题: ${answers.name}`));
|
|
147
|
+
} catch (err) {
|
|
148
|
+
console.error(chalk.red(err.message));
|
|
149
|
+
}
|
|
150
|
+
});
|
|
151
|
+
|
|
152
|
+
// history - 查看历史
|
|
153
|
+
program
|
|
154
|
+
.command('history')
|
|
155
|
+
.description('查看历史匹配记录')
|
|
156
|
+
.option('-d, --days <n>', '最近几天', '7')
|
|
157
|
+
.action((opts) => {
|
|
158
|
+
if (!configExists()) {
|
|
159
|
+
console.log(chalk.red('配置文件不存在,请先运行: ai-news init'));
|
|
160
|
+
return;
|
|
161
|
+
}
|
|
162
|
+
try {
|
|
163
|
+
const records = getHistory(parseInt(opts.days));
|
|
164
|
+
if (records.length === 0) {
|
|
165
|
+
console.log(chalk.gray('暂无历史记录'));
|
|
166
|
+
return;
|
|
167
|
+
}
|
|
168
|
+
console.log(chalk.bold(`\n📋 最近 ${opts.days} 天的匹配记录 (${records.length} 条)\n`));
|
|
169
|
+
for (const r of records) {
|
|
170
|
+
const analysis = r.analysis_json ? JSON.parse(r.analysis_json) : null;
|
|
171
|
+
const topic = r.matched_topic || '未分类';
|
|
172
|
+
console.log(chalk.yellow(` [${topic}] `) + chalk.bold(r.title));
|
|
173
|
+
if (analysis?.summary) {
|
|
174
|
+
console.log(chalk.gray(` ${analysis.summary.slice(0, 80)}...`));
|
|
175
|
+
}
|
|
176
|
+
console.log(chalk.gray(` ${r.url} | ${r.created_at}`));
|
|
177
|
+
console.log('');
|
|
178
|
+
}
|
|
179
|
+
closeDb();
|
|
180
|
+
} catch (err) {
|
|
181
|
+
console.error(chalk.red(err.message));
|
|
182
|
+
}
|
|
183
|
+
});
|
|
184
|
+
|
|
185
|
+
// config - 显示配置
|
|
186
|
+
program
|
|
187
|
+
.command('config')
|
|
188
|
+
.description('显示当前配置')
|
|
189
|
+
.action(() => {
|
|
190
|
+
if (!configExists()) {
|
|
191
|
+
console.log(chalk.red('配置文件不存在,请先运行: ai-news init'));
|
|
192
|
+
return;
|
|
193
|
+
}
|
|
194
|
+
const config = loadConfig();
|
|
195
|
+
console.log(chalk.bold('\n📂 配置目录: ') + getConfigDir());
|
|
196
|
+
console.log(chalk.bold('\n📡 RSS 订阅源:'));
|
|
197
|
+
for (const f of config.feeds) {
|
|
198
|
+
console.log(` • ${f.name}: ${f.url}`);
|
|
199
|
+
}
|
|
200
|
+
console.log(chalk.bold('\n🎯 关注主题:'));
|
|
201
|
+
for (const t of config.topics) {
|
|
202
|
+
console.log(` • [${t.priority}] ${t.name}: ${t.description}`);
|
|
203
|
+
console.log(chalk.gray(` 关键词: ${t.keywords.join(', ')}`));
|
|
204
|
+
}
|
|
205
|
+
console.log('');
|
|
206
|
+
});
|
|
207
|
+
|
|
208
|
+
program.parse();
|
|
@@ -0,0 +1,56 @@
|
|
|
1
|
+
# AI News Agent 配置文件
|
|
2
|
+
# 复制此文件到 ~/.ai-news-agent/config.yaml 并修改
|
|
3
|
+
|
|
4
|
+
# RSS 订阅源
|
|
5
|
+
feeds:
|
|
6
|
+
- name: "Anthropic Blog"
|
|
7
|
+
url: "https://www.anthropic.com/rss.xml"
|
|
8
|
+
- name: "OpenAI Blog"
|
|
9
|
+
url: "https://openai.com/blog/rss.xml"
|
|
10
|
+
- name: "Hacker News - AI"
|
|
11
|
+
url: "https://hnrss.org/newest?q=AI+LLM+agent"
|
|
12
|
+
- name: "The Verge - AI"
|
|
13
|
+
url: "https://www.theverge.com/rss/ai-artificial-intelligence/index.xml"
|
|
14
|
+
|
|
15
|
+
# 用户关注的 topics
|
|
16
|
+
topics:
|
|
17
|
+
- name: "Claude Code 版本特性"
|
|
18
|
+
description: "Claude Code CLI 工具的新版本发布、新功能、效率提升特性、MCP 工具链更新"
|
|
19
|
+
keywords:
|
|
20
|
+
- "claude code"
|
|
21
|
+
- "claude cli"
|
|
22
|
+
- "anthropic cli"
|
|
23
|
+
- "claude agent"
|
|
24
|
+
priority: high
|
|
25
|
+
|
|
26
|
+
- name: "AI 编程工具动态"
|
|
27
|
+
description: "Cursor、GitHub Copilot、Windsurf 等 AI 编程辅助工具的重大更新和新功能"
|
|
28
|
+
keywords:
|
|
29
|
+
- "cursor"
|
|
30
|
+
- "copilot"
|
|
31
|
+
- "windsurf"
|
|
32
|
+
- "ai coding"
|
|
33
|
+
- "ai ide"
|
|
34
|
+
priority: medium
|
|
35
|
+
|
|
36
|
+
- name: "LLM 前沿进展"
|
|
37
|
+
description: "大语言模型的重要技术突破、新模型发布、benchmark 结果"
|
|
38
|
+
keywords:
|
|
39
|
+
- "gpt"
|
|
40
|
+
- "claude"
|
|
41
|
+
- "gemini"
|
|
42
|
+
- "llama"
|
|
43
|
+
- "benchmark"
|
|
44
|
+
priority: low
|
|
45
|
+
|
|
46
|
+
# 输出设置
|
|
47
|
+
output:
|
|
48
|
+
terminal: true
|
|
49
|
+
markdown:
|
|
50
|
+
enabled: true
|
|
51
|
+
dir: "~/.ai-news-agent/reports"
|
|
52
|
+
|
|
53
|
+
# Claude API 设置
|
|
54
|
+
claude:
|
|
55
|
+
model: "claude-haiku-4-5-20251001"
|
|
56
|
+
max_articles_per_run: 50
|
package/package.json
ADDED
|
@@ -0,0 +1,53 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "@coratch/ai-news-agent",
|
|
3
|
+
"version": "1.0.0",
|
|
4
|
+
"description": "AI 前沿资讯智能订阅 Agent - 自动抓取 RSS、Claude AI 匹配关注点、生成中文摘要报告",
|
|
5
|
+
"type": "module",
|
|
6
|
+
"bin": {
|
|
7
|
+
"ai-news": "./bin/ai-news.js"
|
|
8
|
+
},
|
|
9
|
+
"files": [
|
|
10
|
+
"bin/",
|
|
11
|
+
"src/",
|
|
12
|
+
"config.example.yaml",
|
|
13
|
+
"README.md"
|
|
14
|
+
],
|
|
15
|
+
"scripts": {
|
|
16
|
+
"start": "node bin/ai-news.js run",
|
|
17
|
+
"dev": "node bin/ai-news.js"
|
|
18
|
+
},
|
|
19
|
+
"keywords": [
|
|
20
|
+
"ai",
|
|
21
|
+
"news",
|
|
22
|
+
"rss",
|
|
23
|
+
"claude",
|
|
24
|
+
"agent",
|
|
25
|
+
"newsletter",
|
|
26
|
+
"anthropic",
|
|
27
|
+
"llm"
|
|
28
|
+
],
|
|
29
|
+
"author": "Coratch",
|
|
30
|
+
"license": "MIT",
|
|
31
|
+
"repository": {
|
|
32
|
+
"type": "git",
|
|
33
|
+
"url": "git+https://github.com/Coratch/ai-news-agent.git"
|
|
34
|
+
},
|
|
35
|
+
"homepage": "https://github.com/Coratch/ai-news-agent#readme",
|
|
36
|
+
"bugs": {
|
|
37
|
+
"url": "https://github.com/Coratch/ai-news-agent/issues"
|
|
38
|
+
},
|
|
39
|
+
"dependencies": {
|
|
40
|
+
"@anthropic-ai/sdk": "^0.39.0",
|
|
41
|
+
"better-sqlite3": "^11.7.0",
|
|
42
|
+
"chalk": "^5.4.1",
|
|
43
|
+
"commander": "^13.1.0",
|
|
44
|
+
"inquirer": "^12.3.2",
|
|
45
|
+
"linkedom": "^0.18.6",
|
|
46
|
+
"ora": "^8.1.1",
|
|
47
|
+
"rss-parser": "^3.13.0",
|
|
48
|
+
"yaml": "^2.7.0"
|
|
49
|
+
},
|
|
50
|
+
"engines": {
|
|
51
|
+
"node": ">=18.0.0"
|
|
52
|
+
}
|
|
53
|
+
}
|
package/src/analyzer.js
ADDED
|
@@ -0,0 +1,131 @@
|
|
|
1
|
+
import Anthropic from '@anthropic-ai/sdk';
|
|
2
|
+
|
|
3
|
+
let client = null;
|
|
4
|
+
|
|
5
|
+
function getClient() {
|
|
6
|
+
if (!client) {
|
|
7
|
+
client = new Anthropic();
|
|
8
|
+
}
|
|
9
|
+
return client;
|
|
10
|
+
}
|
|
11
|
+
|
|
12
|
+
/**
|
|
13
|
+
* 阶段1: 快速筛选 — 基于标题+摘要判断是否匹配用户关注点
|
|
14
|
+
* 批量处理以减少 API 调用次数
|
|
15
|
+
*
|
|
16
|
+
* @param {Array} articles - 文章列表 [{title, summary, link, feedName}]
|
|
17
|
+
* @param {Array} topics - 用户关注点 [{name, description, keywords, priority}]
|
|
18
|
+
* @param {string} model
|
|
19
|
+
* @returns {Array} 匹配的文章列表,附带 matchedTopic 和 relevance
|
|
20
|
+
*/
|
|
21
|
+
export async function quickFilter(articles, topics, model) {
|
|
22
|
+
if (articles.length === 0 || topics.length === 0) return [];
|
|
23
|
+
|
|
24
|
+
const topicsDesc = topics.map((t, i) =>
|
|
25
|
+
`[${i}] ${t.name} (${t.priority}): ${t.description}\n 关键词: ${t.keywords.join(', ')}`
|
|
26
|
+
).join('\n');
|
|
27
|
+
|
|
28
|
+
// 分批处理,每批 10 篇
|
|
29
|
+
const batchSize = 10;
|
|
30
|
+
const matched = [];
|
|
31
|
+
|
|
32
|
+
for (let i = 0; i < articles.length; i += batchSize) {
|
|
33
|
+
const batch = articles.slice(i, i + batchSize);
|
|
34
|
+
const articlesDesc = batch.map((a, j) =>
|
|
35
|
+
`[${j}] "${a.title}" (${a.feedName})\n ${a.summary.slice(0, 200)}`
|
|
36
|
+
).join('\n\n');
|
|
37
|
+
|
|
38
|
+
const prompt = `你是一个 AI 资讯筛选助手。请判断以下文章是否与用户关注的 topics 相关。
|
|
39
|
+
|
|
40
|
+
## 用户关注的 Topics
|
|
41
|
+
${topicsDesc}
|
|
42
|
+
|
|
43
|
+
## 待筛选文章
|
|
44
|
+
${articlesDesc}
|
|
45
|
+
|
|
46
|
+
## 输出要求
|
|
47
|
+
返回 JSON 数组,只包含匹配的文章。每个元素:
|
|
48
|
+
{"index": 文章序号, "topicIndex": 匹配的topic序号, "relevance": 0.0-1.0的相关度}
|
|
49
|
+
|
|
50
|
+
如果没有匹配的文章,返回空数组 []。
|
|
51
|
+
只返回 relevance >= 0.6 的结果。只输出 JSON,不要其他内容。`;
|
|
52
|
+
|
|
53
|
+
try {
|
|
54
|
+
const response = await getClient().messages.create({
|
|
55
|
+
model,
|
|
56
|
+
max_tokens: 1024,
|
|
57
|
+
messages: [{ role: 'user', content: prompt }],
|
|
58
|
+
});
|
|
59
|
+
|
|
60
|
+
const text = response.content[0]?.text || '[]';
|
|
61
|
+
const jsonMatch = text.match(/\[[\s\S]*\]/);
|
|
62
|
+
if (!jsonMatch) continue;
|
|
63
|
+
|
|
64
|
+
const results = JSON.parse(jsonMatch[0]);
|
|
65
|
+
for (const r of results) {
|
|
66
|
+
if (r.index >= 0 && r.index < batch.length && r.relevance >= 0.6) {
|
|
67
|
+
matched.push({
|
|
68
|
+
...batch[r.index],
|
|
69
|
+
matchedTopic: topics[r.topicIndex] || topics[0],
|
|
70
|
+
relevance: r.relevance,
|
|
71
|
+
});
|
|
72
|
+
}
|
|
73
|
+
}
|
|
74
|
+
} catch (err) {
|
|
75
|
+
console.error(` 筛选批次失败: ${err.message}`);
|
|
76
|
+
}
|
|
77
|
+
}
|
|
78
|
+
|
|
79
|
+
return matched;
|
|
80
|
+
}
|
|
81
|
+
|
|
82
|
+
/**
|
|
83
|
+
* 阶段2: 深度分析 — 对匹配的文章生成中文摘要
|
|
84
|
+
*
|
|
85
|
+
* @param {object} article - 文章 {title, link, feedName, matchedTopic, fullContent}
|
|
86
|
+
* @param {string} model
|
|
87
|
+
* @returns {object} {summary, keyPoints, actionable, recommendation}
|
|
88
|
+
*/
|
|
89
|
+
export async function deepAnalyze(article, model) {
|
|
90
|
+
const content = article.fullContent || article.summary;
|
|
91
|
+
const topic = article.matchedTopic;
|
|
92
|
+
|
|
93
|
+
const prompt = `你是一个专业的 AI 技术资讯分析师。请对以下文章进行深度分析,重点关注与用户关注点的关联。
|
|
94
|
+
|
|
95
|
+
## 用户关注点
|
|
96
|
+
名称: ${topic.name}
|
|
97
|
+
描述: ${topic.description}
|
|
98
|
+
|
|
99
|
+
## 文章信息
|
|
100
|
+
标题: ${article.title}
|
|
101
|
+
来源: ${article.feedName}
|
|
102
|
+
内容:
|
|
103
|
+
${content}
|
|
104
|
+
|
|
105
|
+
## 输出要求
|
|
106
|
+
返回 JSON(只输出 JSON,不要其他内容):
|
|
107
|
+
{
|
|
108
|
+
"summary": "150字以内的中文摘要,突出与用户关注点相关的内容",
|
|
109
|
+
"keyPoints": ["关键点1", "关键点2", "关键点3"],
|
|
110
|
+
"actionable": true/false(这个信息是否需要用户立即采取行动,如版本升级、功能试用等),
|
|
111
|
+
"recommendation": "一句话行动建议,如果 actionable 为 false 则为空字符串"
|
|
112
|
+
}`;
|
|
113
|
+
|
|
114
|
+
try {
|
|
115
|
+
const response = await getClient().messages.create({
|
|
116
|
+
model,
|
|
117
|
+
max_tokens: 1024,
|
|
118
|
+
messages: [{ role: 'user', content: prompt }],
|
|
119
|
+
});
|
|
120
|
+
|
|
121
|
+
const text = response.content[0]?.text || '{}';
|
|
122
|
+
const jsonMatch = text.match(/\{[\s\S]*\}/);
|
|
123
|
+
if (!jsonMatch) {
|
|
124
|
+
return { summary: '分析失败', keyPoints: [], actionable: false, recommendation: '' };
|
|
125
|
+
}
|
|
126
|
+
return JSON.parse(jsonMatch[0]);
|
|
127
|
+
} catch (err) {
|
|
128
|
+
console.error(` 分析失败 [${article.title}]: ${err.message}`);
|
|
129
|
+
return { summary: '分析失败: ' + err.message, keyPoints: [], actionable: false, recommendation: '' };
|
|
130
|
+
}
|
|
131
|
+
}
|
package/src/config.js
ADDED
|
@@ -0,0 +1,96 @@
|
|
|
1
|
+
import fs from 'fs';
|
|
2
|
+
import path from 'path';
|
|
3
|
+
import os from 'os';
|
|
4
|
+
import YAML from 'yaml';
|
|
5
|
+
|
|
6
|
+
const CONFIG_DIR = path.join(os.homedir(), '.ai-news-agent');
|
|
7
|
+
const CONFIG_FILE = path.join(CONFIG_DIR, 'config.yaml');
|
|
8
|
+
|
|
9
|
+
const DEFAULT_CONFIG = {
|
|
10
|
+
feeds: [
|
|
11
|
+
{ name: 'Anthropic Engineering (GitHub)', url: 'https://raw.githubusercontent.com/conoro/anthropic-engineering-rss-feed/main/anthropic_engineering_rss.xml' },
|
|
12
|
+
{ name: 'Hacker News - AI/LLM', url: 'https://hnrss.org/newest?q=AI+LLM+agent' },
|
|
13
|
+
{ name: 'Hacker News - Claude', url: 'https://hnrss.org/newest?q=claude+anthropic' },
|
|
14
|
+
{ name: 'The Verge - AI', url: 'https://www.theverge.com/rss/ai-artificial-intelligence/index.xml' },
|
|
15
|
+
],
|
|
16
|
+
topics: [
|
|
17
|
+
{
|
|
18
|
+
name: 'Claude Code 版本特性',
|
|
19
|
+
description: 'Claude Code CLI 工具的新版本发布、新功能、效率提升特性',
|
|
20
|
+
keywords: ['claude code', 'claude cli', 'anthropic cli'],
|
|
21
|
+
priority: 'high',
|
|
22
|
+
},
|
|
23
|
+
],
|
|
24
|
+
output: {
|
|
25
|
+
terminal: true,
|
|
26
|
+
markdown: { enabled: true, dir: path.join(CONFIG_DIR, 'reports') },
|
|
27
|
+
},
|
|
28
|
+
claude: {
|
|
29
|
+
model: 'claude-haiku-4-5-20251001',
|
|
30
|
+
max_articles_per_run: 50,
|
|
31
|
+
},
|
|
32
|
+
};
|
|
33
|
+
|
|
34
|
+
export function getConfigDir() {
|
|
35
|
+
return CONFIG_DIR;
|
|
36
|
+
}
|
|
37
|
+
|
|
38
|
+
export function ensureConfigDir() {
|
|
39
|
+
if (!fs.existsSync(CONFIG_DIR)) {
|
|
40
|
+
fs.mkdirSync(CONFIG_DIR, { recursive: true });
|
|
41
|
+
}
|
|
42
|
+
}
|
|
43
|
+
|
|
44
|
+
export function configExists() {
|
|
45
|
+
return fs.existsSync(CONFIG_FILE);
|
|
46
|
+
}
|
|
47
|
+
|
|
48
|
+
export function loadConfig() {
|
|
49
|
+
if (!configExists()) {
|
|
50
|
+
throw new Error(`配置文件不存在: ${CONFIG_FILE}\n请先运行 ai-news init 创建配置`);
|
|
51
|
+
}
|
|
52
|
+
const raw = fs.readFileSync(CONFIG_FILE, 'utf-8');
|
|
53
|
+
const config = YAML.parse(raw);
|
|
54
|
+
// 展开 ~ 路径
|
|
55
|
+
if (config.output?.markdown?.dir) {
|
|
56
|
+
config.output.markdown.dir = config.output.markdown.dir.replace(/^~/, os.homedir());
|
|
57
|
+
}
|
|
58
|
+
return config;
|
|
59
|
+
}
|
|
60
|
+
|
|
61
|
+
export function saveConfig(config) {
|
|
62
|
+
ensureConfigDir();
|
|
63
|
+
const content = YAML.stringify(config, { lineWidth: 120 });
|
|
64
|
+
fs.writeFileSync(CONFIG_FILE, content, 'utf-8');
|
|
65
|
+
}
|
|
66
|
+
|
|
67
|
+
export function initConfig(customConfig = null) {
|
|
68
|
+
ensureConfigDir();
|
|
69
|
+
const config = customConfig || DEFAULT_CONFIG;
|
|
70
|
+
saveConfig(config);
|
|
71
|
+
// 确保报告目录存在
|
|
72
|
+
const reportDir = config.output?.markdown?.dir?.replace(/^~/, os.homedir())
|
|
73
|
+
|| path.join(CONFIG_DIR, 'reports');
|
|
74
|
+
if (!fs.existsSync(reportDir)) {
|
|
75
|
+
fs.mkdirSync(reportDir, { recursive: true });
|
|
76
|
+
}
|
|
77
|
+
return CONFIG_FILE;
|
|
78
|
+
}
|
|
79
|
+
|
|
80
|
+
export function addFeed(name, url) {
|
|
81
|
+
const config = loadConfig();
|
|
82
|
+
if (config.feeds.some(f => f.url === url)) {
|
|
83
|
+
throw new Error(`RSS 源已存在: ${url}`);
|
|
84
|
+
}
|
|
85
|
+
config.feeds.push({ name, url });
|
|
86
|
+
saveConfig(config);
|
|
87
|
+
}
|
|
88
|
+
|
|
89
|
+
export function addTopic(name, description, keywords, priority = 'medium') {
|
|
90
|
+
const config = loadConfig();
|
|
91
|
+
if (config.topics.some(t => t.name === name)) {
|
|
92
|
+
throw new Error(`关注点已存在: ${name}`);
|
|
93
|
+
}
|
|
94
|
+
config.topics.push({ name, description, keywords, priority });
|
|
95
|
+
saveConfig(config);
|
|
96
|
+
}
|
package/src/extractor.js
ADDED
|
@@ -0,0 +1,67 @@
|
|
|
1
|
+
import { parseHTML } from 'linkedom';
|
|
2
|
+
|
|
3
|
+
/**
|
|
4
|
+
* 从 URL 抓取网页并提取正文
|
|
5
|
+
* @param {string} url
|
|
6
|
+
* @returns {string} 提取的正文文本(截断到 3000 字符以控制 token)
|
|
7
|
+
*/
|
|
8
|
+
export async function extractContent(url) {
|
|
9
|
+
try {
|
|
10
|
+
const controller = new AbortController();
|
|
11
|
+
const timeout = setTimeout(() => controller.abort(), 15000);
|
|
12
|
+
|
|
13
|
+
const response = await fetch(url, {
|
|
14
|
+
signal: controller.signal,
|
|
15
|
+
headers: {
|
|
16
|
+
'User-Agent': 'Mozilla/5.0 (compatible; AI-News-Agent/1.0)',
|
|
17
|
+
'Accept': 'text/html,application/xhtml+xml',
|
|
18
|
+
},
|
|
19
|
+
});
|
|
20
|
+
clearTimeout(timeout);
|
|
21
|
+
|
|
22
|
+
if (!response.ok) {
|
|
23
|
+
return '';
|
|
24
|
+
}
|
|
25
|
+
|
|
26
|
+
const html = await response.text();
|
|
27
|
+
return extractFromHtml(html);
|
|
28
|
+
} catch {
|
|
29
|
+
return '';
|
|
30
|
+
}
|
|
31
|
+
}
|
|
32
|
+
|
|
33
|
+
/**
|
|
34
|
+
* 从 HTML 字符串中提取正文
|
|
35
|
+
*/
|
|
36
|
+
function extractFromHtml(html) {
|
|
37
|
+
const { document } = parseHTML(html);
|
|
38
|
+
|
|
39
|
+
// 移除无关元素
|
|
40
|
+
const removeTags = ['script', 'style', 'nav', 'header', 'footer', 'aside', 'iframe'];
|
|
41
|
+
for (const tag of removeTags) {
|
|
42
|
+
document.querySelectorAll(tag).forEach(el => el.remove());
|
|
43
|
+
}
|
|
44
|
+
|
|
45
|
+
// 优先尝试 article 标签
|
|
46
|
+
let content = '';
|
|
47
|
+
const article = document.querySelector('article');
|
|
48
|
+
if (article) {
|
|
49
|
+
content = article.textContent || '';
|
|
50
|
+
}
|
|
51
|
+
|
|
52
|
+
// 回退到 main 或 body
|
|
53
|
+
if (!content || content.trim().length < 200) {
|
|
54
|
+
const main = document.querySelector('main') || document.querySelector('[role="main"]');
|
|
55
|
+
if (main) {
|
|
56
|
+
content = main.textContent || '';
|
|
57
|
+
}
|
|
58
|
+
}
|
|
59
|
+
|
|
60
|
+
if (!content || content.trim().length < 200) {
|
|
61
|
+
content = document.body?.textContent || '';
|
|
62
|
+
}
|
|
63
|
+
|
|
64
|
+
// 清理空白并截断
|
|
65
|
+
content = content.replace(/\s+/g, ' ').trim();
|
|
66
|
+
return content.slice(0, 3000);
|
|
67
|
+
}
|
package/src/feeds.js
ADDED
|
@@ -0,0 +1,50 @@
|
|
|
1
|
+
import RSSParser from 'rss-parser';
|
|
2
|
+
|
|
3
|
+
const parser = new RSSParser({
|
|
4
|
+
timeout: 15000,
|
|
5
|
+
headers: {
|
|
6
|
+
'User-Agent': 'AI-News-Agent/1.0',
|
|
7
|
+
},
|
|
8
|
+
});
|
|
9
|
+
|
|
10
|
+
/**
|
|
11
|
+
* 抓取单个 RSS 源
|
|
12
|
+
* @returns {Array<{title, link, summary, pubDate, feedName}>}
|
|
13
|
+
*/
|
|
14
|
+
async function fetchFeed(feed) {
|
|
15
|
+
try {
|
|
16
|
+
const result = await parser.parseURL(feed.url);
|
|
17
|
+
return (result.items || []).map(item => ({
|
|
18
|
+
title: item.title || '',
|
|
19
|
+
link: item.link || '',
|
|
20
|
+
summary: item.contentSnippet || item.content || '',
|
|
21
|
+
pubDate: item.pubDate || item.isoDate || '',
|
|
22
|
+
feedName: feed.name,
|
|
23
|
+
}));
|
|
24
|
+
} catch (err) {
|
|
25
|
+
console.error(` 抓取失败 [${feed.name}]: ${err.message}`);
|
|
26
|
+
return [];
|
|
27
|
+
}
|
|
28
|
+
}
|
|
29
|
+
|
|
30
|
+
/**
|
|
31
|
+
* 并行抓取所有 RSS 源
|
|
32
|
+
* @param {Array<{name, url}>} feeds
|
|
33
|
+
* @param {number} maxArticles - 最多返回文章数
|
|
34
|
+
* @returns {Array} 文章列表,按时间倒序
|
|
35
|
+
*/
|
|
36
|
+
export async function fetchAllFeeds(feeds, maxArticles = 50) {
|
|
37
|
+
const results = await Promise.allSettled(feeds.map(f => fetchFeed(f)));
|
|
38
|
+
const articles = results
|
|
39
|
+
.filter(r => r.status === 'fulfilled')
|
|
40
|
+
.flatMap(r => r.value);
|
|
41
|
+
|
|
42
|
+
// 按发布时间倒序
|
|
43
|
+
articles.sort((a, b) => {
|
|
44
|
+
const da = a.pubDate ? new Date(a.pubDate).getTime() : 0;
|
|
45
|
+
const db = b.pubDate ? new Date(b.pubDate).getTime() : 0;
|
|
46
|
+
return db - da;
|
|
47
|
+
});
|
|
48
|
+
|
|
49
|
+
return articles.slice(0, maxArticles);
|
|
50
|
+
}
|
package/src/index.js
ADDED
|
@@ -0,0 +1,147 @@
|
|
|
1
|
+
import ora from 'ora';
|
|
2
|
+
import { loadConfig } from './config.js';
|
|
3
|
+
import { fetchAllFeeds } from './feeds.js';
|
|
4
|
+
import { extractContent } from './extractor.js';
|
|
5
|
+
import { quickFilter, deepAnalyze } from './analyzer.js';
|
|
6
|
+
import { filterNew, markProcessed, closeDb } from './storage.js';
|
|
7
|
+
import { printResults, generateMarkdownReport } from './output.js';
|
|
8
|
+
|
|
9
|
+
/**
|
|
10
|
+
* dry-run 模式:基于关键词做本地匹配 + mock 分析
|
|
11
|
+
*/
|
|
12
|
+
function localKeywordMatch(articles, topics) {
|
|
13
|
+
const matched = [];
|
|
14
|
+
for (const article of articles) {
|
|
15
|
+
const text = `${article.title} ${article.summary}`.toLowerCase();
|
|
16
|
+
for (const topic of topics) {
|
|
17
|
+
const hit = topic.keywords.some(kw => text.includes(kw.toLowerCase()));
|
|
18
|
+
if (hit) {
|
|
19
|
+
matched.push({
|
|
20
|
+
...article,
|
|
21
|
+
matchedTopic: topic,
|
|
22
|
+
relevance: 0.8,
|
|
23
|
+
});
|
|
24
|
+
break;
|
|
25
|
+
}
|
|
26
|
+
}
|
|
27
|
+
}
|
|
28
|
+
return matched;
|
|
29
|
+
}
|
|
30
|
+
|
|
31
|
+
/**
|
|
32
|
+
* 主流程:抓取 → 去重 → 筛选 → 分析 → 输出
|
|
33
|
+
* @param {object} options - { dryRun: boolean }
|
|
34
|
+
*/
|
|
35
|
+
export async function run(options = {}) {
|
|
36
|
+
const { dryRun = false } = options;
|
|
37
|
+
|
|
38
|
+
// 检查 API Key(dry-run 模式跳过)
|
|
39
|
+
if (!dryRun && !process.env.ANTHROPIC_API_KEY) {
|
|
40
|
+
console.error('❌ 未设置 ANTHROPIC_API_KEY 环境变量');
|
|
41
|
+
console.error(' 请运行: export ANTHROPIC_API_KEY=your-api-key');
|
|
42
|
+
console.error(' 或使用 --dry-run 模式跳过 AI 分析');
|
|
43
|
+
process.exit(1);
|
|
44
|
+
}
|
|
45
|
+
|
|
46
|
+
const config = loadConfig();
|
|
47
|
+
const stats = { feedCount: config.feeds.length, totalArticles: 0, newArticles: 0 };
|
|
48
|
+
|
|
49
|
+
if (dryRun) {
|
|
50
|
+
console.log('🧪 Dry-run 模式: 使用本地关键词匹配,跳过 Claude API\n');
|
|
51
|
+
}
|
|
52
|
+
|
|
53
|
+
// Step 1: 抓取 RSS
|
|
54
|
+
let spinner = ora('正在抓取 RSS 订阅源...').start();
|
|
55
|
+
const articles = await fetchAllFeeds(config.feeds, config.claude.max_articles_per_run);
|
|
56
|
+
stats.totalArticles = articles.length;
|
|
57
|
+
spinner.succeed(`抓取完成: ${articles.length} 篇文章来自 ${config.feeds.length} 个源`);
|
|
58
|
+
|
|
59
|
+
if (articles.length === 0) {
|
|
60
|
+
console.log('没有抓取到任何文章,请检查 RSS 源配置');
|
|
61
|
+
closeDb();
|
|
62
|
+
return;
|
|
63
|
+
}
|
|
64
|
+
|
|
65
|
+
// Step 2: 去重
|
|
66
|
+
spinner = ora('过滤已处理文章...').start();
|
|
67
|
+
const newArticles = filterNew(articles);
|
|
68
|
+
stats.newArticles = newArticles.length;
|
|
69
|
+
spinner.succeed(`新增文章: ${newArticles.length} 篇 (已跳过 ${articles.length - newArticles.length} 篇)`);
|
|
70
|
+
|
|
71
|
+
if (newArticles.length === 0) {
|
|
72
|
+
console.log('没有新文章需要处理');
|
|
73
|
+
closeDb();
|
|
74
|
+
return;
|
|
75
|
+
}
|
|
76
|
+
|
|
77
|
+
// Step 3: 筛选
|
|
78
|
+
let matched;
|
|
79
|
+
if (dryRun) {
|
|
80
|
+
spinner = ora(`本地关键词匹配 (${newArticles.length} 篇)...`).start();
|
|
81
|
+
matched = localKeywordMatch(newArticles, config.topics);
|
|
82
|
+
spinner.succeed(`关键词命中: ${matched.length} 篇`);
|
|
83
|
+
} else {
|
|
84
|
+
spinner = ora(`AI 正在筛选匹配文章 (${newArticles.length} 篇)...`).start();
|
|
85
|
+
matched = await quickFilter(newArticles, config.topics, config.claude.model);
|
|
86
|
+
spinner.succeed(`匹配命中: ${matched.length} 篇`);
|
|
87
|
+
}
|
|
88
|
+
|
|
89
|
+
if (matched.length === 0) {
|
|
90
|
+
for (const a of newArticles) {
|
|
91
|
+
markProcessed(a, null);
|
|
92
|
+
}
|
|
93
|
+
printResults([], stats);
|
|
94
|
+
closeDb();
|
|
95
|
+
return;
|
|
96
|
+
}
|
|
97
|
+
|
|
98
|
+
// Step 4: 深度分析
|
|
99
|
+
spinner = ora(`分析 ${matched.length} 篇匹配文章...`).start();
|
|
100
|
+
const results = [];
|
|
101
|
+
|
|
102
|
+
for (let i = 0; i < matched.length; i++) {
|
|
103
|
+
const article = matched[i];
|
|
104
|
+
spinner.text = `分析 (${i + 1}/${matched.length}): ${article.title.slice(0, 40)}...`;
|
|
105
|
+
|
|
106
|
+
if (dryRun) {
|
|
107
|
+
// dry-run: 用 RSS 摘要做简单总结
|
|
108
|
+
article.analysis = {
|
|
109
|
+
summary: article.summary.slice(0, 150) || '(RSS 摘要为空,需通过 AI 分析获取详情)',
|
|
110
|
+
keyPoints: [`来源: ${article.feedName}`, `关键词匹配: ${article.matchedTopic.name}`],
|
|
111
|
+
actionable: article.matchedTopic.priority === 'high',
|
|
112
|
+
recommendation: article.matchedTopic.priority === 'high' ? '建议关注此文章详情' : '',
|
|
113
|
+
};
|
|
114
|
+
} else {
|
|
115
|
+
const fullContent = await extractContent(article.link);
|
|
116
|
+
article.fullContent = fullContent || article.summary;
|
|
117
|
+
article.analysis = await deepAnalyze(article, config.claude.model);
|
|
118
|
+
}
|
|
119
|
+
|
|
120
|
+
results.push(article);
|
|
121
|
+
markProcessed(article, article.analysis);
|
|
122
|
+
}
|
|
123
|
+
|
|
124
|
+
// 标记未匹配文章为已处理
|
|
125
|
+
for (const a of newArticles) {
|
|
126
|
+
if (!matched.find(m => m.link === a.link)) {
|
|
127
|
+
markProcessed(a, null);
|
|
128
|
+
}
|
|
129
|
+
}
|
|
130
|
+
|
|
131
|
+
spinner.succeed(`分析完成: ${results.length} 篇`);
|
|
132
|
+
|
|
133
|
+
// Step 5: 输出
|
|
134
|
+
if (config.output.terminal) {
|
|
135
|
+
printResults(results, stats);
|
|
136
|
+
}
|
|
137
|
+
|
|
138
|
+
if (config.output.markdown?.enabled) {
|
|
139
|
+
const reportPath = generateMarkdownReport(results, stats, config.output.markdown.dir);
|
|
140
|
+
if (reportPath) {
|
|
141
|
+
console.log(`📄 报告已保存: ${reportPath}`);
|
|
142
|
+
}
|
|
143
|
+
}
|
|
144
|
+
|
|
145
|
+
closeDb();
|
|
146
|
+
return results;
|
|
147
|
+
}
|
package/src/output.js
ADDED
|
@@ -0,0 +1,160 @@
|
|
|
1
|
+
import chalk from 'chalk';
|
|
2
|
+
import fs from 'fs';
|
|
3
|
+
import path from 'path';
|
|
4
|
+
|
|
5
|
+
const PRIORITY_COLORS = {
|
|
6
|
+
high: chalk.red,
|
|
7
|
+
medium: chalk.yellow,
|
|
8
|
+
low: chalk.blue,
|
|
9
|
+
};
|
|
10
|
+
|
|
11
|
+
const PRIORITY_LABELS = {
|
|
12
|
+
high: 'HIGH',
|
|
13
|
+
medium: 'MED',
|
|
14
|
+
low: 'LOW',
|
|
15
|
+
};
|
|
16
|
+
|
|
17
|
+
/**
|
|
18
|
+
* 终端输出匹配结果
|
|
19
|
+
*/
|
|
20
|
+
export function printResults(results, stats) {
|
|
21
|
+
const now = new Date().toLocaleDateString('zh-CN');
|
|
22
|
+
|
|
23
|
+
console.log('');
|
|
24
|
+
console.log(chalk.bold('━'.repeat(50)));
|
|
25
|
+
console.log(chalk.bold.cyan(` AI 资讯日报 — ${now}`));
|
|
26
|
+
console.log(chalk.gray(` 已扫描 ${stats.feedCount} 个源 | ${stats.totalArticles} 篇文章 | 新增 ${stats.newArticles} 篇 | 命中 ${results.length} 篇`));
|
|
27
|
+
console.log(chalk.bold('━'.repeat(50)));
|
|
28
|
+
|
|
29
|
+
if (results.length === 0) {
|
|
30
|
+
console.log(chalk.gray('\n 暂无匹配的新文章\n'));
|
|
31
|
+
return;
|
|
32
|
+
}
|
|
33
|
+
|
|
34
|
+
// 按 priority 分组
|
|
35
|
+
const grouped = {};
|
|
36
|
+
for (const r of results) {
|
|
37
|
+
const priority = r.matchedTopic?.priority || 'low';
|
|
38
|
+
if (!grouped[priority]) grouped[priority] = [];
|
|
39
|
+
grouped[priority].push(r);
|
|
40
|
+
}
|
|
41
|
+
|
|
42
|
+
for (const priority of ['high', 'medium', 'low']) {
|
|
43
|
+
const items = grouped[priority];
|
|
44
|
+
if (!items?.length) continue;
|
|
45
|
+
|
|
46
|
+
const colorFn = PRIORITY_COLORS[priority] || chalk.white;
|
|
47
|
+
const label = PRIORITY_LABELS[priority] || priority;
|
|
48
|
+
|
|
49
|
+
for (const item of items) {
|
|
50
|
+
console.log('');
|
|
51
|
+
console.log(colorFn(` [${label}] ${item.matchedTopic?.name || '未分类'}`));
|
|
52
|
+
console.log(chalk.bold(` ${item.title}`));
|
|
53
|
+
console.log(chalk.gray(` 来源: ${item.feedName} | ${formatDate(item.pubDate)}`));
|
|
54
|
+
|
|
55
|
+
if (item.analysis) {
|
|
56
|
+
console.log('');
|
|
57
|
+
console.log(` ${chalk.white(item.analysis.summary)}`);
|
|
58
|
+
|
|
59
|
+
if (item.analysis.keyPoints?.length) {
|
|
60
|
+
console.log('');
|
|
61
|
+
for (const point of item.analysis.keyPoints) {
|
|
62
|
+
console.log(chalk.cyan(` • ${point}`));
|
|
63
|
+
}
|
|
64
|
+
}
|
|
65
|
+
|
|
66
|
+
if (item.analysis.actionable && item.analysis.recommendation) {
|
|
67
|
+
console.log('');
|
|
68
|
+
console.log(chalk.green(` → ${item.analysis.recommendation}`));
|
|
69
|
+
}
|
|
70
|
+
}
|
|
71
|
+
|
|
72
|
+
console.log(chalk.gray(` ${item.link}`));
|
|
73
|
+
console.log(chalk.gray(' ' + '─'.repeat(46)));
|
|
74
|
+
}
|
|
75
|
+
}
|
|
76
|
+
|
|
77
|
+
console.log('');
|
|
78
|
+
}
|
|
79
|
+
|
|
80
|
+
/**
|
|
81
|
+
* 生成 Markdown 报告
|
|
82
|
+
*/
|
|
83
|
+
export function generateMarkdownReport(results, stats, outputDir) {
|
|
84
|
+
if (!results.length) return null;
|
|
85
|
+
|
|
86
|
+
const now = new Date();
|
|
87
|
+
const dateStr = now.toISOString().split('T')[0];
|
|
88
|
+
const timeStr = now.toLocaleTimeString('zh-CN');
|
|
89
|
+
|
|
90
|
+
let md = `# AI 资讯日报 — ${dateStr}\n\n`;
|
|
91
|
+
md += `> 扫描 ${stats.feedCount} 个源 | ${stats.totalArticles} 篇文章 | 新增 ${stats.newArticles} 篇 | 命中 ${results.length} 篇 | 生成时间 ${timeStr}\n\n`;
|
|
92
|
+
md += `---\n\n`;
|
|
93
|
+
|
|
94
|
+
// 按 priority 分组
|
|
95
|
+
const grouped = {};
|
|
96
|
+
for (const r of results) {
|
|
97
|
+
const priority = r.matchedTopic?.priority || 'low';
|
|
98
|
+
if (!grouped[priority]) grouped[priority] = [];
|
|
99
|
+
grouped[priority].push(r);
|
|
100
|
+
}
|
|
101
|
+
|
|
102
|
+
for (const priority of ['high', 'medium', 'low']) {
|
|
103
|
+
const items = grouped[priority];
|
|
104
|
+
if (!items?.length) continue;
|
|
105
|
+
|
|
106
|
+
const emoji = { high: '🔴', medium: '🟡', low: '🔵' }[priority];
|
|
107
|
+
const label = PRIORITY_LABELS[priority];
|
|
108
|
+
|
|
109
|
+
for (const item of items) {
|
|
110
|
+
md += `## ${emoji} [${label}] ${item.matchedTopic?.name || '未分类'}\n\n`;
|
|
111
|
+
md += `### ${item.title}\n\n`;
|
|
112
|
+
md += `**来源**: ${item.feedName} | **时间**: ${formatDate(item.pubDate)}\n\n`;
|
|
113
|
+
|
|
114
|
+
if (item.analysis) {
|
|
115
|
+
md += `**摘要**: ${item.analysis.summary}\n\n`;
|
|
116
|
+
|
|
117
|
+
if (item.analysis.keyPoints?.length) {
|
|
118
|
+
md += `**关键点**:\n`;
|
|
119
|
+
for (const point of item.analysis.keyPoints) {
|
|
120
|
+
md += `- ${point}\n`;
|
|
121
|
+
}
|
|
122
|
+
md += '\n';
|
|
123
|
+
}
|
|
124
|
+
|
|
125
|
+
if (item.analysis.actionable && item.analysis.recommendation) {
|
|
126
|
+
md += `> 💡 **建议**: ${item.analysis.recommendation}\n\n`;
|
|
127
|
+
}
|
|
128
|
+
}
|
|
129
|
+
|
|
130
|
+
md += `🔗 [阅读原文](${item.link})\n\n`;
|
|
131
|
+
md += `---\n\n`;
|
|
132
|
+
}
|
|
133
|
+
}
|
|
134
|
+
|
|
135
|
+
// 写入文件
|
|
136
|
+
if (!fs.existsSync(outputDir)) {
|
|
137
|
+
fs.mkdirSync(outputDir, { recursive: true });
|
|
138
|
+
}
|
|
139
|
+
const filePath = path.join(outputDir, `${dateStr}.md`);
|
|
140
|
+
|
|
141
|
+
// 如果同一天多次运行,追加内容
|
|
142
|
+
if (fs.existsSync(filePath)) {
|
|
143
|
+
md = `\n\n---\n\n# 更新 (${timeStr})\n\n` + md.split('---\n\n').slice(1).join('---\n\n');
|
|
144
|
+
fs.appendFileSync(filePath, md, 'utf-8');
|
|
145
|
+
} else {
|
|
146
|
+
fs.writeFileSync(filePath, md, 'utf-8');
|
|
147
|
+
}
|
|
148
|
+
|
|
149
|
+
return filePath;
|
|
150
|
+
}
|
|
151
|
+
|
|
152
|
+
function formatDate(dateStr) {
|
|
153
|
+
if (!dateStr) return '未知时间';
|
|
154
|
+
try {
|
|
155
|
+
const d = new Date(dateStr);
|
|
156
|
+
return d.toLocaleDateString('zh-CN') + ' ' + d.toLocaleTimeString('zh-CN', { hour: '2-digit', minute: '2-digit' });
|
|
157
|
+
} catch {
|
|
158
|
+
return dateStr;
|
|
159
|
+
}
|
|
160
|
+
}
|
package/src/storage.js
ADDED
|
@@ -0,0 +1,88 @@
|
|
|
1
|
+
import Database from 'better-sqlite3';
|
|
2
|
+
import path from 'path';
|
|
3
|
+
import crypto from 'crypto';
|
|
4
|
+
import { getConfigDir, ensureConfigDir } from './config.js';
|
|
5
|
+
|
|
6
|
+
let db = null;
|
|
7
|
+
|
|
8
|
+
function getDb() {
|
|
9
|
+
if (!db) {
|
|
10
|
+
ensureConfigDir();
|
|
11
|
+
const dbPath = path.join(getConfigDir(), 'articles.db');
|
|
12
|
+
db = new Database(dbPath);
|
|
13
|
+
db.pragma('journal_mode = WAL');
|
|
14
|
+
db.exec(`
|
|
15
|
+
CREATE TABLE IF NOT EXISTS articles (
|
|
16
|
+
url_hash TEXT PRIMARY KEY,
|
|
17
|
+
url TEXT NOT NULL,
|
|
18
|
+
title TEXT,
|
|
19
|
+
feed_name TEXT,
|
|
20
|
+
matched_topic TEXT,
|
|
21
|
+
analysis_json TEXT,
|
|
22
|
+
created_at TEXT DEFAULT (datetime('now', 'localtime'))
|
|
23
|
+
)
|
|
24
|
+
`);
|
|
25
|
+
}
|
|
26
|
+
return db;
|
|
27
|
+
}
|
|
28
|
+
|
|
29
|
+
function hashUrl(url) {
|
|
30
|
+
return crypto.createHash('md5').update(url).digest('hex');
|
|
31
|
+
}
|
|
32
|
+
|
|
33
|
+
/**
|
|
34
|
+
* 检查文章是否已处理过
|
|
35
|
+
*/
|
|
36
|
+
export function isProcessed(url) {
|
|
37
|
+
const row = getDb().prepare('SELECT 1 FROM articles WHERE url_hash = ?').get(hashUrl(url));
|
|
38
|
+
return !!row;
|
|
39
|
+
}
|
|
40
|
+
|
|
41
|
+
/**
|
|
42
|
+
* 过滤出未处理的文章
|
|
43
|
+
*/
|
|
44
|
+
export function filterNew(articles) {
|
|
45
|
+
return articles.filter(a => !isProcessed(a.link));
|
|
46
|
+
}
|
|
47
|
+
|
|
48
|
+
/**
|
|
49
|
+
* 保存已处理的文章
|
|
50
|
+
*/
|
|
51
|
+
export function markProcessed(article, analysis) {
|
|
52
|
+
getDb().prepare(`
|
|
53
|
+
INSERT OR IGNORE INTO articles (url_hash, url, title, feed_name, matched_topic, analysis_json)
|
|
54
|
+
VALUES (?, ?, ?, ?, ?, ?)
|
|
55
|
+
`).run(
|
|
56
|
+
hashUrl(article.link),
|
|
57
|
+
article.link,
|
|
58
|
+
article.title,
|
|
59
|
+
article.feedName,
|
|
60
|
+
article.matchedTopic?.name || '',
|
|
61
|
+
JSON.stringify(analysis),
|
|
62
|
+
);
|
|
63
|
+
}
|
|
64
|
+
|
|
65
|
+
/**
|
|
66
|
+
* 获取历史记录
|
|
67
|
+
* @param {number} days - 最近几天
|
|
68
|
+
* @param {number} limit - 最多返回条数
|
|
69
|
+
*/
|
|
70
|
+
export function getHistory(days = 7, limit = 50) {
|
|
71
|
+
return getDb().prepare(`
|
|
72
|
+
SELECT url, title, feed_name, matched_topic, analysis_json, created_at
|
|
73
|
+
FROM articles
|
|
74
|
+
WHERE created_at >= datetime('now', 'localtime', ?)
|
|
75
|
+
ORDER BY created_at DESC
|
|
76
|
+
LIMIT ?
|
|
77
|
+
`).all(`-${days} days`, limit);
|
|
78
|
+
}
|
|
79
|
+
|
|
80
|
+
/**
|
|
81
|
+
* 关闭数据库连接
|
|
82
|
+
*/
|
|
83
|
+
export function closeDb() {
|
|
84
|
+
if (db) {
|
|
85
|
+
db.close();
|
|
86
|
+
db = null;
|
|
87
|
+
}
|
|
88
|
+
}
|