npm - @wentorai/research-plugins - Versions diffs - 1.2.3 → 1.3.0 - Mend

@wentorai/research-plugins 1.2.3 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (142) hide show

package/skills/tools/document/openpaper-guide/SKILL.md DELETED Viewed

@@ -1,232 +0,0 @@
----
-name: openpaper-guide
-description: "Open-source tool for organizing and annotating research papers"
-metadata:
-  openclaw:
-    emoji: "📄"
-    category: "tools"
-    subcategory: "document"
-    keywords: ["paper management", "PDF annotation", "research organizer", "paper reader", "document viewer", "open source"]
-    source: "https://github.com/nicehash/openpaper"
----
-# OpenPaper Guide
-## Overview
-OpenPaper is an open-source research paper management and annotation tool. It provides PDF viewing with inline annotations, paper organization with tags and collections, metadata extraction, full-text search across your library, and export capabilities. Designed as a lightweight, privacy-focused alternative to commercial reference managers, running entirely locally.
-## Installation
-```bash
-# Install via pip
-pip install openpaper
-# Or from source
-git clone https://github.com/nicehash/openpaper.git
-cd openpaper && pip install -e .
-# Launch
-openpaper
-```
-## Library Management
-```python
-from openpaper import Library
-library = Library("./my_research_library")
-# Add papers
-paper = library.add("path/to/paper.pdf")
-print(f"Added: {paper.title}")
-print(f"Authors: {paper.authors}")
-print(f"Year: {paper.year}")
-# Bulk import
-added = library.import_directory(
-    "downloads/papers/",
-    recursive=True,
-    extract_metadata=True,   # Auto-extract from PDF
-    deduplicate=True,         # Skip duplicates by DOI/title
-)
-print(f"Imported {len(added)} papers, {added.duplicates} skipped")
-```
-## Organization
-```python
-# Tags
-paper.add_tag("transformer")
-paper.add_tag("attention")
-paper.add_tag("priority:high")
-# Collections
-library.create_collection("thesis-chapter-2")
-library.add_to_collection("thesis-chapter-2", paper)
-# Smart collections (auto-updating filters)
-library.create_smart_collection(
-    name="Recent NLP",
-    filters={
-        "tags": ["nlp"],
-        "year": {"gte": 2023},
-        "read_status": "unread",
-    },
-)
-# List and browse
-for p in library.search(tags=["transformer"], year=2024):
-    print(f"{p.title} ({p.year}) - {p.read_status}")
-```
-## Annotations
-```python
-# Add annotations to papers
-paper.annotate(
-    page=3,
-    type="highlight",
-    text="The attention mechanism allows the model to focus...",
-    color="yellow",
-    note="Key definition of attention",
-)
-paper.annotate(
-    page=5,
-    type="comment",
-    position=(100, 250),  # x, y coordinates
-    note="This contradicts the claim in Smith et al. 2023",
-)
-# Export annotations
-annotations = paper.get_annotations()
-for ann in annotations:
-    print(f"[p.{ann.page}] {ann.type}: {ann.text[:60]}...")
-    if ann.note:
-        print(f"  Note: {ann.note}")
-# Export to markdown
-paper.export_annotations("annotations.md")
-```
-## Search
-```python
-# Full-text search across library
-results = library.search_fulltext("attention mechanism")
-for r in results:
-    print(f"{r.title} (relevance: {r.score:.2f})")
-    for match in r.matches[:3]:
-        print(f"  p.{match.page}: ...{match.context}...")
-# Metadata search
-results = library.search(
-    query="transformer",        # Title/abstract search
-    authors="Vaswani",
-    year_range=(2020, 2025),
-    tags=["nlp"],
-)
-# Semantic search (if embeddings enabled)
-results = library.semantic_search(
-    "methods for reducing quadratic complexity of attention",
-    top_k=10,
-)
-```
-## Metadata Extraction
-```python
-# Auto-extract metadata from PDFs
-metadata = library.extract_metadata("paper.pdf")
-print(f"Title: {metadata.title}")
-print(f"Authors: {metadata.authors}")
-print(f"Abstract: {metadata.abstract[:200]}...")
-print(f"DOI: {metadata.doi}")
-print(f"Year: {metadata.year}")
-print(f"References: {len(metadata.references)}")
-# Enrich with external databases
-enriched = library.enrich_metadata(
-    paper,
-    sources=["crossref", "semantic_scholar"],
-)
-print(f"Citations: {enriched.citation_count}")
-print(f"Venue: {enriched.venue}")
-```
-## Export
-```python
-# Export bibliography
-library.export_bibtex("references.bib", collection="thesis-chapter-2")
-# Export reading list
-library.export_reading_list("reading_list.md", format="markdown")
-# Export annotations from all papers
-library.export_all_annotations("all_annotations.md")
-# Sync with reference manager
-library.export_ris("export.ris")          # RIS format
-library.export_csv("export.csv")          # CSV with metadata
-```
-## Configuration
-```json
-{
-  "library_path": "./research_library",
-  "pdf_viewer": "builtin",
-  "metadata": {
-    "auto_extract": true,
-    "enrich_sources": ["crossref"],
-    "language": "en"
-  },
-  "search": {
-    "fulltext_index": true,
-    "semantic_search": false,
-    "embedding_model": "all-MiniLM-L6-v2"
-  },
-  "storage": {
-    "copy_pdfs": true,
-    "organize_by": "year",
-    "max_library_size_gb": 10
-  }
-}
-```
-## CLI Usage
-```bash
-# Add paper
-openpaper add paper.pdf --tags "nlp,transformer"
-# Search
-openpaper search "attention mechanism" --limit 10
-# List library
-openpaper list --sort year --tags "priority:high"
-# Export
-openpaper export bibtex --collection thesis --output refs.bib
-# Stats
-openpaper stats
-# Papers: 342, Tagged: 289, Annotated: 156, Collections: 12
-```
-## Use Cases
-1. **Paper library**: Organize and search your PDF collection
-2. **Reading workflow**: Track read status, annotate, take notes
-3. **Reference management**: Export BibTeX for LaTeX papers
-4. **Literature review**: Tag and categorize papers by topic
-5. **Team sharing**: Export reading lists and annotations
-## References
-- [OpenPaper GitHub](https://github.com/nicehash/openpaper)
-- [Zotero](https://www.zotero.org/) — Popular open-source alternative
-- [Semantic Scholar API](https://api.semanticscholar.org/) — Metadata enrichment

package/skills/tools/document/qq-connect/SKILL.md DELETED Viewed

@@ -1,227 +0,0 @@
----
-name: qq-connect
-description: "Connect QQ messaging to Research-Claw via QQ Bot API"
-metadata:
-  openclaw:
-    emoji: "💬"
-    category: "tools"
-    subcategory: "document"
-    keywords: ["QQ", "QQ Bot", "messaging", "Tencent", "channel"]
-    source: "https://q.qq.com/"
----
-# QQ 连接指南 — Research-Claw × QQ Bot
-通过 QQ Bot 官方 API 将 QQ 消息通道连接到 Research-Claw，实现在 QQ 中直接与科研助手交流。
-## 架构概述
-```
-QQ 用户 ⟶ QQ 开放平台 (WebSocket) ⟶ openclaw-qqbot 插件 ⟶ OpenClaw Gateway ⟶ Research-Claw Agent
-                                                                       ↕
-                                                              research-claw-core (文献/任务/工作区/雷达)
-```
-- **openclaw-qqbot** 是一个 OpenClaw channel plugin，使用腾讯 QQ Bot API v2（官方接口）
-- 它与 research-claw-core 插件共存于同一 gateway 进程，共享 agent 上下文
-- 安装后，QQ 用户的消息等同于 dashboard 中的对话 — agent 的全部 27 个工具均可用
-- 附带 2 个技能：`qqbot-cron`（定时提醒）和 `qqbot-media`（图片/语音/视频/文件收发）
-## 前提条件
-1. **QQ 开放平台账号** — 前往 https://q.qq.com/ 注册开发者账号
-2. **创建机器人应用** — 在开放平台创建一个机器人，获取：
-   - **AppID**（应用 ID）
-   - **AppSecret**（应用密钥）
-3. **配置机器人权限**（在 QQ 开放平台控制台）：
-   - 消息接收：开启「群聊」和/或「私聊」权限
-   - 推荐开启：`PUBLIC_GUILD_MESSAGES` + `GROUP_AND_C2C`（频道+群+私聊）
-4. **Research-Claw 已正常运行** — gateway 可访问 (`ws://127.0.0.1:18789`)
-## 安装步骤（Agent 可直接执行）
-### 步骤 1：安装 QQ Bot 插件
-```bash
-openclaw plugins install @tencent-connect/openclaw-qqbot@latest
-```
-如果网络不通，可使用代理：
-```bash
-HTTPS_PROXY=http://127.0.0.1:7890 openclaw plugins install @tencent-connect/openclaw-qqbot@latest
-```
-### 步骤 2：配置凭证
-**方法 A — 使用 CLI（推荐）：**
-> 注意：`openclaw channels add` 仅支持内置通道（Telegram、Discord 等）。
-> qqbot 作为自定义插件，需通过 `config set` 配置。
-```bash
-openclaw config set channels.qqbot.appId "<APP_ID>"
-openclaw config set channels.qqbot.clientSecret "<APP_SECRET>"
-openclaw config set channels.qqbot.enabled true
-```
-**方法 B — 直接编辑配置文件：**
-在 `~/.openclaw/openclaw.json` 中添加：
-```jsonc
-{
-  "channels": {
-    "qqbot": {
-      "appId": "<APP_ID>",
-      "clientSecret": "<APP_SECRET>",
-      "enabled": true
-    }
-  }
-}
-```
-### 步骤 3：重启 Gateway
-```bash
-openclaw gateway restart
-```
-### 步骤 4：验证连接
-重启后，在 QQ 中向你的机器人发送一条消息（如「你好」）。如果收到回复，连接成功。
-查看 gateway 日志确认：
-```bash
-openclaw logs | grep -i qqbot
-```
-正常应看到类似：
-```
-[qqbot] Connected to QQ gateway
-[qqbot] Session ready: sessionId=xxx
-```
-## 可选配置
-```jsonc
-{
-  "channels": {
-    "qqbot": {
-      "appId": "<APP_ID>",
-      "clientSecret": "<APP_SECRET>",
-      "enabled": true,
-      // 机器人显示名称（可选）
-      "name": "科研龙虾",
-      // 系统提示（可选，追加到 agent 默认提示之后）
-      "systemPrompt": "你现在通过 QQ 与用户交流，回复请简洁。",
-      // 私聊策略：open=开放 | pairing=需要用户先发消息 | allowlist=白名单
-      "dmPolicy": "open",
-      // 语音转文字（需要 STT 模型配置）
-      "stt": {
-        "provider": "openai",
-        "model": "whisper-1",
-        "enabled": true
-      },
-      // 文字转语音
-      "tts": {
-        "provider": "openai",
-        "model": "tts-1",
-        "voice": "nova",
-        "enabled": false
-      },
-      // 多账号（可选）
-      "accounts": {
-        "bot2": {
-          "appId": "<BOT2_APP_ID>",
-          "clientSecret": "<BOT2_APP_SECRET>",
-          "enabled": true
-        }
-      }
-    }
-  }
-}
-```
-## Agent 自动化设置流程
-当用户要求连接 QQ 时，按以下步骤操作：
-1. **询问凭证**：
-   > 要连接 QQ，需要你在 QQ 开放平台（https://q.qq.com/）创建机器人后获取的 AppID 和 AppSecret。
-   > 你已经有了吗？如果没有，我可以指导你创建。
-2. **获取 AppID 和 AppSecret 后，执行安装**：
-   ```bash
-   openclaw plugins install @tencent-connect/openclaw-qqbot@latest
-   openclaw config set channels.qqbot.appId "<用户提供的AppID>"
-   openclaw config set channels.qqbot.clientSecret "<用户提供的Secret>"
-   openclaw config set channels.qqbot.enabled true
-   openclaw gateway restart
-   ```
-3. **引导测试**：
-   > 安装完成！请在 QQ 中找到你的机器人，发送一条消息试试。
-4. **记录到 MEMORY.md**：
-   ```markdown
-   ### Environment
-   - QQ Bot: connected (AppID: <前4位>****)
-   ```
-## QQ Bot 注册指引（用于引导无账号的用户）
-1. 访问 https://q.qq.com/，使用 QQ 账号登录
-2. 点击「创建机器人」（或「应用管理」→「创建」）
-3. 填写机器人基本信息（名称、头像、简介）
-4. 在「开发设置」中获取 **AppID** 和 **AppSecret**
-5. 在「消息」→「消息订阅」中开启消息接收权限：
-   - 群消息（推荐）
-   - 私聊消息（推荐）
-   - 频道消息（可选）
-6. 审核通过后即可使用
-> 注意：QQ 机器人需要通过腾讯审核。测试阶段可使用沙箱环境。
-## 消息能力
-连接 QQ 后，agent 可以：
-| 能力 | 说明 |
-|------|------|
-| 文本对话 | QQ 消息 ↔ agent 双向交流 |
-| 图片收发 | 用户发图自动下载，agent 用 `<qqimg>` 标签发图 |
-| 语音处理 | STT 转文字（需配置），TTS 发语音 |
-| 视频/文件 | `<qqvideo>` / `<qqfile>` 标签发送 |
-| 定时提醒 | cron 工具创建一次性/周期提醒 |
-| 群聊 | 支持群组中 @机器人 交流 |
-| 私聊 | 支持一对一私聊 |
-## 常见问题
-### 连接超时
-- 检查网络：QQ Bot API 需要能访问 `api.sgroup.qq.com` 和 `bots.qq.com`
-- 如果在国内需要代理，在配置中不需要设置代理（QQ API 是国内服务）
-- 如果在海外，可能需要配置代理
-### 权限不足（Intent 降级）
-- 插件会自动尝试 3 个权限级别：全功能 → 群+频道 → 仅频道
-- 如果只能用基础功能，去 QQ 开放平台检查机器人权限配置
-### 消息发不出去
-- 确认 AppID 和 AppSecret 正确
-- 确认机器人已通过审核（或在沙箱中测试）
-- 检查 gateway 日志：`openclaw logs | grep qqbot`
-### 语音无法转文字
-- 需要配置 STT 提供商（如 OpenAI Whisper）
-- 在 `channels.qqbot.stt` 中设置 provider 和 model
-### 重启后断开
-- 5 分钟内重启可自动恢复 session（session persistence）
-- 超过 5 分钟会重新建立连接，不影响功能

package/skills/tools/document/weknora-guide/SKILL.md DELETED Viewed

@@ -1,216 +0,0 @@
----
-name: weknora-guide
-description: "Tencent document understanding engine with RAG capabilities"
-metadata:
-  openclaw:
-    emoji: "📑"
-    category: "tools"
-    subcategory: "document"
-    keywords: ["WeKnora", "document understanding", "RAG", "text mining", "Tencent", "knowledge extraction"]
-    source: "https://github.com/Tencent/WeKnora"
----
-# WeKnora Guide
-## Overview
-WeKnora is Tencent's open-source document understanding and retrieval-augmented generation engine. It processes complex documents (PDF, DOCX, HTML) into structured knowledge, supporting layout analysis, table extraction, formula recognition, and multi-modal content parsing. Integrates with RAG pipelines for question answering over document collections. Suited for academic paper processing, report analysis, and enterprise document intelligence.
-## Installation
-```bash
-# Install WeKnora
-pip install weknora
-# With GPU support
-pip install weknora[gpu]
-# With all optional dependencies
-pip install weknora[all]
-```
-## Document Parsing
-```python
-from weknora import DocumentParser
-parser = DocumentParser()
-# Parse a PDF document
-doc = parser.parse("research_paper.pdf")
-print(f"Pages: {doc.num_pages}")
-print(f"Sections: {len(doc.sections)}")
-print(f"Tables: {len(doc.tables)}")
-print(f"Figures: {len(doc.figures)}")
-print(f"Equations: {len(doc.equations)}")
-# Access structured content
-for section in doc.sections:
-    print(f"\n## {section.title}")
-    print(f"   {section.text[:200]}...")
-    if section.tables:
-        print(f"   Tables: {len(section.tables)}")
-```
-## Layout Analysis
-```python
-from weknora import LayoutAnalyzer
-analyzer = LayoutAnalyzer(model="layoutlmv3")
-# Detect document layout elements
-layout = analyzer.analyze("paper.pdf")
-for page in layout.pages:
-    print(f"\nPage {page.number}:")
-    for element in page.elements:
-        print(f"  [{element.type}] ({element.bbox}) "
-              f"{element.text[:50]}...")
-    # Element types: title, text, table, figure,
-    #   equation, header, footer, caption, list
-```
-## Table Extraction
-```python
-from weknora import TableExtractor
-extractor = TableExtractor()
-# Extract tables from document
-tables = extractor.extract("paper.pdf")
-for i, table in enumerate(tables):
-    print(f"\nTable {i+1}: {table.caption}")
-    df = table.to_dataframe()
-    print(df.head())
-    # Export
-    df.to_csv(f"table_{i+1}.csv")
-# Extract specific table by page
-table = extractor.extract_from_page("paper.pdf", page=5, index=0)
-```
-## Formula Recognition
-```python
-from weknora import FormulaRecognizer
-recognizer = FormulaRecognizer()
-# Extract formulas from document
-formulas = recognizer.extract("paper.pdf")
-for formula in formulas:
-    print(f"Page {formula.page}: {formula.latex}")
-    # Output: "\\mathcal{L} = -\\sum_{i} y_i \\log(\\hat{y}_i)"
-    print(f"  Type: {formula.type}")  # inline or display
-```
-## RAG Pipeline
-```python
-from weknora import RAGPipeline
-# Build RAG over document collection
-rag = RAGPipeline(
-    embedding_model="bge-large-zh-v1.5",
-    chunk_size=512,
-    chunk_overlap=64,
-)
-# Index documents
-rag.add_documents([
-    "papers/transformer.pdf",
-    "papers/bert.pdf",
-    "papers/gpt3.pdf",
-])
-# Query
-result = rag.query(
-    "What is the computational complexity of self-attention?"
-)
-print(result.answer)
-for source in result.sources:
-    print(f"  [{source.document}] p.{source.page}: "
-          f"{source.text[:80]}...")
-```
-## Multi-Modal Processing
-```python
-from weknora import MultiModalParser
-parser = MultiModalParser()
-# Process document with figures and tables
-doc = parser.parse("paper.pdf", extract_all=True)
-# Access figure descriptions
-for fig in doc.figures:
-    print(f"Figure {fig.number}: {fig.caption}")
-    fig.save_image(f"figures/fig_{fig.number}.png")
-# Cross-reference tables and text
-for ref in doc.cross_references:
-    print(f"'{ref.text}' → {ref.target_type} {ref.target_id}")
-```
-## Batch Processing
-```python
-from weknora import BatchProcessor
-processor = BatchProcessor(
-    workers=4,
-    output_dir="./parsed_docs",
-)
-# Process directory of documents
-results = processor.process_directory(
-    "papers/",
-    formats=["pdf", "docx"],
-    output_format="json",  # or "markdown"
-)
-print(f"Processed: {results.success}/{results.total}")
-print(f"Failed: {results.failures}")
-```
-## Configuration
-```python
-from weknora import Config
-config = Config(
-    parser={
-        "layout_model": "layoutlmv3",
-        "ocr_engine": "paddleocr",
-        "formula_engine": "latex_ocr",
-        "language": "en",  # or "zh", "multi"
-    },
-    rag={
-        "embedding_model": "bge-large-zh-v1.5",
-        "reranker": "bge-reranker-large",
-        "chunk_strategy": "semantic",
-        "vector_store": "faiss",
-    },
-)
-```
-## Use Cases
-1. **Paper parsing**: Extract structured content from academic PDFs
-2. **Table digitization**: Convert paper tables to spreadsheets
-3. **Document QA**: RAG-based question answering over papers
-4. **Knowledge extraction**: Build knowledge bases from documents
-5. **Report analysis**: Process and compare technical reports
-## References
-- [WeKnora GitHub](https://github.com/Tencent/WeKnora)
-- [LayoutLMv3](https://arxiv.org/abs/2204.08387)
-- [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)