npm - @wentorai/research-plugins - Versions diffs - 1.1.0 → 1.2.0 - Mend

@wentorai/research-plugins 1.1.0 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (261) hide show

package/skills/domains/finance/finsight-research-guide/SKILL.md ADDED Viewed

@@ -0,0 +1,113 @@
+---
+name: finsight-research-guide
+description: "Deep financial research with the FinSight multi-agent system"
+metadata:
+  openclaw:
+    emoji: "💰"
+    category: "domains"
+    subcategory: "finance"
+    keywords: ["FinSight", "financial analysis", "deep research", "market analysis", "financial reports", "multi-agent"]
+    source: "https://github.com/RUC-NLPIR/FinSight"
+---
+# FinSight Research Guide
+## Overview
+FinSight is a deep research agent designed specifically for financial analysis. Developed by RUC-NLPIR, it combines multi-source data retrieval, financial reasoning, and report generation to produce publication-ready financial research. It handles market analysis, company fundamentals, sector comparisons, and macroeconomic assessment through specialized agents.
+## Installation
+```bash
+git clone https://github.com/RUC-NLPIR/FinSight.git
+cd FinSight && pip install -e .
+```
+## Core Capabilities
+### Research Query to Report
+```python
+from finsight import FinSightAgent
+agent = FinSightAgent(llm_provider="anthropic")
+# Generate comprehensive financial analysis
+report = agent.research(
+    "Analyze the competitive landscape of the global EV battery "
+    "market. Compare CATL, LG Energy, and Panasonic on market "
+    "share, technology, margins, and growth outlook."
+)
+print(report.summary)
+report.save("ev_battery_analysis.pdf")
+```
+### Agent Architecture
+| Agent | Role |
+|-------|------|
+| **Retrieval Agent** | Fetches data from SEC filings, financial APIs, news |
+| **Data Agent** | Processes financial statements, ratios, time series |
+| **Analysis Agent** | Performs fundamental, technical, and comparative analysis |
+| **Reasoning Agent** | Synthesizes findings, identifies trends and risks |
+| **Report Agent** | Generates structured research reports with citations |
+### Financial Data Sources
+```python
+# FinSight integrates with multiple data sources
+config = {
+    "sec_edgar": True,        # SEC filings (free)
+    "fred": True,             # Federal Reserve economic data
+    "yahoo_finance": True,    # Market data (free)
+    "news_api": True,         # Financial news
+    "world_bank": True,       # Macro indicators
+}
+```
+### Analysis Types
+```python
+# Company fundamental analysis
+report = agent.research(
+    "Provide a fundamental analysis of NVIDIA including "
+    "revenue trends, margin analysis, valuation multiples, "
+    "and competitive moat assessment."
+)
+# Sector analysis
+report = agent.research(
+    "Compare the top 5 cloud computing companies by revenue "
+    "growth, operating margins, and R&D investment intensity."
+)
+# Macro analysis
+report = agent.research(
+    "Analyze the impact of rising interest rates on US "
+    "commercial real estate valuations since 2022."
+)
+```
+## Report Structure
+Generated reports typically include:
+1. **Executive Summary** — Key findings in 3-5 bullets
+2. **Market Overview** — Industry size, growth, trends
+3. **Company Analysis** — Financials, competitive position
+4. **Risk Assessment** — Key risks and mitigation
+5. **Outlook** — Forward-looking analysis with scenarios
+6. **Sources** — Cited data sources and references
+## Use Cases
+1. **Investment research**: Company and sector deep dives
+2. **Due diligence**: Comprehensive target company analysis
+3. **Academic research**: Financial economics research support
+4. **Market intelligence**: Competitive landscape mapping
+## References
+- [FinSight GitHub](https://github.com/RUC-NLPIR/FinSight)
+- [RUC-NLPIR Lab](http://playbigdata.ruc.edu.cn/)

package/skills/domains/finance/options-analytics-agent-guide/SKILL.md ADDED Viewed

@@ -0,0 +1,117 @@
+---
+name: options-analytics-agent-guide
+description: "AI agent for options pricing, Greeks, and strategy analysis"
+metadata:
+  openclaw:
+    emoji: "📉"
+    category: "domains"
+    subcategory: "finance"
+    keywords: ["options analytics", "derivatives", "Greeks", "Black-Scholes", "strategy analysis", "financial agent"]
+    source: "https://github.com/options-analytics/options-agent"
+---
+# Options Analytics Agent Guide
+## Overview
+An AI agent for options pricing, risk analysis, and strategy evaluation. It combines Black-Scholes and binomial models, Greeks calculations, implied volatility surfaces, and portfolio risk analytics into a conversational interface. Researchers and quantitative analysts can query options data, price exotic derivatives, and evaluate trading strategies through natural language.
+## Core Capabilities
+```python
+from options_agent import OptionsAgent
+agent = OptionsAgent(llm_provider="anthropic")
+# Price an option
+result = agent.price(
+    option_type="call",
+    strike=100,
+    spot=105,
+    expiry_days=30,
+    risk_free_rate=0.05,
+    volatility=0.20,
+    model="black_scholes",
+)
+print(f"Price: ${result.price:.2f}")
+print(f"Delta: {result.delta:.4f}")
+print(f"Gamma: {result.gamma:.4f}")
+print(f"Theta: {result.theta:.4f}")
+print(f"Vega: {result.vega:.4f}")
+print(f"Rho: {result.rho:.4f}")
+```
+## Greeks Analysis
+```python
+# Full Greeks surface
+surface = agent.greeks_surface(
+    strike=100,
+    spot_range=(80, 120),
+    expiry_range=(7, 90),  # days
+    volatility=0.25,
+)
+surface.plot_delta_surface("delta_surface.png")
+surface.plot_gamma_surface("gamma_surface.png")
+surface.plot_theta_decay("theta_decay.png")
+```
+## Strategy Evaluation
+```python
+# Evaluate an options strategy
+strategy = agent.evaluate_strategy(
+    legs=[
+        {"type": "call", "strike": 100, "action": "buy", "qty": 1},
+        {"type": "call", "strike": 110, "action": "sell", "qty": 1},
+    ],
+    spot=105,
+    expiry_days=30,
+    volatility=0.20,
+)
+print(f"Strategy: {strategy.name}")  # Bull Call Spread
+print(f"Max profit: ${strategy.max_profit:.2f}")
+print(f"Max loss: ${strategy.max_loss:.2f}")
+print(f"Breakeven: ${strategy.breakeven:.2f}")
+strategy.plot_payoff("payoff.png")
+strategy.plot_pnl_scenarios("scenarios.png")
+```
+## Implied Volatility
+```python
+# Calculate implied volatility
+iv = agent.implied_volatility(
+    market_price=5.50,
+    option_type="call",
+    strike=100,
+    spot=105,
+    expiry_days=30,
+    risk_free_rate=0.05,
+)
+print(f"Implied volatility: {iv:.2%}")
+# Volatility smile/surface
+vol_surface = agent.volatility_surface(
+    ticker="SPY",
+    date="2025-03-10",
+)
+vol_surface.plot("vol_surface.png")
+```
+## Use Cases
+1. **Options pricing**: Black-Scholes and numerical methods
+2. **Risk management**: Greeks and portfolio risk metrics
+3. **Strategy analysis**: P&L profiles and breakeven analysis
+4. **Volatility analysis**: IV surfaces and skew analysis
+5. **Education**: Interactive derivatives teaching tool
+## References
+- [Options Analytics Agent](https://github.com/options-analytics/options-agent)
+- [QuantLib](https://www.quantlib.org/) — Quantitative finance library

package/skills/domains/geoscience/pangaea-data-api/SKILL.md ADDED Viewed

@@ -0,0 +1,197 @@
+---
+name: pangaea-data-api
+description: "Access earth and environmental science datasets via PANGAEA API"
+metadata:
+  openclaw:
+    emoji: "🌍"
+    category: "domains"
+    subcategory: "geoscience"
+    keywords: ["PANGAEA", "earth science data", "oceanography", "paleoclimate", "environmental data", "geoscience repository"]
+    source: "https://www.pangaea.de/"
+---
+# PANGAEA Data Repository API
+## Overview
+PANGAEA is the world's leading data repository for earth and environmental sciences, hosting 400K+ datasets with 20B+ data points. It archives research data from oceanography, paleoclimatology, geology, ecology, and atmospheric science. Each dataset has a DOI and is linked to the originating publication. The API provides search, metadata retrieval, and data download. Free, no authentication required.
+## API Endpoints
+### Search API
+```bash
+# Search datasets by keyword
+curl "https://www.pangaea.de/advanced/search.php?q=ocean+temperature&count=20&type=json"
+# Search with geographic bounding box
+curl "https://www.pangaea.de/advanced/search.php?\
+q=sediment+core&minlat=-60&maxlat=-30&minlon=-180&maxlon=180&type=json"
+# Filter by parameter (measurement type)
+curl "https://www.pangaea.de/advanced/search.php?\
+q=carbon+dioxide&param=Atmospheric+CO2&type=json"
+# Filter by date range
+curl "https://www.pangaea.de/advanced/search.php?\
+q=Arctic+ice&mindate=2020-01-01&maxdate=2026-12-31&type=json"
+```
+### ElasticSearch API
+```bash
+# Full-text search via Elasticsearch
+curl -X POST "https://ws.pangaea.de/es/pangaea/panmd/_search" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "query": {
+      "bool": {
+        "must": [
+          {"match": {"citation.title": "ocean temperature"}}
+        ],
+        "filter": [
+          {"range": {"citation.year": {"gte": 2020}}}
+        ]
+      }
+    },
+    "size": 20
+  }'
+```
+### Dataset Access
+```bash
+# Get dataset metadata
+curl "https://doi.pangaea.de/10.1594/PANGAEA.123456?format=metainfo_json"
+# Download dataset as tab-delimited text
+curl "https://doi.pangaea.de/10.1594/PANGAEA.123456?format=textfile"
+# Download as CSV
+curl "https://doi.pangaea.de/10.1594/PANGAEA.123456?format=csv"
+```
+### OAI-PMH Harvesting
+```bash
+# List records
+curl "https://ws.pangaea.de/oai/provider?verb=ListRecords&metadataPrefix=oai_dc"
+# Get specific record
+curl "https://ws.pangaea.de/oai/provider?verb=GetRecord&identifier=oai:pangaea.de:doi:10.1594/PANGAEA.123456&metadataPrefix=oai_dc"
+```
+### Query Parameters (Search API)
+| Parameter | Description | Example |
+|-----------|-------------|---------|
+| `q` | Search query | `q=coral+reef+bleaching` |
+| `count` | Results per page | `count=50` |
+| `offset` | Pagination offset | `offset=20` |
+| `minlat/maxlat` | Latitude bounds | `-90` to `90` |
+| `minlon/maxlon` | Longitude bounds | `-180` to `180` |
+| `mindate/maxdate` | Temporal filter | `2020-01-01` |
+| `param` | Parameter/measurement | `Temperature` |
+| `topic` | Topic filter | `Atmosphere`, `Biosphere` |
+| `type` | Response format | `json`, `xml` |
+## Python Usage
+```python
+import requests
+import pandas as pd
+from io import StringIO
+SEARCH_URL = "https://www.pangaea.de/advanced/search.php"
+ES_URL = "https://ws.pangaea.de/es/pangaea/panmd/_search"
+def search_pangaea(query: str, count: int = 20,
+                   bbox: dict = None) -> list:
+    """Search PANGAEA for earth science datasets."""
+    params = {"q": query, "count": count, "type": "json"}
+    if bbox:
+        params.update({
+            "minlat": bbox.get("south", -90),
+            "maxlat": bbox.get("north", 90),
+            "minlon": bbox.get("west", -180),
+            "maxlon": bbox.get("east", 180),
+        })
+    resp = requests.get(SEARCH_URL, params=params, timeout=30)
+    resp.raise_for_status()
+    data = resp.json()
+    results = []
+    for item in data.get("results", []):
+        results.append({
+            "doi": item.get("URI", ""),
+            "title": item.get("citation", ""),
+            "year": item.get("year"),
+            "size": item.get("size"),
+            "parameters": item.get("params", []),
+            "score": item.get("score"),
+        })
+    return results
+def download_dataset(doi: str) -> pd.DataFrame:
+    """Download a PANGAEA dataset as a pandas DataFrame."""
+    url = f"https://doi.pangaea.de/{doi}?format=textfile"
+    resp = requests.get(url, timeout=60)
+    resp.raise_for_status()
+    lines = resp.text.split("\n")
+    header_end = next(
+        (i for i, line in enumerate(lines) if line.startswith("*/")),
+        -1,
+    )
+    data_text = "\n".join(lines[header_end + 1:])
+    return pd.read_csv(StringIO(data_text), sep="\t")
+def search_by_location(query: str, lat: float, lon: float,
+                       radius_deg: float = 5.0) -> list:
+    """Search datasets near a geographic location."""
+    bbox = {
+        "south": lat - radius_deg,
+        "north": lat + radius_deg,
+        "west": lon - radius_deg,
+        "east": lon + radius_deg,
+    }
+    return search_pangaea(query, bbox=bbox)
+# Example: find ocean temperature datasets
+datasets = search_pangaea("sea surface temperature", count=5)
+for ds in datasets:
+    print(f"[{ds['year']}] {ds['title'][:80]}...")
+    print(f"  DOI: {ds['doi']} | Size: {ds['size']}")
+# Example: download a specific dataset
+# df = download_dataset("10.1594/PANGAEA.123456")
+# print(df.head())
+# Example: find Arctic research data
+arctic = search_by_location("permafrost", lat=70, lon=25)
+for ds in arctic[:3]:
+    print(f"{ds['title'][:80]}...")
+```
+## Data Topics
+| Topic | Coverage |
+|-------|----------|
+| Oceans | Temperature, salinity, currents, chemistry |
+| Paleoclimate | Ice cores, sediment cores, tree rings |
+| Atmosphere | CO2, aerosols, weather observations |
+| Lithosphere | Geology, tectonics, geochemistry |
+| Biosphere | Biodiversity, ecology, marine biology |
+| Cryosphere | Sea ice, glaciers, permafrost |
+## References
+- [PANGAEA](https://www.pangaea.de/)
+- [PANGAEA API](https://wiki.pangaea.de/wiki/PANGAEA_API)
+- [PANGAEA Elasticsearch](https://wiki.pangaea.de/wiki/Elasticsearch_API)
+- Diepenbroek, M. et al. (2002). "PANGAEA — an information system for environmental sciences." *C&G* 28(10).

package/skills/domains/humanities/digital-humanities-methods/SKILL.md ADDED Viewed

@@ -0,0 +1,232 @@
+---
+name: digital-humanities-methods
+description: "Computational methods for humanities research with text and network analysis"
+metadata:
+  openclaw:
+    emoji: "📜"
+    category: "domains"
+    subcategory: "humanities"
+    keywords: ["digital humanities", "text analysis", "corpus linguistics", "network analysis", "cultural analytics", "computational methods"]
+    source: "https://clawhub.ai/digital-humanities"
+---
+# Digital Humanities Methods
+## Overview
+Digital Humanities (DH) applies computational methods to humanistic inquiry — analyzing literary texts, historical records, cultural artifacts, and social networks at scale. This guide covers the core computational methods used in DH research: text analysis, topic modeling, network analysis, spatial analysis, and corpus linguistics. These methods complement rather than replace traditional close reading and archival research.
+## Text Analysis
+### Preprocessing Pipeline
+```python
+import re
+from collections import Counter
+def preprocess_text(text: str, language: str = "en") -> list:
+    """Standard preprocessing for humanities text analysis."""
+    # Lowercase
+    text = text.lower()
+    # Remove metadata markers (page numbers, headers)
+    text = re.sub(r'\[page \d+\]', '', text)
+    text = re.sub(r'\n{3,}', '\n\n', text)
+    # Tokenize (simple whitespace + punctuation split)
+    tokens = re.findall(r'\b[a-z]+\b', text)
+    # Remove stopwords (customize per corpus!)
+    # Standard lists often remove words meaningful in literary analysis
+    # e.g., "not", "but", "never" carry sentiment — keep them if relevant
+    stopwords = {"the", "a", "an", "is", "are", "was", "were", "in",
+                 "on", "at", "to", "for", "of", "with", "and", "or"}
+    tokens = [t for t in tokens if t not in stopwords and len(t) > 2]
+    return tokens
+# Word frequency analysis
+def word_frequencies(tokens: list, top_n: int = 50) -> list:
+    return Counter(tokens).most_common(top_n)
+```
+### Stylometry (Authorship Analysis)
+```python
+"""Stylometric features for authorship attribution."""
+def extract_style_features(text: str) -> dict:
+    """Extract stylistic features from a text."""
+    sentences = text.split('.')
+    words = text.split()
+    chars = list(text)
+    return {
+        "avg_sentence_length": len(words) / max(len(sentences), 1),
+        "avg_word_length": sum(len(w) for w in words) / max(len(words), 1),
+        "vocabulary_richness": len(set(words)) / max(len(words), 1),
+        "hapax_ratio": sum(1 for w, c in Counter(words).items() if c == 1) / max(len(set(words)), 1),
+        "comma_rate": text.count(',') / max(len(words), 1),
+        "semicolon_rate": text.count(';') / max(len(words), 1),
+        "question_rate": text.count('?') / max(len(sentences), 1),
+        "exclamation_rate": text.count('!') / max(len(sentences), 1),
+    }
+```
+### Sentiment Analysis for Historical Texts
+```python
+# Note: Modern NLP sentiment tools are trained on contemporary text.
+# For historical texts, consider:
+# 1. Historical sentiment lexicons (e.g., NRC Emotion Lexicon adapted)
+# 2. Period-specific word lists
+# 3. Manual validation on a sample before scaling
+from transformers import pipeline
+# Modern text sentiment (use with caution on historical texts)
+sentiment = pipeline("sentiment-analysis")
+result = sentiment("It was the best of times, it was the worst of times.")
+# Better: keyword-based approach with custom lexicons
+POSITIVE = {"virtue", "honor", "glory", "triumph", "beauty", "noble"}
+NEGATIVE = {"vice", "shame", "defeat", "ruin", "wretched", "base"}
+def lexicon_sentiment(tokens: list, pos: set, neg: set) -> float:
+    """Simple lexicon-based sentiment score."""
+    pos_count = sum(1 for t in tokens if t in pos)
+    neg_count = sum(1 for t in tokens if t in neg)
+    total = pos_count + neg_count
+    if total == 0:
+        return 0.0
+    return (pos_count - neg_count) / total
+```
+## Topic Modeling
+```python
+from sklearn.feature_extraction.text import CountVectorizer
+from sklearn.decomposition import LatentDirichletAllocation
+def run_topic_model(documents: list, n_topics: int = 10,
+                     n_top_words: int = 15):
+    """LDA topic modeling on a corpus of documents."""
+    # Vectorize
+    vectorizer = CountVectorizer(max_df=0.95, min_df=2,
+                                  max_features=5000,
+                                  stop_words="english")
+    dtm = vectorizer.fit_transform(documents)
+    feature_names = vectorizer.get_feature_names_out()
+    # Fit LDA
+    lda = LatentDirichletAllocation(n_components=n_topics,
+                                     random_state=42,
+                                     max_iter=50)
+    lda.fit(dtm)
+    # Display topics
+    topics = {}
+    for topic_idx, topic in enumerate(lda.components_):
+        top_words = [feature_names[i] for i in topic.argsort()[-n_top_words:]]
+        topics[f"Topic {topic_idx}"] = top_words
+        print(f"Topic {topic_idx}: {', '.join(top_words)}")
+    return lda, topics, dtm
+```
+## Network Analysis
+### Social Network from Letters/Correspondence
+```python
+import networkx as nx
+def build_correspondence_network(letters: list) -> nx.Graph:
+    """Build a social network from letter metadata.
+    Args:
+        letters: list of dicts with 'sender', 'recipient', 'date', 'location'
+    """
+    G = nx.Graph()
+    for letter in letters:
+        sender = letter["sender"]
+        recipient = letter["recipient"]
+        if G.has_edge(sender, recipient):
+            G[sender][recipient]["weight"] += 1
+        else:
+            G.add_edge(sender, recipient, weight=1)
+    # Compute centrality measures
+    centrality = nx.degree_centrality(G)
+    betweenness = nx.betweenness_centrality(G)
+    print(f"Network: {G.number_of_nodes()} individuals, "
+          f"{G.number_of_edges()} connections")
+    print(f"Most connected: {max(centrality, key=centrality.get)}")
+    print(f"Most bridging: {max(betweenness, key=betweenness.get)}")
+    return G
+```
+## Spatial Analysis (GIS for Humanities)
+Common applications:
+- Mapping historical events or migration patterns
+- Georeferencing historical maps
+- Spatial analysis of literary settings
+```python
+import folium
+def create_historical_map(events: list, title: str = "Historical Events"):
+    """Create an interactive map of historical events.
+    Args:
+        events: list of dicts with 'name', 'lat', 'lon', 'date', 'description'
+    """
+    center_lat = sum(e["lat"] for e in events) / len(events)
+    center_lon = sum(e["lon"] for e in events) / len(events)
+    m = folium.Map(location=[center_lat, center_lon], zoom_start=5)
+    for event in events:
+        popup = f"<b>{event['name']}</b><br>{event['date']}<br>{event['description']}"
+        folium.Marker(
+            location=[event["lat"], event["lon"]],
+            popup=popup,
+            tooltip=event["name"]
+        ).add_to(m)
+    m.save(f"{title.replace(' ', '_').lower()}.html")
+    return m
+```
+## Key Data Sources for DH
+| Source | Content | Access |
+|--------|---------|--------|
+| **Project Gutenberg** | 70,000+ free ebooks | gutenberg.org |
+| **HathiTrust** | 17M+ digitized volumes | hathitrust.org |
+| **Internet Archive** | Books, media, web archives | archive.org |
+| **EEBO / ECCO** | Early English books (1475-1800) | Institutional |
+| **Perseus Digital Library** | Greek and Latin classics | perseus.tufts.edu |
+| **Europeana** | European cultural heritage | europeana.eu |
+| **DPLA** | US digital public library | dp.la |
+| **Old Bailey Online** | London criminal trials (1674-1913) | oldbaileyonline.org |
+## Methodological Considerations
+1. **Close reading still matters**: Computational methods reveal patterns; interpretation requires humanistic expertise
+2. **Corpus bias**: Digitized collections over-represent certain periods, languages, and genres
+3. **OCR quality**: Historical texts often have high OCR error rates — validate before analysis
+4. **Anachronism**: Modern NLP tools may misinterpret historical language use
+5. **Interdisciplinary collaboration**: Best DH work pairs domain expertise with technical skills
+## References
+- Moretti, F. (2013). *Distant Reading*. Verso.
+- Jockers, M. L. (2013). *Macroanalysis: Digital Methods and Literary History*. University of Illinois Press.
+- [Programming Historian](https://programminghistorian.org/) — Free tutorials for DH methods
+- [DH Tools Directory](https://dirtdirectory.org/)