npm - @wentorai/research-plugins - Versions diffs - 1.0.0 - Mend

@wentorai/research-plugins 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (252) hide show

package/skills/domains/finance/quantitative-finance-guide/SKILL.md ADDED Viewed

@@ -0,0 +1,151 @@
+---
+name: quantitative-finance-guide
+description: "Quantitative methods for financial modeling, derivatives pricing, and risk an..."
+metadata:
+  openclaw:
+    emoji: "bar_chart"
+    category: "domains"
+    subcategory: "finance"
+    keywords: ["quantitative finance", "financial data", "stock analysis", "pricing psychology", "derivatives pricing"]
+    source: "wentor"
+---
+# Quantitative Finance Guide
+A rigorous skill for applying quantitative methods to financial research, covering derivatives pricing, portfolio optimization, risk modeling, and time series econometrics. Designed for academic researchers and quantitative analysts.
+## Derivatives Pricing
+### Black-Scholes-Merton Model
+The foundational model for European option pricing:
+```python
+import numpy as np
+from scipy.stats import norm
+def black_scholes(S: float, K: float, T: float, r: float,
+                   sigma: float, option_type: str = 'call') -> dict:
+    """
+    Black-Scholes European option pricing.
+    Args:
+        S: Current stock price
+        K: Strike price
+        T: Time to maturity (years)
+        r: Risk-free rate (annualized)
+        sigma: Volatility (annualized)
+        option_type: 'call' or 'put'
+    """
+    d1 = (np.log(S / K) + (r + 0.5 * sigma**2) * T) / (sigma * np.sqrt(T))
+    d2 = d1 - sigma * np.sqrt(T)
+    if option_type == 'call':
+        price = S * norm.cdf(d1) - K * np.exp(-r * T) * norm.cdf(d2)
+    else:
+        price = K * np.exp(-r * T) * norm.cdf(-d2) - S * norm.cdf(-d1)
+    greeks = {
+        'delta': norm.cdf(d1) if option_type == 'call' else norm.cdf(d1) - 1,
+        'gamma': norm.pdf(d1) / (S * sigma * np.sqrt(T)),
+        'theta': -(S * norm.pdf(d1) * sigma) / (2 * np.sqrt(T)),
+        'vega': S * norm.pdf(d1) * np.sqrt(T),
+        'rho': K * T * np.exp(-r * T) * norm.cdf(d2) if option_type == 'call'
+               else -K * T * np.exp(-r * T) * norm.cdf(-d2)
+    }
+    return {'price': price, 'greeks': greeks}
+# Example: price a call option
+result = black_scholes(S=100, K=105, T=0.5, r=0.05, sigma=0.20, option_type='call')
+print(f"Call Price: ${result['price']:.2f}")
+print(f"Delta: {result['greeks']['delta']:.4f}")
+```
+### Monte Carlo Simulation
+For path-dependent options and complex payoffs:
+```python
+def monte_carlo_option(S0, K, T, r, sigma, n_paths=100000, n_steps=252):
+    """Geometric Brownian Motion Monte Carlo pricer."""
+    dt = T / n_steps
+    Z = np.random.standard_normal((n_paths, n_steps))
+    paths = np.zeros((n_paths, n_steps + 1))
+    paths[:, 0] = S0
+    for t in range(n_steps):
+        paths[:, t + 1] = paths[:, t] * np.exp(
+            (r - 0.5 * sigma**2) * dt + sigma * np.sqrt(dt) * Z[:, t]
+        )
+    payoffs = np.maximum(paths[:, -1] - K, 0)
+    price = np.exp(-r * T) * np.mean(payoffs)
+    std_err = np.exp(-r * T) * np.std(payoffs) / np.sqrt(n_paths)
+    return {'price': price, 'std_error': std_err, '95_ci': (price - 1.96*std_err, price + 1.96*std_err)}
+```
+## Portfolio Optimization
+### Mean-Variance Optimization (Markowitz)
+Construct efficient frontiers using quadratic programming:
+```python
+from scipy.optimize import minimize
+def efficient_frontier(returns: np.ndarray, n_portfolios: int = 50) -> list:
+    """
+    Compute efficient frontier points.
+    returns: T x N array of asset returns
+    """
+    n_assets = returns.shape[1]
+    mean_returns = returns.mean(axis=0)
+    cov_matrix = np.cov(returns.T)
+    results = []
+    target_returns = np.linspace(mean_returns.min(), mean_returns.max(), n_portfolios)
+    for target in target_returns:
+        constraints = [
+            {'type': 'eq', 'fun': lambda w: np.sum(w) - 1},
+            {'type': 'eq', 'fun': lambda w, t=target: w @ mean_returns - t}
+        ]
+        bounds = [(0, 1)] * n_assets
+        w0 = np.ones(n_assets) / n_assets
+        result = minimize(lambda w: w @ cov_matrix @ w, w0,
+                          bounds=bounds, constraints=constraints, method='SLSQP')
+        if result.success:
+            vol = np.sqrt(result.fun)
+            results.append({'return': target, 'volatility': vol, 'weights': result.x})
+    return results
+```
+## Risk Management
+### Value at Risk (VaR) and Expected Shortfall
+Three approaches to VaR estimation:
+1. **Historical Simulation**: Non-parametric, uses actual return distribution
+2. **Variance-Covariance (Parametric)**: Assumes normal distribution, fast computation
+3. **Monte Carlo VaR**: Most flexible, handles non-linear instruments
+```python
+def compute_var_es(returns: np.ndarray, confidence: float = 0.95) -> dict:
+    """Compute VaR and Expected Shortfall (CVaR)."""
+    sorted_returns = np.sort(returns)
+    var_index = int((1 - confidence) * len(sorted_returns))
+    var = -sorted_returns[var_index]
+    es = -sorted_returns[:var_index].mean()
+    return {'VaR': var, 'ES': es, 'confidence': confidence}
+```
+## Time Series Econometrics
+For financial time series, test for stationarity (ADF test), model volatility clustering with GARCH models, and check for cointegration in pairs trading strategies. Always report Newey-West standard errors when autocorrelation is present, and use information criteria (AIC, BIC) for model selection.
+## References
+- Hull, J. C. (2022). *Options, Futures, and Other Derivatives* (11th ed.). Pearson.
+- Markowitz, H. (1952). Portfolio Selection. *Journal of Finance*, 7(1), 77-91.

package/skills/domains/geoscience/climate-science-guide/SKILL.md ADDED Viewed

@@ -0,0 +1,158 @@
+---
+name: climate-science-guide
+description: "Climate data analysis, modeling workflows, and carbon neutrality research met..."
+metadata:
+  openclaw:
+    emoji: "cloud"
+    category: "domains"
+    subcategory: "geoscience"
+    keywords: ["climate change", "carbon neutrality", "atmospheric science", "climatology", "climate modeling"]
+    source: "wentor"
+---
+# Climate Science Guide
+A research skill for analyzing climate data, working with climate model outputs, and conducting carbon-related studies. Covers data sources, standard analytical workflows, and visualization techniques used in climate science publications.
+## Climate Data Sources
+### Observational Datasets
+| Dataset | Variables | Resolution | Period | Source |
+|---------|-----------|-----------|--------|--------|
+| ERA5 | Temperature, precipitation, wind, etc. | 0.25 deg, hourly | 1940-present | ECMWF/Copernicus |
+| GPCP | Precipitation | 2.5 deg, monthly | 1979-present | NASA |
+| HadCRUT5 | Surface temperature anomaly | 5 deg, monthly | 1850-present | Met Office |
+| NOAA GHCN | Station temperature, precipitation | Point data | 1850-present | NOAA |
+| CRU TS | Temperature, precipitation, vapor pressure | 0.5 deg, monthly | 1901-present | UEA CRU |
+### CMIP6 Model Outputs
+```python
+import xarray as xr
+def load_cmip6_data(model: str, experiment: str, variable: str,
+                     member: str = 'r1i1p1f1') -> xr.Dataset:
+    """
+    Load CMIP6 model output from a local or cloud archive.
+    Args:
+        model: Model name (e.g., 'CESM2', 'UKESM1-0-LL')
+        experiment: SSP scenario (e.g., 'ssp245', 'ssp585', 'historical')
+        variable: Variable name (e.g., 'tas', 'pr', 'tos')
+        member: Ensemble member ID
+    """
+    # Using Pangeo cloud catalog
+    import intake
+    catalog = intake.open_esm_datastore(
+        "https://storage.googleapis.com/cmip6/pangeo-cmip6.json"
+    )
+    query = catalog.search(
+        source_id=model,
+        experiment_id=experiment,
+        variable_id=variable,
+        member_id=member,
+        table_id='Amon'  # Monthly atmospheric data
+    )
+    ds = query.to_dataset_dict(zarr_kwargs={'consolidated': True})
+    key = list(ds.keys())[0]
+    return ds[key]
+```
+## Temperature Trend Analysis
+### Computing Global Mean Temperature Anomaly
+```python
+import numpy as np
+def compute_global_mean_anomaly(ds: xr.Dataset, var: str = 'tas',
+                                 baseline: tuple = (1850, 1900)) -> xr.DataArray:
+    """
+    Compute area-weighted global mean temperature anomaly
+    relative to a baseline period.
+    """
+    # Area weighting by latitude
+    weights = np.cos(np.deg2rad(ds.lat))
+    weights = weights / weights.sum()
+    # Global mean
+    global_mean = ds[var].weighted(weights).mean(dim=['lat', 'lon'])
+    # Baseline climatology
+    baseline_mean = global_mean.sel(
+        time=slice(str(baseline[0]), str(baseline[1]))
+    ).mean('time')
+    anomaly = global_mean - baseline_mean
+    return anomaly
+# Usage
+# anomaly = compute_global_mean_anomaly(historical_ds)
+# anomaly.plot()  # produces a time series of temperature anomaly
+```
+## Carbon Budget Analysis
+### Emissions and Remaining Budget
+Track cumulative CO2 emissions against the remaining carbon budget for temperature targets:
+```python
+def carbon_budget_tracker(cumulative_emissions_gtco2: float,
+                           target_warming: float = 1.5) -> dict:
+    """
+    Estimate remaining carbon budget.
+    Based on IPCC AR6 estimates.
+    """
+    # IPCC AR6 remaining budget from 2020 (GtCO2)
+    budgets = {
+        1.5: {'50pct': 500, '67pct': 400, '83pct': 300},
+        2.0: {'50pct': 1350, '67pct': 1150, '83pct': 900}
+    }
+    budget = budgets[target_warming]
+    remaining = {prob: val - cumulative_emissions_gtco2
+                 for prob, val in budget.items()}
+    # At ~40 GtCO2/year current rate
+    years_left = {prob: max(0, val / 40) for prob, val in remaining.items()}
+    return {'remaining_budget_GtCO2': remaining, 'years_at_current_rate': years_left}
+result = carbon_budget_tracker(cumulative_emissions_gtco2=200, target_warming=1.5)
+print(result)
+```
+## Climate Visualization
+### Spatial Maps with Cartopy
+```python
+import matplotlib.pyplot as plt
+import cartopy.crs as ccrs
+def plot_climate_map(data: xr.DataArray, title: str,
+                      cmap: str = 'RdBu_r', vmin: float = None,
+                      vmax: float = None):
+    """Publication-quality climate map."""
+    fig = plt.figure(figsize=(12, 6))
+    ax = fig.add_subplot(1, 1, 1, projection=ccrs.Robinson())
+    ax.coastlines(linewidth=0.5)
+    ax.gridlines(draw_labels=True, linewidth=0.3, alpha=0.5)
+    im = data.plot(ax=ax, transform=ccrs.PlateCarree(),
+                   cmap=cmap, vmin=vmin, vmax=vmax,
+                   add_colorbar=False)
+    cbar = plt.colorbar(im, ax=ax, orientation='horizontal',
+                         pad=0.05, shrink=0.7)
+    cbar.set_label(data.attrs.get('units', ''))
+    ax.set_title(title, fontsize=14)
+    plt.tight_layout()
+    return fig
+```
+## Best Practices
+- Always report uncertainties: use multi-model ensembles and provide confidence intervals
+- Document data preprocessing steps for reproducibility
+- Use standardized calendar handling (`cftime`) for model outputs with non-standard calendars
+- Apply bias correction (e.g., quantile mapping) when comparing model outputs to observations
+- Follow FAIR data principles and cite datasets using their DOIs

package/skills/domains/geoscience/gis-remote-sensing-guide/SKILL.md ADDED Viewed

@@ -0,0 +1,129 @@
+---
+name: gis-remote-sensing-guide
+description: "GIS analysis and remote sensing workflows for geospatial research applications"
+metadata:
+  openclaw:
+    emoji: "earth_americas"
+    category: "domains"
+    subcategory: "geoscience"
+    keywords: ["GIS", "remote sensing", "geology", "atmospheric science", "climatology", "geospatial analysis"]
+    source: "wentor"
+---
+# GIS and Remote Sensing Guide
+A comprehensive skill for conducting geospatial analysis and remote sensing research. Covers data acquisition from satellite platforms, spatial analysis with open-source tools, and publication-quality map production.
+## Satellite Data Sources
+### Key Earth Observation Platforms
+| Platform | Provider | Spatial Res. | Revisit | Free? | Use Case |
+|----------|----------|-------------|---------|-------|----------|
+| Landsat 8/9 | USGS/NASA | 30m (MS), 15m (pan) | 16 days | Yes | Land cover, NDVI time series |
+| Sentinel-2 | ESA/Copernicus | 10m | 5 days | Yes | Agriculture, urban mapping |
+| MODIS | NASA | 250m-1km | 1-2 days | Yes | Large-scale vegetation, fire |
+| Sentinel-1 | ESA | 5-20m | 6 days | Yes | SAR, flood mapping, deformation |
+| SRTM/ASTER | NASA | 30m | N/A | Yes | Digital elevation models |
+### Data Download with Python
+```python
+import ee
+# Initialize Google Earth Engine
+ee.Initialize()
+def get_sentinel2_composite(aoi: ee.Geometry, start: str, end: str,
+                             cloud_max: int = 20) -> ee.Image:
+    """
+    Create a cloud-free Sentinel-2 composite.
+    Args:
+        aoi: Area of interest as ee.Geometry
+        start: Start date (YYYY-MM-DD)
+        end: End date (YYYY-MM-DD)
+        cloud_max: Maximum cloud cover percentage
+    """
+    collection = (ee.ImageCollection('COPERNICUS/S2_SR_HARMONIZED')
+                  .filterBounds(aoi)
+                  .filterDate(start, end)
+                  .filter(ee.Filter.lt('CLOUDY_PIXEL_PERCENTAGE', cloud_max)))
+    # Cloud masking using SCL band
+    def mask_clouds(image):
+        scl = image.select('SCL')
+        mask = scl.neq(3).And(scl.neq(8)).And(scl.neq(9)).And(scl.neq(10))
+        return image.updateMask(mask)
+    return collection.map(mask_clouds).median().clip(aoi)
+# Define study area
+study_area = ee.Geometry.Rectangle([116.0, 39.5, 117.0, 40.5])  # Beijing region
+composite = get_sentinel2_composite(study_area, '2024-06-01', '2024-09-30')
+```
+## Spatial Analysis with GeoPandas
+### Vector Data Processing
+```python
+import geopandas as gpd
+from shapely.geometry import Point
+def spatial_join_analysis(points_gdf: gpd.GeoDataFrame,
+                          polygons_gdf: gpd.GeoDataFrame,
+                          agg_col: str) -> gpd.GeoDataFrame:
+    """
+    Perform spatial join and aggregate point data within polygons.
+    """
+    joined = gpd.sjoin(points_gdf, polygons_gdf, how='inner', predicate='within')
+    summary = joined.groupby('index_right').agg(
+        count=(agg_col, 'count'),
+        mean_value=(agg_col, 'mean'),
+        std_value=(agg_col, 'std')
+    ).reset_index()
+    result = polygons_gdf.merge(summary, left_index=True, right_on='index_right')
+    return result
+# Example: aggregate soil samples within administrative boundaries
+soil_samples = gpd.read_file('soil_data.geojson')
+admin_bounds = gpd.read_file('admin_boundaries.shp')
+result = spatial_join_analysis(soil_samples, admin_bounds, 'pH_value')
+```
+## Remote Sensing Indices
+### Vegetation and Water Indices
+```python
+import rasterio
+import numpy as np
+def compute_indices(image_path: str) -> dict:
+    """Compute common remote sensing spectral indices."""
+    with rasterio.open(image_path) as src:
+        red = src.read(3).astype(float)    # Band 4 in Sentinel-2
+        nir = src.read(4).astype(float)    # Band 8
+        green = src.read(2).astype(float)  # Band 3
+        swir = src.read(5).astype(float)   # Band 11
+    # Normalized Difference Vegetation Index
+    ndvi = (nir - red) / (nir + red + 1e-10)
+    # Normalized Difference Water Index
+    ndwi = (green - nir) / (green + nir + 1e-10)
+    # Normalized Burn Ratio
+    nbr = (nir - swir) / (nir + swir + 1e-10)
+    return {'NDVI': ndvi, 'NDWI': ndwi, 'NBR': nbr}
+```
+## Map Production
+For publication-quality maps, always include: scale bar, north arrow, coordinate reference system label, legend, and data source attribution. Use `matplotlib` with `cartopy` for projected maps, or `folium` for interactive web maps. Export at 300 DPI minimum for journal submissions.
+## Coordinate Reference Systems
+Always verify and document the CRS. Use EPSG codes (e.g., EPSG:4326 for WGS84, EPSG:32650 for UTM Zone 50N). Reproject all layers to a common CRS before spatial operations to avoid misalignment errors.

package/skills/domains/humanities/digital-humanities-guide/SKILL.md ADDED Viewed

@@ -0,0 +1,181 @@
+---
+name: digital-humanities-guide
+description: "Computational methods for humanities research including text mining and netwo..."
+metadata:
+  openclaw:
+    emoji: "scroll"
+    category: "domains"
+    subcategory: "humanities"
+    keywords: ["digital humanities", "philosophy", "literary studies", "art history", "linguistics", "text mining"]
+    source: "wentor"
+---
+# Digital Humanities Guide
+A skill for applying computational and quantitative methods to humanities research. Covers text mining, network analysis, spatial humanities, and digital archival methods. Designed for researchers bridging traditional humanities with data-driven approaches.
+## Text Mining and Distant Reading
+### Corpus Preparation
+```python
+import re
+from collections import Counter
+def prepare_corpus(texts: list[str], stopwords: set = None) -> list[list[str]]:
+    """
+    Tokenize and clean a corpus of texts for analysis.
+    Args:
+        texts: List of raw text strings
+        stopwords: Set of words to remove
+    Returns:
+        List of tokenized, cleaned documents
+    """
+    if stopwords is None:
+        stopwords = {'the', 'a', 'an', 'and', 'or', 'but', 'in', 'on',
+                     'at', 'to', 'for', 'of', 'with', 'is', 'was', 'are'}
+    processed = []
+    for text in texts:
+        # Lowercase and remove punctuation
+        tokens = re.findall(r'\b[a-z]+\b', text.lower())
+        # Remove stopwords and short tokens
+        tokens = [t for t in tokens if t not in stopwords and len(t) > 2]
+        processed.append(tokens)
+    return processed
+def compute_tfidf(corpus: list[list[str]]) -> dict:
+    """Compute TF-IDF scores for term importance analysis."""
+    import math
+    n_docs = len(corpus)
+    # Document frequency
+    df = Counter()
+    for doc in corpus:
+        df.update(set(doc))
+    # TF-IDF per document
+    tfidf_scores = []
+    for doc in corpus:
+        tf = Counter(doc)
+        total = len(doc)
+        scores = {}
+        for term, count in tf.items():
+            tf_val = count / total
+            idf_val = math.log(n_docs / (1 + df[term]))
+            scores[term] = tf_val * idf_val
+        tfidf_scores.append(scores)
+    return tfidf_scores
+```
+### Topic Modeling
+Apply Latent Dirichlet Allocation (LDA) to discover thematic structures in large text corpora:
+```python
+from gensim import corpora, models
+def run_topic_model(corpus: list[list[str]], n_topics: int = 10,
+                     passes: int = 15) -> models.LdaModel:
+    """
+    Train an LDA topic model on a preprocessed corpus.
+    """
+    dictionary = corpora.Dictionary(corpus)
+    dictionary.filter_extremes(no_below=5, no_above=0.5)
+    bow_corpus = [dictionary.doc2bow(doc) for doc in corpus]
+    lda_model = models.LdaModel(
+        bow_corpus,
+        num_topics=n_topics,
+        id2word=dictionary,
+        passes=passes,
+        random_state=42,
+        alpha='auto',
+        eta='auto'
+    )
+    return lda_model
+# Print top words per topic
+# for idx, topic in lda_model.print_topics(-1):
+#     print(f"Topic {idx}: {topic}")
+```
+## Network Analysis for Historical Research
+### Correspondence and Social Networks
+```python
+import networkx as nx
+def build_correspondence_network(letters: list[dict]) -> nx.Graph:
+    """
+    Build a social network from historical correspondence data.
+    Args:
+        letters: List of dicts with 'sender', 'recipient', 'date', 'location'
+    """
+    G = nx.Graph()
+    for letter in letters:
+        sender = letter['sender']
+        recipient = letter['recipient']
+        if G.has_edge(sender, recipient):
+            G[sender][recipient]['weight'] += 1
+        else:
+            G.add_edge(sender, recipient, weight=1)
+    # Compute centrality measures
+    degree_cent = nx.degree_centrality(G)
+    betweenness = nx.betweenness_centrality(G)
+    for node in G.nodes():
+        G.nodes[node]['degree_centrality'] = degree_cent[node]
+        G.nodes[node]['betweenness'] = betweenness[node]
+    return G
+# Identify the most connected and most bridging figures
+# sorted(degree_cent.items(), key=lambda x: x[1], reverse=True)[:10]
+```
+## Spatial Humanities
+Map historical events, literary settings, or cultural artifacts using GIS tools:
+- **QGIS** for desktop spatial analysis with historical maps
+- **Recogito** for annotating place names in texts
+- **Peripleo** for linked open geodata visualization
+- **Palladio** for Stanford's humanities data visualization platform
+Georeferencing historical maps requires at least 4 ground control points with known coordinates, using polynomial or thin-plate spline transformation.
+## Digital Archival Methods
+### TEI Encoding
+The Text Encoding Initiative (TEI) is the standard for scholarly digital editions:
+```xml
+<TEI xmlns="http://www.tei-c.org/ns/1.0">
+  <teiHeader>
+    <fileDesc>
+      <titleStmt>
+        <title>Letters of [Historical Figure]</title>
+      </titleStmt>
+    </fileDesc>
+  </teiHeader>
+  <text>
+    <body>
+      <div type="letter" n="1">
+        <opener>
+          <dateline><date when="1789-07-14">14 July 1789</date></dateline>
+          <salute>Dear Friend,</salute>
+        </opener>
+        <p>The events of today have been most extraordinary...</p>
+      </div>
+    </body>
+  </text>
+</TEI>
+```
+## Ethical Considerations
+Digital humanities research must address: copyright and fair use for digitized materials, privacy concerns for living subjects in social network analysis, algorithmic bias in NLP tools trained on modern English when applied to historical texts, and the responsibility to make digital scholarship accessible beyond the academy.