npm - ecological-agent-skills - Versions diffs - 3.1.0 - Mend

ecological-agent-skills 3.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (217) hide show

package/skills/acoustic-monitoring/resources/soundscape-ecology-guide.md ADDED Viewed

@@ -0,0 +1,90 @@
+# Soundscape Ecology Guide
+## The NDSI Framework: Three Components of Soundscape
+| Component | Definition | Frequency range (typical) | NDSI role |
+|---|---|---|---|
+| Biophony | Sounds produced by biological organisms | 2–8 kHz | Numerator |
+| Geophony | Non-biological natural sounds (wind, rain, rivers) | Broadband | — |
+| Anthrophony | Human-generated sounds (traffic, machinery) | 0.2–2 kHz | Denominator |
+**NDSI** measures the dominance of biological over anthropogenic signals. Values near +1 indicate pristine soundscapes; values near -1 indicate urbanised or disturbed sites.
+---
+## Controlling for Diel and Seasonal Variation
+### Why it matters
+Acoustic diversity peaks at dawn and dusk ("dawn chorus" and "dusk chorus"). Comparing morning recordings from one site with afternoon recordings from another produces spurious differences unrelated to biodiversity.
+### Recommended control strategies
+1. **Stratify by time window:** Compare only recordings taken in the same 1-hour window (e.g., 05:00–06:00 for dawn chorus).
+2. **Mixed-effects model with time-of-day as covariate:**
+```r
+suppressPackageStartupMessages(library(lme4))
+suppressPackageStartupMessages(library(lubridate))
+indices$hour <- hour(indices$datetime)
+indices$cos_hour <- cos(2 * pi * indices$hour / 24)
+indices$sin_hour <- sin(2 * pi * indices$hour / 24)
+# Model ACI with hour and site as predictors
+model <- lmer(ACI ~ cos_hour + sin_hour + habitat_type + (1 | site / recorder_id),
+              data = indices)
+```
+3. **Seasonal stratification:** Group months into seasons (wet/dry or astronomical seasons) and analyse separately or include as a fixed effect.
+---
+## Rarefaction for Acoustic Richness Estimation
+```r
+# Species accumulation curve from BirdNET detections
+# Requires detection matrix (sites × species)
+suppressPackageStartupMessages(library(vegan))
+det_matrix <- read.csv("detections_filtered.csv") %>%
+  group_by(site, species) %>%
+  summarise(n = n(), .groups = "drop") %>%
+  tidyr::pivot_wider(names_from = species, values_from = n, values_fill = 0)
+sp_matrix <- as.matrix(det_matrix[, -1])
+sp_accum <- specaccum(sp_matrix, method = "rarefaction")
+plot(sp_accum, xlab = "Recording hours", ylab = "Cumulative species detected")
+```
+**Rule:** Flag sites with < 48 hours of recordings as under-sampled relative to asymptotic richness.
+---
+## Integrating Acoustic and Visual Survey Data
+| Data type | Acoustic | Camera trap | Transect |
+|---|---|---|---|
+| What it detects | Vocalising species | Mobile species at detection range | Visible species in survey belt |
+| Temporal resolution | Continuous | Event-based | Snapshot |
+| Species ID confidence | Moderate (requires validation) | High (image) | High (observer) |
+| Integration approach | Multi-state model / species accumulation | Occupancy model | Distance sampling |
+---
+## Common Soundscape Gradients
+| Gradient | Expected NDSI trend | Expected ACI trend |
+|---|---|---|
+| Urban → rural | Increasing | Increasing |
+| Deforested → intact forest | Increasing | Increasing |
+| Day → night (tropical) | Variable | Increasing (insect chorus) |
+| Dry → wet season | Increasing | Increasing (amphibians, insects) |
+| Degraded habitat recovery | Increasing over years | Increasing |
+---
+## References
+- Pijanowski, B.C. et al. (2011). Soundscape ecology: the science of sound in the landscape. *BioScience*, 61(3), 203–216. DOI: 10.1525/bio.2011.61.3.6
+- Sueur, J. & Farina, A. (2015). Ecoacoustics: the ecological investigation and interpretation of environmental sound. *Biosemiotics*, 8(3), 493–502. DOI: 10.1007/s12304-015-9248-x
+- Tucker, D. et al. (2014). Linking ecological condition and the soundscape in fragmented Australian forests. *Landscape Ecology*, 29(4), 745–758. DOI: 10.1007/s10980-014-9978-6

package/skills/acoustic-monitoring/resources/species-id-tools-comparison.md ADDED Viewed

@@ -0,0 +1,89 @@
+# Species Identification Tools Comparison
+## Tool Comparison Table
+| Tool | Taxa covered | Cost | API / CLI | Recommended confidence | Geographic limitation |
+|---|---|---|---|---|---|
+| BirdNET | Birds (10,000+ species) | Free (open source) | CLI + Python API | ≥ 0.7 | Best in North America and Europe |
+| RavenPro | Any (manual + CNN) | Paid ($400–$800) | None (GUI) | User-defined | None |
+| Kaleidoscope Pro | Bats, birds | Paid ($350+) | None (GUI) | ≥ 90% | None |
+| ARBIMON | Tropical birds, frogs | Free (cloud) | Web API | ≥ 0.8 | Tropics focus |
+| Rainforest Connection | Chainsaws, birds | Proprietary | API (partners) | N/A | Tropical forests |
+---
+## BirdNET in Detail
+### Installation
+```bash
+# Install from GitHub
+pip install birdnet
+# Or use the BirdNET-Analyzer standalone
+git clone https://github.com/kahst/BirdNET-Analyzer
+pip install -r requirements.txt
+```
+### Command-line usage
+```bash
+python analyze.py \
+  --i /path/to/audio/ \
+  --o /path/to/output/ \
+  --min_conf 0.7 \
+  --lat -3.5 \
+  --lon -60.2 \
+  --week 24 \
+  --slist /path/to/species_list.txt
+```
+### Parameters
+| Parameter | Description | Default |
+|---|---|---|
+| `--min_conf` | Confidence threshold | 0.1 (must increase) |
+| `--lat` / `--lon` | Location for species filtering | None |
+| `--week` | Week of year (1–48) for seasonal filtering | None |
+| `--slist` | Restrict to species in this list | None |
+| `--rtype` | Output format: table, audacity, r, csv | table |
+### Interpreting Scores
+- < 0.5: Very likely false positive — do not use without validation
+- 0.5–0.7: Possible detection — flag for review
+- 0.7–0.9: Probable detection — use with caution in analyses
+- > 0.9: High confidence — generally reliable for common species
+---
+## Validation Protocol
+To calculate regional precision and recall for BirdNET:
+1. **Create validation set:** Randomly select 100–200 detections per confidence band (< 0.5, 0.5–0.7, 0.7–0.9, > 0.9).
+2. **Manual verification:** Review spectrograms in Raven or Audacity; mark each as TP or FP.
+3. **Calculate precision per band:**
+```r
+validation <- data.frame(
+  confidence_band = c("<0.5", "0.5-0.7", "0.7-0.9", ">0.9"),
+  n_reviewed      = c(100, 100, 100, 100),
+  n_correct       = c(12, 45, 78, 94)
+)
+validation$precision <- validation$n_correct / validation$n_reviewed
+```
+4. **Apply precision as weight** in species accumulation analyses.
+---
+## When Manual Validation Is Mandatory
+Always validate manually before using BirdNET results for:
+- Species of conservation concern (IUCN threatened categories)
+- First records in a region
+- Rare or vagrant species
+- Any species with < 20 training recordings in BirdNET training set
+---
+## References
+- Kahl, S. et al. (2021). BirdNET: A deep learning solution for avian diversity monitoring. *Ecological Informatics*, 61, 101236. DOI: 10.1016/j.ecoinf.2021.101236
+- Abrahams, C. (2021). Practical bioacoustics: Passive acoustic monitoring. *British Wildlife* 32(4), 241–249.

package/skills/acoustic-monitoring/scripts/batch_species_detection.py ADDED Viewed

@@ -0,0 +1,360 @@
+# ecological-agent-skills / Copyright (C) 2026 Francisco Diego Barros Barata
+# SPDX-License-Identifier: GPL-3.0-or-later
+"""
+Batch species detection using BirdNET-Analyzer for passive acoustic monitoring.
+Usage:
+    python batch_species_detection.py <audio_dir> <output_dir>
+        [--confidence 0.7] [--lat -3.0] [--lon -60.0]
+        [--date 2024-06-01] [--overlap 0.0] [--rtype csv]
+Requires:
+    BirdNET-Analyzer installed and available on PATH as `birdnet_analyzer`
+    (or configured via BIRDNET_PATH environment variable).
+Outputs:
+    detections_raw.csv           — all detections from all files
+    detections_filtered.csv      — above confidence threshold
+    species_list.csv             — unique species with n detections, max conf
+    detection_summary.csv        — species × hour detection counts
+    daily_detection_plot.png     — species accumulation and hourly bar chart
+"""
+import logging
+import sys
+from datetime import datetime
+from pathlib import Path
+SKILL_NAME = "acoustic-monitoring"
+_LOG_DIR   = Path("logs")
+_LOG_DIR.mkdir(parents=True, exist_ok=True)
+_log_file  = _LOG_DIR / f"skill_{SKILL_NAME}_{datetime.now().strftime('%Y%m%d_%H%M%S')}.log"
+logging.basicConfig(
+    level=logging.INFO,
+    format="[%(asctime)s] [%(levelname)s] [" + SKILL_NAME + "] %(message)s",
+    datefmt="%Y-%m-%d %H:%M:%S",
+    handlers=[
+        logging.StreamHandler(sys.stdout),
+        logging.FileHandler(_log_file, encoding="utf-8"),
+    ],
+)
+logger = logging.getLogger(SKILL_NAME)
+def log_step(n: int, desc: str) -> None:
+    logger.info("-- STEP %d: %s", n, desc)
+def log_decision(var: str, val, why: str) -> None:
+    logger.info("DECISION | %s = %s | %s", var, val, why)
+import os
+import subprocess
+import csv
+import json
+import re
+from datetime import timedelta
+from collections import defaultdict
+import argparse
+BIRDNET_CMD = os.environ.get("BIRDNET_PATH", "birdnet_analyzer")
+def parse_args():
+    parser = argparse.ArgumentParser(
+        description="Batch BirdNET species detection for PAM recordings"
+    )
+    parser.add_argument("audio_dir",  help="Directory containing .wav/.flac files")
+    parser.add_argument("output_dir", help="Directory for output files")
+    parser.add_argument("--confidence", type=float, default=0.7,
+                        help="Minimum confidence threshold (default: 0.7)")
+    parser.add_argument("--lat", type=float, default=None,
+                        help="Recording latitude (improves species filtering)")
+    parser.add_argument("--lon", type=float, default=None,
+                        help="Recording longitude")
+    parser.add_argument("--date", default=None,
+                        help="Recording date YYYY-MM-DD (enables seasonal filter)")
+    parser.add_argument("--overlap", type=float, default=0.0,
+                        help="Segment overlap in seconds (default: 0.0)")
+    parser.add_argument("--rtype", default="csv",
+                        choices=["csv", "table", "audacity"],
+                        help="BirdNET result type (default: csv)")
+    parser.add_argument("--min_conf_flag", type=float, default=0.5,
+                        help="Minimum confidence for raw output (default: 0.5)")
+    # Also support positional fallback for simple invocation
+    if len(sys.argv) >= 3 and not sys.argv[2].startswith("--"):
+        # Called as: script.py <audio_dir> <output_dir>
+        ns = parser.parse_args(sys.argv[1:])
+    else:
+        ns = parser.parse_args()
+    return ns
+def discover_audio_files(audio_dir: Path) -> list[Path]:
+    """Return all .wav and .flac files in audio_dir (recursive)."""
+    files = []
+    for ext in ("*.wav", "*.WAV", "*.flac", "*.FLAC"):
+        files.extend(audio_dir.rglob(ext))
+    return sorted(files)
+def parse_timestamp_from_filename(fname: str):
+    """Extract datetime from common recorder filename patterns.
+    Returns datetime or None.
+    """
+    patterns = [
+        r"(\d{8})[_$T](\d{6})",   # AUDIOMOTH_20240601_050000
+        r"(\d{4}-\d{2}-\d{2})T(\d{2}-\d{2}-\d{2})",  # ISO variant
+    ]
+    for pat in patterns:
+        m = re.search(pat, fname)
+        if m:
+            date_s, time_s = m.group(1), m.group(2)
+            date_s = date_s.replace("-", "")
+            time_s = time_s.replace("-", "")
+            try:
+                return datetime.strptime(date_s + time_s, "%Y%m%d%H%M%S")
+            except ValueError:
+                continue
+    return None
+def run_birdnet(audio_file: Path, output_dir: Path, args) -> Path | None:
+    """Run BirdNET-Analyzer on a single file. Returns path to result file."""
+    result_dir = output_dir / "birdnet_raw"
+    result_dir.mkdir(parents=True, exist_ok=True)
+    result_file = result_dir / (audio_file.stem + f".BirdNET.results.{args.rtype}")
+    cmd = [
+        BIRDNET_CMD,
+        "--i", str(audio_file),
+        "--o", str(result_dir),
+        "--min_conf", str(args.min_conf_flag),
+        "--overlap", str(args.overlap),
+        "--rtype", args.rtype,
+    ]
+    if args.lat is not None and args.lon is not None:
+        cmd += ["--lat", str(args.lat), "--lon", str(args.lon)]
+    if args.date is not None:
+        cmd += ["--week", _date_to_week(args.date)]
+    try:
+        proc = subprocess.run(cmd, capture_output=True, text=True, timeout=300)
+        if proc.returncode != 0:
+            logger.warning("BirdNET falhou para %s: %s", audio_file.name, proc.stderr[:200])
+            return None
+        return result_file if result_file.exists() else None
+    except FileNotFoundError:
+        logger.error(
+            "BirdNET nao encontrado. Configure BIRDNET_PATH ou instale birdnet_analyzer\n  Causa provavel: BirdNET-Analyzer nao instalado ou nao esta no PATH\n  Skill anterior: [nenhuma — dependencia externa]"
+        )
+        sys.exit(1)
+    except subprocess.TimeoutExpired:
+        logger.warning("Timeout ao processar %s", audio_file.name)
+        return None
+def _date_to_week(date_str: str) -> str:
+    """Convert YYYY-MM-DD to BirdNET week number (1-48)."""
+    try:
+        d = datetime.strptime(date_str, "%Y-%m-%d")
+        # BirdNET uses 48 weeks (each ~7.6 days)
+        week = min(48, max(1, int((d.timetuple().tm_yday - 1) / 7.625) + 1))
+        return str(week)
+    except ValueError:
+        return "1"
+def parse_birdnet_csv(result_file: Path) -> list[dict]:
+    """Parse BirdNET CSV output into list of detection dicts."""
+    detections = []
+    try:
+        with open(result_file, newline="", encoding="utf-8") as f:
+            reader = csv.DictReader(f, delimiter="\t")
+            if reader.fieldnames is None:
+                return detections
+            for row in reader:
+                # Normalize column names (BirdNET versions differ)
+                det = {
+                    "start_s":    float(row.get("Start (s)", row.get("start", 0))),
+                    "end_s":      float(row.get("End (s)",   row.get("end",   0))),
+                    "species_code": row.get("Label",    row.get("label",  "")).strip(),
+                    "common_name":  row.get("Common name", row.get("common_name", "")).strip(),
+                    "confidence": float(row.get("Confidence", row.get("confidence", 0))),
+                }
+                detections.append(det)
+    except Exception as e:
+        logger.warning("Nao foi possivel analisar %s: %s", result_file.name, e)
+    return detections
+def main():
+    args = parse_args()
+    audio_dir  = Path(args.audio_dir)
+    output_dir = Path(args.output_dir)
+    output_dir.mkdir(parents=True, exist_ok=True)
+    # ── Input precondition checks ────────────────────────────────────────────
+    if not audio_dir.is_dir():
+        logger.error(
+            "Input nao encontrado: %s\n  Causa provavel: caminho incorreto ou diretorio nao montado\n  Skill anterior: [nenhuma — etapa inicial]",
+            audio_dir,
+        )
+        sys.exit(1)
+    log_decision("confidence", args.confidence,
+                 "limiar minimo de confianca para filtrar deteccoes; >= 0.7 recomendado")
+    log_decision("lat/lon", f"{args.lat}/{args.lon}",
+                 "coordenadas geograficas melhoram o filtro de especies do BirdNET")
+    log_decision("overlap", args.overlap,
+                 "sobreposicao de segmentos em segundos; 0 = sem sobreposicao")
+    log_step(1, "Descobrindo arquivos de audio")
+    audio_files = discover_audio_files(audio_dir)
+    if not audio_files:
+        logger.error(
+            "Nenhum arquivo de audio encontrado em %s\n  Causa provavel: diretorio vazio ou extensoes nao suportadas\n  Skill anterior: [nenhuma]",
+            audio_dir,
+        )
+        sys.exit(1)
+    logger.info("Encontrados %d arquivos de audio", len(audio_files))
+    logger.info("Limiar de confianca: %s", args.confidence)
+    log_step(2, "Executando BirdNET em cada arquivo de audio")
+    all_detections = []
+    for i, fpath in enumerate(audio_files, 1):
+        logger.info("  [%d/%d] %s", i, len(audio_files), fpath.name)
+        file_ts = parse_timestamp_from_filename(fpath.name)
+        if file_ts is None:
+            logger.warning("Timestamp nao extraido do nome do arquivo: %s", fpath.name)
+        try:
+            result_file = run_birdnet(fpath, output_dir, args)
+        except Exception as e:
+            logger.error("Unexpected error in run_birdnet for %s: %s", fpath.name, e)
+            raise
+        if result_file is None:
+            continue
+        raw_dets = parse_birdnet_csv(result_file)
+        for det in raw_dets:
+            det["file"] = fpath.name
+            if file_ts is not None:
+                abs_time = file_ts + timedelta(seconds=det["start_s"])
+                det["datetime"] = abs_time.strftime("%Y-%m-%d %H:%M:%S")
+                det["hour"]     = abs_time.hour
+                det["date"]     = abs_time.strftime("%Y-%m-%d")
+            else:
+                det["datetime"] = ""
+                det["hour"]     = -1
+                det["date"]     = ""
+            all_detections.append(det)
+    if not all_detections:
+        logger.warning(
+            "Nenhuma deteccao produzida. Verifique a instalacao do BirdNET e os arquivos de audio."
+        )
+        sys.exit(0)
+    logger.info("Total de deteccoes brutas: %d", len(all_detections))
+    log_step(3, "Escrevendo deteccoes brutas e filtradas")
+    # ── Write raw detections ────────────────────────────────────────────────
+    fieldnames = ["file", "datetime", "date", "hour",
+                  "start_s", "end_s", "species_code", "common_name", "confidence"]
+    raw_path = output_dir / "detections_raw.csv"
+    try:
+        with open(raw_path, "w", newline="", encoding="utf-8") as f:
+            writer = csv.DictWriter(f, fieldnames=fieldnames)
+            writer.writeheader()
+            writer.writerows(all_detections)
+        logger.info("Deteccoes brutas: %d -> %s", len(all_detections), raw_path)
+    except Exception as e:
+        logger.error("Unexpected error writing raw detections: %s", e)
+        raise
+    # ── Filter by confidence ────────────────────────────────────────────────
+    filtered = [d for d in all_detections if d["confidence"] >= args.confidence]
+    filt_path = output_dir / "detections_filtered.csv"
+    with open(filt_path, "w", newline="", encoding="utf-8") as f:
+        writer = csv.DictWriter(f, fieldnames=fieldnames)
+        writer.writeheader()
+        writer.writerows(filtered)
+    logger.info("Deteccoes filtradas (>=%s): %d -> %s", args.confidence, len(filtered), filt_path)
+    if len(filtered) == 0:
+        logger.warning(
+            "Nenhuma deteccao apos filtro de confianca %.2f; considere reduzir o limiar",
+            args.confidence,
+        )
+    log_step(4, "Construindo lista de especies detectadas")
+    # ── Species list ────────────────────────────────────────────────────────
+    species_stats = defaultdict(lambda: {"n": 0, "max_conf": 0.0, "dates": set()})
+    for d in filtered:
+        sp = d["common_name"] or d["species_code"]
+        species_stats[sp]["n"] += 1
+        species_stats[sp]["max_conf"] = max(species_stats[sp]["max_conf"], d["confidence"])
+        if d["date"]:
+            species_stats[sp]["dates"].add(d["date"])
+    sp_path = output_dir / "species_list.csv"
+    with open(sp_path, "w", newline="", encoding="utf-8") as f:
+        writer = csv.writer(f)
+        writer.writerow(["species", "n_detections", "max_confidence",
+                         "n_dates", "confidence_category"])
+        for sp, stats in sorted(species_stats.items(), key=lambda x: -x[1]["n"]):
+            conf_cat = ("high" if stats["max_conf"] >= 0.9 else
+                        "medium" if stats["max_conf"] >= 0.7 else "low")
+            writer.writerow([sp, stats["n"], round(stats["max_conf"], 3),
+                             len(stats["dates"]), conf_cat])
+    logger.info("Especies detectadas: %d -> %s", len(species_stats), sp_path)
+    log_step(5, "Gerando resumo de deteccoes por hora")
+    # ── Detection summary by hour ────────────────────────────────────────────
+    hour_counts = defaultdict(lambda: defaultdict(int))
+    for d in filtered:
+        if d["hour"] >= 0:
+            sp = d["common_name"] or d["species_code"]
+            hour_counts[sp][d["hour"]] += 1
+    if hour_counts:
+        all_hours = list(range(24))
+        all_sps   = sorted(hour_counts.keys())
+        sum_path  = output_dir / "detection_summary.csv"
+        with open(sum_path, "w", newline="", encoding="utf-8") as f:
+            writer = csv.writer(f)
+            writer.writerow(["species"] + [str(h) for h in all_hours])
+            for sp in all_sps:
+                writer.writerow([sp] + [hour_counts[sp].get(h, 0) for h in all_hours])
+        logger.info("Resumo de deteccoes por hora -> %s", sum_path)
+    else:
+        logger.warning(
+            "Sem timestamps horarios disponíveis; detection_summary.csv nao gerado"
+        )
+    log_step(6, "Calculando acumulacao de especies")
+    # ── Species accumulation (text) ─────────────────────────────────────────
+    seen = set()
+    accumulation = []
+    for i, d in enumerate(filtered, 1):
+        sp = d["common_name"] or d["species_code"]
+        seen.add(sp)
+        if i % max(1, len(filtered) // 50) == 0 or i == len(filtered):
+            accumulation.append({"n_detections": i, "cumulative_species": len(seen)})
+    accum_path = output_dir / "species_accumulation.csv"
+    with open(accum_path, "w", newline="", encoding="utf-8") as f:
+        writer = csv.DictWriter(f, fieldnames=["n_detections", "cumulative_species"])
+        writer.writeheader()
+        writer.writerows(accumulation)
+    logger.info("Acumulacao de especies -> %s", accum_path)
+    logger.info("Deteccao em lote concluida")
+if __name__ == "__main__":
+    main()