npm - ecological-agent-skills - Versions diffs - 3.1.0 - Mend

ecological-agent-skills 3.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (217) hide show

package/workflows/produce-technical-report/WORKFLOW.md ADDED Viewed

@@ -0,0 +1,113 @@
+# Workflow: produce-technical-report
+**Purpose:** Synthesise analytical outputs into a publication-ready technical report
+**Skills:** reproducible-ecology-pipeline → (read analytical outputs) → report template → technical synthesis
+---
+## Trigger
+Invoke when the user wants to generate a structured technical report from the outputs of a completed analysis.
+**Example prompts:**
+- "Write a technical report from the SDM results"
+- "Generate a methods and results section based on these outputs"
+- "Produce the final impact assessment report"
+---
+## Prerequisites
+At least one upstream workflow must have been completed. The following files should exist:
+- `parameter_manifest.yaml`
+- `decision_log.md`
+- `performance_metrics.csv` or equivalent results tables
+- Key figures (maps, plots)
+---
+## Steps
+### Step 1 — Read Reproducibility Package
+- Load `parameter_manifest.yaml` and `decision_log.md`
+- Extract: data sources, software versions, all parameters, key decisions
+### Step 2 — Inventory Analytical Outputs
+- List all result tables, maps, and figures
+- Confirm each output is traceable to its generating step
+### Step 3 — Synthesise Methods Section
+- Write Methods following the relevant reporting standard:
+  - SDM: ODMAP protocol
+  - Occupancy: standard unmarked/PRESENCE reporting
+  - Impact assessment: BACI reporting guidelines
+  - Community ecology: vegan + PERMANOVA conventions
+- Include: data sources, cleaning steps, modeling approach, validation strategy, uncertainty quantification
+### Step 4 — Synthesise Results Section
+- Present results in order: data summary → model performance → main findings → uncertainty
+- Integrate figures and tables with cross-references
+- State statistical findings with: estimate, CI, test statistic, p-value, effect size
+### Step 5 — Write Discussion Outline
+- Interpret main findings in ecological context
+- Discuss model limitations and uncertainties
+- Suggest management implications (if applicable)
+- Propose follow-up studies
+### Step 6 — Final Checklist
+- [ ] All figures labelled and captioned
+- [ ] All tables numbered and titled
+- [ ] All statistical values reported completely (estimate, CI, test stat, p, effect size)
+- [ ] Data and code availability statement included
+- [ ] Software citations included
+- [ ] Report cross-checks against `parameter_manifest.yaml` complete
+---
+## Report Template Structure
+```
+1. Introduction
+2. Methods
+   2.1 Study area
+   2.2 Data sources
+   2.3 Data preparation
+   2.4 Analysis approach
+   2.5 Validation and uncertainty
+3. Results
+   3.1 Data summary
+   3.2 Model performance
+   3.3 Main findings
+   3.4 Uncertainty
+4. Discussion
+5. Conclusions
+6. References
+Appendix: Reproducibility package
+```
+---
+## Reporting Standards Reference
+| Analysis type | Standard |
+|---------------|---------|
+| SDM | ODMAP (Zurell et al. 2020, Ecography) |
+| Occupancy | MacKenzie et al. (2018) textbook conventions |
+| Community ecology | vegan documentation; Oksanen et al. |
+| Impact assessment | CEQ NEPA guidelines or local equivalent |
+| Ecosystem services | IPBES assessment framework |
+---
+## Decision Points
+| Condition | Diagnosis | Recommended Action |
+|---|---|---|
+| Analysis not reproducible from params.yaml alone | Missing parameter documentation | Complete params.yaml before finalising report; add all missing parameters and run again |
+| Figures generated with different software versions | Version lock not enforced | Record all package versions in session_info.txt; use renv/conda lockfile for environment reproducibility |
+| Statistical results change with different random seed | Stochastic component not fixed | Set and document seed in params.yaml; report sensitivity of results to seed variation |
+| External reviewer cannot reproduce key figure | Code sharing incomplete | Ensure all scripts are included; verify all input data paths are documented and data is accessible |
+| Report references unpublished dataset | Citation incomplete for peer review | Obtain dataset DOI (Zenodo, Dryad, GBIF) or write data availability statement before submission |
+| Methods section length exceeds journal limit | Too much methodological detail in main text | Move detailed parameters to Supplementary Methods; use ODMAP table format for SDMs |
+| Discussion cites model predictions without uncertainty | Overconfident interpretation | Always pair predictions with uncertainty estimates; use conditional language ("model suggests", "under assumptions") |

package/workflows/run-camera-trap-occupancy/WORKFLOW.md ADDED Viewed

@@ -0,0 +1,87 @@
+# Workflow: run-camera-trap-occupancy
+**Purpose:** Process camera trap data into detection histories and estimate occupancy with imperfect detection
+**Skills:** ecological-data-foundation → camera-trap-processing → occupancy-and-detection → model-validation-and-uncertainty → reproducible-ecology-pipeline
+---
+## Trigger
+Invoke when the user has camera trap image records and wants to estimate species occupancy accounting for imperfect detection.
+**Example prompts:**
+- "Process my camera trap data and estimate jaguar occupancy in the study area"
+- "Build detection histories from camera trap records and run an occupancy model"
+- "Estimate puma occupancy from camera trap surveys across 40 stations"
+---
+## Steps
+### Step 1 — ecological-data-foundation
+- Validate camera station metadata (GPS coordinates, deployment/retrieval dates)
+- Clean species identification records; resolve taxonomy
+- Flag stations with incomplete deployment periods
+- Output: `stations_clean.csv`, `records_clean.csv`, `qa_report.md`
+### Step 2 — camera-trap-processing
+- Define independence threshold (default 30 min; adjust per taxon)
+- Generate record table with independent detection events
+- Build camera operation matrix (active/inactive per station per day)
+- Construct detection history matrix (sites x occasions)
+- Compute trap effort summary and RAI (relative abundance index)
+- Output: `record_table.csv`, `detection_history.csv`, `camera_operation.csv`, `trap_effort.csv`
+### Step 3 — occupancy-and-detection
+- Fit null model psi(.), p(.)
+- Fit candidate models with site covariates (habitat, elevation, distance to water) and detection covariates (effort, season, camera model)
+- Goodness-of-fit: MacKenzie-Bailey chi-squared (parametric bootstrap)
+- Model selection by AICc (or QAICc if c-hat > 1.5)
+- Report psi and p with 95% CIs
+- Output: `model_selection_table.csv`, `occupancy_estimates.csv`, `gof_report.md`
+### Step 4 — model-validation-and-uncertainty
+- Report goodness-of-fit and c-hat
+- Assess sensitivity to independence threshold choice
+- Compute minimum surveys needed for target detection power
+- Output: `validation_report.md`, `power_analysis.csv`
+### Step 5 — reproducible-ecology-pipeline
+- Document independence threshold justification
+- Log candidate model rationale and closure assumption
+- Capture software environment (R/camtrapR/unmarked versions)
+- Output: `parameter_manifest.yaml`, `decision_log.md`, `reproducibility_checklist.md`
+---
+## Expected Deliverables
+- Detection history matrix (sites x occasions)
+- Camera operation and trap effort summaries
+- Occupancy estimate (psi) with 95% CI
+- Detection estimate (p) with 95% CI
+- Model selection table (AICc, delta-AIC, weights)
+- Covariate effect estimates
+- Reproducibility package
+---
+## Minimum Data Requirements
+- >= 20 camera stations with >= 3 survey occasions each
+- Station metadata with GPS coordinates and deployment dates
+- Species identification for target species
+- >= 5 independent detections of target species across stations
+---
+## Decision Points
+| Condition | Diagnosis | Recommended Action |
+|---|---|---|
+| RAI = 0 for target species | Species not detected at any station | Cannot fit occupancy model; report non-detection; consider expanding survey effort |
+| < 5 detections across all stations | Extremely low detection rate | Occupancy estimates will be unreliable; report with strong caveats; consider pooling occasions |
+| Independence threshold changes results by > 20% | Sensitivity to threshold choice | Report results at multiple thresholds (15, 30, 60 min); justify final choice in decision log |
+| Camera operation < 70% of planned effort | High station failure rate | Exclude stations with < 14 active days; report effective vs planned effort |
+| GoF test fails (p < 0.05) | Closure violation or unmodelled heterogeneity | Shorten occasion length to reduce closure violations; add detection covariates |
+| Naive occupancy = 1.0 | Species detected at every station | Occupancy model unnecessary; report naive occupancy; focus on abundance or activity patterns |

package/workflows/run-conservation-prioritization/WORKFLOW.md ADDED Viewed

@@ -0,0 +1,89 @@
+# Workflow: run-conservation-prioritization
+**Purpose:** Identify priority areas for conservation using systematic planning tools (prioritizr/Marxan)
+**Skills:** ecological-data-foundation → geoprocessing-for-ecology → species-distribution-modeling → spatial-prioritization → reproducible-ecology-pipeline
+---
+## Trigger
+Invoke when the user wants to design a reserve network, identify priority areas for conservation, evaluate representation targets, or assess existing protected area coverage.
+**Example prompts:**
+- "Run a spatial prioritization for 50 species in the Atlantic Forest with 30% targets"
+- "Design a reserve network using prioritizr to meet representation targets"
+- "Evaluate which areas should be added to the protected area network for [region]"
+---
+## Steps
+### Step 1 — ecological-data-foundation
+- Validate species distribution data (rasters or occurrence records)
+- Check planning unit layer (grid or irregular polygons)
+- Validate cost layer and locked-in/locked-out constraint layers
+- Output: `species_data_clean/`, `planning_units.shp`, `cost_layer.tif`, `qa_report.md`
+### Step 2 — geoprocessing-for-ecology
+- Reproject all layers to common equal-area CRS
+- Align raster resolutions across species distributions and cost surface
+- Clip to study area extent
+- Rasterise planning units if needed
+- Output: `planning_units_raster.tif`, `species_stack.tif`, `cost_aligned.tif`
+### Step 3 — species-distribution-modeling (conditional)
+- If species distributions are not yet available as rasters, fit SDMs
+- Generate binary suitability maps for each species
+- Stack all species distributions into a single raster stack
+- Output: `species_binary_stack.tif`, `sdm_report.md`
+### Step 4 — spatial-prioritization
+- Define representation targets (e.g., 30% of each species' range)
+- Set up minimum-set or maximum-coverage problem formulation
+- Apply locked-in constraints (existing protected areas) and locked-out constraints
+- Calibrate boundary length modifier (BLM) for spatial compactness
+- Solve using ILP solver (HiGHS via prioritizr)
+- Compute irreplaceability (selection frequency across near-optimal solutions)
+- Run target sensitivity analysis (20%, 30%, 50%)
+- Output: `solution_map.tif`, `representation_table.csv`, `irreplaceability_map.tif`, `blm_calibration.csv`, `sensitivity_report.md`
+### Step 5 — reproducible-ecology-pipeline
+- Document target justification and data sources
+- Log solver parameters and BLM choice
+- Record constraint rationale (locked-in/out areas)
+- Output: `parameter_manifest.yaml`, `decision_log.md`, `reproducibility_checklist.md`
+---
+## Expected Deliverables
+- Optimal reserve solution map
+- Feature representation summary (% target met per species)
+- Irreplaceability map (selection frequency)
+- BLM calibration curve (cost vs boundary length)
+- Target sensitivity analysis
+- Cost efficiency summary
+- Reproducibility package
+---
+## Minimum Data Requirements
+- Distribution data (rasters or occurrence records) for >= 5 species/features
+- Planning unit layer covering study area
+- Cost layer (opportunity cost or area-based)
+- Representation targets (default: 30% per feature)
+---
+## Decision Points
+| Condition | Diagnosis | Recommended Action |
+|---|---|---|
+| Problem infeasible (no solution found) | Targets exceed available habitat | Reduce targets or expand study area; report which species are infeasible |
+| > 50% of planning units selected | Targets too high or cost surface too uniform | Review target realism; consider log-linear targets scaled by range size |
+| BLM calibration shows no elbow | Trade-off between cost and compactness is linear | Use BLM = 0 (no compactness penalty) and report fragmented solution; justify ecologically |
+| Some species at 0% representation | Species range falls outside planning units or data gap | Add occurrence data or expand study area; flag unrepresented species |
+| Locked-in areas already meet all targets | Existing protected areas are sufficient | Report as positive finding; analyse gap for underrepresented species |
+| Irreplaceability > 0.9 for specific planning units | Critical areas with no substitutes | Flag as highest priority for immediate protection |
+| Solution changes > 40% with +/- 10% target shift | High sensitivity to target choice | Report portfolio of solutions across target range; do not present single solution as definitive |

package/workflows/run-multispecies-screening/WORKFLOW.md ADDED Viewed

@@ -0,0 +1,197 @@
+# Workflow: run-multispecies-screening
+Multi-species SDM screening pipeline for rapid prioritisation of conservation targets.
+---
+## Trigger Phrases
+- "screening de [N] espécies"
+- "lista de espécies ameaçadas"
+- "triagem de risco"
+- "priorização de espécies para modelagem"
+- "quais espécies têm área adequada reduzida"
+- "screen [N] species for distribution modeling"
+---
+## Skills Used
+`1 → 2 → 4 → 6 → 5`
+| Step | Skill |
+|---|---|
+| Data download + cleaning | `ecological-data-foundation` (skill 1) |
+| Spatial operations | `geoprocessing-for-ecology` (skill 2) |
+| Modeling best practices | `predictive-modeling-best-practices` (skill 4) |
+| SDM | `species-distribution-modeling` (skill 6) |
+| Validation | `model-validation-and-uncertainty` (skill 5) |
+---
+## Inputs
+| Input | Format | Required | Description |
+|---|---|---|---|
+| `species_list.csv` | CSV with column `scientificName` | Yes | List of species to screen |
+| `predictor_stack.tif` | Multi-band GeoTIFF | Yes | Environmental predictor layers |
+| `study_area.shp` | Shapefile / GeoJSON | Yes | Geographic extent for projections |
+| `n_min_occurrences` | Integer (default: 30) | No | Minimum cleaned records required |
+| `auc_threshold` | Numeric 0–1 (default: 0.75) | No | Minimum AUC for "adequate model" |
+| `suitable_area_threshold_km2` | Numeric (default: 50000) | No | Area below which species flagged as high priority |
+---
+## Pipeline — Loop per Species
+For each `scientificName` in `species_list.csv`:
+### Step 1 — Download + Cleaning
+**Skill:** `ecological-data-foundation`
+- Download occurrences from GBIF using `download_from_gbif.R` or `.py`
+- Apply standard cleaning: coordinate flags, duplicates, spatial thinning (1 grid cell)
+- Record `n_raw` and `n_clean`
+### Step 2 — Data Sufficiency Check
+```
+IF n_clean < n_min_occurrences:
+    → flag species as "dados_insuficientes" = TRUE
+    → write row to screening_summary.csv with flag
+    → SKIP to next species
+```
+### Step 3 — Quick Calibration
+**Skill:** `species-distribution-modeling`
+- Use fixed quick-calibration settings: RM = 1, FC = "LQ"
+- Run `sdm_pipeline.py` or `run_ensemble_sdm.R` with these parameters
+- No full ENMeval grid search (reserved for high-priority species in full `run-sdm-study`)
+### Step 4 — Minimum Validation
+**Skill:** `model-validation-and-uncertainty`
+- Calculate AUC (random 20% hold-out) and TSS
+- Run `validate_model.py` or `validate_sdm.R`
+- Flag model as `model_adequate = AUC >= auc_threshold`
+### Step 5 — Extrapolation Risk Check
+- Run `extrapolation_risk.R` comparing training stack vs. full study area projection
+- Record `pct_mop_zero` (% area with MOP = 0) and `pct_mop_low` (% area with MOP < 0.25)
+- Flag `severe_extrapolation = pct_mop_zero > 0.40`
+---
+## Per-Species Outputs
+All written to `output/{scientificName}/`:
+| File | Description |
+|---|---|
+| `suitability_current.tif` | Continuous suitability map (0–1) |
+| `binary_map.tif` | Binary presence/absence map (threshold = max TSS) |
+| `metrics.csv` | AUC, TSS, n_clean, suitable_area_km2, pct_mop_zero |
+| `mop_layer.tif` | MOP extrapolation layer |
+---
+## Consolidated Output
+**File:** `output/screening_summary.csv`
+| Column | Description |
+|---|---|
+| `scientificName` | Species name |
+| `n_raw` | Records before cleaning |
+| `n_clean` | Records after cleaning + thinning |
+| `dados_insuficientes` | TRUE if n_clean < n_min |
+| `AUC` | Model AUC (hold-out) |
+| `TSS` | Model TSS |
+| `suitable_area_km2` | Current suitable area (binary map) |
+| `pct_mop_zero` | % projection area with MOP = 0 |
+| `severe_extrapolation` | TRUE if pct_mop_zero > 0.40 |
+| `model_adequate` | TRUE if AUC >= auc_threshold |
+| `priority` | Alta / Média / Baixa (see criteria below) |
+---
+## Priority Classification Criteria
+```
+Alta priority:
+  suitable_area_km2 < suitable_area_threshold_km2
+  AND AUC >= auc_threshold
+  AND dados_insuficientes == FALSE
+  AND severe_extrapolation == FALSE
+Média priority:
+  (suitable_area_km2 >= suitable_area_threshold_km2 AND AUC >= auc_threshold)
+  OR severe_extrapolation == TRUE (model adequate but extrapolation concern)
+Baixa priority:
+  AUC < auc_threshold (model unreliable)
+  OR dados_insuficientes == TRUE
+```
+Species flagged `dados_insuficientes` or with `AUC < auc_threshold` should NOT be
+ranked by area — insufficient basis for reliable prioritisation.
+---
+## Decision Points
+| Condition | Diagnosis | Recommended Action |
+|---|---|---|
+| AUC < 0.70 | Model has no predictive power | Revise predictor set; do not include in priority ranking |
+| n_clean < 30 after thinning | Insufficient data for reliable SDM | Classify as "dados_insuficientes"; supplement with field surveys |
+| MOP = 0 in > 40% of projected area | Severe extrapolation | Flag as `severe_extrapolation`; restrict interpretation to calibration area |
+| > 50% of range in a single biome | Biome-specific background needed | Re-run with biome-restricted background for that species |
+| All species in list return AUC < 0.70 | Wrong predictor set or data quality issue | Check coordinate cleaning, CRS alignment, and predictor resolution |
+| n species with Alta priority = 0 | Thresholds too strict or genuinely no high-priority species | Adjust `suitable_area_threshold_km2` and document rationale |
+---
+## Parallelisation Note
+For lists of N > 10 species, use `future` for parallel processing:
+```r
+library(future)
+library(furrr)
+# Use all available cores (or set workers = N explicitly)
+plan(multisession, workers = parallel::detectCores() - 1)
+results <- future_map(species_list, run_single_species_screen,
+                      .options = furrr_options(seed = 42))
+```
+For Python equivalents, use `concurrent.futures.ProcessPoolExecutor`.
+---
+## Deliverables
+| Deliverable | Format | Description |
+|---|---|---|
+| `screening_summary.csv` | CSV | Master table with all species metrics and priority flags |
+| `suitability_current.tif` | GeoTIFF (per species) | Continuous suitability map |
+| `binary_map.tif` | GeoTIFF (per species) | Binary presence/absence |
+| `mop_layer.tif` | GeoTIFF (per species) | Extrapolation risk layer |
+| `metrics.csv` | CSV (per species) | Individual species metrics |
+| `screening_report.md` | Markdown | Auto-generated summary with priority lists |
+---
+## Notes
+- This workflow is designed for **rapid screening**, not publication-quality SDMs.
+  High-priority species identified here should be modelled in full using `run-sdm-study`.
+- The quick calibration (RM=1, FC=LQ) will generally produce slightly over-fitted
+  models compared to full ENMeval calibration. This is acceptable for screening
+  purposes but must not be used for conservation planning directly.
+- Always document `n_min_occurrences` and `auc_threshold` used in the screening in
+  the project's `params.yaml`.
+- Record the GBIF download DOIs for each species in `data_provenance.md`.

package/workflows/run-occupancy-analysis/WORKFLOW.md ADDED Viewed

@@ -0,0 +1,74 @@
+# Workflow: run-occupancy-analysis
+**Purpose:** Estimate species occupancy and detection probability from repeated survey data
+**Skills:** ecological-data-foundation → biostatistics-workbench → occupancy-and-detection → model-validation-and-uncertainty → reproducible-ecology-pipeline
+---
+## Trigger
+Invoke when the user has repeated presence/absence survey data and wants to estimate occupancy while accounting for imperfect detection.
+**Example prompts:**
+- "Analyse camera trap data to estimate jaguar occupancy"
+- "Run a single-season occupancy model for [species] using repeated point counts"
+- "How does occupancy probability vary with habitat quality after accounting for detection?"
+---
+## Steps
+### Step 1 — ecological-data-foundation
+- Validate detection history matrix (sites × occasions)
+- Check site and observation covariate tables
+- Flag sites with all-zero histories (never detected)
+- Output: `detection_history.csv`, `site_covariates.csv`, `obs_covariates.csv`
+### Step 2 — biostatistics-workbench
+- Check covariate distributions and standardise continuous predictors
+- Assess collinearity among site and observation covariates
+- Output: `covariate_summary.csv`, `collinearity_report.csv`
+### Step 3 — occupancy-and-detection
+- Fit null model (ψ(.), p(.))
+- Fit candidate model set based on a priori hypotheses
+- Goodness-of-fit: MacKenzie-Bailey χ² (parametric bootstrap)
+- Model selection by AICc (or QAICc if ĉ > 1.5)
+- Report ψ and p with 95% CIs
+- Output: `model_selection_table.csv`, `occupancy_estimates.csv`, `gof_report.md`
+### Step 4 — model-validation-and-uncertainty
+- Report goodness-of-fit and ĉ
+- Assess sensitivity to number of survey occasions
+- Compute minimum surveys needed for target power
+- Output: `validation_report.md`, `power_analysis.csv`
+### Step 5 — reproducible-ecology-pipeline
+- Document closure assumption justification
+- Log candidate model rationale
+- Capture R session and unmarked version
+- Output: `parameter_manifest.yaml`, `software_environment.txt`
+---
+## Expected Deliverables
+- Occupancy estimate (ψ) with 95% CI
+- Detection estimate (p) with 95% CI
+- Model selection table (AICc, ΔAIC, weights)
+- Covariate effect estimates
+- Goodness-of-fit results
+---
+## Decision Points
+| Condition | Diagnosis | Recommended Action |
+|---|---|---|
+| GoF test fails (p < 0.05) | Violation of closure assumption or unmodelled heterogeneity | Add detection covariate; consider mixture models (heterogeneity in detection); verify closure window |
+| naive_p > 0.5 but estimated p̂ < 0.1 | Multicollinearity in detection covariates inflating estimates | Check pairwise correlation among detection covariates; reduce model; use VIF screening |
+| AIC-best model has ψ̂ ≈ 1.0 with high SE | Numerical convergence issue (perfect detection or near-saturation) | Simplify model; check for sites with detection at every occasion; constrain starting values |
+| k (number of surveys) < 3 | Low power to separate ψ from p | Report power analysis alongside estimates; collect more survey occasions in future |
+| Detection probability varies strongly by observer | Unmodelled observer effect | Include observer ID as categorical detection covariate |
+| ĉ > 1.5 (overdispersion) | Extra-binomial variance; GoF indicates poor fit | Use QAICc instead of AICc for model selection; report ĉ in methods |
+| All sites have detection at every occasion | 100% naive occupancy | Consider whether closure is violated or species is truly ubiquitous; occupancy model may be unnecessary |

package/workflows/run-population-viability/WORKFLOW.md ADDED Viewed

@@ -0,0 +1,90 @@
+# Workflow: run-population-viability
+**Purpose:** Project population trajectories and estimate extinction risk using matrix population models
+**Skills:** ecological-data-foundation → biostatistics-workbench → population-viability-analysis → model-validation-and-uncertainty → reproducible-ecology-pipeline
+---
+## Trigger
+Invoke when the user wants to project population growth, estimate extinction probability, compute lambda/elasticity, or evaluate a species against IUCN Criterion E.
+**Example prompts:**
+- "Run a PVA for African elephants using a Lefkovitch matrix"
+- "Estimate extinction probability for [species] over the next 100 years"
+- "Compute lambda and elasticity for a population with 4 life stages"
+---
+## Steps
+### Step 1 — ecological-data-foundation
+- Validate vital rate data (survival and fecundity per stage or age class)
+- Check demographic data sources (mark-recapture, census, literature)
+- Flag missing stages or implausible rates (survival > 1, negative fecundity)
+- Output: `vital_rates_clean.csv`, `demographic_sources.md`, `qa_report.md`
+### Step 2 — biostatistics-workbench
+- Assess variance in vital rate estimates
+- Compute coefficient of variation (CV) for each rate
+- Check for temporal autocorrelation in demographic time series (if available)
+- Output: `vital_rate_summary.csv`, `cv_report.csv`
+### Step 3 — population-viability-analysis
+- Construct Leslie (age-based) or Lefkovitch (stage-based) projection matrix
+- Compute deterministic lambda, stable stage distribution, reproductive value
+- Compute sensitivity and elasticity matrices
+- Run stochastic PVA (Monte Carlo, >= 1000 simulations):
+  - Beta distribution for survival rates
+  - Lognormal distribution for fecundity rates
+  - Optional: catastrophe scenarios, density dependence
+- Estimate extinction probability at quasi-extinction threshold
+- Classify against IUCN Criterion E thresholds
+- Output: `lambda_summary.csv`, `elasticity_matrix.csv`, `pva_trajectories.csv`, `extinction_curve.csv`, `iucn_criterion_e.md`
+### Step 4 — model-validation-and-uncertainty
+- Sensitivity analysis: vary vital rates +/- 10% individually
+- Compare deterministic vs stochastic lambda
+- Assess effect of catastrophe frequency on extinction risk
+- Report confidence intervals on extinction probability
+- Output: `sensitivity_report.md`, `validation_report.md`
+### Step 5 — reproducible-ecology-pipeline
+- Document vital rate sources with citations
+- Log matrix structure and parameterisation choices
+- Record simulation parameters (n_sims, time horizon, quasi-extinction threshold)
+- Output: `parameter_manifest.yaml`, `decision_log.md`, `reproducibility_checklist.md`
+---
+## Expected Deliverables
+- Projection matrix with lambda, sensitivity, and elasticity
+- Stochastic PVA trajectories (median + 95% CI)
+- Extinction probability curve over time horizon
+- IUCN Criterion E classification
+- Elasticity-based management recommendations
+- Reproducibility package
+---
+## Minimum Data Requirements
+- Vital rates (survival + fecundity) for >= 2 life stages
+- Initial population size or stage distribution
+- Quasi-extinction threshold (default: 50 individuals)
+- Time horizon (default: 100 years)
+---
+## Decision Points
+| Condition | Diagnosis | Recommended Action |
+|---|---|---|
+| Deterministic lambda < 1.0 | Population declining without stochasticity | Report decline rate; identify which vital rate has highest elasticity for management |
+| CV of any vital rate > 0.3 | High demographic stochasticity | Use stochastic PVA as primary result; do not rely on deterministic lambda alone |
+| Extinction probability > 0.10 in 100 years | Meets IUCN Criterion E (Vulnerable threshold) | Report IUCN classification; identify management levers via elasticity analysis |
+| Elasticity concentrated in one rate (> 0.6) | Population growth dominated by single vital rate | Focus conservation action on that rate; run targeted management scenarios |
+| Catastrophe probability unknown | Cannot parameterise rare events | Run scenarios at 0%, 5%, 10% catastrophe frequency; report range |
+| n_stages < 3 with available data for more | Matrix oversimplified | Expand matrix to match available data resolution; justify aggregation if used |
+| Stochastic lambda >> deterministic lambda | Possible parameterisation error or Jensen's inequality effect | Verify distribution choices; report both values and explain discrepancy |