ecological-agent-skills 3.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (217) hide show
  1. package/AGENT_CONTEXT.md +191 -0
  2. package/CATALOG.md +329 -0
  3. package/LICENSE +692 -0
  4. package/README.md +347 -0
  5. package/bin/install.mjs +168 -0
  6. package/docs/comparison-with-alternatives.md +38 -0
  7. package/docs/global-examples-index.md +103 -0
  8. package/docs/repository-statistics.md +101 -0
  9. package/docs/theoretical-foundations.md +188 -0
  10. package/environment.yaml +106 -0
  11. package/examples/community/arctic_tundra_vegetation_example.md +247 -0
  12. package/examples/community/bird_landuse_example.md +63 -0
  13. package/examples/community/phytoplankton_reservoir_example.md +60 -0
  14. package/examples/community/reef_fish_indopacific_example.md +221 -0
  15. package/examples/impact/baci_road_example.md +57 -0
  16. package/examples/impact/ecosystem_services_atlantic_forest.md +83 -0
  17. package/examples/impact/forest_loss_borneo_timeseries_example.md +225 -0
  18. package/examples/occupancy/puma_camera_example.md +61 -0
  19. package/examples/occupancy/snow_leopard_himalayas_example.md +204 -0
  20. package/examples/reproducible/whittaker_biome_sdm_example.md +406 -0
  21. package/examples/sdm/anteater_cerrado_example.md +69 -0
  22. package/examples/sdm/jaguar_amazon_example.md +80 -0
  23. package/examples/sdm/koala_climate_change_example.md +170 -0
  24. package/examples/sdm/wolf_recolonization_europe_example.md +193 -0
  25. package/package.json +43 -0
  26. package/renv.lock +194 -0
  27. package/skills/SKILL_INDEX.json +1020 -0
  28. package/skills/acoustic-monitoring/SKILL.md +163 -0
  29. package/skills/acoustic-monitoring/examples/example-prompts.md +100 -0
  30. package/skills/acoustic-monitoring/examples/temperate_forest_birds_example.md +285 -0
  31. package/skills/acoustic-monitoring/resources/acoustic-indices-reference.md +93 -0
  32. package/skills/acoustic-monitoring/resources/soundscape-ecology-guide.md +90 -0
  33. package/skills/acoustic-monitoring/resources/species-id-tools-comparison.md +89 -0
  34. package/skills/acoustic-monitoring/scripts/batch_species_detection.py +360 -0
  35. package/skills/acoustic-monitoring/scripts/compute_acoustic_indices.R +235 -0
  36. package/skills/acoustic-monitoring/scripts/compute_acoustic_indices.py +374 -0
  37. package/skills/biostatistics-workbench/SKILL.md +140 -0
  38. package/skills/biostatistics-workbench/examples/example-prompts.md +39 -0
  39. package/skills/biostatistics-workbench/resources/effect-size-reference.md +81 -0
  40. package/skills/biostatistics-workbench/resources/glm-family-link-reference.md +47 -0
  41. package/skills/biostatistics-workbench/resources/test-selection-guide.md +93 -0
  42. package/skills/biostatistics-workbench/scripts/glm_pipeline.R +78 -0
  43. package/skills/biostatistics-workbench/scripts/glm_pipeline.py +210 -0
  44. package/skills/camera-trap-processing/SKILL.md +159 -0
  45. package/skills/camera-trap-processing/examples/example-prompts.md +103 -0
  46. package/skills/camera-trap-processing/examples/leopard_serengeti_example.md +231 -0
  47. package/skills/camera-trap-processing/resources/activity-patterns-reference.md +113 -0
  48. package/skills/camera-trap-processing/resources/camtrapR-workflow-guide.md +130 -0
  49. package/skills/camera-trap-processing/resources/detection-event-definition-guide.md +89 -0
  50. package/skills/camera-trap-processing/scripts/estimate_activity.R +169 -0
  51. package/skills/camera-trap-processing/scripts/process_camtrap_data.R +179 -0
  52. package/skills/camera-trap-processing/scripts/process_camtrap_data.py +192 -0
  53. package/skills/community-ecology-ordination/SKILL.md +133 -0
  54. package/skills/community-ecology-ordination/examples/example-prompts.md +35 -0
  55. package/skills/community-ecology-ordination/resources/dissimilarity-metric-guide.md +53 -0
  56. package/skills/community-ecology-ordination/resources/nmds-interpretation-guide.md +104 -0
  57. package/skills/community-ecology-ordination/scripts/__pycache__/community_analysis.cpython-311.pyc +0 -0
  58. package/skills/community-ecology-ordination/scripts/community_analysis.R +143 -0
  59. package/skills/community-ecology-ordination/scripts/community_analysis.py +231 -0
  60. package/skills/ecological-data-foundation/SKILL.md +129 -0
  61. package/skills/ecological-data-foundation/examples/example-prompts.md +40 -0
  62. package/skills/ecological-data-foundation/resources/coordinate-cleaning-flags.md +66 -0
  63. package/skills/ecological-data-foundation/resources/darwin-core-glossary.md +91 -0
  64. package/skills/ecological-data-foundation/resources/data-citation-guide.md +265 -0
  65. package/skills/ecological-data-foundation/resources/gbif-data-citation-guide.md +193 -0
  66. package/skills/ecological-data-foundation/resources/qa-checklist.md +83 -0
  67. package/skills/ecological-data-foundation/scripts/__pycache__/clean_occurrences.cpython-311.pyc +0 -0
  68. package/skills/ecological-data-foundation/scripts/__pycache__/download_from_ebird.cpython-311.pyc +0 -0
  69. package/skills/ecological-data-foundation/scripts/__pycache__/download_from_inat.cpython-311.pyc +0 -0
  70. package/skills/ecological-data-foundation/scripts/__pycache__/download_from_iucn.cpython-311.pyc +0 -0
  71. package/skills/ecological-data-foundation/scripts/__pycache__/download_from_obis.cpython-311.pyc +0 -0
  72. package/skills/ecological-data-foundation/scripts/clean_occurrences.R +230 -0
  73. package/skills/ecological-data-foundation/scripts/clean_occurrences.py +268 -0
  74. package/skills/ecological-data-foundation/scripts/download_from_ebird.R +251 -0
  75. package/skills/ecological-data-foundation/scripts/download_from_ebird.py +364 -0
  76. package/skills/ecological-data-foundation/scripts/download_from_gbif.R +315 -0
  77. package/skills/ecological-data-foundation/scripts/download_from_gbif.py +407 -0
  78. package/skills/ecological-data-foundation/scripts/download_from_inat.R +238 -0
  79. package/skills/ecological-data-foundation/scripts/download_from_inat.py +304 -0
  80. package/skills/ecological-data-foundation/scripts/download_from_iucn.R +273 -0
  81. package/skills/ecological-data-foundation/scripts/download_from_iucn.py +344 -0
  82. package/skills/ecological-data-foundation/scripts/download_from_obis.R +248 -0
  83. package/skills/ecological-data-foundation/scripts/download_from_obis.py +318 -0
  84. package/skills/ecological-impact-assessment/SKILL.md +123 -0
  85. package/skills/ecological-impact-assessment/examples/example-prompts.md +32 -0
  86. package/skills/ecological-impact-assessment/resources/baci-design-guide.md +55 -0
  87. package/skills/ecological-impact-assessment/resources/fragmentation-metrics-reference.md +86 -0
  88. package/skills/ecological-impact-assessment/resources/pressure-index-template.md +78 -0
  89. package/skills/ecological-impact-assessment/resources/study-design-guide.md +168 -0
  90. package/skills/ecological-impact-assessment/scripts/baci_analysis.R +161 -0
  91. package/skills/ecological-impact-assessment/scripts/fragmentation_analysis.py +141 -0
  92. package/skills/ecological-impact-assessment/scripts/power_analysis_baci.R +274 -0
  93. package/skills/ecosystem-services-assessment/SKILL.md +125 -0
  94. package/skills/ecosystem-services-assessment/examples/example-prompts.md +24 -0
  95. package/skills/ecosystem-services-assessment/resources/es-indicator-reference.md +45 -0
  96. package/skills/ecosystem-services-assessment/resources/invest-parameter-guide.md +86 -0
  97. package/skills/ecosystem-services-assessment/resources/rusle-coefficients.md +88 -0
  98. package/skills/ecosystem-services-assessment/scripts/__pycache__/compute_es.cpython-311.pyc +0 -0
  99. package/skills/ecosystem-services-assessment/scripts/compute_es.py +189 -0
  100. package/skills/ecosystem-services-assessment/scripts/tradeoff_analysis.R +161 -0
  101. package/skills/environmental-time-series/SKILL.md +125 -0
  102. package/skills/environmental-time-series/examples/example-prompts.md +33 -0
  103. package/skills/environmental-time-series/resources/anomaly-indices-reference.md +88 -0
  104. package/skills/environmental-time-series/resources/bfast-parameter-guide.md +69 -0
  105. package/skills/environmental-time-series/scripts/__pycache__/recovery_trajectory.cpython-311.pyc +0 -0
  106. package/skills/environmental-time-series/scripts/__pycache__/trend_analysis.cpython-311.pyc +0 -0
  107. package/skills/environmental-time-series/scripts/recovery_trajectory.R +305 -0
  108. package/skills/environmental-time-series/scripts/recovery_trajectory.py +178 -0
  109. package/skills/environmental-time-series/scripts/trend_analysis.R +192 -0
  110. package/skills/environmental-time-series/scripts/trend_analysis.py +184 -0
  111. package/skills/geoprocessing-for-ecology/SKILL.md +123 -0
  112. package/skills/geoprocessing-for-ecology/examples/example-prompts.md +32 -0
  113. package/skills/geoprocessing-for-ecology/resources/crs-reference.md +62 -0
  114. package/skills/geoprocessing-for-ecology/resources/global-predictor-sources.md +331 -0
  115. package/skills/geoprocessing-for-ecology/resources/resampling-methods.md +57 -0
  116. package/skills/geoprocessing-for-ecology/scripts/__pycache__/download_predictors.cpython-311.pyc +0 -0
  117. package/skills/geoprocessing-for-ecology/scripts/download_predictors.R +239 -0
  118. package/skills/geoprocessing-for-ecology/scripts/download_predictors.py +379 -0
  119. package/skills/geoprocessing-for-ecology/scripts/stack_and_extract.R +224 -0
  120. package/skills/geoprocessing-for-ecology/scripts/stack_and_extract.py +172 -0
  121. package/skills/landscape-connectivity/SKILL.md +170 -0
  122. package/skills/landscape-connectivity/examples/example-prompts.md +96 -0
  123. package/skills/landscape-connectivity/examples/jaguar_mesoamerica_corridor_example.md +271 -0
  124. package/skills/landscape-connectivity/resources/circuitscape-parameter-guide.md +155 -0
  125. package/skills/landscape-connectivity/resources/graph-theory-for-ecology.md +134 -0
  126. package/skills/landscape-connectivity/resources/resistance-surface-guide.md +141 -0
  127. package/skills/landscape-connectivity/scripts/connectivity_analysis.py +387 -0
  128. package/skills/landscape-connectivity/scripts/connectivity_metrics.R +274 -0
  129. package/skills/landscape-connectivity/scripts/resistance_surface.R +239 -0
  130. package/skills/model-validation-and-uncertainty/SKILL.md +131 -0
  131. package/skills/model-validation-and-uncertainty/examples/example-prompts.md +30 -0
  132. package/skills/model-validation-and-uncertainty/resources/extrapolation-risk-guide.md +236 -0
  133. package/skills/model-validation-and-uncertainty/resources/metric-selection-guide.md +52 -0
  134. package/skills/model-validation-and-uncertainty/resources/threshold-selection-guide.md +64 -0
  135. package/skills/model-validation-and-uncertainty/scripts/__pycache__/validate_model.cpython-311.pyc +0 -0
  136. package/skills/model-validation-and-uncertainty/scripts/extrapolation_risk.R +315 -0
  137. package/skills/model-validation-and-uncertainty/scripts/validate_model.py +226 -0
  138. package/skills/model-validation-and-uncertainty/scripts/validate_sdm.R +162 -0
  139. package/skills/occupancy-and-detection/SKILL.md +126 -0
  140. package/skills/occupancy-and-detection/examples/example-prompts.md +33 -0
  141. package/skills/occupancy-and-detection/resources/detection-history-format.md +100 -0
  142. package/skills/occupancy-and-detection/resources/occupancy-study-design.md +47 -0
  143. package/skills/occupancy-and-detection/scripts/__pycache__/occupancy_analysis.cpython-311.pyc +0 -0
  144. package/skills/occupancy-and-detection/scripts/occupancy_analysis.R +160 -0
  145. package/skills/occupancy-and-detection/scripts/occupancy_analysis.py +159 -0
  146. package/skills/population-viability-analysis/SKILL.md +161 -0
  147. package/skills/population-viability-analysis/examples/african_elephant_pva_example.md +266 -0
  148. package/skills/population-viability-analysis/examples/example-prompts.md +95 -0
  149. package/skills/population-viability-analysis/resources/extinction-risk-thresholds.md +128 -0
  150. package/skills/population-viability-analysis/resources/matrix-model-guide.md +139 -0
  151. package/skills/population-viability-analysis/resources/sensitivity-elasticity-reference.md +182 -0
  152. package/skills/population-viability-analysis/scripts/matrix_pva.R +258 -0
  153. package/skills/population-viability-analysis/scripts/pva_analysis.py +442 -0
  154. package/skills/population-viability-analysis/scripts/stochastic_pva.R +353 -0
  155. package/skills/predictive-modeling-best-practices/SKILL.md +136 -0
  156. package/skills/predictive-modeling-best-practices/examples/example-prompts.md +58 -0
  157. package/skills/predictive-modeling-best-practices/resources/collinearity-decision-tree.md +65 -0
  158. package/skills/predictive-modeling-best-practices/resources/sampling-bias-correction.md +267 -0
  159. package/skills/predictive-modeling-best-practices/resources/spatial-cv-guide.md +73 -0
  160. package/skills/predictive-modeling-best-practices/scripts/__pycache__/spatial_cv.cpython-311.pyc +0 -0
  161. package/skills/predictive-modeling-best-practices/scripts/collinearity_check.R +112 -0
  162. package/skills/predictive-modeling-best-practices/scripts/spatial_cv.py +182 -0
  163. package/skills/reproducible-ecology-pipeline/SKILL.md +139 -0
  164. package/skills/reproducible-ecology-pipeline/examples/example-prompts.md +35 -0
  165. package/skills/reproducible-ecology-pipeline/resources/directory-structure-template.md +94 -0
  166. package/skills/reproducible-ecology-pipeline/resources/params-yaml-template.yaml +84 -0
  167. package/skills/reproducible-ecology-pipeline/resources/reproducibility-checklist-template.md +66 -0
  168. package/skills/reproducible-ecology-pipeline/scripts/generate_file_manifest.py +110 -0
  169. package/skills/reproducible-ecology-pipeline/scripts/init_project.sh +53 -0
  170. package/skills/spatial-prioritization/SKILL.md +162 -0
  171. package/skills/spatial-prioritization/examples/biodiversity_hotspot_prioritization_example.md +289 -0
  172. package/skills/spatial-prioritization/examples/example-prompts.md +93 -0
  173. package/skills/spatial-prioritization/resources/cost-surface-reference.md +130 -0
  174. package/skills/spatial-prioritization/resources/marxan-vs-prioritizr-comparison.md +125 -0
  175. package/skills/spatial-prioritization/resources/prioritizr-formulation-guide.md +188 -0
  176. package/skills/spatial-prioritization/resources/representation-targets-guide.md +186 -0
  177. package/skills/spatial-prioritization/scripts/prioritization_sensitivity.R +320 -0
  178. package/skills/spatial-prioritization/scripts/run_prioritization.R +336 -0
  179. package/skills/species-distribution-modeling/SKILL.md +139 -0
  180. package/skills/species-distribution-modeling/examples/example-prompts.md +36 -0
  181. package/skills/species-distribution-modeling/resources/algorithm-comparison.md +25 -0
  182. package/skills/species-distribution-modeling/resources/calibration-area-guide.md +71 -0
  183. package/skills/species-distribution-modeling/resources/climate-scenario-preparation.md +170 -0
  184. package/skills/species-distribution-modeling/resources/maxent-calibration-guide.md +211 -0
  185. package/skills/species-distribution-modeling/resources/sdm-checklist.md +37 -0
  186. package/skills/species-distribution-modeling/scripts/predict_distribution.R +236 -0
  187. package/skills/species-distribution-modeling/scripts/predict_distribution.py +286 -0
  188. package/skills/species-distribution-modeling/scripts/prepare_future_layers.R +351 -0
  189. package/skills/species-distribution-modeling/scripts/project_scenarios.R +220 -0
  190. package/skills/species-distribution-modeling/scripts/run_ensemble_sdm.R +99 -0
  191. package/skills/species-distribution-modeling/scripts/sdm_pipeline.py +318 -0
  192. package/skills/species-distribution-modeling/scripts/tune_maxnet.R +344 -0
  193. package/templates/SKILL_TEMPLATE.md +225 -0
  194. package/templates/checklists/data-submission-checklist.md +38 -0
  195. package/templates/checklists/post-analysis-checklist.md +55 -0
  196. package/templates/checklists/pre-analysis-checklist.md +31 -0
  197. package/templates/prompts/debug-skill.md +47 -0
  198. package/templates/prompts/invoke-skill.md +34 -0
  199. package/templates/prompts/invoke-workflow.md +45 -0
  200. package/templates/reports/technical-report-template.md +80 -0
  201. package/templates/scripts/logger_setup.R +79 -0
  202. package/templates/scripts/logger_setup.py +119 -0
  203. package/templates/scripts/params_loader.R +28 -0
  204. package/templates/scripts/params_loader.py +38 -0
  205. package/workflows/analyze-community-structure/WORKFLOW.md +72 -0
  206. package/workflows/analyze-environmental-change/WORKFLOW.md +73 -0
  207. package/workflows/assess-ecological-impact/WORKFLOW.md +75 -0
  208. package/workflows/assess-ecosystem-services/WORKFLOW.md +68 -0
  209. package/workflows/assess-landscape-connectivity/WORKFLOW.md +84 -0
  210. package/workflows/build-fire-risk-map/WORKFLOW.md +79 -0
  211. package/workflows/produce-technical-report/WORKFLOW.md +113 -0
  212. package/workflows/run-camera-trap-occupancy/WORKFLOW.md +87 -0
  213. package/workflows/run-conservation-prioritization/WORKFLOW.md +89 -0
  214. package/workflows/run-multispecies-screening/WORKFLOW.md +197 -0
  215. package/workflows/run-occupancy-analysis/WORKFLOW.md +74 -0
  216. package/workflows/run-population-viability/WORKFLOW.md +90 -0
  217. package/workflows/run-sdm-study/WORKFLOW.md +99 -0
@@ -0,0 +1,210 @@
1
+ #!/usr/bin/env python3
2
+ # ecological-agent-skills / Copyright (C) 2026 Francisco Diego Barros Barata
3
+ # SPDX-License-Identifier: GPL-3.0-or-later
4
+
5
+ """
6
+ glm_pipeline.py
7
+ Fit candidate GLMs, check assumptions, model selection.
8
+ Usage: python glm_pipeline.py <data_csv> <response_var> <output_dir>
9
+ Requires: pandas, numpy, statsmodels, scipy, matplotlib, seaborn
10
+ """
11
+ import logging
12
+ import sys
13
+ from datetime import datetime
14
+ from pathlib import Path
15
+
16
+ SKILL_NAME = "biostatistics-workbench"
17
+ _LOG_DIR = Path("logs")
18
+ _LOG_DIR.mkdir(parents=True, exist_ok=True)
19
+ _log_file = _LOG_DIR / f"skill_{SKILL_NAME}_{datetime.now().strftime('%Y%m%d_%H%M%S')}.log"
20
+ logging.basicConfig(
21
+ level=logging.INFO,
22
+ format="[%(asctime)s] [%(levelname)s] [" + SKILL_NAME + "] %(message)s",
23
+ datefmt="%Y-%m-%d %H:%M:%S",
24
+ handlers=[
25
+ logging.StreamHandler(sys.stdout),
26
+ logging.FileHandler(_log_file, encoding="utf-8"),
27
+ ],
28
+ )
29
+ logger = logging.getLogger(SKILL_NAME)
30
+
31
+ def log_step(n: int, desc: str) -> None:
32
+ logger.info("-- STEP %d: %s", n, desc)
33
+
34
+ def log_decision(var: str, val, why: str) -> None:
35
+ logger.info("DECISION | %s = %s | %s", var, val, why)
36
+
37
+ import numpy as np
38
+ import pandas as pd
39
+ import statsmodels.formula.api as smf
40
+ import statsmodels.api as sm
41
+ import matplotlib.pyplot as plt
42
+ import scipy.stats as stats
43
+
44
+
45
+ def vif_check(df: pd.DataFrame, predictors: list) -> pd.DataFrame:
46
+ """Compute VIF for each predictor via auxiliary regressions."""
47
+ from statsmodels.stats.outliers_influence import variance_inflation_factor
48
+ X = sm.add_constant(df[predictors].dropna())
49
+ vif_data = pd.DataFrame({
50
+ "predictor": predictors,
51
+ "VIF": [variance_inflation_factor(X.values, i+1) for i in range(len(predictors))]
52
+ }).sort_values("VIF", ascending=False)
53
+ return vif_data
54
+
55
+ def fit_candidates(data: pd.DataFrame, response: str, candidates: dict, family) -> list:
56
+ results = []
57
+ for label, formula_str in candidates.items():
58
+ try:
59
+ m = smf.glm(formula_str, data=data, family=family).fit(disp=0)
60
+ results.append({"label": label, "formula": formula_str, "AIC": m.aic,
61
+ "deviance": m.deviance, "df_resid": m.df_resid, "model": m})
62
+ logger.info(" %s: AIC = %.2f", label, m.aic)
63
+ except Exception as e:
64
+ logger.error(
65
+ "Unexpected error in fit_candidates [%s]: %s\n"
66
+ "Causa provavel: formula invalida, colunas ausentes, ou familia incompativel\n"
67
+ "Verifique: nomes das colunas no CSV e formula definida\n"
68
+ "Skill anterior: data-cleaning",
69
+ label, e
70
+ )
71
+ return results
72
+
73
+ def model_selection_table(results: list) -> pd.DataFrame:
74
+ tbl = pd.DataFrame([{k: v for k, v in r.items() if k != "model"} for r in results])
75
+ tbl = tbl.sort_values("AIC").reset_index(drop=True)
76
+ tbl["deltaAIC"] = tbl["AIC"] - tbl["AIC"].min()
77
+ tbl["weight"] = np.exp(-0.5 * tbl["deltaAIC"])
78
+ tbl["weight"] /= tbl["weight"].sum()
79
+ return tbl
80
+
81
+ def diagnostic_plots(model, label: str, output_dir: Path) -> None:
82
+ fig, axes = plt.subplots(1, 2, figsize=(10, 4))
83
+ # Residuals vs fitted
84
+ fitted = model.fittedvalues
85
+ resid = model.resid_pearson
86
+ axes[0].scatter(fitted, resid, alpha=0.5, s=20)
87
+ axes[0].axhline(0, color="red", linestyle="--")
88
+ axes[0].set_xlabel("Fitted values"); axes[0].set_ylabel("Pearson residuals")
89
+ axes[0].set_title("Residuals vs Fitted")
90
+ # QQ plot
91
+ stats.probplot(resid, dist="norm", plot=axes[1])
92
+ axes[1].set_title("QQ Plot of Pearson Residuals")
93
+ fig.suptitle(f"Diagnostics: {label}")
94
+ plt.tight_layout()
95
+ plt.savefig(output_dir / f"diagnostics_{label}.png", dpi=150)
96
+ plt.close()
97
+
98
+ def main():
99
+ data_file = sys.argv[1] if len(sys.argv) > 1 else "data/processed/data.csv"
100
+ response_var = sys.argv[2] if len(sys.argv) > 2 else "richness"
101
+ output_dir = Path(sys.argv[3]) if len(sys.argv) > 3 else Path("outputs/stats")
102
+
103
+ log_step(1, "Validate inputs and load data")
104
+ if not Path(data_file).exists():
105
+ logger.error(
106
+ "Input file not found: %s\n"
107
+ "Causa provavel: caminho incorreto ou arquivo nao gerado ainda\n"
108
+ "Verifique: o argumento data_csv e o diretorio de trabalho\n"
109
+ "Skill anterior: data-cleaning",
110
+ data_file
111
+ )
112
+ sys.exit(1)
113
+
114
+ output_dir.mkdir(parents=True, exist_ok=True)
115
+
116
+ try:
117
+ dat = pd.read_csv(data_file)
118
+ except Exception as e:
119
+ logger.error(
120
+ "Unexpected error in load data: %s\n"
121
+ "Causa provavel: arquivo CSV malformado ou permissoes insuficientes\n"
122
+ "Verifique: encoding e estrutura do arquivo CSV\n"
123
+ "Skill anterior: data-cleaning",
124
+ e
125
+ )
126
+ raise
127
+
128
+ logger.info("Loaded %d rows. Response: %s", len(dat), response_var)
129
+
130
+ if response_var not in dat.columns:
131
+ logger.error(
132
+ "Response variable '%s' not found in columns: %s\n"
133
+ "Causa provavel: nome da variavel resposta incorreto\n"
134
+ "Verifique: cabecalho do CSV e o argumento response_var\n"
135
+ "Skill anterior: data-cleaning",
136
+ response_var, list(dat.columns)
137
+ )
138
+ sys.exit(1)
139
+
140
+ n_missing = dat[response_var].isna().sum()
141
+ if n_missing > 0:
142
+ log_warn_msg = (
143
+ "Response variable '%s' has %d missing values (%.1f%%). "
144
+ "Rows with NA will be dropped by statsmodels."
145
+ )
146
+ logger.warning(log_warn_msg, response_var, n_missing, 100 * n_missing / len(dat))
147
+
148
+ log_step(2, "Define candidate models and family")
149
+ # --- Define your candidate models here ---
150
+ candidates = {
151
+ "null": f"{response_var} ~ 1",
152
+ "model1": f"{response_var} ~ C(group)",
153
+ "model2": f"{response_var} ~ C(group) + elevation",
154
+ "model3": f"{response_var} ~ C(group) + elevation + forest_cover",
155
+ }
156
+ family = sm.families.NegativeBinomial()
157
+ log_decision("family", "NegativeBinomial", "count response variable; NB handles overdispersion")
158
+ log_decision("n_candidates", len(candidates), "null + 3 increasingly complex models for AIC comparison")
159
+
160
+ log_step(3, "Fit candidate models")
161
+ logger.info("Fitting candidate models:")
162
+ results = fit_candidates(dat, response_var, candidates, family)
163
+
164
+ if not results:
165
+ logger.error(
166
+ "No models converged successfully.\n"
167
+ "Causa provavel: dados insuficientes ou preditores com NA em todas as linhas\n"
168
+ "Verifique: completude dos dados e formulas dos candidatos\n"
169
+ "Skill anterior: data-cleaning"
170
+ )
171
+ sys.exit(1)
172
+
173
+ log_step(4, "Build model selection table")
174
+ try:
175
+ tbl = model_selection_table(results)
176
+ logger.info("Model selection table:\n%s", tbl[['label','AIC','deltaAIC','weight']].to_string(index=False))
177
+ tbl.drop(columns=["model"], errors="ignore").to_csv(output_dir / "model_selection.csv", index=False)
178
+ except Exception as e:
179
+ logger.error(
180
+ "Unexpected error in model selection table: %s\n"
181
+ "Causa provavel: nenhum modelo ajustado com sucesso\n"
182
+ "Verifique: etapa de fitting para mensagens de erro anteriores\n"
183
+ "Skill anterior: biostatistics-workbench (fitting)",
184
+ e
185
+ )
186
+ raise
187
+
188
+ log_step(5, "Summarise best model and save diagnostics")
189
+ try:
190
+ best_result = min(results, key=lambda x: x["AIC"])
191
+ best_model = best_result["model"]
192
+ log_decision("best_model", best_result["label"], "lowest AIC among converged candidates")
193
+ logger.info("Best model: %s (AIC = %.2f)", best_result["label"], best_result["AIC"])
194
+ logger.info(str(best_model.summary()))
195
+ (output_dir / "best_model_summary.txt").write_text(str(best_model.summary()))
196
+
197
+ diagnostic_plots(best_model, best_result["label"], output_dir)
198
+ logger.info("Outputs written to: %s", output_dir)
199
+ except Exception as e:
200
+ logger.error(
201
+ "Unexpected error in best model summary/diagnostics: %s\n"
202
+ "Causa provavel: objeto de modelo invalido ou diretorio sem permissao de escrita\n"
203
+ "Verifique: output_dir e o modelo selecionado\n"
204
+ "Skill anterior: biostatistics-workbench (fitting)",
205
+ e
206
+ )
207
+ raise
208
+
209
+ if __name__ == "__main__":
210
+ main()
@@ -0,0 +1,159 @@
1
+ ---
2
+ name: camera-trap-processing
3
+ description: "Processes camera trap image records into structured detection data, activity patterns, and trap effort summaries. Use this skill when the user mentions camera traps, wildlife cameras, trap nights, detection events, diel activity patterns, camtrapR, temporal overlap indices (Dhat), RAI (relative abundance index), camera station data, detection history generation, or independence thresholds for photo events."
4
+ skill_version: 1.0.0
5
+ ---
6
+
7
+ # Skill: camera-trap-processing
8
+
9
+ **Domain:** Camera traps · Detection events · Activity patterns · Occupancy · Abundance indices
10
+
11
+ ---
12
+
13
+ ## Purpose
14
+
15
+ Guides the agent through processing raw camera trap image data into validated detection records, calculating activity pattern metrics, and preparing outputs for occupancy and abundance estimation. Covers detection event definition, independence filtering, trap effort computation, diel activity analysis, and integration with the occupancy-and-detection skill via detection history matrices.
16
+
17
+ ---
18
+
19
+ ## When to Invoke
20
+
21
+ Invoke this skill when:
22
+
23
+ - A user provides a directory of camera trap images or a raw detection CSV for processing
24
+ - The goal is to calculate wildlife activity patterns, diel overlap, or relative activity indices
25
+ - Detection history matrices are needed as input for occupancy models
26
+ - The user asks about independent detection events, trap effort, or camera operation
27
+ - Photographic mark-recapture or N-mixture models are planned
28
+
29
+ **trigger_keywords:** `camera trap`, `wildlife camera`, `detection event`, `trap night`, `activity pattern`, `diel activity`, `photographic index`, `camtrapR`, `detection history`, `occupancy camera`, `camera operation`, `independent detection`
30
+
31
+ ---
32
+
33
+ ## Inputs
34
+
35
+ | Input | Format | Required |
36
+ |---|---|---|
37
+ | Image directory (organised by camera station) | Directory tree | Required |
38
+ | Camera metadata CSV (station, lat, lon, setup date, retrieval date) | CSV | Required |
39
+ | Species list for filtering | TXT or CSV | Recommended |
40
+ | Independence threshold (minutes) | Integer (default: 30) | Optional |
41
+ | Detection record table (if images already processed) | CSV | Conditional |
42
+
43
+ ---
44
+
45
+ ## Outputs
46
+
47
+ | Output | Description |
48
+ |---|---|
49
+ | `record_table.csv` | One row per independent detection event per species per camera |
50
+ | `detection_history.csv` | Binary site × occasion matrix ready for `unmarked` |
51
+ | `camera_operation.csv` | Daily operation status per camera (1=active, 0=inactive) |
52
+ | `trap_effort_summary.csv` | Trap-nights per station and per species |
53
+ | `records_per_species.csv` | Count of independent events per species |
54
+ | `activity_plot.png` | Kernel density of diel activity with 95% CI |
55
+ | `activity_overlap.csv` | Diel overlap index Δ between two groups |
56
+ | `circular_stats.csv` | Mean activity time, concentration (κ), Rayleigh test |
57
+
58
+ ---
59
+
60
+ ## Steps
61
+
62
+ 1. **Validate camera metadata**
63
+ Confirm all stations have setup and retrieval dates, coordinates, and unique IDs.
64
+ Run `process_camtrap_data.R` with the image directory and metadata CSV.
65
+ The script creates the camera operation matrix from station dates.
66
+
67
+ 2. **Extract EXIF and build record table**
68
+ `camtrapR::recordTable()` reads image timestamps and species labels from directory structure.
69
+ Check that directory hierarchy matches: `<station>/<species>/<images>`.
70
+ Output: `record_table.csv` with columns station, species, datetime, filename.
71
+
72
+ 3. **Apply independence filter**
73
+ Events from the same species at the same station within `indep_threshold_min` (default: 30)
74
+ are collapsed into a single event. Record the threshold in `decision_log.md`.
75
+ If threshold is changed from 30 min, justify the choice using home range data.
76
+
77
+ 4. **Calculate trap effort**
78
+ Compute trap-nights per station from the camera operation matrix.
79
+ Flag stations with < 100 trap-nights in `trap_effort_summary.csv`.
80
+ Do not use stations with < 20 trap-nights in occupancy estimation.
81
+
82
+ 5. **Generate detection history matrix**
83
+ Use `camtrapR::detectionHistory()` with the chosen occasion length (default: 1 week).
84
+ Output: binary matrix with rows = stations, columns = occasions.
85
+ Pass to `occupancy-and-detection` skill for model fitting.
86
+
87
+ 6. **Estimate diel activity patterns** *(optional — invoke `estimate_activity.R`)*
88
+ Fit von Mises kernel to circular time-of-day data.
89
+ Calculate Dhat4 overlap index between two groups (e.g., dry vs. wet season).
90
+ If `n_independent_events < 10` per species, report relative activity index only.
91
+
92
+ 7. **Validate outputs**
93
+ Confirm all output files are non-empty. Check that detection history dimensions match
94
+ (n_stations × n_occasions). Verify no station has all-NA rows.
95
+ Record decisions in `decision_log.md`.
96
+
97
+ ---
98
+
99
+ ## Decision Points
100
+
101
+ | Condition | Diagnosis | Recommended Action |
102
+ |---|---|---|
103
+ | Time between photos < independence threshold | Events are not independent — same animal | Collapse to one event; use 30 min default unless home range data justifies otherwise |
104
+ | `n_independent_events` < 10 per species | Insufficient data for occupancy estimation | Report relative activity index (RAI) only; do not fit occupancy model |
105
+ | Camera operational time < 100 trap-nights | Insufficient sampling effort | Flag station; exclude from occupancy analysis; include in effort summary |
106
+ | Stations < 15 | Below minimum for occupancy model | Use naive occupancy with explicit caveat; see occupancy-and-detection skill |
107
+ | Missing EXIF timestamps | Images cannot be time-ordered | Request re-processing; use filename timestamps as fallback if consistent |
108
+
109
+ ---
110
+
111
+ ## Key Decisions to Document
112
+
113
+ Record the following in `decision_log.md` after running this skill:
114
+
115
+ - Which independence threshold (minutes) was used and why
116
+ - How many events were collapsed due to non-independence
117
+ - Which stations were excluded due to low effort and why
118
+ - What occasion length was used for detection history and why
119
+ - Whether any species had insufficient events for occupancy (RAI reported instead)
120
+
121
+ ---
122
+
123
+ ## Tools and Libraries
124
+
125
+ **R**
126
+ ```r
127
+ suppressPackageStartupMessages(library(camtrapR)) # record table, detection history
128
+ suppressPackageStartupMessages(library(overlap)) # diel overlap index (Dhat4)
129
+ suppressPackageStartupMessages(library(circular)) # circular statistics
130
+ suppressPackageStartupMessages(library(dplyr)) # data manipulation
131
+ suppressPackageStartupMessages(library(ggplot2)) # plotting
132
+ suppressPackageStartupMessages(library(lubridate)) # date handling
133
+ ```
134
+
135
+ **Python**
136
+ ```python
137
+ import pandas as pd # data manipulation
138
+ import numpy as np # numerical operations
139
+ import matplotlib.pyplot as plt # plotting
140
+ from pathlib import Path # file system operations
141
+ ```
142
+
143
+ ---
144
+
145
+ ## Resources
146
+
147
+ - [`skills/camera-trap-processing/resources/detection-event-definition-guide.md`](resources/detection-event-definition-guide.md) — Independence thresholds by taxon and how the choice affects estimates
148
+ - [`skills/camera-trap-processing/resources/camtrapR-workflow-guide.md`](resources/camtrapR-workflow-guide.md) — Directory structure, EXIF extraction, and key camtrapR functions
149
+ - [`skills/camera-trap-processing/resources/activity-patterns-reference.md`](resources/activity-patterns-reference.md) — Diel overlap index Δ, circular statistics, and seasonal stratification
150
+
151
+ ---
152
+
153
+ ## Notes
154
+
155
+ - **Directory structure is mandatory for camtrapR:** images must be organised as `<station>/<species>/<images>`. Flat directories will cause `recordTable()` to fail. If images are flat, reorganise before processing.
156
+ - **Independence threshold inflates occupancy if too short:** Using 5 min instead of 30 min can increase apparent detection events by 2–5×, biasing occupancy upward. Document the threshold and perform sensitivity analysis if in doubt.
157
+ - **Trap-nights ≠ camera-nights if cameras malfunction:** Always use the camera operation matrix, not simply `retrieval_date - setup_date`. Cameras with SD card failures, battery death, or tampering must be accounted for.
158
+ - **RAI is not equivalent to occupancy or density:** Relative Activity Index (detections per 100 trap-nights) is a naive index that conflates occupancy, detection probability, and activity. It is useful for relative comparisons only.
159
+ - **Circular time requires conversion to radians:** Activity times must be converted from clock hours to radians (`time_rad = hour * (2*pi/24)`) before fitting kernel density or calculating Δ.
@@ -0,0 +1,103 @@
1
+ ---
2
+ skill_id: camera-trap-processing
3
+ example_count: 5
4
+ ---
5
+
6
+ # Camera Trap Processing — Example Prompts
7
+
8
+ ## Scenario 1: Multi-Species Occupancy Survey
9
+
10
+ **Context:** Savanna park, 40 stations deployed for 60 days. Goal: build detection histories for all medium-to-large mammals for an occupancy modelling exercise.
11
+
12
+ **Prompt:**
13
+ > "I have 60 days of camera trap data from 40 stations in a savanna park. Process all images, define independent detection events (30-min threshold for mammals), build a weekly detection history matrix, and flag stations with less than 100 trap-nights."
14
+
15
+ **Expected workflow:**
16
+ 1. `process_camtrap_data.R <image_dir> <metadata.csv> outputs/ 30`
17
+ 2. Inspect `trap_effort_summary.csv` — exclude stations < 100 trap-nights
18
+ 3. Load `detection_history.csv` into `unmarked` for single-season occupancy modelling
19
+ 4. Check `records_per_species.csv` — exclude species with < 10 events from occupancy
20
+
21
+ **Key decision points:**
22
+ - Stations excluded for effort < 100 trap-nights: re-visit in next field season
23
+ - Species with < 10 events: report RAI only, not occupancy
24
+
25
+ ---
26
+
27
+ ## Scenario 2: Predator–Prey Temporal Overlap
28
+
29
+ **Context:** African savanna. Compare diel activity patterns of lions and wildebeest. Determine the degree of temporal overlap (Δ) and assess whether lions are more active when wildebeest is active.
30
+
31
+ **Prompt:**
32
+ > "I have 6 months of camera trap data for lions and wildebeest at a savanna site. Estimate diel activity curves for both species and compute temporal overlap (Dhat4). Test whether overlap is significantly higher than expected by chance using a bootstrap permutation test."
33
+
34
+ **Expected workflow:**
35
+ 1. `estimate_activity.R record_table.csv "Lion" outputs/`
36
+ 2. `estimate_activity.R record_table.csv "Wildebeest" outputs/`
37
+ 3. Load both activity CSV outputs; compare Dhat4 estimates with bootstrap CI
38
+ 4. Report interpretation: Δ > 0.75 = high overlap → possible pursuit or avoidance
39
+
40
+ **Key decision points:**
41
+ - If n Lion < 75 → bootstrap CI will be wide; interpret cautiously
42
+ - If 95% CI of Δ does not overlap 0.5 (random) → significant temporal association
43
+
44
+ ---
45
+
46
+ ## Scenario 3: Relative Activity Index (RAI) Across a Disturbance Gradient
47
+
48
+ **Context:** Forest–agricultural edge transect, 3 habitat types (interior, edge, farmland). Compare RAI of 5 focal species across habitat zones.
49
+
50
+ **Prompt:**
51
+ > "Compare the Relative Activity Index (RAI) for tapir, peccary, deer, ocelot, and armadillo across three habitat types (forest interior, forest edge, farmland) using 90 days of camera trap data. Test for significant habitat effects on RAI."
52
+
53
+ **Expected workflow:**
54
+ 1. `process_camtrap_data.R` for entire dataset; join with habitat zone from metadata
55
+ 2. Compute RAI = detections / trap-nights × 100 per station × species
56
+ 3. Kruskal-Wallis or negative binomial GLM: `RAI ~ habitat_type + (1 | station)`
57
+ 4. Post-hoc Dunn test for pairwise habitat comparisons
58
+ 5. Plot faceted bar chart (species × habitat) with SE bars
59
+
60
+ **Expected findings:**
61
+ - Tapir likely shows strong avoidance of farmland
62
+ - Armadillo may show edge preference
63
+ - Report effect sizes (η²) alongside p-values
64
+
65
+ ---
66
+
67
+ ## Scenario 4: Camera Trap Population Index Trend
68
+
69
+ **Context:** Long-term monitoring (5 years). Track RAI of a threatened ungulate across annual survey periods to detect population decline.
70
+
71
+ **Prompt:**
72
+ > "I have 5 years of annual camera trap surveys (60-day seasons, same 25 stations each year) for a threatened deer species. Calculate RAI per station per year, test for a significant linear trend, and assess whether the population is declining."
73
+
74
+ **Expected workflow:**
75
+ 1. `process_camtrap_data.R` for each year; combine record tables with year column
76
+ 2. Compute station-level RAI per year
77
+ 3. Linear mixed model: `log(RAI + 0.01) ~ year + (1 | station)`
78
+ 4. If slope < 0 and p < 0.05 → declining trend; estimate annual rate of change
79
+ 5. Plot RAI time series per station; highlight trend line with 95% CI
80
+
81
+ **Key decision points:**
82
+ - If station composition differs between years → use only stations sampled all 5 years
83
+ - If RAI changes > 50% between consecutive years → check for recording malfunctions
84
+
85
+ ---
86
+
87
+ ## Scenario 5: Camera Trap Data for Occupancy Modelling Integration
88
+
89
+ **Context:** Joint camera trap + acoustic monitoring survey. Need to prepare occupancy inputs for 12 focal bird species detected in both data streams.
90
+
91
+ **Prompt:**
92
+ > "I have 45 days of camera trap data (for ground-dwelling birds) alongside acoustic monitoring results from the same stations. Prepare weekly detection history matrices for 12 focal species, combine with acoustic detection histories, and run a multi-method occupancy model in unmarked."
93
+
94
+ **Expected workflow:**
95
+ 1. `process_camtrap_data.R` → `detection_history.csv` (camera)
96
+ 2. Acoustic detections filtered to ≥ 0.7 confidence → binary weekly matrix
97
+ 3. Bind camera and acoustic histories as two observation methods in `unmarked::occuMS()`
98
+ 4. Compare detection probability estimates between methods
99
+ 5. Report false-absence risk for each method per species
100
+
101
+ **Key decision points:**
102
+ - If camera detection probability < 0.10 for arboreal species → remove camera data for that species
103
+ - If acoustic and camera occupancy estimates differ > 20% → investigate site-level covariates