ecological-agent-skills 3.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (217) hide show
  1. package/AGENT_CONTEXT.md +191 -0
  2. package/CATALOG.md +329 -0
  3. package/LICENSE +692 -0
  4. package/README.md +347 -0
  5. package/bin/install.mjs +168 -0
  6. package/docs/comparison-with-alternatives.md +38 -0
  7. package/docs/global-examples-index.md +103 -0
  8. package/docs/repository-statistics.md +101 -0
  9. package/docs/theoretical-foundations.md +188 -0
  10. package/environment.yaml +106 -0
  11. package/examples/community/arctic_tundra_vegetation_example.md +247 -0
  12. package/examples/community/bird_landuse_example.md +63 -0
  13. package/examples/community/phytoplankton_reservoir_example.md +60 -0
  14. package/examples/community/reef_fish_indopacific_example.md +221 -0
  15. package/examples/impact/baci_road_example.md +57 -0
  16. package/examples/impact/ecosystem_services_atlantic_forest.md +83 -0
  17. package/examples/impact/forest_loss_borneo_timeseries_example.md +225 -0
  18. package/examples/occupancy/puma_camera_example.md +61 -0
  19. package/examples/occupancy/snow_leopard_himalayas_example.md +204 -0
  20. package/examples/reproducible/whittaker_biome_sdm_example.md +406 -0
  21. package/examples/sdm/anteater_cerrado_example.md +69 -0
  22. package/examples/sdm/jaguar_amazon_example.md +80 -0
  23. package/examples/sdm/koala_climate_change_example.md +170 -0
  24. package/examples/sdm/wolf_recolonization_europe_example.md +193 -0
  25. package/package.json +43 -0
  26. package/renv.lock +194 -0
  27. package/skills/SKILL_INDEX.json +1020 -0
  28. package/skills/acoustic-monitoring/SKILL.md +163 -0
  29. package/skills/acoustic-monitoring/examples/example-prompts.md +100 -0
  30. package/skills/acoustic-monitoring/examples/temperate_forest_birds_example.md +285 -0
  31. package/skills/acoustic-monitoring/resources/acoustic-indices-reference.md +93 -0
  32. package/skills/acoustic-monitoring/resources/soundscape-ecology-guide.md +90 -0
  33. package/skills/acoustic-monitoring/resources/species-id-tools-comparison.md +89 -0
  34. package/skills/acoustic-monitoring/scripts/batch_species_detection.py +360 -0
  35. package/skills/acoustic-monitoring/scripts/compute_acoustic_indices.R +235 -0
  36. package/skills/acoustic-monitoring/scripts/compute_acoustic_indices.py +374 -0
  37. package/skills/biostatistics-workbench/SKILL.md +140 -0
  38. package/skills/biostatistics-workbench/examples/example-prompts.md +39 -0
  39. package/skills/biostatistics-workbench/resources/effect-size-reference.md +81 -0
  40. package/skills/biostatistics-workbench/resources/glm-family-link-reference.md +47 -0
  41. package/skills/biostatistics-workbench/resources/test-selection-guide.md +93 -0
  42. package/skills/biostatistics-workbench/scripts/glm_pipeline.R +78 -0
  43. package/skills/biostatistics-workbench/scripts/glm_pipeline.py +210 -0
  44. package/skills/camera-trap-processing/SKILL.md +159 -0
  45. package/skills/camera-trap-processing/examples/example-prompts.md +103 -0
  46. package/skills/camera-trap-processing/examples/leopard_serengeti_example.md +231 -0
  47. package/skills/camera-trap-processing/resources/activity-patterns-reference.md +113 -0
  48. package/skills/camera-trap-processing/resources/camtrapR-workflow-guide.md +130 -0
  49. package/skills/camera-trap-processing/resources/detection-event-definition-guide.md +89 -0
  50. package/skills/camera-trap-processing/scripts/estimate_activity.R +169 -0
  51. package/skills/camera-trap-processing/scripts/process_camtrap_data.R +179 -0
  52. package/skills/camera-trap-processing/scripts/process_camtrap_data.py +192 -0
  53. package/skills/community-ecology-ordination/SKILL.md +133 -0
  54. package/skills/community-ecology-ordination/examples/example-prompts.md +35 -0
  55. package/skills/community-ecology-ordination/resources/dissimilarity-metric-guide.md +53 -0
  56. package/skills/community-ecology-ordination/resources/nmds-interpretation-guide.md +104 -0
  57. package/skills/community-ecology-ordination/scripts/__pycache__/community_analysis.cpython-311.pyc +0 -0
  58. package/skills/community-ecology-ordination/scripts/community_analysis.R +143 -0
  59. package/skills/community-ecology-ordination/scripts/community_analysis.py +231 -0
  60. package/skills/ecological-data-foundation/SKILL.md +129 -0
  61. package/skills/ecological-data-foundation/examples/example-prompts.md +40 -0
  62. package/skills/ecological-data-foundation/resources/coordinate-cleaning-flags.md +66 -0
  63. package/skills/ecological-data-foundation/resources/darwin-core-glossary.md +91 -0
  64. package/skills/ecological-data-foundation/resources/data-citation-guide.md +265 -0
  65. package/skills/ecological-data-foundation/resources/gbif-data-citation-guide.md +193 -0
  66. package/skills/ecological-data-foundation/resources/qa-checklist.md +83 -0
  67. package/skills/ecological-data-foundation/scripts/__pycache__/clean_occurrences.cpython-311.pyc +0 -0
  68. package/skills/ecological-data-foundation/scripts/__pycache__/download_from_ebird.cpython-311.pyc +0 -0
  69. package/skills/ecological-data-foundation/scripts/__pycache__/download_from_inat.cpython-311.pyc +0 -0
  70. package/skills/ecological-data-foundation/scripts/__pycache__/download_from_iucn.cpython-311.pyc +0 -0
  71. package/skills/ecological-data-foundation/scripts/__pycache__/download_from_obis.cpython-311.pyc +0 -0
  72. package/skills/ecological-data-foundation/scripts/clean_occurrences.R +230 -0
  73. package/skills/ecological-data-foundation/scripts/clean_occurrences.py +268 -0
  74. package/skills/ecological-data-foundation/scripts/download_from_ebird.R +251 -0
  75. package/skills/ecological-data-foundation/scripts/download_from_ebird.py +364 -0
  76. package/skills/ecological-data-foundation/scripts/download_from_gbif.R +315 -0
  77. package/skills/ecological-data-foundation/scripts/download_from_gbif.py +407 -0
  78. package/skills/ecological-data-foundation/scripts/download_from_inat.R +238 -0
  79. package/skills/ecological-data-foundation/scripts/download_from_inat.py +304 -0
  80. package/skills/ecological-data-foundation/scripts/download_from_iucn.R +273 -0
  81. package/skills/ecological-data-foundation/scripts/download_from_iucn.py +344 -0
  82. package/skills/ecological-data-foundation/scripts/download_from_obis.R +248 -0
  83. package/skills/ecological-data-foundation/scripts/download_from_obis.py +318 -0
  84. package/skills/ecological-impact-assessment/SKILL.md +123 -0
  85. package/skills/ecological-impact-assessment/examples/example-prompts.md +32 -0
  86. package/skills/ecological-impact-assessment/resources/baci-design-guide.md +55 -0
  87. package/skills/ecological-impact-assessment/resources/fragmentation-metrics-reference.md +86 -0
  88. package/skills/ecological-impact-assessment/resources/pressure-index-template.md +78 -0
  89. package/skills/ecological-impact-assessment/resources/study-design-guide.md +168 -0
  90. package/skills/ecological-impact-assessment/scripts/baci_analysis.R +161 -0
  91. package/skills/ecological-impact-assessment/scripts/fragmentation_analysis.py +141 -0
  92. package/skills/ecological-impact-assessment/scripts/power_analysis_baci.R +274 -0
  93. package/skills/ecosystem-services-assessment/SKILL.md +125 -0
  94. package/skills/ecosystem-services-assessment/examples/example-prompts.md +24 -0
  95. package/skills/ecosystem-services-assessment/resources/es-indicator-reference.md +45 -0
  96. package/skills/ecosystem-services-assessment/resources/invest-parameter-guide.md +86 -0
  97. package/skills/ecosystem-services-assessment/resources/rusle-coefficients.md +88 -0
  98. package/skills/ecosystem-services-assessment/scripts/__pycache__/compute_es.cpython-311.pyc +0 -0
  99. package/skills/ecosystem-services-assessment/scripts/compute_es.py +189 -0
  100. package/skills/ecosystem-services-assessment/scripts/tradeoff_analysis.R +161 -0
  101. package/skills/environmental-time-series/SKILL.md +125 -0
  102. package/skills/environmental-time-series/examples/example-prompts.md +33 -0
  103. package/skills/environmental-time-series/resources/anomaly-indices-reference.md +88 -0
  104. package/skills/environmental-time-series/resources/bfast-parameter-guide.md +69 -0
  105. package/skills/environmental-time-series/scripts/__pycache__/recovery_trajectory.cpython-311.pyc +0 -0
  106. package/skills/environmental-time-series/scripts/__pycache__/trend_analysis.cpython-311.pyc +0 -0
  107. package/skills/environmental-time-series/scripts/recovery_trajectory.R +305 -0
  108. package/skills/environmental-time-series/scripts/recovery_trajectory.py +178 -0
  109. package/skills/environmental-time-series/scripts/trend_analysis.R +192 -0
  110. package/skills/environmental-time-series/scripts/trend_analysis.py +184 -0
  111. package/skills/geoprocessing-for-ecology/SKILL.md +123 -0
  112. package/skills/geoprocessing-for-ecology/examples/example-prompts.md +32 -0
  113. package/skills/geoprocessing-for-ecology/resources/crs-reference.md +62 -0
  114. package/skills/geoprocessing-for-ecology/resources/global-predictor-sources.md +331 -0
  115. package/skills/geoprocessing-for-ecology/resources/resampling-methods.md +57 -0
  116. package/skills/geoprocessing-for-ecology/scripts/__pycache__/download_predictors.cpython-311.pyc +0 -0
  117. package/skills/geoprocessing-for-ecology/scripts/download_predictors.R +239 -0
  118. package/skills/geoprocessing-for-ecology/scripts/download_predictors.py +379 -0
  119. package/skills/geoprocessing-for-ecology/scripts/stack_and_extract.R +224 -0
  120. package/skills/geoprocessing-for-ecology/scripts/stack_and_extract.py +172 -0
  121. package/skills/landscape-connectivity/SKILL.md +170 -0
  122. package/skills/landscape-connectivity/examples/example-prompts.md +96 -0
  123. package/skills/landscape-connectivity/examples/jaguar_mesoamerica_corridor_example.md +271 -0
  124. package/skills/landscape-connectivity/resources/circuitscape-parameter-guide.md +155 -0
  125. package/skills/landscape-connectivity/resources/graph-theory-for-ecology.md +134 -0
  126. package/skills/landscape-connectivity/resources/resistance-surface-guide.md +141 -0
  127. package/skills/landscape-connectivity/scripts/connectivity_analysis.py +387 -0
  128. package/skills/landscape-connectivity/scripts/connectivity_metrics.R +274 -0
  129. package/skills/landscape-connectivity/scripts/resistance_surface.R +239 -0
  130. package/skills/model-validation-and-uncertainty/SKILL.md +131 -0
  131. package/skills/model-validation-and-uncertainty/examples/example-prompts.md +30 -0
  132. package/skills/model-validation-and-uncertainty/resources/extrapolation-risk-guide.md +236 -0
  133. package/skills/model-validation-and-uncertainty/resources/metric-selection-guide.md +52 -0
  134. package/skills/model-validation-and-uncertainty/resources/threshold-selection-guide.md +64 -0
  135. package/skills/model-validation-and-uncertainty/scripts/__pycache__/validate_model.cpython-311.pyc +0 -0
  136. package/skills/model-validation-and-uncertainty/scripts/extrapolation_risk.R +315 -0
  137. package/skills/model-validation-and-uncertainty/scripts/validate_model.py +226 -0
  138. package/skills/model-validation-and-uncertainty/scripts/validate_sdm.R +162 -0
  139. package/skills/occupancy-and-detection/SKILL.md +126 -0
  140. package/skills/occupancy-and-detection/examples/example-prompts.md +33 -0
  141. package/skills/occupancy-and-detection/resources/detection-history-format.md +100 -0
  142. package/skills/occupancy-and-detection/resources/occupancy-study-design.md +47 -0
  143. package/skills/occupancy-and-detection/scripts/__pycache__/occupancy_analysis.cpython-311.pyc +0 -0
  144. package/skills/occupancy-and-detection/scripts/occupancy_analysis.R +160 -0
  145. package/skills/occupancy-and-detection/scripts/occupancy_analysis.py +159 -0
  146. package/skills/population-viability-analysis/SKILL.md +161 -0
  147. package/skills/population-viability-analysis/examples/african_elephant_pva_example.md +266 -0
  148. package/skills/population-viability-analysis/examples/example-prompts.md +95 -0
  149. package/skills/population-viability-analysis/resources/extinction-risk-thresholds.md +128 -0
  150. package/skills/population-viability-analysis/resources/matrix-model-guide.md +139 -0
  151. package/skills/population-viability-analysis/resources/sensitivity-elasticity-reference.md +182 -0
  152. package/skills/population-viability-analysis/scripts/matrix_pva.R +258 -0
  153. package/skills/population-viability-analysis/scripts/pva_analysis.py +442 -0
  154. package/skills/population-viability-analysis/scripts/stochastic_pva.R +353 -0
  155. package/skills/predictive-modeling-best-practices/SKILL.md +136 -0
  156. package/skills/predictive-modeling-best-practices/examples/example-prompts.md +58 -0
  157. package/skills/predictive-modeling-best-practices/resources/collinearity-decision-tree.md +65 -0
  158. package/skills/predictive-modeling-best-practices/resources/sampling-bias-correction.md +267 -0
  159. package/skills/predictive-modeling-best-practices/resources/spatial-cv-guide.md +73 -0
  160. package/skills/predictive-modeling-best-practices/scripts/__pycache__/spatial_cv.cpython-311.pyc +0 -0
  161. package/skills/predictive-modeling-best-practices/scripts/collinearity_check.R +112 -0
  162. package/skills/predictive-modeling-best-practices/scripts/spatial_cv.py +182 -0
  163. package/skills/reproducible-ecology-pipeline/SKILL.md +139 -0
  164. package/skills/reproducible-ecology-pipeline/examples/example-prompts.md +35 -0
  165. package/skills/reproducible-ecology-pipeline/resources/directory-structure-template.md +94 -0
  166. package/skills/reproducible-ecology-pipeline/resources/params-yaml-template.yaml +84 -0
  167. package/skills/reproducible-ecology-pipeline/resources/reproducibility-checklist-template.md +66 -0
  168. package/skills/reproducible-ecology-pipeline/scripts/generate_file_manifest.py +110 -0
  169. package/skills/reproducible-ecology-pipeline/scripts/init_project.sh +53 -0
  170. package/skills/spatial-prioritization/SKILL.md +162 -0
  171. package/skills/spatial-prioritization/examples/biodiversity_hotspot_prioritization_example.md +289 -0
  172. package/skills/spatial-prioritization/examples/example-prompts.md +93 -0
  173. package/skills/spatial-prioritization/resources/cost-surface-reference.md +130 -0
  174. package/skills/spatial-prioritization/resources/marxan-vs-prioritizr-comparison.md +125 -0
  175. package/skills/spatial-prioritization/resources/prioritizr-formulation-guide.md +188 -0
  176. package/skills/spatial-prioritization/resources/representation-targets-guide.md +186 -0
  177. package/skills/spatial-prioritization/scripts/prioritization_sensitivity.R +320 -0
  178. package/skills/spatial-prioritization/scripts/run_prioritization.R +336 -0
  179. package/skills/species-distribution-modeling/SKILL.md +139 -0
  180. package/skills/species-distribution-modeling/examples/example-prompts.md +36 -0
  181. package/skills/species-distribution-modeling/resources/algorithm-comparison.md +25 -0
  182. package/skills/species-distribution-modeling/resources/calibration-area-guide.md +71 -0
  183. package/skills/species-distribution-modeling/resources/climate-scenario-preparation.md +170 -0
  184. package/skills/species-distribution-modeling/resources/maxent-calibration-guide.md +211 -0
  185. package/skills/species-distribution-modeling/resources/sdm-checklist.md +37 -0
  186. package/skills/species-distribution-modeling/scripts/predict_distribution.R +236 -0
  187. package/skills/species-distribution-modeling/scripts/predict_distribution.py +286 -0
  188. package/skills/species-distribution-modeling/scripts/prepare_future_layers.R +351 -0
  189. package/skills/species-distribution-modeling/scripts/project_scenarios.R +220 -0
  190. package/skills/species-distribution-modeling/scripts/run_ensemble_sdm.R +99 -0
  191. package/skills/species-distribution-modeling/scripts/sdm_pipeline.py +318 -0
  192. package/skills/species-distribution-modeling/scripts/tune_maxnet.R +344 -0
  193. package/templates/SKILL_TEMPLATE.md +225 -0
  194. package/templates/checklists/data-submission-checklist.md +38 -0
  195. package/templates/checklists/post-analysis-checklist.md +55 -0
  196. package/templates/checklists/pre-analysis-checklist.md +31 -0
  197. package/templates/prompts/debug-skill.md +47 -0
  198. package/templates/prompts/invoke-skill.md +34 -0
  199. package/templates/prompts/invoke-workflow.md +45 -0
  200. package/templates/reports/technical-report-template.md +80 -0
  201. package/templates/scripts/logger_setup.R +79 -0
  202. package/templates/scripts/logger_setup.py +119 -0
  203. package/templates/scripts/params_loader.R +28 -0
  204. package/templates/scripts/params_loader.py +38 -0
  205. package/workflows/analyze-community-structure/WORKFLOW.md +72 -0
  206. package/workflows/analyze-environmental-change/WORKFLOW.md +73 -0
  207. package/workflows/assess-ecological-impact/WORKFLOW.md +75 -0
  208. package/workflows/assess-ecosystem-services/WORKFLOW.md +68 -0
  209. package/workflows/assess-landscape-connectivity/WORKFLOW.md +84 -0
  210. package/workflows/build-fire-risk-map/WORKFLOW.md +79 -0
  211. package/workflows/produce-technical-report/WORKFLOW.md +113 -0
  212. package/workflows/run-camera-trap-occupancy/WORKFLOW.md +87 -0
  213. package/workflows/run-conservation-prioritization/WORKFLOW.md +89 -0
  214. package/workflows/run-multispecies-screening/WORKFLOW.md +197 -0
  215. package/workflows/run-occupancy-analysis/WORKFLOW.md +74 -0
  216. package/workflows/run-population-viability/WORKFLOW.md +90 -0
  217. package/workflows/run-sdm-study/WORKFLOW.md +99 -0
@@ -0,0 +1,169 @@
1
+ # ecological-agent-skills / Copyright (C) 2026 Francisco Diego Barros Barata
2
+ # SPDX-License-Identifier: GPL-3.0-or-later
3
+
4
+ # Usage: Rscript estimate_activity.R <record_table_csv> <species_name> <output_dir> [group_column]
5
+
6
+ # ── Inline logger ─────────────────────────────────────────────────────────────
7
+ SKILL_NAME <- "camera-trap-processing"
8
+ .log_ts <- function() format(Sys.time(), "[%Y-%m-%d %H:%M:%S]")
9
+ log_info <- function(...) message(.log_ts(), " [INFO] ", sprintf(...))
10
+ log_warn <- function(...) message(.log_ts(), " [WARN] ", sprintf(...))
11
+ log_error<- function(...) message(.log_ts(), " [ERROR] ", sprintf(...))
12
+ log_step <- function(n, d) log_info("-- STEP %d: %s", n, d)
13
+ log_decision <- function(v, val, why) log_info("DECISION | %s = %s | %s", v, val, why)
14
+ dir.create("logs", recursive=TRUE, showWarnings=FALSE)
15
+
16
+ suppressPackageStartupMessages(library(overlap))
17
+ suppressPackageStartupMessages(library(circular))
18
+ suppressPackageStartupMessages(library(dplyr))
19
+ suppressPackageStartupMessages(library(ggplot2))
20
+ suppressPackageStartupMessages(library(lubridate))
21
+
22
+ args <- commandArgs(trailingOnly = TRUE)
23
+ if (length(args) < 3) {
24
+ log_error("Argumentos insuficientes. Uso: Rscript estimate_activity.R <record_table_csv> <species_name> <output_dir> [group_column]")
25
+ cat("Usage: Rscript estimate_activity.R <record_table_csv> <species_name> <output_dir> [group_column]\n")
26
+ cat(" group_column: optional column for comparing two groups (e.g., 'season')\n")
27
+ quit(status = 1)
28
+ }
29
+
30
+ record_csv <- args[1]
31
+ species_name <- args[2]
32
+ output_dir <- args[3]
33
+ group_col <- ifelse(length(args) >= 4, args[4], NULL)
34
+
35
+ # ── Input precondition checks ────────────────────────────────────────────────
36
+ if (!file.exists(record_csv)) {
37
+ log_error("Input nao encontrado: %s\nCausa provavel: process_camtrap_data.R nao foi executado ou falhou\nVerifique: se record_table.csv existe no diretorio de saida\nSkill anterior: camera-trap-processing (process_camtrap_data.R)", record_csv)
38
+ stop("Missing record_csv: ", record_csv)
39
+ }
40
+
41
+ log_decision("species_name", species_name,
42
+ "especie alvo para estimativa de atividade diaria")
43
+ log_decision("group_col", ifelse(is.null(group_col), "NULL", group_col),
44
+ "coluna de grupo para comparacao de sobreposicao; NULL = sem comparacao")
45
+
46
+ dir.create(output_dir, recursive = TRUE, showWarnings = FALSE)
47
+
48
+ log_step(1, "Carregando e filtrando registros para a especie alvo")
49
+ records <- tryCatch({
50
+ read.csv(record_csv, stringsAsFactors = FALSE)
51
+ }, error = function(e) {
52
+ log_error("Falha ao ler record_table CSV: %s\nCausa provavel: arquivo corrompido ou formato invalido\nVerifique: saida de process_camtrap_data.R\nSkill anterior: camera-trap-processing", conditionMessage(e))
53
+ stop(e)
54
+ })
55
+
56
+ records$DateTimeOriginal <- as.POSIXct(records$DateTimeOriginal,
57
+ format = "%Y-%m-%d %H:%M:%S")
58
+
59
+ sp_data <- records[records$Species == species_name, ]
60
+ log_info("Registros para '%s': %d eventos", species_name, nrow(sp_data))
61
+
62
+ if (nrow(sp_data) < 10) {
63
+ log_error("n_eventos_independentes < 10 para '%s' (%d encontrados)\nCausa provavel: especie rara ou limiar de independencia muito alto\nVerifique: especies disponiveis na tabela de registros\nSkill anterior: camera-trap-processing (process_camtrap_data.R)", species_name, nrow(sp_data))
64
+ stop("n_independent_events < 10 for '", species_name,
65
+ "'. Report RAI only; do not estimate activity overlap.")
66
+ }
67
+
68
+ log_step(2, "Convertendo horarios para radianos e estimando densidade de atividade")
69
+ # Convert to radians
70
+ to_rad <- function(dt) {
71
+ (hour(dt) + minute(dt) / 60) * (2 * pi / 24)
72
+ }
73
+ sp_data$time_rad <- to_rad(sp_data$DateTimeOriginal)
74
+
75
+ # Overall activity density plot
76
+ tryCatch({
77
+ png(file.path(output_dir, "activity_plot.png"), width = 900, height = 600, res = 120)
78
+ overlapPlot(sp_data$time_rad, rug = TRUE,
79
+ main = paste("Diel Activity —", gsub("_", " ", species_name)),
80
+ xlab = "Time of day", col.main = "black")
81
+ dev.off()
82
+ log_info("Grafico de atividade salvo: activity_plot.png")
83
+ }, error = function(e) {
84
+ log_error("Falha ao gerar grafico de atividade: %s\nCausa provavel: biblioteca overlap com problema\nVerifique: instalacao do pacote overlap\nSkill anterior: [nenhuma]", conditionMessage(e))
85
+ stop(e)
86
+ })
87
+
88
+ log_step(3, "Calculando estatisticas circulares")
89
+ # Circular statistics
90
+ time_circ <- circular(sp_data$time_rad, units = "radians", template = "clock24")
91
+ mean_rad <- mean.circular(time_circ)
92
+ mean_hour <- as.numeric(mean_rad) * 24 / (2 * pi)
93
+ kappa_est <- tryCatch(mle.vonmises(time_circ)$kappa, error = function(e) {
94
+ log_warn("mle.vonmises() falhou (kappa = NA): %s", conditionMessage(e))
95
+ NA_real_
96
+ })
97
+ rayleigh_p <- rayleigh.test(time_circ)$p.value
98
+
99
+ log_decision("rayleigh_p", round(rayleigh_p, 4),
100
+ "p < 0.05 indica distribuicao nao-uniforme (atividade diaria concentrada)")
101
+ if (rayleigh_p >= 0.05) {
102
+ log_warn("Rayleigh p = %.4f >= 0.05: atividade diaria nao difere de uniforme para '%s'",
103
+ rayleigh_p, species_name)
104
+ }
105
+
106
+ circ_stats <- data.frame(
107
+ species = species_name,
108
+ n_events = nrow(sp_data),
109
+ mean_activity_hour = round(mean_hour %% 24, 2),
110
+ kappa = round(kappa_est, 3),
111
+ rayleigh_p = round(rayleigh_p, 4),
112
+ non_uniform = rayleigh_p < 0.05
113
+ )
114
+ write.csv(circ_stats, file.path(output_dir, "circular_stats.csv"), row.names = FALSE)
115
+ log_info("Estatisticas circulares salvas: hora media = %.1f h, kappa = %.3f",
116
+ mean_hour %% 24, ifelse(is.na(kappa_est), 0, kappa_est))
117
+
118
+ log_step(4, "Calculando sobreposicao de atividade entre grupos (se aplicavel)")
119
+ # Diel overlap between two groups (if group_col provided)
120
+ overlap_result <- data.frame()
121
+ if (!is.null(group_col) && group_col %in% names(sp_data)) {
122
+ groups <- unique(sp_data[[group_col]])
123
+ if (length(groups) == 2) {
124
+ groupA <- sp_data$time_rad[sp_data[[group_col]] == groups[1]]
125
+ groupB <- sp_data$time_rad[sp_data[[group_col]] == groups[2]]
126
+
127
+ if (length(groupA) >= 10 && length(groupB) >= 10) {
128
+ log_info("Estimando sobreposicao Dhat4 entre '%s' (n=%d) e '%s' (n=%d)",
129
+ groups[1], length(groupA), groups[2], length(groupB))
130
+ boot_out <- bootEst(groupA, groupB, nb = 1000, type = "Dhat4")
131
+ delta4 <- boot_out["Dhat4"]
132
+ ci_lower <- boot_out["lwr"]
133
+ ci_upper <- boot_out["upr"]
134
+
135
+ overlap_result <- data.frame(
136
+ groupA = groups[1],
137
+ groupB = groups[2],
138
+ n_A = length(groupA),
139
+ n_B = length(groupB),
140
+ Dhat4 = round(delta4, 3),
141
+ ci_lower = round(ci_lower, 3),
142
+ ci_upper = round(ci_upper, 3)
143
+ )
144
+
145
+ png(file.path(output_dir, "activity_overlap.png"), width = 900, height = 600, res = 120)
146
+ overlapPlot(groupA, groupB, rug = TRUE,
147
+ main = paste("Diel Overlap — Δ4 =", round(delta4, 2)),
148
+ linecol = c("blue", "red"))
149
+ legend("topright", legend = groups, col = c("blue", "red"), lty = 1)
150
+ dev.off()
151
+ log_info("Sobreposicao Dhat4 = %.3f [%.3f, %.3f]",
152
+ delta4, ci_lower, ci_upper)
153
+ } else {
154
+ log_warn("Um ou ambos os grupos tem < 10 eventos; sobreposicao nao calculada (n_A=%d, n_B=%d)",
155
+ length(groupA), length(groupB))
156
+ }
157
+ } else {
158
+ log_warn("group_col '%s' tem %d valores unicos; exatamente 2 necessarios para sobreposicao",
159
+ group_col, length(groups))
160
+ }
161
+ } else if (!is.null(group_col)) {
162
+ log_warn("Coluna de grupo '%s' nao encontrada nos dados da especie", group_col)
163
+ }
164
+ write.csv(overlap_result, file.path(output_dir, "activity_overlap.csv"), row.names = FALSE)
165
+
166
+ log_info("Concluido. Saidas gravadas em: %s", output_dir)
167
+ log_info(" n eventos: %d", nrow(sp_data))
168
+ log_info(" Hora media de atividade: %.1f h", mean_hour %% 24)
169
+ log_info(" Rayleigh p: %.4f", round(rayleigh_p, 4))
@@ -0,0 +1,179 @@
1
+ # ecological-agent-skills / Copyright (C) 2026 Francisco Diego Barros Barata
2
+ # SPDX-License-Identifier: GPL-3.0-or-later
3
+
4
+ # Usage: Rscript process_camtrap_data.R <image_dir> <camera_metadata_csv> <output_dir> [indep_threshold_min]
5
+
6
+ # ── Inline logger ─────────────────────────────────────────────────────────────
7
+ SKILL_NAME <- "camera-trap-processing"
8
+ .log_ts <- function() format(Sys.time(), "[%Y-%m-%d %H:%M:%S]")
9
+ log_info <- function(...) message(.log_ts(), " [INFO] ", sprintf(...))
10
+ log_warn <- function(...) message(.log_ts(), " [WARN] ", sprintf(...))
11
+ log_error<- function(...) message(.log_ts(), " [ERROR] ", sprintf(...))
12
+ log_step <- function(n, d) log_info("-- STEP %d: %s", n, d)
13
+ log_decision <- function(v, val, why) log_info("DECISION | %s = %s | %s", v, val, why)
14
+ dir.create("logs", recursive=TRUE, showWarnings=FALSE)
15
+
16
+ suppressPackageStartupMessages(library(camtrapR))
17
+ suppressPackageStartupMessages(library(dplyr))
18
+ suppressPackageStartupMessages(library(lubridate))
19
+
20
+ args <- commandArgs(trailingOnly = TRUE)
21
+ if (length(args) < 3) {
22
+ log_error("Uso: Rscript process_camtrap_data.R <image_dir> <camera_metadata_csv> <output_dir> [indep_threshold_min]")
23
+ cat("Usage: Rscript process_camtrap_data.R <image_dir> <camera_metadata_csv> <output_dir> [indep_threshold_min]\n")
24
+ cat(" indep_threshold_min: independence threshold in minutes (default: 30)\n")
25
+ quit(status = 1)
26
+ }
27
+
28
+ image_dir <- args[1]
29
+ metadata_csv <- args[2]
30
+ output_dir <- args[3]
31
+ thresh_min <- ifelse(length(args) >= 4, as.integer(args[4]), 30L)
32
+
33
+ # ── Input precondition checks ────────────────────────────────────────────────
34
+ if (!dir.exists(image_dir)) {
35
+ log_error("Input nao encontrado: %s\nCausa provavel: caminho errado ou diretorio nao montado\nVerifique: se o diretorio de imagens existe e tem permissao de leitura\nSkill anterior: [nenhuma — etapa inicial]", image_dir)
36
+ stop("Missing image_dir: ", image_dir)
37
+ }
38
+ if (!file.exists(metadata_csv)) {
39
+ log_error("Input nao encontrado: %s\nCausa provavel: arquivo CSV de metadados nao gerado ou nome incorreto\nVerifique: se o arquivo existe e o caminho esta correto\nSkill anterior: [nenhuma — etapa inicial]", metadata_csv)
40
+ stop("Missing metadata_csv: ", metadata_csv)
41
+ }
42
+
43
+ log_decision("indep_threshold_min", thresh_min,
44
+ "limiar padrao de 30 min para independencia de registros; ajuste por especie se necessario")
45
+
46
+ dir.create(output_dir, recursive = TRUE, showWarnings = FALSE)
47
+
48
+ log_step(1, "Carregando metadados das cameras")
49
+ cam_meta <- tryCatch({
50
+ read.csv(metadata_csv, stringsAsFactors = FALSE)
51
+ }, error = function(e) {
52
+ log_error("Falha ao ler metadata CSV: %s\nCausa provavel: arquivo corrompido ou formato incorreto\nVerifique: estrutura do CSV\nSkill anterior: [nenhuma]", conditionMessage(e))
53
+ stop(e)
54
+ })
55
+ log_info("Metadados carregados: %d estacoes", nrow(cam_meta))
56
+
57
+ required_cols <- c("Station", "Setup_date", "Retrieval_date")
58
+ missing <- setdiff(required_cols, names(cam_meta))
59
+ if (length(missing) > 0) {
60
+ log_error("Colunas obrigatorias ausentes no CSV de metadados: %s\nCausa provavel: formato de planilha incorreto\nVerifique: se o CSV contem Station, Setup_date, Retrieval_date", paste(missing, collapse = ", "))
61
+ stop("Camera metadata missing required columns: ", paste(missing, collapse = ", "))
62
+ }
63
+
64
+ log_step(2, "Construindo matriz de operacao das cameras")
65
+ has_problems <- all(c("Problem1_from", "Problem1_to") %in% names(cam_meta))
66
+ log_decision("has_problems", has_problems,
67
+ "indica se ha colunas de problemas tecnicas nas cameras no CSV")
68
+
69
+ cam_op <- tryCatch({
70
+ cameraOperation(
71
+ CTtable = cam_meta,
72
+ stationCol = "Station",
73
+ setupCol = "Setup_date",
74
+ retrievalCol = "Retrieval_date",
75
+ hasProblems = has_problems,
76
+ dateFormat = "yyyy-mm-dd"
77
+ )
78
+ }, error = function(e) {
79
+ log_error("Falha em cameraOperation(): %s\nCausa provavel: datas em formato errado ou estacoes duplicadas\nVerifique: formato yyyy-mm-dd nas colunas de data\nSkill anterior: [nenhuma]", conditionMessage(e))
80
+ stop(e)
81
+ })
82
+
83
+ # Trap effort per station
84
+ trap_effort <- data.frame(
85
+ Station = rownames(cam_op),
86
+ trap_nights = apply(cam_op, 1, sum, na.rm = TRUE)
87
+ )
88
+ low_effort <- trap_effort$Station[trap_effort$trap_nights < 100]
89
+ if (length(low_effort) > 0) {
90
+ log_warn("Estacoes com < 100 armadilhas-noite (dados insuficientes para ocupancia): %s",
91
+ paste(low_effort, collapse = ", "))
92
+ }
93
+
94
+ log_step(3, "Construindo tabela de registros a partir das imagens")
95
+ log_info("Limiar de independencia: %d min", thresh_min)
96
+ record_table <- tryCatch({
97
+ recordTable(
98
+ inDir = image_dir,
99
+ IDfrom = "directory",
100
+ minDeltaTime = thresh_min,
101
+ deltaTimeComparedTo = "lastIndependentRecord",
102
+ timeZone = Sys.timezone(),
103
+ removeDuplicateRecords = TRUE
104
+ )
105
+ }, error = function(e) {
106
+ log_error("Falha em recordTable(): %s\nCausa provavel: estrutura de diretorios incorreta\nVerifique: se imagens seguem <Estacao>/<Especie>/<imagens>\nSkill anterior: [nenhuma]", conditionMessage(e))
107
+ stop("recordTable() failed: ", conditionMessage(e),
108
+ "\nCheck that image directory follows <Station>/<Species>/<images> structure.")
109
+ })
110
+ log_info("Tabela de registros construida: %d eventos independentes", nrow(record_table))
111
+
112
+ log_step(4, "Calculando resumo de registros por especie")
113
+ # Records per species summary
114
+ records_per_species <- record_table %>%
115
+ group_by(Species) %>%
116
+ summarise(
117
+ n_events = n(),
118
+ n_stations = n_distinct(Station),
119
+ first_detection = min(DateTimeOriginal),
120
+ last_detection = max(DateTimeOriginal),
121
+ .groups = "drop"
122
+ )
123
+
124
+ low_detections <- records_per_species$Species[records_per_species$n_events < 10]
125
+ if (length(low_detections) > 0) {
126
+ log_warn("Especies com < 10 eventos independentes (apenas RAI; sem ocupancia): %s",
127
+ paste(low_detections, collapse = ", "))
128
+ }
129
+
130
+ log_step(5, "Gerando historicos de deteccao por especie")
131
+ # Generate detection history for all species with >= 10 events
132
+ det_hist_list <- list()
133
+ for (sp in records_per_species$Species[records_per_species$n_events >= 10]) {
134
+ sp_clean <- gsub(" ", "_", sp)
135
+ dh <- tryCatch(
136
+ detectionHistory(
137
+ recordTable = record_table,
138
+ camOp = cam_op,
139
+ stationCol = "Station",
140
+ speciesCol = "Species",
141
+ recordDateTimeCol = "DateTimeOriginal",
142
+ species = sp,
143
+ occasionLength = 7,
144
+ day1 = "station",
145
+ output = "binary"
146
+ ),
147
+ error = function(e) {
148
+ log_warn("detectionHistory() falhou para especie '%s': %s", sp, conditionMessage(e))
149
+ NULL
150
+ }
151
+ )
152
+ if (!is.null(dh)) det_hist_list[[sp_clean]] <- dh$detection_history
153
+ }
154
+ log_info("Historicos de deteccao gerados para %d especies", length(det_hist_list))
155
+
156
+ log_step(6, "Escrevendo arquivos de saida")
157
+ # Write outputs
158
+ write.csv(record_table, file.path(output_dir, "record_table.csv"), row.names = FALSE)
159
+ write.csv(cam_op, file.path(output_dir, "camera_operation.csv"), row.names = TRUE)
160
+ write.csv(trap_effort, file.path(output_dir, "trap_effort_summary.csv"), row.names = FALSE)
161
+ write.csv(records_per_species, file.path(output_dir, "records_per_species.csv"), row.names = FALSE)
162
+
163
+ if (length(det_hist_list) > 0) {
164
+ # Write the first species' detection history as default output
165
+ dh_df <- as.data.frame(det_hist_list[[1]])
166
+ write.csv(dh_df, file.path(output_dir, "detection_history.csv"), row.names = TRUE)
167
+ # Write all species if multiple
168
+ for (sp_name in names(det_hist_list)) {
169
+ dh_df_sp <- as.data.frame(det_hist_list[[sp_name]])
170
+ write.csv(dh_df_sp,
171
+ file.path(output_dir, paste0("detection_history_", sp_name, ".csv")),
172
+ row.names = TRUE)
173
+ }
174
+ }
175
+
176
+ log_info("Concluido. Saidas gravadas em: %s", output_dir)
177
+ log_info(" record_table.csv: %d eventos independentes", nrow(record_table))
178
+ log_info(" records_per_species.csv: %d especies", nrow(records_per_species))
179
+ log_info(" trap_effort_summary.csv: %d estacoes", nrow(trap_effort))
@@ -0,0 +1,192 @@
1
+ # ecological-agent-skills / Copyright (C) 2026 Francisco Diego Barros Barata
2
+ # SPDX-License-Identifier: GPL-3.0-or-later
3
+
4
+ """Process camera trap detection records from CSV output of camtrapR.
5
+ Usage: python process_camtrap_data.py <record_table_csv> <output_dir> [species_name]
6
+
7
+ Reads a record_table.csv produced by camtrapR's recordTable() function.
8
+ Computes descriptive statistics per species and camera station.
9
+ Generates a detection timeline plot.
10
+ Does NOT require camtrapR — works with the CSV output only.
11
+
12
+ Arguments:
13
+ record_table_csv : path to record_table.csv from camtrapR
14
+ output_dir : directory for output files
15
+ species_name : optional; filter to a single species
16
+ """
17
+
18
+ import logging
19
+ import sys
20
+ from datetime import datetime
21
+ from pathlib import Path
22
+
23
+ SKILL_NAME = "camera-trap-processing"
24
+ _LOG_DIR = Path("logs")
25
+ _LOG_DIR.mkdir(parents=True, exist_ok=True)
26
+ _log_file = _LOG_DIR / f"skill_{SKILL_NAME}_{datetime.now().strftime('%Y%m%d_%H%M%S')}.log"
27
+ logging.basicConfig(
28
+ level=logging.INFO,
29
+ format="[%(asctime)s] [%(levelname)s] [" + SKILL_NAME + "] %(message)s",
30
+ datefmt="%Y-%m-%d %H:%M:%S",
31
+ handlers=[
32
+ logging.StreamHandler(sys.stdout),
33
+ logging.FileHandler(_log_file, encoding="utf-8"),
34
+ ],
35
+ )
36
+ logger = logging.getLogger(SKILL_NAME)
37
+
38
+ def log_step(n: int, desc: str) -> None:
39
+ logger.info("-- STEP %d: %s", n, desc)
40
+
41
+ def log_decision(var: str, val, why: str) -> None:
42
+ logger.info("DECISION | %s = %s | %s", var, val, why)
43
+
44
+ import pandas as pd
45
+ import numpy as np
46
+ import matplotlib.pyplot as plt
47
+ import matplotlib.dates as mdates
48
+
49
+
50
+ def load_records(csv_path: str) -> pd.DataFrame:
51
+ df = pd.read_csv(csv_path, parse_dates=["DateTimeOriginal"])
52
+ required = {"Station", "Species", "DateTimeOriginal"}
53
+ missing = required - set(df.columns)
54
+ if missing:
55
+ raise ValueError(f"Record table missing required columns: {missing}")
56
+ return df
57
+
58
+
59
+ def species_summary(df: pd.DataFrame) -> pd.DataFrame:
60
+ summary = (
61
+ df.groupby("Species")
62
+ .agg(
63
+ n_events=("Species", "count"),
64
+ n_stations=("Station", "nunique"),
65
+ first_detection=("DateTimeOriginal", "min"),
66
+ last_detection=("DateTimeOriginal", "max"),
67
+ )
68
+ .reset_index()
69
+ )
70
+ summary["rai_per_100_trapnights"] = np.nan # requires cam_op; computed externally
71
+ return summary
72
+
73
+
74
+ def station_summary(df: pd.DataFrame) -> pd.DataFrame:
75
+ return (
76
+ df.groupby("Station")
77
+ .agg(
78
+ n_events=("Species", "count"),
79
+ n_species=("Species", "nunique"),
80
+ first_detection=("DateTimeOriginal", "min"),
81
+ last_detection=("DateTimeOriginal", "max"),
82
+ )
83
+ .reset_index()
84
+ )
85
+
86
+
87
+ def plot_detection_timeline(df: pd.DataFrame, output_path: Path) -> None:
88
+ """Timeline of detections per species over time."""
89
+ species_list = df["Species"].unique()
90
+ species_codes = {sp: i for i, sp in enumerate(sorted(species_list))}
91
+
92
+ fig, ax = plt.subplots(figsize=(12, max(4, len(species_list) * 0.6)))
93
+ for _, row in df.iterrows():
94
+ y = species_codes[row["Species"]]
95
+ ax.plot(row["DateTimeOriginal"], y, "|", color="steelblue",
96
+ markersize=4, alpha=0.5)
97
+
98
+ ax.set_yticks(list(species_codes.values()))
99
+ ax.set_yticklabels([s.replace("_", " ") for s in species_codes.keys()],
100
+ fontsize=8)
101
+ ax.xaxis.set_major_formatter(mdates.DateFormatter("%b %Y"))
102
+ fig.autofmt_xdate()
103
+ ax.set_xlabel("Date")
104
+ ax.set_title("Camera Trap Detection Timeline")
105
+ ax.grid(axis="x", linestyle="--", alpha=0.4)
106
+ plt.tight_layout()
107
+ fig.savefig(output_path, dpi=120)
108
+ plt.close(fig)
109
+
110
+
111
+ def main():
112
+ if len(sys.argv) < 3:
113
+ print(__doc__)
114
+ sys.exit(1)
115
+
116
+ record_csv = sys.argv[1]
117
+ output_dir = Path(sys.argv[2])
118
+ species_filter = sys.argv[3] if len(sys.argv) >= 4 else None
119
+
120
+ # ── Input precondition checks ────────────────────────────────────────────
121
+ if not Path(record_csv).exists():
122
+ logger.error(
123
+ "Input nao encontrado: %s\n Causa provavel: process_camtrap_data.R nao foi executado ou falhou\n Skill anterior: camera-trap-processing (process_camtrap_data.R)",
124
+ record_csv,
125
+ )
126
+ sys.exit(1)
127
+
128
+ log_decision("species_filter", species_filter,
129
+ "None = todas as especies; nome especifico = filtro por especie")
130
+
131
+ output_dir.mkdir(parents=True, exist_ok=True)
132
+
133
+ log_step(1, "Carregando registros da tabela CSV")
134
+ try:
135
+ df = load_records(record_csv)
136
+ except Exception as e:
137
+ logger.error(
138
+ "Falha ao carregar record_table CSV: %s\n Causa provavel: colunas obrigatorias ausentes ou arquivo corrompido\n Skill anterior: camera-trap-processing",
139
+ e,
140
+ )
141
+ sys.exit(1)
142
+
143
+ logger.info("Total de registros: %d, Especies: %d", len(df), df["Species"].nunique())
144
+
145
+ if species_filter:
146
+ log_step(2, f"Filtrando para a especie '{species_filter}'")
147
+ df = df[df["Species"] == species_filter]
148
+ logger.info("Filtrado para '%s': %d registros", species_filter, len(df))
149
+ if len(df) == 0:
150
+ logger.error(
151
+ "Nenhum registro encontrado para a especie '%s'\n Causa provavel: nome de especie incorreto\n Skill anterior: camera-trap-processing (process_camtrap_data.R)",
152
+ species_filter,
153
+ )
154
+ sys.exit(1)
155
+ else:
156
+ log_step(2, "Processando todas as especies")
157
+
158
+ log_step(3, "Calculando resumo por especie e estacao")
159
+ try:
160
+ sp_sum = species_summary(df)
161
+ st_sum = station_summary(df)
162
+ except Exception as e:
163
+ logger.error("Unexpected error in species/station summary: %s", e)
164
+ raise
165
+
166
+ sp_sum.to_csv(output_dir / "species_summary.csv", index=False)
167
+ st_sum.to_csv(output_dir / "station_summary.csv", index=False)
168
+ logger.info("species_summary.csv: %d especies", len(sp_sum))
169
+ logger.info("station_summary.csv: %d estacoes", len(st_sum))
170
+
171
+ low_events = sp_sum[sp_sum["n_events"] < 10]["Species"].tolist()
172
+ if low_events:
173
+ logger.warning(
174
+ "Especies com < 10 eventos (apenas RAI; sem estimativa de ocupancia): %s",
175
+ low_events,
176
+ )
177
+
178
+ log_step(4, "Gerando grafico de linha do tempo de deteccoes")
179
+ try:
180
+ plot_detection_timeline(df, output_dir / "detection_timeline.png")
181
+ logger.info("detection_timeline.png salvo")
182
+ except Exception as e:
183
+ logger.error("Unexpected error in plot_detection_timeline: %s", e)
184
+ raise
185
+
186
+ logger.info("Concluido. Saidas gravadas em: %s", output_dir)
187
+ logger.info(" species_summary.csv: %d especies", len(sp_sum))
188
+ logger.info(" station_summary.csv: %d estacoes", len(st_sum))
189
+
190
+
191
+ if __name__ == "__main__":
192
+ main()
@@ -0,0 +1,133 @@
1
+ ---
2
+ name: community-ecology-ordination
3
+ description: "Performs multivariate community ecology analyses including ordination, diversity metrics, and assemblage comparisons. Use this skill when the user mentions species composition, NMDS, PCA ordination, PERMANOVA, beta diversity, alpha diversity, species richness, Bray-Curtis dissimilarity, indicator species analysis, cluster analysis, species-by-site matrices, diversity indices, or assemblage structure comparisons."
4
+ skill_version: 1.0.0
5
+ ---
6
+
7
+ # Skill: community-ecology-ordination
8
+
9
+ **Domain:** NMDS · PCA · PCoA · Diversity · Clustering · Composition
10
+ **Phase:** 3 — Specialist
11
+ **Used by:** analyze-community-structure
12
+
13
+ ---
14
+
15
+ ## Purpose
16
+
17
+ Guides the agent through multivariate analysis of ecological communities: ordination of species assemblages, diversity metric computation, beta diversity partitioning, cluster analysis, and hypothesis testing on community composition.
18
+
19
+ ---
20
+
21
+ ## When to Invoke
22
+
23
+ - Analysing species composition across multiple sites
24
+ - Comparing community structure between treatments, habitats, or time periods
25
+ - Computing alpha and beta diversity metrics
26
+ - Identifying species groups or site clusters
27
+
28
+ ---
29
+
30
+ ## Inputs
31
+
32
+ | Input | Format | Required |
33
+ |-------|--------|----------|
34
+ | Species × site abundance or presence matrix | CSV | Yes |
35
+ | Environmental metadata per site | CSV | Recommended |
36
+ | Treatment or grouping variable | Factor column | Recommended |
37
+
38
+ ---
39
+
40
+ ## Outputs
41
+
42
+ | Output | Description |
43
+ |--------|-------------|
44
+ | `ordination_plot.png` | NMDS/PCA biplot |
45
+ | `diversity_metrics.csv` | Alpha diversity per site |
46
+ | `beta_diversity_matrix.csv` | Pairwise dissimilarity matrix |
47
+ | `permanova_results.txt` | PERMANOVA output |
48
+ | `cluster_dendrogram.png` | Hierarchical clustering dendrogram |
49
+ | `community_report.md` | Full analysis narrative |
50
+
51
+ ---
52
+
53
+ ## Steps
54
+
55
+ ### 1. Data Preparation
56
+ - Check for sites with zero species (remove or flag)
57
+ - Check for species observed at only one site (rare species handling: keep or remove)
58
+ - Standardise if needed (Hellinger, Wisconsin, presence/absence)
59
+ - Choose dissimilarity metric: Bray-Curtis (abundance), Jaccard (presence/absence), UniFrac (phylogenetic)
60
+
61
+ ### 2. Alpha Diversity
62
+ - Species richness (S)
63
+ - Shannon index (H')
64
+ - Simpson index (1−D)
65
+ - Rarefaction curves to assess sampling adequacy
66
+ - Report metric ± SE per group
67
+
68
+ ### 3. Ordination
69
+ **NMDS:**
70
+ - Run with k=2 (default) and k=3; choose lowest stress with acceptable fit
71
+ - Stress < 0.1 = excellent, < 0.2 = acceptable, > 0.2 = poor
72
+ - Run with ≥ 20 random starts; confirm convergence
73
+
74
+ **PCA (for environmental gradients or species scores):**
75
+ - Use Hellinger-transformed data or correlation matrix
76
+ - Report eigenvalues and % variance per axis
77
+
78
+ **PCoA / MDS:**
79
+ - For non-Euclidean dissimilarity matrices
80
+
81
+ ### 4. Beta Diversity Partitioning
82
+ - Partition total beta diversity into nestedness and turnover components (betapart)
83
+ - Report contribution of each component per group comparison
84
+
85
+ ### 5. Hypothesis Testing
86
+ - PERMANOVA (`adonis2`): test if group centroids differ
87
+ - PERMDISP: test if group dispersions (variances) differ (required before interpreting PERMANOVA)
88
+ - ANOSIM: alternative non-parametric test
89
+ - Report F/R statistic, R², p-value (permutation-based)
90
+
91
+ ### 6. Species Contributions
92
+ - SIMPER: identify species driving dissimilarity between groups
93
+ - IndVal: identify indicator species per group
94
+ - Report top N contributing species per axis or group
95
+
96
+ ### 7. Cluster Analysis
97
+ - Hierarchical clustering: Ward.D2 linkage preferred
98
+ - Cophenetic correlation to assess cluster quality
99
+ - k-means as alternative for large datasets
100
+ - Determine optimal k using silhouette or elbow method
101
+
102
+ ---
103
+
104
+ ## Key Decisions to Document
105
+
106
+ - Dissimilarity metric and rationale
107
+ - Rare species handling
108
+ - Data transformation applied
109
+ - Number of NMDS dimensions
110
+ - Permutation count for PERMANOVA
111
+
112
+ ---
113
+
114
+ ## Tools and Libraries
115
+
116
+ **R:** `vegan`, `betapart`, `indicspecies`, `ape`, `dendextend`, `ggplot2`
117
+ **Python:** `skbio`, `scipy.cluster`, `sklearn.manifold`
118
+
119
+ ---
120
+
121
+ ## Resources
122
+
123
+ - `resources/dissimilarity-metric-guide.md` — which metric for which data type
124
+ - `resources/nmds-interpretation-guide.md` — how to read and report NMDS plots
125
+ - `examples/` — worked NMDS and PERMANOVA example
126
+
127
+ ---
128
+
129
+ ## Notes
130
+
131
+ - Always run PERMDISP before interpreting PERMANOVA; significant dispersion differences can inflate PERMANOVA results
132
+ - Stress value must be reported alongside all NMDS plots
133
+ - Rarefaction is mandatory when sites have very different sampling intensities