claudient 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (283) hide show
  1. package/.claude-plugin/plugin.json +42 -0
  2. package/CONTEXT.md +58 -0
  3. package/README.md +165 -0
  4. package/agents/build-resolvers/de/python-resolver.md +64 -0
  5. package/agents/build-resolvers/de/typescript-resolver.md +65 -0
  6. package/agents/build-resolvers/es/python-resolver.md +64 -0
  7. package/agents/build-resolvers/es/typescript-resolver.md +65 -0
  8. package/agents/build-resolvers/fr/python-resolver.md +64 -0
  9. package/agents/build-resolvers/fr/typescript-resolver.md +65 -0
  10. package/agents/build-resolvers/nl/python-resolver.md +64 -0
  11. package/agents/build-resolvers/nl/typescript-resolver.md +65 -0
  12. package/agents/build-resolvers/python-resolver.md +62 -0
  13. package/agents/build-resolvers/typescript-resolver.md +63 -0
  14. package/agents/core/architect.md +64 -0
  15. package/agents/core/code-reviewer.md +78 -0
  16. package/agents/core/de/architect.md +66 -0
  17. package/agents/core/de/code-reviewer.md +80 -0
  18. package/agents/core/de/planner.md +63 -0
  19. package/agents/core/de/security-reviewer.md +93 -0
  20. package/agents/core/es/architect.md +66 -0
  21. package/agents/core/es/code-reviewer.md +80 -0
  22. package/agents/core/es/planner.md +63 -0
  23. package/agents/core/es/security-reviewer.md +93 -0
  24. package/agents/core/fr/architect.md +66 -0
  25. package/agents/core/fr/code-reviewer.md +80 -0
  26. package/agents/core/fr/planner.md +63 -0
  27. package/agents/core/fr/security-reviewer.md +93 -0
  28. package/agents/core/nl/architect.md +66 -0
  29. package/agents/core/nl/code-reviewer.md +80 -0
  30. package/agents/core/nl/planner.md +63 -0
  31. package/agents/core/nl/security-reviewer.md +93 -0
  32. package/agents/core/planner.md +61 -0
  33. package/agents/core/security-reviewer.md +91 -0
  34. package/guides/agent-orchestration.md +231 -0
  35. package/guides/de/agent-orchestration.md +174 -0
  36. package/guides/de/getting-started.md +164 -0
  37. package/guides/de/hooks-cookbook.md +160 -0
  38. package/guides/de/memory-management.md +153 -0
  39. package/guides/de/security.md +180 -0
  40. package/guides/de/skill-authoring.md +214 -0
  41. package/guides/de/token-optimization.md +156 -0
  42. package/guides/es/agent-orchestration.md +174 -0
  43. package/guides/es/getting-started.md +164 -0
  44. package/guides/es/hooks-cookbook.md +160 -0
  45. package/guides/es/memory-management.md +153 -0
  46. package/guides/es/security.md +180 -0
  47. package/guides/es/skill-authoring.md +214 -0
  48. package/guides/es/token-optimization.md +156 -0
  49. package/guides/fr/agent-orchestration.md +174 -0
  50. package/guides/fr/getting-started.md +164 -0
  51. package/guides/fr/hooks-cookbook.md +227 -0
  52. package/guides/fr/memory-management.md +169 -0
  53. package/guides/fr/security.md +180 -0
  54. package/guides/fr/skill-authoring.md +214 -0
  55. package/guides/fr/token-optimization.md +158 -0
  56. package/guides/getting-started.md +164 -0
  57. package/guides/hooks-cookbook.md +423 -0
  58. package/guides/memory-management.md +192 -0
  59. package/guides/nl/agent-orchestration.md +174 -0
  60. package/guides/nl/getting-started.md +164 -0
  61. package/guides/nl/hooks-cookbook.md +160 -0
  62. package/guides/nl/memory-management.md +153 -0
  63. package/guides/nl/security.md +180 -0
  64. package/guides/nl/skill-authoring.md +214 -0
  65. package/guides/nl/token-optimization.md +156 -0
  66. package/guides/security.md +229 -0
  67. package/guides/skill-authoring.md +226 -0
  68. package/guides/token-optimization.md +169 -0
  69. package/hooks/lifecycle/cost-tracker.md +49 -0
  70. package/hooks/lifecycle/cost-tracker.sh +59 -0
  71. package/hooks/lifecycle/pre-compact-save.md +56 -0
  72. package/hooks/lifecycle/pre-compact-save.sh +37 -0
  73. package/hooks/lifecycle/session-start.md +50 -0
  74. package/hooks/lifecycle/session-start.sh +47 -0
  75. package/hooks/post-tool-use/audit-log.md +53 -0
  76. package/hooks/post-tool-use/audit-log.sh +53 -0
  77. package/hooks/post-tool-use/prettier.md +53 -0
  78. package/hooks/post-tool-use/prettier.sh +49 -0
  79. package/hooks/pre-tool-use/block-dangerous.md +48 -0
  80. package/hooks/pre-tool-use/block-dangerous.sh +76 -0
  81. package/hooks/pre-tool-use/git-push-confirm.md +46 -0
  82. package/hooks/pre-tool-use/git-push-confirm.sh +36 -0
  83. package/mcp/configs/github.json +11 -0
  84. package/mcp/configs/postgres.json +11 -0
  85. package/mcp/de/recommended-servers.md +170 -0
  86. package/mcp/es/recommended-servers.md +170 -0
  87. package/mcp/fr/recommended-servers.md +170 -0
  88. package/mcp/nl/recommended-servers.md +170 -0
  89. package/mcp/recommended-servers.md +168 -0
  90. package/package.json +45 -0
  91. package/prompts/project-starters/de/fastapi-project.md +62 -0
  92. package/prompts/project-starters/de/nextjs-project.md +82 -0
  93. package/prompts/project-starters/es/fastapi-project.md +62 -0
  94. package/prompts/project-starters/es/nextjs-project.md +82 -0
  95. package/prompts/project-starters/fastapi-project.md +60 -0
  96. package/prompts/project-starters/fr/fastapi-project.md +62 -0
  97. package/prompts/project-starters/fr/nextjs-project.md +82 -0
  98. package/prompts/project-starters/nextjs-project.md +80 -0
  99. package/prompts/project-starters/nl/fastapi-project.md +62 -0
  100. package/prompts/project-starters/nl/nextjs-project.md +82 -0
  101. package/prompts/system-prompts/ai-product.md +80 -0
  102. package/prompts/system-prompts/data-pipeline.md +76 -0
  103. package/prompts/system-prompts/de/ai-product.md +82 -0
  104. package/prompts/system-prompts/de/data-pipeline.md +78 -0
  105. package/prompts/system-prompts/de/saas-backend.md +71 -0
  106. package/prompts/system-prompts/es/ai-product.md +82 -0
  107. package/prompts/system-prompts/es/data-pipeline.md +78 -0
  108. package/prompts/system-prompts/es/saas-backend.md +71 -0
  109. package/prompts/system-prompts/fr/ai-product.md +82 -0
  110. package/prompts/system-prompts/fr/data-pipeline.md +78 -0
  111. package/prompts/system-prompts/fr/saas-backend.md +71 -0
  112. package/prompts/system-prompts/nl/ai-product.md +82 -0
  113. package/prompts/system-prompts/nl/data-pipeline.md +78 -0
  114. package/prompts/system-prompts/nl/saas-backend.md +71 -0
  115. package/prompts/system-prompts/saas-backend.md +69 -0
  116. package/prompts/task-specific/changelog.md +81 -0
  117. package/prompts/task-specific/de/changelog.md +83 -0
  118. package/prompts/task-specific/de/debugging.md +78 -0
  119. package/prompts/task-specific/de/pr-description.md +69 -0
  120. package/prompts/task-specific/debugging.md +76 -0
  121. package/prompts/task-specific/es/changelog.md +83 -0
  122. package/prompts/task-specific/es/debugging.md +78 -0
  123. package/prompts/task-specific/es/pr-description.md +69 -0
  124. package/prompts/task-specific/fr/changelog.md +83 -0
  125. package/prompts/task-specific/fr/debugging.md +78 -0
  126. package/prompts/task-specific/fr/pr-description.md +69 -0
  127. package/prompts/task-specific/nl/changelog.md +83 -0
  128. package/prompts/task-specific/nl/debugging.md +78 -0
  129. package/prompts/task-specific/nl/pr-description.md +69 -0
  130. package/prompts/task-specific/pr-description.md +67 -0
  131. package/rules/common/coding-style.md +45 -0
  132. package/rules/common/de/coding-style.md +47 -0
  133. package/rules/common/de/git.md +48 -0
  134. package/rules/common/de/performance.md +40 -0
  135. package/rules/common/de/security.md +45 -0
  136. package/rules/common/de/testing.md +45 -0
  137. package/rules/common/es/coding-style.md +47 -0
  138. package/rules/common/es/git.md +48 -0
  139. package/rules/common/es/performance.md +40 -0
  140. package/rules/common/es/security.md +45 -0
  141. package/rules/common/es/testing.md +45 -0
  142. package/rules/common/fr/coding-style.md +47 -0
  143. package/rules/common/fr/git.md +48 -0
  144. package/rules/common/fr/performance.md +40 -0
  145. package/rules/common/fr/security.md +45 -0
  146. package/rules/common/fr/testing.md +45 -0
  147. package/rules/common/git.md +46 -0
  148. package/rules/common/nl/coding-style.md +47 -0
  149. package/rules/common/nl/git.md +48 -0
  150. package/rules/common/nl/performance.md +40 -0
  151. package/rules/common/nl/security.md +45 -0
  152. package/rules/common/nl/testing.md +45 -0
  153. package/rules/common/performance.md +38 -0
  154. package/rules/common/security.md +43 -0
  155. package/rules/common/testing.md +43 -0
  156. package/rules/language-specific/de/go.md +48 -0
  157. package/rules/language-specific/de/python.md +38 -0
  158. package/rules/language-specific/de/typescript.md +51 -0
  159. package/rules/language-specific/es/go.md +48 -0
  160. package/rules/language-specific/es/python.md +38 -0
  161. package/rules/language-specific/es/typescript.md +51 -0
  162. package/rules/language-specific/fr/go.md +48 -0
  163. package/rules/language-specific/fr/python.md +38 -0
  164. package/rules/language-specific/fr/typescript.md +51 -0
  165. package/rules/language-specific/go.md +46 -0
  166. package/rules/language-specific/nl/go.md +48 -0
  167. package/rules/language-specific/nl/python.md +38 -0
  168. package/rules/language-specific/nl/typescript.md +51 -0
  169. package/rules/language-specific/python.md +36 -0
  170. package/rules/language-specific/typescript.md +49 -0
  171. package/scripts/cli.js +161 -0
  172. package/scripts/link-skills.sh +35 -0
  173. package/scripts/list-skills.sh +34 -0
  174. package/skills/ai-engineering/agent-construction.md +285 -0
  175. package/skills/ai-engineering/claude-api.md +248 -0
  176. package/skills/ai-engineering/de/agent-construction.md +287 -0
  177. package/skills/ai-engineering/de/claude-api.md +250 -0
  178. package/skills/ai-engineering/es/agent-construction.md +287 -0
  179. package/skills/ai-engineering/es/claude-api.md +250 -0
  180. package/skills/ai-engineering/fr/agent-construction.md +287 -0
  181. package/skills/ai-engineering/fr/claude-api.md +250 -0
  182. package/skills/ai-engineering/nl/agent-construction.md +287 -0
  183. package/skills/ai-engineering/nl/claude-api.md +250 -0
  184. package/skills/backend/dotnet/csharp.md +304 -0
  185. package/skills/backend/dotnet/de/csharp.md +306 -0
  186. package/skills/backend/dotnet/es/csharp.md +306 -0
  187. package/skills/backend/dotnet/fr/csharp.md +306 -0
  188. package/skills/backend/dotnet/nl/csharp.md +306 -0
  189. package/skills/backend/go/de/go.md +307 -0
  190. package/skills/backend/go/es/go.md +307 -0
  191. package/skills/backend/go/fr/go.md +307 -0
  192. package/skills/backend/go/go.md +305 -0
  193. package/skills/backend/go/nl/go.md +307 -0
  194. package/skills/backend/nodejs/de/nestjs.md +274 -0
  195. package/skills/backend/nodejs/de/nextjs.md +222 -0
  196. package/skills/backend/nodejs/es/nestjs.md +274 -0
  197. package/skills/backend/nodejs/es/nextjs.md +222 -0
  198. package/skills/backend/nodejs/fr/nestjs.md +274 -0
  199. package/skills/backend/nodejs/fr/nextjs.md +222 -0
  200. package/skills/backend/nodejs/nestjs.md +272 -0
  201. package/skills/backend/nodejs/nextjs.md +220 -0
  202. package/skills/backend/nodejs/nl/nestjs.md +274 -0
  203. package/skills/backend/nodejs/nl/nextjs.md +222 -0
  204. package/skills/backend/python/de/django.md +285 -0
  205. package/skills/backend/python/de/fastapi.md +244 -0
  206. package/skills/backend/python/django.md +283 -0
  207. package/skills/backend/python/es/django.md +285 -0
  208. package/skills/backend/python/es/fastapi.md +244 -0
  209. package/skills/backend/python/fastapi.md +242 -0
  210. package/skills/backend/python/fr/django.md +285 -0
  211. package/skills/backend/python/fr/fastapi.md +244 -0
  212. package/skills/backend/python/nl/django.md +285 -0
  213. package/skills/backend/python/nl/fastapi.md +244 -0
  214. package/skills/data-ml/dbt-data-pipelines.md +155 -0
  215. package/skills/data-ml/de/dbt-data-pipelines.md +157 -0
  216. package/skills/data-ml/de/pandas-polars.md +147 -0
  217. package/skills/data-ml/de/pytorch-tensorflow.md +171 -0
  218. package/skills/data-ml/es/dbt-data-pipelines.md +157 -0
  219. package/skills/data-ml/es/pandas-polars.md +147 -0
  220. package/skills/data-ml/es/pytorch-tensorflow.md +171 -0
  221. package/skills/data-ml/fr/dbt-data-pipelines.md +157 -0
  222. package/skills/data-ml/fr/pandas-polars.md +147 -0
  223. package/skills/data-ml/fr/pytorch-tensorflow.md +171 -0
  224. package/skills/data-ml/nl/dbt-data-pipelines.md +157 -0
  225. package/skills/data-ml/nl/pandas-polars.md +147 -0
  226. package/skills/data-ml/nl/pytorch-tensorflow.md +171 -0
  227. package/skills/data-ml/pandas-polars.md +145 -0
  228. package/skills/data-ml/pytorch-tensorflow.md +169 -0
  229. package/skills/database/de/graphql.md +181 -0
  230. package/skills/database/es/graphql.md +181 -0
  231. package/skills/database/fr/graphql.md +181 -0
  232. package/skills/database/graphql.md +179 -0
  233. package/skills/database/nl/graphql.md +181 -0
  234. package/skills/devops-infra/de/docker.md +133 -0
  235. package/skills/devops-infra/de/github-actions.md +179 -0
  236. package/skills/devops-infra/de/kubernetes.md +129 -0
  237. package/skills/devops-infra/de/terraform.md +130 -0
  238. package/skills/devops-infra/docker.md +131 -0
  239. package/skills/devops-infra/es/docker.md +133 -0
  240. package/skills/devops-infra/es/github-actions.md +179 -0
  241. package/skills/devops-infra/es/kubernetes.md +129 -0
  242. package/skills/devops-infra/es/terraform.md +130 -0
  243. package/skills/devops-infra/fr/docker.md +133 -0
  244. package/skills/devops-infra/fr/github-actions.md +179 -0
  245. package/skills/devops-infra/fr/kubernetes.md +129 -0
  246. package/skills/devops-infra/fr/terraform.md +130 -0
  247. package/skills/devops-infra/github-actions.md +177 -0
  248. package/skills/devops-infra/kubernetes.md +127 -0
  249. package/skills/devops-infra/nl/docker.md +133 -0
  250. package/skills/devops-infra/nl/github-actions.md +179 -0
  251. package/skills/devops-infra/nl/kubernetes.md +129 -0
  252. package/skills/devops-infra/nl/terraform.md +130 -0
  253. package/skills/devops-infra/terraform.md +128 -0
  254. package/skills/finance-payments/de/stripe.md +187 -0
  255. package/skills/finance-payments/es/stripe.md +187 -0
  256. package/skills/finance-payments/fr/stripe.md +187 -0
  257. package/skills/finance-payments/nl/stripe.md +187 -0
  258. package/skills/finance-payments/stripe.md +185 -0
  259. package/workflows/code-review.md +151 -0
  260. package/workflows/de/code-review.md +153 -0
  261. package/workflows/de/debugging-session.md +146 -0
  262. package/workflows/de/feature-development.md +155 -0
  263. package/workflows/de/new-project-bootstrap.md +175 -0
  264. package/workflows/de/refactor-safely.md +150 -0
  265. package/workflows/debugging-session.md +144 -0
  266. package/workflows/es/code-review.md +153 -0
  267. package/workflows/es/debugging-session.md +146 -0
  268. package/workflows/es/feature-development.md +155 -0
  269. package/workflows/es/new-project-bootstrap.md +175 -0
  270. package/workflows/es/refactor-safely.md +150 -0
  271. package/workflows/feature-development.md +153 -0
  272. package/workflows/fr/code-review.md +153 -0
  273. package/workflows/fr/debugging-session.md +146 -0
  274. package/workflows/fr/feature-development.md +155 -0
  275. package/workflows/fr/new-project-bootstrap.md +175 -0
  276. package/workflows/fr/refactor-safely.md +150 -0
  277. package/workflows/new-project-bootstrap.md +173 -0
  278. package/workflows/nl/code-review.md +153 -0
  279. package/workflows/nl/debugging-session.md +146 -0
  280. package/workflows/nl/feature-development.md +155 -0
  281. package/workflows/nl/new-project-bootstrap.md +175 -0
  282. package/workflows/nl/refactor-safely.md +150 -0
  283. package/workflows/refactor-safely.md +148 -0
@@ -0,0 +1,147 @@
1
+ > 🇳🇱 Dit is de Nederlandse vertaling. [Engelse versie](../pandas-polars.md).
2
+
3
+ # Pandas / Polars Skill
4
+
5
+ ## Wanneer te activeren
6
+ - Tabelgegevens opschonen, transformeren of aggregeren in Python
7
+ - DataFrames samenvoegen, joinen of hervormen
8
+ - Datavalidatie of kwaliteitscontroles schrijven
9
+ - Converteren tussen formaten (CSV, Parquet, JSON, Excel)
10
+ - Een nieuw dataset profileren of verkennen
11
+ - Trage Pandas-code optimaliseren voor grote datasets
12
+ - Pandas-code migreren naar Polars voor betere prestaties
13
+
14
+ ## Wanneer NIET te gebruiken
15
+ - SQL in een database (push transformaties naar de database als data er al is)
16
+ - Spark/gedistribueerd computing (gebruik PySpark skill voor datasets > beschikbaar RAM)
17
+ - dbt-modellen (SQL-gebaseerde transformaties in een warehouse)
18
+ - NumPy array-operaties op niet-tabelgegevens
19
+
20
+ ## Instructies
21
+
22
+ ### Pandas — prestatieregels
23
+ ```python
24
+ import pandas as pd
25
+ import numpy as np
26
+
27
+ # Gebruik nooit iterrows() — vectoriseer in plaats daarvan
28
+ # Slecht:
29
+ for idx, row in df.iterrows():
30
+ df.at[idx, 'tax'] = row['price'] * 0.2
31
+
32
+ # Goed:
33
+ df['tax'] = df['price'] * 0.2
34
+
35
+ # Gebruik .loc voor labelgebaseerde toegang, .iloc voor positiegebaseerde
36
+ # Ketting nooit zonder toewijzing — veroorzaakt SettingWithCopyWarning
37
+ df.loc[df['status'] == 'active', 'flag'] = True
38
+
39
+ # Categorisch dtype voor lage-cardinaliteit string-kolommen (enorm geheugenvoordeel)
40
+ df['country'] = df['country'].astype('category')
41
+
42
+ # Numerieke typen downcasten om geheugen te verminderen
43
+ df['quantity'] = pd.to_numeric(df['quantity'], downcast='integer')
44
+ df['price'] = pd.to_numeric(df['price'], downcast='float')
45
+ ```
46
+
47
+ ### Pandas — aggregatie en groupby
48
+ ```python
49
+ # Groupby met meerdere aggregaties
50
+ summary = (
51
+ df.groupby(['region', 'category'])
52
+ .agg(
53
+ total_revenue=('revenue', 'sum'),
54
+ order_count=('order_id', 'nunique'),
55
+ avg_order_value=('revenue', 'mean'),
56
+ )
57
+ .reset_index()
58
+ .sort_values('total_revenue', ascending=False)
59
+ )
60
+ ```
61
+
62
+ ### Pandas — samenvoegen
63
+ ```python
64
+ # Geef altijd expliciet how= aan — vertrouw nooit op de standaard (inner)
65
+ result = pd.merge(
66
+ orders,
67
+ customers,
68
+ on='customer_id',
69
+ how='left', # expliciet
70
+ validate='m:1', # valideert kardinaliteit — geeft fout als geschonden
71
+ suffixes=('_order', '_customer')
72
+ )
73
+ ```
74
+
75
+ ### Polars — wanneer te gebruiken in plaats van Pandas
76
+ Gebruik Polars wanneer:
77
+ - Dataset > 1M rijen (Polars is 5–100x sneller voor veel operaties)
78
+ - Je luie evaluatie nodig hebt (query-optimalisatie vóór uitvoering)
79
+ - Parallellisme belangrijk is (Polars gebruikt standaard alle CPU-kernen)
80
+
81
+ ```python
82
+ import polars as pl
83
+
84
+ # Lazy API — queries worden geoptimaliseerd vóór uitvoering
85
+ result = (
86
+ pl.scan_parquet("orders.parquet") # Luie scan — nog geen data geladen
87
+ .filter(pl.col("status") == "completed")
88
+ .group_by(["region", "category"])
89
+ .agg([
90
+ pl.col("revenue").sum().alias("total_revenue"),
91
+ pl.col("order_id").n_unique().alias("order_count"),
92
+ pl.col("revenue").mean().alias("avg_order_value"),
93
+ ])
94
+ .sort("total_revenue", descending=True)
95
+ .collect() # Nu uitvoeren
96
+ )
97
+ ```
98
+
99
+ ### Polars — expressies (geen geketend indexeren)
100
+ ```python
101
+ # Polars: geen SettingWithCopyWarning, geen geketend indexeren
102
+ df = df.with_columns([
103
+ (pl.col("price") * 0.2).alias("tax"),
104
+ pl.col("name").str.to_uppercase().alias("name_upper"),
105
+ pl.when(pl.col("quantity") > 10)
106
+ .then(pl.lit("bulk"))
107
+ .otherwise(pl.lit("standard"))
108
+ .alias("order_type"),
109
+ ])
110
+ ```
111
+
112
+ ### Datavalidatiepatroon
113
+ ```python
114
+ def validate_orders(df: pd.DataFrame) -> None:
115
+ assert df['order_id'].notna().all(), "order_id has nulls"
116
+ assert df['order_id'].is_unique, "order_id has duplicates"
117
+ assert (df['amount'] >= 0).all(), "amount has negative values"
118
+ assert df['status'].isin(['pending', 'completed', 'cancelled']).all(), "invalid status values"
119
+ assert pd.to_datetime(df['created_at'], errors='coerce').notna().all(), "created_at has invalid dates"
120
+ ```
121
+
122
+ ### Formaatconversie
123
+ ```python
124
+ # Lezen
125
+ df = pd.read_parquet("data.parquet", columns=['id', 'name', 'amount']) # Kolomselectie bij leestijd
126
+ df = pd.read_csv("data.csv", dtype={'id': str}, parse_dates=['created_at'])
127
+
128
+ # Schrijven — gebruik altijd Parquet boven CSV voor grote datasets
129
+ df.to_parquet("output.parquet", index=False, compression='snappy')
130
+ ```
131
+
132
+ ## Voorbeeld
133
+
134
+ **Gebruiker:** Schoon een ruwe orders-CSV op: herstel dtypes, verwijder duplicaten, behandel nulls, voeg afgeleide kolommen toe (revenue_after_tax, order_size_bucket) en output een gevalideerd Parquet-bestand.
135
+
136
+ **Verwachte output:**
137
+ - Lezen met expliciet `dtype=` en `parse_dates=`
138
+ - Verwijder dubbele `order_id`-rijen (behoud laatste)
139
+ - Vul nulls: `quantity` → 0, `discount` → 0.0, verwijder rijen waar `customer_id` null is
140
+ - Afgeleid: `revenue_after_tax = price * quantity * (1 - discount) * 0.8`
141
+ - Bucket: `order_size_bucket` = 'small'/<100, 'medium'/100–1000, 'large'/>1000
142
+ - Valideer met assertions vóór schrijven
143
+ - Schrijf naar Parquet met snappy-compressie
144
+
145
+ ---
146
+
147
+ > **Werk met ons:** Claudient wordt ondersteund door [Uitbreiden](https://uitbreiden.com/) — we bouwen AI-producten en B2B-oplossingen met ontwikkelaarsgemeenschappen. Datapipelines of AI-dataproducten bouwen? [uitbreiden.com](https://uitbreiden.com/)
@@ -0,0 +1,171 @@
1
+ > 🇳🇱 Dit is de Nederlandse vertaling. [Engelse versie](../pytorch-tensorflow.md).
2
+
3
+ # PyTorch / TensorFlow Skill
4
+
5
+ ## Wanneer te activeren
6
+ - Neurale netwerk-trainingslussen schrijven in PyTorch
7
+ - Keras/TensorFlow-modellen bouwen en trainen
8
+ - Aangepaste verliesfuncties of modelarchitecturen implementeren
9
+ - GPU-training instellen met apparaatbeheer
10
+ - Dataloaders en preprocessing-pipelines schrijven voor modeltraining
11
+ - Modelevaluatie, checkpointing en vroeg stoppen implementeren
12
+ - NaN-verliezen, exploderende gradiënten of trainingsinstabiliteit debuggen
13
+ - Modellen porteren tussen PyTorch en TensorFlow
14
+
15
+ ## Wanneer NIET te gebruiken
16
+ - scikit-learn-taken (classificatie, regressie, clustering op tabelgegevens) — geen deep learning
17
+ - Pandas/Polars-gegevensmanipulatie vóór de modelleerstap
18
+ - Hugging Face fine-tuning met trainer API (andere workflow)
19
+ - Inference-only deployments zonder trainingscode
20
+
21
+ ## Instructies
22
+
23
+ ### PyTorch trainingslus — standaardstructuur
24
+ ```python
25
+ import torch
26
+ import torch.nn as nn
27
+ from torch.utils.data import DataLoader
28
+
29
+ def train(model, train_loader, val_loader, epochs, lr, device):
30
+ optimizer = torch.optim.AdamW(model.parameters(), lr=lr, weight_decay=1e-2)
31
+ scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=epochs)
32
+ criterion = nn.CrossEntropyLoss()
33
+
34
+ best_val_loss = float('inf')
35
+
36
+ for epoch in range(epochs):
37
+ # Training
38
+ model.train()
39
+ train_loss = 0.0
40
+ for batch in train_loader:
41
+ inputs, targets = batch
42
+ inputs, targets = inputs.to(device), targets.to(device)
43
+
44
+ optimizer.zero_grad()
45
+ outputs = model(inputs)
46
+ loss = criterion(outputs, targets)
47
+ loss.backward()
48
+
49
+ # Gradiëntclipping — altijd voor stabiliteit
50
+ torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0)
51
+
52
+ optimizer.step()
53
+ train_loss += loss.item()
54
+
55
+ # Validatie
56
+ model.eval()
57
+ val_loss = 0.0
58
+ with torch.no_grad():
59
+ for batch in val_loader:
60
+ inputs, targets = batch
61
+ inputs, targets = inputs.to(device), targets.to(device)
62
+ outputs = model(inputs)
63
+ val_loss += criterion(outputs, targets).item()
64
+
65
+ scheduler.step()
66
+
67
+ # Checkpoint beste model
68
+ if val_loss < best_val_loss:
69
+ best_val_loss = val_loss
70
+ torch.save(model.state_dict(), 'best_model.pt')
71
+
72
+ print(f"Epoch {epoch+1}/{epochs} | Train: {train_loss/len(train_loader):.4f} | Val: {val_loss/len(val_loader):.4f}")
73
+ ```
74
+
75
+ ### Apparaatbeheer
76
+ ```python
77
+ # Altijd expliciete apparaatselectie
78
+ device = torch.device('cuda' if torch.cuda.is_available() else
79
+ 'mps' if torch.backends.mps.is_available() else
80
+ 'cpu')
81
+ model = model.to(device)
82
+ ```
83
+ Codeer nooit `'cuda'` hard — controleer altijd beschikbaarheid.
84
+
85
+ ### Aangepaste modelstructuur
86
+ ```python
87
+ class MyModel(nn.Module):
88
+ def __init__(self, input_dim, hidden_dim, output_dim, dropout=0.3):
89
+ super().__init__()
90
+ self.network = nn.Sequential(
91
+ nn.Linear(input_dim, hidden_dim),
92
+ nn.LayerNorm(hidden_dim),
93
+ nn.GELU(),
94
+ nn.Dropout(dropout),
95
+ nn.Linear(hidden_dim, output_dim)
96
+ )
97
+
98
+ def forward(self, x):
99
+ return self.network(x)
100
+ ```
101
+ Geef de voorkeur aan `nn.Sequential` voor eenvoudige feedforward; gebruik `forward()`-overschrijving voor complexe vertakkingen.
102
+
103
+ ### Trainingsinstabiliteit debuggen
104
+ 1. **NaN-verlies** → controleer op log(0) in verlies, exploderende invoer of deling door nul in preprocessing
105
+ 2. **Exploderende gradiënten** → voeg `clip_grad_norm_` toe (al aanwezig in bovenstaande sjabloon)
106
+ 3. **Verdwijnende gradiënten** → controleer activeringsfuncties (vermijd sigmoid/tanh in diepe netwerken), gebruik residuele verbindingen
107
+ 4. **Verlies daalt niet** → verklein LR 10x, controleer shuffling van dataloader, controleer of labels correct zijn
108
+ 5. **GPU OOM** → verklein batchgrootte, gebruik gradiëntcheckpointing, gebruik gemengde precisie
109
+
110
+ ### Training met gemengde precisie (PyTorch)
111
+ ```python
112
+ from torch.cuda.amp import autocast, GradScaler
113
+
114
+ scaler = GradScaler()
115
+
116
+ for batch in train_loader:
117
+ optimizer.zero_grad()
118
+ with autocast():
119
+ outputs = model(inputs)
120
+ loss = criterion(outputs, targets)
121
+ scaler.scale(loss).backward()
122
+ scaler.unscale_(optimizer)
123
+ torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)
124
+ scaler.step(optimizer)
125
+ scaler.update()
126
+ ```
127
+
128
+ ### TensorFlow/Keras — standaardstructuur
129
+ ```python
130
+ import tensorflow as tf
131
+
132
+ model = tf.keras.Sequential([
133
+ tf.keras.layers.Dense(256, activation='relu'),
134
+ tf.keras.layers.Dropout(0.3),
135
+ tf.keras.layers.Dense(10, activation='softmax')
136
+ ])
137
+
138
+ model.compile(
139
+ optimizer=tf.keras.optimizers.AdamW(learning_rate=1e-3, weight_decay=1e-2),
140
+ loss='sparse_categorical_crossentropy',
141
+ metrics=['accuracy']
142
+ )
143
+
144
+ callbacks = [
145
+ tf.keras.callbacks.EarlyStopping(patience=5, restore_best_weights=True),
146
+ tf.keras.callbacks.ModelCheckpoint('best_model.keras', save_best_only=True),
147
+ tf.keras.callbacks.ReduceLROnPlateau(patience=3, factor=0.5)
148
+ ]
149
+
150
+ history = model.fit(
151
+ train_dataset,
152
+ validation_data=val_dataset,
153
+ epochs=100,
154
+ callbacks=callbacks
155
+ )
156
+ ```
157
+
158
+ ## Voorbeeld
159
+
160
+ **Gebruiker:** Bouw een PyTorch-tekstclassificator voor sentimentanalyse (binair) met embedding, LSTM en dropout.
161
+
162
+ **Verwachte output:**
163
+ - `SentimentLSTM(nn.Module)` — embeddinglaag, LSTM, dropout, lineair hoofd
164
+ - `forward()` — behandelt verpakte sequenties of gevulde invoer
165
+ - Trainingslus met gradiëntclipping, validatie per epoch, beste model checkpoint
166
+ - `device` automatisch gedetecteerd (CUDA/MPS/CPU)
167
+ - Train/val-splitsing via `DataLoader` met shuffling alleen op train
168
+
169
+ ---
170
+
171
+ > **Werk met ons:** Claudient wordt ondersteund door [Uitbreiden](https://uitbreiden.com/) — we bouwen AI-producten en B2B-oplossingen met ontwikkelaarsgemeenschappen. ML-modellen of AI-aangedreven producten bouwen? [uitbreiden.com](https://uitbreiden.com/)
@@ -0,0 +1,145 @@
1
+ # Pandas / Polars Skill
2
+
3
+ ## When to activate
4
+ - Cleaning, transforming, or aggregating tabular data in Python
5
+ - Merging, joining, or reshaping DataFrames
6
+ - Writing data validation or quality checks
7
+ - Converting between formats (CSV, Parquet, JSON, Excel)
8
+ - Profiling or exploring a new dataset
9
+ - Optimizing slow Pandas code for large datasets
10
+ - Migrating Pandas code to Polars for performance
11
+
12
+ ## When NOT to use
13
+ - SQL in a database (push transformations to the database when data is already there)
14
+ - Spark/distributed computing (use PySpark skill for datasets > available RAM)
15
+ - dbt models (SQL-based transformations in a warehouse)
16
+ - NumPy array operations on non-tabular data
17
+
18
+ ## Instructions
19
+
20
+ ### Pandas — performance rules
21
+ ```python
22
+ import pandas as pd
23
+ import numpy as np
24
+
25
+ # Never use iterrows() — vectorize instead
26
+ # Bad:
27
+ for idx, row in df.iterrows():
28
+ df.at[idx, 'tax'] = row['price'] * 0.2
29
+
30
+ # Good:
31
+ df['tax'] = df['price'] * 0.2
32
+
33
+ # Use .loc for label-based access, .iloc for position-based
34
+ # Never chain without assignment — causes SettingWithCopyWarning
35
+ df.loc[df['status'] == 'active', 'flag'] = True
36
+
37
+ # Categorical dtype for low-cardinality string columns (massive memory saving)
38
+ df['country'] = df['country'].astype('category')
39
+
40
+ # Downcasting numeric types to reduce memory
41
+ df['quantity'] = pd.to_numeric(df['quantity'], downcast='integer')
42
+ df['price'] = pd.to_numeric(df['price'], downcast='float')
43
+ ```
44
+
45
+ ### Pandas — aggregation and groupby
46
+ ```python
47
+ # Groupby with multiple aggregations
48
+ summary = (
49
+ df.groupby(['region', 'category'])
50
+ .agg(
51
+ total_revenue=('revenue', 'sum'),
52
+ order_count=('order_id', 'nunique'),
53
+ avg_order_value=('revenue', 'mean'),
54
+ )
55
+ .reset_index()
56
+ .sort_values('total_revenue', ascending=False)
57
+ )
58
+ ```
59
+
60
+ ### Pandas — merging
61
+ ```python
62
+ # Always specify how= explicitly — never rely on default (inner)
63
+ result = pd.merge(
64
+ orders,
65
+ customers,
66
+ on='customer_id',
67
+ how='left', # explicit
68
+ validate='m:1', # validates cardinality — raises if violated
69
+ suffixes=('_order', '_customer')
70
+ )
71
+ ```
72
+
73
+ ### Polars — when to use instead of Pandas
74
+ Use Polars when:
75
+ - Dataset > 1M rows (Polars is 5–100x faster for many operations)
76
+ - You need lazy evaluation (query optimization before execution)
77
+ - Parallelism matters (Polars uses all CPU cores by default)
78
+
79
+ ```python
80
+ import polars as pl
81
+
82
+ # Lazy API — queries are optimized before execution
83
+ result = (
84
+ pl.scan_parquet("orders.parquet") # Lazy scan — no data loaded yet
85
+ .filter(pl.col("status") == "completed")
86
+ .group_by(["region", "category"])
87
+ .agg([
88
+ pl.col("revenue").sum().alias("total_revenue"),
89
+ pl.col("order_id").n_unique().alias("order_count"),
90
+ pl.col("revenue").mean().alias("avg_order_value"),
91
+ ])
92
+ .sort("total_revenue", descending=True)
93
+ .collect() # Execute now
94
+ )
95
+ ```
96
+
97
+ ### Polars — expressions (no chained indexing)
98
+ ```python
99
+ # Polars: no SettingWithCopyWarning, no chained indexing
100
+ df = df.with_columns([
101
+ (pl.col("price") * 0.2).alias("tax"),
102
+ pl.col("name").str.to_uppercase().alias("name_upper"),
103
+ pl.when(pl.col("quantity") > 10)
104
+ .then(pl.lit("bulk"))
105
+ .otherwise(pl.lit("standard"))
106
+ .alias("order_type"),
107
+ ])
108
+ ```
109
+
110
+ ### Data validation pattern
111
+ ```python
112
+ def validate_orders(df: pd.DataFrame) -> None:
113
+ assert df['order_id'].notna().all(), "order_id has nulls"
114
+ assert df['order_id'].is_unique, "order_id has duplicates"
115
+ assert (df['amount'] >= 0).all(), "amount has negative values"
116
+ assert df['status'].isin(['pending', 'completed', 'cancelled']).all(), "invalid status values"
117
+ assert pd.to_datetime(df['created_at'], errors='coerce').notna().all(), "created_at has invalid dates"
118
+ ```
119
+
120
+ ### Format conversion
121
+ ```python
122
+ # Read
123
+ df = pd.read_parquet("data.parquet", columns=['id', 'name', 'amount']) # Column selection at read time
124
+ df = pd.read_csv("data.csv", dtype={'id': str}, parse_dates=['created_at'])
125
+
126
+ # Write — always use Parquet over CSV for large datasets
127
+ df.to_parquet("output.parquet", index=False, compression='snappy')
128
+ ```
129
+
130
+ ## Example
131
+
132
+ **User:** Clean a raw orders CSV: fix dtypes, remove duplicates, handle nulls, add derived columns (revenue_after_tax, order_size_bucket), and output a validated Parquet file.
133
+
134
+ **Expected output:**
135
+ - Read with explicit `dtype=` and `parse_dates=`
136
+ - Drop duplicate `order_id` rows (keep last)
137
+ - Fill nulls: `quantity` → 0, `discount` → 0.0, drop rows where `customer_id` is null
138
+ - Derive: `revenue_after_tax = price * quantity * (1 - discount) * 0.8`
139
+ - Bucket: `order_size_bucket` = 'small'/<100, 'medium'/100–1000, 'large'/>1000
140
+ - Validate with assertions before write
141
+ - Write to Parquet with snappy compression
142
+
143
+ ---
144
+
145
+ > **Work with us:** Claudient is backed by [Uitbreiden](https://uitbreiden.com/) — we build AI products and B2B solutions with developer communities. Building data pipelines or AI data products? [uitbreiden.com](https://uitbreiden.com/)
@@ -0,0 +1,169 @@
1
+ # PyTorch / TensorFlow Skill
2
+
3
+ ## When to activate
4
+ - Writing neural network training loops in PyTorch
5
+ - Building and training Keras/TensorFlow models
6
+ - Implementing custom loss functions or model architectures
7
+ - Setting up GPU training with device management
8
+ - Writing data loaders and preprocessing pipelines for model training
9
+ - Implementing model evaluation, checkpointing, and early stopping
10
+ - Debugging NaN losses, exploding gradients, or training instability
11
+ - Porting models between PyTorch and TensorFlow
12
+
13
+ ## When NOT to use
14
+ - scikit-learn tasks (classification, regression, clustering on tabular data) — not deep learning
15
+ - Pandas/Polars data manipulation before the modeling step
16
+ - Hugging Face fine-tuning with trainer API (different workflow)
17
+ - Inference-only deployments without training code
18
+
19
+ ## Instructions
20
+
21
+ ### PyTorch training loop — standard structure
22
+ ```python
23
+ import torch
24
+ import torch.nn as nn
25
+ from torch.utils.data import DataLoader
26
+
27
+ def train(model, train_loader, val_loader, epochs, lr, device):
28
+ optimizer = torch.optim.AdamW(model.parameters(), lr=lr, weight_decay=1e-2)
29
+ scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=epochs)
30
+ criterion = nn.CrossEntropyLoss()
31
+
32
+ best_val_loss = float('inf')
33
+
34
+ for epoch in range(epochs):
35
+ # Training
36
+ model.train()
37
+ train_loss = 0.0
38
+ for batch in train_loader:
39
+ inputs, targets = batch
40
+ inputs, targets = inputs.to(device), targets.to(device)
41
+
42
+ optimizer.zero_grad()
43
+ outputs = model(inputs)
44
+ loss = criterion(outputs, targets)
45
+ loss.backward()
46
+
47
+ # Gradient clipping — always for stability
48
+ torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0)
49
+
50
+ optimizer.step()
51
+ train_loss += loss.item()
52
+
53
+ # Validation
54
+ model.eval()
55
+ val_loss = 0.0
56
+ with torch.no_grad():
57
+ for batch in val_loader:
58
+ inputs, targets = batch
59
+ inputs, targets = inputs.to(device), targets.to(device)
60
+ outputs = model(inputs)
61
+ val_loss += criterion(outputs, targets).item()
62
+
63
+ scheduler.step()
64
+
65
+ # Checkpoint best model
66
+ if val_loss < best_val_loss:
67
+ best_val_loss = val_loss
68
+ torch.save(model.state_dict(), 'best_model.pt')
69
+
70
+ print(f"Epoch {epoch+1}/{epochs} | Train: {train_loss/len(train_loader):.4f} | Val: {val_loss/len(val_loader):.4f}")
71
+ ```
72
+
73
+ ### Device management
74
+ ```python
75
+ # Always explicit device selection
76
+ device = torch.device('cuda' if torch.cuda.is_available() else
77
+ 'mps' if torch.backends.mps.is_available() else
78
+ 'cpu')
79
+ model = model.to(device)
80
+ ```
81
+ Never hardcode `'cuda'` — always check availability.
82
+
83
+ ### Custom model structure
84
+ ```python
85
+ class MyModel(nn.Module):
86
+ def __init__(self, input_dim, hidden_dim, output_dim, dropout=0.3):
87
+ super().__init__()
88
+ self.network = nn.Sequential(
89
+ nn.Linear(input_dim, hidden_dim),
90
+ nn.LayerNorm(hidden_dim),
91
+ nn.GELU(),
92
+ nn.Dropout(dropout),
93
+ nn.Linear(hidden_dim, output_dim)
94
+ )
95
+
96
+ def forward(self, x):
97
+ return self.network(x)
98
+ ```
99
+ Prefer `nn.Sequential` for simple feedforward; use `forward()` override for complex branching.
100
+
101
+ ### Debugging training instability
102
+ 1. **NaN loss** → check for log(0) in loss, exploding inputs, or division by zero in preprocessing
103
+ 2. **Exploding gradients** → add `clip_grad_norm_` (already in template above)
104
+ 3. **Vanishing gradients** → check activation functions (avoid sigmoid/tanh in deep networks), use residual connections
105
+ 4. **Loss not decreasing** → reduce LR 10x, check data loader shuffling, verify labels are correct
106
+ 5. **GPU OOM** → reduce batch size, use gradient checkpointing, use mixed precision
107
+
108
+ ### Mixed precision training (PyTorch)
109
+ ```python
110
+ from torch.cuda.amp import autocast, GradScaler
111
+
112
+ scaler = GradScaler()
113
+
114
+ for batch in train_loader:
115
+ optimizer.zero_grad()
116
+ with autocast():
117
+ outputs = model(inputs)
118
+ loss = criterion(outputs, targets)
119
+ scaler.scale(loss).backward()
120
+ scaler.unscale_(optimizer)
121
+ torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)
122
+ scaler.step(optimizer)
123
+ scaler.update()
124
+ ```
125
+
126
+ ### TensorFlow/Keras — standard structure
127
+ ```python
128
+ import tensorflow as tf
129
+
130
+ model = tf.keras.Sequential([
131
+ tf.keras.layers.Dense(256, activation='relu'),
132
+ tf.keras.layers.Dropout(0.3),
133
+ tf.keras.layers.Dense(10, activation='softmax')
134
+ ])
135
+
136
+ model.compile(
137
+ optimizer=tf.keras.optimizers.AdamW(learning_rate=1e-3, weight_decay=1e-2),
138
+ loss='sparse_categorical_crossentropy',
139
+ metrics=['accuracy']
140
+ )
141
+
142
+ callbacks = [
143
+ tf.keras.callbacks.EarlyStopping(patience=5, restore_best_weights=True),
144
+ tf.keras.callbacks.ModelCheckpoint('best_model.keras', save_best_only=True),
145
+ tf.keras.callbacks.ReduceLROnPlateau(patience=3, factor=0.5)
146
+ ]
147
+
148
+ history = model.fit(
149
+ train_dataset,
150
+ validation_data=val_dataset,
151
+ epochs=100,
152
+ callbacks=callbacks
153
+ )
154
+ ```
155
+
156
+ ## Example
157
+
158
+ **User:** Build a PyTorch text classifier for sentiment analysis (binary) with embedding, LSTM, and dropout.
159
+
160
+ **Expected output:**
161
+ - `SentimentLSTM(nn.Module)` — embedding layer, LSTM, dropout, linear head
162
+ - `forward()` — handles packed sequences or padded input
163
+ - Training loop with gradient clipping, validation per epoch, best model checkpoint
164
+ - `device` auto-detected (CUDA/MPS/CPU)
165
+ - Train/val split via `DataLoader` with shuffling on train only
166
+
167
+ ---
168
+
169
+ > **Work with us:** Claudient is backed by [Uitbreiden](https://uitbreiden.com/) — we build AI products and B2B solutions with developer communities. Building ML models or AI-powered products? [uitbreiden.com](https://uitbreiden.com/)