k8s-agent-skills 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (51) hide show
  1. package/README.md +102 -0
  2. package/package.json +63 -0
  3. package/skills/atlas/SKILL.md +166 -0
  4. package/skills/cert-manager/SKILL.md +212 -0
  5. package/skills/cilium-gateway/SKILL.md +283 -0
  6. package/skills/cilium-network/SKILL.md +243 -0
  7. package/skills/cnpg/SKILL.md +130 -0
  8. package/skills/dragonfly/SKILL.md +194 -0
  9. package/skills/external-dns/SKILL.md +185 -0
  10. package/skills/flagger/SKILL.md +292 -0
  11. package/skills/flux/SKILL.md +36 -0
  12. package/skills/gitea/SKILL.md +32 -0
  13. package/skills/gitea-api/SKILL.md +104 -0
  14. package/skills/gitea-registry/SKILL.md +71 -0
  15. package/skills/gitea-runner/SKILL.md +126 -0
  16. package/skills/gitea-tea/SKILL.md +206 -0
  17. package/skills/gitea-webhooks/SKILL.md +93 -0
  18. package/skills/harbor/SKILL.md +32 -0
  19. package/skills/harbor-api/SKILL.md +231 -0
  20. package/skills/harbor-helm/SKILL.md +238 -0
  21. package/skills/harbor-terraform/SKILL.md +233 -0
  22. package/skills/higress/SKILL.md +27 -0
  23. package/skills/higress-helm/SKILL.md +328 -0
  24. package/skills/higress-operator/SKILL.md +435 -0
  25. package/skills/kserve/SKILL.md +28 -0
  26. package/skills/kserve-helm/SKILL.md +330 -0
  27. package/skills/kserve-operator/SKILL.md +763 -0
  28. package/skills/kubeflow/SKILL.md +33 -0
  29. package/skills/kubeflow-pipelines/SKILL.md +392 -0
  30. package/skills/kubeflow-trainer/SKILL.md +429 -0
  31. package/skills/kubeflow-training-operator/SKILL.md +176 -0
  32. package/skills/mariadb/SKILL.md +27 -0
  33. package/skills/mariadb-helm/SKILL.md +378 -0
  34. package/skills/mariadb-operator/SKILL.md +1114 -0
  35. package/skills/nvidia-device-plugin/SKILL.md +204 -0
  36. package/skills/rook-ceph/SKILL.md +22 -0
  37. package/skills/rook-ceph-operator/SKILL.md +150 -0
  38. package/skills/rook-ceph-toolbox/SKILL.md +220 -0
  39. package/skills/sealed-secrets/SKILL.md +221 -0
  40. package/skills/stakater-reloader/SKILL.md +259 -0
  41. package/skills/talos/SKILL.md +244 -0
  42. package/skills/tekton/SKILL.md +187 -0
  43. package/skills/vector/SKILL.md +24 -0
  44. package/skills/vector-helm/SKILL.md +186 -0
  45. package/skills/vector-operator/SKILL.md +455 -0
  46. package/skills/victoria-metrics/SKILL.md +35 -0
  47. package/skills/victoriametrics-operator/SKILL.md +248 -0
  48. package/skills/zitadel/SKILL.md +24 -0
  49. package/skills/zitadel-api/SKILL.md +962 -0
  50. package/skills/zitadel-helm/SKILL.md +263 -0
  51. package/skills/zitadel-terraform/SKILL.md +728 -0
package/README.md ADDED
@@ -0,0 +1,102 @@
1
+ # k8s-agent-skills
2
+
3
+ Agent skills for Kubernetes cluster operations tooling. Each skill is a self-contained `SKILL.md` designed for agentic AI tools (Claude Code, OpenCode, Codex) that load skills for task-specific expertise.
4
+
5
+ **Mirror:** [github.com/Aidas-dev/k8s-agent-skills](https://github.com/Aidas-dev/k8s-agent-skills)
6
+
7
+ ## Skills
8
+
9
+ | Skill | What it covers | CRDs |
10
+ |-------|---------------|------|
11
+ | [atlas](skills/atlas/SKILL.md) | Atlas Operator — DB schema migrations, lint, policies | AtlasSchema, AtlasMigration |
12
+ | [cert-manager](skills/cert-manager/SKILL.md) | TLS cert provisioning, ACME, Issuers, Gateway integration | Certificate, Issuer, ClusterIssuer |
13
+ | [cilium-gateway](skills/cilium-gateway/SKILL.md) | Gateway API, TLS, traffic splitting, oauth2-proxy, hostNetwork | GatewayClass, Gateway, HTTPRoute, etc. |
14
+ | [cilium-network](skills/cilium-network/SKILL.md) | Cilium CNI, network policies, LB IPAM, encryption, Hubble | CiliumNetworkPolicy, CiliumCIDRGroup, etc. |
15
+ | [cnpg](skills/cnpg/SKILL.md) | CloudNativePG — PostgreSQL clusters, backups, poolers | Cluster, Backup, ScheduledBackup, Pooler |
16
+ | [dragonfly](skills/dragonfly/SKILL.md) | DragonflyDB — Redis-compatible operator, replication, TLS | Dragonfly |
17
+ | [external-dns](skills/external-dns/SKILL.md) | DNS sync — Cloudflare, Route53, Gateway API, sources, registry | None |
18
+ | [flagger](skills/flagger/SKILL.md) | Progressive delivery, canary, A/B, blue/green | Canary, MetricTemplate, AlertProvider |
19
+ | [flux](skills/flux/SKILL.md) | Flux CD router — debugging, CRDs, repo audit | Router → sub-skills |
20
+ | [gitea](skills/gitea/SKILL.md) | Gitea router — API, runner, registry, webhooks, tea CLI | Router → sub-skills |
21
+ | [gitea-api](skills/gitea-api/SKILL.md) | Gitea REST API — auth, repos, issues, PRs, packages | None |
22
+ | [gitea-registry](skills/gitea-registry/SKILL.md) | Gitea container registry — OCI, multi-arch, push/pull | None |
23
+ | [gitea-runner](skills/gitea-runner/SKILL.md) | Gitea Actions runners — registration, host-mode, ephemeral | None |
24
+ | [gitea-tea](skills/gitea-tea/SKILL.md) | tea CLI — commands, auth, actions, webhooks, admin | None |
25
+ | [gitea-webhooks](skills/gitea-webhooks/SKILL.md) | Gitea webhooks — events, HMAC, org vs repo hooks | None |
26
+ | [harbor](skills/harbor/SKILL.md) | Harbor router — API, Helm, Terraform | Router → sub-skills |
27
+ | [harbor-api](skills/harbor-api/SKILL.md) | Harbor REST API v2 — projects, artifacts, robots, replication, GC, OIDC | None |
28
+ | [harbor-helm](skills/harbor-helm/SKILL.md) | Harbor Helm chart — production deploy, external DB/Redis/S3, Trivy | None |
29
+ | [harbor-terraform](skills/harbor-terraform/SKILL.md) | Harbor Terraform provider — 20 resources, 8 data sources | None |
30
+ | [higress](skills/higress/SKILL.md) | Higress router — CRDs, Wasm plugins, AI Gateway, Helm | Router → sub-skills |
31
+ | [higress-helm](skills/higress-helm/SKILL.md) | Higress Helm chart — core/console/redis/plugin-server/o11y | None |
32
+ | [higress-operator](skills/higress-operator/SKILL.md) | Higress CRDs — WasmPlugin, Http2Rpc, McpBridge, 41 Wasm plugins, 16 AI providers | WasmPlugin, Http2Rpc, McpBridge |
33
+ | [kserve](skills/kserve/SKILL.md) | KServe router — CRDs, Helm, deployment modes | Router → sub-skills |
34
+ | [kserve-helm](skills/kserve-helm/SKILL.md) | KServe Helm — 10 charts, Serverless/Raw/ModelMesh modes | None |
35
+ | [kserve-operator](skills/kserve-operator/SKILL.md) | KServe CRDs — InferenceService, ServingRuntime, InferenceGraph, LLMInferenceService, LocalModel | 22 CRDs under serving.kserve.io |
36
+ | [kubeflow](skills/kubeflow/SKILL.md) | Kubeflow router — Trainer v2, Pipelines v2, Training Operator v1 | Router → sub-skills |
37
+ | [kubeflow-pipelines](skills/kubeflow-pipelines/SKILL.md) | KFP v2 SDK — DSL, IR YAML, control flow, Kubernetes Native API | PipelineRun (v2beta1) |
38
+ | [kubeflow-trainer](skills/kubeflow-trainer/SKILL.md) | Kubeflow Trainer v2.2 — TrainJob, TrainingRuntime, 5 ML policies | TrainJob, TrainingRuntime, ClusterTrainingRuntime |
39
+ | [kubeflow-training-operator](skills/kubeflow-training-operator/SKILL.md) | Legacy v1 — PyTorchJob, TFJob, MPIJob, XGBoostJob | PyTorchJob, TFJob, MPIJob, XGBoostJob |
40
+ | [mariadb](skills/mariadb/SKILL.md) | MariaDB router — operator CRDs, Helm | Router → sub-skills |
41
+ | [mariadb-helm](skills/mariadb-helm/SKILL.md) | MariaDB operator Helm — 3 charts, production HA values | None |
42
+ | [mariadb-operator](skills/mariadb-operator/SKILL.md) | MariaDB operator — 12 CRDs, Galera HA, MaxScale, backups, PITR | 12 CRDs under k8s.mariadb.com |
43
+ | [nvidia-device-plugin](skills/nvidia-device-plugin/SKILL.md) | GPU discovery, GFD, NFD, CDI, MIG, time-slicing | None (ConfigMap) |
44
+ | [rook-ceph-operator](skills/rook-ceph-operator/SKILL.md) | Ceph cluster, block pools, object store, NFS, CSI | CephCluster, CephBlockPool, CephObjectStore, etc. |
45
+ | [rook-ceph-toolbox](skills/rook-ceph-toolbox/SKILL.md) | Ceph CLI — health, OSD mgmt, RBD, RGW, CRUSH | None (toolbox ops) |
46
+ | [sealed-secrets](skills/sealed-secrets/SKILL.md) | Encrypted Secrets for GitOps, kubeseal, key rotation | SealedSecret |
47
+ | [stakater-reloader](skills/stakater-reloader/SKILL.md) | ConfigMap/Secret reload, annotations, Helm values | None (annotation-based) |
48
+ | [talos](skills/talos/SKILL.md) | Talos Linux — cluster deploy, machine config, upgrades, talosctl | None |
49
+ | [tekton](skills/tekton/SKILL.md) | Tekton pipelines — resolver refs, matrix, CEL, TTL | Task, Pipeline, etc. |
50
+ | [vector](skills/vector/SKILL.md) | Vector router — Helm, operator CRDs | Router → sub-skills |
51
+ | [vector-helm](skills/vector-helm/SKILL.md) | Vector Helm chart — 3 roles (Agent/Aggregator/Stateless), customConfig | None |
52
+ | [vector-operator](skills/vector-operator/SKILL.md) | Vector operator — 5 CRDs, auto-routing by source type | Vector, VectorPipeline, ClusterVectorPipeline, VectorAggregator, ClusterVectorAggregator |
53
+ | [victoria-metrics](skills/victoria-metrics/SKILL.md) | VM skill router — operator, queries, cardinality, logs, traces | Router |
54
+ | [victoriametrics-operator](skills/victoriametrics-operator/SKILL.md) | VM Operator CRDs — VMAgent, VMAlert, VMServiceScrape, VLogs | 19 CRDs |
55
+ | [zitadel](skills/zitadel/SKILL.md) | ZITADEL router — API, Helm, Terraform | Router → sub-skills |
56
+ | [zitadel-api](skills/zitadel-api/SKILL.md) | ZITADEL API — OIDC/SAML/API apps, users, orgs, roles | None |
57
+ | [zitadel-helm](skills/zitadel-helm/SKILL.md) | ZITADEL Helm — CNPG, Gateway API, caches, masterkey | None |
58
+ | [zitadel-terraform](skills/zitadel-terraform/SKILL.md) | ZITADEL Terraform provider — 80+ resources, 40+ data sources | None |
59
+
60
+ ## Usage
61
+
62
+ ### Via npm (recommended)
63
+
64
+ ```bash
65
+ npm install --save-dev k8s-agent-skills
66
+ # or
67
+ bun add -d k8s-agent-skills
68
+
69
+ # Symlink skills to agent config dir
70
+ npx skills-npm
71
+ # → symlinks skills/ into ~/.agents/skills/
72
+ ```
73
+
74
+ ### Via git clone
75
+
76
+ ```bash
77
+ git clone https://github.com/Aidas-dev/k8s-agent-skills.git
78
+ ln -sf $(pwd)/k8s-agent-skills/skills/* ~/.agents/skills/
79
+ ```
80
+
81
+ ### Manual copy
82
+
83
+ ```bash
84
+ # OpenCode / Codex
85
+ cp -r skills/vector ~/.agents/skills/
86
+
87
+ # Claude Code Desktop
88
+ cp -r skills/external-dns ~/.claude/skills/
89
+ ```
90
+
91
+ ## npm Publishing
92
+
93
+ The package auto-publishes to npm on push to `main` via GitHub Actions.
94
+
95
+ **Setup required** (one-time):
96
+ 1. Create an npm automation token at [npmjs.com/settings/tokens](https://www.npmjs.com/settings/tokens)
97
+ 2. Add it as `NPM_TOKEN` secret in the GitHub repo settings
98
+ 3. Workflow at `.github/workflows/publish.yml` handles version bump and publish
99
+
100
+ ## License
101
+
102
+ MIT — free to use, modify, and distribute.
package/package.json ADDED
@@ -0,0 +1,63 @@
1
+ {
2
+ "name": "k8s-agent-skills",
3
+ "version": "1.2.0",
4
+ "private": false,
5
+ "description": "Agent skills for Kubernetes cluster operations — Cilium, Talos, Flux, Rook-Ceph, CNPG, Gitea, Tekton, Cert-Manager, VictoriaMetrics, ZITADEL, Harbor, Higress, KServe, Kubeflow, MariaDB, Vector, ExternalDNS, Dragonfly, Flagger, Sealed Secrets, Stakater Reloader, NVIDIA GPU, Atlas",
6
+ "files": [
7
+ "skills"
8
+ ],
9
+ "keywords": [
10
+ "ai-agent-skill",
11
+ "kubernetes",
12
+ "gitops",
13
+ "cilium",
14
+ "talos",
15
+ "flux",
16
+ "rook-ceph",
17
+ "postgresql",
18
+ "cnpg",
19
+ "gitea",
20
+ "tekton",
21
+ "cert-manager",
22
+ "victoriametrics",
23
+ "zitadel",
24
+ "harbor",
25
+ "higress",
26
+ "kserve",
27
+ "kubeflow",
28
+ "mariadb",
29
+ "vector",
30
+ "dragonfly",
31
+ "flagger",
32
+ "external-dns",
33
+ "sealed-secrets",
34
+ "stakater-reloader",
35
+ "nvidia-gpu",
36
+ "atlas",
37
+ "open-code",
38
+ "opencode",
39
+ "claude-code"
40
+ ],
41
+ "license": "MIT",
42
+ "author": {
43
+ "name": "Aidas",
44
+ "url": "https://github.com/Aidas-dev"
45
+ },
46
+ "homepage": "https://github.com/Aidas-dev/k8s-agent-skills",
47
+ "repository": {
48
+ "type": "git",
49
+ "url": "git+https://github.com/Aidas-dev/k8s-agent-skills.git"
50
+ },
51
+ "bugs": {
52
+ "url": "https://github.com/Aidas-dev/k8s-agent-skills/issues"
53
+ },
54
+ "publishConfig": {
55
+ "access": "public",
56
+ "registry": "https://registry.npmjs.org",
57
+ "provenance": true,
58
+ "tag": "latest"
59
+ },
60
+ "engines": {
61
+ "node": ">=18"
62
+ }
63
+ }
@@ -0,0 +1,166 @@
1
+ ---
2
+ name: atlas
3
+ description: Use when working with Atlas Operator on Kubernetes — creating or troubleshooting AtlasSchema or AtlasMigration resources; configuring DevDB, migration policies, credential patterns, or schema diff/lint rules.
4
+ ---
5
+
6
+ # Atlas Operator
7
+
8
+ ## Overview
9
+
10
+ Database schema management on Kubernetes via `db.atlasgo.io/v1alpha1`. Two approaches: **AtlasSchema** (declarative — define desired schema, operator diffs and applies) and **AtlasMigration** (versioned — ordered SQL scripts). Deployed via OCI Helm chart: `ghcr.io/ariga/charts/atlas-operator` (latest: 0.7.32).
11
+
12
+ ## CRD Reference
13
+
14
+ | CRD | Approach | Use When |
15
+ |-----|----------|----------|
16
+ | `AtlasSchema` | Declarative | Schema should match a desired state (operator auto-diffs) |
17
+ | `AtlasMigration` | Versioned | Need sequential migration files, rollback support, audit trail |
18
+
19
+ ## AtlasSchema (Declarative)
20
+
21
+ ```yaml
22
+ apiVersion: db.atlasgo.io/v1alpha1
23
+ kind: AtlasSchema
24
+ metadata:
25
+ name: app-schema
26
+ spec:
27
+ urlFrom:
28
+ secretKeyRef:
29
+ name: db-creds
30
+ key: url # postgres://user:pass@host:5432/db?sslmode=require
31
+
32
+ schema:
33
+ sql: |
34
+ CREATE TABLE users (
35
+ id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
36
+ email TEXT UNIQUE NOT NULL,
37
+ created_at TIMESTAMPTZ DEFAULT now()
38
+ );
39
+ CREATE TABLE posts (
40
+ id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
41
+ user_id UUID REFERENCES users(id),
42
+ title TEXT NOT NULL
43
+ );
44
+
45
+ policy:
46
+ lint:
47
+ destructive:
48
+ error: true # Block DROP COLUMN / DROP TABLE
49
+ diff:
50
+ skip:
51
+ - drop_column # Skip dropping columns (compatible changes only)
52
+ exclude:
53
+ - "audit_*" # Glob patterns to ignore
54
+
55
+ prewarmDevDB: true # Default — ephemeral dev container for diff
56
+ allowCustomConfig: false # Don't allow DB config changes via atlas
57
+ ```
58
+
59
+ ## AtlasMigration (Versioned)
60
+
61
+ ```yaml
62
+ apiVersion: db.atlasgo.io/v1alpha1
63
+ kind: AtlasMigration
64
+ metadata:
65
+ name: app-migrations
66
+ spec:
67
+ urlFrom:
68
+ secretKeyRef:
69
+ name: db-creds
70
+ key: url
71
+
72
+ dir:
73
+ configMapRef:
74
+ name: migration-files # ConfigMap with SQL migration scripts
75
+
76
+ execOrder: linear # linear | linear-skip | non-linear
77
+ baseline: "000000" # Start from this version (skip earlier)
78
+
79
+ policy:
80
+ lint:
81
+ destructive:
82
+ error: true
83
+
84
+ prewarmDevDB: true
85
+ allowCustomConfig: false
86
+ ```
87
+
88
+ Migration files in ConfigMap must be lexicographically ordered:
89
+
90
+ ```yaml
91
+ apiVersion: v1
92
+ kind: ConfigMap
93
+ metadata:
94
+ name: migration-files
95
+ data:
96
+ 000001_init.sql: |
97
+ CREATE TABLE users (...);
98
+ 000002_add_posts.sql: |
99
+ CREATE TABLE posts (...);
100
+ 000003_add_indexes.sql: |
101
+ CREATE INDEX idx_users_email ON users(email);
102
+ ```
103
+
104
+ ## Credential Patterns
105
+
106
+ | Pattern | Example |
107
+ |---------|---------|
108
+ | Connection URL in Secret | `urlFrom.secretKeyRef.key: url` |
109
+ | Credentials struct | `credentials.host`, `.user`, `.passwordFrom`, `.database`, `.port` |
110
+ | Extra connection parameters | `credentials.parameters: { sslmode: require }` |
111
+
112
+ Use `credentials` struct when the URL is constructed from multiple Secret keys:
113
+
114
+ ```yaml
115
+ spec:
116
+ credentials:
117
+ host: myhost.example.com
118
+ user: app
119
+ passwordFrom:
120
+ secretKeyRef:
121
+ name: db-creds
122
+ key: password
123
+ database: mydb
124
+ port: 5432
125
+ ```
126
+
127
+ ## Policy Configuration
128
+
129
+ | Policy | Field | Effect |
130
+ |--------|-------|--------|
131
+ | `lint.destructive.error` | `true`/`false` | Block (true) or warn (false) on destructive changes |
132
+ | `diff.skip` | `[drop_column, drop_index, ...]` | Operations to skip in diff (safe for roll-forward) |
133
+ | `exclude` | `["pattern_*"]` | Glob patterns to exclude from management |
134
+
135
+ ## Migration Directory Sources
136
+
137
+ | Source | Location | Use Case |
138
+ |--------|----------|----------|
139
+ | `dir.configMapRef` | ConfigMap in same namespace | Simple, no external deps |
140
+ | `dir.local.path` | Mounted volume path | Sidecar / init container pattern |
141
+ | `dir.remote` | Atlas Cloud URL | Team-managed schema registry |
142
+ | Inline (`spec.schema.sql`) | Embedded in CR | Quick prototyping |
143
+
144
+ ## DevDB
145
+
146
+ Ephemeral database used to compute diff between desired and current schema. If `devURL` not set, operator spins a temporary container. Set explicitly to reuse or for air-gapped:
147
+
148
+ ```yaml
149
+ spec:
150
+ devURL: postgres://user:pass@dev-db-host:5432/template?sslmode=disable
151
+ ```
152
+
153
+ Or via Secret: `devURLFrom.secretKeyRef`.
154
+
155
+ ## Common Mistakes
156
+
157
+ - **`allowCustomConfig: true` without need** — allows arbitrary DB config changes; keep `false` unless you need `ALTER SYSTEM SET`
158
+ - **No lint policy** — destructive changes silently applied; always set `lint.destructive.error: true` for production
159
+ - **Wrong execOrder** — `non-linear` skips version ordering; use `linear` for strict migration order
160
+ - **Missing `baseline`** — operator tries to run all migrations including already-applied ones; set `baseline` to skip
161
+ - **ConfigMap not updated** — AtlasMigration reads migration dir once at reconcile; if ConfigMap changes, operator picks it up on next loop
162
+ - **DevDB connection issues** — operator needs network access to a temporary or explicit devDB; ensure `prewarmDevDB: true` works in your cluster
163
+
164
+ ## Supported Databases
165
+
166
+ MySQL, MariaDB, PostgreSQL, SQLite, SQL Server, ClickHouse, CockroachDB, TiDB, YugabyteDB.
@@ -0,0 +1,212 @@
1
+ ---
2
+ name: cert-manager
3
+ description: Use when creating or managing cert-manager CRDs — Certificate, Issuer, ClusterIssuer, CertificateRequest. Covers ACME DNS-01/HTTP-01, self-signed, CA issuers, Gateway API integration, and common patterns.
4
+ ---
5
+
6
+ # Cert-Manager CRDs
7
+
8
+ Chart from `https://charts.jetstack.io` or `oci://quay.io/jetstack/charts/cert-manager`
9
+ Latest chart: **v1.20.2** (Apr 2026) | API: `cert-manager.io/v1`, `acme.cert-manager.io/v1`
10
+ CRDs: Certificate, Issuer, ClusterIssuer, CertificateRequest | ACME: Order, Challenge | Alpha: ListenerSet
11
+
12
+ Install CRDs: `--set crds.enabled=true` or `installCRDs: true`
13
+
14
+ ## CRD Reference
15
+
16
+ | CRD | API | Scope | Purpose | Key Fields |
17
+ |-----|-----|-------|---------|------------|
18
+ | **Issuer** | `cert-manager.io/v1` | Namespaced | Certificate authority within a namespace. ACME, self-signed, CA, Vault, Venafi. | `spec.acme`, `spec.ca`, `spec.selfSigned`, `spec.vault`, `spec.venafi` |
19
+ | **ClusterIssuer** | `cert-manager.io/v1` | Cluster | Cluster-wide issuer — can issue certs in any namespace. Same spec as Issuer. | Same as Issuer |
20
+ | **Certificate** | `cert-manager.io/v1` | Namespaced | Request a TLS certificate from an issuer. Creates Secret with cert+key. | `spec.secretName`, `spec.issuerRef`, `spec.dnsNames[]`, `spec.commonName`, `spec.duration`, `spec.renewBefore`, `spec.privateKey` (algorithm, size, rotationPolicy), `spec.usages[]`, `spec.subject` |
21
+ | **CertificateRequest** | `cert-manager.io/v1` | Namespaced | Low-level CSR resource. Usually created automatically by Certificate. | `spec.issuerRef`, `spec.request` (base64 CSR), `spec.duration`, `spec.isCA` |
22
+
23
+ ### ACME Resources
24
+
25
+ | CRD | API | Purpose |
26
+ |-----|-----|---------|
27
+ | **Order** | `acme.cert-manager.io/v1` | ACME order lifecycle. Created automatically by Certificate with ACME issuer. |
28
+ | **Challenge** | `acme.cert-manager.io/v1` | ACME challenge (DNS-01, HTTP-01). Created automatically per domain in Order. |
29
+
30
+ ## Issuer Types
31
+
32
+ | Type | Configuration | Use Case |
33
+ |------|--------------|----------|
34
+ | **ACME** | `spec.acme.server` + `spec.acme.solvers[]` | Let's Encrypt, ZeroSSL, Buypass |
35
+ | **SelfSigned** | `spec.selfSigned: {}` | Internal CA bootstrap, testing |
36
+ | **CA** | `spec.ca.secretName` | Sign with existing CA key+cert in Secret |
37
+ | **Vault** | `spec.vault` | HashiCorp Vault PKI |
38
+ | **Venafi** | `spec.venafi` | Venafi TPP/Cloud |
39
+
40
+ ## Deployed Pattern — Let's Encrypt + Cloudflare DNS-01
41
+
42
+ ```yaml
43
+ apiVersion: cert-manager.io/v1
44
+ kind: ClusterIssuer
45
+ metadata:
46
+ name: letsencrypt-production
47
+ spec:
48
+ acme:
49
+ server: https://acme-v02.api.letsencrypt.org/directory
50
+ email: admin@example.com
51
+ privateKeySecretRef:
52
+ name: letsencrypt-production-account-key
53
+ solvers:
54
+ - dns01:
55
+ cloudflare:
56
+ apiTokenSecretRef:
57
+ name: cloudflare-api-token-secret
58
+ key: api-token
59
+ ```
60
+
61
+ ## Deployed Pattern — Gateway Wildcard Certificate
62
+
63
+ ```yaml
64
+ apiVersion: cert-manager.io/v1
65
+ kind: Certificate
66
+ metadata:
67
+ name: kubexa-tech-gateway
68
+ namespace: cilium-gateway
69
+ spec:
70
+ secretName: kubexa-tech-tls
71
+ issuerRef:
72
+ name: letsencrypt-production
73
+ kind: ClusterIssuer
74
+ dnsNames:
75
+ - "*.kubexa.tech"
76
+ - "kubexa.tech"
77
+ duration: 168h # 7 days
78
+ renewBefore: 144h # Renew 6 days before expiry
79
+ privateKey:
80
+ algorithm: ECDSA
81
+ size: 256
82
+ rotationPolicy: Always
83
+ usages:
84
+ - server auth
85
+ ```
86
+
87
+ ## Deployed Pattern — Self-Signed CA + Issuer
88
+
89
+ ```yaml
90
+ # 1. Self-signed root CA
91
+ apiVersion: cert-manager.io/v1
92
+ kind: ClusterIssuer
93
+ metadata:
94
+ name: selfsigned
95
+ spec:
96
+ selfSigned: {}
97
+ ---
98
+ apiVersion: cert-manager.io/v1
99
+ kind: Certificate
100
+ metadata:
101
+ name: ca-root
102
+ namespace: cert-manager
103
+ spec:
104
+ secretName: ca-root-secret
105
+ isCA: true
106
+ commonName: "cluster-ca"
107
+ privateKey:
108
+ algorithm: ECDSA
109
+ size: 256
110
+ issuerRef:
111
+ name: selfsigned
112
+ kind: ClusterIssuer
113
+ ---
114
+ # 2. CA issuer from root
115
+ apiVersion: cert-manager.io/v1
116
+ kind: ClusterIssuer
117
+ metadata:
118
+ name: ca-issuer
119
+ spec:
120
+ ca:
121
+ secretName: ca-root-secret
122
+ ```
123
+
124
+ ## Key Config Details
125
+
126
+ ```yaml
127
+ # ACME DNS-01 solver options
128
+ spec:
129
+ acme:
130
+ solvers:
131
+ - dns01:
132
+ cloudflare:
133
+ apiTokenSecretRef: # API token (not API key)
134
+ name: cloudflare-token
135
+ key: api-token
136
+ # Or route53:
137
+ route53:
138
+ region: us-east-1
139
+ accessKeyIDSecretRef:
140
+ name: aws-creds
141
+ key: access-key-id
142
+ secretAccessKeySecretRef:
143
+ name: aws-creds
144
+ key: secret-access-key
145
+ hostedZoneID: Z123456
146
+ # Or Azure:
147
+ azureDNS:
148
+ subscriptionID: <id>
149
+ resourceGroupName: <rg>
150
+ hostedZoneName: example.com
151
+ tenantID: <tenant>
152
+ clientID: <client>
153
+ clientSecretSecretRef:
154
+ name: azure-creds
155
+ key: client-secret
156
+ - http01:
157
+ ingress:
158
+ class: nginx # Ingress class for HTTP-01 solver
159
+ ```
160
+
161
+ ## Gateway API Integration
162
+
163
+ cert-manager v1.20+ supports Gateway API for ACME HTTP-01 challenges. No `parentRefs` required (v1.20):
164
+
165
+ ```yaml
166
+ spec:
167
+ acme:
168
+ solvers:
169
+ - http01:
170
+ gatewayHTTPRoute:
171
+ parentRefs:
172
+ - name: my-gateway
173
+ namespace: istio-system
174
+ ```
175
+
176
+ ## Important Fields — Certificate
177
+
178
+ ```yaml
179
+ spec:
180
+ duration: 2160h # 90 days default
181
+ renewBefore: 720h # 30 days before expiry
182
+ privateKey:
183
+ algorithm: ECDSA # ECDSA > RSA for performance
184
+ size: 256 # P-256 curve
185
+ rotationPolicy: Never # or "Always" to rekey on renewal
186
+ usages: # Extended Key Usage
187
+ - server auth
188
+ - client auth
189
+ subject:
190
+ organizations:
191
+ - MyOrg
192
+ emailAddresses:
193
+ - admin@example.com
194
+ keystores: # Create JKS/PKCS12 keystores
195
+ pkcs12:
196
+ create: true
197
+ passwordSecretRef:
198
+ name: keystore-pass
199
+ key: password
200
+ ```
201
+
202
+ ## Common Mistakes
203
+
204
+ - **DNS-01 `apiTokenSecretRef` vs `apiKeySecretRef`** — Cloudflare uses API **tokens** (scoped), not API keys. Use `apiTokenSecretRef`.
205
+ - **Certificate renewBefore > duration** — cert-manager will constantly renew. Keep `renewBefore < duration`.
206
+ - **ClusterIssuer in wrong namespace** — ClusterIssuer is not namespaced. Certificate references it via `issuerRef.kind: ClusterIssuer`.
207
+ - **EC key with older clients** — Some clients don't support ECDSA. Use RSA if compatibility needed: `algorithm: RSA`, `size: 2048`.
208
+ - **DNS-01 recursive nameservers** — cert-manager needs authoritative DNS. Set `--dns01-recursive-nameservers=8.8.8.8:53,1.1.1.1:53` and `--dns01-recursive-nameservers-only` to avoid public resolver issues.
209
+ - **ACME rate limits** — Let's Encrypt has 50 certs/week/domain for production, 5/week for staging. Use staging for testing.
210
+ - **Secret overwrite** — cert-manager overwrites the Secret on renewal. Don't manually edit the TLS secret.
211
+ - **dnsNames vs commonName** — SANs (`dnsNames`) are required. `commonName` is deprecated in CA/B guidelines.
212
+ - **HTTP-01 on port 80** — cert-manager needs port 80 reachable. Behind Cilium Gateway, ensure HTTP listener exists.