@a5c-ai/krate 5.0.1-staging.f672fe79b

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (174) hide show
  1. package/Dockerfile +29 -0
  2. package/README.md +183 -0
  3. package/bin/krate-demo.mjs +23 -0
  4. package/bin/krate-server.mjs +14 -0
  5. package/dist/krate-controller-ui.json +2407 -0
  6. package/dist/krate-lifecycle.json +201 -0
  7. package/dist/krate-runtime-snapshot.json +2955 -0
  8. package/dist/krate-summary.json +687 -0
  9. package/docs/README.md +61 -0
  10. package/docs/agents/README.md +83 -0
  11. package/docs/agents/acceptance-test-matrix.md +193 -0
  12. package/docs/agents/agent-mux-adapter-contract.md +167 -0
  13. package/docs/agents/agent-mux-source-map.md +310 -0
  14. package/docs/agents/agent-run-memory-import-spec.md +256 -0
  15. package/docs/agents/agent-stack-management-spec.md +421 -0
  16. package/docs/agents/api-contract-spec.md +309 -0
  17. package/docs/agents/artifacts-writeback-spec.md +145 -0
  18. package/docs/agents/chart-packaging-spec.md +128 -0
  19. package/docs/agents/ci-orchestration-spec.md +140 -0
  20. package/docs/agents/context-assembly-spec.md +219 -0
  21. package/docs/agents/controller-reconciliation-spec.md +255 -0
  22. package/docs/agents/crd-schema-spec.md +315 -0
  23. package/docs/agents/decision-log-open-questions.md +169 -0
  24. package/docs/agents/developer-implementation-checklist.md +329 -0
  25. package/docs/agents/dispatching-design.md +262 -0
  26. package/docs/agents/glossary.md +66 -0
  27. package/docs/agents/implementation-blueprint.md +324 -0
  28. package/docs/agents/implementation-rollout-slices.md +251 -0
  29. package/docs/agents/memory-context-integration-spec.md +194 -0
  30. package/docs/agents/memory-ontology-schema-spec.md +253 -0
  31. package/docs/agents/memory-operations-runbook.md +121 -0
  32. package/docs/agents/mvp-vertical-slice-spec.md +146 -0
  33. package/docs/agents/observability-audit-spec.md +265 -0
  34. package/docs/agents/operator-runbook.md +174 -0
  35. package/docs/agents/org-memory-api-payload-examples.md +333 -0
  36. package/docs/agents/org-memory-controller-sequence-spec.md +181 -0
  37. package/docs/agents/org-memory-e2e-fixture-plan.md +161 -0
  38. package/docs/agents/org-memory-ui-implementation-map.md +114 -0
  39. package/docs/agents/org-memory-vertical-slice-spec.md +168 -0
  40. package/docs/agents/org-resource-model-delta-spec.md +111 -0
  41. package/docs/agents/org-route-resource-model-spec.md +183 -0
  42. package/docs/agents/org-scoping-namespace-spec.md +114 -0
  43. package/docs/agents/rbac-secrets-management-spec.md +406 -0
  44. package/docs/agents/repository-page-integration-spec.md +255 -0
  45. package/docs/agents/resource-contract-examples.md +808 -0
  46. package/docs/agents/resource-relationship-map.md +190 -0
  47. package/docs/agents/security-threat-model.md +188 -0
  48. package/docs/agents/shared-memory-company-brain-spec.md +358 -0
  49. package/docs/agents/storage-migration-spec.md +168 -0
  50. package/docs/agents/subagent-orchestration-spec.md +152 -0
  51. package/docs/agents/system-overview.md +88 -0
  52. package/docs/agents/tools-mcp-skills-spec.md +189 -0
  53. package/docs/agents/traceability-matrix.md +79 -0
  54. package/docs/agents/ui-flow-spec.md +211 -0
  55. package/docs/agents/ui-ux-system-spec.md +426 -0
  56. package/docs/agents/workspace-lifecycle-spec.md +166 -0
  57. package/docs/architecture-spec.md +78 -0
  58. package/docs/components/control-plane.md +78 -0
  59. package/docs/components/data-plane.md +69 -0
  60. package/docs/components/hooks-events.md +67 -0
  61. package/docs/components/identity-rbac-policy.md +73 -0
  62. package/docs/components/kubevela-oam.md +70 -0
  63. package/docs/components/operations-publishing.md +81 -0
  64. package/docs/components/runners-ci.md +66 -0
  65. package/docs/components/web-ui.md +94 -0
  66. package/docs/external/README.md +47 -0
  67. package/docs/external/bidirectional-sync-design.md +134 -0
  68. package/docs/external/cicd-interface.md +64 -0
  69. package/docs/external/external-backend-controllers.md +170 -0
  70. package/docs/external/external-backend-crds.md +234 -0
  71. package/docs/external/external-backend-ui-spec.md +151 -0
  72. package/docs/external/external-backend-ux-flows.md +115 -0
  73. package/docs/external/external-object-mapping.md +125 -0
  74. package/docs/external/git-forge-interface.md +68 -0
  75. package/docs/external/github-integration-design.md +151 -0
  76. package/docs/external/issue-tracking-interface.md +66 -0
  77. package/docs/external/provider-capability-manifests.md +204 -0
  78. package/docs/external/provider-catalog.md +139 -0
  79. package/docs/external/provider-rollout-testing.md +78 -0
  80. package/docs/external/research-results.md +48 -0
  81. package/docs/external/security-auth-permissions.md +81 -0
  82. package/docs/external/sync-state-machines.md +108 -0
  83. package/docs/external/unified-external-backend-model.md +107 -0
  84. package/docs/external/user-facing-changes.md +67 -0
  85. package/docs/gaps.md +161 -0
  86. package/docs/install.md +94 -0
  87. package/docs/krate-design.md +334 -0
  88. package/docs/local-minikube.md +55 -0
  89. package/docs/ontology/README.md +32 -0
  90. package/docs/ontology/bounded-contexts.md +29 -0
  91. package/docs/ontology/events-and-hooks.md +32 -0
  92. package/docs/ontology/oam-kubevela.md +32 -0
  93. package/docs/ontology/operations-and-release.md +25 -0
  94. package/docs/ontology/personas-and-actors.md +32 -0
  95. package/docs/ontology/policies-and-invariants.md +33 -0
  96. package/docs/ontology/problem-space.md +30 -0
  97. package/docs/ontology/resource-contracts.md +40 -0
  98. package/docs/ontology/resource-taxonomy.md +42 -0
  99. package/docs/ontology/runners-and-ci.md +29 -0
  100. package/docs/ontology/solution-space.md +24 -0
  101. package/docs/ontology/storage-and-data-boundaries.md +29 -0
  102. package/docs/ontology/validation-matrix.md +24 -0
  103. package/docs/ontology/web-ui-excellent-flows.md +32 -0
  104. package/docs/ontology/workflows.md +39 -0
  105. package/docs/ontology/world.md +35 -0
  106. package/docs/product-requirements.md +62 -0
  107. package/docs/roadmap-mvp.md +87 -0
  108. package/docs/system-requirements.md +90 -0
  109. package/docs/tests/README.md +53 -0
  110. package/docs/tests/agent-qa-plan.md +63 -0
  111. package/docs/tests/browser-ui-tests.md +62 -0
  112. package/docs/tests/ci-quality-gates.md +48 -0
  113. package/docs/tests/coverage-model.md +64 -0
  114. package/docs/tests/e2e-scenario-tests.md +53 -0
  115. package/docs/tests/fixtures-test-data.md +63 -0
  116. package/docs/tests/observability-reliability-tests.md +54 -0
  117. package/docs/tests/product-test-matrix.md +145 -0
  118. package/docs/tests/qa-adoption-roadmap.md +130 -0
  119. package/docs/tests/qa-automation-plan.md +101 -0
  120. package/docs/tests/security-compliance-tests.md +57 -0
  121. package/docs/tests/test-framework-tools.md +88 -0
  122. package/docs/tests/test-suite-layout.md +121 -0
  123. package/docs/tests/unit-integration-tests.md +48 -0
  124. package/docs/todo-kyverno +714 -0
  125. package/docs/user-stories.md +78 -0
  126. package/examples/minikube-demo.yaml +190 -0
  127. package/examples/oam-application.yaml +23 -0
  128. package/examples/policy-kyverno-pr-title.yaml +18 -0
  129. package/package.json +63 -0
  130. package/scripts/build.mjs +29 -0
  131. package/scripts/setup-minikube.mjs +65 -0
  132. package/scripts/smoke.mjs +37 -0
  133. package/scripts/validate-doc-coverage.mjs +152 -0
  134. package/scripts/validate-package.mjs +93 -0
  135. package/scripts/validate-ui.mjs +207 -0
  136. package/src/agent-approval-controller.js +123 -0
  137. package/src/agent-context-bundles.js +242 -0
  138. package/src/agent-dispatch-controller.js +86 -0
  139. package/src/agent-mux-client.js +280 -0
  140. package/src/agent-permission-review.js +162 -0
  141. package/src/agent-stack-controller.js +296 -0
  142. package/src/agent-trigger-controller.js +108 -0
  143. package/src/api-controller.js +206 -0
  144. package/src/argocd-gitops.js +43 -0
  145. package/src/auth.js +265 -0
  146. package/src/component-catalog.js +41 -0
  147. package/src/control-plane.js +136 -0
  148. package/src/controller-client.js +38 -0
  149. package/src/controller-ui.js +538 -0
  150. package/src/data-plane.js +178 -0
  151. package/src/gitea-backend.js +95 -0
  152. package/src/handoff.js +98 -0
  153. package/src/hooks-events.js +63 -0
  154. package/src/http-server.js +151 -0
  155. package/src/identity-policy.js +86 -0
  156. package/src/index.js +30 -0
  157. package/src/kubernetes-controller.js +812 -0
  158. package/src/kubernetes-resource-gateway.js +48 -0
  159. package/src/operations.js +112 -0
  160. package/src/resource-model.js +203 -0
  161. package/src/runners-ci.js +48 -0
  162. package/src/runtime.js +196 -0
  163. package/src/web-ui.js +40 -0
  164. package/tests/agent-approval-controller.test.js +173 -0
  165. package/tests/agent-context-bundles.test.js +278 -0
  166. package/tests/agent-dispatch-controller.test.js +176 -0
  167. package/tests/agent-mux-client.test.js +204 -0
  168. package/tests/agent-permission-review.test.js +209 -0
  169. package/tests/agent-resources.test.js +212 -0
  170. package/tests/agent-stack-controller.test.js +221 -0
  171. package/tests/agent-trigger-controller.test.js +211 -0
  172. package/tests/deployment.test.js +395 -0
  173. package/tests/e2e/lifecycle.test.js +117 -0
  174. package/tests/krate.test.js +727 -0
@@ -0,0 +1,334 @@
1
+ # Krate
2
+
3
+ **A Kubernetes-native Git forge where repos, PRs, CI, and policy share one identity model, one RBAC system, and one declarative API.**
4
+
5
+ ---
6
+
7
+ ## TL;DR
8
+
9
+ Most "Kubernetes-native" forges are monoliths in a Helm chart. Krate is different: it extends the Kubernetes API itself. Pull requests, issues, pipelines, and runners are real Kubernetes resources, governed by native RBAC, queryable with `kubectl`, and policy-controlled by the existing admission-webhook ecosystem (Kyverno, Gatekeeper).
10
+
11
+ The bet: platform engineering teams already run Argo, Crossplane, ARC, and Kyverno. A forge that *natively composes* with that stack — instead of bolting on integrations — eliminates an entire layer of glue, an entire RBAC system, and an entire class of CVEs.
12
+
13
+ ---
14
+
15
+ ## What was wrong with the naive spec
16
+
17
+ The original "K8s-native forge" idea has three architectural traps that kill it before MVP:
18
+
19
+ **1. etcd is not your database.** Storing every issue, PR, and comment as a CRD sounds elegant, but etcd has a 1.5MB per-object limit and degrades past ~8GB total. A medium-sized org (10k issues × 100 comments × 2KB) takes the cluster's control plane down. CRDs are an *API contract*, not a storage engine.
20
+
21
+ **2. One PVC per repository doesn't scale.** AWS EBS allows ~25 volumes per node; GCE PD has similar limits. A node hosting 1,000 repos on dedicated PVCs is mathematically impossible. The CSI story works only with ReadWriteMany filesystems or Gitea-managed RWX layout.
22
+
23
+ **3. Scale-to-zero on `git-receive` breaks real workflows.** Cold-starting on `git push` adds 3–8s of latency, and webhook fan-out doesn't always retry cleanly. *Pull* is the path that benefits from elastic scaling, not push.
24
+
25
+ The refined design fixes all three.
26
+
27
+ ---
28
+
29
+ ## Architecture: control plane / data plane
30
+
31
+ ```mermaid
32
+ graph TB
33
+ subgraph Clients
34
+ CLI["git CLI / kubectl"]
35
+ UI["Next.js UI"]
36
+ CI["CI Runners (ARC/Tekton)"]
37
+ end
38
+
39
+ subgraph CP["Control Plane (kube-apiserver)"]
40
+ AAS["Aggregated API Server<br/>PullRequest, Issue, Review,<br/>Pipeline, Job, WebhookDelivery"]
41
+ CRD["CRDs<br/>Repository, WebhookSubscription,<br/>RefPolicy, BranchProtection,<br/>RunnerPool, View, Selector"]
42
+ AAS -.backed by.-> PG[("Postgres<br/>social data")]
43
+ CRD -.backed by.-> ETCD[("etcd<br/>config only")]
44
+ end
45
+
46
+ subgraph DP["Data Plane"]
47
+ ROUTER["Git Protocol Router<br/>(smart-HTTP + SSH)"]
48
+ S1["Gitea pod 0"]
49
+ S2["Gitea pod 1"]
50
+ S3["Gitea pod N..."]
51
+ ZOEKT["Zoekt Code Search"]
52
+ ROUTER --> S1
53
+ ROUTER --> S2
54
+ ROUTER --> S3
55
+ S1 -.indexed by.-> ZOEKT
56
+ end
57
+
58
+ RWX[("RWX Volumes (EFS/Ceph)")]
59
+ OBJ[("Object Storage (LFS, archives)")]
60
+
61
+ REPO_OP["Repository Operator (Crossplane)"]
62
+ PR_OP["PR Operator → preview env per PR"]
63
+ ARGO["ArgoCD ApplicationSet"]
64
+
65
+ CLI --> ROUTER
66
+ UI --> AAS
67
+ UI --> ROUTER
68
+ CI --> ROUTER
69
+ S1 --> RWX
70
+ S2 --> RWX
71
+ S3 --> RWX
72
+ S1 --> OBJ
73
+ AAS --> REPO_OP
74
+ AAS --> PR_OP
75
+ PR_OP --> ARGO
76
+ ```
77
+
78
+ ### Control plane
79
+
80
+ Exposes a Kubernetes-style API. `Repository`, `WebhookSubscription`, `RefPolicy`, `BranchProtection`, `RunnerPool`, `View`, and `Selector` are CRDs (low-cardinality, declarative). `PullRequest`, `Issue`, `Review`, `Pipeline`, `Job`, and `WebhookDelivery` are served by an **Aggregated API Server** (like `metrics-server`) — same kubectl semantics, but backed by Postgres instead of etcd. This is the single highest-stakes architectural decision.
81
+
82
+ ### Data plane
83
+
84
+ Three separable services. The *Gitea backend* terminates smart-HTTP and SSH and owns repository hosting, deploy keys, branch protection, collaborators/teams, and webhooks. Krate controllers reconcile `Repository` resources into Gitea integration plans, while *Code Search* (Zoekt) runs as a separate indexing service over repository events.
85
+
86
+ ### Scaling profile
87
+
88
+ - `git-upload-pack` (reads): HPA on request rate, 1→N.
89
+ - `git-receive-pack` (writes): warm minimum 1 Gitea backend replica; KEDA bursts on backlog.
90
+ - Search indexers: scale independently.
91
+
92
+ ### Identity
93
+
94
+ OIDC for humans, federated to K8s `User`/`Group`. For CI, **Workload Identity**: runner pods get a projected ServiceAccount token, scoped to `(repo, ref, pipeline_id)`. **No PATs, ever.** Want to push to a registry? Bind the SA to the registry's RoleBinding. The CI system has no special privileges — it has whatever you grant the SA.
95
+
96
+ ---
97
+
98
+ ## Runners: a first-class system
99
+
100
+ ARC is the *executor*, not the runner system. Krate has its own runner abstraction that composes with ARC, Tekton, or Buildkite Agent underneath. Identity, caching, cost attribution, and untrusted-code isolation are forge concerns.
101
+
102
+ ### Resources
103
+
104
+ - **`RunnerPool`** (CRD): image, resources, node selector, scaling policy, allowed repos, **trust tier** (`trusted` | `untrusted`), cache backend.
105
+ - **`Pipeline`** (Aggregated API): a single CI invocation. References a `RunnerPool` and a workflow file.
106
+ - **`Job`** (Aggregated API): a single step within a pipeline; owns one runner pod.
107
+
108
+ ### Four problems the model fixes structurally
109
+
110
+ **Cold start vs cost.** Pools have `warmReplicas` and `maxReplicas`. KEDA scales between them on queue depth. Trusted pools default to warm=2; untrusted (fork PRs) default to warm=0 — pay cold-start latency for safety.
111
+
112
+ **Untrusted code execution.** A PR from a fork *must* run in an `untrusted` pool with a ServiceAccount that has zero secrets and no cluster API access. Enforced by an admission policy on `Pipeline` create — the controller refuses to schedule untrusted code on a trusted pool. Most common security incident in CI; making it un-bypassable kills the bug class.
113
+
114
+ **Caching.** BuildKit cache mount per pool, backed by S3. Shared across runs of the same repo, never across pools. No "is my cache poisoned" debugging because cache scope is structural.
115
+
116
+ **Identity.** No PATs. Each Job pod gets a projected ServiceAccount token scoped to `(repo, ref, pipeline_id)`. The CI system has whatever privileges its SA has — nothing more.
117
+
118
+ ### Runners UI
119
+
120
+ Three screens carry 90% of the weight:
121
+
122
+ **Pool dashboard** (org-level). Live grid of pools: warm/total replicas, queue depth, p50/p95 wait time, last-hour cost. Click → spec, recent runs, scaling timeline. Killer feature: a *"Why is this slow?"* link that surfaces the bottleneck (queue saturation? image pull? node pressure?) by querying pod events.
123
+
124
+ **Live run view** (per pipeline) — the screen people live in:
125
+
126
+ ```
127
+ ┌─────────────────────────────────────────────────────────────────┐
128
+ │ Pipeline #4127 · feature/auth-refactor · ⏱ 2m 14s │
129
+ │ [Cancel] [Rerun failed] [Rerun from step ▾] [</> YAML] │
130
+ ├──────────────────┬──────────────────────────────────────────────┤
131
+ │ Steps │ Logs: build > test > integration │
132
+ │ │ │
133
+ │ ✓ checkout │ + go test ./... │
134
+ │ ✓ deps │ ok github.com/krate/api 0.142s │
135
+ │ ✓ build │ ok github.com/krate/auth 0.318s │
136
+ │ ⟳ test │ --- FAIL: TestTokenExchange (0.04s) │
137
+ │ ↳ unit │ auth_test.go:142: expected 200, got 401 │
138
+ │ ↳ integration │ FAIL github.com/krate/auth │
139
+ │ ◯ deploy │ │
140
+ │ ◯ notify │ [📋 Copy failure] [🔍 Find similar runs] │
141
+ │ │ │
142
+ │ Pod: krate- │ ────── Live tail ────── │
143
+ │ runner-trusted- │ │
144
+ │ 7c4f9 → │ │
145
+ │ [kubectl logs] │ │
146
+ └──────────────────┴──────────────────────────────────────────────┘
147
+ ```
148
+
149
+ Log streaming uses the K8s `Pod.log` watch endpoint piped through SSE. *"Find similar runs"* is a label-selector query: `kubectl get pipelines -l failure.signature=auth_test:142`. *"Rerun from step"* creates a new `Pipeline` with a `resumeFrom` field — controller skips earlier steps and rehydrates cache.
150
+
151
+ **Pool editor.** Split view: left = form (image, resources, scaling), right = the YAML it produces, live-updating. Save shows "this is equivalent to `kubectl apply -f -`" with the manifest. Same screen offers *"Save to repo"* — opens a PR against your platform-config repo.
152
+
153
+ ---
154
+
155
+ ## Hooks: three layers, not one
156
+
157
+ "Hooks" gets conflated. Krate has three distinct hook surfaces, each with a different mechanism:
158
+
159
+ ### Layer 1 — Server-side Git hooks (`pre-receive`, `update`, `post-receive`)
160
+
161
+ Run *inside* the receive-pack process; can't be CRDs (Git doesn't know what etcd is). Model: `RefPolicy` CRD declares the rule (require commit signing, block force-push to main, require linear history). A controller compiles policies into a config map mounted into Gitea backend pods; receive-pack evaluates them in-process. Custom hooks run in a sandboxed **WASM runtime** — no shelling out, no supply-chain risk.
162
+
163
+ ### Layer 2 — Outbound webhooks (HTTP delivery to Slack, Jenkins, etc.)
164
+
165
+ `WebhookSubscription` CRD per repo or org. Delivery via NATS JetStream (K8s-native, durable queue) with exponential backoff and HMAC signing. Every delivery becomes a `WebhookDelivery` resource — queryable, replayable, observable through the same `kubectl get` machinery as everything else.
166
+
167
+ ### Layer 3 — Admission webhooks on the control plane (the elegant part)
168
+
169
+ Because PRs and Issues are real Kubernetes resources, **any** `ValidatingAdmissionWebhook` or `MutatingAdmissionWebhook` works on them. **Kyverno and OPA Gatekeeper work out of the box for PR policy.** Block PRs with empty descriptions? Kyverno policy. Require two reviewers from different teams? CEL expression. You inherit the entire Kubernetes policy ecosystem for free.
170
+
171
+ ### Hooks UI
172
+
173
+ Per-repo Hooks tab unifies all three layers in one explorer:
174
+
175
+ ```
176
+ Hooks affecting this repository
177
+ ─────────────────────────────────────────────────────────────────
178
+ GIT REFS (pre-receive, update, post-receive)
179
+ ✓ require-signed-commits RefPolicy/org-defaults [edit]
180
+ ✓ block-force-push-main RefPolicy/protected [edit]
181
+ ⚠ lint-on-push RefPolicy/this-repo [3 fails today]
182
+
183
+ OUTBOUND WEBHOOKS
184
+ ✓ slack-engineering push, pull_request [deliveries →]
185
+ ✗ jenkins-legacy push [12 failed in 1h ⚠]
186
+
187
+ ADMISSION POLICIES (on PullRequest, Issue)
188
+ ✓ require-pr-description Kyverno: ClusterPolicy [view]
189
+ ✓ block-wip-merges Kyverno: ClusterPolicy [view]
190
+ + Add policy [Browse Kyverno templates]
191
+ ```
192
+
193
+ **Webhook delivery log** — the page everyone secretly wants and no forge does well. Every delivery: request headers, body, response, latency, retry chain. *"Replay"* re-fires with current secrets. Filter via `kubectl get webhookdeliveries -l status=failed,subscription=jenkins-legacy --since=1h`.
194
+
195
+ **Policy authoring.** If Kyverno is installed, the UI deep-links to its policies. If not, a built-in editor with form mode (templates) and CEL mode (raw expressions). A *"Violations"* panel shows current PRs that *would* fail the policy if it were enforcing — roll out in audit mode first.
196
+
197
+ ---
198
+
199
+ ## UX scope
200
+
201
+ ### Personas
202
+
203
+ - **Developer** — lives in PR review, run debugging, code browse. Wants speed.
204
+ - **Platform engineer** — manages pools, policies, hooks, identity. Wants control and auditability.
205
+ - *Repo admin* is mostly a developer with extra settings access; not a separate IA tier.
206
+
207
+ ### Information architecture
208
+
209
+ **Top-level (org-scoped):**
210
+
211
+ | Section | Primary use |
212
+ |---|---|
213
+ | Repositories | Browse, search, create |
214
+ | Inbox | Cross-repo PRs/issues/runs needing me |
215
+ | Runs | Cross-repo CI activity |
216
+ | Runners | Pool dashboard + live runs |
217
+ | Hooks & Policies | All three hook layers |
218
+ | Insights | Lead time, runner cost, hook health |
219
+ | Settings | Identity, storage, Gitea backend defaults |
220
+
221
+ **Per-repo:** Code · Pull Requests · Issues · Runs · Hooks · Settings.
222
+
223
+ ### Six flows that must be excellent
224
+
225
+ 1. **Open and review a PR.** Three-pane: file tree / diff / conversation. Inline comments thread. Suggested edits commit-with-one-click. Keyboard-first (j/k/n/p). Right rail shows the PR's `Pipeline` runs, linked to live run view.
226
+
227
+ 2. **Debug a failing run.** Live run view with step navigation, log streaming, "find similar failures."
228
+
229
+ 3. **Configure a runner pool.** Form + YAML split, GitOps export.
230
+
231
+ 4. **Add a webhook and verify it works.** Create form → *"Send test delivery"* → see it in the log within seconds, with response.
232
+
233
+ 5. **Write a PR policy.** Pick a template → preview violations on existing PRs → enable in audit mode → graduate to enforce. The audit→enforce gradient *is* the UX, not just a flag.
234
+
235
+ 6. **Cross-repo triage.** Inbox view with saved filters. Filters persist as `Selector` CRDs — a team's "P0 bug triage view" is a YAML file you can commit and share.
236
+
237
+ ### Front-end stack (Next.js 15, App Router)
238
+
239
+ Three principles:
240
+
241
+ **Direct-to-API, no intermediate backend.** Server Components fetch from the Aggregated API Server using `@kubernetes/client-node`. There is no separate "forge backend" — the control plane *is* the backend.
242
+
243
+ **Real-time via the Watch API.** Every list view opens a Watch stream from the API server, piped to the browser via Server-Sent Events from a Route Handler. PR state, comments, CI status — all stream in without polling. Free, because the K8s API gives it to you.
244
+
245
+ **GitOps-transparent UI.** Every mutating action shows the equivalent YAML inline and offers *"Copy as kubectl"*.
246
+
247
+ Auth model: **NextAuth** handles OIDC login, then exchanges the OIDC token for a Kubernetes ServiceAccount token via the TokenRequest API. Every API call from a Server Component carries the user's K8s identity, so RBAC enforcement happens at the API server — not in app code. Zero permission logic to write or audit.
248
+
249
+ ```
250
+ ┌─────────────────────────────────────────────────────────────┐
251
+ │ Browser │
252
+ │ ├─ Server Components (RSC stream from K8s API) │
253
+ │ ├─ SSE client (Watch events → optimistic UI) │
254
+ │ └─ Monaco/CodeMirror for diff + code view │
255
+ └──────────────┬──────────────────────────────────────────────┘
256
+ │ HTTPS
257
+ ┌──────────────▼──────────────────────────────────────────────┐
258
+ │ Next.js (Edge runtime where possible) │
259
+ │ ├─ /repos/[…] → Server Component, kubectl-equivalent │
260
+ │ ├─ /api/watch/orgs/[org]/* → Route Handler, K8s Watch → SSE │
261
+ │ ├─ /api/git-proxy → Streams smart-HTTP to Gitea backend │
262
+ │ └─ NextAuth (OIDC) ↔ TokenRequest API for SA exchange │
263
+ └──────────────┬──────────────────────────────────────────────┘
264
+ │ K8s API (Aggregated) │ Git smart-HTTP
265
+ ▼ ▼
266
+ Control Plane Data Plane
267
+ ```
268
+
269
+ ---
270
+
271
+ ## "GitOps-transparent UX" — concretely
272
+
273
+ Three patterns, applied uniformly:
274
+
275
+ **1. Every mutating action has a `</>` button** in the same screen position. Click: side panel shows (a) the resulting YAML, (b) the kubectl command equivalent, (c) *"Copy as `kubectl apply`"*, (d) *"Open PR against config repo."* Click "Apply" in the panel and the action runs through the same code path as the form button — the panel isn't a translation layer, it's the truth and the form is the rendering of it.
276
+
277
+ **2. Every detail page has a YAML tab** next to the rendered view. Lens/Headlamp parity, native.
278
+
279
+ **3. Every saved view is a resource.** Inbox filter, dashboard layout, custom column set — all stored as `View` / `Selector` CRDs. Your team's triage dashboard is `git clone`-able. No competitor has this because no competitor's UI state lives in the same store as the domain data.
280
+
281
+ **4. Activity log entries are kubectl commands.** "Bob merged PR #42" expands to: `kubectl patch pullrequest/42 --type=merge -p '{"spec":{"merged":true}}'` — copyable, replayable, auditable. The audit log isn't generated from events; it's a literal command history.
282
+
283
+ ---
284
+
285
+ ## Why this wins (positioning)
286
+
287
+ The pitch isn't "Gitea on Kubernetes." It's: **the only forge where your repos, PRs, CI, and infrastructure share one identity model, one RBAC system, and one declarative API.**
288
+
289
+ Three moats:
290
+
291
+ - **Platform engineers want this.** Internal developer platforms are the buyer. They already run Argo, Crossplane, ARC, Kyverno. A forge that natively composes with them eliminates integration glue.
292
+ - **Multi-tenancy is namespace-shaped.** Tenant isolation is a solved K8s problem. Other forges reinvent it badly.
293
+ - **No new permission system.** The single biggest source of breach incidents at competitors is auth bugs. Inheriting K8s RBAC removes an entire class of CVEs from the roadmap.
294
+
295
+ ---
296
+
297
+ ## Roadmap
298
+
299
+ | Capability | v0.1 (6 wks) | v0.2 | v0.3 |
300
+ |---|---|---|---|
301
+ | Repos, PRs, Issues (Aggregated API) | ✓ | | |
302
+ | Single-Gitea backend data plane | ✓ | | |
303
+ | Next.js UI: code, PR review, run view | ✓ | | |
304
+ | Outbound webhooks + delivery log | ✓ | | |
305
+ | Admission webhooks (Kyverno-compatible) | ✓ (free) | | |
306
+ | External CI via ARC | ✓ | | |
307
+ | Native `RunnerPool` + live run view | | ✓ | |
308
+ | `RefPolicy` (server-side git hooks, WASM) | | ✓ | |
309
+ | Gitea backend horizontal scale-out | | ✓ | |
310
+ | `View` / `Selector` CRDs (saved triage) | | ✓ | |
311
+ | Code search (Zoekt) | | | ✓ |
312
+ | Multi-cluster federation | | | ✓ |
313
+
314
+ **The MVP demo:** a single `kubectl apply -f kyverno-pr-policy.yaml` blocks a PR — and the same policy shows up in the UI's Hooks tab without anyone wiring it. That's the moment people get it.
315
+
316
+ ### 6-week MVP plan
317
+
318
+ - **W1–2.** Aggregated API Server with `Repository` + `PullRequest`, Postgres-backed, working `kubectl get/create`.
319
+ - **W3.** Single-Gitea backend data plane: `git-upload-pack` and `git-receive-pack` served by Gitea with Repository Operator creating repositories through the Gitea API.
320
+ - **W4.** Next.js skeleton — login, repo list, file view, PR list (RSC + Watch SSE).
321
+ - **W5.** PR creation flow, inline diff, comment thread.
322
+ - **W6.** Workload Identity for CI, demo with ARC running a real workflow.
323
+
324
+ Ship as a public Helm chart. The story writes itself: `helm install krate; kubectl create -f my-repo.yaml; git push`.
325
+
326
+ ---
327
+
328
+ ## Open decisions
329
+
330
+ 1. **Aggregated API Server vs pure CRDs.** Highest-stakes call; everything downstream depends on it. AAS is correct; commit early.
331
+ 2. **Runner abstraction: executor-pluggable or ARC-only for MVP?** Pluggable is correct long-term but doubles week-2 complexity.
332
+ 3. **Bundle Kyverno in the install, or BYO?** Bundling is friendlier; BYO keeps install lean.
333
+ 4. **Name.** "Krate" is the project name used by this implementation package and documentation set.
334
+
@@ -0,0 +1,55 @@
1
+ # Local Minikube Setup
2
+
3
+
4
+
5
+ Krate now includes a deterministic local setup script for the Kubernetes package lifecycle.
6
+
7
+
8
+
9
+ ## Dry-run validation
10
+
11
+
12
+
13
+ Use dry-run mode in development and CI because it does not require a local cluster:
14
+
15
+
16
+
17
+ ```bash
18
+
19
+ npm run setup:minikube -- --dry-run
20
+
21
+ npm run setup:minikube -- --dry-run --json
22
+
23
+ npm run e2e
24
+
25
+ ```
26
+
27
+
28
+
29
+ Dry-run mode verifies the intended lifecycle command plan: start minikube, enable ingress and metrics, select the context, validate the chart, install the chart, apply demo resources, wait for the API deployment, and run smoke assertions.
30
+
31
+
32
+
33
+ ## Apply mode
34
+
35
+
36
+
37
+ Use apply mode when `minikube`, `kubectl`, `helm`, Node.js, npm, and a working container driver are installed:
38
+
39
+
40
+
41
+ ```bash
42
+
43
+ npm run setup:minikube -- --apply
44
+
45
+ ```
46
+
47
+
48
+
49
+ The script defaults to profile `krate`, namespace `krate-system`, release `krate`, driver `docker`, and chart `charts/krate`. Override with `--profile=...`, `--namespace=...`, `--release=...`, `--driver=...`, and `--chart=...`.
50
+
51
+
52
+
53
+ ## Release boundary
54
+
55
+ The chart is production-shaped and validates Krate install contracts against the executable Kubernetes package model, including Argo CD Application and Gitea backend surfaces. The repository includes a production-shaped controller image build, ingress values for the Next.js app, registry pull-secret support, and GitHub publishing lanes for GHCR images, chart artifacts, generated dist/example bundles, and AKS branch deployments.
@@ -0,0 +1,32 @@
1
+ # Krate Ontology
2
+
3
+ This tree turns the requirements in `docs/` into an implementation ontology for the Kubernetes-native forge MVP.
4
+
5
+ ## Reading order
6
+
7
+ 1. `world.md` - domain, actors, external systems, and assumptions.
8
+ 2. `problem-space.md` - jobs, risks, and failure modes Krate must solve.
9
+ 3. `solution-space.md` - architecture and MVP module mapping.
10
+ 4. `bounded-contexts.md` - ownership boundaries across control, data, identity, CI, hooks, UI, and operations.
11
+ 5. `resource-taxonomy.md` and `resource-contracts.md` - API kinds and lifecycle contracts.
12
+ 6. Workflow, policy, storage, event, CI, UI, operations, and validation files for executable acceptance gates.
13
+
14
+ ## Traceability model
15
+
16
+ Each ontology entry traces to at least one source document under `docs/`, one implementation module under `src/`, and one validation surface in `tests/` or `scripts/`.
17
+
18
+ | Ontology area | Source docs | Implementation | Validation |
19
+ | --- | --- | --- | --- |
20
+ | Control-plane resources | `docs/components/control-plane.md` | `src/resource-model.js`, `src/control-plane.js` | `tests/krate.test.js` |
21
+ | Identity and policy | `docs/components/identity-rbac-policy.md` | `src/identity-policy.js` | RBAC/admission tests |
22
+ | Git data plane | `docs/components/data-plane.md` | `src/data-plane.js` | Gitea backend tests |
23
+ | CI and runners | `docs/components/runners-ci.md` | `src/runners-ci.js` | Runner scheduler tests |
24
+ | Hooks and webhooks | `docs/components/hooks-events.md` | `src/hooks-events.js` | Webhook bus tests |
25
+ | Web UI flows | `docs/components/web-ui.md` | `src/web-ui.js` | smoke assertions |
26
+ | Operations and release | `docs/components/operations-publishing.md` | `src/operations.js` | build, smoke, doc coverage |
27
+
28
+ ## Completion criteria
29
+
30
+ - All resource kinds are classified as CRD-backed configuration or aggregated Postgres-backed records.
31
+ - All high-risk invariants are executable: RBAC, admission, storage boundaries, fork isolation, ref protection, webhook replay, UI YAML transparency, and backup/restore order.
32
+ - `npm run check` verifies build output, doc/ontology coverage, tests, and smoke flow.
@@ -0,0 +1,29 @@
1
+ # Bounded Contexts
2
+
3
+ ## Control plane
4
+
5
+ Owns resource verbs, storage routing, RBAC checks, admission decisions, audit records, status patches, lists, and watches. It depends on `resource-model` for kind classification and `identity-policy` for authorization and admission evaluation.
6
+
7
+ ## Data plane
8
+
9
+ Owns Gitea repository hosting, warm Git receive routing, object storage metadata, search indexing hooks, `RefPolicy`, and `BranchProtection` enforcement. It writes `Repository` config and emits Git events through the control-plane event stream.
10
+
11
+ ## Identity and policy
12
+
13
+ Owns OIDC identity mapping, Kubernetes groups, default RBAC roles, service-account scope profiles, trust tiers, admission policies, and audit/enforce rollout lifecycle.
14
+
15
+ ## Runners and CI
16
+
17
+ Owns `RunnerPool`, `Pipeline`, and `Job` resources, queue-depth scaling, fork isolation, cache configuration, and rerun/resume semantics.
18
+
19
+ ## Hooks and events
20
+
21
+ Owns outbound `WebhookSubscription` and `WebhookDelivery` resources, HMAC signing, retry/failure status, replay records, and the distinction between server-side Git hooks, outbound webhooks, and Kubernetes admission hooks.
22
+
23
+ ## Web UI
24
+
25
+ Owns resource-backed view models for dashboards, PR review, runner pool editing, webhook inspection, YAML previews, and excellent flows. It must not invent hidden state that cannot be mapped back to resources.
26
+
27
+ ## Operations and publishing
28
+
29
+ Owns install manifests, CRD/APIService publication, observability surfaces, backup/restore order, upgrade gates, smoke tests, and release readiness checks.
@@ -0,0 +1,32 @@
1
+ # Events and Hooks
2
+
3
+ Krate separates server-side Git hooks, outbound HTTP webhooks, Kubernetes admission hooks, and resource watch events.
4
+
5
+ ## Watch events
6
+
7
+ - Emitted on resource create/update/status mutations.
8
+ - Include resource kind, storage boundary, and audit context.
9
+ - Used by UI, controllers, and smoke assertions.
10
+
11
+ ## Server-side Git hooks
12
+
13
+ - Modeled through `RefPolicy` and `BranchProtection`.
14
+ - Enforced during receive-pack before protected writes complete.
15
+ - Must not mount broad secrets or shell out with ambient privilege.
16
+
17
+ ## Outbound webhooks
18
+
19
+ - Configured by `WebhookSubscription`.
20
+ - Materialized as durable `WebhookDelivery` records.
21
+ - Signed with HMAC SHA-256.
22
+ - Replay creates a new delivery attempt and keeps replay metadata.
23
+
24
+ ## Admission hooks
25
+
26
+ - Validate Krate resources before storage.
27
+ - Support audit warnings and enforce denial.
28
+ - Must expose actionable status when denied.
29
+
30
+ ## Observability
31
+
32
+ Every delivery and event should expose phase, latency or timestamp, response metadata where applicable, and enough context for replay or remediation.
@@ -0,0 +1,32 @@
1
+ # OAM and KubeVela Ontology Assimilation
2
+
3
+ Krate assimilates the Open Application Model as the application-delivery layer of the forge ontology. The OAM model gives Krate a vocabulary for app-centric delivery without replacing Kubernetes or Gitea.
4
+
5
+ ## Concepts
6
+
7
+ | OAM concept | Krate ontology role | Kubernetes resource surface |
8
+ | --- | --- | --- |
9
+ | Application | Deployable application attached to a repository, branch, PR, or release | `applications.core.oam.dev` |
10
+ | Component | Reusable workload or service module | `componentdefinitions.core.oam.dev` plus Application components |
11
+ | Workload type | Runtime implementation selected by a component definition | `workloaddefinitions.core.oam.dev` when installed by KubeVela |
12
+ | Trait | Operational behavior attached to a component | `traitdefinitions.core.oam.dev` plus Application traits |
13
+ | Policy | Delivery rule such as topology, override, health, or security | `policydefinitions.core.oam.dev` plus Application policies |
14
+ | Workflow step | Ordered delivery action such as deploy, suspend, approve, promote, rollback | `workflowstepdefinitions.core.oam.dev`, `workflows.core.oam.dev`, plus Application workflow |
15
+ | Application revision | Concrete release/revision materialized by KubeVela | `applicationrevisions.core.oam.dev` |
16
+ | Resource tracker | Applied-resource ownership graph maintained by KubeVela | cluster-scoped `resourcetrackers.core.oam.dev` |
17
+ | Scope | Logical grouping boundary | `scopedefinitions.core.oam.dev` when installed plus Application component `scopes` maps |
18
+
19
+ ## Assimilation Rules
20
+
21
+ - OAM `Application` is not a Git repository; it is a delivery object that references build/deploy intent derived from a repository.
22
+ - OAM components, workloads, traits, scopes, policies, and workflow steps are capability catalog entries, not hard-coded Krate forms.
23
+ - Krate UI must present simple forge tasks first: deploy from repo, preview PR, promote release, inspect rollout, rollback.
24
+ - Raw OAM YAML remains visible so operators can copy, review, and apply exact Kubernetes resources.
25
+ - KubeVela status, ApplicationRevisions, Workflows, Policies, and ResourceTrackers are authoritative for OAM delivery health; Krate may summarize but must not synthesize success.
26
+
27
+ ## Validation Expectations
28
+
29
+ - Chart render includes an Argo CD Application for KubeVela when enabled.
30
+ - `/api/controller` exposes discovered KubeVela definition, application, revision, policy, workflow, and resource-tracker counts when KubeVela CRDs are installed.
31
+ - UI has an Applications surface that names OAM Applications, Components, Workloads, Traits, Scopes, Policies, Workflow Steps, and KubeVela installation status.
32
+ - Repository pages show how a repo/PR maps to an OAM Application and preview workflow.
@@ -0,0 +1,25 @@
1
+ # Operations and Release
2
+
3
+ ## Install
4
+
5
+ Install manifests must include CRDs, APIService, controllers, Gitea backend, runner scheduler, webhook dispatcher, and web UI deployments or equivalent resources.
6
+
7
+ ## Observability
8
+
9
+ Operators need metrics and logs for API latency, storage boundary health, Git receive latency, queue depth, runner saturation, webhook delivery phase, replay counts, and admission denials.
10
+
11
+ ## Backup and restore
12
+
13
+ Backup covers CRDs/config, Postgres records, Gitea repositories, and object storage. Restore order is API/config, Postgres, Gitea repositories, objects, controllers. Validation includes listing resources, reading refs, opening a PR, and replaying webhooks.
14
+
15
+ ## Upgrade
16
+
17
+ Upgrades must preserve CRD compatibility, aggregated API availability, migrations, controller reconciliation, and rollback instructions.
18
+
19
+ ## Release gates
20
+
21
+ - Build produces `dist/krate-summary.json`.
22
+ - Documentation and ontology coverage pass.
23
+ - Unit acceptance tests pass.
24
+ - Smoke flow passes.
25
+ - Known limitations are explicit and not hidden in incomplete implementation paths.
@@ -0,0 +1,32 @@
1
+ # Personas and Actors
2
+
3
+ ## Developer
4
+
5
+ - Groups: `system:authenticated`, `krate:developers`.
6
+ - Can create and update `PullRequest`, `Issue`, `Review`, `Pipeline`, and `Job` records.
7
+ - Cannot create repositories or branch protection without repo-admin rights.
8
+ - Needs UI flows for PR review, failed CI inspection, webhook visibility, and YAML/resource transparency.
9
+
10
+ ## Repository admin
11
+
12
+ - Groups: `krate:repo-admins`.
13
+ - Can create repositories, branch protection, ref policies, webhook subscriptions, triage views, and selectors.
14
+ - Can replay webhook deliveries and manage repository-level governance.
15
+
16
+ ## Platform engineer
17
+
18
+ - Groups: `krate:platform-engineers`.
19
+ - Can perform all verbs on all Krate kinds.
20
+ - Owns install, upgrade, observability, runner pools, backup/restore, and release gates.
21
+
22
+ ## Controllers
23
+
24
+ - Use scoped service accounts.
25
+ - Patch status and reconcile desired state.
26
+ - Must not bypass admission-sensitive invariants unless explicitly modeled.
27
+
28
+ ## Runner jobs
29
+
30
+ - Use service-account scopes derived from trust tier.
31
+ - Trusted jobs may access configured caches and publication credentials.
32
+ - Untrusted fork jobs have no secrets and no cluster API mutation.
@@ -0,0 +1,33 @@
1
+ # Policies and Invariants
2
+
3
+ ## RBAC
4
+
5
+ - Users must be mapped from OIDC identity into Kubernetes-style groups.
6
+ - `system:authenticated` can read and watch, not mutate privileged resources.
7
+ - Developers can mutate review and CI records.
8
+ - Repo admins can mutate repository governance resources.
9
+ - Platform engineers can mutate all Krate resources.
10
+
11
+ ## Admission rollout
12
+
13
+ - Policies support `audit` mode for warnings without blocking.
14
+ - Policies support `enforce` mode for fail-closed denial.
15
+ - Rollout sequence is preview, audit, then enforce.
16
+
17
+ ## Isolation
18
+
19
+ - Fork PR jobs are untrusted.
20
+ - Untrusted jobs receive no secrets and no cluster API access.
21
+ - Trusted jobs receive only explicitly scoped capabilities.
22
+
23
+ ## Git governance
24
+
25
+ - `BranchProtection` can require PR flow for protected refs.
26
+ - `RefPolicy` can deny internal refs and unsafe updates.
27
+ - Receive-pack must emit events with correlation to repository and Gitea backend.
28
+
29
+ ## Audit and transparency
30
+
31
+ - Every mutation produces an audit entry with actor, groups, operation, resource, warnings, and allowed status.
32
+ - UI actions must expose equivalent YAML/resource state.
33
+ - Release gates must include docs coverage, tests, build, and smoke.
@@ -0,0 +1,30 @@
1
+ # Problem-Space Ontology
2
+
3
+ Krate exists because teams want forge workflows that inherit Kubernetes governance instead of re-implementing identity, policy, audit, deployment, and operations outside the cluster model.
4
+
5
+ ## Jobs to be done
6
+
7
+ - Host repositories without one-PVC-per-repository scaling failure.
8
+ - Review and merge pull requests with branch protection and status checks.
9
+ - Run CI on trusted and untrusted work while preserving secret boundaries.
10
+ - Govern refs, admission, runner access, and webhook behavior through resources.
11
+ - Inspect and replay webhook deliveries without hidden queue state.
12
+ - Operate install, upgrade, backup, restore, and release processes with visible gates.
13
+ - Triage cross-repository work through selectors and views.
14
+
15
+ ## Failure modes to prevent
16
+
17
+ - **etcd overload** from comments, jobs, logs, webhook attempts, or high-cardinality records.
18
+ - **Cold Git writes** from unprepared receive-pack paths.
19
+ - **Token sprawl** from CI jobs using broad personal or cluster credentials.
20
+ - **Fork leakage** where untrusted PRs can read secrets or mutate cluster resources.
21
+ - **Opaque hooks** where failed webhook deliveries cannot be inspected or replayed.
22
+ - **Non-auditable UI** where a click cannot be mapped to a resource mutation.
23
+ - **Operational drift** where manifests, backup order, and release gates are not tested.
24
+
25
+ ## Success criteria
26
+
27
+ - Every mutation has an actor, verb, resource, storage boundary, admission decision, audit entry, and watch event.
28
+ - Every resource family has a clear owner context and lifecycle.
29
+ - Every excellent flow has a YAML/resource equivalent.
30
+ - Every release candidate passes build, doc coverage, unit acceptance tests, and smoke assertions.