@a5c-ai/krate 5.0.1-staging.f672fe79b
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/Dockerfile +29 -0
- package/README.md +183 -0
- package/bin/krate-demo.mjs +23 -0
- package/bin/krate-server.mjs +14 -0
- package/dist/krate-controller-ui.json +2407 -0
- package/dist/krate-lifecycle.json +201 -0
- package/dist/krate-runtime-snapshot.json +2955 -0
- package/dist/krate-summary.json +687 -0
- package/docs/README.md +61 -0
- package/docs/agents/README.md +83 -0
- package/docs/agents/acceptance-test-matrix.md +193 -0
- package/docs/agents/agent-mux-adapter-contract.md +167 -0
- package/docs/agents/agent-mux-source-map.md +310 -0
- package/docs/agents/agent-run-memory-import-spec.md +256 -0
- package/docs/agents/agent-stack-management-spec.md +421 -0
- package/docs/agents/api-contract-spec.md +309 -0
- package/docs/agents/artifacts-writeback-spec.md +145 -0
- package/docs/agents/chart-packaging-spec.md +128 -0
- package/docs/agents/ci-orchestration-spec.md +140 -0
- package/docs/agents/context-assembly-spec.md +219 -0
- package/docs/agents/controller-reconciliation-spec.md +255 -0
- package/docs/agents/crd-schema-spec.md +315 -0
- package/docs/agents/decision-log-open-questions.md +169 -0
- package/docs/agents/developer-implementation-checklist.md +329 -0
- package/docs/agents/dispatching-design.md +262 -0
- package/docs/agents/glossary.md +66 -0
- package/docs/agents/implementation-blueprint.md +324 -0
- package/docs/agents/implementation-rollout-slices.md +251 -0
- package/docs/agents/memory-context-integration-spec.md +194 -0
- package/docs/agents/memory-ontology-schema-spec.md +253 -0
- package/docs/agents/memory-operations-runbook.md +121 -0
- package/docs/agents/mvp-vertical-slice-spec.md +146 -0
- package/docs/agents/observability-audit-spec.md +265 -0
- package/docs/agents/operator-runbook.md +174 -0
- package/docs/agents/org-memory-api-payload-examples.md +333 -0
- package/docs/agents/org-memory-controller-sequence-spec.md +181 -0
- package/docs/agents/org-memory-e2e-fixture-plan.md +161 -0
- package/docs/agents/org-memory-ui-implementation-map.md +114 -0
- package/docs/agents/org-memory-vertical-slice-spec.md +168 -0
- package/docs/agents/org-resource-model-delta-spec.md +111 -0
- package/docs/agents/org-route-resource-model-spec.md +183 -0
- package/docs/agents/org-scoping-namespace-spec.md +114 -0
- package/docs/agents/rbac-secrets-management-spec.md +406 -0
- package/docs/agents/repository-page-integration-spec.md +255 -0
- package/docs/agents/resource-contract-examples.md +808 -0
- package/docs/agents/resource-relationship-map.md +190 -0
- package/docs/agents/security-threat-model.md +188 -0
- package/docs/agents/shared-memory-company-brain-spec.md +358 -0
- package/docs/agents/storage-migration-spec.md +168 -0
- package/docs/agents/subagent-orchestration-spec.md +152 -0
- package/docs/agents/system-overview.md +88 -0
- package/docs/agents/tools-mcp-skills-spec.md +189 -0
- package/docs/agents/traceability-matrix.md +79 -0
- package/docs/agents/ui-flow-spec.md +211 -0
- package/docs/agents/ui-ux-system-spec.md +426 -0
- package/docs/agents/workspace-lifecycle-spec.md +166 -0
- package/docs/architecture-spec.md +78 -0
- package/docs/components/control-plane.md +78 -0
- package/docs/components/data-plane.md +69 -0
- package/docs/components/hooks-events.md +67 -0
- package/docs/components/identity-rbac-policy.md +73 -0
- package/docs/components/kubevela-oam.md +70 -0
- package/docs/components/operations-publishing.md +81 -0
- package/docs/components/runners-ci.md +66 -0
- package/docs/components/web-ui.md +94 -0
- package/docs/external/README.md +47 -0
- package/docs/external/bidirectional-sync-design.md +134 -0
- package/docs/external/cicd-interface.md +64 -0
- package/docs/external/external-backend-controllers.md +170 -0
- package/docs/external/external-backend-crds.md +234 -0
- package/docs/external/external-backend-ui-spec.md +151 -0
- package/docs/external/external-backend-ux-flows.md +115 -0
- package/docs/external/external-object-mapping.md +125 -0
- package/docs/external/git-forge-interface.md +68 -0
- package/docs/external/github-integration-design.md +151 -0
- package/docs/external/issue-tracking-interface.md +66 -0
- package/docs/external/provider-capability-manifests.md +204 -0
- package/docs/external/provider-catalog.md +139 -0
- package/docs/external/provider-rollout-testing.md +78 -0
- package/docs/external/research-results.md +48 -0
- package/docs/external/security-auth-permissions.md +81 -0
- package/docs/external/sync-state-machines.md +108 -0
- package/docs/external/unified-external-backend-model.md +107 -0
- package/docs/external/user-facing-changes.md +67 -0
- package/docs/gaps.md +161 -0
- package/docs/install.md +94 -0
- package/docs/krate-design.md +334 -0
- package/docs/local-minikube.md +55 -0
- package/docs/ontology/README.md +32 -0
- package/docs/ontology/bounded-contexts.md +29 -0
- package/docs/ontology/events-and-hooks.md +32 -0
- package/docs/ontology/oam-kubevela.md +32 -0
- package/docs/ontology/operations-and-release.md +25 -0
- package/docs/ontology/personas-and-actors.md +32 -0
- package/docs/ontology/policies-and-invariants.md +33 -0
- package/docs/ontology/problem-space.md +30 -0
- package/docs/ontology/resource-contracts.md +40 -0
- package/docs/ontology/resource-taxonomy.md +42 -0
- package/docs/ontology/runners-and-ci.md +29 -0
- package/docs/ontology/solution-space.md +24 -0
- package/docs/ontology/storage-and-data-boundaries.md +29 -0
- package/docs/ontology/validation-matrix.md +24 -0
- package/docs/ontology/web-ui-excellent-flows.md +32 -0
- package/docs/ontology/workflows.md +39 -0
- package/docs/ontology/world.md +35 -0
- package/docs/product-requirements.md +62 -0
- package/docs/roadmap-mvp.md +87 -0
- package/docs/system-requirements.md +90 -0
- package/docs/tests/README.md +53 -0
- package/docs/tests/agent-qa-plan.md +63 -0
- package/docs/tests/browser-ui-tests.md +62 -0
- package/docs/tests/ci-quality-gates.md +48 -0
- package/docs/tests/coverage-model.md +64 -0
- package/docs/tests/e2e-scenario-tests.md +53 -0
- package/docs/tests/fixtures-test-data.md +63 -0
- package/docs/tests/observability-reliability-tests.md +54 -0
- package/docs/tests/product-test-matrix.md +145 -0
- package/docs/tests/qa-adoption-roadmap.md +130 -0
- package/docs/tests/qa-automation-plan.md +101 -0
- package/docs/tests/security-compliance-tests.md +57 -0
- package/docs/tests/test-framework-tools.md +88 -0
- package/docs/tests/test-suite-layout.md +121 -0
- package/docs/tests/unit-integration-tests.md +48 -0
- package/docs/todo-kyverno +714 -0
- package/docs/user-stories.md +78 -0
- package/examples/minikube-demo.yaml +190 -0
- package/examples/oam-application.yaml +23 -0
- package/examples/policy-kyverno-pr-title.yaml +18 -0
- package/package.json +63 -0
- package/scripts/build.mjs +29 -0
- package/scripts/setup-minikube.mjs +65 -0
- package/scripts/smoke.mjs +37 -0
- package/scripts/validate-doc-coverage.mjs +152 -0
- package/scripts/validate-package.mjs +93 -0
- package/scripts/validate-ui.mjs +207 -0
- package/src/agent-approval-controller.js +123 -0
- package/src/agent-context-bundles.js +242 -0
- package/src/agent-dispatch-controller.js +86 -0
- package/src/agent-mux-client.js +280 -0
- package/src/agent-permission-review.js +162 -0
- package/src/agent-stack-controller.js +296 -0
- package/src/agent-trigger-controller.js +108 -0
- package/src/api-controller.js +206 -0
- package/src/argocd-gitops.js +43 -0
- package/src/auth.js +265 -0
- package/src/component-catalog.js +41 -0
- package/src/control-plane.js +136 -0
- package/src/controller-client.js +38 -0
- package/src/controller-ui.js +538 -0
- package/src/data-plane.js +178 -0
- package/src/gitea-backend.js +95 -0
- package/src/handoff.js +98 -0
- package/src/hooks-events.js +63 -0
- package/src/http-server.js +151 -0
- package/src/identity-policy.js +86 -0
- package/src/index.js +30 -0
- package/src/kubernetes-controller.js +812 -0
- package/src/kubernetes-resource-gateway.js +48 -0
- package/src/operations.js +112 -0
- package/src/resource-model.js +203 -0
- package/src/runners-ci.js +48 -0
- package/src/runtime.js +196 -0
- package/src/web-ui.js +40 -0
- package/tests/agent-approval-controller.test.js +173 -0
- package/tests/agent-context-bundles.test.js +278 -0
- package/tests/agent-dispatch-controller.test.js +176 -0
- package/tests/agent-mux-client.test.js +204 -0
- package/tests/agent-permission-review.test.js +209 -0
- package/tests/agent-resources.test.js +212 -0
- package/tests/agent-stack-controller.test.js +221 -0
- package/tests/agent-trigger-controller.test.js +211 -0
- package/tests/deployment.test.js +395 -0
- package/tests/e2e/lifecycle.test.js +117 -0
- package/tests/krate.test.js +727 -0
|
@@ -0,0 +1,334 @@
|
|
|
1
|
+
# Krate
|
|
2
|
+
|
|
3
|
+
**A Kubernetes-native Git forge where repos, PRs, CI, and policy share one identity model, one RBAC system, and one declarative API.**
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## TL;DR
|
|
8
|
+
|
|
9
|
+
Most "Kubernetes-native" forges are monoliths in a Helm chart. Krate is different: it extends the Kubernetes API itself. Pull requests, issues, pipelines, and runners are real Kubernetes resources, governed by native RBAC, queryable with `kubectl`, and policy-controlled by the existing admission-webhook ecosystem (Kyverno, Gatekeeper).
|
|
10
|
+
|
|
11
|
+
The bet: platform engineering teams already run Argo, Crossplane, ARC, and Kyverno. A forge that *natively composes* with that stack — instead of bolting on integrations — eliminates an entire layer of glue, an entire RBAC system, and an entire class of CVEs.
|
|
12
|
+
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
## What was wrong with the naive spec
|
|
16
|
+
|
|
17
|
+
The original "K8s-native forge" idea has three architectural traps that kill it before MVP:
|
|
18
|
+
|
|
19
|
+
**1. etcd is not your database.** Storing every issue, PR, and comment as a CRD sounds elegant, but etcd has a 1.5MB per-object limit and degrades past ~8GB total. A medium-sized org (10k issues × 100 comments × 2KB) takes the cluster's control plane down. CRDs are an *API contract*, not a storage engine.
|
|
20
|
+
|
|
21
|
+
**2. One PVC per repository doesn't scale.** AWS EBS allows ~25 volumes per node; GCE PD has similar limits. A node hosting 1,000 repos on dedicated PVCs is mathematically impossible. The CSI story works only with ReadWriteMany filesystems or Gitea-managed RWX layout.
|
|
22
|
+
|
|
23
|
+
**3. Scale-to-zero on `git-receive` breaks real workflows.** Cold-starting on `git push` adds 3–8s of latency, and webhook fan-out doesn't always retry cleanly. *Pull* is the path that benefits from elastic scaling, not push.
|
|
24
|
+
|
|
25
|
+
The refined design fixes all three.
|
|
26
|
+
|
|
27
|
+
---
|
|
28
|
+
|
|
29
|
+
## Architecture: control plane / data plane
|
|
30
|
+
|
|
31
|
+
```mermaid
|
|
32
|
+
graph TB
|
|
33
|
+
subgraph Clients
|
|
34
|
+
CLI["git CLI / kubectl"]
|
|
35
|
+
UI["Next.js UI"]
|
|
36
|
+
CI["CI Runners (ARC/Tekton)"]
|
|
37
|
+
end
|
|
38
|
+
|
|
39
|
+
subgraph CP["Control Plane (kube-apiserver)"]
|
|
40
|
+
AAS["Aggregated API Server<br/>PullRequest, Issue, Review,<br/>Pipeline, Job, WebhookDelivery"]
|
|
41
|
+
CRD["CRDs<br/>Repository, WebhookSubscription,<br/>RefPolicy, BranchProtection,<br/>RunnerPool, View, Selector"]
|
|
42
|
+
AAS -.backed by.-> PG[("Postgres<br/>social data")]
|
|
43
|
+
CRD -.backed by.-> ETCD[("etcd<br/>config only")]
|
|
44
|
+
end
|
|
45
|
+
|
|
46
|
+
subgraph DP["Data Plane"]
|
|
47
|
+
ROUTER["Git Protocol Router<br/>(smart-HTTP + SSH)"]
|
|
48
|
+
S1["Gitea pod 0"]
|
|
49
|
+
S2["Gitea pod 1"]
|
|
50
|
+
S3["Gitea pod N..."]
|
|
51
|
+
ZOEKT["Zoekt Code Search"]
|
|
52
|
+
ROUTER --> S1
|
|
53
|
+
ROUTER --> S2
|
|
54
|
+
ROUTER --> S3
|
|
55
|
+
S1 -.indexed by.-> ZOEKT
|
|
56
|
+
end
|
|
57
|
+
|
|
58
|
+
RWX[("RWX Volumes (EFS/Ceph)")]
|
|
59
|
+
OBJ[("Object Storage (LFS, archives)")]
|
|
60
|
+
|
|
61
|
+
REPO_OP["Repository Operator (Crossplane)"]
|
|
62
|
+
PR_OP["PR Operator → preview env per PR"]
|
|
63
|
+
ARGO["ArgoCD ApplicationSet"]
|
|
64
|
+
|
|
65
|
+
CLI --> ROUTER
|
|
66
|
+
UI --> AAS
|
|
67
|
+
UI --> ROUTER
|
|
68
|
+
CI --> ROUTER
|
|
69
|
+
S1 --> RWX
|
|
70
|
+
S2 --> RWX
|
|
71
|
+
S3 --> RWX
|
|
72
|
+
S1 --> OBJ
|
|
73
|
+
AAS --> REPO_OP
|
|
74
|
+
AAS --> PR_OP
|
|
75
|
+
PR_OP --> ARGO
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
### Control plane
|
|
79
|
+
|
|
80
|
+
Exposes a Kubernetes-style API. `Repository`, `WebhookSubscription`, `RefPolicy`, `BranchProtection`, `RunnerPool`, `View`, and `Selector` are CRDs (low-cardinality, declarative). `PullRequest`, `Issue`, `Review`, `Pipeline`, `Job`, and `WebhookDelivery` are served by an **Aggregated API Server** (like `metrics-server`) — same kubectl semantics, but backed by Postgres instead of etcd. This is the single highest-stakes architectural decision.
|
|
81
|
+
|
|
82
|
+
### Data plane
|
|
83
|
+
|
|
84
|
+
Three separable services. The *Gitea backend* terminates smart-HTTP and SSH and owns repository hosting, deploy keys, branch protection, collaborators/teams, and webhooks. Krate controllers reconcile `Repository` resources into Gitea integration plans, while *Code Search* (Zoekt) runs as a separate indexing service over repository events.
|
|
85
|
+
|
|
86
|
+
### Scaling profile
|
|
87
|
+
|
|
88
|
+
- `git-upload-pack` (reads): HPA on request rate, 1→N.
|
|
89
|
+
- `git-receive-pack` (writes): warm minimum 1 Gitea backend replica; KEDA bursts on backlog.
|
|
90
|
+
- Search indexers: scale independently.
|
|
91
|
+
|
|
92
|
+
### Identity
|
|
93
|
+
|
|
94
|
+
OIDC for humans, federated to K8s `User`/`Group`. For CI, **Workload Identity**: runner pods get a projected ServiceAccount token, scoped to `(repo, ref, pipeline_id)`. **No PATs, ever.** Want to push to a registry? Bind the SA to the registry's RoleBinding. The CI system has no special privileges — it has whatever you grant the SA.
|
|
95
|
+
|
|
96
|
+
---
|
|
97
|
+
|
|
98
|
+
## Runners: a first-class system
|
|
99
|
+
|
|
100
|
+
ARC is the *executor*, not the runner system. Krate has its own runner abstraction that composes with ARC, Tekton, or Buildkite Agent underneath. Identity, caching, cost attribution, and untrusted-code isolation are forge concerns.
|
|
101
|
+
|
|
102
|
+
### Resources
|
|
103
|
+
|
|
104
|
+
- **`RunnerPool`** (CRD): image, resources, node selector, scaling policy, allowed repos, **trust tier** (`trusted` | `untrusted`), cache backend.
|
|
105
|
+
- **`Pipeline`** (Aggregated API): a single CI invocation. References a `RunnerPool` and a workflow file.
|
|
106
|
+
- **`Job`** (Aggregated API): a single step within a pipeline; owns one runner pod.
|
|
107
|
+
|
|
108
|
+
### Four problems the model fixes structurally
|
|
109
|
+
|
|
110
|
+
**Cold start vs cost.** Pools have `warmReplicas` and `maxReplicas`. KEDA scales between them on queue depth. Trusted pools default to warm=2; untrusted (fork PRs) default to warm=0 — pay cold-start latency for safety.
|
|
111
|
+
|
|
112
|
+
**Untrusted code execution.** A PR from a fork *must* run in an `untrusted` pool with a ServiceAccount that has zero secrets and no cluster API access. Enforced by an admission policy on `Pipeline` create — the controller refuses to schedule untrusted code on a trusted pool. Most common security incident in CI; making it un-bypassable kills the bug class.
|
|
113
|
+
|
|
114
|
+
**Caching.** BuildKit cache mount per pool, backed by S3. Shared across runs of the same repo, never across pools. No "is my cache poisoned" debugging because cache scope is structural.
|
|
115
|
+
|
|
116
|
+
**Identity.** No PATs. Each Job pod gets a projected ServiceAccount token scoped to `(repo, ref, pipeline_id)`. The CI system has whatever privileges its SA has — nothing more.
|
|
117
|
+
|
|
118
|
+
### Runners UI
|
|
119
|
+
|
|
120
|
+
Three screens carry 90% of the weight:
|
|
121
|
+
|
|
122
|
+
**Pool dashboard** (org-level). Live grid of pools: warm/total replicas, queue depth, p50/p95 wait time, last-hour cost. Click → spec, recent runs, scaling timeline. Killer feature: a *"Why is this slow?"* link that surfaces the bottleneck (queue saturation? image pull? node pressure?) by querying pod events.
|
|
123
|
+
|
|
124
|
+
**Live run view** (per pipeline) — the screen people live in:
|
|
125
|
+
|
|
126
|
+
```
|
|
127
|
+
┌─────────────────────────────────────────────────────────────────┐
|
|
128
|
+
│ Pipeline #4127 · feature/auth-refactor · ⏱ 2m 14s │
|
|
129
|
+
│ [Cancel] [Rerun failed] [Rerun from step ▾] [</> YAML] │
|
|
130
|
+
├──────────────────┬──────────────────────────────────────────────┤
|
|
131
|
+
│ Steps │ Logs: build > test > integration │
|
|
132
|
+
│ │ │
|
|
133
|
+
│ ✓ checkout │ + go test ./... │
|
|
134
|
+
│ ✓ deps │ ok github.com/krate/api 0.142s │
|
|
135
|
+
│ ✓ build │ ok github.com/krate/auth 0.318s │
|
|
136
|
+
│ ⟳ test │ --- FAIL: TestTokenExchange (0.04s) │
|
|
137
|
+
│ ↳ unit │ auth_test.go:142: expected 200, got 401 │
|
|
138
|
+
│ ↳ integration │ FAIL github.com/krate/auth │
|
|
139
|
+
│ ◯ deploy │ │
|
|
140
|
+
│ ◯ notify │ [📋 Copy failure] [🔍 Find similar runs] │
|
|
141
|
+
│ │ │
|
|
142
|
+
│ Pod: krate- │ ────── Live tail ────── │
|
|
143
|
+
│ runner-trusted- │ │
|
|
144
|
+
│ 7c4f9 → │ │
|
|
145
|
+
│ [kubectl logs] │ │
|
|
146
|
+
└──────────────────┴──────────────────────────────────────────────┘
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
Log streaming uses the K8s `Pod.log` watch endpoint piped through SSE. *"Find similar runs"* is a label-selector query: `kubectl get pipelines -l failure.signature=auth_test:142`. *"Rerun from step"* creates a new `Pipeline` with a `resumeFrom` field — controller skips earlier steps and rehydrates cache.
|
|
150
|
+
|
|
151
|
+
**Pool editor.** Split view: left = form (image, resources, scaling), right = the YAML it produces, live-updating. Save shows "this is equivalent to `kubectl apply -f -`" with the manifest. Same screen offers *"Save to repo"* — opens a PR against your platform-config repo.
|
|
152
|
+
|
|
153
|
+
---
|
|
154
|
+
|
|
155
|
+
## Hooks: three layers, not one
|
|
156
|
+
|
|
157
|
+
"Hooks" gets conflated. Krate has three distinct hook surfaces, each with a different mechanism:
|
|
158
|
+
|
|
159
|
+
### Layer 1 — Server-side Git hooks (`pre-receive`, `update`, `post-receive`)
|
|
160
|
+
|
|
161
|
+
Run *inside* the receive-pack process; can't be CRDs (Git doesn't know what etcd is). Model: `RefPolicy` CRD declares the rule (require commit signing, block force-push to main, require linear history). A controller compiles policies into a config map mounted into Gitea backend pods; receive-pack evaluates them in-process. Custom hooks run in a sandboxed **WASM runtime** — no shelling out, no supply-chain risk.
|
|
162
|
+
|
|
163
|
+
### Layer 2 — Outbound webhooks (HTTP delivery to Slack, Jenkins, etc.)
|
|
164
|
+
|
|
165
|
+
`WebhookSubscription` CRD per repo or org. Delivery via NATS JetStream (K8s-native, durable queue) with exponential backoff and HMAC signing. Every delivery becomes a `WebhookDelivery` resource — queryable, replayable, observable through the same `kubectl get` machinery as everything else.
|
|
166
|
+
|
|
167
|
+
### Layer 3 — Admission webhooks on the control plane (the elegant part)
|
|
168
|
+
|
|
169
|
+
Because PRs and Issues are real Kubernetes resources, **any** `ValidatingAdmissionWebhook` or `MutatingAdmissionWebhook` works on them. **Kyverno and OPA Gatekeeper work out of the box for PR policy.** Block PRs with empty descriptions? Kyverno policy. Require two reviewers from different teams? CEL expression. You inherit the entire Kubernetes policy ecosystem for free.
|
|
170
|
+
|
|
171
|
+
### Hooks UI
|
|
172
|
+
|
|
173
|
+
Per-repo Hooks tab unifies all three layers in one explorer:
|
|
174
|
+
|
|
175
|
+
```
|
|
176
|
+
Hooks affecting this repository
|
|
177
|
+
─────────────────────────────────────────────────────────────────
|
|
178
|
+
GIT REFS (pre-receive, update, post-receive)
|
|
179
|
+
✓ require-signed-commits RefPolicy/org-defaults [edit]
|
|
180
|
+
✓ block-force-push-main RefPolicy/protected [edit]
|
|
181
|
+
⚠ lint-on-push RefPolicy/this-repo [3 fails today]
|
|
182
|
+
|
|
183
|
+
OUTBOUND WEBHOOKS
|
|
184
|
+
✓ slack-engineering push, pull_request [deliveries →]
|
|
185
|
+
✗ jenkins-legacy push [12 failed in 1h ⚠]
|
|
186
|
+
|
|
187
|
+
ADMISSION POLICIES (on PullRequest, Issue)
|
|
188
|
+
✓ require-pr-description Kyverno: ClusterPolicy [view]
|
|
189
|
+
✓ block-wip-merges Kyverno: ClusterPolicy [view]
|
|
190
|
+
+ Add policy [Browse Kyverno templates]
|
|
191
|
+
```
|
|
192
|
+
|
|
193
|
+
**Webhook delivery log** — the page everyone secretly wants and no forge does well. Every delivery: request headers, body, response, latency, retry chain. *"Replay"* re-fires with current secrets. Filter via `kubectl get webhookdeliveries -l status=failed,subscription=jenkins-legacy --since=1h`.
|
|
194
|
+
|
|
195
|
+
**Policy authoring.** If Kyverno is installed, the UI deep-links to its policies. If not, a built-in editor with form mode (templates) and CEL mode (raw expressions). A *"Violations"* panel shows current PRs that *would* fail the policy if it were enforcing — roll out in audit mode first.
|
|
196
|
+
|
|
197
|
+
---
|
|
198
|
+
|
|
199
|
+
## UX scope
|
|
200
|
+
|
|
201
|
+
### Personas
|
|
202
|
+
|
|
203
|
+
- **Developer** — lives in PR review, run debugging, code browse. Wants speed.
|
|
204
|
+
- **Platform engineer** — manages pools, policies, hooks, identity. Wants control and auditability.
|
|
205
|
+
- *Repo admin* is mostly a developer with extra settings access; not a separate IA tier.
|
|
206
|
+
|
|
207
|
+
### Information architecture
|
|
208
|
+
|
|
209
|
+
**Top-level (org-scoped):**
|
|
210
|
+
|
|
211
|
+
| Section | Primary use |
|
|
212
|
+
|---|---|
|
|
213
|
+
| Repositories | Browse, search, create |
|
|
214
|
+
| Inbox | Cross-repo PRs/issues/runs needing me |
|
|
215
|
+
| Runs | Cross-repo CI activity |
|
|
216
|
+
| Runners | Pool dashboard + live runs |
|
|
217
|
+
| Hooks & Policies | All three hook layers |
|
|
218
|
+
| Insights | Lead time, runner cost, hook health |
|
|
219
|
+
| Settings | Identity, storage, Gitea backend defaults |
|
|
220
|
+
|
|
221
|
+
**Per-repo:** Code · Pull Requests · Issues · Runs · Hooks · Settings.
|
|
222
|
+
|
|
223
|
+
### Six flows that must be excellent
|
|
224
|
+
|
|
225
|
+
1. **Open and review a PR.** Three-pane: file tree / diff / conversation. Inline comments thread. Suggested edits commit-with-one-click. Keyboard-first (j/k/n/p). Right rail shows the PR's `Pipeline` runs, linked to live run view.
|
|
226
|
+
|
|
227
|
+
2. **Debug a failing run.** Live run view with step navigation, log streaming, "find similar failures."
|
|
228
|
+
|
|
229
|
+
3. **Configure a runner pool.** Form + YAML split, GitOps export.
|
|
230
|
+
|
|
231
|
+
4. **Add a webhook and verify it works.** Create form → *"Send test delivery"* → see it in the log within seconds, with response.
|
|
232
|
+
|
|
233
|
+
5. **Write a PR policy.** Pick a template → preview violations on existing PRs → enable in audit mode → graduate to enforce. The audit→enforce gradient *is* the UX, not just a flag.
|
|
234
|
+
|
|
235
|
+
6. **Cross-repo triage.** Inbox view with saved filters. Filters persist as `Selector` CRDs — a team's "P0 bug triage view" is a YAML file you can commit and share.
|
|
236
|
+
|
|
237
|
+
### Front-end stack (Next.js 15, App Router)
|
|
238
|
+
|
|
239
|
+
Three principles:
|
|
240
|
+
|
|
241
|
+
**Direct-to-API, no intermediate backend.** Server Components fetch from the Aggregated API Server using `@kubernetes/client-node`. There is no separate "forge backend" — the control plane *is* the backend.
|
|
242
|
+
|
|
243
|
+
**Real-time via the Watch API.** Every list view opens a Watch stream from the API server, piped to the browser via Server-Sent Events from a Route Handler. PR state, comments, CI status — all stream in without polling. Free, because the K8s API gives it to you.
|
|
244
|
+
|
|
245
|
+
**GitOps-transparent UI.** Every mutating action shows the equivalent YAML inline and offers *"Copy as kubectl"*.
|
|
246
|
+
|
|
247
|
+
Auth model: **NextAuth** handles OIDC login, then exchanges the OIDC token for a Kubernetes ServiceAccount token via the TokenRequest API. Every API call from a Server Component carries the user's K8s identity, so RBAC enforcement happens at the API server — not in app code. Zero permission logic to write or audit.
|
|
248
|
+
|
|
249
|
+
```
|
|
250
|
+
┌─────────────────────────────────────────────────────────────┐
|
|
251
|
+
│ Browser │
|
|
252
|
+
│ ├─ Server Components (RSC stream from K8s API) │
|
|
253
|
+
│ ├─ SSE client (Watch events → optimistic UI) │
|
|
254
|
+
│ └─ Monaco/CodeMirror for diff + code view │
|
|
255
|
+
└──────────────┬──────────────────────────────────────────────┘
|
|
256
|
+
│ HTTPS
|
|
257
|
+
┌──────────────▼──────────────────────────────────────────────┐
|
|
258
|
+
│ Next.js (Edge runtime where possible) │
|
|
259
|
+
│ ├─ /repos/[…] → Server Component, kubectl-equivalent │
|
|
260
|
+
│ ├─ /api/watch/orgs/[org]/* → Route Handler, K8s Watch → SSE │
|
|
261
|
+
│ ├─ /api/git-proxy → Streams smart-HTTP to Gitea backend │
|
|
262
|
+
│ └─ NextAuth (OIDC) ↔ TokenRequest API for SA exchange │
|
|
263
|
+
└──────────────┬──────────────────────────────────────────────┘
|
|
264
|
+
│ K8s API (Aggregated) │ Git smart-HTTP
|
|
265
|
+
▼ ▼
|
|
266
|
+
Control Plane Data Plane
|
|
267
|
+
```
|
|
268
|
+
|
|
269
|
+
---
|
|
270
|
+
|
|
271
|
+
## "GitOps-transparent UX" — concretely
|
|
272
|
+
|
|
273
|
+
Three patterns, applied uniformly:
|
|
274
|
+
|
|
275
|
+
**1. Every mutating action has a `</>` button** in the same screen position. Click: side panel shows (a) the resulting YAML, (b) the kubectl command equivalent, (c) *"Copy as `kubectl apply`"*, (d) *"Open PR against config repo."* Click "Apply" in the panel and the action runs through the same code path as the form button — the panel isn't a translation layer, it's the truth and the form is the rendering of it.
|
|
276
|
+
|
|
277
|
+
**2. Every detail page has a YAML tab** next to the rendered view. Lens/Headlamp parity, native.
|
|
278
|
+
|
|
279
|
+
**3. Every saved view is a resource.** Inbox filter, dashboard layout, custom column set — all stored as `View` / `Selector` CRDs. Your team's triage dashboard is `git clone`-able. No competitor has this because no competitor's UI state lives in the same store as the domain data.
|
|
280
|
+
|
|
281
|
+
**4. Activity log entries are kubectl commands.** "Bob merged PR #42" expands to: `kubectl patch pullrequest/42 --type=merge -p '{"spec":{"merged":true}}'` — copyable, replayable, auditable. The audit log isn't generated from events; it's a literal command history.
|
|
282
|
+
|
|
283
|
+
---
|
|
284
|
+
|
|
285
|
+
## Why this wins (positioning)
|
|
286
|
+
|
|
287
|
+
The pitch isn't "Gitea on Kubernetes." It's: **the only forge where your repos, PRs, CI, and infrastructure share one identity model, one RBAC system, and one declarative API.**
|
|
288
|
+
|
|
289
|
+
Three moats:
|
|
290
|
+
|
|
291
|
+
- **Platform engineers want this.** Internal developer platforms are the buyer. They already run Argo, Crossplane, ARC, Kyverno. A forge that natively composes with them eliminates integration glue.
|
|
292
|
+
- **Multi-tenancy is namespace-shaped.** Tenant isolation is a solved K8s problem. Other forges reinvent it badly.
|
|
293
|
+
- **No new permission system.** The single biggest source of breach incidents at competitors is auth bugs. Inheriting K8s RBAC removes an entire class of CVEs from the roadmap.
|
|
294
|
+
|
|
295
|
+
---
|
|
296
|
+
|
|
297
|
+
## Roadmap
|
|
298
|
+
|
|
299
|
+
| Capability | v0.1 (6 wks) | v0.2 | v0.3 |
|
|
300
|
+
|---|---|---|---|
|
|
301
|
+
| Repos, PRs, Issues (Aggregated API) | ✓ | | |
|
|
302
|
+
| Single-Gitea backend data plane | ✓ | | |
|
|
303
|
+
| Next.js UI: code, PR review, run view | ✓ | | |
|
|
304
|
+
| Outbound webhooks + delivery log | ✓ | | |
|
|
305
|
+
| Admission webhooks (Kyverno-compatible) | ✓ (free) | | |
|
|
306
|
+
| External CI via ARC | ✓ | | |
|
|
307
|
+
| Native `RunnerPool` + live run view | | ✓ | |
|
|
308
|
+
| `RefPolicy` (server-side git hooks, WASM) | | ✓ | |
|
|
309
|
+
| Gitea backend horizontal scale-out | | ✓ | |
|
|
310
|
+
| `View` / `Selector` CRDs (saved triage) | | ✓ | |
|
|
311
|
+
| Code search (Zoekt) | | | ✓ |
|
|
312
|
+
| Multi-cluster federation | | | ✓ |
|
|
313
|
+
|
|
314
|
+
**The MVP demo:** a single `kubectl apply -f kyverno-pr-policy.yaml` blocks a PR — and the same policy shows up in the UI's Hooks tab without anyone wiring it. That's the moment people get it.
|
|
315
|
+
|
|
316
|
+
### 6-week MVP plan
|
|
317
|
+
|
|
318
|
+
- **W1–2.** Aggregated API Server with `Repository` + `PullRequest`, Postgres-backed, working `kubectl get/create`.
|
|
319
|
+
- **W3.** Single-Gitea backend data plane: `git-upload-pack` and `git-receive-pack` served by Gitea with Repository Operator creating repositories through the Gitea API.
|
|
320
|
+
- **W4.** Next.js skeleton — login, repo list, file view, PR list (RSC + Watch SSE).
|
|
321
|
+
- **W5.** PR creation flow, inline diff, comment thread.
|
|
322
|
+
- **W6.** Workload Identity for CI, demo with ARC running a real workflow.
|
|
323
|
+
|
|
324
|
+
Ship as a public Helm chart. The story writes itself: `helm install krate; kubectl create -f my-repo.yaml; git push`.
|
|
325
|
+
|
|
326
|
+
---
|
|
327
|
+
|
|
328
|
+
## Open decisions
|
|
329
|
+
|
|
330
|
+
1. **Aggregated API Server vs pure CRDs.** Highest-stakes call; everything downstream depends on it. AAS is correct; commit early.
|
|
331
|
+
2. **Runner abstraction: executor-pluggable or ARC-only for MVP?** Pluggable is correct long-term but doubles week-2 complexity.
|
|
332
|
+
3. **Bundle Kyverno in the install, or BYO?** Bundling is friendlier; BYO keeps install lean.
|
|
333
|
+
4. **Name.** "Krate" is the project name used by this implementation package and documentation set.
|
|
334
|
+
|
|
@@ -0,0 +1,55 @@
|
|
|
1
|
+
# Local Minikube Setup
|
|
2
|
+
|
|
3
|
+
|
|
4
|
+
|
|
5
|
+
Krate now includes a deterministic local setup script for the Kubernetes package lifecycle.
|
|
6
|
+
|
|
7
|
+
|
|
8
|
+
|
|
9
|
+
## Dry-run validation
|
|
10
|
+
|
|
11
|
+
|
|
12
|
+
|
|
13
|
+
Use dry-run mode in development and CI because it does not require a local cluster:
|
|
14
|
+
|
|
15
|
+
|
|
16
|
+
|
|
17
|
+
```bash
|
|
18
|
+
|
|
19
|
+
npm run setup:minikube -- --dry-run
|
|
20
|
+
|
|
21
|
+
npm run setup:minikube -- --dry-run --json
|
|
22
|
+
|
|
23
|
+
npm run e2e
|
|
24
|
+
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
|
|
28
|
+
|
|
29
|
+
Dry-run mode verifies the intended lifecycle command plan: start minikube, enable ingress and metrics, select the context, validate the chart, install the chart, apply demo resources, wait for the API deployment, and run smoke assertions.
|
|
30
|
+
|
|
31
|
+
|
|
32
|
+
|
|
33
|
+
## Apply mode
|
|
34
|
+
|
|
35
|
+
|
|
36
|
+
|
|
37
|
+
Use apply mode when `minikube`, `kubectl`, `helm`, Node.js, npm, and a working container driver are installed:
|
|
38
|
+
|
|
39
|
+
|
|
40
|
+
|
|
41
|
+
```bash
|
|
42
|
+
|
|
43
|
+
npm run setup:minikube -- --apply
|
|
44
|
+
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
|
|
48
|
+
|
|
49
|
+
The script defaults to profile `krate`, namespace `krate-system`, release `krate`, driver `docker`, and chart `charts/krate`. Override with `--profile=...`, `--namespace=...`, `--release=...`, `--driver=...`, and `--chart=...`.
|
|
50
|
+
|
|
51
|
+
|
|
52
|
+
|
|
53
|
+
## Release boundary
|
|
54
|
+
|
|
55
|
+
The chart is production-shaped and validates Krate install contracts against the executable Kubernetes package model, including Argo CD Application and Gitea backend surfaces. The repository includes a production-shaped controller image build, ingress values for the Next.js app, registry pull-secret support, and GitHub publishing lanes for GHCR images, chart artifacts, generated dist/example bundles, and AKS branch deployments.
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
# Krate Ontology
|
|
2
|
+
|
|
3
|
+
This tree turns the requirements in `docs/` into an implementation ontology for the Kubernetes-native forge MVP.
|
|
4
|
+
|
|
5
|
+
## Reading order
|
|
6
|
+
|
|
7
|
+
1. `world.md` - domain, actors, external systems, and assumptions.
|
|
8
|
+
2. `problem-space.md` - jobs, risks, and failure modes Krate must solve.
|
|
9
|
+
3. `solution-space.md` - architecture and MVP module mapping.
|
|
10
|
+
4. `bounded-contexts.md` - ownership boundaries across control, data, identity, CI, hooks, UI, and operations.
|
|
11
|
+
5. `resource-taxonomy.md` and `resource-contracts.md` - API kinds and lifecycle contracts.
|
|
12
|
+
6. Workflow, policy, storage, event, CI, UI, operations, and validation files for executable acceptance gates.
|
|
13
|
+
|
|
14
|
+
## Traceability model
|
|
15
|
+
|
|
16
|
+
Each ontology entry traces to at least one source document under `docs/`, one implementation module under `src/`, and one validation surface in `tests/` or `scripts/`.
|
|
17
|
+
|
|
18
|
+
| Ontology area | Source docs | Implementation | Validation |
|
|
19
|
+
| --- | --- | --- | --- |
|
|
20
|
+
| Control-plane resources | `docs/components/control-plane.md` | `src/resource-model.js`, `src/control-plane.js` | `tests/krate.test.js` |
|
|
21
|
+
| Identity and policy | `docs/components/identity-rbac-policy.md` | `src/identity-policy.js` | RBAC/admission tests |
|
|
22
|
+
| Git data plane | `docs/components/data-plane.md` | `src/data-plane.js` | Gitea backend tests |
|
|
23
|
+
| CI and runners | `docs/components/runners-ci.md` | `src/runners-ci.js` | Runner scheduler tests |
|
|
24
|
+
| Hooks and webhooks | `docs/components/hooks-events.md` | `src/hooks-events.js` | Webhook bus tests |
|
|
25
|
+
| Web UI flows | `docs/components/web-ui.md` | `src/web-ui.js` | smoke assertions |
|
|
26
|
+
| Operations and release | `docs/components/operations-publishing.md` | `src/operations.js` | build, smoke, doc coverage |
|
|
27
|
+
|
|
28
|
+
## Completion criteria
|
|
29
|
+
|
|
30
|
+
- All resource kinds are classified as CRD-backed configuration or aggregated Postgres-backed records.
|
|
31
|
+
- All high-risk invariants are executable: RBAC, admission, storage boundaries, fork isolation, ref protection, webhook replay, UI YAML transparency, and backup/restore order.
|
|
32
|
+
- `npm run check` verifies build output, doc/ontology coverage, tests, and smoke flow.
|
|
@@ -0,0 +1,29 @@
|
|
|
1
|
+
# Bounded Contexts
|
|
2
|
+
|
|
3
|
+
## Control plane
|
|
4
|
+
|
|
5
|
+
Owns resource verbs, storage routing, RBAC checks, admission decisions, audit records, status patches, lists, and watches. It depends on `resource-model` for kind classification and `identity-policy` for authorization and admission evaluation.
|
|
6
|
+
|
|
7
|
+
## Data plane
|
|
8
|
+
|
|
9
|
+
Owns Gitea repository hosting, warm Git receive routing, object storage metadata, search indexing hooks, `RefPolicy`, and `BranchProtection` enforcement. It writes `Repository` config and emits Git events through the control-plane event stream.
|
|
10
|
+
|
|
11
|
+
## Identity and policy
|
|
12
|
+
|
|
13
|
+
Owns OIDC identity mapping, Kubernetes groups, default RBAC roles, service-account scope profiles, trust tiers, admission policies, and audit/enforce rollout lifecycle.
|
|
14
|
+
|
|
15
|
+
## Runners and CI
|
|
16
|
+
|
|
17
|
+
Owns `RunnerPool`, `Pipeline`, and `Job` resources, queue-depth scaling, fork isolation, cache configuration, and rerun/resume semantics.
|
|
18
|
+
|
|
19
|
+
## Hooks and events
|
|
20
|
+
|
|
21
|
+
Owns outbound `WebhookSubscription` and `WebhookDelivery` resources, HMAC signing, retry/failure status, replay records, and the distinction between server-side Git hooks, outbound webhooks, and Kubernetes admission hooks.
|
|
22
|
+
|
|
23
|
+
## Web UI
|
|
24
|
+
|
|
25
|
+
Owns resource-backed view models for dashboards, PR review, runner pool editing, webhook inspection, YAML previews, and excellent flows. It must not invent hidden state that cannot be mapped back to resources.
|
|
26
|
+
|
|
27
|
+
## Operations and publishing
|
|
28
|
+
|
|
29
|
+
Owns install manifests, CRD/APIService publication, observability surfaces, backup/restore order, upgrade gates, smoke tests, and release readiness checks.
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
# Events and Hooks
|
|
2
|
+
|
|
3
|
+
Krate separates server-side Git hooks, outbound HTTP webhooks, Kubernetes admission hooks, and resource watch events.
|
|
4
|
+
|
|
5
|
+
## Watch events
|
|
6
|
+
|
|
7
|
+
- Emitted on resource create/update/status mutations.
|
|
8
|
+
- Include resource kind, storage boundary, and audit context.
|
|
9
|
+
- Used by UI, controllers, and smoke assertions.
|
|
10
|
+
|
|
11
|
+
## Server-side Git hooks
|
|
12
|
+
|
|
13
|
+
- Modeled through `RefPolicy` and `BranchProtection`.
|
|
14
|
+
- Enforced during receive-pack before protected writes complete.
|
|
15
|
+
- Must not mount broad secrets or shell out with ambient privilege.
|
|
16
|
+
|
|
17
|
+
## Outbound webhooks
|
|
18
|
+
|
|
19
|
+
- Configured by `WebhookSubscription`.
|
|
20
|
+
- Materialized as durable `WebhookDelivery` records.
|
|
21
|
+
- Signed with HMAC SHA-256.
|
|
22
|
+
- Replay creates a new delivery attempt and keeps replay metadata.
|
|
23
|
+
|
|
24
|
+
## Admission hooks
|
|
25
|
+
|
|
26
|
+
- Validate Krate resources before storage.
|
|
27
|
+
- Support audit warnings and enforce denial.
|
|
28
|
+
- Must expose actionable status when denied.
|
|
29
|
+
|
|
30
|
+
## Observability
|
|
31
|
+
|
|
32
|
+
Every delivery and event should expose phase, latency or timestamp, response metadata where applicable, and enough context for replay or remediation.
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
# OAM and KubeVela Ontology Assimilation
|
|
2
|
+
|
|
3
|
+
Krate assimilates the Open Application Model as the application-delivery layer of the forge ontology. The OAM model gives Krate a vocabulary for app-centric delivery without replacing Kubernetes or Gitea.
|
|
4
|
+
|
|
5
|
+
## Concepts
|
|
6
|
+
|
|
7
|
+
| OAM concept | Krate ontology role | Kubernetes resource surface |
|
|
8
|
+
| --- | --- | --- |
|
|
9
|
+
| Application | Deployable application attached to a repository, branch, PR, or release | `applications.core.oam.dev` |
|
|
10
|
+
| Component | Reusable workload or service module | `componentdefinitions.core.oam.dev` plus Application components |
|
|
11
|
+
| Workload type | Runtime implementation selected by a component definition | `workloaddefinitions.core.oam.dev` when installed by KubeVela |
|
|
12
|
+
| Trait | Operational behavior attached to a component | `traitdefinitions.core.oam.dev` plus Application traits |
|
|
13
|
+
| Policy | Delivery rule such as topology, override, health, or security | `policydefinitions.core.oam.dev` plus Application policies |
|
|
14
|
+
| Workflow step | Ordered delivery action such as deploy, suspend, approve, promote, rollback | `workflowstepdefinitions.core.oam.dev`, `workflows.core.oam.dev`, plus Application workflow |
|
|
15
|
+
| Application revision | Concrete release/revision materialized by KubeVela | `applicationrevisions.core.oam.dev` |
|
|
16
|
+
| Resource tracker | Applied-resource ownership graph maintained by KubeVela | cluster-scoped `resourcetrackers.core.oam.dev` |
|
|
17
|
+
| Scope | Logical grouping boundary | `scopedefinitions.core.oam.dev` when installed plus Application component `scopes` maps |
|
|
18
|
+
|
|
19
|
+
## Assimilation Rules
|
|
20
|
+
|
|
21
|
+
- OAM `Application` is not a Git repository; it is a delivery object that references build/deploy intent derived from a repository.
|
|
22
|
+
- OAM components, workloads, traits, scopes, policies, and workflow steps are capability catalog entries, not hard-coded Krate forms.
|
|
23
|
+
- Krate UI must present simple forge tasks first: deploy from repo, preview PR, promote release, inspect rollout, rollback.
|
|
24
|
+
- Raw OAM YAML remains visible so operators can copy, review, and apply exact Kubernetes resources.
|
|
25
|
+
- KubeVela status, ApplicationRevisions, Workflows, Policies, and ResourceTrackers are authoritative for OAM delivery health; Krate may summarize but must not synthesize success.
|
|
26
|
+
|
|
27
|
+
## Validation Expectations
|
|
28
|
+
|
|
29
|
+
- Chart render includes an Argo CD Application for KubeVela when enabled.
|
|
30
|
+
- `/api/controller` exposes discovered KubeVela definition, application, revision, policy, workflow, and resource-tracker counts when KubeVela CRDs are installed.
|
|
31
|
+
- UI has an Applications surface that names OAM Applications, Components, Workloads, Traits, Scopes, Policies, Workflow Steps, and KubeVela installation status.
|
|
32
|
+
- Repository pages show how a repo/PR maps to an OAM Application and preview workflow.
|
|
@@ -0,0 +1,25 @@
|
|
|
1
|
+
# Operations and Release
|
|
2
|
+
|
|
3
|
+
## Install
|
|
4
|
+
|
|
5
|
+
Install manifests must include CRDs, APIService, controllers, Gitea backend, runner scheduler, webhook dispatcher, and web UI deployments or equivalent resources.
|
|
6
|
+
|
|
7
|
+
## Observability
|
|
8
|
+
|
|
9
|
+
Operators need metrics and logs for API latency, storage boundary health, Git receive latency, queue depth, runner saturation, webhook delivery phase, replay counts, and admission denials.
|
|
10
|
+
|
|
11
|
+
## Backup and restore
|
|
12
|
+
|
|
13
|
+
Backup covers CRDs/config, Postgres records, Gitea repositories, and object storage. Restore order is API/config, Postgres, Gitea repositories, objects, controllers. Validation includes listing resources, reading refs, opening a PR, and replaying webhooks.
|
|
14
|
+
|
|
15
|
+
## Upgrade
|
|
16
|
+
|
|
17
|
+
Upgrades must preserve CRD compatibility, aggregated API availability, migrations, controller reconciliation, and rollback instructions.
|
|
18
|
+
|
|
19
|
+
## Release gates
|
|
20
|
+
|
|
21
|
+
- Build produces `dist/krate-summary.json`.
|
|
22
|
+
- Documentation and ontology coverage pass.
|
|
23
|
+
- Unit acceptance tests pass.
|
|
24
|
+
- Smoke flow passes.
|
|
25
|
+
- Known limitations are explicit and not hidden in incomplete implementation paths.
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
# Personas and Actors
|
|
2
|
+
|
|
3
|
+
## Developer
|
|
4
|
+
|
|
5
|
+
- Groups: `system:authenticated`, `krate:developers`.
|
|
6
|
+
- Can create and update `PullRequest`, `Issue`, `Review`, `Pipeline`, and `Job` records.
|
|
7
|
+
- Cannot create repositories or branch protection without repo-admin rights.
|
|
8
|
+
- Needs UI flows for PR review, failed CI inspection, webhook visibility, and YAML/resource transparency.
|
|
9
|
+
|
|
10
|
+
## Repository admin
|
|
11
|
+
|
|
12
|
+
- Groups: `krate:repo-admins`.
|
|
13
|
+
- Can create repositories, branch protection, ref policies, webhook subscriptions, triage views, and selectors.
|
|
14
|
+
- Can replay webhook deliveries and manage repository-level governance.
|
|
15
|
+
|
|
16
|
+
## Platform engineer
|
|
17
|
+
|
|
18
|
+
- Groups: `krate:platform-engineers`.
|
|
19
|
+
- Can perform all verbs on all Krate kinds.
|
|
20
|
+
- Owns install, upgrade, observability, runner pools, backup/restore, and release gates.
|
|
21
|
+
|
|
22
|
+
## Controllers
|
|
23
|
+
|
|
24
|
+
- Use scoped service accounts.
|
|
25
|
+
- Patch status and reconcile desired state.
|
|
26
|
+
- Must not bypass admission-sensitive invariants unless explicitly modeled.
|
|
27
|
+
|
|
28
|
+
## Runner jobs
|
|
29
|
+
|
|
30
|
+
- Use service-account scopes derived from trust tier.
|
|
31
|
+
- Trusted jobs may access configured caches and publication credentials.
|
|
32
|
+
- Untrusted fork jobs have no secrets and no cluster API mutation.
|
|
@@ -0,0 +1,33 @@
|
|
|
1
|
+
# Policies and Invariants
|
|
2
|
+
|
|
3
|
+
## RBAC
|
|
4
|
+
|
|
5
|
+
- Users must be mapped from OIDC identity into Kubernetes-style groups.
|
|
6
|
+
- `system:authenticated` can read and watch, not mutate privileged resources.
|
|
7
|
+
- Developers can mutate review and CI records.
|
|
8
|
+
- Repo admins can mutate repository governance resources.
|
|
9
|
+
- Platform engineers can mutate all Krate resources.
|
|
10
|
+
|
|
11
|
+
## Admission rollout
|
|
12
|
+
|
|
13
|
+
- Policies support `audit` mode for warnings without blocking.
|
|
14
|
+
- Policies support `enforce` mode for fail-closed denial.
|
|
15
|
+
- Rollout sequence is preview, audit, then enforce.
|
|
16
|
+
|
|
17
|
+
## Isolation
|
|
18
|
+
|
|
19
|
+
- Fork PR jobs are untrusted.
|
|
20
|
+
- Untrusted jobs receive no secrets and no cluster API access.
|
|
21
|
+
- Trusted jobs receive only explicitly scoped capabilities.
|
|
22
|
+
|
|
23
|
+
## Git governance
|
|
24
|
+
|
|
25
|
+
- `BranchProtection` can require PR flow for protected refs.
|
|
26
|
+
- `RefPolicy` can deny internal refs and unsafe updates.
|
|
27
|
+
- Receive-pack must emit events with correlation to repository and Gitea backend.
|
|
28
|
+
|
|
29
|
+
## Audit and transparency
|
|
30
|
+
|
|
31
|
+
- Every mutation produces an audit entry with actor, groups, operation, resource, warnings, and allowed status.
|
|
32
|
+
- UI actions must expose equivalent YAML/resource state.
|
|
33
|
+
- Release gates must include docs coverage, tests, build, and smoke.
|
|
@@ -0,0 +1,30 @@
|
|
|
1
|
+
# Problem-Space Ontology
|
|
2
|
+
|
|
3
|
+
Krate exists because teams want forge workflows that inherit Kubernetes governance instead of re-implementing identity, policy, audit, deployment, and operations outside the cluster model.
|
|
4
|
+
|
|
5
|
+
## Jobs to be done
|
|
6
|
+
|
|
7
|
+
- Host repositories without one-PVC-per-repository scaling failure.
|
|
8
|
+
- Review and merge pull requests with branch protection and status checks.
|
|
9
|
+
- Run CI on trusted and untrusted work while preserving secret boundaries.
|
|
10
|
+
- Govern refs, admission, runner access, and webhook behavior through resources.
|
|
11
|
+
- Inspect and replay webhook deliveries without hidden queue state.
|
|
12
|
+
- Operate install, upgrade, backup, restore, and release processes with visible gates.
|
|
13
|
+
- Triage cross-repository work through selectors and views.
|
|
14
|
+
|
|
15
|
+
## Failure modes to prevent
|
|
16
|
+
|
|
17
|
+
- **etcd overload** from comments, jobs, logs, webhook attempts, or high-cardinality records.
|
|
18
|
+
- **Cold Git writes** from unprepared receive-pack paths.
|
|
19
|
+
- **Token sprawl** from CI jobs using broad personal or cluster credentials.
|
|
20
|
+
- **Fork leakage** where untrusted PRs can read secrets or mutate cluster resources.
|
|
21
|
+
- **Opaque hooks** where failed webhook deliveries cannot be inspected or replayed.
|
|
22
|
+
- **Non-auditable UI** where a click cannot be mapped to a resource mutation.
|
|
23
|
+
- **Operational drift** where manifests, backup order, and release gates are not tested.
|
|
24
|
+
|
|
25
|
+
## Success criteria
|
|
26
|
+
|
|
27
|
+
- Every mutation has an actor, verb, resource, storage boundary, admission decision, audit entry, and watch event.
|
|
28
|
+
- Every resource family has a clear owner context and lifecycle.
|
|
29
|
+
- Every excellent flow has a YAML/resource equivalent.
|
|
30
|
+
- Every release candidate passes build, doc coverage, unit acceptance tests, and smoke assertions.
|