okstra 0.49.0 → 0.51.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (86) hide show
  1. package/README.kr.md +8 -7
  2. package/README.md +8 -7
  3. package/bin/okstra +2 -0
  4. package/docs/kr/architecture.md +23 -24
  5. package/docs/kr/cli.md +6 -6
  6. package/docs/project-structure-overview.md +13 -9
  7. package/docs/superpowers/plans/2026-06-05-wizard-batch-prompts.md +559 -0
  8. package/docs/superpowers/specs/2026-06-05-wizard-batch-prompts-design.md +121 -0
  9. package/docs/task-process/error-analysis.md +1 -1
  10. package/docs/task-process/final-verification.md +1 -1
  11. package/docs/task-process/release-handoff.md +1 -1
  12. package/docs/task-process/requirements-discovery.md +1 -1
  13. package/package.json +1 -1
  14. package/runtime/BUILD.json +2 -2
  15. package/runtime/agents/SKILL.md +18 -14
  16. package/runtime/agents/workers/claude-worker.md +4 -4
  17. package/runtime/agents/workers/codex-worker.md +3 -3
  18. package/runtime/agents/workers/gemini-worker.md +3 -3
  19. package/runtime/agents/workers/report-writer-worker.md +3 -3
  20. package/runtime/bin/lib/okstra/cli.sh +8 -1
  21. package/runtime/bin/lib/okstra/globals.sh +3 -0
  22. package/runtime/bin/lib/okstra/interactive.sh +14 -12
  23. package/runtime/bin/lib/okstra/usage.sh +6 -0
  24. package/runtime/bin/okstra-render-report-views.py +1 -1
  25. package/runtime/bin/okstra-team-reconcile.sh +28 -0
  26. package/runtime/bin/okstra.sh +2 -0
  27. package/runtime/prompts/launch.template.md +4 -2
  28. package/runtime/prompts/profiles/_common-contract.md +15 -15
  29. package/runtime/prompts/profiles/_implementation-deliverable.md +1 -1
  30. package/runtime/prompts/profiles/_implementation-executor.md +3 -3
  31. package/runtime/prompts/profiles/_implementation-verifier.md +2 -2
  32. package/runtime/prompts/profiles/error-analysis.md +1 -1
  33. package/runtime/prompts/profiles/final-verification.md +2 -2
  34. package/runtime/prompts/profiles/implementation-planning.md +10 -9
  35. package/runtime/prompts/profiles/implementation.md +1 -1
  36. package/runtime/prompts/profiles/improvement-discovery.md +5 -5
  37. package/runtime/prompts/profiles/release-handoff.md +2 -2
  38. package/runtime/prompts/profiles/requirements-discovery.md +2 -2
  39. package/runtime/python/okstra_ctl/analysis_packet.py +259 -0
  40. package/runtime/python/okstra_ctl/clarification_items.py +11 -11
  41. package/runtime/python/okstra_ctl/context_cost.py +308 -0
  42. package/runtime/python/okstra_ctl/migrate.py +2 -12
  43. package/runtime/python/okstra_ctl/paths.py +22 -0
  44. package/runtime/python/okstra_ctl/render.py +285 -126
  45. package/runtime/python/okstra_ctl/render_final_report.py +32 -1
  46. package/runtime/python/okstra_ctl/report_views.py +12 -12
  47. package/runtime/python/okstra_ctl/run.py +510 -248
  48. package/runtime/python/okstra_ctl/sequence.py +2 -5
  49. package/runtime/python/okstra_ctl/team_reconcile.py +131 -0
  50. package/runtime/python/okstra_ctl/wizard.py +219 -136
  51. package/runtime/python/okstra_ctl/workflow.py +1 -1
  52. package/runtime/python/okstra_ctl/worktree.py +13 -5
  53. package/runtime/schemas/final-report-v1.0.schema.json +4 -0
  54. package/runtime/skills/okstra-brief/SKILL.md +1 -1
  55. package/runtime/skills/okstra-coding-preflight/SKILL.md +69 -0
  56. package/runtime/skills/okstra-coding-preflight/architecture/hexagonal.md +116 -0
  57. package/runtime/skills/okstra-coding-preflight/clean-code.md +254 -0
  58. package/runtime/skills/okstra-coding-preflight/languages/java.md +64 -0
  59. package/runtime/skills/okstra-coding-preflight/languages/javascript-typescript.md +87 -0
  60. package/runtime/skills/okstra-coding-preflight/languages/kotlin.md +69 -0
  61. package/runtime/skills/okstra-coding-preflight/languages/nodejs.md +66 -0
  62. package/runtime/skills/okstra-coding-preflight/languages/python.md +179 -0
  63. package/runtime/skills/okstra-coding-preflight/languages/rust.md +105 -0
  64. package/runtime/skills/okstra-coding-preflight/languages/sql.md +68 -0
  65. package/runtime/skills/okstra-context-loader/SKILL.md +12 -6
  66. package/runtime/skills/okstra-convergence/SKILL.md +8 -8
  67. package/runtime/skills/okstra-inspect/SKILL.md +100 -1
  68. package/runtime/skills/okstra-report-writer/SKILL.md +27 -23
  69. package/runtime/skills/okstra-run/SKILL.md +3 -1
  70. package/runtime/skills/okstra-team-contract/SKILL.md +8 -5
  71. package/runtime/templates/reports/final-report.template.md +188 -187
  72. package/runtime/templates/reports/i18n/en.json +4 -4
  73. package/runtime/templates/reports/i18n/ko.json +4 -4
  74. package/runtime/templates/reports/implementation-planning-input.template.md +1 -1
  75. package/runtime/templates/reports/release-handoff-input.template.md +1 -1
  76. package/runtime/templates/reports/user-response.template.md +1 -1
  77. package/runtime/templates/worker-prompt-preamble.md +4 -4
  78. package/runtime/validators/lib/fixtures.sh +2 -2
  79. package/runtime/validators/validate-implementation-plan-stages.py +9 -9
  80. package/runtime/validators/validate-report-views.py +10 -10
  81. package/runtime/validators/validate-run.py +36 -36
  82. package/runtime/validators/validate_improvement_report.py +8 -8
  83. package/src/_python-helper.mjs +3 -3
  84. package/src/context-cost.mjs +27 -0
  85. package/src/install.mjs +1 -0
  86. package/src/uninstall.mjs +1 -0
@@ -0,0 +1,69 @@
1
+ # Kotlin Conventions
2
+
3
+ ## Style guide
4
+
5
+ Kotlin official coding conventions (kotlinlang.org/docs/coding-conventions.html). In IntelliJ / Android Studio enable via *Settings → Editor → Code Style → Kotlin → Set from… → Kotlin style guide*.
6
+
7
+ ## Naming
8
+
9
+ | Element | Convention | Example |
10
+ |---|---|---|
11
+ | Class / object / interface | UpperCamelCase | `DeclarationProcessor` |
12
+ | Function / property / parameter | lowerCamelCase | `processDeclarations`, `isReady` |
13
+ | Constant (`const val` or top-level immutable) | SCREAMING_SNAKE_CASE | `MAX_COUNT`, `USER_NAME_FIELD` |
14
+ | Backing property | leading underscore | `_elementList` (private), `elementList` (public) |
15
+ | Package | lowercase, no underscores | `com.example.feature` |
16
+ | Test function | backticked sentence | `` `parses empty input` `` |
17
+
18
+ ## Formatting
19
+
20
+ - Indent: **4 spaces**.
21
+ - Opening brace at end of line, closing brace on its own line.
22
+ - Lambdas: spaces around braces and arrow — `list.filter { it > 10 }`, `map.forEach { (k, v) -> println("$k=$v") }`.
23
+ - Trailing comma in multi-line lists / parameters / `when` arms (since Kotlin 1.4).
24
+ - Function expression body when the body is a single expression: `fun double(x: Int) = x * 2`.
25
+
26
+ ## Required practices
27
+
28
+ - Prefer `val` over `var`. Mutability is opt-in.
29
+ - Default and named arguments replace constructor / function overloads.
30
+ - Prefer the standard library: `filter`, `map`, `groupBy`, `associateBy`, `chunked`, `windowed` over manual loops.
31
+ - String templates: `"Hello, $name"`, never `"Hello, " + name`.
32
+ - Extension functions are encouraged. Keep them `internal` or file-private unless they belong to a public API.
33
+ - `data class` for value-holders. Do not attach behaviour to them.
34
+ - Sealed classes / sealed interfaces for closed hierarchies — exhaustive `when` catches new cases at compile time.
35
+ - Null safety: never chain `!!`. Use `?.`, `?:`, `let`, `requireNotNull(x) { "why" }`, `checkNotNull(x)`.
36
+ - Use `inline` for higher-order functions that are called frequently and pass lambdas, **not** as a default optimization.
37
+ - Use `object` for true singletons and `companion object` only when JVM-visible static members are needed.
38
+
39
+ ## Coroutines
40
+
41
+ - Pick **one** coroutine framework per module. Do not mix coroutines with `CompletableFuture` chains in the same flow.
42
+ - Never `runBlocking { ... }` inside another coroutine.
43
+ - Pass `CoroutineScope` explicitly; do not use `GlobalScope`.
44
+ - Cancel scopes deterministically on lifecycle teardown.
45
+ - Use `withContext(Dispatchers.IO)` only for blocking I/O — most application code stays on the default dispatcher.
46
+
47
+ ## Tests
48
+
49
+ - Framework: **JUnit 5 + kotlin.test**, or **Kotest**. Match the project.
50
+ - Backticked function names are idiomatic for tests: `` fun \`returns empty list when input is empty\`() `` .
51
+ - Use `kotlinx-coroutines-test`'s `runTest` for suspend tests.
52
+ - Use **MockK** over Mockito for Kotlin (handles `final` classes, coroutines, and extension functions correctly).
53
+ - Prefer property-based tests (Kotest's `forAll`) for pure transformations.
54
+ - Assertion libraries: `kotest-assertions` (`shouldBe`, `shouldThrow`) or AssertJ.
55
+
56
+ ### Self-mock signals to refuse (rule from `clean-code.md` → Testing discipline)
57
+
58
+ - `spyk(sut)` followed by `every { sut.someMethod() } returns ...` — the SUT's own method is replaced; the test verifies wiring, not behavior.
59
+ - `coEvery { sut.suspendMethod() } returns ...` for the suspend equivalent.
60
+ - `mockkObject(SomeSingleton)` / `mockkStatic(...)` when `SomeSingleton` *is* the unit under test.
61
+ - Reflection or `internal` visibility hacks (`@VisibleForTesting`, `callPrivateFunc` extensions) used to assert on private helpers — that's the implementation-coupling smell, not just a self-mock.
62
+
63
+ What's fine: `mockk<Collaborator>()` for injected dependencies, `coEvery {}` on those collaborators, and `verify {}` on outcome boundaries (mailer, gateway, repository write).
64
+
65
+ ## Formatter / linter to run
66
+
67
+ - **ktlint** and / or **detekt**.
68
+ - `./gradlew ktlintFormat detekt` if the plugins are configured.
69
+ - Run before commit.
@@ -0,0 +1,66 @@
1
+ # Node.js Conventions (server-side)
2
+
3
+ Load this **in addition to** [javascript-typescript.md](javascript-typescript.md). JS / TS rules still apply; this file adds the server-specific layer.
4
+
5
+ ## Architecture
6
+
7
+ - Separate layers: **HTTP / transport** → **business / domain** → **data access**. No layer skips downward; no layer reaches upward.
8
+ - Request handlers are thin: parse input → call a service → format response.
9
+ - Business logic is framework-agnostic. Do **not** import Express / Fastify / Nest types in service or domain modules.
10
+ - One file, one concern. A 1000-line `index.ts` is a structural bug.
11
+ - Dependency injection (constructor or factory) over module-level singletons. Singletons are not testable.
12
+
13
+ ## Error handling
14
+
15
+ - Distinguish **operational errors** (network, validation, expected business state) from **programmer errors** (bugs, broken invariants). Recover from operational, crash on programmer.
16
+ - Use `async` / `await` with `try` / `catch` at the layer that knows what to do. Catching just to rethrow with no extra value is a smell.
17
+ - Centralise error handling in middleware. Every route returns through it.
18
+ - Always handle event-emitter `'error'` events — an unhandled emitter error crashes the process.
19
+ - Throw subclasses of `Error` (`class ValidationError extends Error {}`) so handlers can `instanceof` them.
20
+ - Process-level: `process.on('unhandledRejection', ...)` and `process.on('uncaughtException', ...)` should log and exit gracefully. Do not swallow.
21
+
22
+ ## Async discipline
23
+
24
+ - Never mix callbacks with promises in the same flow. Use `util.promisify` at the boundary, once.
25
+ - Prefer `for await...of` over manual cursor / stream loops.
26
+ - `Promise.all` for independent parallel work, `Promise.allSettled` when one failure must not cancel the rest.
27
+ - Set explicit timeouts on **every** outgoing HTTP / DB call (no infinite waits, no defaults).
28
+ - Use `AbortController` to cancel in-flight work when the upstream caller goes away.
29
+
30
+ ## Security baseline
31
+
32
+ - All configuration in environment variables (loaded via `dotenv` in dev only). Validate them at startup with `zod`, `envalid`, or similar — fail fast on missing / invalid values.
33
+ - Never log secrets, auth-endpoint request bodies, full JWTs, or PII.
34
+ - HTTPS only at the edge. Behind a TLS-terminating proxy is fine; never speak plain HTTP across the public internet.
35
+ - Use `helmet` for security headers, `cors` with an explicit allowlist (no `*` in production).
36
+ - Validate every untrusted input (`zod`, `joi`, `valibot`). Validation belongs at the boundary, not inside services.
37
+ - Rate-limit public endpoints (`express-rate-limit` or upstream proxy).
38
+ - Run `npm audit` / `pnpm audit` regularly. Patch promptly.
39
+
40
+ ## Performance
41
+
42
+ - **Streams** (`fs.createReadStream`, `pipeline`) for large payloads — never load multi-MB files fully into memory.
43
+ - Multi-core: `cluster`, PM2, or a process manager. Pure async will not save you from a single-thread CPU bottleneck.
44
+ - Keep request handlers off the event loop. Offload heavy CPU work to a `worker_thread` or a queue.
45
+ - Connection pool for every DB client. `pg`, `mysql2`, and `mongoose` default to a pool — do **not** create a client per request.
46
+ - Cache idempotent reads at an explicit layer (Redis, in-memory LRU). Do not bolt caching into a service method.
47
+ - Use `Buffer`/typed arrays for binary data, not strings.
48
+
49
+ ## Logging
50
+
51
+ - Structured logger (`pino`, `winston`). Emit **JSON**, not human strings.
52
+ - Every request: request id, method, path, status, latency. Propagate the request id through downstream calls via `AsyncLocalStorage`.
53
+ - Levels: `error` (pageable), `warn` (suspicious), `info` (business event), `debug` (off in production).
54
+ - Never `console.log` in production code. `console.log` exists for ad-hoc REPL only.
55
+
56
+ ## Tests
57
+
58
+ - Unit-test services and domain functions with zero Node-runtime dependencies — no `fs`, no `net`, no real DB.
59
+ - Integration-test the HTTP layer with `supertest` against the real Express / Fastify app instance.
60
+ - `testcontainers` for DBs when the production DB has features (locks, full-text, JSONB, partitioning) you depend on. In-memory fakes lie about behaviour.
61
+ - Test the error paths. Coverage of the happy path only is not coverage.
62
+
63
+ ## Formatter / linter / tools
64
+
65
+ - Same as JS / TS: `prettier`, `eslint`, `tsc`.
66
+ - `npm run lint && npm test` (or the project's equivalent) must pass before any commit.
@@ -0,0 +1,179 @@
1
+ # Python Conventions
2
+
3
+ If a project-level Python style skill or repo-local `CONTRIBUTING.md` / `pyproject.toml` config exists, **it wins**. This file is the fallback.
4
+
5
+ ## Core principles
6
+
7
+ - Write simple, explicit, readable Python. Prefer boring, maintainable code over clever abstractions.
8
+ - Optimize for correctness first, then clarity, then performance.
9
+ - Keep functions small and single-responsibility. Prefer composition over inheritance.
10
+ - Type-hint all public functions, class methods, and non-trivial internal functions.
11
+ - Use dataclasses or Pydantic models for structured data — never loose `dict`s for domain data.
12
+ - Make side effects explicit. Fail fast with clear errors. No hidden global state.
13
+ - Code must be testable without network, filesystem, database, or time dependencies.
14
+
15
+ ## Python version
16
+
17
+ Target **Python 3.11+** unless told otherwise. Use modern syntax:
18
+
19
+ - `str | None`, not `Optional[str]`
20
+ - `list[str]` / `dict[str, int]`, not `List` / `Dict`
21
+ - `match` only when it improves clarity
22
+ - `pathlib.Path`, not string paths
23
+
24
+ ## Style guide
25
+
26
+ PEP 8, enforced by `ruff format` (or `black`).
27
+
28
+ - Indent: **4 spaces**. Line length: **88** (ruff/black default) unless the project sets otherwise.
29
+ - `snake_case` functions/variables, `PascalCase` classes, `SCREAMING_SNAKE_CASE` constants, leading `_` for private.
30
+
31
+ ## Function design
32
+
33
+ - One thing per function. Avoid bodies over ~40 lines without a strong reason.
34
+ - Explicit parameters over reading globals. Return values instead of mutating arguments.
35
+ - Avoid behavior-changing boolean flags — split into separate functions.
36
+ - Do not hide I/O inside functions that look pure. Do not catch broad exceptions unless re-raising with useful context.
37
+
38
+ Bad:
39
+
40
+ ```python
41
+ def process(data, save=True, notify=False):
42
+ ...
43
+ ```
44
+
45
+ Better:
46
+
47
+ ```python
48
+ def build_invoice(data: InvoiceInput) -> Invoice: ...
49
+ def save_invoice(invoice: Invoice) -> None: ...
50
+ def notify_invoice_created(invoice: Invoice) -> None: ...
51
+ ```
52
+
53
+ ## Typing
54
+
55
+ - Type hints everywhere meaningful. Avoid `Any` unless unavoidable; never silence a type error without a one-line why.
56
+ - No untyped dicts for domain data — use `dataclass`, `Enum`, `TypedDict`, or Pydantic.
57
+ - `Protocol` for dependency inversion.
58
+
59
+ Bad:
60
+
61
+ ```python
62
+ def create_user(data: dict): ...
63
+ ```
64
+
65
+ Better:
66
+
67
+ ```python
68
+ @dataclass(frozen=True)
69
+ class CreateUserCommand:
70
+ email: str
71
+ name: str
72
+
73
+ def create_user(command: CreateUserCommand) -> User: ...
74
+ ```
75
+
76
+ ## Error handling
77
+
78
+ - Specific exception types with useful context. Never swallow exceptions silently.
79
+ - Do not use exceptions for normal control flow.
80
+ - Convert infrastructure errors into application/domain errors at boundaries.
81
+ - Never expose internal stack traces or secrets to users.
82
+
83
+ Bad:
84
+
85
+ ```python
86
+ try:
87
+ send_email(user)
88
+ except Exception:
89
+ pass
90
+ ```
91
+
92
+ Better:
93
+
94
+ ```python
95
+ try:
96
+ send_email(user)
97
+ except EmailProviderError as exc:
98
+ raise NotificationFailedError(f"Failed to notify user_id={user.id}") from exc
99
+ ```
100
+
101
+ ## Async
102
+
103
+ - Use async only for real I/O concurrency. Never call sync-blocking I/O inside an async function.
104
+ - Always set timeouts on external calls. No fire-and-forget tasks without explicit lifecycle + error handling.
105
+
106
+ Bad:
107
+
108
+ ```python
109
+ async def fetch():
110
+ requests.get(url) # blocking call inside async
111
+ ```
112
+
113
+ Better:
114
+
115
+ ```python
116
+ async def fetch(client: httpx.AsyncClient, url: str) -> Response:
117
+ return await client.get(url, timeout=10)
118
+ ```
119
+
120
+ ## Project / module layout
121
+
122
+ Separate domain logic from infrastructure; keep business rules independent of frameworks, DBs, queues, and external APIs.
123
+
124
+ - `domain/` — entities, value objects, pure business logic
125
+ - `application/` — use cases, orchestration, commands, queries
126
+ - `infrastructure/` — DB, external APIs, filesystem, queues
127
+ - `interfaces/` — HTTP, CLI, workers, event handlers
128
+ - `tests/` — unit, integration, e2e
129
+
130
+ When the project is ports-and-adapters, also read [../architecture/hexagonal.md](../architecture/hexagonal.md).
131
+
132
+ ## Security
133
+
134
+ - Never hardcode secrets — read from env vars or a secret manager.
135
+ - Parameterized SQL only; never build SQL via string interpolation / f-strings.
136
+ - Validate all external input. Treat file paths, URLs, headers, and serialized input as untrusted.
137
+ - No unsafe deserialization (arbitrary `pickle.load`).
138
+
139
+ ## Database
140
+
141
+ - Explicit, reviewable SQL. Avoid N+1 queries. Wrap multi-step writes in transactions.
142
+ - Use repository interfaces (`Protocol`) when DB access should be decoupled from domain logic.
143
+ - Keep migrations backward-compatible when possible.
144
+
145
+ ## Logging
146
+
147
+ - Structured logging — never `print()` for application logs.
148
+ - Include IDs/context (request, user, job, entity). Never log secrets, tokens, passwords, cookies, API keys, or sensitive PII.
149
+
150
+ ## Tests
151
+
152
+ - Unit-test domain logic; integration-test DB, external-API wrappers, and framework wiring.
153
+ - Deterministic tests — no `sleep`, no execution-order dependence, no real external services unless marked integration/e2e.
154
+ - Mock only at boundaries. Use factories/builders for test data. Test behavior, not implementation.
155
+ - Every bug fix ships a regression test when feasible.
156
+
157
+ ### Self-mock signals to refuse (rule from `clean-code.md` → Testing discipline)
158
+
159
+ - `unittest.mock.patch`-ing a method **on the class under test**, then asserting that method was called. Mock the collaborator it depends on, not the unit it *is*.
160
+ - Splitting out a helper *only so* the test can patch it, then asserting `helper.assert_called_once()` as the real check — that proves the SUT calls the helper, not that the behavior works.
161
+ - Reaching into privates (`obj._internal`) to assert on internal state instead of observable outcomes.
162
+ - A `MagicMock` standing in for the SUT itself with a `return_value` that mirrors the very thing under test.
163
+
164
+ What's fine: mocking injected dependencies (`Mock(spec=UserRepository)`, `httpx.MockTransport`), asserting on return values / raised exceptions / emitted events / boundary calls.
165
+
166
+ ## Required tooling
167
+
168
+ Run before declaring a Python change complete:
169
+
170
+ - `ruff check .` — lint
171
+ - `ruff format --check .` — formatting (or `black --check .`)
172
+ - `mypy .` or `pyright` — type check (if configured)
173
+ - `pytest` — tests
174
+
175
+ Dependency/project management: `uv`, Poetry, or `pip-tools`. Pin dependencies in application projects; reach for the standard library before adding third-party deps, and avoid importing heavy deps at module-import time when unnecessary.
176
+
177
+ ## Forbidden anti-patterns
178
+
179
+ God classes/functions, hidden global mutable state, circular imports, catch-all `except Exception` without re-raise, silent failure, untyped domain dicts, business logic in controllers/routes or coupled to ORM models, hardcoded secrets, string-formatted SQL, unbounded retries, missing timeouts on external calls, fire-and-forget async without error handling, order-dependent tests, abstractions before two concrete use cases, magic constants without names, copy-pasted logic, comments that restate obvious code, leftover `print()` debugging.
@@ -0,0 +1,105 @@
1
+ # Rust Conventions
2
+
3
+ If a project-level `rust-guidelines` skill or repo-local `CONTRIBUTING.md` exists, **it wins**. This file is the fallback.
4
+
5
+ ## Style guide
6
+
7
+ Official Rust Style Guide (rustwiki.org/en/style-guide/), enforced by `rustfmt`. Set the edition in `rustfmt.toml`:
8
+
9
+ ```toml
10
+ style_edition = "2024"
11
+ ```
12
+
13
+ ## Naming (RFC 430)
14
+
15
+ | Element | Convention | Example |
16
+ |---|---|---|
17
+ | Crate / module | `snake_case` | `my_crate`, `parser` |
18
+ | Type / trait / enum variant | `UpperCamelCase` | `HttpClient`, `Error::Timeout` |
19
+ | Function / method / variable | `snake_case` | `send_request`, `retry_count` |
20
+ | Const / static | `SCREAMING_SNAKE_CASE` | `MAX_RETRIES` |
21
+ | Lifetime | short lowercase | `'a`, `'src` |
22
+ | Type parameter | single uppercase | `T`, `E` |
23
+
24
+ ## Formatting
25
+
26
+ - Indent: **4 spaces**.
27
+ - Max line length: **100** (`rustfmt` default).
28
+ - Always run `cargo fmt` before commit.
29
+
30
+ ## Ownership & borrowing
31
+
32
+ - Prefer borrowing (`&T`) over cloning. `.clone()` is a code smell unless you can name the reason in one sentence.
33
+ - Return owned types from constructors and "convert" methods; accept borrowed slices (`&str`, `&[T]`) as parameters.
34
+ - `&mut` only when you must mutate. Two `&` borrows are usually better than one `&mut`.
35
+ - `Cow<'_, str>` when a function sometimes allocates and sometimes does not.
36
+
37
+ ## Error handling
38
+
39
+ - **Library** crates: define a typed error with `thiserror`. One `Error` enum per crate, one variant per failure mode.
40
+ - **Application** crates: `anyhow::Result<T>` at the top level is fine; convert to typed errors at module boundaries.
41
+ - **No `unwrap()` / `expect()` in production paths.** Acceptable in:
42
+ - Tests.
43
+ - `main` for setup that genuinely cannot fail.
44
+ - `expect("invariant: ...")` when the message documents the invariant.
45
+ - Use the `?` operator. Do not write `match`-on-`Result` ladders.
46
+
47
+ ## Idioms
48
+
49
+ - Iterators (`map`, `filter`, `collect`, `fold`, `find`) over indexed loops.
50
+ - `if let` / `let else` over `match` with a single arm.
51
+ - `#[derive(Debug, Clone, ...)]` aggressively. Add `PartialEq` / `Eq` / `Hash` when the type lands in a collection.
52
+ - Newtype small primitives that carry meaning: `struct UserId(u64);`, not raw `u64`.
53
+ - `mut` and `pub` only when needed. Start private, widen on demand.
54
+ - Use `From` / `TryFrom` for conversions, not `parse_x_from_y` helpers.
55
+ - Pattern-match deeply (`if let Some(User { id, .. }) = user`) rather than chained `.unwrap().unwrap()`.
56
+
57
+ ## Async / Tokio
58
+
59
+ - Pick **one** runtime per binary (`tokio` is standard). Never mix `tokio` and `async-std`.
60
+ - Never `block_on` inside an async function.
61
+ - Spawn tasks deliberately; remember `JoinHandle`s and `await` them.
62
+ - `tokio::select!` for races; never poll futures by hand.
63
+ - Long CPU work goes in `tokio::task::spawn_blocking`.
64
+ - Cancellation: use `CancellationToken` (tokio-util) for cooperative shutdown.
65
+
66
+ ## Unsafe
67
+
68
+ - Every `unsafe` block needs a comment documenting **every** invariant the caller must uphold.
69
+ - New `unsafe` requires a second pair of eyes on review.
70
+ - Default to safe abstractions (`bytes`, `parking_lot`, `crossbeam`) before reaching for `unsafe`.
71
+
72
+ ## Module layout
73
+
74
+ - Public API at the crate root via `pub use`; implementation in private modules.
75
+ - One concept per file. Split a module before it passes ~500 lines.
76
+ - Unit tests in `mod tests { use super::*; ... }` at the bottom of the file.
77
+ - Integration tests in the `tests/` directory.
78
+
79
+ ## Tests
80
+
81
+ - Unit: `#[test]` inside `mod tests`. Use `assert_eq!`, `assert!`, `assert_matches!`.
82
+ - Async: `#[tokio::test]` or `#[test]` with an explicit `tokio::runtime::Runtime` when you need control.
83
+ - Property: `proptest` or `quickcheck` for pure transformations.
84
+ - **Doc tests**: runnable examples in doc comments are tests — keep them green.
85
+ - Use `pretty_assertions::assert_eq!` for readable diffs on large structs.
86
+
87
+ ### Self-mock signals to refuse (rule from `clean-code.md` → Testing discipline)
88
+
89
+ Rust's trait-based DI makes self-mocking rarer than in JVM/JS, but it still happens:
90
+
91
+ - Using `mockall::mock!` (or `automock`) to generate a mock of the **same struct under test**, then asserting on its own methods. The unit under test must be the real impl; mock the trait it depends on, not the trait it *is*.
92
+ - Splitting a behavior into a helper trait *only so* the test can stub it, then expecting `expect_helper().returning(...)` to be the real assertion. The test now proves the SUT calls the helper, not that the behavior works.
93
+ - Reaching into privates via `pub(crate)` widening, `#[cfg(test)] pub` shortcuts, or test-only modules that expose internal state to assert on.
94
+ - A test that constructs the SUT, replaces one of its trait-object dependencies with a mock whose `returning(...)` mirrors the very thing being tested.
95
+
96
+ What's fine: `mockall` mocks of injected trait dependencies (`MockHttpClient`, `MockUserRepository`), `unwrap()` in tests for setup that genuinely cannot fail, `assert_eq!` on returned values, `assert_matches!` on returned `Result` / `Option`.
97
+
98
+ ## Required tooling
99
+
100
+ Run before any commit:
101
+
102
+ - `cargo fmt --all` — formatting.
103
+ - `cargo clippy --all-targets --all-features -- -D warnings` — lint (warnings are errors).
104
+ - `cargo check --all-targets` — fast type check.
105
+ - `cargo test --all` — tests.
@@ -0,0 +1,68 @@
1
+ # SQL Conventions
2
+
3
+ Defaults below target PostgreSQL. MySQL-specific notes are marked.
4
+
5
+ ## Formatting / style
6
+
7
+ - Keywords **lowercase**: `select`, `from`, `where`, `join`. (Some teams prefer uppercase — match the file you are editing; never mix in one statement.)
8
+ - All identifiers `snake_case`: tables, columns, indexes, constraints.
9
+ - Always use explicit `as` for column aliases: `select count(*) as user_count`.
10
+ - Always state the join type: `inner join`, `left join`, `cross join`. **Never** a bare `join`.
11
+ - One column per line in `select`; one table or join per line in `from`.
12
+ - Prefer **CTEs** (`with foo as (...)`) over deeply nested subqueries.
13
+ - Dates / timestamps in ISO 8601: `'2026-05-17'`, `'2026-05-17T12:00:00Z'`.
14
+
15
+ ## Schema design
16
+
17
+ - Every table has an `id` primary key.
18
+ - PostgreSQL: `id bigint generated always as identity primary key`.
19
+ - MySQL: `id bigint unsigned not null auto_increment primary key`.
20
+ - Foreign keys named `<referenced_table>_id`: `user_id`, `order_id`.
21
+ - Singular table names (`user`, `order`, `invoice_line`) unless the project already uses plural — match, do not mix.
22
+ - `comment on table ... is '...'` for every table; `comment on column ...` for non-obvious columns. (MySQL: `comment '...'` inline.)
23
+ - Timestamps: `created_at`, `updated_at`, both `timestamptz` in Postgres. Keep timezone discipline explicit — store UTC, render in the user's zone at the edge.
24
+ - Soft delete: `deleted_at timestamptz null`. Index it if you query on it. Add a partial index `where deleted_at is null` for hot reads.
25
+ - Use `not null` aggressively. `null` should mean "unknown", not "default".
26
+ - Use enum or `check` constraints over free-text status columns.
27
+ - Composite indexes: column order matches your `where` / `order by` patterns; left-most prefix wins.
28
+
29
+ ## Migrations
30
+
31
+ - One change per migration file. Name: `<timestamp>_<verb>_<noun>.sql` (e.g. `20260517_120000_add_user_deleted_at.sql`).
32
+ - Migrations are **forward-only** in production. Write a "down" migration only if your tool requires it and you genuinely intend to run it.
33
+ - Splitting a destructive change:
34
+ 1. Add nullable column.
35
+ 2. Backfill (separate deploy).
36
+ 3. Add `not null` / index `concurrently`.
37
+ 4. Drop the old column.
38
+ - `create index concurrently` in Postgres for hot tables — avoids the write lock.
39
+ - Application-deploy migrations **never** destroy data. Destructive operations are a separate, explicit step gated on backups.
40
+
41
+ ## Query practices
42
+
43
+ - Always filter on indexed columns for OLTP queries. Verify with `explain` / `explain analyze`.
44
+ - Never `select *` in application code (migrations and ad-hoc inspection are fine).
45
+ - Parameterised queries from the application layer. String concatenation is SQL injection.
46
+ - `limit` every query that could return an unbounded set.
47
+ - `returning *` (Postgres) on `insert` / `update` / `delete` instead of a follow-up `select`.
48
+ - For pagination, prefer keyset (`where id > $last_seen`) over offset on large tables.
49
+
50
+ ## MySQL-specific
51
+
52
+ - Engine: **InnoDB** (default). Charset: `utf8mb4`, collation `utf8mb4_0900_ai_ci` (8.0+).
53
+ - Quote identifiers with backticks when they collide with reserved words.
54
+ - `select ... for update` only inside an explicit transaction.
55
+ - `on duplicate key update` for upserts; in Postgres use `on conflict (...) do update`.
56
+
57
+ ## Tests
58
+
59
+ - Run query tests against a **real** database via `testcontainers` or a disposable schema. In-memory SQLite is not equivalent to Postgres / MySQL — different SQL dialect, different isolation, different `null` semantics.
60
+ - Migrations themselves must be tested: apply from an empty schema in CI; assert the resulting structure.
61
+ - Seed test data via factory functions (one per table), not raw `insert` statements copy-pasted into each test.
62
+ - Verify performance assumptions with `explain` in CI for queries flagged as hot.
63
+
64
+ ## Tools
65
+
66
+ - Formatting: `pg_format`, `sqlfluff`, `sql-formatter`.
67
+ - Linting / safety: `squawk` (Postgres migration linter — catches missing `concurrently`, locking foot-guns, etc.).
68
+ - Schema diff: `migra` (Postgres), `mysqldiff` (MySQL).
@@ -56,7 +56,7 @@ user-invocable: false
56
56
  | `workflow.awaitingApproval` | Approval wait marker |
57
57
  | `workflow.routingStatus` | Routing decision status |
58
58
  | `workflow.lastSafeCheckpoint` | Safe resume checkpoint metadata |
59
- | `instructionSetPath` | Path to the `instruction-set/` **directory** containing `analysis-profile.md`, `analysis-material.md`, `reference-expectations.md`, `task-brief.md`, `final-report-template.md` (see Step 4). Not a single-file path. |
59
+ | `instructionSetPath` | Path to the `instruction-set/` **directory** containing `analysis-packet.md`, `analysis-profile.md`, `analysis-material.md`, `reference-expectations.md`, `task-brief.md`, `final-report-template.md` (see Step 4). Not a single-file path. |
60
60
  | `referenceExpectationsPath` | config/deployment expectation artifact path |
61
61
  | `latestRunPath` | latest run path |
62
62
  | `latestRunStatus` | latest run status |
@@ -78,6 +78,7 @@ After identifying the task root in `task-manifest.json`, derive all paths accord
78
78
  ├── task-index.md (human-readable summary, non-canonical)
79
79
  ├── instruction-set/
80
80
  │ ├── analysis-profile.md (analysis guide by task type)
81
+ │ ├── analysis-packet.md (primary compact input for analysis workers)
81
82
  │ ├── analysis-material.md (analysis materials)
82
83
  │ ├── reference-expectations.md (config/deployment expected values)
83
84
  │ ├── task-brief.md (task brief)
@@ -116,13 +117,18 @@ After identifying the task root in `task-manifest.json`, derive all paths accord
116
117
 
117
118
  ## Step 4: Instruction Set Reading Order
118
119
 
119
- After verifying `task-manifest.json`, read the instruction set in the following order:
120
+ After verifying `task-manifest.json`, read only the compact intake files needed for the current action. Do not bulk-read the whole instruction-set directory.
120
121
 
121
122
  1. `instruction-set/analysis-profile.md` (analysis guide by task type)
122
- 2. `instruction-set/analysis-material.md` (analysis materials, if available)
123
- 3. `instruction-set/reference-expectations.md` (config/deployment expected values)
124
- 4. `instruction-set/task-brief.md` (task brief)
125
- 5. `instruction-set/final-report-template.md` (report template)
123
+ 2. `instruction-set/analysis-packet.md` (primary compact input for analysis workers)
124
+ 3. `runs/<task-type>/state/active-run-context-<task-type>-<seq>.json` if present (compact current-run path/worker snapshot)
125
+
126
+ Read source files lazily:
127
+
128
+ - `instruction-set/task-brief.md` only for reporter-confirmation checks, source verification, or report-writer synthesis.
129
+ - `instruction-set/analysis-material.md` only when packet content is insufficient or a source citation needs verification.
130
+ - `instruction-set/reference-expectations.md` for report-writer synthesis or when packet expectation extract is insufficient.
131
+ - `instruction-set/final-report-template.md` only for report-writer authoring.
126
132
 
127
133
  ### Brief Reporter-Confirmation Precondition (BLOCKING)
128
134
 
@@ -77,7 +77,7 @@ Read the worker result files generated in Phase 4/5 and extract individual findi
77
77
  - For bullet/numbered findings, parse `[TICKETID: <id>]` from the item title.
78
78
  - Items with multiple tickets (e.g. `TICKET-123, TICKET-456`) expand to a set of ticket keys.
79
79
  - Items tagged `unknown` keep the literal `unknown` as their ticket key.
80
- 2. For each finding, record the summary, evidence (file path, line number, basis), the worker who identified it, **the worker-internal item ID assigned by that worker** (e.g. `F-001`, `1.1`, `F-3` — see `prompts/profiles/_common-contract.md` "Cross-worker traceability" SSOT), and the parsed ticket set. The item ID is persisted on the finding record as `findings[].discoveredBy.<worker>.itemId` and on each cross-worker confirmation as `findings[].sourceItems[]` (one entry per contributing `<worker>:<item-id>` pair). The final-report's `## 1.1 Consensus` / `## 1.2 Differences` / `## 3.1 Primary Evidence` tables read this list verbatim into their `Source items` columns — without this, the synthesised `C-NNN` row has no traceable link back to the original worker wording.
80
+ 2. For each finding, record the summary, evidence (file path, line number, basis), the worker who identified it, **the worker-internal item ID assigned by that worker** (e.g. `F-001`, `1.1`, `F-3` — see `prompts/profiles/_common-contract.md` "Cross-worker traceability" SSOT), and the parsed ticket set. The item ID is persisted on the finding record as `findings[].discoveredBy.<worker>.itemId` and on each cross-worker confirmation as `findings[].sourceItems[]` (one entry per contributing `<worker>:<item-id>` pair). The final-report's `## 6.1 Consensus` / `## 6.2 Differences` / `## 2.1 Primary Evidence` tables read this list verbatim into their `Source items` columns — without this, the synthesised `C-NNN` row has no traceable link back to the original worker wording.
81
81
  3. Claude Lead groups findings based on semantic similarity AND ticket-set equality:
82
82
  - Same semantics + same ticket set across 2+ workers → immediately reach `full consensus`.
83
83
  - Same semantics but disjoint ticket sets → keep as separate groups (do NOT over-merge across tickets).
@@ -561,12 +561,12 @@ existing Acceptance Blocker. If you find none, say so explicitly.
561
561
  ### Verification — confirm-or-downgrade (BLOCKING)
562
562
 
563
563
  Each candidate blocker is verified by the Phase 4 analysers (excluding the critic). Do NOT use the adversarial finding classifier's "uncertain → reject" rule here.
564
- - **Confirmed** (an analyser reproduces it or cites supporting evidence) → promote to a `## 4 Acceptance Blockers` row (keep severity + recommended follow-up phase).
564
+ - **Confirmed** (an analyser reproduces it or cites supporting evidence) → promote to a `## 5.8 Acceptance Blockers` row (keep severity + recommended follow-up phase).
565
565
  - **Not confirmed** (cannot reproduce, or evidence is weak) → **downgrade to a Residual Risk row — never drop it.** Record the escalation trigger so the user can re-judge a high-severity-but-unconfirmed candidate.
566
566
 
567
567
  ### Verdict impact
568
568
 
569
- Promoted blockers enter `## 4 Acceptance Blockers`; since `accepted` requires zero blockers, the verdict moves to `conditional-accept` / `blocked` automatically. The existing verdict↔blocker consistency validator (`validators/validate-run.py` `_validate_final_verification_consistency`) enforces this unchanged — no new enum or validator.
569
+ Promoted blockers enter `## 5.8 Acceptance Blockers`; since `accepted` requires zero blockers, the verdict moves to `conditional-accept` / `blocked` automatically. The existing verdict↔blocker consistency validator (`validators/validate-run.py` `_validate_final_verification_consistency`) enforces this unchanged — no new enum or validator.
570
570
 
571
571
  ### State
572
572
 
@@ -630,7 +630,7 @@ Default values are emitted into the manifest by `scripts/okstra_ctl/render.py` (
630
630
 
631
631
  ### Plan-item extraction (Round 0 equivalent)
632
632
 
633
- From the report-writer's draft of `## 4.5 Implementation Plan Deliverables`, lead extracts plan items with the following prefixes (see also `templates/reports/final-report.template.md` §4.5.9):
633
+ From the report-writer's draft of `## 5.5 Implementation Plan Deliverables`, lead extracts plan items with the following prefixes (see also `templates/reports/final-report.template.md` §5.5.9):
634
634
 
635
635
  | Prefix | Source sub-section | One row per |
636
636
  |--------|--------------------|-------------|
@@ -689,13 +689,13 @@ Plan-body verification stays **lightweight** even under this posture — the `ve
689
689
  - all dispatches non-result → `aborted-non-result`
690
690
  - any `partial-consensus` / `dissent-isolated` present, no `majority-disagree` → `passed-with-dissent`
691
691
  - all items `full-consensus` → `passed`
692
- 6. Lead writes `runs/<task-type>/state/plan-body-verification-<task-type>-<seq>.json` (schema below) and populates `### 4.5.9 Plan Body Verification` in the final report (template at `templates/reports/final-report.template.md`). The §4.5.9 body uses a single `#### Verdict details` table (`Plan item / Worker / Verdict / Breakage kind / Note` — one row per plan-item × worker pair). The older wide `| Plan item | <worker1> | <worker2> | … | Classification |` matrix and the former narrow `#### Verdict summary` card are both removed — the matrix scaled horizontally with the worker count, and the summary only restated per-item classifications already derivable from the details table. The validator's `Plan Body Verification` + `Gate result:` substring checks gate this section.
693
- 7. For every `majority-disagree` item, lead adds a row to `## 5. Clarification Items` with:
692
+ 6. Lead writes `runs/<task-type>/state/plan-body-verification-<task-type>-<seq>.json` (schema below) and populates `### 5.5.9 Plan Body Verification` in the final report (template at `templates/reports/final-report.template.md`). The §5.5.9 body uses a single `#### Verdict details` table (`Plan item / Worker / Verdict / Breakage kind / Note` — one row per plan-item × worker pair). The older wide `| Plan item | <worker1> | <worker2> | … | Classification |` matrix and the former narrow `#### Verdict summary` card are both removed — the matrix scaled horizontally with the worker count, and the summary only restated per-item classifications already derivable from the details table. The validator's `Plan Body Verification` + `Gate result:` substring checks gate this section.
693
+ 7. For every `majority-disagree` item, lead adds a row to `## 1. Clarification Items` with:
694
694
  - new `C-<N>` ID (numbering continues from any existing rows)
695
695
  - `Statement` summarising the disagreement and the worker breakage `<kind>`
696
696
  - `Kind` chosen per the standard policy (usually `decision` for option-level conflicts, `data-point` for path/symbol mismatches)
697
697
  - `Blocks=approval`
698
- - the §4.5.9 verdict table's `Classification` column for that row reads `majority-disagree → C-<N>` (1:1 ID match — orphan on either side is a contract violation per `prompts/profiles/implementation-planning.md` self-review step 6).
698
+ - the §5.5.9 verdict table's `Classification` column for that row reads `majority-disagree → C-<N>` (1:1 ID match — orphan on either side is a contract violation per `prompts/profiles/implementation-planning.md` self-review step 6).
699
699
  8. The top-of-report `- [ ] Approved` marker line is rendered if and only if the Gate result is `passed` or `passed-with-dissent`. `validators/validate-run.py` `validate_phase_boundary` enforces this correspondence; manually adding the marker line when the gate did not pass is a contract violation.
700
700
 
701
701
  ### `plan-body-verification-<task-type>-<seq>.json` schema
@@ -802,4 +802,4 @@ Mirrors finding convergence (§"Worker failure handling in reverify"). Concretel
802
802
 
803
803
  - A dispatch that returns terminal non-result MUST NOT be aggregated as `DISAGREE`.
804
804
  - If at least one dispatch was issued AND **all** plan-body dispatches return non-result, the Gate result is `aborted-non-result`. Record one `contract-violation` event per non-result dispatch.
805
- - When the gate is `aborted-non-result`, report-writer MUST keep the frontmatter `approved: false` (publishing `approved: true` under this gate result is a validator failure). A single row is added to `## 5. Clarification Items` with `Statement="plan-body verification could not run — all workers returned non-result"`, `Kind=decision`, `Blocks=approval`, allowing the user to either retry the phase or override by manually flipping the frontmatter to `approved: true` (or running `--approve` on the resume command).
805
+ - When the gate is `aborted-non-result`, report-writer MUST keep the frontmatter `approved: false` (publishing `approved: true` under this gate result is a validator failure). A single row is added to `## 1. Clarification Items` with `Statement="plan-body verification could not run — all workers returned non-result"`, `Kind=decision`, `Blocks=approval`, allowing the user to either retry the phase or override by manually flipping the frontmatter to `approved: true` (or running `--approve` on the resume command).