npm - @harness-engineering/cli - Versions diffs - 1.23.1 → 1.23.2 - Mend

@harness-engineering/cli 1.23.1 → 1.23.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (423) hide show

package/dist/agents/skills/gemini-cli/harness-compliance/SKILL.md CHANGED Viewed

@@ -288,6 +288,16 @@ Phase 4: REPORT
       7. Add automated HIPAA compliance regression tests to CI pipeline
 ```
+## Rationalizations to Reject
+| Rationalization                                                                 | Reality                                                                                                                                                                                                                                       |
+| ------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| "We're not in the EU so GDPR doesn't apply to us"                               | GDPR applies to any organization that processes data of EU residents, regardless of where the organization is based. If a single EU user can sign up, GDPR scope must be assessed.                                                            |
+| "Our lawyers will handle the compliance questions — just document what we have" | Legal review and technical implementation are distinct. Lawyers cannot attest that Article 17 deletion cascades to S3 and Segment. The technical implementation must be audited separately.                                                   |
+| "We already did a SOC2 audit last year — this codebase is the same"             | SOC2 Type II assesses controls over time. Adding a new data store, third-party processor, or API endpoint can invalidate previous control attestations. Audits are point-in-time snapshots, not permanent certificates.                       |
+| "The audit isn't for three months — we can fix the gaps before then"            | Gaps found now require implementation, testing, and evidence collection time. Auditors expect evidence of sustained control operation, not freshly deployed fixes. A gap fixed the week before an audit is still a finding.                   |
+| "That field is technically a username, not PII"                                 | Data classification cannot be done by naming convention. A username combined with any other identifying field (email, IP, phone) is PII under GDPR. Classification must be based on the realistic re-identification risk, not the field name. |
 ## Gates
 - **No compliance report without data classification.** A compliance audit that does not inventory and classify data fields is incomplete. The classification matrix must be produced before controls can be meaningfully assessed. Without knowing what data exists and where, control checks are theoretical.

package/dist/agents/skills/gemini-cli/harness-containerization/SKILL.md CHANGED Viewed

@@ -269,6 +269,16 @@ Phase 4: VALIDATE
   Result: PASS -- well-configured container setup
 ```
+## Rationalizations to Reject
+| Rationalization                                                                       | Reality                                                                                                                                                                                                                                                                                                           |
+| ------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| "We use `latest` in production because we always want the most recent image"          | `latest` is a mutable pointer. A deployment at 9am and a rollback at 9pm may pull different image contents with the same tag. Immutable tags (semver or digest) are required for reproducible deployments and reliable rollbacks.                                                                                 |
+| "The container runs as root because it needs to bind to port 80"                      | Binding to a privileged port requires elevated capability, not full root access. The correct solution is to run the container as a non-root user and use `--cap-add NET_BIND_SERVICE`, or to expose a high port and use a load balancer or ingress for port translation.                                          |
+| "We don't set resource limits because we want the container to use whatever it needs" | Containers without memory limits can exhaust node memory, triggering OOM kills of other containers on the same node. Kubernetes uses resource requests for scheduling and limits for safety; omitting limits transfers the risk of one container onto the entire node.                                            |
+| "Our image is 2GB but it only takes a few seconds to pull in our CI"                  | Image size multiplies across every developer pull, every CI run, and every Kubernetes pod startup. A 2GB image that takes 3 seconds to pull in CI with a warm cache takes 90 seconds on a cold node during an autoscaling event at peak traffic.                                                                  |
+| "We don't need liveness and readiness probes — the container exits if it crashes"     | Process exit is a coarse health signal. A process that is running but deadlocked, stuck in an infinite retry loop, or unable to connect to its database will never exit. Kubernetes will continue routing traffic to it. Readiness probes prevent traffic routing to unhealthy containers that are still running. |
 ## Gates
 - **No `latest` tag in production manifests.** Production Kubernetes manifests or compose files using `latest` image tags are blocking findings. Immutable tags or digests are required.

package/dist/agents/skills/gemini-cli/harness-data-pipeline/SKILL.md CHANGED Viewed

@@ -259,6 +259,16 @@ Phase 4: DOCUMENT
   Quality Report: FAIL (2 errors requiring immediate attention)
 ```
+## Rationalizations to Reject
+| Rationalization                                                                                                                   | Reality                                                                                                                                                                                                                                                                                            |
+| --------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| "The pipeline failed halfway through — we'll just re-run it and it'll pick up where it left off."                                 | A non-idempotent pipeline that is re-run from the middle writes duplicate records for the portion that succeeded before failure. The correct fix is to make the pipeline idempotent (MERGE, upsert, or delete-then-insert) so re-runs are always safe, not to assume partial re-runs are harmless. |
+| "The model has no dbt tests yet, but it's only used in one dashboard — low risk."                                                 | Every untested model is a silent data quality failure waiting to reach a stakeholder. Revenue and user-facing models require test coverage regardless of how few consumers they have today. The number of consumers grows; the coverage does not add itself retroactively.                         |
+| "We're still figuring out the schema — we'll add data contracts once the model stabilizes."                                       | Contracts are most valuable during schema evolution, not after it. An unstable schema without a contract lets breaking changes propagate undetected to downstream consumers. Add the contract as the model is defined; update it explicitly as the schema changes. That explicitness is the value. |
+| "Circular dependency detection is handled by the orchestrator — I don't need to check for it during design."                      | Orchestrators detect circular dependencies at runtime, after the DAG has been deployed. Static analysis during design catches them before deployment, before the pipeline fails at 3am, and before engineers have to diagnose a graph cycle under pressure. Detect them early.                     |
+| "The freshness check is too strict — it keeps alerting because the upstream source is occasionally delayed. I'll just remove it." | A freshness check that fires too often has the wrong threshold. Removing it means stale data reaches analysts silently. Adjust the `warn_after` and `error_after` thresholds to match the source's actual SLA, and escalate if the source cannot meet its own SLA.                                 |
 ## Gates
 - **No approving non-idempotent production pipelines.** If a pipeline writes data without MERGE, upsert, or delete-then-insert patterns, it is flagged as an error. Non-idempotent pipelines cause data duplication on re-runs.

package/dist/agents/skills/gemini-cli/harness-data-validation/SKILL.md CHANGED Viewed

@@ -328,6 +328,16 @@ await producer.send({ topic: 'order-events', messages: [{ value: JSON.stringify(
 const event = orderPlacedSchema.parse(JSON.parse(message.value));
 ```
+## Rationalizations to Reject
+| Rationalization                                                                                                                 | Reality                                                                                                                                                                                                                                                                                                                                                                           |
+| ------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| "TypeScript already types the request body — we don't need runtime Zod validation on top of that."                              | TypeScript types are erased at runtime. `req.body as CreateUserInput` compiles fine and accepts any payload at runtime. A missing required field, a string where a number is expected, or an injected extra field bypasses TypeScript entirely. Runtime validation is not redundant with types — it is the only enforcement that exists when the application is actually running. |
+| "We trust this internal service — we don't need to validate its message payloads."                                              | Trust boundaries are not about intent; they are about reliability. Internal services change their schemas, deploy independently, and have bugs. A consumer that accepts payloads without validation silently processes malformed data and produces corrupted downstream records. Validate every message that crosses a process boundary, regardless of who sent it.               |
+| "The validation error message just says 'invalid input' — the developer can look at the schema to understand what failed."      | Developers are not the only consumers of validation errors. Frontend applications display them, monitoring systems alert on them, and support teams diagnose them. A message that says `{"field":"email","expected":"string email","received":"null"}` is resolved in seconds. "Invalid input" creates a support ticket.                                                          |
+| "The two services define their own schemas independently but they've been in sync so far — shared contracts are overkill."      | "In sync so far" describes luck, not process. Independent schema definitions diverge at the next feature sprint when one team changes a field name. Shared contracts in a common package make schema drift a compile-time error instead of a runtime mystery. The divergence between `userId` and `customerId` in the same event is exactly what independent definitions produce. |
+| "Environment variable validation at startup is unnecessary — if a variable is missing, the app will fail when it's first used." | Failing at the first usage of a missing variable produces a cryptic error deep in the call stack, often after the app has been running for minutes and has processed real requests. Failing at startup produces a clear error with the variable name, before any requests are served. Fast failure is always better than deferred failure.                                        |
 ## Gates
 - **No type assertions on external data.** WHERE `as` is used to cast data from an API response, message payload, request body, or `JSON.parse` result, THEN the skill must flag it as a trust boundary violation. Type assertions bypass runtime validation entirely. The only acceptable pattern is runtime validation followed by type inference.

package/dist/agents/skills/gemini-cli/harness-database/SKILL.md CHANGED Viewed

@@ -286,21 +286,11 @@ These apply to ALL skills. If you catch yourself doing any of these, STOP.
 ## Rationalizations to Reject
-### Universal
-These reasoning patterns sound plausible but lead to bad outcomes. Reject them.
-- **"It's probably fine"** — "Probably" is not evidence. Verify before asserting.
-- **"This is best practice"** — Best practice in what context? Cite the source and
-  confirm it applies to this codebase.
-- **"We can fix it later"** — If it is worth flagging, it is worth documenting now
-  with a concrete follow-up plan.
-### Domain-Specific
-- **"The table is small, we don't need an index"** — Tables grow. Plan for the steady state, not the current row count.
-- **"The ORM handles this for us"** — ORMs generate SQL that may not match your performance expectations. Review the generated queries for correctness and efficiency.
-- **"We can always add a migration later"** — Schema changes in production have operational cost. Design the schema thoughtfully now rather than migrating repeatedly.
+| Rationalization                              | Reality                                                                                                                          |
+| -------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------- |
+| "The table is small, we don't need an index" | Tables grow. Plan for the steady state, not the current row count.                                                               |
+| "The ORM handles this for us"                | ORMs generate SQL that may not match your performance expectations. Review the generated queries for correctness and efficiency. |
+| "We can always add a migration later"        | Schema changes in production have operational cost. Design the schema thoughtfully now rather than migrating repeatedly.         |
 ## Escalation

package/dist/agents/skills/gemini-cli/harness-debugging/SKILL.md CHANGED Viewed

@@ -298,6 +298,15 @@ Update the session status to `resolved`.
 - Debug session file is complete with investigation log, hypotheses, and resolution
 - Learnings were captured for future reference
+## Rationalizations to Reject
+| Rationalization                                                                   | Reality                                                                                                                                     |
+| --------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------- |
+| "I have a strong hunch about what is wrong, so I will jump straight to fixing it" | Phase 1 INVESTIGATE must be completed before ANY fix code is written. You are guessing, not debugging.                                      |
+| "I changed two things and the bug is gone, so the fix must be correct"            | One variable at a time is a gate. Changing multiple things simultaneously means you do not know which change fixed it.                      |
+| "This is my third attempt but I feel close, so one more try before escalating"    | After 3 failed fix attempts, the gate requires you to question the architecture. The problem is likely not where you think it is.           |
+| "A try-catch that swallows the error prevents the crash, so the bug is fixed"     | Symptom suppression is explicitly listed as a bad fix. Wrapping the failure in a try-catch addresses what the bug did, not why it happened. |
 ## Examples
 ### Example: API Endpoint Returns 500 Instead of 400

package/dist/agents/skills/gemini-cli/harness-dependency-health/SKILL.md CHANGED Viewed

@@ -145,6 +145,16 @@ For each problem found, generate a specific, actionable recommendation:
 - Report follows the structured output format
 - All findings are backed by graph query evidence (with graph) or systematic static analysis (without graph)
+## Rationalizations to Reject
+These are common rationalizations that sound reasonable but lead to incorrect results. When you catch yourself thinking any of these, stop and follow the documented process instead.
+| Rationalization                                                                                                  | Why It Is Wrong                                                                                                                               |
+| ---------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------- |
+| "There are a few orphan files but they are probably test fixtures or configs, so I will skip investigating them" | Orphan detection explicitly excludes entry points. Files with zero inbound imports that are not entry points must be investigated.            |
+| "The cycle is between two closely related files, so it is not really a problem"                                  | Cycles create fragile coupling where any change in the cycle affects all members. Even "related" files should not have circular dependencies. |
+| "The health score is a B, which is good enough -- no need to act on the recommendations"                         | A hub with 14 importers is a single point of failure. "Good enough" scores mask specific structural risks that compound over time.            |
 ## Examples
 ### Example: Weekly Health Check on Monorepo

package/dist/agents/skills/gemini-cli/harness-deployment/SKILL.md CHANGED Viewed

@@ -283,21 +283,11 @@ These apply to ALL skills. If you catch yourself doing any of these, STOP.
 ## Rationalizations to Reject
-### Universal
-These reasoning patterns sound plausible but lead to bad outcomes. Reject them.
-- **"It's probably fine"** — "Probably" is not evidence. Verify before asserting.
-- **"This is best practice"** — Best practice in what context? Cite the source and
-  confirm it applies to this codebase.
-- **"We can fix it later"** — If it is worth flagging, it is worth documenting now
-  with a concrete follow-up plan.
-### Domain-Specific
-- **"It's just a config change, not a code change"** — Config changes cause outages at the same rate as code changes. Deploy them with the same rigor and rollback strategy.
-- **"We tested this in staging"** — Staging is not production. Traffic patterns, data volume, and edge cases differ. Staging success does not guarantee production safety.
-- **"Downtime will be brief"** — Brief is not zero. Quantify the expected impact and communicate it to stakeholders before deploying.
+| Rationalization                                | Reality                                                                                                                                |
+| ---------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------- |
+| "It's just a config change, not a code change" | Config changes cause outages at the same rate as code changes. Deploy them with the same rigor and rollback strategy.                  |
+| "We tested this in staging"                    | Staging is not production. Traffic patterns, data volume, and edge cases differ. Staging success does not guarantee production safety. |
+| "Downtime will be brief"                       | Brief is not zero. Quantify the expected impact and communicate it to stakeholders before deploying.                                   |
 ## Escalation

package/dist/agents/skills/gemini-cli/harness-design/SKILL.md CHANGED Viewed

@@ -246,6 +246,16 @@ DESIGN-003 [info] Three font weights in one component
   Fix:        Consolidate font-weight values to 400 (body) and 600 (heading) only
 ```
+## Rationalizations to Reject
+| Rationalization                                                                                                             | Reality                                                                                                                                                                                                                            |
+| --------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| "The tokens are already defined, so the aesthetic intent is obvious — I can infer it and skip Phase 1."                     | Tokens define values, not intent. The style, tone, and differentiator exist in the designer's head, not in a color ramp. DESIGN.md cannot be generated without explicit confirmation.                                              |
+| "There are only 3 violations and they're minor — I'll skip recording them in the graph to save time."                       | Unrecorded violations are invisible to every downstream skill. harness-impact-analysis and harness-accessibility rely on `VIOLATES_DESIGN` edges existing. Skip graph writes and the enforcement record is permanently incomplete. |
+| "The strictness level isn't set in config, so I'll just use strict to be safe."                                             | Defaulting to strict without reading config imposes blocking CI failures the team never agreed to. Always read `design.strictness` and default to `standard` when absent — not to the most aggressive level.                       |
+| "This anti-pattern is declared, but there are 40+ instances — it would take forever to report them all, so I'll summarize." | The REVIEW phase must report every finding with file path, line number, and severity. Summarizing hides the scope from the team and makes automated tooling miss violations.                                                       |
+| "DESIGN.md already exists from a previous run, so I can skip Phase 2 and go straight to REVIEW."                            | An existing DESIGN.md may be outdated or missing sections. The DIRECTION phase must verify all required sections are present and current before the REVIEW phase can rely on them.                                                 |
 ## Gates
 These are hard stops. Violating any gate means the process has broken down.

package/dist/agents/skills/gemini-cli/harness-design-mobile/SKILL.md CHANGED Viewed

@@ -317,6 +317,16 @@ struct WorkoutRow: View {
 }
 ```
+## Rationalizations to Reject
+| Rationalization                                                                                                                                                 | Reality                                                                                                                                                                                                                                                                 |
+| --------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| "The touch target is 40pt on iOS — that's close to 44pt and the designer approved the comp, so I'll leave it."                                                  | The 44pt iOS minimum and 48dp Android minimum are non-negotiable gates, not guidelines. Touch target violations are `error` severity regardless of strictness level. Designer comp approval does not override platform accessibility requirements.                      |
+| "This is a cross-platform React Native component, so I only need to read the generic token mapping — platform-specific rules for iOS and Android are optional." | React Native components require both `ios.yaml` and `android.yaml` rules. Platform-specific rules govern safe areas, elevation, navigation patterns, and touch targets that differ between platforms. Missing either set produces non-compliant native behavior.        |
+| "The component uses a hardcoded shadow for iOS — `shadowColor`, `shadowOffset`, etc. Those aren't design tokens, they're platform APIs."                        | Shadow colors must still reference token values. `shadowColor: tokens.color.neutral[900]` is the correct form. Hardcoded shadow values like `#000` or `rgba(0,0,0,0.2)` are token binding violations the VERIFY phase will flag.                                        |
+| "There's no `design-system/DESIGN.md` yet, but I know the aesthetic intent from our planning discussion — I'll proceed with tokens only."                       | Proceeding without `DESIGN.md` means anti-pattern enforcement is disabled for the entire VERIFY phase. The anti-pattern check is what catches design intent violations beyond token correctness. Warn the user and recommend running harness-design first.              |
+| "The scaffold plan is straightforward — a simple card component. I'll skip presenting it to the user and just generate."                                        | The scaffold plan confirmation is when the user can catch incorrect platform assumptions (wrong StyleSheet structure, wrong platform APIs) before any code is written. Mobile components are harder to refactor than web components due to platform-specific branching. |
 ## Gates
 - **No component generation without reading tokens from harness-design-system.** The SCAFFOLD phase requires `design-system/tokens.json`. Do not generate components with hardcoded values as a fallback.

package/dist/agents/skills/gemini-cli/harness-design-system/SKILL.md CHANGED Viewed

@@ -263,6 +263,16 @@ Spacing:         PASS (monotonically increasing, no gaps)
 Harness validate: PASS
 ```
+## Rationalizations to Reject
+| Rationalization                                                                                                            | Reality                                                                                                                                                                                                                                                 |
+| -------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| "The project already has a tailwind.config with colors — I can derive the token set from that and skip the DEFINE phase."  | Existing Tailwind config represents design debt, not design intent. The DEFINE phase exists to make deliberate choices about palette and typography. Deriving tokens from scattered config perpetuates the inconsistency the skill is meant to resolve. |
+| "One of the contrast pairs is 4.3:1 — close enough to 4.5:1 to pass. I'll mark it as passing."                             | 4.3:1 fails WCAG AA for normal text. There is no "close enough." Flag the failure and ask the user to choose an alternative. Silently accepting sub-threshold contrast is a compliance defect.                                                          |
+| "The user confirmed the palette in our conversation, so I can skip the formal confirmation gate and generate immediately." | The confirmation gate exists as a structural checkpoint, not a courtesy. Generate only after presenting the full palette + typography + spacing summary and receiving explicit approval. Conversation context can drift.                                |
+| "There are no existing design files, so I can skip the DISCOVER phase and go straight to defining."                        | The DISCOVER phase also detects the CSS framework and existing color/font usage. Skipping it means the generated tokens may not map to the actual CSS strategy and the design debt assessment is lost.                                                  |
+| "Fonts without fallback stacks are probably fine — modern browsers handle missing fonts gracefully."                       | A missing fallback stack is a token validation failure regardless of browser behavior. Every `fontFamily` token must include at least one generic fallback. This is a VALIDATE phase gate, not a style preference.                                      |
 ## Gates
 These are hard stops. Violating any gate means the process has broken down.

package/dist/agents/skills/gemini-cli/harness-design-web/SKILL.md CHANGED Viewed

@@ -341,6 +341,16 @@ defineProps<Props>();
 </style>
 ```
+## Rationalizations to Reject
+| Rationalization                                                                                                                                | Reality                                                                                                                                                                                                                                                |
+| ---------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| "The tokens file doesn't exist yet, but I know the brand colors — I'll hardcode them as a placeholder and note they should be replaced later." | Hardcoded values in generated output are the exact problem this skill exists to prevent. There is no placeholder exception. If `design-system/tokens.json` does not exist, instruct the user to run harness-design-system first and stop.              |
+| "The framework is obviously React — everything in this project is React. I don't need to run detection."                                       | Detection also identifies the CSS strategy (Tailwind vs CSS Modules vs CSS-in-JS), which determines how tokens map to code. Skipping detection produces components that may reference non-existent Tailwind classes or wrong theme paths.              |
+| "The user hasn't confirmed the scaffold plan, but the component structure is straightforward — I'll just generate it."                         | The scaffold plan confirmation is a gate. The user must see which tokens will be consumed and what the component structure will be before code is written. Generating first and explaining later inverts the review opportunity.                       |
+| "This component only uses one hardcoded hex value for a shadow — that's not really a design value, so I'll leave it."                          | Every color, font, and spacing value must reference a token. Shadows use color tokens. "Not really a design value" is not a category the verification phase recognizes. The VERIFY phase will flag it; the IMPLEMENT phase should not introduce it.    |
+| "The `@design-token` annotations are just comments — skipping them on a few components won't affect anything."                                 | These annotations are how `harness scan` creates `USES_TOKEN` edges in the knowledge graph. Missing annotations mean harness-impact-analysis cannot trace token changes to affected components. They are structural metadata, not decorative comments. |
 ## Gates
 These are hard stops. Violating any gate means the process has broken down.

package/dist/agents/skills/gemini-cli/harness-diagnostics/SKILL.md CHANGED Viewed

@@ -232,6 +232,15 @@ This log accumulates over time and helps improve future classifications.
 - **Flaky test not isolated in 60 minutes:** The non-determinism source may be outside the codebase (infrastructure, external service). Escalate with your findings.
 - **Security vulnerability with large blast radius:** If the minimal fix requires changing more than 3 files, reclassify as Design and escalate.
+## Rationalizations to Reject
+| Rationalization                                                                           | Why It Is Wrong                                                                                                                         |
+| ----------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |
+| "I can see the error is a type issue -- let me just fix it without formal classification" | The gate says classification must be explicit and written down before any fix attempt. Implicit classification skips the evidence step. |
+| "This looks like a Design issue, but I can probably fix it locally with a small change"   | Design category MUST escalate. Local fixes for architectural problems create more architectural problems.                               |
+| "I do not need to run tests before fixing -- I know what the baseline is"                 | Deterministic checks before AND after is a gate. Without a recorded baseline, you cannot prove the fix helped.                          |
+| "My first fix did not work, but I will try a different approach within the same category" | Reclassify, do not force. If the resolution strategy is not working, the classification is probably wrong.                              |
 ## Examples
 ### Example 1: Type Error in API Handler

package/dist/agents/skills/gemini-cli/harness-docs-pipeline/SKILL.md CHANGED Viewed

@@ -379,6 +379,17 @@ while iteration < maxIterations:
 - PASS/WARN/FAIL report includes per-category breakdown and specific remaining findings
 - Drift fixes in FIX phase are excluded from AUDIT findings (no double-counting)
+## Rationalizations to Reject
+These are common rationalizations that sound reasonable but lead to incorrect results. When you catch yourself thinking any of these, stop and follow the documented process instead.
+| Rationalization                                                                                               | Why It Is Wrong                                                                                                                                   |
+| ------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
+| "The drift finding is marked unsafe but the fix is obvious, so I will apply it silently"                      | Never apply a fix classified as unsafe without explicit user approval. The Iron Law: safe fixes are silent, unsafe fixes surface.                 |
+| "The convergence loop reduced findings from 8 to 6, but the remaining ones are hard -- I will keep iterating" | If a convergence iteration does not reduce the finding count, stop immediately. Continuing without progress wastes iterations.                    |
+| "I can write the drift detection logic directly instead of delegating to detect-doc-drift"                    | The pipeline delegates, never reimplements. Each sub-skill retains full standalone functionality.                                                 |
+| "The graph is not available so the pipeline results will be unreliable"                                       | The entire pipeline runs without a graph using static analysis fallbacks. Reduced accuracy is noted in the report, not used as an excuse to skip. |
 ## Examples
 ### Example: Full pipeline run with fixes

package/dist/agents/skills/gemini-cli/harness-dx/SKILL.md CHANGED Viewed

@@ -261,6 +261,16 @@ Phase 4: VALIDATE
   DX Scorecard: B -> A (projected after applying changes)
 ```
+## Rationalizations to Reject
+| Rationalization                                                                                                                                        | Reality                                                                                                                                                                                                                                                                    |
+| ------------------------------------------------------------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| "The README has an installation section but it only covers npm — yarn and pnpm users can figure it out. I'll mark installation as complete."           | Installation instructions must cover all package managers the project supports. If `yarn.lock` or `pnpm-lock.yaml` exists alongside `package-lock.json`, all three installers must be documented. Partial coverage is scored as partial, not complete.                     |
+| "This code example in the README uses the old `sdk.connect()` API — but it still parses syntactically, so it passes the syntax check."                 | Stale API references are broken examples regardless of syntax validity. A syntactically valid example that calls a renamed or removed function fails the freshness check and must be flagged as broken in the scorecard.                                                   |
+| "The API function's behavior is complex, but I can infer what it does from the name `parseAndValidate` — I'll write the docstring stub based on that." | Documentation must be derived from actual source code: type signatures, test files, and existing docs. Inferring behavior from function names produces fabricated documentation. Flag functions that cannot be documented from source as requiring developer-written docs. |
+| "The getting-started guide already exists in the wiki — it's not in the repo, but I'll mark the quickstart as present."                                | Documentation must be locatable from the repository root. A wiki link from the README satisfies the API reference link criterion only if the link is explicit. A guide that requires knowing where the wiki is does not meet the discoverability requirement.              |
+| "There are 18 undocumented exports — I'll generate all 18 JSDoc stubs and commit them without showing the user first."                                 | Scaffolded documentation must be presented for review before being written. Generated stubs may contain inaccurate parameter descriptions or wrong return type assumptions. Use `emit_interaction` to present scaffolded content and wait for approval.                    |
 ## Gates
 - **No scaffolding without human confirmation.** Generated documentation is always presented as a draft for review. Do not commit generated files automatically. Use `emit_interaction` to present scaffolded content and wait for approval.

package/dist/agents/skills/gemini-cli/harness-e2e/SKILL.md CHANGED Viewed

@@ -230,6 +230,15 @@ describe('Checkout flow', () => {
 });
 ```
+## Rationalizations to Reject
+| Rationalization                                                                      | Why It Is Wrong                                                                                                                                                    |
+| ------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| "Using CSS class selectors is faster than adding data-testid attributes"             | No CSS class selectors in page objects. .btn-primary breaks when the design system updates class names. Use data-testid, ARIA roles, and accessible labels.        |
+| "Adding a short waitForTimeout is easier than figuring out the right wait condition" | No arbitrary waits is a hard gate. waitForTimeout is a flakiness timebomb. Wait for specific conditions: network responses, DOM mutations, or URL changes.         |
+| "This test creates data through the UI because the API setup is complex"             | Test data must be created via API or fixtures, not through UI interactions. UI-based setup is slow, brittle, and conflates setup failures with assertion failures. |
+| "The test only fails sometimes in CI -- adding a retry will fix it"                  | Flaky tests block merge. Diagnose the root cause. Retries mask problems. After remediation, rerun 5 times to confirm stability.                                    |
 ## Gates
 - **No CSS class selectors in page objects.** If a locator uses `.btn-primary` or `[class*="header"]`, the test is brittle. Use `data-testid`, ARIA roles, or accessible labels. Rewrite before merging.

package/dist/agents/skills/gemini-cli/harness-event-driven/SKILL.md CHANGED Viewed

@@ -265,6 +265,16 @@ PASS: Idempotency via sequence numbers on event store
 PASS: Read model rebuild procedure documented in ops runbook
 ```
+## Rationalizations to Reject
+| Rationalization                                                                               | Reality                                                                                                                                                                                                                                                                                                                                                    |
+| --------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| "Our handlers are idempotent enough — we don't need a deduplication table"                    | "Idempotent enough" is not a guarantee. At-least-once delivery means the same message can arrive seconds, minutes, or hours apart. A handler that relies on approximate idempotency (e.g., checking a cache) will produce duplicate side effects when the deduplication window expires or the cache is flushed.                                            |
+| "We publish the event right after the database write — it's essentially the same transaction" | Two separate operations are not a transaction regardless of how close together they are. If the process crashes between the database write and the event publish, the write is committed but the event is never sent. Consumers will never see the state change. This is the dual-write problem and it requires the transactional outbox pattern to solve. |
+| "The dead-letter queue is configured but nobody monitors it"                                  | An unmonitored DLQ is a silent data loss queue. Failed messages accumulate with no alerting, no replay procedure, and no investigation. A DLQ without monitoring and a replay runbook is a place where business events go to die.                                                                                                                          |
+| "Saga compensation is complex — we'll handle failures with manual intervention"               | Manual intervention does not scale and is not available at 3am. A saga that partially completes without compensation leaves the system in a state that requires a human to reconstruct — which means it will not be reconstructed reliably. Every saga step that can fail must have a defined compensating action.                                         |
+| "We'll add event versioning when we need to change the schema"                                | Adding versioning to an event schema after consumers are deployed is a breaking change. Consumers expecting version 1 receive an unversioned event and have no way to detect that it is incompatible. Versioning must be in the envelope from the first event in production.                                                                               |
 ## Gates
 - **Every consumer must have a dead-letter queue.** No consumer may silently drop failed messages. WHERE a consumer is configured without a DLQ, THEN the skill must halt and require DLQ configuration before proceeding. Lost messages in production are unrecoverable.

package/dist/agents/skills/gemini-cli/harness-execution/SKILL.md CHANGED Viewed

@@ -411,6 +411,15 @@ When this skill makes claims about task completion, test results, or code behavi
 - No improvisation: tasks were executed as written, or execution was stopped and the blocker was reported
 - All stopping conditions were respected (no guessing past blockers, no blind retries)
+## Rationalizations to Reject
+| Rationalization                                                                                                | Reality                                                                                                                                                   |
+| -------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| "The plan says to do X, but doing Y would be cleaner -- I will improvise"                                      | The Iron Law states: execute the plan as written. If the plan is wrong, stop and fix the plan. Improvising mid-execution introduces untested assumptions. |
+| "This task depends on Task 3 which I know is done, so I can skip verifying prerequisites"                      | Prerequisites must be verified mechanically, not from memory. Check that dependency tasks are marked complete in state and that referenced files exist.   |
+| "The checkpoint is just a confirmation step and the output looks correct, so I will auto-continue"             | Checkpoints are non-negotiable pause points. If a task has a checkpoint marker, execution must pause.                                                     |
+| "Harness validate passed on the previous task and nothing changed structurally, so I can skip it for this one" | Validation runs after every task with no exceptions. Each task may introduce subtle architectural drift that only harness validate catches.               |
 ## Examples
 ### Example: Executing a 5-Task Notification Plan

package/dist/agents/skills/gemini-cli/harness-feature-flags/SKILL.md CHANGED Viewed

@@ -206,6 +206,16 @@
 - Rollout configuration is validated for active flags
 - Lifecycle policies are recommended with enforcement mechanisms
+## Rationalizations to Reject
+| Rationalization                                                            | Why It Is Wrong                                                                                                                        |
+| -------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------- |
+| "This release flag has been at 100% for a while, but removing it is risky" | Release flags at 100% for more than 30 days are stale candidates. Every stale flag adds dead code branches and test matrix complexity. |
+| "We only need to test the flag-on path since that is the path we ship"     | No flags without test coverage for both paths. The flag-off path IS the fallback when the flag provider is unreachable.                |
+| "These two flags depend on each other, but they work fine together"        | No coupled flag dependencies is a blocking finding. Flags that require other flags creates combinatorial complexity.                   |
+| "Setting the flag default to true makes the rollout easier"                | Every flag must default to safe (feature disabled). A default of true means a provider outage enables the feature for everyone.        |
+| "We do not need a naming convention -- our flag count is small"            | Inconsistent naming becomes unmanageable as flag count grows. The skill flags inconsistency as a warning even at small scale.          |
 ## Examples
 ### Example: React SPA with LaunchDarkly

package/dist/agents/skills/gemini-cli/harness-git-workflow/SKILL.md CHANGED Viewed

@@ -190,6 +190,16 @@ git branch -D <branch-name>
 - Worktree was cleaned up after finishing (unless keeping for continued work)
 - No stale worktree references remain after cleanup
+## Rationalizations to Reject
+| Rationalization                                                                                                                                       | Reality                                                                                                                                                                                                                                                        |
+| ----------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| "The tests are probably fine on the fresh branch — they were passing on main when I last checked. I'll skip baseline verification and start working." | Baseline verification is the condition that makes branch work trustworthy. A test failure discovered at finish time is ambiguous — it could be pre-existing or introduced by the work. Skipping baseline removes the only clean comparison point.              |
+| "The user said 'just merge it' — I'll merge without checking if the base branch has advanced since the worktree was created."                         | The pre-finish check for base branch divergence is mandatory before any finishing strategy. Merging without rebasing first can produce a merge that silently breaks tests that were passing on the branch but conflict with new commits on main.               |
+| "The worktree directory isn't gitignored, but it's inside a nested folder that's unlikely to be committed accidentally."                              | The `.gitignore` check is not about likelihood — it is about preventing accidental commits of worktree state that would corrupt the repository. If the worktree directory is not gitignored, add it before creating the worktree. No exceptions.               |
+| "The user chose to discard — I'll delete the branch and worktree immediately without showing the commits that will be lost."                          | The discard path requires showing the commit list from `git log main..HEAD --oneline` and receiving explicit confirmation before running `git worktree remove` and `git branch -D`. Work is being permanently deleted; the user must see what they are losing. |
+| "There's already a worktree for this branch at a different path — I'll create a second one since the user asked for a fresh setup."                   | Git does not allow two worktrees checked out to the same branch. Attempting to create a duplicate will fail. Instead, ask the user whether to use the existing worktree or create a new branch. Never assume a second worktree is the right answer.            |
 ## Examples
 ### Example: Setting Up a Worktree for a New Feature

package/dist/agents/skills/gemini-cli/harness-hotspot-detector/SKILL.md CHANGED Viewed

@@ -128,6 +128,16 @@ Use `get_relationships` to check structural edges between co-change pairs.
 - Report follows the structured output format
 - All findings are backed by graph query evidence (with graph) or git log analysis (without graph)
+## Rationalizations to Reject
+These are common rationalizations that sound reasonable but lead to incorrect results. When you catch yourself thinking any of these, stop and follow the documented process instead.
+| Rationalization                                                                                                          | Why It Is Wrong                                                                                                                                         |
+| ------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| "High churn just means the file is actively developed, not that it is risky"                                             | High churn in shared utilities specifically equals high risk. A file with 45 commits that co-changes with 12 different files indicates hidden coupling. |
+| "The co-change pair is between two files in different modules, but they probably just happen to change at the same time" | Distant co-change pairs are flagged as suspicious precisely because they indicate hidden coupling.                                                      |
+| "No graph exists so the analysis will be too incomplete to be useful"                                                    | Git log provides ~90% of the data needed for hotspot detection. The fallback is the highest-completeness fallback across all graph-enhanced skills.     |
 ## Examples
 ### Example: Detecting Hotspots in a Growing Codebase

package/dist/agents/skills/gemini-cli/harness-i18n/SKILL.md CHANGED Viewed

@@ -465,6 +465,16 @@ Remaining violations (require human judgment): 5
 - I18N-401: Missing key in es -- requires Spanish translation
 - I18N-402: Untranslated value in fr -- requires French translation
+## Rationalizations to Reject
+| Rationalization                                                                                                                           | Reality                                                                                                                                                                                                                                               |
+| ----------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| "This string is the app's brand name — it's technically hardcoded but obviously shouldn't be translated. I'll skip flagging it."          | Brand names require explicit suppression via `// i18n-ignore` comment, not silent omission from the scan. Skipping without suppression means future scans have inconsistent results and the team has no record of the deliberate decision.            |
+| "The framework isn't in the knowledge base, but I can tell from context it's using i18next patterns — I'll apply i18next rules directly." | Unrecognized frameworks must fall back to generic detection rules, not assumed framework rules. Applying i18next-specific fix patterns to an unknown framework produces incorrect wrapping that breaks at runtime. Log the gap and use generic rules. |
+| "The project has `i18n.enabled: false` — I'll still flag errors for hardcoded strings since the team should know about them."             | Respecting `i18n.enabled: false` is a gate. The team made a configuration decision. In that state, run in discovery mode (info severity only). Escalating to errors overrides the team's explicit choice.                                             |
+| "I18N-402 untranslated values are just warnings — I'll skip reporting them to keep the report shorter."                                   | Untranslated values (target identical to source) are a distinct violation category with their own code. They indicate copy-paste during file creation without actual translation. Omitting them produces a misleadingly optimistic coverage report.   |
+| "The plural rules for this locale look complex — I'll just check for 'one' and 'other' forms like English and move on."                   | Plural rules are locale-specific and must be loaded from the locale profile. Arabic requires six categories; Polish requires four. Checking only English plural categories produces false-passing results for languages that require more forms.      |
 ## Gates
 These are hard stops. Violating any gate means the process has broken down.

package/dist/agents/skills/gemini-cli/harness-i18n-process/SKILL.md CHANGED Viewed

@@ -369,6 +369,16 @@ Result:         BLOCKED -- i18n review not conducted for user-facing PR
 Action:         Run harness-i18n scan on changed files, address findings, then re-review
 ```
+## Rationalizations to Reject
+| Rationalization                                                                                                                                     | Reality                                                                                                                                                                                                                                                                   |
+| --------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| "I can see hardcoded strings in the component the user is discussing — I'll flag them now as part of the process review."                           | This skill operates on artifacts (specs, plans, review context), never on source code. Scanning source files is harness-i18n's responsibility. Running Grep on component files from within this skill violates the skill boundary, regardless of how convenient it seems. |
+| "The feature clearly has no user-facing strings — it's a background job. I'll skip injection entirely without checking."                            | The skill must assess whether injection is applicable before skipping. Background jobs can produce user-facing output via notifications, emails, and error responses. A deliberate "not applicable" decision requires reading the feature description, not assuming.      |
+| "The team is in prompt mode and has dismissed the suggestion twice — I'll escalate to gate mode enforcement to make sure they take i18n seriously." | Escalating to gate mode is a configuration decision the team must make explicitly. Prompt mode is always dismissible. Unilaterally enforcing gate-mode behavior overrides a team's deliberate choice and violates the skill's core operating contract.                    |
+| "The plan has one task called 'polish and cleanup' — that probably includes i18n work. I'll mark the i18n check as passing."                        | In gate mode, i18n task presence must be verified by keyword match (i18n, translation, locale, localization, l10n), not inferred from vague task names. Ambiguous tasks must be flagged as missing, not assumed to cover the requirement.                                 |
+| "The spec mentions 'multi-language support' in passing — that counts as addressing i18n, so I won't require a dedicated section."                   | A passing mention is not an i18n section. The validation check requires the spec to identify which strings are user-facing, which locales are affected, and any formatting requirements. A vague reference satisfies none of these.                                       |
 ## Gates
 These are hard stops. Violating any gate means the process has broken down.

package/dist/agents/skills/gemini-cli/harness-i18n-workflow/SKILL.md CHANGED Viewed

@@ -493,6 +493,16 @@ emails.welcome.greeting           -> "Hello {name}, welcome aboard!"
 Approve to continue scaffolding, or provide corrections.
 ```
+## Rationalizations to Reject
+| Rationalization                                                                                                                                  | Reality                                                                                                                                                                                                                                                                |
+| ------------------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| "The user already told me they want Spanish and French — I can skip the configuration phase and go straight to scaffolding."                     | The configuration phase writes the `i18n` block to `harness.config.json`. Without it, subsequent runs of harness-i18n have no enabled flag, no strictness level, and no locale list to work against. Verbal confirmation does not substitute for written config.       |
+| "In retrofit mode, the key naming is straightforward — I'll apply the generated key catalog directly without showing it to the user for review." | The retrofit key catalog checkpoint is a hard gate. Key names become permanent identifiers that translation teams, TMS tools, and source code will reference for years. The user must review and approve them before any files are written.                            |
+| "The pseudo-locale transformation for this string with `{name}` is obvious — I'll just wrap the entire string including the placeholder."        | ICU MessageFormat placeholders must be preserved exactly. Transforming `{name}` to `{ñàmë}` breaks the interpolation at runtime. The pseudo-locale algorithm must detect and skip all placeholder syntax before applying accent and expansion transforms.              |
+| "These target locale files already exist from a previous run — I'll overwrite them with the new extraction output to keep things clean."         | Existing target locale translations must never be overwritten. A key with a translated (non-empty, non-source-identical) value in a target locale represents real translation work. Overwriting it destroys that work silently. Always preserve existing translations. |
+| "We found 120 strings in retrofit mode — I'll just run the full extraction without the audit phase since we clearly need everything extracted."  | The retrofit audit results are what tell the user how much effort the extraction requires and let them prioritize high-traffic flows. Skipping the audit and going straight to extraction removes the user's ability to scope the work before it happens.              |
 ## Gates
 These are hard stops. Violating any gate means the process has broken down.

package/dist/agents/skills/gemini-cli/harness-impact-analysis/SKILL.md CHANGED Viewed

@@ -151,6 +151,16 @@ When no graph is available, use static analysis to approximate impact:
 - Report follows the structured output format
 - All findings are backed by graph query evidence (with graph) or systematic static analysis (without graph)
+## Rationalizations to Reject
+These are common rationalizations that sound reasonable but lead to incorrect results. When you catch yourself thinking any of these, stop and follow the documented process instead.
+| Rationalization                                                                                    | Why It Is Wrong                                                                                                                          |
+| -------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------- |
+| "The change is small so the blast radius must be low -- I can skip the transitive dependent check" | Small changes to shared utilities can have outsized blast radius. A one-line change to auth.ts can affect 23 transitive dependents.      |
+| "The graph is a few commits behind but it is close enough for this analysis"                       | If the graph is more than 2 commits behind, the skill requires a refresh before proceeding. Recent commits may have added new consumers. |
+| "No graph exists so I cannot produce a useful impact analysis"                                     | The fallback strategy using import parsing and naming conventions achieves ~70% completeness. Missing the graph does not mean stopping.  |
 ## Examples
 ### Example: Analyzing a Change to auth.ts

package/dist/agents/skills/gemini-cli/harness-incident-response/SKILL.md CHANGED Viewed

@@ -208,6 +208,16 @@ Phase 4: IMPROVE
     4. [P2] Create secret rotation runbook for all services (owner: @sre)
 ```
+## Rationalizations to Reject
+| Rationalization                                                                         | Reality                                                                                                                                                                                                                                                                           |
+| --------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| "The root cause was human error — someone pushed a bad config"                          | Human error is a symptom, not a root cause. The root cause is the system that allowed a bad config to reach production undetected. A postmortem that stops at "human error" prevents no future incidents because it identifies no systemic fix.                                   |
+| "We know what happened — we don't need to write a full postmortem for a minor incident" | The decision about what is "minor" is made under the stress of recovery, not under calm analysis. Contributing factors and near-misses that look minor in the moment are frequently the root cause of the next major incident. Document while the context is fresh.               |
+| "The action items are in Slack — we don't need to track them formally"                  | Action items not tracked in a formal system with owners and due dates are not completed. Slack messages are buried within hours. The improvement phase of an incident exists only if its outputs are tracked to completion.                                                       |
+| "We don't have SLOs yet so we can't calculate error budget impact"                      | The absence of SLOs is itself a finding. Without SLOs, there is no objective basis for deciding whether reliability is acceptable. The incident is the forcing function to establish baseline SLOs. Document this gap as a P0 action item.                                        |
+| "The incident was caused by a third-party outage — nothing we could have done"          | Third-party outages expose missing circuit breakers, absent fallbacks, and insufficient multi-region routing. The postmortem should document why the third-party outage caused a customer-visible incident and what resilience improvements would have isolated the blast radius. |
 ## Gates
 - **No postmortem without a root cause statement.** A postmortem that says "cause unknown" is incomplete. If the root cause cannot be determined, the postmortem must document what was investigated, what was ruled out, and what additional data is needed. Do not close the investigation.

package/dist/agents/skills/gemini-cli/harness-infrastructure-as-code/SKILL.md CHANGED Viewed

@@ -264,6 +264,16 @@ Phase 4: VALIDATE
   Result: WARN -- 2 security improvements needed
 ```
+## Rationalizations to Reject
+| Rationalization                                                                                | Reality                                                                                                                                                                                                                                                                                                  |
+| ---------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| "We store state locally because it's just a dev environment"                                   | Local state is not shared between team members. Two developers running `terraform apply` against the same environment with diverged local state will produce conflicting resource definitions, duplicate resources, or state corruption that requires manual recovery.                                   |
+| "We haven't pinned the provider version because we want to automatically get security patches" | Unpinned providers can silently change resource behavior on `terraform init`. A `~> 5.0` constraint without an upper bound can pull a provider with breaking changes. Pin the minor version and upgrade explicitly via reviewed PRs so changes are intentional.                                          |
+| "That S3 bucket has public access because it hosts our static site"                            | Static site hosting does not require a public bucket ACL. CloudFront with an Origin Access Control (OAC) policy serves files from a private bucket. Public bucket ACLs are a common misconfiguration vector because they apply to all objects, including accidentally uploaded sensitive files.          |
+| "We'll tag resources properly before we go to production"                                      | Untagged resources accumulate. Cost allocation reports become impossible, security audits cannot identify owners, and decommissioning requires manual investigation of every resource. Tagging must be enforced at resource creation — retroactive tagging at scale is a weeks-long engineering project. |
+| "Manual changes are fine for urgent hotfixes — we'll import them to Terraform afterward"       | Manual changes without immediate import create drift that may be overwritten by the next `terraform apply`. The "import it later" step is almost never done. Every manual change that goes unimported erodes the reliability guarantee that IaC provides.                                                |
 ## Gates
 - **No local state for shared infrastructure.** Terraform configurations managing shared resources must use a remote backend with locking. Local state is blocking for any non-experimental configuration.