universal-dev-standards 5.11.0 → 5.13.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (28) hide show
  1. package/bundled/ai/standards/acceptance-criteria-traceability.ai.yaml +10 -4
  2. package/bundled/ai/standards/deployment-standards.ai.yaml +50 -2
  3. package/bundled/ai/standards/full-coverage-testing.ai.yaml +8 -1
  4. package/bundled/ai/standards/license-compliance.ai.yaml +379 -10
  5. package/bundled/ai/standards/logging.ai.yaml +40 -3
  6. package/bundled/ai/standards/packaging-standards.ai.yaml +25 -2
  7. package/bundled/ai/standards/self-review-protocol.ai.yaml +144 -0
  8. package/bundled/ai/standards/test-governance.ai.yaml +19 -0
  9. package/bundled/core/deployment-standards.md +100 -2
  10. package/bundled/core/license-compliance.md +118 -0
  11. package/bundled/core/logging-standards.md +122 -2
  12. package/bundled/core/packaging-standards.md +72 -2
  13. package/bundled/core/self-review-protocol.md +160 -0
  14. package/bundled/locales/zh-CN/CHANGELOG.md +68 -3
  15. package/bundled/locales/zh-CN/README.md +2 -2
  16. package/bundled/locales/zh-CN/SECURITY.md +1 -1
  17. package/bundled/locales/zh-TW/CHANGELOG.md +68 -3
  18. package/bundled/locales/zh-TW/README.md +2 -2
  19. package/bundled/locales/zh-TW/SECURITY.md +1 -1
  20. package/bundled/locales/zh-TW/core/self-review-protocol.md +158 -0
  21. package/bundled/skills/README.md +3 -0
  22. package/bundled/skills/contract-test-assistant/SKILL.md +7 -0
  23. package/bundled/skills/deploy-assistant/SKILL.md +2 -0
  24. package/bundled/skills/logging-guide/SKILL.md +25 -2
  25. package/bundled/skills/migration-assistant/SKILL.md +104 -0
  26. package/bundled/skills/runbook-assistant/SKILL.md +8 -0
  27. package/package.json +2 -2
  28. package/standards-registry.json +17 -4
@@ -0,0 +1,144 @@
1
+ # Self-Review Protocol - AI Optimized
2
+ # Source: core/self-review-protocol.md
3
+
4
+ id: self-review-protocol
5
+ meta:
6
+ version: "1.0.0"
7
+ updated: "2026-05-26"
8
+ source: core/self-review-protocol.md
9
+ description: Mandatory self-review pass on large markdown edits before commit; catches 6 categories of internal cross-reference inconsistencies that internal reasoning routinely misses
10
+
11
+ trigger_conditions:
12
+ mandatory:
13
+ condition: commit modifies > 50 lines of markdown
14
+ artefact_types:
15
+ - ADR (architecture decision records, DEC-NNN)
16
+ - XSPEC (cross-project specs)
17
+ - XSPEC SDD Deltas
18
+ - SKILL.md (Claude Code custom skills)
19
+ - ARCHITECTURE.md
20
+ - API.md
21
+ - DEPLOYMENT.md
22
+ - MIGRATION.md
23
+ - runbooks
24
+ - playbooks
25
+ - README.md (when modifying major sections)
26
+ optional:
27
+ condition: commit modifies <= 50 lines of markdown
28
+ rationale: small edits rarely have cross-reference risk
29
+ not_applicable:
30
+ condition: code or config only changes
31
+ rationale: covered by lint, test, and code review
32
+
33
+ inconsistency_categories:
34
+ - id: 1
35
+ name: diagram_step_mismatch
36
+ description: Diagram or flow chart out of sync with step list
37
+ example: workflow diagram has 7 boxes but document defines 8 steps
38
+ check: count diagram nodes vs `## Step N:` / `## N.` headers
39
+ - id: 2
40
+ name: changelog_reference_error
41
+ description: Changelog entry references wrong anchor
42
+ example: changelog says "Step 1 added X" but X actually at Step 0
43
+ check: for each changelog line, grep the anchor it references
44
+ - id: 3
45
+ name: count_drift
46
+ description: Explicit number in text out of sync with actual count
47
+ example: "self-audit has 4 questions" but list has 7
48
+ check: grep for "N questions", "N rows", "N items" and verify
49
+ - id: 4
50
+ name: stale_template
51
+ description: Hardcoded model names, tool versions, dates not updated
52
+ example: commit template hardcodes Claude Sonnet 4.6 when model varies
53
+ check: find hardcoded specifics; replace with placeholders or update
54
+ - id: 5
55
+ name: wrong_tool_reference
56
+ description: Recommended CLI command does not do what described
57
+ example: recommends `claude --version` for model name (shows CLI version)
58
+ check: for each CLI command mentioned, mental check or `--help` verify
59
+ - id: 6
60
+ name: placeholder_rule_misalignment
61
+ description: Example contradicts current rule or latest case experience
62
+ example: example shows D1/D2/D3 but rule says D3 not mandatory
63
+ check: every concrete value in examples consistent with current rules
64
+
65
+ procedure:
66
+ step_1:
67
+ name: re_read_full_file
68
+ description: Use file-reading tool to read entire file (not just diff)
69
+ when: after editing, before committing
70
+ step_2:
71
+ name: walk_categories
72
+ description: Apply 6 categories above against the file
73
+ step_3:
74
+ name: fix_in_same_commit
75
+ description: If issues found, edit in place and include fixes in same commit
76
+ rationale: ship-and-patch creates follow-up commits; in-place fix is cleaner
77
+ step_4:
78
+ name: patch_if_already_committed
79
+ description: If issues found after commit, create patch commit (e.g., v1.2.1 fixes v1.2.0)
80
+
81
+ recording_formats:
82
+ skill_md:
83
+ location: changelog line in frontmatter or near top
84
+ format: '> **v{X.Y.(Z+1)} Self-review pass {YYYY-MM-DD}**: {N} issues found, {M} fixed in same commit'
85
+ adr_dec:
86
+ location: '## Follow-up Tracking table'
87
+ format: '| Self-review pass | This DEC | ✅ YYYY-MM-DD (6 categories, no issues) |'
88
+ xspec_sdd_delta:
89
+ location: after non-modification list section (e.g. §N.6)
90
+ format: '> Self-review pass: YYYY-MM-DD (6 categories, no issues)'
91
+ commit_message_body:
92
+ location: last line of commit body
93
+ format: 'Self-review (protocol v1.0.0): N issues found, M applied in same commit / 0 found.'
94
+
95
+ distinction_from_other_practices:
96
+ code_review:
97
+ covers: code correctness, design, security
98
+ trigger: before merging code PR
99
+ source_standard: code-review.md
100
+ content_self_audit:
101
+ covers: content completeness (all required sections present)
102
+ trigger: each artefact creation
103
+ example: eval-source skill 7-question audit
104
+ self_review_protocol:
105
+ covers: internal cross-reference consistency (form, not content)
106
+ trigger: after large markdown edit, before commit
107
+ source_standard: self-review-protocol.md (this standard)
108
+ peer_review:
109
+ covers: independent perspective, blast radius assessment
110
+ trigger: significant changes
111
+
112
+ anti_patterns:
113
+ - skipping_because_diff_small:
114
+ problem: small diffs in large files often introduce cross-ref errors elsewhere
115
+ mitigation: trigger is whole-file size, not diff size
116
+ - reviewing_diff_only:
117
+ problem: cross-ref errors may live in unchanged sections referencing changed content
118
+ mitigation: re-read whole file, not just diff
119
+ - document_without_practice:
120
+ problem: discipline is in the practice, not the documentation
121
+ mitigation: enforce in commit checklist
122
+ - substitute_for_peer_review:
123
+ problem: self-review catches inconsistencies, not design flaws
124
+ mitigation: keep peer review for significant changes
125
+
126
+ verification:
127
+ metrics:
128
+ - patch_commit_ratio:
129
+ description: ratio of v1.X.0 -> v1.X.1 follow-up patches for same artefact
130
+ target: significant drop after adopting; eval-source went from 100% to 0% after v1.3.0
131
+ - issue_surface_time:
132
+ description: issues caught by self-review (pre-commit) vs by next reader (post-commit)
133
+ target: pre-commit grows, post-commit shrinks
134
+
135
+ examples_in_the_wild:
136
+ - artefact: dev-platform/.claude/skills/eval-source/SKILL.md
137
+ version: v1.3.0
138
+ commit: 6b45c5d
139
+ note: first SKILL.md edit to pass self-review pre-commit; preceded by 2 patch cycles (v1.1.0->v1.1.1 with 3 issues, v1.2.0->v1.2.1 with 6 issues) that motivated this standard
140
+
141
+ self_review:
142
+ date: "2026-05-26"
143
+ issues_found: 0
144
+ notes: First draft self-review pass; no internal inconsistencies detected
@@ -158,3 +158,22 @@ standard:
158
158
  evidence: >
159
159
  BUG-A08 post-mortem (2026-04-20): 22 tests existed in UDS but were never
160
160
  executed by any CI gate, passing silently and masking real failures.
161
+
162
+ - id: gate-wiring-required
163
+ trigger: adding any quality detection script to the repository
164
+ instruction: |
165
+ Quality detection scripts (anti-fake check, stub check, coverage ratchet,
166
+ tautology scanner) MUST appear in at least one CI workflow job AND at least
167
+ one local hook (pre-commit or pre-push). A script that exists in scripts/
168
+ but is never called by CI is equivalent to not existing and constitutes a
169
+ governance gap. Apply the same execution-continuity principle to detection
170
+ scripts as to test cases: existence ≠ execution.
171
+ Checklist when adding a detection script:
172
+ [ ] Script is called in .github/workflows/*.yml (at least one job)
173
+ [ ] Script is called in .husky/pre-commit or .husky/pre-push
174
+ [ ] CI step name references the XSPEC or standard that mandates it
175
+ priority: required
176
+ evidence: >
177
+ XSPEC-220 post-mortem (2026-05-19): check-anti-fake-tests.sh existed in
178
+ vibeops/scripts/ for months but was not called by pre-commit, allowing
179
+ tautology assertions to be committed undetected.
@@ -2,8 +2,8 @@
2
2
 
3
3
  > **Language**: English | [繁體中文](../locales/zh-TW/core/deployment-standards.md)
4
4
 
5
- **Version**: 1.0.0
6
- **Last Updated**: 2026-02-09
5
+ **Version**: 1.1.0
6
+ **Last Updated**: 2026-05-26
7
7
  **Applicability**: All software projects with deployment pipelines
8
8
  **Scope**: universal
9
9
  **Industry Standards**: Twelve-Factor App, Google SRE — Release Engineering, DORA State of DevOps
@@ -219,6 +219,103 @@ Strategies are not mutually exclusive. Common combinations:
219
219
 
220
220
  ---
221
221
 
222
+ ## Defensive Deployment Ordering
223
+
224
+ When a deploy script replaces a running install (the destructive-update pattern common to Windows IIS, SystemD-managed services, or any "stop → swap → start" workflow), the ordering of destructive steps relative to verification is non-negotiable.
225
+
226
+ ### The forbidden ordering
227
+
228
+ ```
229
+ 1. Stop service
230
+ 2. Extract new package ← may silently no-op on format mismatch
231
+ 3. Delete old install ← runs unconditionally — destroys the running install
232
+ 4. Copy new install ← throws (source doesn't exist)
233
+ 5. Start service ← cannot start (binaries gone)
234
+ ```
235
+
236
+ If step 2 silently fails (corrupt archive, wrong format, disk full, permissions), step 3 still runs and **destroys the running install**, leaving nothing to recover from except backup. Backup helps for full rollback but does NOT prevent the outage window — the service is already down.
237
+
238
+ ### The required ordering — extract, verify, then delete
239
+
240
+ The destructive deploy ordering **MUST** be:
241
+
242
+ ```
243
+ 1. Stop service
244
+ 2. Extract new package → staging area (NOT directly over live install)
245
+ 3. ✅ VERIFY staging area contains expected artifacts
246
+ ↑ if verification fails: abort, do NOT touch the live install
247
+ 4. Backup live install (or done earlier — both is fine)
248
+ 5. Delete old install (preserving logs / runtime data)
249
+ 6. Copy new install from staging
250
+ 7. Restore preserved configs
251
+ 8. Start service
252
+ 9. Sanity check (HTTP probe / health endpoint)
253
+ ```
254
+
255
+ **Step 3 verification is non-negotiable.** Minimum verification is checking that at least one well-known file from the new package exists in the staging area. Hash-checking a manifest of expected files is preferred when available.
256
+
257
+ ### Verification snippets
258
+
259
+ **PowerShell** (Windows IIS deploy):
260
+
261
+ ```powershell
262
+ $staging = "C:\deploy\staging-$(Get-Date -Format yyyyMMddHHmmss)"
263
+ Expand-Archive -Path $zipPath -DestinationPath $staging -Force
264
+
265
+ # Non-negotiable: verify staging before touching live install
266
+ if (-not (Test-Path "$staging\api\MyApp.dll")) {
267
+ throw "Expected $staging\api\MyApp.dll not found — archive may be corrupt or wrong format. Aborting deploy. Live install untouched."
268
+ }
269
+
270
+ # Only NOW touch live install
271
+ Copy-Item "$apiDir" "$backupDir" -Recurse -Force
272
+ Get-ChildItem $apiDir -Exclude logs | Remove-Item -Recurse -Force
273
+ Copy-Item "$staging\api\*" $apiDir -Recurse
274
+ ```
275
+
276
+ **bash** (Linux SystemD-managed service):
277
+
278
+ ```bash
279
+ set -euo pipefail
280
+
281
+ STAGING="/srv/deploy/staging-$(date +%Y%m%d%H%M%S)"
282
+ mkdir -p "$STAGING"
283
+ tar -xzf "$ARCHIVE" -C "$STAGING"
284
+
285
+ # Non-negotiable: verify staging before touching live install
286
+ if [ ! -f "$STAGING/bin/myapp" ]; then
287
+ echo "ERROR: Expected $STAGING/bin/myapp not found. Aborting deploy. Live install untouched." >&2
288
+ exit 1
289
+ fi
290
+
291
+ # Only NOW touch live install
292
+ systemctl stop myapp
293
+ cp -a "$LIVE_DIR" "$BACKUP_DIR"
294
+ find "$LIVE_DIR" -mindepth 1 -not -path "$LIVE_DIR/logs*" -delete
295
+ cp -a "$STAGING"/* "$LIVE_DIR/"
296
+ systemctl start myapp
297
+ ```
298
+
299
+ ### Failure modes addressed
300
+
301
+ | Failure mode | What protects against it |
302
+ |---|---|
303
+ | Archive is wrong format (e.g., tar renamed to `.zip`) | Step 3 verify fails — live install untouched |
304
+ | Partial extract (disk full mid-extract) | Step 3 verify fails — live install untouched |
305
+ | Archive root structure changed (extra wrapper folder, missing key file) | Step 3 verify fails — live install untouched |
306
+ | Permissions issue (extract step had read but not write) | Step 3 verify fails — live install untouched |
307
+ | Backup script itself fails | Optional secondary check after step 4 |
308
+
309
+ ### Upstream prevention
310
+
311
+ Verifying at the consumer side is the last line of defense. The **upstream** prevention — refusing to produce a misformatted archive in the first place — is covered by [Packaging Standards — Archive Format Integrity](packaging-standards.md#archive-format-integrity). Both layers together form a defense-in-depth pair; neither alone is sufficient.
312
+
313
+ ### Failure mode reference (real incident)
314
+
315
+ A Windows IIS production deploy script (2026-05-24) ran `Expand-Archive` against a tar-renamed-to-`.zip` archive (silent no-op), then `Remove-Item -Recurse` against the live `apiDir`, then `Copy-Item` from a source that did not exist (because nothing had been extracted). The live install was wiped, AppPool stopped, production was down for ~3 minutes until backup-based rollback completed. Adding step 3 verify (`Test-Path "$staging/api/MyApp.dll"`) would have aborted the deploy at the staging stage with the live install untouched.
316
+
317
+ ---
318
+
222
319
  ## Post-Deployment Checklist
223
320
 
224
321
  ### Immediate (< 5 minutes)
@@ -334,6 +431,7 @@ Smoke test failure MUST block the deployment from proceeding and trigger a rollb
334
431
 
335
432
  | Version | Date | Changes |
336
433
  |---------|------|---------|
434
+ | 1.1.0 | 2026-05-26 | Added: Defensive Deployment Ordering section — required extract-verify-then-delete sequence, PowerShell + bash verify snippets, failure mode mapping, cross-link to packaging-standards Archive Format Integrity (XSPEC-231 / closes issue #110) |
337
435
  | 1.0.0 | 2026-02-09 | Initial release |
338
436
 
339
437
  ---
@@ -0,0 +1,118 @@
1
+ # License Compliance Standards
2
+
3
+ > **Version**: 2.1.0 | **Status**: Active | **Updated**: 2026-05-16
4
+ > **AI-optimized version**: `ai/standards/license-compliance.ai.yaml`
5
+ > **Agent Spec**: ASPEC-001 (cross-project/aspec/ASPEC-001-license-compliance-agent.md)
6
+
7
+ ## Overview
8
+
9
+ Comprehensive license compliance for AI-augmented development, covering both general OSS practice (Tier 1) and AI-specific rules for AI-generated code (Tier 2).
10
+
11
+ ## Tier 1 — General OSS Compliance Practices
12
+
13
+ Applies to every project regardless of AI use.
14
+
15
+ | ID | Rule | Level |
16
+ |----|------|-------|
17
+ | REQ-001 | License classification and allowlist | MUST |
18
+ | REQ-002 | Automated license scanning in CI | MUST |
19
+ | REQ-003 | SBOM generation (CycloneDX 1.5 or SPDX 2.3) | MUST |
20
+ | REQ-004 | License attribution and NOTICES file | MUST |
21
+ | REQ-005 | License violation remediation (5 business days) | MUST |
22
+ | REQ-006 | License review for new technology adoption | SHOULD |
23
+
24
+ ### License Tiers
25
+
26
+ | Tier | Licenses | Action |
27
+ |------|----------|--------|
28
+ | APPROVED | MIT, Apache 2.0, BSD-2/3-Clause, ISC, CC0 | Auto-approve |
29
+ | REVIEW-REQUIRED | LGPL-2.1/3.0, MPL-2.0, CDDL | Legal review before adoption |
30
+ | PROHIBITED | GPL-2.0/3.0, AGPL-3.0, SSPL-1.0, BUSL-1.1 | Block PR immediately |
31
+
32
+ ## Tier 2 — AI-Specific Rules
33
+
34
+ Binding on AI Agents that produce code (VibeOps Generator Agent and equivalents).
35
+
36
+ | ID | Rule | Severity |
37
+ |----|------|----------|
38
+ | LC-001 | SPDX ID lookup required | Blocking |
39
+ | LC-002 | Blocklist auto-block | Blocking |
40
+ | LC-003 | Allowlist auto-approve | Informational |
41
+ | LC-004 | Greylist human review | Review required |
42
+ | LC-005 | SBOM mandatory generation | Blocking |
43
+ | LC-006 | Copyright similarity threshold (≥0.85 block) | Blocking |
44
+ | LC-007 | PII pattern detection | Review required |
45
+ | LC-008 | EU AI Act transparency marker | Blocking |
46
+ | LC-009 | Customer policy ceiling | Informational |
47
+
48
+ ## v2.1.0 Enhancements (XSPEC-193 Phase 2)
49
+
50
+ ### ClearlyDefined API (LC-001)
51
+
52
+ - Primary license lookup source: `https://api.clearlydefined.io/definitions/{type}/{provider}/{namespace}/{name}/{revision}`
53
+ - Confidence ≥ 0.95 for well-known packages (score.total ≥ 80)
54
+ - 24h TTL LRU cache (cap=500) + negative cache for 404
55
+ - Token bucket: 10 req/s, burst 20
56
+ - Retry strategy: 5xx → exponential backoff × 3 (200ms/1s/3s); 429 → batch fallback
57
+ - DEC-064 cache key isolation: `sha256(client_salt + ':' + purl)`
58
+
59
+ ### AST PII Analysis (LC-007)
60
+
61
+ - Tree-sitter support: TypeScript, JavaScript, Python
62
+ - Context classification:
63
+ - `hardcoded_value` → severity upgraded to `critical`
64
+ - `comment` → severity downgraded to `info`
65
+ - `schema_field` → annotated, no severity change
66
+ - `// pii:ignore` pragma: suppresses findings on same line
67
+ - Optional fields: `PIIPattern.confidence`, `PIIPattern.ast_context`
68
+ - Graceful fallback to regex when tree-sitter unavailable
69
+
70
+ ### EmbeddingProvider Strategy (LC-006)
71
+
72
+ - `provider='onnx-minilm'`: ONNX local inference (all-MiniLM-L6-v2)
73
+ - `provider='ollama-bge-m3'`: Ollama local API (localhost:11434)
74
+ - `provider='jaccard'`: Jaccard token similarity (Phase 1 baseline, default)
75
+ - In-memory snippet index (`buildSnippetIndex()`) per-customer (DEC-064 salt)
76
+ - External search: opt-in via `enableExternalSearch=true` (default=false)
77
+
78
+ ## Principles
79
+
80
+ | ID | Principle |
81
+ |----|-----------|
82
+ | P-1 | SPDX First — all license IDs must be SPDX standard |
83
+ | P-2 | Independent Evaluator — different model class from Generator |
84
+ | P-3 | Evidence-Based Decision — every block carries traceable evidence |
85
+ | P-4 | Transparency by Default — EU AI Act Article 50 markers required |
86
+ | P-5 | Customer Sovereignty — policy customizable within EULA §9 limits |
87
+
88
+ ## Tool Sequence (XSPEC-193 §2)
89
+
90
+ ```
91
+ 1. dependency_reader
92
+ 2. license_lookup ← ClearlyDefined API (v2.1.0)
93
+ 3. license_blocklist_check
94
+ 4. sbom_generator
95
+ 5. pii_pattern_detector ← AST-enhanced (v2.1.0)
96
+ 6. copyright_similarity_check ← EmbeddingProvider (v2.1.0)
97
+ 7. eu_ai_act_classifier
98
+ 8. transparency_marker
99
+ 9. block_pr
100
+ 10. suggest_alternative
101
+ 11. escalate_to_human
102
+ ```
103
+
104
+ ## Related Specs
105
+
106
+ - XSPEC-193 — License Compliance Agent complete spec
107
+ - XSPEC-066 — Wave 3 Compliance Pack (v1.0.0 baseline)
108
+ - DEC-063 — VibeOps legal & compliance strategy
109
+ - DEC-064 — Customer IP isolation (cache salt)
110
+ - ASPEC-001 — License Compliance Agent SPEC (XSPEC-205 §REQ-2 format)
111
+
112
+ ## Changelog
113
+
114
+ | Version | Date | Changes |
115
+ |---------|------|---------|
116
+ | v1.0.0 | 2026-04-30 | Initial — REQ-001~006 general OSS practices |
117
+ | v2.0.0 | 2026-05-14 | Added Tier 2 LC-001~009 AI-specific rules |
118
+ | v2.1.0 | 2026-05-16 | ClearlyDefined API + AST PII + EmbeddingProvider + ASPEC-001 ref |
@@ -2,8 +2,8 @@
2
2
 
3
3
  > **Language**: English | [繁體中文](../locales/zh-TW/core/logging-standards.md)
4
4
 
5
- **Version**: 1.2.0
6
- **Last Updated**: 2026-01-24
5
+ **Version**: 1.3.0
6
+ **Last Updated**: 2026-05-26
7
7
  **Applicability**: All software projects
8
8
  **Scope**: universal
9
9
  **Industry Standards**: RFC 5424, OpenTelemetry, W3C Trace Context
@@ -595,6 +595,117 @@ For endpoints called thousands of times per second:
595
595
  | WARN | 90 days |
596
596
  | ERROR/FATAL | 1 year |
597
597
 
598
+ ---
599
+
600
+ ## Log File Rotation Policy
601
+
602
+ ### Rotation policy — MUST set both
603
+
604
+ A file-based log sink configuration **MUST** include **both** triggers:
605
+
606
+ 1. **Time-based rotation** (`rollingInterval: Day` or equivalent) — for chronological partitioning
607
+ 2. **Size-based rotation** with `rollOnFileSizeLimit: true` (or equivalent) — to handle volume spikes
608
+
609
+ > **Why mandatory:** Most logging libraries ship with a silent default size cap. When the file hits the cap, subsequent log writes are **dropped silently** — no warning, no error. The application keeps running while half a day of logs vanish. Setting both triggers explicitly defeats this trap.
610
+
611
+ ### Default cap is hostile in production
612
+
613
+ | Library | Default size cap | Behavior when cap hit |
614
+ |---|---|---|
615
+ | Serilog File sink (.NET) | 1 GB | **Silently stops writing** (`RollOnFileSizeLimit = false` by default) |
616
+ | log4j RollingFileAppender | none unless set | Same — no roll = drops |
617
+ | Python `RotatingFileHandler` | infinite unless `maxBytes` set | Grows unbounded |
618
+ | Winston `winston-daily-rotate-file` | none unless `maxSize` set | Same — no roll = drops |
619
+
620
+ If you do not explicitly configure size-based rotation, you are accepting one of the failure modes above.
621
+
622
+ ### Recommended starting values
623
+
624
+ | Parameter | Value | Rationale |
625
+ |---|---|---|
626
+ | `fileSizeLimitBytes` | 100 MB | Balance: small enough to open in an editor, large enough to avoid excessive rolls |
627
+ | `rollOnFileSizeLimit` | `true` | When cap hit, create `*-001.txt`, `*-002.txt`; do **NOT** drop |
628
+ | `retainedFileCountLimit` | ≥ N×7 where N = max expected rolls/day | Avoid premature deletion of in-window logs |
629
+
630
+ ### Recipes per language
631
+
632
+ **.NET / Serilog** (`appsettings.json`):
633
+
634
+ ```json
635
+ {
636
+ "Serilog": {
637
+ "WriteTo": [{
638
+ "Name": "File",
639
+ "Args": {
640
+ "path": "logs/app-.txt",
641
+ "rollingInterval": "Day",
642
+ "fileSizeLimitBytes": 104857600,
643
+ "rollOnFileSizeLimit": true,
644
+ "retainedFileCountLimit": 90
645
+ }
646
+ }]
647
+ }
648
+ }
649
+ ```
650
+
651
+ **Python** (`logging.handlers`):
652
+
653
+ ```python
654
+ from logging.handlers import RotatingFileHandler
655
+
656
+ handler = RotatingFileHandler(
657
+ filename="logs/app.log",
658
+ maxBytes=104857600, # 100 MB
659
+ backupCount=90 # ~3 months of rolls assuming low cardinality
660
+ )
661
+ # For combined time+size rotation, compose TimedRotatingFileHandler with size check
662
+ # or use a third-party library such as concurrent-log-handler.
663
+ ```
664
+
665
+ **Java / log4j2** (`log4j2.xml`):
666
+
667
+ ```xml
668
+ <RollingFile name="App" fileName="logs/app.log"
669
+ filePattern="logs/app-%d{yyyy-MM-dd}-%i.log.gz">
670
+ <PatternLayout pattern="%d %-5p %c{1.} - %m%n"/>
671
+ <Policies>
672
+ <TimeBasedTriggeringPolicy interval="1"/>
673
+ <SizeBasedTriggeringPolicy size="100 MB"/>
674
+ </Policies>
675
+ <DefaultRolloverStrategy max="90"/>
676
+ </RollingFile>
677
+ ```
678
+
679
+ **Node / Winston** (`winston-daily-rotate-file`):
680
+
681
+ ```javascript
682
+ import DailyRotateFile from "winston-daily-rotate-file";
683
+
684
+ new DailyRotateFile({
685
+ filename: "logs/app-%DATE%.log",
686
+ datePattern: "YYYY-MM-DD",
687
+ maxSize: "100m",
688
+ maxFiles: "90d"
689
+ });
690
+ ```
691
+
692
+ ### Operational SOP — investigate, don't just raise the cap
693
+
694
+ If a log file size reaches ≥ 90% of `fileSizeLimitBytes` at expected end-of-day, **investigate the cause before raising the cap**. Typical root causes:
695
+
696
+ - Noisy retry loop logging every attempt at INFO instead of WARN summary
697
+ - Unbounded debug logging accidentally enabled in production
698
+ - Stack-trace flood from one upstream failure
699
+ - Health probe / sidecar polluting the business log
700
+
701
+ Raising the cap masks the underlying noise problem and pushes the next outage further out.
702
+
703
+ ### Failure-mode reference (real incident)
704
+
705
+ A production .NET Worker using only `rollingInterval: Day` (no size limit set, Serilog default 1 GB cap) hit the cap at 07:31 and silently dropped every log entry until 13:00+ when the operator noticed the tail was stale. Five consecutive daily files showed `~1,073,741,8XX bytes` (= 1 GiB exactly, Serilog default). Half a day of production diagnostics were lost. Setting `fileSizeLimitBytes` + `rollOnFileSizeLimit: true` would have rolled to `worker-YYYYMMDD_001.txt` and preserved the events.
706
+
707
+ ---
708
+
598
709
  ## Quick Reference Card
599
710
 
600
711
  ### Log Level Selection
@@ -623,6 +734,14 @@ App cannot continue? → FATAL
623
734
  - [ ] Credit cards never logged
624
735
  - [ ] Retention policies configured
625
736
 
737
+ ### Rotation Checklist
738
+
739
+ - [ ] Time-based rotation set (`rollingInterval: Day` or equivalent)
740
+ - [ ] Size-based rotation set with `rollOnFileSizeLimit: true` (or equivalent)
741
+ - [ ] `fileSizeLimitBytes` explicitly configured (default cap is hostile)
742
+ - [ ] `retainedFileCountLimit` ≥ N×7 to cover within-window rolls
743
+ - [ ] 90% size SOP defined: investigate noise root cause, do not just raise cap
744
+
626
745
  ---
627
746
 
628
747
  **Related Standards:**
@@ -635,6 +754,7 @@ App cannot continue? → FATAL
635
754
 
636
755
  | Version | Date | Changes |
637
756
  |---------|------|---------|
757
+ | 1.3.0 | 2026-05-26 | Added: Log File Rotation Policy — mandatory dual-trigger (time + size) rotation with hostile-default warning, recipes for .NET/Python/Java/Node, ops SOP (XSPEC-232 / closes issue #111) |
638
758
  | 1.2.0 | 2026-01-24 | Added: OpenTelemetry Semantic Conventions, Observability Three Pillars Integration, Log-based Alerting, Advanced Correlation Patterns |
639
759
  | 1.1.0 | 2026-01-05 | Added: References section with OWASP, RFC 5424, OpenTelemetry, and 12 Factor App |
640
760
  | 1.0.0 | 2025-12-30 | Initial logging standards |
@@ -2,8 +2,8 @@
2
2
 
3
3
  > **Language**: English | [繁體中文](../locales/zh-TW/core/packaging-standards.md)
4
4
 
5
- **Version**: 1.0.0
6
- **Last Updated**: 2026-04-15
5
+ **Version**: 1.1.0
6
+ **Last Updated**: 2026-05-26
7
7
  **Applicability**: Projects using a UDS-aware toolchain
8
8
  **Scope**: universal
9
9
 
@@ -194,6 +194,75 @@ A packaging run is considered **successful** when ALL of the following condition
194
194
 
195
195
  ---
196
196
 
197
+ ## Archive Format Integrity
198
+
199
+ When a packaging step produces an archive (`.zip`, `.tar.gz`, `.tar.bz2`, etc.) that will be consumed by a deploy script, the **real binary format MUST match the file extension**. A file named `.zip` MUST be a real ZIP archive (PKZip magic `PK\x03\x04`), not a renamed tar archive.
200
+
201
+ > **Why mandatory:** mismatched archive formats trigger silent failures downstream. PowerShell's `Expand-Archive` and `[System.IO.Compression.ZipFile]::ExtractToDirectory()` accept tar-renamed-to-`.zip` **without raising an error** — the file is read, nothing is extracted, no exception. If the next step of the deploy script is destructive (e.g., "delete current install directory"), the live install is destroyed with nothing to replace it.
202
+
203
+ ### Verification before publish
204
+
205
+ Every packaging step that produces an archive **MUST** include format verification before declaring success. Minimum verification:
206
+
207
+ | Format | Verification one-liner |
208
+ |---|---|
209
+ | `.zip` | `python -c "import zipfile; zipfile.ZipFile('out.zip').namelist()"` must succeed |
210
+ | `.zip` (Unix) | `file out.zip` must report `Zip archive data`, **NOT** `POSIX tar archive` |
211
+ | `.tar.gz` | `tar -tzf out.tar.gz >/dev/null` must succeed |
212
+ | any | optional: hash a manifest of expected files and compare |
213
+
214
+ Verification failure MUST abort the packaging pipeline before publish.
215
+
216
+ ### Platform-specific recipes
217
+
218
+ **Windows — DO use:**
219
+
220
+ ```powershell
221
+ # Option A: PowerShell built-in (produces real ZIP)
222
+ Compress-Archive -Path "publish\*" -DestinationPath "dist\patch.zip" -Force
223
+
224
+ # Option B: .NET API (produces real ZIP)
225
+ Add-Type -Assembly System.IO.Compression.FileSystem
226
+ [System.IO.Compression.ZipFile]::CreateFromDirectory(
227
+ "publish", "dist\patch.zip", "Optimal", $false
228
+ )
229
+ ```
230
+
231
+ **Windows — DO NOT use:**
232
+
233
+ ```bash
234
+ # ❌ git-bash / busybox tar -a -cf is UNRELIABLE on Windows
235
+ # The -a "auto by extension" flag produces a POSIX tar archive with .zip extension.
236
+ # `file patch.zip` → "POSIX tar archive (GNU)" (not "Zip archive data")
237
+ cd publish && tar -a -cf "../dist/patch.zip" api/
238
+ ```
239
+
240
+ **Unix-like — DO use:**
241
+
242
+ ```bash
243
+ # Use 'zip' for ZIP archives (BSD/Linux)
244
+ zip -r dist/patch.zip publish/
245
+
246
+ # Use 'tar -czf' (without -a) for tar.gz archives — explicit, deterministic
247
+ tar -czf dist/patch.tar.gz publish/
248
+
249
+ # Verify before publishing
250
+ file dist/patch.zip # expect "Zip archive data"
251
+ python -c "import zipfile; zipfile.ZipFile('dist/patch.zip').namelist()"
252
+ ```
253
+
254
+ ### Consumer-side defense
255
+
256
+ Producers cannot guarantee that consumers verify. Consumers (deploy scripts) **MUST** verify archive integrity before any destructive action. See [Deployment Standards — Defensive Deployment Ordering](deployment-standards.md#defensive-deployment-ordering) for the consumer-side requirement.
257
+
258
+ ### Failure mode reference (real incident)
259
+
260
+ A Windows IIS production deploy script (2026-05-24) used `tar -a -cf patch.zip api/` in git-bash to produce its release archive. The consumer-side PowerShell deploy script then ran `Expand-Archive` (silent no-op on the tar-renamed file), proceeded to `Remove-Item -Recurse` the live `apiDir`, then `Copy-Item` from a source that did not exist (because nothing had been extracted). The live install was wiped, AppPool stopped, and production was down for ~3 minutes until backup-based rollback completed.
261
+
262
+ The combination of (a) producer using auto-extension tar and (b) consumer not verifying extract output destroyed the running install with no error raised at any step.
263
+
264
+ ---
265
+
197
266
  ## Related Standards
198
267
 
199
268
  - [Deployment Standards](deployment-standards.md) — Deploy stage that follows packaging
@@ -207,6 +276,7 @@ A packaging run is considered **successful** when ALL of the following condition
207
276
 
208
277
  | Version | Date | Changes |
209
278
  |---------|------|---------|
279
+ | 1.1.0 | 2026-05-26 | Added: Archive Format Integrity section — real-format-must-match-extension rule, verification one-liners, Windows recipe DO/DON'T list, real incident reference (XSPEC-231 / closes issue #113) |
210
280
  | 1.0.0 | 2026-04-15 | Initial release — XSPEC-034 Phase 1 |
211
281
 
212
282
  ---