codingbuddy-rules 5.3.0 → 5.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,145 +1,19 @@
1
- # Severity Classification
1
+ # Severity Classification (Incident Operations)
2
2
 
3
- Objective severity classification based on SLO burn rates and business impact.
3
+ > **Canonical source:** P1/P2/P3/P4 severity level **definitions**, impact criteria, and response expectations live in [`packages/rules/.ai-rules/rules/severity-classification.md`](../../rules/severity-classification.md) under *Production Incident Severity*.
4
+ >
5
+ > This file narrows that canonical scale to **operational guidance** for incident response: how to classify in practice, the math behind burn rates, the decision tree, and what to include in an incident report.
4
6
 
5
7
  ## The Classification Rule
6
8
 
7
9
  **Classify severity BEFORE taking any action.** Severity determines:
10
+
8
11
  - Response time expectations
9
12
  - Who gets notified
10
13
  - Resource allocation priority
11
14
  - Communication cadence
12
15
 
13
- ## Severity Matrix
14
-
15
- ### P1 - Critical
16
-
17
- **SLO Burn Rate:** >14.4x (consuming >2% error budget per hour)
18
-
19
- **Impact Criteria (ANY of these):**
20
- - Complete service outage
21
- - >50% of users affected
22
- - Critical business function unavailable
23
- - Data loss or corruption risk
24
- - Active security breach
25
- - Revenue-generating flow completely blocked
26
- - Compliance/regulatory violation in progress
27
-
28
- **Response Expectations:**
29
-
30
- | Metric | Target |
31
- |--------|--------|
32
- | Acknowledge | Within 5 minutes |
33
- | First update | Within 15 minutes |
34
- | War room formed | Within 15 minutes |
35
- | Executive notification | Within 30 minutes |
36
- | Customer communication | Within 1 hour |
37
- | Update cadence | Every 15 minutes |
38
-
39
- **Escalation:** Immediate page to on-call, all hands if needed
40
-
41
- **Example Incidents:**
42
- - Production database unreachable
43
- - Authentication service down
44
- - Payment processing 100% failure
45
- - Major cloud region outage affecting services
46
- - Data breach detected
47
-
48
- ---
49
-
50
- ### P2 - High
51
-
52
- **SLO Burn Rate:** >6x (consuming >5% error budget per 6 hours)
53
-
54
- **Impact Criteria (ANY of these):**
55
- - Major feature unavailable
56
- - 10-50% of users affected
57
- - Significant performance degradation (>5x latency)
58
- - Secondary business function blocked
59
- - Partial data integrity issues
60
- - Key integration failing
61
-
62
- **Response Expectations:**
63
-
64
- | Metric | Target |
65
- |--------|--------|
66
- | Acknowledge | Within 15 minutes |
67
- | First update | Within 30 minutes |
68
- | Status page update | Within 30 minutes |
69
- | Stakeholder notification | Within 1 hour |
70
- | Update cadence | Every 30 minutes |
71
-
72
- **Escalation:** Page on-call during business hours, notify team lead
73
-
74
- **Example Incidents:**
75
- - Search functionality completely broken
76
- - 30% of API requests failing
77
- - Email notifications not sending
78
- - Third-party payment provider degraded
79
- - Mobile app login issues for subset of users
80
-
81
- ---
82
-
83
- ### P3 - Medium
84
-
85
- **SLO Burn Rate:** >3x (consuming >10% error budget per 24 hours)
86
-
87
- **Impact Criteria (ANY of these):**
88
- - Minor feature impacted
89
- - <10% of users affected
90
- - Workaround available
91
- - Non-critical function degraded
92
- - Cosmetic issues affecting usability
93
- - Performance slightly degraded
94
-
95
- **Response Expectations:**
96
-
97
- | Metric | Target |
98
- |--------|--------|
99
- | Acknowledge | Within 1 hour |
100
- | First update | Within 2 hours |
101
- | Resolution target | Within 8 business hours |
102
- | Update cadence | At milestones |
103
-
104
- **Escalation:** Create ticket, notify team channel
105
-
106
- **Example Incidents:**
107
- - Report generation slow but working
108
- - Specific browser experiencing issues
109
- - Non-critical API endpoint intermittent
110
- - Email formatting broken
111
- - Search results slightly inaccurate
112
-
113
- ---
114
-
115
- ### P4 - Low
116
-
117
- **SLO Burn Rate:** >1x (projected budget exhaustion within SLO window)
118
-
119
- **Impact Criteria (ALL of these):**
120
- - Minimal or no user impact
121
- - Edge case or rare scenario
122
- - Cosmetic only
123
- - Performance within acceptable range
124
- - Workaround trivial
125
-
126
- **Response Expectations:**
127
-
128
- | Metric | Target |
129
- |--------|--------|
130
- | Acknowledge | Within 1 business day |
131
- | Resolution target | Next sprint/release |
132
- | Update cadence | On resolution |
133
-
134
- **Escalation:** Backlog item, routine prioritization
135
-
136
- **Example Incidents:**
137
- - Minor UI misalignment
138
- - Rare edge case error
139
- - Documentation inconsistency
140
- - Non-user-facing optimization opportunity
141
-
142
- ---
16
+ See the canonical severity matrix for P1-P4 definitions, impact criteria, and response expectations. This file does **not** redefine those levels — refer to the canonical document whenever you need the authoritative criteria.
143
17
 
144
18
  ## Error Budget Integration
145
19
 
@@ -170,10 +44,10 @@ Burn Rate = 1.44 / 0.1 = 14.4x (P1!)
170
44
  | Tier 2 (Important) | 99.9% | 0.1% | 43.8 minutes |
171
45
  | Tier 3 (Standard) | 99.5% | 0.5% | 3.65 hours |
172
46
 
173
- ---
174
-
175
47
  ## Classification Decision Tree
176
48
 
49
+ Use this when an incident is detected and you need to assign severity quickly.
50
+
177
51
  ```
178
52
  Is the service completely unavailable?
179
53
  ├── Yes → P1
@@ -208,8 +82,6 @@ Is impact minimal/cosmetic only?
208
82
  └── No → Default to P3 (when uncertain)
209
83
  ```
210
84
 
211
- ---
212
-
213
85
  ## When Uncertain, Classify Higher
214
86
 
215
87
  **Rule:** If you're unsure between two severity levels, classify higher.
@@ -218,21 +90,21 @@ Is impact minimal/cosmetic only?
218
90
  - Unsure between P2 and P3? → Classify as P2
219
91
  - Unsure between P3 and P4? → Classify as P3
220
92
 
221
- **Rationale:** Over-response is better than under-response. You can always downgrade.
222
-
223
- ---
93
+ **Rationale:** Over-response is better than under-response. You can always downgrade. This rule is also stated in the canonical document; it is repeated here because during an incident you must be able to act without following links.
224
94
 
225
95
  ## Severity Changes During Incident
226
96
 
227
97
  Severity can change as you learn more:
228
98
 
229
99
  **Upgrade when:**
100
+
230
101
  - Impact wider than initially assessed
231
102
  - More users affected than thought
232
103
  - Business impact greater than estimated
233
104
  - Mitigation not working
234
105
 
235
106
  **Downgrade when:**
107
+
236
108
  - Successful mitigation reduced impact
237
109
  - Fewer users affected than thought
238
110
  - Workaround discovered
@@ -240,8 +112,6 @@ Severity can change as you learn more:
240
112
 
241
113
  **Always communicate severity changes** to all stakeholders immediately.
242
114
 
243
- ---
244
-
245
115
  ## Include in Incident Reports
246
116
 
247
117
  When documenting severity, always include:
@@ -254,3 +124,9 @@ Impact: [Brief description]
254
124
  SLO Status: [Which SLO breaching]
255
125
  Error Budget Remaining: [Percentage]
256
126
  ```
127
+
128
+ ## Relationship to Code Review Severity
129
+
130
+ Code review uses a **different** severity scale (`critical`/`high`/`medium`/`low`) for PR approval gating. The two scales are not interchangeable — see [`rules/severity-classification.md`](../../rules/severity-classification.md#mapping-between-scales) for the narrow cases where they correspond (e.g., when a `critical` code review finding that shipped becomes an incident).
131
+
132
+ Do **not** mix the two scales in incident documents. Use P1-P4 here.
@@ -74,6 +74,8 @@ Security, Correctness, and Test Coverage must ALWAYS be reviewed. No exceptions.
74
74
 
75
75
  ## Quick Reference
76
76
 
77
+ > **Severity source:** The `Critical`/`High`/`Medium`/`Low` levels used below are the **Code Review Severity** scale defined in [`../../rules/severity-classification.md`](../../rules/severity-classification.md#code-review-severity). This skill decides *which dimensions to review at each level*; the canonical file defines *what counts as each level*. Do not conflate these levels with production incident severity (P1-P4).
78
+
77
79
  ### Priority Levels
78
80
 
79
81
  | Priority | Action | Dimensions |
@@ -48,11 +48,15 @@ function scaffold(cwd, options = {}) {
48
48
  return { skipped: false, dirs: copiedDirs, targetPath: targetDir };
49
49
  }
50
50
 
51
+ // Directories to skip during recursive copy (runtime state, not project content)
52
+ const SKIP_DIRS = new Set(['.omc']);
53
+
51
54
  function copyDirRecursive(src, dest) {
52
55
  fs.mkdirSync(dest, { recursive: true });
53
56
  const entries = fs.readdirSync(src, { withFileTypes: true });
54
57
 
55
58
  for (const entry of entries) {
59
+ if (entry.isDirectory() && SKIP_DIRS.has(entry.name)) continue;
56
60
  const srcPath = path.join(src, entry.name);
57
61
  const destPath = path.join(dest, entry.name);
58
62
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "codingbuddy-rules",
3
- "version": "5.3.0",
3
+ "version": "5.4.1",
4
4
  "description": "AI coding rules for consistent practices across AI assistants",
5
5
  "main": "index.js",
6
6
  "types": "index.d.ts",
@@ -14,6 +14,7 @@
14
14
  "index.js",
15
15
  "index.d.ts",
16
16
  ".ai-rules",
17
+ "!**/.omc",
17
18
  "bin",
18
19
  "lib"
19
20
  ],