cfsa-antigravity 2.0.0 → 2.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (116) hide show
  1. package/README.md +14 -0
  2. package/package.json +1 -1
  3. package/template/.agent/instructions/commands.md +8 -32
  4. package/template/.agent/instructions/example.md +21 -0
  5. package/template/.agent/instructions/patterns.md +3 -3
  6. package/template/.agent/instructions/tech-stack.md +71 -23
  7. package/template/.agent/instructions/workflow.md +12 -1
  8. package/template/.agent/rules/completion-checklist.md +6 -0
  9. package/template/.agent/rules/security-first.md +3 -3
  10. package/template/.agent/rules/vertical-slices.md +1 -1
  11. package/template/.agent/skill-library/MANIFEST.md +6 -0
  12. package/template/.agent/skill-library/stack/devops/git-advanced/SKILL.md +972 -0
  13. package/template/.agent/skill-library/stack/devops/git-workflow/SKILL.md +420 -0
  14. package/template/.agent/skills/api-versioning/SKILL.md +44 -298
  15. package/template/.agent/skills/api-versioning/references/typescript.md +157 -0
  16. package/template/.agent/skills/architecture-mapping/SKILL.md +13 -13
  17. package/template/.agent/skills/bootstrap-agents/SKILL.md +151 -152
  18. package/template/.agent/skills/clean-code/SKILL.md +64 -118
  19. package/template/.agent/skills/clean-code/references/typescript.md +126 -0
  20. package/template/.agent/skills/database-schema-design/SKILL.md +93 -317
  21. package/template/.agent/skills/database-schema-design/references/relational.md +228 -0
  22. package/template/.agent/skills/error-handling-patterns/SKILL.md +62 -557
  23. package/template/.agent/skills/error-handling-patterns/references/go.md +162 -0
  24. package/template/.agent/skills/error-handling-patterns/references/python.md +262 -0
  25. package/template/.agent/skills/error-handling-patterns/references/rust.md +112 -0
  26. package/template/.agent/skills/error-handling-patterns/references/typescript.md +178 -0
  27. package/template/.agent/skills/idea-extraction/SKILL.md +322 -224
  28. package/template/.agent/skills/logging-best-practices/SKILL.md +108 -767
  29. package/template/.agent/skills/logging-best-practices/references/go.md +49 -0
  30. package/template/.agent/skills/logging-best-practices/references/python.md +52 -0
  31. package/template/.agent/skills/logging-best-practices/references/typescript.md +215 -0
  32. package/template/.agent/skills/migration-management/SKILL.md +127 -311
  33. package/template/.agent/skills/migration-management/references/relational.md +214 -0
  34. package/template/.agent/skills/parallel-feature-development/SKILL.md +34 -43
  35. package/template/.agent/skills/pipeline-rubrics/references/be-rubric.md +1 -1
  36. package/template/.agent/skills/pipeline-rubrics/references/ia-rubric.md +2 -2
  37. package/template/.agent/skills/pipeline-rubrics/references/scoring.md +1 -1
  38. package/template/.agent/skills/pipeline-rubrics/references/vision-rubric.md +2 -1
  39. package/template/.agent/skills/prd-templates/SKILL.md +23 -6
  40. package/template/.agent/skills/prd-templates/references/be-spec-template.md +2 -2
  41. package/template/.agent/skills/prd-templates/references/decomposition-templates.md +2 -2
  42. package/template/.agent/skills/prd-templates/references/engineering-standards-template.md +2 -0
  43. package/template/.agent/skills/prd-templates/references/fe-spec-template.md +1 -1
  44. package/template/.agent/skills/prd-templates/references/fractal-cx-template.md +58 -0
  45. package/template/.agent/skills/prd-templates/references/fractal-feature-template.md +93 -0
  46. package/template/.agent/skills/prd-templates/references/fractal-node-index-template.md +55 -0
  47. package/template/.agent/skills/prd-templates/references/ideation-crosscut-template.md +26 -47
  48. package/template/.agent/skills/prd-templates/references/ideation-index-template.md +47 -31
  49. package/template/.agent/skills/prd-templates/references/operational-templates.md +1 -1
  50. package/template/.agent/skills/prd-templates/references/placeholder-workflow-mapping.md +50 -21
  51. package/template/.agent/skills/prd-templates/references/skill-loading-protocol.md +32 -0
  52. package/template/.agent/skills/prd-templates/references/slice-completion-gates.md +29 -0
  53. package/template/.agent/skills/prd-templates/references/spec-coverage-sweep.md +3 -3
  54. package/template/.agent/skills/prd-templates/references/tdd-testing-policy.md +39 -0
  55. package/template/.agent/skills/prd-templates/references/vision-template.md +8 -8
  56. package/template/.agent/skills/regex-patterns/SKILL.md +122 -540
  57. package/template/.agent/skills/regex-patterns/references/go.md +44 -0
  58. package/template/.agent/skills/regex-patterns/references/javascript.md +63 -0
  59. package/template/.agent/skills/regex-patterns/references/python.md +77 -0
  60. package/template/.agent/skills/regex-patterns/references/rust.md +43 -0
  61. package/template/.agent/skills/resolve-ambiguity/SKILL.md +1 -1
  62. package/template/.agent/skills/session-continuity/SKILL.md +11 -9
  63. package/template/.agent/skills/session-continuity/protocols/02-progress-generation.md +2 -2
  64. package/template/.agent/skills/session-continuity/protocols/04-pattern-extraction.md +1 -1
  65. package/template/.agent/skills/session-continuity/protocols/05-session-close.md +1 -1
  66. package/template/.agent/skills/session-continuity/protocols/09-parallel-claim.md +1 -1
  67. package/template/.agent/skills/session-continuity/protocols/10-placeholder-verification-gate.md +57 -78
  68. package/template/.agent/skills/session-continuity/protocols/11-parallel-synthesis.md +1 -1
  69. package/template/.agent/skills/spec-writing/SKILL.md +1 -1
  70. package/template/.agent/skills/tdd-workflow/SKILL.md +94 -317
  71. package/template/.agent/skills/tdd-workflow/references/typescript.md +231 -0
  72. package/template/.agent/skills/testing-strategist/SKILL.md +74 -687
  73. package/template/.agent/skills/testing-strategist/references/typescript.md +328 -0
  74. package/template/.agent/skills/workflow-automation/SKILL.md +62 -154
  75. package/template/.agent/skills/workflow-automation/references/inngest.md +88 -0
  76. package/template/.agent/skills/workflow-automation/references/temporal.md +64 -0
  77. package/template/.agent/workflows/bootstrap-agents-fill.md +85 -143
  78. package/template/.agent/workflows/bootstrap-agents-provision.md +90 -107
  79. package/template/.agent/workflows/create-prd-architecture.md +23 -16
  80. package/template/.agent/workflows/create-prd-compile.md +11 -12
  81. package/template/.agent/workflows/create-prd-design-system.md +1 -1
  82. package/template/.agent/workflows/create-prd-security.md +9 -11
  83. package/template/.agent/workflows/create-prd-stack.md +10 -4
  84. package/template/.agent/workflows/create-prd.md +9 -9
  85. package/template/.agent/workflows/decompose-architecture-structure.md +4 -6
  86. package/template/.agent/workflows/decompose-architecture-validate.md +18 -1
  87. package/template/.agent/workflows/decompose-architecture.md +18 -3
  88. package/template/.agent/workflows/evolve-contract.md +11 -11
  89. package/template/.agent/workflows/evolve-feature-classify.md +14 -6
  90. package/template/.agent/workflows/ideate-discover.md +72 -107
  91. package/template/.agent/workflows/ideate-extract.md +84 -63
  92. package/template/.agent/workflows/ideate-validate.md +26 -22
  93. package/template/.agent/workflows/ideate.md +9 -9
  94. package/template/.agent/workflows/implement-slice-setup.md +25 -23
  95. package/template/.agent/workflows/implement-slice-tdd.md +73 -89
  96. package/template/.agent/workflows/implement-slice.md +4 -4
  97. package/template/.agent/workflows/plan-phase-preflight.md +6 -2
  98. package/template/.agent/workflows/plan-phase-write.md +6 -8
  99. package/template/.agent/workflows/remediate-pipeline-assess.md +2 -1
  100. package/template/.agent/workflows/resolve-ambiguity.md +2 -2
  101. package/template/.agent/workflows/update-architecture-map.md +22 -5
  102. package/template/.agent/workflows/validate-phase-quality.md +155 -0
  103. package/template/.agent/workflows/validate-phase-readiness.md +167 -0
  104. package/template/.agent/workflows/validate-phase.md +19 -157
  105. package/template/.agent/workflows/verify-infrastructure.md +10 -10
  106. package/template/.agent/workflows/write-architecture-spec-design.md +23 -14
  107. package/template/.agent/workflows/write-be-spec-classify.md +25 -21
  108. package/template/.agent/workflows/write-be-spec.md +1 -1
  109. package/template/.agent/workflows/write-fe-spec-classify.md +6 -12
  110. package/template/.agent/workflows/write-fe-spec-write.md +1 -1
  111. package/template/AGENTS.md +6 -2
  112. package/template/GEMINI.md +5 -3
  113. package/template/docs/README.md +10 -10
  114. package/template/docs/kit-architecture.md +126 -33
  115. package/template/docs/plans/ideation/README.md +8 -3
  116. package/template/.agent/skills/prd-templates/references/ideation-domain-template.md +0 -55
@@ -1,11 +1,24 @@
1
1
  ---
2
2
  name: regex-patterns
3
- description: "Comprehensive regular expressions guide covering character classes, quantifiers, anchors, groups (capturing, non-capturing, named), lookahead/lookbehind, backreferences, common patterns (email, URL, IP, phone, dates, semver), ReDoS prevention, Unicode support, flags, regex in multiple languages (JS, Python, Go, Rust), debugging techniques, and when NOT to use regex. Use when writing, debugging, or optimizing regular expressions."
4
- version: 1.0.0
3
+ description: "Comprehensive regular expressions guide covering character classes, quantifiers, anchors, groups, lookahead/lookbehind, common patterns (email, URL, IP, phone, dates, semver), ReDoS prevention, Unicode support, flags, debugging, and when NOT to use regex."
4
+ version: 2.0.0
5
5
  ---
6
6
 
7
7
  # Regular Expressions Mastery
8
8
 
9
+ ## Stack-Specific References
10
+
11
+ Regex syntax is mostly universal but API usage differs by language. After reading the patterns below, read the reference for your language:
12
+
13
+ | Language | Reference | Engine Notes |
14
+ |----------|-----------|-------------|
15
+ | JavaScript | `references/javascript.md` | V8/SpiderMonkey — full PCRE-like features |
16
+ | Python | `references/python.md` | `re` module — PCRE-like, supports verbose mode |
17
+ | Go | `references/go.md` | RE2 — no backreferences or lookaround |
18
+ | Rust | `references/rust.md` | regex crate — RE2-like; `fancy-regex` for lookaround |
19
+
20
+ ---
21
+
9
22
  ## 1. Fundamentals
10
23
 
11
24
  ### Character Classes
@@ -25,17 +38,6 @@ version: 1.0.0
25
38
  [0-9a-fA-F] Hexadecimal digit
26
39
  ```
27
40
 
28
- ### Custom Character Classes
29
-
30
- ```
31
- [aeiou] Vowels
32
- [^aeiou] Non-vowels (consonants and non-letters)
33
- [\d\s] Digit or whitespace
34
- [.\-+] Literal dot, hyphen, or plus (hyphen escaped or at start/end)
35
- [\[\]] Literal square brackets
36
- [^\n\r] Any character except newline
37
- ```
38
-
39
41
  ### Quantifiers
40
42
 
41
43
  ```
@@ -46,12 +48,11 @@ version: 1.0.0
46
48
  {2,5} Between 2 and 5
47
49
  {3,} 3 or more
48
50
 
49
- *? Zero or more (lazy -- match as few as possible)
51
+ *? Zero or more (lazy)
50
52
  +? One or more (lazy)
51
- ?? Zero or one (lazy)
52
53
  {2,5}? Between 2 and 5 (lazy)
53
54
 
54
- *+ Zero or more (possessive -- no backtracking, not in all engines)
55
+ *+ Zero or more (possessive not in all engines)
55
56
  ++ One or more (possessive)
56
57
  ```
57
58
 
@@ -60,7 +61,7 @@ version: 1.0.0
60
61
  ```
61
62
  Input: <b>bold</b> and <b>more bold</b>
62
63
 
63
- <b>.*</b> Greedy: matches "<b>bold</b> and <b>more bold</b>" (one match)
64
+ <b>.*</b> Greedy: matches entire string (one match)
64
65
  <b>.*?</b> Lazy: matches "<b>bold</b>" and "<b>more bold</b>" (two matches)
65
66
  ```
66
67
 
@@ -69,25 +70,19 @@ Input: <b>bold</b> and <b>more bold</b>
69
70
  ## 2. Anchors and Boundaries
70
71
 
71
72
  ```
72
- ^ Start of string (or start of line with /m flag)
73
- $ End of string (or end of line with /m flag)
73
+ ^ Start of string (or line with multiline flag)
74
+ $ End of string (or line with multiline flag)
74
75
  \b Word boundary (between \w and \W)
75
76
  \B Non-word boundary
76
- \A Start of string (never affected by /m flag -- Python, Ruby)
77
- \z End of string (absolute -- Python, Ruby)
78
- \Z End of string or before final newline (Python, Ruby)
77
+ \A Start of string (never affected by multiline Python, Ruby)
78
+ \z End of string (absolute Python, Ruby)
79
79
  ```
80
80
 
81
81
  ### Word Boundary Examples
82
82
 
83
83
  ```
84
- Pattern: \bcat\b
85
- Matches: "The cat sat" (matches "cat")
86
- No match: "concatenate" (cat is inside a word)
87
-
88
- Pattern: \bpre\w+
89
- Matches: "prefix", "preview", "predict"
90
- No match: "compress" (pre not at word boundary)
84
+ \bcat\b Matches "cat" in "The cat sat" — not in "concatenate"
85
+ \bpre\w+ Matches "prefix", "preview" not "compress"
91
86
  ```
92
87
 
93
88
  ---
@@ -97,44 +92,27 @@ No match: "compress" (pre not at word boundary)
97
92
  ### Capturing Groups
98
93
 
99
94
  ```
100
- (abc) Capture group -- stores the match for backreference
95
+ (abc) Capture group stores match for backreference
101
96
  (\d{4}) Capture the year
102
97
  (\w+)@(\w+) Two capture groups: username and domain
103
98
 
104
- # Backreferences (refer to captured group)
105
- (.)\1 Matches repeated character: "aa", "bb", "cc"
106
- (\w+)\s+\1 Matches repeated word: "the the", "is is"
99
+ # Backreferences
100
+ (.)\1 Matches repeated character: "aa", "bb"
101
+ (\w+)\s+\1 Matches repeated word: "the the"
107
102
  ```
108
103
 
109
104
  ### Named Groups
110
105
 
111
- ```javascript
112
- // JavaScript
113
- const re = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
114
- const match = re.exec('2025-06-15');
115
- // match.groups.year === '2025'
116
- // match.groups.month === '06'
117
- // match.groups.day === '15'
118
- ```
119
-
120
- ```python
121
- # Python
122
- import re
123
- pattern = r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})'
124
- match = re.match(pattern, '2025-06-15')
125
- # match.group('year') == '2025'
126
- # match.group('month') == '06'
127
- ```
106
+ Named group syntax varies by language:
107
+ - JavaScript: `(?<name>pattern)`
108
+ - Python: `(?P<name>pattern)`
109
+ - Go: `(?P<name>pattern)` (RE2 syntax)
128
110
 
129
111
  ### Non-Capturing Groups
130
112
 
131
113
  ```
132
114
  (?:abc) Groups but does NOT capture (no backreference)
133
-
134
- # Use non-capturing groups for alternation without capturing
135
- (?:com|org|net) Matches "com", "org", or "net" without storing
136
-
137
- # Performance: non-capturing groups are slightly faster
115
+ (?:com|org|net) Matches without storing
138
116
  ```
139
117
 
140
118
  ### Alternation
@@ -149,218 +127,103 @@ cat|dog Matches "cat" or "dog"
149
127
 
150
128
  ## 4. Lookahead and Lookbehind
151
129
 
152
- ### Lookahead (Zero-Width Assertion -- Looks Forward)
130
+ ### Lookahead
153
131
 
154
132
  ```
155
- (?=pattern) Positive lookahead: followed by pattern
156
- (?!pattern) Negative lookahead: NOT followed by pattern
133
+ (?=pattern) Positive: followed by pattern
134
+ (?!pattern) Negative: NOT followed by pattern
157
135
 
158
- # Password validation: at least one digit, one uppercase, one lowercase
136
+ # Password: at least one digit, one uppercase, one lowercase
159
137
  ^(?=.*\d)(?=.*[a-z])(?=.*[A-Z]).{8,}$
160
138
 
161
139
  # Match "foo" only if followed by "bar"
162
140
  foo(?=bar) Matches "foo" in "foobar", not in "foobaz"
163
141
 
164
- # Match a number NOT followed by a percent sign
142
+ # Match number NOT followed by percent
165
143
  \d+(?!%) Matches "42" in "42 items", not in "42%"
166
144
  ```
167
145
 
168
- ### Lookbehind (Zero-Width Assertion -- Looks Backward)
146
+ ### Lookbehind
169
147
 
170
148
  ```
171
- (?<=pattern) Positive lookbehind: preceded by pattern
172
- (?<!pattern) Negative lookbehind: NOT preceded by pattern
149
+ (?<=pattern) Positive: preceded by pattern
150
+ (?<!pattern) Negative: NOT preceded by pattern
173
151
 
174
- # Match digits preceded by a dollar sign
175
152
  (?<=\$)\d+ Matches "50" in "$50", not in "50 items"
176
-
177
- # Match a word NOT preceded by "un"
178
153
  (?<!un)happy Matches "happy" but not "unhappy"
179
154
 
180
155
  # Note: lookbehinds must be fixed-width in most engines
181
- # (?<=ab|abc) -- not allowed in some engines (variable-width)
182
- # JavaScript supports variable-width lookbehind since ES2018
183
- ```
184
-
185
- ### Practical Lookaround Examples
186
-
187
- ```javascript
188
- // Format number with commas: 1234567 -> 1,234,567
189
- '1234567'.replace(/\B(?=(\d{3})+(?!\d))/g, ',');
190
- // Result: "1,234,567"
191
-
192
- // Extract values from key=value pairs without capturing the key
193
- const re = /(?<=name=)\w+/g;
194
- 'name=Alice age=30'.match(re);
195
- // Result: ["Alice"]
196
-
197
- // Match word that is not inside quotes
198
- // \b\w+\b(?=(?:[^"]*"[^"]*")*[^"]*$)
156
+ # Go RE2 does NOT support lookaround at all
199
157
  ```
200
158
 
201
159
  ---
202
160
 
203
161
  ## 5. Common Patterns
204
162
 
205
- ### Email (Simplified, RFC-Compliant Enough for Most Uses)
206
-
163
+ ### Email (Simplified)
207
164
  ```
208
165
  ^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$
209
-
210
- # Breakdown:
211
- # ^ Start of string
212
- # [a-zA-Z0-9._%+\-]+ Local part (letters, digits, dots, etc.)
213
- # @ Literal @
214
- # [a-zA-Z0-9.\-]+ Domain (letters, digits, dots, hyphens)
215
- # \. Literal dot
216
- # [a-zA-Z]{2,} TLD (at least 2 letters)
217
- # $ End of string
218
-
219
- # WARNING: For production, use a library. RFC 5322 email regex is 6,300+ chars.
166
+ # WARNING: For production, use a library. RFC 5322 regex is 6,300+ chars.
220
167
  ```
221
168
 
222
169
  ### URL
223
-
224
170
  ```
225
171
  https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b(?:[-a-zA-Z0-9()@:%_+.~#?&/=]*)
226
-
227
- # For strict validation, use the URL constructor instead:
228
- # new URL(input) -- throws on invalid URL
172
+ # For strict validation, use the URL parser in your language instead.
229
173
  ```
230
174
 
231
- ### IPv4 Address
232
-
175
+ ### IPv4
233
176
  ```
234
177
  \b(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\b
235
-
236
- # Breakdown:
237
- # 25[0-5] 250-255
238
- # 2[0-4]\d 200-249
239
- # [01]?\d\d? 0-199
240
- # Repeated 4 times separated by dots
241
178
  ```
242
179
 
243
- ### Phone Number (US)
244
-
180
+ ### Phone (US)
245
181
  ```
246
182
  ^(?:\+1[-.\s]?)?(?:\(?\d{3}\)?[-.\s]?)?\d{3}[-.\s]?\d{4}$
247
-
248
- # Matches:
249
- # (555) 123-4567
250
- # 555-123-4567
251
- # 5551234567
252
- # +1 555 123 4567
253
183
  ```
254
184
 
255
185
  ### Date (YYYY-MM-DD)
256
-
257
186
  ```
258
187
  ^\d{4}-(?:0[1-9]|1[0-2])-(?:0[1-9]|[12]\d|3[01])$
259
-
260
- # Validates format but NOT calendar correctness (e.g., 2025-02-31 matches).
261
- # For real date validation, parse with Date or a date library.
188
+ # Validates format, NOT calendar correctness.
262
189
  ```
263
190
 
264
- ### Semantic Versioning (SemVer)
265
-
191
+ ### Semantic Versioning
266
192
  ```
267
193
  ^(?:0|[1-9]\d*)\.(?:0|[1-9]\d*)\.(?:0|[1-9]\d*)(?:-(?:(?:0|[1-9]\d*|\d*[a-zA-Z\-][\da-zA-Z\-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z\-][\da-zA-Z\-]*))*))?(?:\+[\da-zA-Z\-]+(?:\.[\da-zA-Z\-]+)*)?$
268
-
269
- # Matches: 1.0.0, 2.1.3-beta.1, 0.0.1+build.123
270
194
  ```
271
195
 
272
- ### Hex Color Code
273
-
196
+ ### Hex Color
274
197
  ```
275
198
  ^#(?:[0-9a-fA-F]{3}){1,2}$
276
-
277
- # Matches: #fff, #FFF, #a1b2c3, #A1B2C3
278
199
  ```
279
200
 
280
- ### Slug (URL-Safe String)
281
-
201
+ ### Slug
282
202
  ```
283
203
  ^[a-z0-9]+(?:-[a-z0-9]+)*$
284
-
285
- # Matches: "hello-world", "my-blog-post", "v2"
286
- # No match: "-start", "end-", "double--dash"
287
204
  ```
288
205
 
289
206
  ---
290
207
 
291
- ## 6. ReDoS Prevention (Catastrophic Backtracking)
292
-
293
- ### The Problem
208
+ ## 6. ReDoS Prevention
294
209
 
295
- Some regex patterns can take exponential time on certain inputs, causing denial of service.
210
+ ### Dangerous Patterns O(2^n)
296
211
 
297
212
  ```
298
- # DANGEROUS patterns -- O(2^n) on adversarial input
299
213
  (a+)+$ Nested quantifiers
300
214
  (a|a)+$ Overlapping alternation
301
215
  (\w+\s*)+$ Repeated group with optional separator
302
-
303
- # Example attack:
304
- # Pattern: (a+)+$
305
- # Input: "aaaaaaaaaaaaaaaaaaaaaaaa!"
306
- # The engine backtracks exponentially trying to match
307
216
  ```
308
217
 
309
218
  ### Prevention Rules
310
219
 
311
- ```
312
- # 1. NEVER nest quantifiers: (a+)+, (\w+)*, (.*)+
313
- # Fix: Flatten to a single quantifier
314
-
315
- # BAD: (a+)+
316
- # GOOD: a+
317
-
318
- # 2. NEVER use overlapping alternation with quantifiers
319
- # BAD: (a|a)+
320
- # GOOD: a+
321
-
322
- # 3. NEVER use .* inside a repeated group
323
- # BAD: (.*,)+
324
- # GOOD: ([^,]*,)+
325
-
326
- # 4. Use atomic groups or possessive quantifiers when available
327
- # a++b Possessive: never backtracks into a+ match
328
- # (?>a+)b Atomic group: same effect
329
-
330
- # 5. Use negated character classes instead of lazy quantifiers
331
- # BAD: ".*?" (lazy dot-star between quotes)
332
- # GOOD: "[^"]*" (negated class -- no backtracking needed)
333
-
334
- # 6. Set regex timeouts in production
335
- ```
336
-
337
- ### Language-Specific Timeout
338
-
339
- ```javascript
340
- // JavaScript: no built-in timeout, use a wrapper
341
- function safeMatch(str, pattern, timeoutMs = 1000) {
342
- // Run in a worker with a timeout, or use a library like re2
343
- // Node.js: consider the 're2' package (linear-time regex engine)
344
- }
345
- ```
346
-
347
- ```python
348
- # Python: use the 'regex' package with timeout
349
- import regex
350
- try:
351
- regex.match(r'(a+)+$', input_string, timeout=1.0)
352
- except regex.error:
353
- pass # Timed out
354
- ```
355
-
356
- ### Safe Alternatives
357
-
358
- ```
359
- # Use RE2 (linear-time regex engine, no backtracking)
360
- # - Available as npm 're2', Python 'google-re2', Go 'regexp' (uses RE2 by default)
361
- # - RE2 does NOT support backreferences or lookaround
362
- # - Trade-off: fewer features, guaranteed linear performance
363
- ```
220
+ 1. NEVER nest quantifiers: `(a+)+` → flatten to `a+`
221
+ 2. NEVER use overlapping alternation with quantifiers
222
+ 3. NEVER use `.*` inside a repeated group — use `([^,]*,)+` instead
223
+ 4. Use atomic groups or possessive quantifiers when available
224
+ 5. Use negated character classes `[^"]*` over lazy `.*?`
225
+ 6. Set regex timeouts in production
226
+ 7. Consider RE2 (linear-time, no backtracking) for user inputs
364
227
 
365
228
  ---
366
229
 
@@ -368,384 +231,103 @@ except regex.error:
368
231
 
369
232
  ### Unicode Categories (\p{})
370
233
 
371
- ```javascript
372
- // Requires /u or /v flag in JavaScript
373
-
374
- // Match any letter (any script)
375
- /\p{L}+/u // "Hello", "Bonjour", etc.
376
-
377
- // Match any number
378
- /\p{N}+/u // "123", etc.
379
-
380
- // Match any punctuation
381
- /\p{P}/u
382
-
383
- // Match emoji (requires /v flag in JS)
384
- /\p{Emoji}/v
385
-
386
- // Match specific scripts
387
- /\p{Script=Greek}+/u
388
- /\p{Script=Han}+/u
389
- /\p{Script=Arabic}+/u
390
-
391
- // Common categories:
392
- // \p{L} Letter (any script)
393
- // \p{Lu} Uppercase letter
394
- // \p{Ll} Lowercase letter
395
- // \p{N} Number
396
- // \p{Nd} Decimal digit
397
- // \p{P} Punctuation
398
- // \p{S} Symbol
399
- // \p{Z} Separator (spaces)
400
- // \p{M} Mark (combining characters)
401
234
  ```
235
+ \p{L} Letter (any script)
236
+ \p{Lu} Uppercase letter
237
+ \p{Ll} Lowercase letter
238
+ \p{N} Number
239
+ \p{Nd} Decimal digit
240
+ \p{P} Punctuation
241
+ \p{S} Symbol
242
+ \p{Z} Separator (spaces)
243
+ \p{M} Mark (combining characters)
402
244
 
403
- ### Unicode-Aware Patterns
404
-
405
- ```javascript
406
- // BAD: ASCII-only word matching
407
- /[a-zA-Z]+/
408
-
409
- // GOOD: Unicode-aware word matching
410
- /[\p{L}\p{M}]+/u
411
-
412
- // BAD: ASCII-only digit matching
413
- /[0-9]+/
414
-
415
- // GOOD: Unicode-aware digit matching (includes Arabic-Indic, etc.)
416
- /\p{Nd}+/u
417
-
418
- // Match a "word" in any language
419
- /\b\p{L}+\b/gu
245
+ # Script-specific
246
+ \p{Script=Greek}
247
+ \p{Script=Han}
248
+ \p{Script=Arabic}
420
249
  ```
421
250
 
251
+ > **Note:** Unicode property support varies by engine. JavaScript requires `/u` flag. Go RE2 supports `\p{L}` etc. Python `re` does NOT support `\p{}`—use the `regex` package.
252
+
422
253
  ---
423
254
 
424
255
  ## 8. Flags
425
256
 
426
257
  ```
427
- g Global: find all matches, not just the first
258
+ g Global: find all matches
428
259
  i Case-insensitive
429
- m Multiline: ^ and $ match line boundaries, not just string boundaries
260
+ m Multiline: ^ and $ match line boundaries
430
261
  s DotAll: . matches newline characters
431
- u Unicode: enables \p{}, correct handling of surrogate pairs
432
- v UnicodeSets: extended Unicode support (JS, newer)
433
- d HasIndices: capture group start/end indices (JS)
434
- y Sticky: match at exact position (lastIndex)
435
- x Extended/Verbose: ignore whitespace and allow comments (Python, Ruby, PCRE)
436
- ```
437
-
438
- ### Extended Mode (Verbose Regex)
439
-
440
- ```python
441
- # Python: re.VERBOSE or re.X
442
- import re
443
-
444
- pattern = re.compile(r"""
445
- ^ # Start of string
446
- (?P<protocol>https?) # Protocol (http or https)
447
- :// # Separator
448
- (?P<domain> # Domain group
449
- [a-zA-Z0-9.-]+ # Domain name
450
- \.[a-zA-Z]{2,} # TLD
451
- )
452
- (?P<path>/\S*)? # Optional path
453
- $ # End of string
454
- """, re.VERBOSE)
455
- ```
456
-
457
- ---
458
-
459
- ## 9. Regex in Different Languages
460
-
461
- ### JavaScript
462
-
463
- ```javascript
464
- // Literal syntax
465
- const re = /^hello\s+(\w+)$/i;
466
-
467
- // Constructor (for dynamic patterns)
468
- const pattern = 'hello';
469
- const re2 = new RegExp(`^${escapeRegExp(pattern)}$`, 'i');
470
-
471
- // Methods
472
- const match = str.match(re); // Returns match array or null
473
- const allMatches = [...str.matchAll(/\d+/g)]; // Iterator of all matches
474
- const replaced = str.replace(/foo/g, 'bar');
475
- const parts = str.split(/[,;\s]+/);
476
- const isValid = re.test(str); // Returns boolean
477
-
478
- // Named groups
479
- const { groups } = /(?<year>\d{4})-(?<month>\d{2})/.exec('2025-06');
480
- // groups.year === '2025'
481
-
482
- // Escape special characters for use in RegExp constructor
483
- function escapeRegExp(str) {
484
- return str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
485
- }
486
- ```
487
-
488
- ### Python
489
-
490
- ```python
491
- import re
492
-
493
- # Compile for reuse
494
- pattern = re.compile(r'^hello\s+(\w+)$', re.IGNORECASE)
495
-
496
- # Methods
497
- match = pattern.match(string) # Match at start
498
- match = pattern.search(string) # Search anywhere
499
- matches = pattern.findall(string) # All matches (list of strings/tuples)
500
- matches = pattern.finditer(string) # Iterator of Match objects
501
- result = pattern.sub(r'bar', string) # Replace
502
- parts = pattern.split(string) # Split
503
-
504
- # Match object
505
- if match:
506
- match.group(0) # Full match
507
- match.group(1) # First capture group
508
- match.group('name') # Named group
509
- match.start() # Start position
510
- match.end() # End position
511
-
512
- # Raw strings: always use r'...' for regex patterns in Python
513
- # r'\n' is a literal backslash-n, not a newline
514
- ```
515
-
516
- ### Go
517
-
518
- ```go
519
- package main
520
-
521
- import (
522
- "fmt"
523
- "regexp"
524
- )
525
-
526
- func main() {
527
- // Compile (returns error if invalid)
528
- re, err := regexp.Compile(`^hello\s+(\w+)$`)
529
- if err != nil {
530
- panic(err)
531
- }
532
-
533
- // MustCompile panics on invalid pattern (use for constants)
534
- re = regexp.MustCompile(`\d+`)
535
-
536
- // Methods
537
- matched := re.MatchString("hello world") // bool
538
- result := re.FindString("abc 123 def") // "123"
539
- allResults := re.FindAllString("a1 b2 c3", -1) // ["1", "2", "3"]
540
- submatch := re.FindStringSubmatch("hello world") // ["hello world", "world"]
541
- replaced := re.ReplaceAllString("foo 123", "NUM") // "foo NUM"
542
-
543
- // Note: Go uses RE2 engine -- no backreferences or lookaround
544
- fmt.Println(matched, result, allResults, submatch, replaced)
545
- }
546
- ```
547
-
548
- ### Rust
549
-
550
- ```rust
551
- use regex::Regex;
552
-
553
- fn main() {
554
- // Compile
555
- let re = Regex::new(r"^hello\s+(\w+)$").unwrap();
556
-
557
- // Methods
558
- let is_match = re.is_match("hello world"); // bool
559
- let caps = re.captures("hello world").unwrap();
560
- let name = &caps[1]; // "world"
561
-
562
- // Named captures
563
- let re = Regex::new(r"(?P<year>\d{4})-(?P<month>\d{2})").unwrap();
564
- let caps = re.captures("2025-06").unwrap();
565
- let year = &caps["year"]; // "2025"
566
-
567
- // Find all matches
568
- let re = Regex::new(r"\d+").unwrap();
569
- let matches: Vec<&str> = re.find_iter("a1 b2 c3").map(|m| m.as_str()).collect();
570
- // ["1", "2", "3"]
571
-
572
- // Replace
573
- let result = re.replace_all("foo 123 bar 456", "NUM");
574
- // "foo NUM bar NUM"
575
-
576
- // Note: Rust regex crate uses finite automata (like RE2)
577
- // No backreferences or lookaround by default
578
- // Use the 'fancy-regex' crate for those features
579
- }
262
+ u Unicode: enables \p{}, correct surrogate pair handling
263
+ x Extended/Verbose: ignore whitespace, allow comments (Python, Ruby, PCRE)
264
+ y Sticky: match at exact position (JavaScript)
580
265
  ```
581
266
 
582
267
  ---
583
268
 
584
- ## 10. Debugging Regex
585
-
586
- ### Techniques
269
+ ## 9. Debugging
587
270
 
588
- ```
589
- 1. Use regex101.com (supports PCRE, Python, JavaScript, Go, Rust)
590
- - Shows match highlighting in real time
591
- - Explains each token
592
- - Shows capture groups
593
- - Has a debugger that shows backtracking steps
594
-
595
- 2. Break complex patterns into named pieces:
596
-
597
- JavaScript:
598
- const year = '(?<year>\\d{4})';
599
- const month = '(?<month>0[1-9]|1[0-2])';
600
- const day = '(?<day>0[1-9]|[12]\\d|3[01])';
601
- const datePattern = new RegExp(`^${year}-${month}-${day}$`);
271
+ 1. Use [regex101.com](https://regex101.com) — supports multiple engines
272
+ 2. Break complex patterns into named pieces and compose them
273
+ 3. Test incrementally start simple, add one element at a time
274
+ 4. Test edge cases: empty string, very long string, Unicode, newlines
275
+ 5. Log intermediate results (capture groups)
602
276
 
603
- 3. Test incrementally:
604
- - Start with the simplest subpattern
605
- - Add one element at a time
606
- - Test with both matching and non-matching inputs
277
+ ### Common Mistakes
607
278
 
608
- 4. Check edge cases:
609
- - Empty string
610
- - Very long string
611
- - Unicode characters
612
- - Newlines
613
- - String with only whitespace
614
-
615
- 5. Log intermediate results:
616
- const re = /(\w+)\s+(\w+)/;
617
- const match = re.exec(input);
618
- console.log('Full match:', match[0]);
619
- console.log('Group 1:', match[1]);
620
- console.log('Group 2:', match[2]);
621
279
  ```
280
+ /file.txt/ # Dot matches anything — matches "file_txt"
281
+ /file\.txt/ # Correct: escaped dot
622
282
 
623
- ### Common Debugging Mistakes
624
-
625
- ```
626
- # Forgetting to escape special characters
627
- /file.txt/ # Matches "file_txt" too (dot matches anything)
628
- /file\.txt/ # Correct: matches only "file.txt"
629
-
630
- # Forgetting anchors
631
283
  /\d{3}/ # Matches "123" inside "abc12345def"
632
- /^\d{3}$/ # Matches only strings that are exactly 3 digits
284
+ /^\d{3}$/ # Correct: anchored to exact 3 digits
633
285
 
634
- # Greedy matching eating too much
635
- /<.*>/ # On "<a>text</a>", matches the entire string
636
- /<[^>]*>/ # Correct: matches "<a>" and "</a>" separately
637
-
638
- # Multiline misunderstanding
639
- /^line$/ # Only matches if entire string is "line"
640
- /^line$/m # Matches "line" on any line in a multiline string
286
+ /<.*>/ # Greedy: matches entire "<a>text</a>"
287
+ /<[^>]*>/ # Correct: negated class
641
288
  ```
642
289
 
643
290
  ---
644
291
 
645
- ## 11. When NOT to Use Regex
646
-
647
- ### HTML/XML Parsing
648
-
649
- ```
650
- # NEVER parse HTML with regex. Use a DOM parser.
651
-
652
- # BAD:
653
- /<div class="title">(.*?)<\/div>/
654
-
655
- # Why it fails:
656
- # - Nested divs
657
- # - Attributes in different orders
658
- # - Self-closing tags
659
- # - Comments containing tags
660
- # - Malformed HTML
661
-
662
- # GOOD:
663
- # JavaScript: DOMParser, cheerio
664
- # Python: BeautifulSoup, lxml
665
- ```
666
-
667
- ### Complex Grammars
668
-
669
- ```
670
- # Regex cannot parse:
671
- # - Nested parentheses of arbitrary depth: ((()))
672
- # - Matching braces in programming languages
673
- # - JSON, YAML, TOML
674
- # - Programming language syntax
675
- # - Mathematical expressions
676
-
677
- # Use a parser (PEG, ANTLR, tree-sitter) for these.
678
- ```
679
-
680
- ### Simple String Operations
681
-
682
- ```
683
- # Do not use regex when simple string methods work:
684
-
685
- # BAD: /^prefix/.test(str)
686
- # GOOD: str.startsWith('prefix')
687
-
688
- # BAD: /suffix$/.test(str)
689
- # GOOD: str.endsWith('suffix')
690
-
691
- # BAD: str.replace(/foo/g, 'bar') (for literal strings)
692
- # GOOD: str.replaceAll('foo', 'bar')
693
-
694
- # BAD: /^$/.test(str)
695
- # GOOD: str.length === 0
292
+ ## 10. When NOT to Use Regex
696
293
 
697
- # BAD: str.split(/,/)
698
- # GOOD: str.split(',') (for literal delimiters)
699
- ```
700
-
701
- ### URL Parsing
294
+ ### Use a Parser Instead
295
+ - HTML/XML — use DOM parser (cheerio, BeautifulSoup, lxml)
296
+ - JSON, YAML, TOML — use a parser
297
+ - Nested structures — regex cannot handle arbitrary depth
298
+ - Programming language syntax — use tree-sitter, ANTLR
702
299
 
300
+ ### Use String Methods Instead
703
301
  ```
704
- # Do not regex URLs. Use the URL API.
705
-
706
- # BAD:
707
- const match = url.match(/^(https?):\/\/([^\/]+)(\/.*)?$/);
708
-
709
- # GOOD:
710
- const parsed = new URL(url);
711
- // parsed.protocol, parsed.hostname, parsed.pathname, parsed.searchParams
302
+ # Prefer:
303
+ startsWith('prefix') over /^prefix/.test(str)
304
+ endsWith('suffix') over /suffix$/.test(str)
305
+ includes('text') over /text/.test(str)
306
+ split(',') over split(/,/)
712
307
  ```
713
308
 
714
- ### When Regex IS the Right Tool
715
-
716
- ```
717
- # Regex excels at:
718
- # - Pattern matching in text (log parsing, data extraction)
719
- # - Input validation (format checks)
720
- # - Search and replace with pattern awareness
721
- # - Tokenizing simple grammars (CSV fields, log lines)
722
- # - Text cleanup (normalize whitespace, strip control characters)
723
- # - Splitting on complex delimiters
724
- ```
309
+ ### When Regex IS Right
310
+ - Pattern matching in text (log parsing, data extraction)
311
+ - Input validation (format checks)
312
+ - Search and replace with pattern awareness
313
+ - Tokenizing simple grammars (CSV, log lines)
314
+ - Text cleanup (normalize whitespace, strip control characters)
725
315
 
726
316
  ---
727
317
 
728
- ## 12. Critical Reminders
318
+ ## 11. Critical Reminders
729
319
 
730
320
  ### ALWAYS
731
-
732
- - Use raw strings (Python: `r'...'`) to avoid double-escaping backslashes
733
- - Anchor patterns with `^` and `$` when validating entire strings
734
- - Use non-capturing groups `(?:...)` when you do not need the captured value
735
- - Prefer negated character classes `[^"]*` over lazy quantifiers `.*?` for performance
736
- - Test with adversarial inputs to check for ReDoS
737
- - Use named groups for readability in complex patterns
738
- - Escape user input before inserting into regex (prevent injection)
739
- - Use the `/u` flag in JavaScript for Unicode correctness
740
- - Comment complex regex patterns (use verbose mode or build from named parts)
321
+ - Anchor with `^` and `$` when validating entire strings
322
+ - Use non-capturing groups `(?:...)` when you don't need the value
323
+ - Prefer negated character classes `[^"]*` over lazy `.*?`
324
+ - Test with adversarial inputs for ReDoS
325
+ - Escape user input before inserting into regex
326
+ - Use Unicode flag for international content
741
327
 
742
328
  ### NEVER
743
-
744
- - Parse HTML, XML, JSON, or any recursive grammar with regex
745
- - Use nested quantifiers without understanding backtracking: `(a+)+`
746
- - Trust regex for email validation beyond basic format checks (use a library)
747
- - Hardcode regex for international content without Unicode support
748
- - Use `.` when you mean a specific character class (be explicit)
749
- - Use `.*` in a pattern that processes user input without backtracking protection
750
- - Forget that `^` and `$` behave differently with and without the `/m` flag
751
- - Build regex from untrusted user input without escaping special characters
329
+ - Parse HTML/XML/JSON with regex
330
+ - Use nested quantifiers without understanding backtracking
331
+ - Trust regex for email validation beyond basic format
332
+ - Use `.` when you mean a specific character class
333
+ - Build regex from untrusted input without escaping