cfsa-antigravity 2.0.0 → 2.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +14 -0
- package/package.json +1 -1
- package/template/.agent/instructions/commands.md +8 -32
- package/template/.agent/instructions/example.md +21 -0
- package/template/.agent/instructions/patterns.md +3 -3
- package/template/.agent/instructions/tech-stack.md +71 -23
- package/template/.agent/instructions/workflow.md +12 -1
- package/template/.agent/rules/completion-checklist.md +6 -0
- package/template/.agent/rules/security-first.md +3 -3
- package/template/.agent/rules/vertical-slices.md +1 -1
- package/template/.agent/skill-library/MANIFEST.md +6 -0
- package/template/.agent/skill-library/stack/devops/git-advanced/SKILL.md +972 -0
- package/template/.agent/skill-library/stack/devops/git-workflow/SKILL.md +420 -0
- package/template/.agent/skills/api-versioning/SKILL.md +44 -298
- package/template/.agent/skills/api-versioning/references/typescript.md +157 -0
- package/template/.agent/skills/architecture-mapping/SKILL.md +13 -13
- package/template/.agent/skills/bootstrap-agents/SKILL.md +151 -152
- package/template/.agent/skills/clean-code/SKILL.md +64 -118
- package/template/.agent/skills/clean-code/references/typescript.md +126 -0
- package/template/.agent/skills/database-schema-design/SKILL.md +93 -317
- package/template/.agent/skills/database-schema-design/references/relational.md +228 -0
- package/template/.agent/skills/error-handling-patterns/SKILL.md +62 -557
- package/template/.agent/skills/error-handling-patterns/references/go.md +162 -0
- package/template/.agent/skills/error-handling-patterns/references/python.md +262 -0
- package/template/.agent/skills/error-handling-patterns/references/rust.md +112 -0
- package/template/.agent/skills/error-handling-patterns/references/typescript.md +178 -0
- package/template/.agent/skills/idea-extraction/SKILL.md +322 -224
- package/template/.agent/skills/logging-best-practices/SKILL.md +108 -767
- package/template/.agent/skills/logging-best-practices/references/go.md +49 -0
- package/template/.agent/skills/logging-best-practices/references/python.md +52 -0
- package/template/.agent/skills/logging-best-practices/references/typescript.md +215 -0
- package/template/.agent/skills/migration-management/SKILL.md +127 -311
- package/template/.agent/skills/migration-management/references/relational.md +214 -0
- package/template/.agent/skills/parallel-feature-development/SKILL.md +34 -43
- package/template/.agent/skills/pipeline-rubrics/references/be-rubric.md +1 -1
- package/template/.agent/skills/pipeline-rubrics/references/ia-rubric.md +2 -2
- package/template/.agent/skills/pipeline-rubrics/references/scoring.md +1 -1
- package/template/.agent/skills/pipeline-rubrics/references/vision-rubric.md +2 -1
- package/template/.agent/skills/prd-templates/SKILL.md +23 -6
- package/template/.agent/skills/prd-templates/references/be-spec-template.md +2 -2
- package/template/.agent/skills/prd-templates/references/decomposition-templates.md +2 -2
- package/template/.agent/skills/prd-templates/references/engineering-standards-template.md +2 -0
- package/template/.agent/skills/prd-templates/references/fe-spec-template.md +1 -1
- package/template/.agent/skills/prd-templates/references/fractal-cx-template.md +58 -0
- package/template/.agent/skills/prd-templates/references/fractal-feature-template.md +93 -0
- package/template/.agent/skills/prd-templates/references/fractal-node-index-template.md +55 -0
- package/template/.agent/skills/prd-templates/references/ideation-crosscut-template.md +26 -47
- package/template/.agent/skills/prd-templates/references/ideation-index-template.md +47 -31
- package/template/.agent/skills/prd-templates/references/operational-templates.md +1 -1
- package/template/.agent/skills/prd-templates/references/placeholder-workflow-mapping.md +50 -21
- package/template/.agent/skills/prd-templates/references/skill-loading-protocol.md +32 -0
- package/template/.agent/skills/prd-templates/references/slice-completion-gates.md +29 -0
- package/template/.agent/skills/prd-templates/references/spec-coverage-sweep.md +3 -3
- package/template/.agent/skills/prd-templates/references/tdd-testing-policy.md +39 -0
- package/template/.agent/skills/prd-templates/references/vision-template.md +8 -8
- package/template/.agent/skills/regex-patterns/SKILL.md +122 -540
- package/template/.agent/skills/regex-patterns/references/go.md +44 -0
- package/template/.agent/skills/regex-patterns/references/javascript.md +63 -0
- package/template/.agent/skills/regex-patterns/references/python.md +77 -0
- package/template/.agent/skills/regex-patterns/references/rust.md +43 -0
- package/template/.agent/skills/resolve-ambiguity/SKILL.md +1 -1
- package/template/.agent/skills/session-continuity/SKILL.md +11 -9
- package/template/.agent/skills/session-continuity/protocols/02-progress-generation.md +2 -2
- package/template/.agent/skills/session-continuity/protocols/04-pattern-extraction.md +1 -1
- package/template/.agent/skills/session-continuity/protocols/05-session-close.md +1 -1
- package/template/.agent/skills/session-continuity/protocols/09-parallel-claim.md +1 -1
- package/template/.agent/skills/session-continuity/protocols/10-placeholder-verification-gate.md +57 -78
- package/template/.agent/skills/session-continuity/protocols/11-parallel-synthesis.md +1 -1
- package/template/.agent/skills/spec-writing/SKILL.md +1 -1
- package/template/.agent/skills/tdd-workflow/SKILL.md +94 -317
- package/template/.agent/skills/tdd-workflow/references/typescript.md +231 -0
- package/template/.agent/skills/testing-strategist/SKILL.md +74 -687
- package/template/.agent/skills/testing-strategist/references/typescript.md +328 -0
- package/template/.agent/skills/workflow-automation/SKILL.md +62 -154
- package/template/.agent/skills/workflow-automation/references/inngest.md +88 -0
- package/template/.agent/skills/workflow-automation/references/temporal.md +64 -0
- package/template/.agent/workflows/bootstrap-agents-fill.md +85 -143
- package/template/.agent/workflows/bootstrap-agents-provision.md +90 -107
- package/template/.agent/workflows/create-prd-architecture.md +23 -16
- package/template/.agent/workflows/create-prd-compile.md +11 -12
- package/template/.agent/workflows/create-prd-design-system.md +1 -1
- package/template/.agent/workflows/create-prd-security.md +9 -11
- package/template/.agent/workflows/create-prd-stack.md +10 -4
- package/template/.agent/workflows/create-prd.md +9 -9
- package/template/.agent/workflows/decompose-architecture-structure.md +4 -6
- package/template/.agent/workflows/decompose-architecture-validate.md +18 -1
- package/template/.agent/workflows/decompose-architecture.md +18 -3
- package/template/.agent/workflows/evolve-contract.md +11 -11
- package/template/.agent/workflows/evolve-feature-classify.md +14 -6
- package/template/.agent/workflows/ideate-discover.md +72 -107
- package/template/.agent/workflows/ideate-extract.md +84 -63
- package/template/.agent/workflows/ideate-validate.md +26 -22
- package/template/.agent/workflows/ideate.md +9 -9
- package/template/.agent/workflows/implement-slice-setup.md +25 -23
- package/template/.agent/workflows/implement-slice-tdd.md +73 -89
- package/template/.agent/workflows/implement-slice.md +4 -4
- package/template/.agent/workflows/plan-phase-preflight.md +6 -2
- package/template/.agent/workflows/plan-phase-write.md +6 -8
- package/template/.agent/workflows/remediate-pipeline-assess.md +2 -1
- package/template/.agent/workflows/resolve-ambiguity.md +2 -2
- package/template/.agent/workflows/update-architecture-map.md +22 -5
- package/template/.agent/workflows/validate-phase-quality.md +155 -0
- package/template/.agent/workflows/validate-phase-readiness.md +167 -0
- package/template/.agent/workflows/validate-phase.md +19 -157
- package/template/.agent/workflows/verify-infrastructure.md +10 -10
- package/template/.agent/workflows/write-architecture-spec-design.md +23 -14
- package/template/.agent/workflows/write-be-spec-classify.md +25 -21
- package/template/.agent/workflows/write-be-spec.md +1 -1
- package/template/.agent/workflows/write-fe-spec-classify.md +6 -12
- package/template/.agent/workflows/write-fe-spec-write.md +1 -1
- package/template/AGENTS.md +6 -2
- package/template/GEMINI.md +5 -3
- package/template/docs/README.md +10 -10
- package/template/docs/kit-architecture.md +126 -33
- package/template/docs/plans/ideation/README.md +8 -3
- package/template/.agent/skills/prd-templates/references/ideation-domain-template.md +0 -55
|
@@ -1,11 +1,24 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: regex-patterns
|
|
3
|
-
description: "Comprehensive regular expressions guide covering character classes, quantifiers, anchors, groups
|
|
4
|
-
version:
|
|
3
|
+
description: "Comprehensive regular expressions guide covering character classes, quantifiers, anchors, groups, lookahead/lookbehind, common patterns (email, URL, IP, phone, dates, semver), ReDoS prevention, Unicode support, flags, debugging, and when NOT to use regex."
|
|
4
|
+
version: 2.0.0
|
|
5
5
|
---
|
|
6
6
|
|
|
7
7
|
# Regular Expressions Mastery
|
|
8
8
|
|
|
9
|
+
## Stack-Specific References
|
|
10
|
+
|
|
11
|
+
Regex syntax is mostly universal but API usage differs by language. After reading the patterns below, read the reference for your language:
|
|
12
|
+
|
|
13
|
+
| Language | Reference | Engine Notes |
|
|
14
|
+
|----------|-----------|-------------|
|
|
15
|
+
| JavaScript | `references/javascript.md` | V8/SpiderMonkey — full PCRE-like features |
|
|
16
|
+
| Python | `references/python.md` | `re` module — PCRE-like, supports verbose mode |
|
|
17
|
+
| Go | `references/go.md` | RE2 — no backreferences or lookaround |
|
|
18
|
+
| Rust | `references/rust.md` | regex crate — RE2-like; `fancy-regex` for lookaround |
|
|
19
|
+
|
|
20
|
+
---
|
|
21
|
+
|
|
9
22
|
## 1. Fundamentals
|
|
10
23
|
|
|
11
24
|
### Character Classes
|
|
@@ -25,17 +38,6 @@ version: 1.0.0
|
|
|
25
38
|
[0-9a-fA-F] Hexadecimal digit
|
|
26
39
|
```
|
|
27
40
|
|
|
28
|
-
### Custom Character Classes
|
|
29
|
-
|
|
30
|
-
```
|
|
31
|
-
[aeiou] Vowels
|
|
32
|
-
[^aeiou] Non-vowels (consonants and non-letters)
|
|
33
|
-
[\d\s] Digit or whitespace
|
|
34
|
-
[.\-+] Literal dot, hyphen, or plus (hyphen escaped or at start/end)
|
|
35
|
-
[\[\]] Literal square brackets
|
|
36
|
-
[^\n\r] Any character except newline
|
|
37
|
-
```
|
|
38
|
-
|
|
39
41
|
### Quantifiers
|
|
40
42
|
|
|
41
43
|
```
|
|
@@ -46,12 +48,11 @@ version: 1.0.0
|
|
|
46
48
|
{2,5} Between 2 and 5
|
|
47
49
|
{3,} 3 or more
|
|
48
50
|
|
|
49
|
-
*? Zero or more (lazy
|
|
51
|
+
*? Zero or more (lazy)
|
|
50
52
|
+? One or more (lazy)
|
|
51
|
-
?? Zero or one (lazy)
|
|
52
53
|
{2,5}? Between 2 and 5 (lazy)
|
|
53
54
|
|
|
54
|
-
*+ Zero or more (possessive
|
|
55
|
+
*+ Zero or more (possessive — not in all engines)
|
|
55
56
|
++ One or more (possessive)
|
|
56
57
|
```
|
|
57
58
|
|
|
@@ -60,7 +61,7 @@ version: 1.0.0
|
|
|
60
61
|
```
|
|
61
62
|
Input: <b>bold</b> and <b>more bold</b>
|
|
62
63
|
|
|
63
|
-
<b>.*</b> Greedy: matches
|
|
64
|
+
<b>.*</b> Greedy: matches entire string (one match)
|
|
64
65
|
<b>.*?</b> Lazy: matches "<b>bold</b>" and "<b>more bold</b>" (two matches)
|
|
65
66
|
```
|
|
66
67
|
|
|
@@ -69,25 +70,19 @@ Input: <b>bold</b> and <b>more bold</b>
|
|
|
69
70
|
## 2. Anchors and Boundaries
|
|
70
71
|
|
|
71
72
|
```
|
|
72
|
-
^ Start of string (or
|
|
73
|
-
$ End of string (or
|
|
73
|
+
^ Start of string (or line with multiline flag)
|
|
74
|
+
$ End of string (or line with multiline flag)
|
|
74
75
|
\b Word boundary (between \w and \W)
|
|
75
76
|
\B Non-word boundary
|
|
76
|
-
\A Start of string (never affected by
|
|
77
|
-
\z End of string (absolute
|
|
78
|
-
\Z End of string or before final newline (Python, Ruby)
|
|
77
|
+
\A Start of string (never affected by multiline — Python, Ruby)
|
|
78
|
+
\z End of string (absolute — Python, Ruby)
|
|
79
79
|
```
|
|
80
80
|
|
|
81
81
|
### Word Boundary Examples
|
|
82
82
|
|
|
83
83
|
```
|
|
84
|
-
|
|
85
|
-
Matches
|
|
86
|
-
No match: "concatenate" (cat is inside a word)
|
|
87
|
-
|
|
88
|
-
Pattern: \bpre\w+
|
|
89
|
-
Matches: "prefix", "preview", "predict"
|
|
90
|
-
No match: "compress" (pre not at word boundary)
|
|
84
|
+
\bcat\b Matches "cat" in "The cat sat" — not in "concatenate"
|
|
85
|
+
\bpre\w+ Matches "prefix", "preview" — not "compress"
|
|
91
86
|
```
|
|
92
87
|
|
|
93
88
|
---
|
|
@@ -97,44 +92,27 @@ No match: "compress" (pre not at word boundary)
|
|
|
97
92
|
### Capturing Groups
|
|
98
93
|
|
|
99
94
|
```
|
|
100
|
-
(abc) Capture group
|
|
95
|
+
(abc) Capture group — stores match for backreference
|
|
101
96
|
(\d{4}) Capture the year
|
|
102
97
|
(\w+)@(\w+) Two capture groups: username and domain
|
|
103
98
|
|
|
104
|
-
# Backreferences
|
|
105
|
-
(.)\1 Matches repeated character: "aa", "bb"
|
|
106
|
-
(\w+)\s+\1 Matches repeated word: "the the"
|
|
99
|
+
# Backreferences
|
|
100
|
+
(.)\1 Matches repeated character: "aa", "bb"
|
|
101
|
+
(\w+)\s+\1 Matches repeated word: "the the"
|
|
107
102
|
```
|
|
108
103
|
|
|
109
104
|
### Named Groups
|
|
110
105
|
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
// match.groups.year === '2025'
|
|
116
|
-
// match.groups.month === '06'
|
|
117
|
-
// match.groups.day === '15'
|
|
118
|
-
```
|
|
119
|
-
|
|
120
|
-
```python
|
|
121
|
-
# Python
|
|
122
|
-
import re
|
|
123
|
-
pattern = r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})'
|
|
124
|
-
match = re.match(pattern, '2025-06-15')
|
|
125
|
-
# match.group('year') == '2025'
|
|
126
|
-
# match.group('month') == '06'
|
|
127
|
-
```
|
|
106
|
+
Named group syntax varies by language:
|
|
107
|
+
- JavaScript: `(?<name>pattern)`
|
|
108
|
+
- Python: `(?P<name>pattern)`
|
|
109
|
+
- Go: `(?P<name>pattern)` (RE2 syntax)
|
|
128
110
|
|
|
129
111
|
### Non-Capturing Groups
|
|
130
112
|
|
|
131
113
|
```
|
|
132
114
|
(?:abc) Groups but does NOT capture (no backreference)
|
|
133
|
-
|
|
134
|
-
# Use non-capturing groups for alternation without capturing
|
|
135
|
-
(?:com|org|net) Matches "com", "org", or "net" without storing
|
|
136
|
-
|
|
137
|
-
# Performance: non-capturing groups are slightly faster
|
|
115
|
+
(?:com|org|net) Matches without storing
|
|
138
116
|
```
|
|
139
117
|
|
|
140
118
|
### Alternation
|
|
@@ -149,218 +127,103 @@ cat|dog Matches "cat" or "dog"
|
|
|
149
127
|
|
|
150
128
|
## 4. Lookahead and Lookbehind
|
|
151
129
|
|
|
152
|
-
### Lookahead
|
|
130
|
+
### Lookahead
|
|
153
131
|
|
|
154
132
|
```
|
|
155
|
-
(?=pattern) Positive
|
|
156
|
-
(?!pattern) Negative
|
|
133
|
+
(?=pattern) Positive: followed by pattern
|
|
134
|
+
(?!pattern) Negative: NOT followed by pattern
|
|
157
135
|
|
|
158
|
-
# Password
|
|
136
|
+
# Password: at least one digit, one uppercase, one lowercase
|
|
159
137
|
^(?=.*\d)(?=.*[a-z])(?=.*[A-Z]).{8,}$
|
|
160
138
|
|
|
161
139
|
# Match "foo" only if followed by "bar"
|
|
162
140
|
foo(?=bar) Matches "foo" in "foobar", not in "foobaz"
|
|
163
141
|
|
|
164
|
-
# Match
|
|
142
|
+
# Match number NOT followed by percent
|
|
165
143
|
\d+(?!%) Matches "42" in "42 items", not in "42%"
|
|
166
144
|
```
|
|
167
145
|
|
|
168
|
-
### Lookbehind
|
|
146
|
+
### Lookbehind
|
|
169
147
|
|
|
170
148
|
```
|
|
171
|
-
(?<=pattern) Positive
|
|
172
|
-
(?<!pattern) Negative
|
|
149
|
+
(?<=pattern) Positive: preceded by pattern
|
|
150
|
+
(?<!pattern) Negative: NOT preceded by pattern
|
|
173
151
|
|
|
174
|
-
# Match digits preceded by a dollar sign
|
|
175
152
|
(?<=\$)\d+ Matches "50" in "$50", not in "50 items"
|
|
176
|
-
|
|
177
|
-
# Match a word NOT preceded by "un"
|
|
178
153
|
(?<!un)happy Matches "happy" but not "unhappy"
|
|
179
154
|
|
|
180
155
|
# Note: lookbehinds must be fixed-width in most engines
|
|
181
|
-
#
|
|
182
|
-
# JavaScript supports variable-width lookbehind since ES2018
|
|
183
|
-
```
|
|
184
|
-
|
|
185
|
-
### Practical Lookaround Examples
|
|
186
|
-
|
|
187
|
-
```javascript
|
|
188
|
-
// Format number with commas: 1234567 -> 1,234,567
|
|
189
|
-
'1234567'.replace(/\B(?=(\d{3})+(?!\d))/g, ',');
|
|
190
|
-
// Result: "1,234,567"
|
|
191
|
-
|
|
192
|
-
// Extract values from key=value pairs without capturing the key
|
|
193
|
-
const re = /(?<=name=)\w+/g;
|
|
194
|
-
'name=Alice age=30'.match(re);
|
|
195
|
-
// Result: ["Alice"]
|
|
196
|
-
|
|
197
|
-
// Match word that is not inside quotes
|
|
198
|
-
// \b\w+\b(?=(?:[^"]*"[^"]*")*[^"]*$)
|
|
156
|
+
# Go RE2 does NOT support lookaround at all
|
|
199
157
|
```
|
|
200
158
|
|
|
201
159
|
---
|
|
202
160
|
|
|
203
161
|
## 5. Common Patterns
|
|
204
162
|
|
|
205
|
-
### Email (Simplified
|
|
206
|
-
|
|
163
|
+
### Email (Simplified)
|
|
207
164
|
```
|
|
208
165
|
^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$
|
|
209
|
-
|
|
210
|
-
# Breakdown:
|
|
211
|
-
# ^ Start of string
|
|
212
|
-
# [a-zA-Z0-9._%+\-]+ Local part (letters, digits, dots, etc.)
|
|
213
|
-
# @ Literal @
|
|
214
|
-
# [a-zA-Z0-9.\-]+ Domain (letters, digits, dots, hyphens)
|
|
215
|
-
# \. Literal dot
|
|
216
|
-
# [a-zA-Z]{2,} TLD (at least 2 letters)
|
|
217
|
-
# $ End of string
|
|
218
|
-
|
|
219
|
-
# WARNING: For production, use a library. RFC 5322 email regex is 6,300+ chars.
|
|
166
|
+
# WARNING: For production, use a library. RFC 5322 regex is 6,300+ chars.
|
|
220
167
|
```
|
|
221
168
|
|
|
222
169
|
### URL
|
|
223
|
-
|
|
224
170
|
```
|
|
225
171
|
https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b(?:[-a-zA-Z0-9()@:%_+.~#?&/=]*)
|
|
226
|
-
|
|
227
|
-
# For strict validation, use the URL constructor instead:
|
|
228
|
-
# new URL(input) -- throws on invalid URL
|
|
172
|
+
# For strict validation, use the URL parser in your language instead.
|
|
229
173
|
```
|
|
230
174
|
|
|
231
|
-
### IPv4
|
|
232
|
-
|
|
175
|
+
### IPv4
|
|
233
176
|
```
|
|
234
177
|
\b(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\b
|
|
235
|
-
|
|
236
|
-
# Breakdown:
|
|
237
|
-
# 25[0-5] 250-255
|
|
238
|
-
# 2[0-4]\d 200-249
|
|
239
|
-
# [01]?\d\d? 0-199
|
|
240
|
-
# Repeated 4 times separated by dots
|
|
241
178
|
```
|
|
242
179
|
|
|
243
|
-
### Phone
|
|
244
|
-
|
|
180
|
+
### Phone (US)
|
|
245
181
|
```
|
|
246
182
|
^(?:\+1[-.\s]?)?(?:\(?\d{3}\)?[-.\s]?)?\d{3}[-.\s]?\d{4}$
|
|
247
|
-
|
|
248
|
-
# Matches:
|
|
249
|
-
# (555) 123-4567
|
|
250
|
-
# 555-123-4567
|
|
251
|
-
# 5551234567
|
|
252
|
-
# +1 555 123 4567
|
|
253
183
|
```
|
|
254
184
|
|
|
255
185
|
### Date (YYYY-MM-DD)
|
|
256
|
-
|
|
257
186
|
```
|
|
258
187
|
^\d{4}-(?:0[1-9]|1[0-2])-(?:0[1-9]|[12]\d|3[01])$
|
|
259
|
-
|
|
260
|
-
# Validates format but NOT calendar correctness (e.g., 2025-02-31 matches).
|
|
261
|
-
# For real date validation, parse with Date or a date library.
|
|
188
|
+
# Validates format, NOT calendar correctness.
|
|
262
189
|
```
|
|
263
190
|
|
|
264
|
-
### Semantic Versioning
|
|
265
|
-
|
|
191
|
+
### Semantic Versioning
|
|
266
192
|
```
|
|
267
193
|
^(?:0|[1-9]\d*)\.(?:0|[1-9]\d*)\.(?:0|[1-9]\d*)(?:-(?:(?:0|[1-9]\d*|\d*[a-zA-Z\-][\da-zA-Z\-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z\-][\da-zA-Z\-]*))*))?(?:\+[\da-zA-Z\-]+(?:\.[\da-zA-Z\-]+)*)?$
|
|
268
|
-
|
|
269
|
-
# Matches: 1.0.0, 2.1.3-beta.1, 0.0.1+build.123
|
|
270
194
|
```
|
|
271
195
|
|
|
272
|
-
### Hex Color
|
|
273
|
-
|
|
196
|
+
### Hex Color
|
|
274
197
|
```
|
|
275
198
|
^#(?:[0-9a-fA-F]{3}){1,2}$
|
|
276
|
-
|
|
277
|
-
# Matches: #fff, #FFF, #a1b2c3, #A1B2C3
|
|
278
199
|
```
|
|
279
200
|
|
|
280
|
-
### Slug
|
|
281
|
-
|
|
201
|
+
### Slug
|
|
282
202
|
```
|
|
283
203
|
^[a-z0-9]+(?:-[a-z0-9]+)*$
|
|
284
|
-
|
|
285
|
-
# Matches: "hello-world", "my-blog-post", "v2"
|
|
286
|
-
# No match: "-start", "end-", "double--dash"
|
|
287
204
|
```
|
|
288
205
|
|
|
289
206
|
---
|
|
290
207
|
|
|
291
|
-
## 6. ReDoS Prevention
|
|
292
|
-
|
|
293
|
-
### The Problem
|
|
208
|
+
## 6. ReDoS Prevention
|
|
294
209
|
|
|
295
|
-
|
|
210
|
+
### Dangerous Patterns — O(2^n)
|
|
296
211
|
|
|
297
212
|
```
|
|
298
|
-
# DANGEROUS patterns -- O(2^n) on adversarial input
|
|
299
213
|
(a+)+$ Nested quantifiers
|
|
300
214
|
(a|a)+$ Overlapping alternation
|
|
301
215
|
(\w+\s*)+$ Repeated group with optional separator
|
|
302
|
-
|
|
303
|
-
# Example attack:
|
|
304
|
-
# Pattern: (a+)+$
|
|
305
|
-
# Input: "aaaaaaaaaaaaaaaaaaaaaaaa!"
|
|
306
|
-
# The engine backtracks exponentially trying to match
|
|
307
216
|
```
|
|
308
217
|
|
|
309
218
|
### Prevention Rules
|
|
310
219
|
|
|
311
|
-
|
|
312
|
-
|
|
313
|
-
|
|
314
|
-
|
|
315
|
-
|
|
316
|
-
|
|
317
|
-
|
|
318
|
-
# 2. NEVER use overlapping alternation with quantifiers
|
|
319
|
-
# BAD: (a|a)+
|
|
320
|
-
# GOOD: a+
|
|
321
|
-
|
|
322
|
-
# 3. NEVER use .* inside a repeated group
|
|
323
|
-
# BAD: (.*,)+
|
|
324
|
-
# GOOD: ([^,]*,)+
|
|
325
|
-
|
|
326
|
-
# 4. Use atomic groups or possessive quantifiers when available
|
|
327
|
-
# a++b Possessive: never backtracks into a+ match
|
|
328
|
-
# (?>a+)b Atomic group: same effect
|
|
329
|
-
|
|
330
|
-
# 5. Use negated character classes instead of lazy quantifiers
|
|
331
|
-
# BAD: ".*?" (lazy dot-star between quotes)
|
|
332
|
-
# GOOD: "[^"]*" (negated class -- no backtracking needed)
|
|
333
|
-
|
|
334
|
-
# 6. Set regex timeouts in production
|
|
335
|
-
```
|
|
336
|
-
|
|
337
|
-
### Language-Specific Timeout
|
|
338
|
-
|
|
339
|
-
```javascript
|
|
340
|
-
// JavaScript: no built-in timeout, use a wrapper
|
|
341
|
-
function safeMatch(str, pattern, timeoutMs = 1000) {
|
|
342
|
-
// Run in a worker with a timeout, or use a library like re2
|
|
343
|
-
// Node.js: consider the 're2' package (linear-time regex engine)
|
|
344
|
-
}
|
|
345
|
-
```
|
|
346
|
-
|
|
347
|
-
```python
|
|
348
|
-
# Python: use the 'regex' package with timeout
|
|
349
|
-
import regex
|
|
350
|
-
try:
|
|
351
|
-
regex.match(r'(a+)+$', input_string, timeout=1.0)
|
|
352
|
-
except regex.error:
|
|
353
|
-
pass # Timed out
|
|
354
|
-
```
|
|
355
|
-
|
|
356
|
-
### Safe Alternatives
|
|
357
|
-
|
|
358
|
-
```
|
|
359
|
-
# Use RE2 (linear-time regex engine, no backtracking)
|
|
360
|
-
# - Available as npm 're2', Python 'google-re2', Go 'regexp' (uses RE2 by default)
|
|
361
|
-
# - RE2 does NOT support backreferences or lookaround
|
|
362
|
-
# - Trade-off: fewer features, guaranteed linear performance
|
|
363
|
-
```
|
|
220
|
+
1. NEVER nest quantifiers: `(a+)+` → flatten to `a+`
|
|
221
|
+
2. NEVER use overlapping alternation with quantifiers
|
|
222
|
+
3. NEVER use `.*` inside a repeated group — use `([^,]*,)+` instead
|
|
223
|
+
4. Use atomic groups or possessive quantifiers when available
|
|
224
|
+
5. Use negated character classes `[^"]*` over lazy `.*?`
|
|
225
|
+
6. Set regex timeouts in production
|
|
226
|
+
7. Consider RE2 (linear-time, no backtracking) for user inputs
|
|
364
227
|
|
|
365
228
|
---
|
|
366
229
|
|
|
@@ -368,384 +231,103 @@ except regex.error:
|
|
|
368
231
|
|
|
369
232
|
### Unicode Categories (\p{})
|
|
370
233
|
|
|
371
|
-
```javascript
|
|
372
|
-
// Requires /u or /v flag in JavaScript
|
|
373
|
-
|
|
374
|
-
// Match any letter (any script)
|
|
375
|
-
/\p{L}+/u // "Hello", "Bonjour", etc.
|
|
376
|
-
|
|
377
|
-
// Match any number
|
|
378
|
-
/\p{N}+/u // "123", etc.
|
|
379
|
-
|
|
380
|
-
// Match any punctuation
|
|
381
|
-
/\p{P}/u
|
|
382
|
-
|
|
383
|
-
// Match emoji (requires /v flag in JS)
|
|
384
|
-
/\p{Emoji}/v
|
|
385
|
-
|
|
386
|
-
// Match specific scripts
|
|
387
|
-
/\p{Script=Greek}+/u
|
|
388
|
-
/\p{Script=Han}+/u
|
|
389
|
-
/\p{Script=Arabic}+/u
|
|
390
|
-
|
|
391
|
-
// Common categories:
|
|
392
|
-
// \p{L} Letter (any script)
|
|
393
|
-
// \p{Lu} Uppercase letter
|
|
394
|
-
// \p{Ll} Lowercase letter
|
|
395
|
-
// \p{N} Number
|
|
396
|
-
// \p{Nd} Decimal digit
|
|
397
|
-
// \p{P} Punctuation
|
|
398
|
-
// \p{S} Symbol
|
|
399
|
-
// \p{Z} Separator (spaces)
|
|
400
|
-
// \p{M} Mark (combining characters)
|
|
401
234
|
```
|
|
235
|
+
\p{L} Letter (any script)
|
|
236
|
+
\p{Lu} Uppercase letter
|
|
237
|
+
\p{Ll} Lowercase letter
|
|
238
|
+
\p{N} Number
|
|
239
|
+
\p{Nd} Decimal digit
|
|
240
|
+
\p{P} Punctuation
|
|
241
|
+
\p{S} Symbol
|
|
242
|
+
\p{Z} Separator (spaces)
|
|
243
|
+
\p{M} Mark (combining characters)
|
|
402
244
|
|
|
403
|
-
|
|
404
|
-
|
|
405
|
-
|
|
406
|
-
|
|
407
|
-
/[a-zA-Z]+/
|
|
408
|
-
|
|
409
|
-
// GOOD: Unicode-aware word matching
|
|
410
|
-
/[\p{L}\p{M}]+/u
|
|
411
|
-
|
|
412
|
-
// BAD: ASCII-only digit matching
|
|
413
|
-
/[0-9]+/
|
|
414
|
-
|
|
415
|
-
// GOOD: Unicode-aware digit matching (includes Arabic-Indic, etc.)
|
|
416
|
-
/\p{Nd}+/u
|
|
417
|
-
|
|
418
|
-
// Match a "word" in any language
|
|
419
|
-
/\b\p{L}+\b/gu
|
|
245
|
+
# Script-specific
|
|
246
|
+
\p{Script=Greek}
|
|
247
|
+
\p{Script=Han}
|
|
248
|
+
\p{Script=Arabic}
|
|
420
249
|
```
|
|
421
250
|
|
|
251
|
+
> **Note:** Unicode property support varies by engine. JavaScript requires `/u` flag. Go RE2 supports `\p{L}` etc. Python `re` does NOT support `\p{}`—use the `regex` package.
|
|
252
|
+
|
|
422
253
|
---
|
|
423
254
|
|
|
424
255
|
## 8. Flags
|
|
425
256
|
|
|
426
257
|
```
|
|
427
|
-
g Global: find all matches
|
|
258
|
+
g Global: find all matches
|
|
428
259
|
i Case-insensitive
|
|
429
|
-
m Multiline: ^ and $ match line boundaries
|
|
260
|
+
m Multiline: ^ and $ match line boundaries
|
|
430
261
|
s DotAll: . matches newline characters
|
|
431
|
-
u Unicode: enables \p{}, correct
|
|
432
|
-
|
|
433
|
-
|
|
434
|
-
y Sticky: match at exact position (lastIndex)
|
|
435
|
-
x Extended/Verbose: ignore whitespace and allow comments (Python, Ruby, PCRE)
|
|
436
|
-
```
|
|
437
|
-
|
|
438
|
-
### Extended Mode (Verbose Regex)
|
|
439
|
-
|
|
440
|
-
```python
|
|
441
|
-
# Python: re.VERBOSE or re.X
|
|
442
|
-
import re
|
|
443
|
-
|
|
444
|
-
pattern = re.compile(r"""
|
|
445
|
-
^ # Start of string
|
|
446
|
-
(?P<protocol>https?) # Protocol (http or https)
|
|
447
|
-
:// # Separator
|
|
448
|
-
(?P<domain> # Domain group
|
|
449
|
-
[a-zA-Z0-9.-]+ # Domain name
|
|
450
|
-
\.[a-zA-Z]{2,} # TLD
|
|
451
|
-
)
|
|
452
|
-
(?P<path>/\S*)? # Optional path
|
|
453
|
-
$ # End of string
|
|
454
|
-
""", re.VERBOSE)
|
|
455
|
-
```
|
|
456
|
-
|
|
457
|
-
---
|
|
458
|
-
|
|
459
|
-
## 9. Regex in Different Languages
|
|
460
|
-
|
|
461
|
-
### JavaScript
|
|
462
|
-
|
|
463
|
-
```javascript
|
|
464
|
-
// Literal syntax
|
|
465
|
-
const re = /^hello\s+(\w+)$/i;
|
|
466
|
-
|
|
467
|
-
// Constructor (for dynamic patterns)
|
|
468
|
-
const pattern = 'hello';
|
|
469
|
-
const re2 = new RegExp(`^${escapeRegExp(pattern)}$`, 'i');
|
|
470
|
-
|
|
471
|
-
// Methods
|
|
472
|
-
const match = str.match(re); // Returns match array or null
|
|
473
|
-
const allMatches = [...str.matchAll(/\d+/g)]; // Iterator of all matches
|
|
474
|
-
const replaced = str.replace(/foo/g, 'bar');
|
|
475
|
-
const parts = str.split(/[,;\s]+/);
|
|
476
|
-
const isValid = re.test(str); // Returns boolean
|
|
477
|
-
|
|
478
|
-
// Named groups
|
|
479
|
-
const { groups } = /(?<year>\d{4})-(?<month>\d{2})/.exec('2025-06');
|
|
480
|
-
// groups.year === '2025'
|
|
481
|
-
|
|
482
|
-
// Escape special characters for use in RegExp constructor
|
|
483
|
-
function escapeRegExp(str) {
|
|
484
|
-
return str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
|
|
485
|
-
}
|
|
486
|
-
```
|
|
487
|
-
|
|
488
|
-
### Python
|
|
489
|
-
|
|
490
|
-
```python
|
|
491
|
-
import re
|
|
492
|
-
|
|
493
|
-
# Compile for reuse
|
|
494
|
-
pattern = re.compile(r'^hello\s+(\w+)$', re.IGNORECASE)
|
|
495
|
-
|
|
496
|
-
# Methods
|
|
497
|
-
match = pattern.match(string) # Match at start
|
|
498
|
-
match = pattern.search(string) # Search anywhere
|
|
499
|
-
matches = pattern.findall(string) # All matches (list of strings/tuples)
|
|
500
|
-
matches = pattern.finditer(string) # Iterator of Match objects
|
|
501
|
-
result = pattern.sub(r'bar', string) # Replace
|
|
502
|
-
parts = pattern.split(string) # Split
|
|
503
|
-
|
|
504
|
-
# Match object
|
|
505
|
-
if match:
|
|
506
|
-
match.group(0) # Full match
|
|
507
|
-
match.group(1) # First capture group
|
|
508
|
-
match.group('name') # Named group
|
|
509
|
-
match.start() # Start position
|
|
510
|
-
match.end() # End position
|
|
511
|
-
|
|
512
|
-
# Raw strings: always use r'...' for regex patterns in Python
|
|
513
|
-
# r'\n' is a literal backslash-n, not a newline
|
|
514
|
-
```
|
|
515
|
-
|
|
516
|
-
### Go
|
|
517
|
-
|
|
518
|
-
```go
|
|
519
|
-
package main
|
|
520
|
-
|
|
521
|
-
import (
|
|
522
|
-
"fmt"
|
|
523
|
-
"regexp"
|
|
524
|
-
)
|
|
525
|
-
|
|
526
|
-
func main() {
|
|
527
|
-
// Compile (returns error if invalid)
|
|
528
|
-
re, err := regexp.Compile(`^hello\s+(\w+)$`)
|
|
529
|
-
if err != nil {
|
|
530
|
-
panic(err)
|
|
531
|
-
}
|
|
532
|
-
|
|
533
|
-
// MustCompile panics on invalid pattern (use for constants)
|
|
534
|
-
re = regexp.MustCompile(`\d+`)
|
|
535
|
-
|
|
536
|
-
// Methods
|
|
537
|
-
matched := re.MatchString("hello world") // bool
|
|
538
|
-
result := re.FindString("abc 123 def") // "123"
|
|
539
|
-
allResults := re.FindAllString("a1 b2 c3", -1) // ["1", "2", "3"]
|
|
540
|
-
submatch := re.FindStringSubmatch("hello world") // ["hello world", "world"]
|
|
541
|
-
replaced := re.ReplaceAllString("foo 123", "NUM") // "foo NUM"
|
|
542
|
-
|
|
543
|
-
// Note: Go uses RE2 engine -- no backreferences or lookaround
|
|
544
|
-
fmt.Println(matched, result, allResults, submatch, replaced)
|
|
545
|
-
}
|
|
546
|
-
```
|
|
547
|
-
|
|
548
|
-
### Rust
|
|
549
|
-
|
|
550
|
-
```rust
|
|
551
|
-
use regex::Regex;
|
|
552
|
-
|
|
553
|
-
fn main() {
|
|
554
|
-
// Compile
|
|
555
|
-
let re = Regex::new(r"^hello\s+(\w+)$").unwrap();
|
|
556
|
-
|
|
557
|
-
// Methods
|
|
558
|
-
let is_match = re.is_match("hello world"); // bool
|
|
559
|
-
let caps = re.captures("hello world").unwrap();
|
|
560
|
-
let name = &caps[1]; // "world"
|
|
561
|
-
|
|
562
|
-
// Named captures
|
|
563
|
-
let re = Regex::new(r"(?P<year>\d{4})-(?P<month>\d{2})").unwrap();
|
|
564
|
-
let caps = re.captures("2025-06").unwrap();
|
|
565
|
-
let year = &caps["year"]; // "2025"
|
|
566
|
-
|
|
567
|
-
// Find all matches
|
|
568
|
-
let re = Regex::new(r"\d+").unwrap();
|
|
569
|
-
let matches: Vec<&str> = re.find_iter("a1 b2 c3").map(|m| m.as_str()).collect();
|
|
570
|
-
// ["1", "2", "3"]
|
|
571
|
-
|
|
572
|
-
// Replace
|
|
573
|
-
let result = re.replace_all("foo 123 bar 456", "NUM");
|
|
574
|
-
// "foo NUM bar NUM"
|
|
575
|
-
|
|
576
|
-
// Note: Rust regex crate uses finite automata (like RE2)
|
|
577
|
-
// No backreferences or lookaround by default
|
|
578
|
-
// Use the 'fancy-regex' crate for those features
|
|
579
|
-
}
|
|
262
|
+
u Unicode: enables \p{}, correct surrogate pair handling
|
|
263
|
+
x Extended/Verbose: ignore whitespace, allow comments (Python, Ruby, PCRE)
|
|
264
|
+
y Sticky: match at exact position (JavaScript)
|
|
580
265
|
```
|
|
581
266
|
|
|
582
267
|
---
|
|
583
268
|
|
|
584
|
-
##
|
|
585
|
-
|
|
586
|
-
### Techniques
|
|
269
|
+
## 9. Debugging
|
|
587
270
|
|
|
588
|
-
|
|
589
|
-
|
|
590
|
-
|
|
591
|
-
|
|
592
|
-
|
|
593
|
-
- Has a debugger that shows backtracking steps
|
|
594
|
-
|
|
595
|
-
2. Break complex patterns into named pieces:
|
|
596
|
-
|
|
597
|
-
JavaScript:
|
|
598
|
-
const year = '(?<year>\\d{4})';
|
|
599
|
-
const month = '(?<month>0[1-9]|1[0-2])';
|
|
600
|
-
const day = '(?<day>0[1-9]|[12]\\d|3[01])';
|
|
601
|
-
const datePattern = new RegExp(`^${year}-${month}-${day}$`);
|
|
271
|
+
1. Use [regex101.com](https://regex101.com) — supports multiple engines
|
|
272
|
+
2. Break complex patterns into named pieces and compose them
|
|
273
|
+
3. Test incrementally — start simple, add one element at a time
|
|
274
|
+
4. Test edge cases: empty string, very long string, Unicode, newlines
|
|
275
|
+
5. Log intermediate results (capture groups)
|
|
602
276
|
|
|
603
|
-
|
|
604
|
-
- Start with the simplest subpattern
|
|
605
|
-
- Add one element at a time
|
|
606
|
-
- Test with both matching and non-matching inputs
|
|
277
|
+
### Common Mistakes
|
|
607
278
|
|
|
608
|
-
4. Check edge cases:
|
|
609
|
-
- Empty string
|
|
610
|
-
- Very long string
|
|
611
|
-
- Unicode characters
|
|
612
|
-
- Newlines
|
|
613
|
-
- String with only whitespace
|
|
614
|
-
|
|
615
|
-
5. Log intermediate results:
|
|
616
|
-
const re = /(\w+)\s+(\w+)/;
|
|
617
|
-
const match = re.exec(input);
|
|
618
|
-
console.log('Full match:', match[0]);
|
|
619
|
-
console.log('Group 1:', match[1]);
|
|
620
|
-
console.log('Group 2:', match[2]);
|
|
621
279
|
```
|
|
280
|
+
/file.txt/ # Dot matches anything — matches "file_txt"
|
|
281
|
+
/file\.txt/ # Correct: escaped dot
|
|
622
282
|
|
|
623
|
-
### Common Debugging Mistakes
|
|
624
|
-
|
|
625
|
-
```
|
|
626
|
-
# Forgetting to escape special characters
|
|
627
|
-
/file.txt/ # Matches "file_txt" too (dot matches anything)
|
|
628
|
-
/file\.txt/ # Correct: matches only "file.txt"
|
|
629
|
-
|
|
630
|
-
# Forgetting anchors
|
|
631
283
|
/\d{3}/ # Matches "123" inside "abc12345def"
|
|
632
|
-
/^\d{3}$/ #
|
|
284
|
+
/^\d{3}$/ # Correct: anchored to exact 3 digits
|
|
633
285
|
|
|
634
|
-
# Greedy
|
|
635
|
-
|
|
636
|
-
/<[^>]*>/ # Correct: matches "<a>" and "</a>" separately
|
|
637
|
-
|
|
638
|
-
# Multiline misunderstanding
|
|
639
|
-
/^line$/ # Only matches if entire string is "line"
|
|
640
|
-
/^line$/m # Matches "line" on any line in a multiline string
|
|
286
|
+
/<.*>/ # Greedy: matches entire "<a>text</a>"
|
|
287
|
+
/<[^>]*>/ # Correct: negated class
|
|
641
288
|
```
|
|
642
289
|
|
|
643
290
|
---
|
|
644
291
|
|
|
645
|
-
##
|
|
646
|
-
|
|
647
|
-
### HTML/XML Parsing
|
|
648
|
-
|
|
649
|
-
```
|
|
650
|
-
# NEVER parse HTML with regex. Use a DOM parser.
|
|
651
|
-
|
|
652
|
-
# BAD:
|
|
653
|
-
/<div class="title">(.*?)<\/div>/
|
|
654
|
-
|
|
655
|
-
# Why it fails:
|
|
656
|
-
# - Nested divs
|
|
657
|
-
# - Attributes in different orders
|
|
658
|
-
# - Self-closing tags
|
|
659
|
-
# - Comments containing tags
|
|
660
|
-
# - Malformed HTML
|
|
661
|
-
|
|
662
|
-
# GOOD:
|
|
663
|
-
# JavaScript: DOMParser, cheerio
|
|
664
|
-
# Python: BeautifulSoup, lxml
|
|
665
|
-
```
|
|
666
|
-
|
|
667
|
-
### Complex Grammars
|
|
668
|
-
|
|
669
|
-
```
|
|
670
|
-
# Regex cannot parse:
|
|
671
|
-
# - Nested parentheses of arbitrary depth: ((()))
|
|
672
|
-
# - Matching braces in programming languages
|
|
673
|
-
# - JSON, YAML, TOML
|
|
674
|
-
# - Programming language syntax
|
|
675
|
-
# - Mathematical expressions
|
|
676
|
-
|
|
677
|
-
# Use a parser (PEG, ANTLR, tree-sitter) for these.
|
|
678
|
-
```
|
|
679
|
-
|
|
680
|
-
### Simple String Operations
|
|
681
|
-
|
|
682
|
-
```
|
|
683
|
-
# Do not use regex when simple string methods work:
|
|
684
|
-
|
|
685
|
-
# BAD: /^prefix/.test(str)
|
|
686
|
-
# GOOD: str.startsWith('prefix')
|
|
687
|
-
|
|
688
|
-
# BAD: /suffix$/.test(str)
|
|
689
|
-
# GOOD: str.endsWith('suffix')
|
|
690
|
-
|
|
691
|
-
# BAD: str.replace(/foo/g, 'bar') (for literal strings)
|
|
692
|
-
# GOOD: str.replaceAll('foo', 'bar')
|
|
693
|
-
|
|
694
|
-
# BAD: /^$/.test(str)
|
|
695
|
-
# GOOD: str.length === 0
|
|
292
|
+
## 10. When NOT to Use Regex
|
|
696
293
|
|
|
697
|
-
|
|
698
|
-
|
|
699
|
-
|
|
700
|
-
|
|
701
|
-
|
|
294
|
+
### Use a Parser Instead
|
|
295
|
+
- HTML/XML — use DOM parser (cheerio, BeautifulSoup, lxml)
|
|
296
|
+
- JSON, YAML, TOML — use a parser
|
|
297
|
+
- Nested structures — regex cannot handle arbitrary depth
|
|
298
|
+
- Programming language syntax — use tree-sitter, ANTLR
|
|
702
299
|
|
|
300
|
+
### Use String Methods Instead
|
|
703
301
|
```
|
|
704
|
-
#
|
|
705
|
-
|
|
706
|
-
|
|
707
|
-
|
|
708
|
-
|
|
709
|
-
# GOOD:
|
|
710
|
-
const parsed = new URL(url);
|
|
711
|
-
// parsed.protocol, parsed.hostname, parsed.pathname, parsed.searchParams
|
|
302
|
+
# Prefer:
|
|
303
|
+
startsWith('prefix') over /^prefix/.test(str)
|
|
304
|
+
endsWith('suffix') over /suffix$/.test(str)
|
|
305
|
+
includes('text') over /text/.test(str)
|
|
306
|
+
split(',') over split(/,/)
|
|
712
307
|
```
|
|
713
308
|
|
|
714
|
-
### When Regex IS
|
|
715
|
-
|
|
716
|
-
|
|
717
|
-
|
|
718
|
-
|
|
719
|
-
|
|
720
|
-
# - Search and replace with pattern awareness
|
|
721
|
-
# - Tokenizing simple grammars (CSV fields, log lines)
|
|
722
|
-
# - Text cleanup (normalize whitespace, strip control characters)
|
|
723
|
-
# - Splitting on complex delimiters
|
|
724
|
-
```
|
|
309
|
+
### When Regex IS Right
|
|
310
|
+
- Pattern matching in text (log parsing, data extraction)
|
|
311
|
+
- Input validation (format checks)
|
|
312
|
+
- Search and replace with pattern awareness
|
|
313
|
+
- Tokenizing simple grammars (CSV, log lines)
|
|
314
|
+
- Text cleanup (normalize whitespace, strip control characters)
|
|
725
315
|
|
|
726
316
|
---
|
|
727
317
|
|
|
728
|
-
##
|
|
318
|
+
## 11. Critical Reminders
|
|
729
319
|
|
|
730
320
|
### ALWAYS
|
|
731
|
-
|
|
732
|
-
- Use
|
|
733
|
-
-
|
|
734
|
-
-
|
|
735
|
-
-
|
|
736
|
-
-
|
|
737
|
-
- Use named groups for readability in complex patterns
|
|
738
|
-
- Escape user input before inserting into regex (prevent injection)
|
|
739
|
-
- Use the `/u` flag in JavaScript for Unicode correctness
|
|
740
|
-
- Comment complex regex patterns (use verbose mode or build from named parts)
|
|
321
|
+
- Anchor with `^` and `$` when validating entire strings
|
|
322
|
+
- Use non-capturing groups `(?:...)` when you don't need the value
|
|
323
|
+
- Prefer negated character classes `[^"]*` over lazy `.*?`
|
|
324
|
+
- Test with adversarial inputs for ReDoS
|
|
325
|
+
- Escape user input before inserting into regex
|
|
326
|
+
- Use Unicode flag for international content
|
|
741
327
|
|
|
742
328
|
### NEVER
|
|
743
|
-
|
|
744
|
-
-
|
|
745
|
-
-
|
|
746
|
-
-
|
|
747
|
-
-
|
|
748
|
-
- Use `.` when you mean a specific character class (be explicit)
|
|
749
|
-
- Use `.*` in a pattern that processes user input without backtracking protection
|
|
750
|
-
- Forget that `^` and `$` behave differently with and without the `/m` flag
|
|
751
|
-
- Build regex from untrusted user input without escaping special characters
|
|
329
|
+
- Parse HTML/XML/JSON with regex
|
|
330
|
+
- Use nested quantifiers without understanding backtracking
|
|
331
|
+
- Trust regex for email validation beyond basic format
|
|
332
|
+
- Use `.` when you mean a specific character class
|
|
333
|
+
- Build regex from untrusted input without escaping
|