Character Classes
.
Any single char except newline (with the s flag, includes newline)
\d
Digit, equivalent to [0-9]
\D
Non-digit, equivalent to [^0-9]
\w
Word char, equivalent to [A-Za-z0-9_]
\s
Whitespace (space, tab, newline, etc.)
[abc]
Any one of a, b, or c
[^abc]
Any char NOT in the set
[a-z]
Range of chars from a to z
[A-Za-z0-9]
Combined ranges
\.
Escaped period (matches literal .)
Anchors
^
Start of string (or line, with m flag)
$
End of string (or line, with m flag)
\b
Word boundary (between \w and \W)
\A
Start of string only (Python, PCRE; not JS)
\z
End of string only (Python, PCRE; not JS)
Groups & Backreferences
(?:abc)
Non-capturing group (no slot in match array)
(?<name>abc)
Named capturing group (most modern engines)
\1, \2
Backreference to first, second capture group
\k<name>
Backreference to named group
$1, $2
In substitution: insert capture group N
Lookaround
(?=abc)
Positive lookahead: followed by abc
(?!abc)
Negative lookahead: NOT followed by abc
(?<=abc)
Positive lookbehind: preceded by abc
(?<!abc)
Negative lookbehind: NOT preceded by abc
Note: Lookbehind requires modern engines (JS as of ES2018, Python re, PCRE). Some older engines (Go's RE2, basic POSIX) don't support it.
Flags
g
Global, find all matches (not just first)
m
Multiline, ^ and $ match line breaks
s
Dotall, . matches newlines
u
Unicode (JS, full Unicode support)
x
Extended/verbose, ignore whitespace and allow comments (Python, PCRE)
y
Sticky, match starting at lastIndex (JS only)
Common Patterns
[\w.+-]+@[\w-]+\.[\w.-]+
Email (loose, good enough for most use cases)
https?://[^\s]+
URL (basic, captures most http and https)
\b(\d{1,3}\.){3}\d{1,3}\b
IPv4 address (loose, doesn't validate ranges 0-255)
[0-9a-fA-F]{8}-([0-9a-fA-F]{4}-){3}[0-9a-fA-F]{12}
UUID v1-v5 format
\d{4}-\d{2}-\d{2}
ISO 8601 date (YYYY-MM-DD)
\d{2}:\d{2}(:\d{2})?
HH:MM or HH:MM:SS time
^#[0-9a-fA-F]{3,8}$
CSS hex color (3, 4, 6, or 8 chars)
\$?\d{1,3}(,\d{3})*(\.\d{2})?
USD currency with optional dollar sign and cents
\+?[1-9]\d{1,14}
E.164 international phone number
^[a-z0-9-]+$
URL-safe slug (lowercase, digits, hyphens)
^(?=.*[A-Z])(?=.*\d).{8,}$
Password: 8+ chars, has uppercase and digit (lookaheads)
<([a-z]+)([^>]*)>.*?</\1>
Matched HTML tag pair (loose, won't handle nesting)
^\s*$
Empty line (only whitespace)
^[A-Z][a-z]+\s[A-Z][a-z]+$
Two capitalized words (e.g. a typical full name)
[a-zA-Z]\w{2,30}
Username: starts with letter, 3-31 chars total
Substitution
"abc".replace(/b/, "X")
JavaScript: replace first match
"abc".replace(/b/g, "X")
JavaScript: replace all matches
re.sub(r'\d+', 'N', s)
Python: replace all digits with "N"
sed 's/foo/bar/g'
sed: replace all in file (use POSIX BRE or ERE syntax)
sed -E 's/foo/bar/g'
sed extended: ERE allows +, ?, {n,m}, () without escaping
$1, $2 in replacement
Insert capture group N (JS, Python, sed, PCRE)
Engine Differences (Heads-up)
JavaScript
Supports lookahead since forever, lookbehind since ES2018, named groups via (?<name>...). Use g flag with matchAll().
Python (re)
Use r"..." raw strings to avoid double-escaping. Named groups via (?P<name>...). Has verbose x flag for commenting.
grep
Default is BRE (basic). Use grep -E for ERE (where +, ?, {n,m}, () are special). Use grep -P for PCRE on GNU grep.
Go (regexp)
Uses RE2: linear time, no lookaround, no backreferences. Trade-off for guaranteed performance.
Rust (regex crate)
RE2-style by default (no lookaround/backrefs). Use fancy-regex crate for those.
Pro Tips
Always escape special chars in literal text: . * + ? ^ $ { } ( ) | [ ] \. Inside character classes, you only need to escape ] \ and the first ^.
Greedy vs lazy matters more than people think. .* matches the longest possible substring. .*? matches the shortest. Wrong one bites you on HTML/JSON-like inputs.
Anchors are free. Add ^ and $ when you mean "match the whole string" rather than "match anywhere in it." Catches a surprising number of bugs.
Named groups beat positional ones once you have more than two captures. Future-you will not remember what $3 meant.
Test patterns interactively before shipping. Use
/tools/regex for live match feedback.