What do the flags mean?

Global (g) finds all matches, not just the first. Case-insensitive (i) ignores letter case. Multiline (m) makes ^ and $ match the start and end of each line instead of the entire string.

Are my patterns or test text stored anywhere?

No. The regex cheat sheet runs entirely in your browser. Patterns, test strings and matches stay local to your device, so it's safe for production snippets, internal data formats, or confidential content.

Does this tool work offline?

Yes, once the page has loaded, the tool works entirely in your browser without needing an internet connection. All processing is done locally with JavaScript.

Free Regex Cheatsheet

Interactive reference guide for regular expressions.

Live pattern testing

Test Pattern

Regular Expression

Test Text

Flags

Global (g)

Case Insensitive (i)

Multiline (m)

No matches

How to Use

Browse the pattern categories or use the search box to find specific patterns.
Enter a regex pattern in the "Test Pattern" field and sample text in "Test Text".
Toggle flags (global, case-insensitive, multiline) and see matches highlighted instantly.

Frequently Asked Questions

What is a regular expression?

A regular expression (regex or regexp) is a pattern used to match, search, and replace text. It uses special characters and syntax to define what strings to find.

What do the flags do?

Global (g) finds all matches. Case Insensitive (i) ignores letter case. Multiline (m) treats ^ and $ as line boundaries instead of string boundaries.

Can I use this cheatsheet in my code?

Yes! Once you've tested a pattern here and verified it works, copy the regex pattern directly into your JavaScript, Python, or other programming language.

A Brief History of the Pattern Language

Regular expressions began as a piece of theoretical computer science. Stephen Kleene defined "regular sets" in a 1956 paper on neural networks; Ken Thompson built them into Unix in 1968 with grep. Henry Spencer's open-source regex library (mid-1980s) became the basis for many later implementations. Larry Wall extended the syntax dramatically in Perl, and his "Perl-compatible regular expressions" (PCRE) became the de facto standard most modern languages followed. Today there are several closely-related but subtly different regex flavours, and a pattern that works in one engine doesn't always work identically in another.

The Engine Your Pattern Lives In

The same syntax can mean different things in different engines. The big families:

POSIX BRE (Basic Regular Expressions), used by grep's default mode, sed. Many metacharacters require backslash escaping: (, ), {, }, +, ?, | are literal unless escaped.
POSIX ERE (Extended Regular Expressions), used by egrep, awk. The above metacharacters work without escaping.
PCRE (Perl-Compatible Regular Expressions), extends ERE with lookarounds, atomic groups, named captures and backreferences. Used by PHP and most modern languages. The Perl-derived shorthand classes \d / \w / \s are common to PCRE, JavaScript, .NET, Java and Python.
JavaScript RegExp, close to PCRE but with notable differences. ES2018 added lookbehinds, named capture groups, the s dotall flag, and Unicode property escapes via the u flag. The v flag for set notation arrived in ES2024.
Python re and Python regex, re is in the standard library; the third-party regex module adds Unicode-aware features, variable-width lookbehinds, and other PCRE-style enhancements.
RE2 (Google's library, used in Go), guarantees linear time but doesn't support backreferences or lookarounds. The trade-off: predictable performance, fewer features.

This cheatsheet's interactive tester runs in JavaScript, so the pattern is evaluated by the browser's JS engine. Patterns that work here may behave differently in Python or PHP. Most differences are in advanced features (lookbehinds, Unicode property escapes, backreferences) rather than basic syntax.

The Core Building Blocks

Almost every regex pattern is built from these elements:

Literals, match themselves. cat matches the substring "cat".
Anchors, ^ (start of string / line), $ (end), \b (word boundary), \B (non-word-boundary).
Character classes, [abc] matches a, b, or c. [^abc] negates. [a-z] is a range. Shorthands: \d (digit), \w (word character: letter, digit, underscore), \s (whitespace), and uppercase versions for negation (\D, \W, \S).
Quantifiers, ? (0 or 1), * (0 or more), + (1 or more), {n}, {n,}, {n,m}. Greedy by default (match as much as possible); add ? for lazy: *?, +?, ??.
Groups, (...) capturing, (?:...) non-capturing, (?<name>...) named (PCRE / JS / Python).
Alternation, cat|dog matches either.
Lookarounds, (?=...) positive lookahead, (?!...) negative lookahead, (?<=...) positive lookbehind, (?<!...) negative lookbehind. Match without consuming.
Backreferences, \1, \2 (numbered), \k<name> (named). Match the same text the corresponding capture matched.
Flags, g (global), i (case-insensitive), m (multiline: ^ and $ match line boundaries), s (dotall: . matches newlines), u (Unicode), y (sticky in JS).

Patterns Worth Memorising

A handful of patterns come up so often it's worth keeping them in your head:

Use	Pattern
Email (basic)	`^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$`
URL	`https?://[^\s]+`
US phone number	`$?\d{3}$?[-.\s]?\d{3}[-.\s]?\d{4}`
ISO date (YYYY-MM-DD)	`\d{4}-(0[1-9]\|1[0-2])-(0[1-9]\|[12]\d\|3[01])`
IPv4 address (no octet validation)	`\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b`
Hex colour	`^#?([0-9a-fA-F]{3}\|[0-9a-fA-F]{6})$`
Whitespace at start/end of line	`^\s+\|\s+$`
Multiple consecutive spaces	`\s{2,}`

A note on email regex: full RFC 5322 email validation needs a 6,000-character monster regex. The simple form above accepts 99% of real email addresses and rejects nothing legitimate; for production use, send a confirmation email instead of trying to perfectly validate the syntax.

Greedy vs Lazy: A Common Surprise

By default, quantifiers are greedy: they match as much as possible while still allowing the overall pattern to match. So <.+> against <a>text</a> matches the whole thing, not just <a>, because .+ grabs as much as it can. To match the smallest possible string, append ? to the quantifier: <.+?> matches <a> and then </a> separately. The greedy/lazy choice is one of the most common sources of "why isn't my regex matching what I expected" bugs.

Catastrophic Backtracking and ReDoS

Some regex patterns can take exponential time to fail on certain inputs, a class of denial-of-service vulnerability called ReDoS (Regular Expression Denial of Service). The classic culprits are nested quantifiers like (a+)+ or (a|aa)+ applied to a long string of as followed by a non-matching character. The engine tries every possible way to split the string before giving up, and the number of ways is exponential.

Real-world incidents: Cloudflare's 2019 outage was triggered by a regex deployed in a WAF rule that catastrophically backtracked on certain inputs. Stack Overflow had a similar incident in July 2016: a post-trim regex (^[\s‌]+|[\s‌]+$) hit exponential backtracking on a single comment containing roughly 20,000 consecutive whitespace characters and took the site down for 34 minutes. Defensive habits: avoid nested quantifiers, prefer atomic groups ((?>...)) where supported, and consider using RE2 / linear-time engines for untrusted input.

Per-Language Quirks Worth Knowing

JavaScript: backslashes need double-escaping in string literals ("\\d") but not in regex literals (/\d/). Use the regex literal form when possible.
Python: use raw strings (r"\d+") to avoid backslash issues. The re module is in the standard library; regex on PyPI adds extra features.
Java: backslashes need quadruple-escaping ("\\\\d" for \d) because Java string literals use \ as escape and the regex compiler then sees \\d.
Bash: regex matching in [[ string =~ pattern ]] uses POSIX ERE. Quoting rules are tricky; consult man bash.
Go: uses RE2, so backreferences and lookarounds aren't available. Trade-off: linear-time guarantee.

When NOT to Use Regex

Jamie Zawinski's famous 1997 line: "Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems."

Don't parse HTML / XML with regex. Use a real parser (DOMParser in browsers, BeautifulSoup in Python, jsoup in Java, etc.). HTML's nested structure is fundamentally beyond what regex can express cleanly.
Don't parse JSON with regex. Use JSON.parse / standard library JSON parsers.
Don't validate emails strictly with regex. Send a confirmation email; that's the only reliable test.
Don't write a CSV parser as a regex. Quoted fields with embedded commas, escaped quotes, and multi-line values quickly outgrow what regex handles cleanly.
Don't try to match balanced parentheses. Standard regex can't (it's a context-free language); some PCRE engines have recursion features that cheat, but a real parser is cleaner.

Common Mistakes

Forgetting to escape special characters. ., *, ?, +, (, ), [, ], {, }, \, ^, $, |, / all have special meanings. To match them literally, prefix with backslash.
Greedy quantifiers consuming too much. Add ? for lazy matching when you want the smallest possible match.
Missing the global flag and wondering why only the first match shows. JavaScript's String.prototype.match() returns only the first match without the g flag.
Catastrophic backtracking on long inputs. Nested quantifiers like (a+)+ can hang on certain inputs. Test with edge cases.
Assuming the same regex behaves the same in every language. Lookbehinds, Unicode escapes, and character class shortcuts all vary.
Trying to validate emails too strictly. The technically-correct RFC 5322 regex is unmaintainable; a simple regex plus confirmation-email-on-signup is the working pattern.
Using regex on HTML, JSON, or CSV. Use a proper parser; the time you save up front you'll lose to bugs.

Related Tools

JSON Formatter URL Encoder Text Tools