Free Regex Tester & Debugger
Test regular expressions with real-time highlighting and capture groups.
Highlighted Matches
Match Details
0 matchesQuick Reference
.Any character except newline
\dDigit (0-9)
\wWord character (a-z, A-Z, 0-9, _)
\sWhitespace (space, tab, newline)
^Start of string (or line with m flag)
$End of string (or line with m flag)
*0 or more of previous
+1 or more of previous
?0 or 1 of previous
{n,m}Between n and m of previous
[abc]Character class: a, b, or c
[^abc]Not a, b, or c
(abc)Capture group
(?:abc)Non-capturing group
a|ba or b
\bWord boundary
(?=abc)Positive lookahead
(?!abc)Negative lookahead
About Regular Expressions
Regular expressions (regex) are patterns used to match character combinations in strings. They are an essential tool in programming, text processing, data validation, and search operations. Every major programming language supports regex · JavaScript, Python, Java, PHP, Ruby, Go, and more.
This tester uses JavaScript's built-in RegExp engine, which supports ECMAScript regex syntax including lookaheads, character classes, quantifiers, and the g, i, m, and s flags. Matches are highlighted in real time as you type, and capture groups are displayed in the match details panel.
Common Uses
- Validate email addresses, phone numbers, and form inputs
- Extract data from log files, CSVs, or HTML
- Find and replace text patterns in code editors
- Parse URLs, file paths, and structured text
- Write web scraping selectors and search filters
Frequently Asked Questions
What do the g, i, m, and s flags do?
g (global) finds all matches instead of stopping at the first. i (case insensitive) ignores uppercase vs. lowercase. m (multiline) makes ^ and $ match the start/end of each line. s (dotAll) makes . match newline characters too.
Will this regex work in Python / Java / PHP?
Most regex syntax is shared across languages. However, there are differences · for example, JavaScript doesn't support lookbehinds in all browsers (though modern ones do), and Python uses different named group syntax. For basic patterns, what works here will work everywhere.
Is my test data sent anywhere?
No. All regex matching happens locally in your browser using JavaScript's native RegExp engine. Nothing is sent to any server.
What is a regex tester?
A regex tester is an interactive editor that runs a regular expression against a sample string and shows you exactly what matched, what did not, and what the capture groups contain. The tester lets you iterate quickly: type the pattern, see the highlights, adjust, repeat. It replaces the slow loop of editing source code, running a script, and reading console output.
Regular expressions themselves are a pattern syntax invented by Stephen Cole Kleene in 1956 to describe sets of strings. Modern regex implementations (PCRE, JavaScript's RegExp, Python's re, .NET's System.Text.RegularExpressions, Java's java.util.regex) share most of their syntax but differ in edge cases like lookbehinds, named groups, Unicode handling, and quantifier behavior.
This tester uses your browser's native JavaScript RegExp engine, which implements ECMAScript 2024 regex including all standard flags (g, i, m, s, u, y, d) and the modern lookbehinds. The output is exactly what your front-end code will see at runtime, which makes the tester especially useful when debugging client-side validation, scraping selectors, or replace-with-callback transforms.
What is inside the tester
The top row holds the pattern input flanked by forward slashes, followed by toggle buttons for the four most-used flags (g, i, m, s). A Patterns button opens a library of common regex snippets (email, URL, phone, date) that you can click to fill the pattern field. Behind the scenes the input is debounced so re-typing does not thrash the matcher.
Below the pattern, the Test String textarea is where you paste the sample text. Matches are highlighted with a yellow background in the Highlighted Matches panel that updates as you type. The Replace with field accepts a replacement string with backreferences ($1, $2, etc.) and shows the resulting text live, perfect for testing string-replacement transforms before pasting them into your code.
The Match Details list shows each match with its zero-based index in the source, the matched substring, and every capture group. A Quick Reference card at the bottom recaps the syntax for character classes, quantifiers, anchors, and lookarounds, so you do not have to context-switch to a documentation tab for the basics.
History and background
Stephen Cole Kleene defines regular events (1956)
Mathematician Stephen Cole Kleene published the paper Representation of Events in Nerve Nets and Finite Automata in 1956, introducing what he called regular events: patterns that describe sets of strings accepted by a finite automaton. The Kleene star (the * operator) carries his name. His algebraic notation is the direct ancestor of every regex syntax in use today.
Ken Thompson ships grep (1968)
Ken Thompson at Bell Labs implemented a regex engine in 1968 inside the QED editor and again in grep (1973), the Unix utility whose name comes from the QED command g/regular-expression/p. Thompson's NFA-based engine ran in linear time per character, a guarantee that backtracking engines later lost when they added features like backreferences.
Perl 5 introduces extended regex (1994)
Larry Wall released Perl 5 in 1994 with a regex flavor that added lookaheads, lookbehinds, named captures (later), inline modifiers and backreferences. Perl 5 regex became so dominant that other languages copied its syntax. Philip Hazel created PCRE (Perl Compatible Regular Expressions) in 1997 as a C library, and PCRE today powers regex in PHP, Apache, NGINX, and many other tools.
JavaScript ships RegExp (1995, formalized 1999)
Brendan Eich's JavaScript 1.0 in 1995 shipped with a RegExp object modeled after Perl 5. ECMAScript edition 3 (1999) formalized the syntax. Subsequent editions added Unicode flag u (ES2015), sticky flag y (ES2015), named groups (ES2018), lookbehinds (ES2018) and indices flag d (ES2022). Browsers caught up over time, and modern engines (V8, SpiderMonkey, JavaScriptCore) implement the full ES2024 spec.
ReDoS, regex denial of service (2003 onward)
Researchers noticed that backtracking regex engines can take exponential time on certain inputs, a class of vulnerability called ReDoS (Regular expression Denial of Service). A 2019 Cloudflare outage was traced to a regex with catastrophic backtracking. Tools like rxxr and node-re2 emerged to detect or sidestep the issue, and engines started enforcing time budgets on long-running matches.
Unicode property escapes land in ECMAScript (2018)
ES2018 added Unicode property escapes such as \\p{Script=Latin} or \\p{Letter}, which let you match by Unicode category without enumerating code points. Combined with the u flag, regex can now distinguish emoji from letters, scripts from each other, and properly handle surrogate pairs. This makes JavaScript regex finally suitable for international text matching, a problem the older ASCII-only syntax could not solve.
Practical workflows
Email validation
Drop a sample of valid and invalid emails into the test area, type your candidate regex (a common starting point is ^[^@\\s]+@[^@\\s]+\\.[^@\\s]+$), and tweak until valid emails highlight and invalid ones do not. Be aware that the full RFC 5321 email spec is so complex that the perfect email regex is hundreds of characters long. A pragmatic regex catches typos; final validation should round-trip through actual SMTP.
URL parsing and extraction
Paste a page of HTML or plain text and write a regex to extract URLs. A starting pattern like https?:\\/\\/\\S+ catches most cases. For production code, prefer the URL constructor (new URL(string)) which handles every edge case; regex is best for quick one-off extractions or log analysis.
Log file scraping
Apache and NGINX logs follow a fixed format. Paste a few log lines, write a regex with named captures ((?<ip>\\S+) (?<ts>\\S+ \\S+) \\"(?<req>[^"]+)\\" ...), and you have a parser ready to feed into a structured-log analyzer. Test on a sample of your real logs before deploying.
Find and replace in code editors
VSCode, Sublime Text, JetBrains IDEs and vim all accept regex in their find-replace dialogs. Iterate on the pattern here first, with the live highlighter showing exactly what matches, then paste the regex into the editor's dialog. Save yourself the pain of misfires on a 5,000-line codebase.
Web scraping CSS class names
When you need to extract data from HTML without a parser (a one-off script, not production), a regex like class="([^"]+)" pulls out class attributes. For anything beyond a quick exploration, switch to a proper DOM library; HTML is not a regular language and regex misses edge cases.
Validating semantic version strings
Semver follows ^\\d+\\.\\d+\\.\\d+(-[\\w.]+)?(\\+[\\w.]+)?$. Drop a list of versions (1.0.0, 1.2.3-beta.1+build.456) into the test area to check the regex catches pre-release and build metadata correctly. This is useful when validating dependencies in CI scripts.
Common pitfalls
Greedy vs lazy quantifiers
By default *, + and ? are greedy: they match as much as possible, then backtrack if the rest of the regex fails. The lazy versions *?, +?, ?? match as little as possible. The classic example is <.*> on <a>text</a> which matches the whole string, while <.*?> matches just <a> and </a> separately. Pick the right one to avoid over-matching surprises.
Catastrophic backtracking (ReDoS)
Nested quantifiers like (a+)+ or (.*)* on a long non-matching input can take exponential time as the engine tries every combination. The browser tab may freeze or crash. Avoid overlapping quantifier groups, prefer atomic groups (?>...) where supported, or pre-validate input length. The npm library safe-regex flags risky patterns automatically.
Special characters need escaping
Characters with special meaning in regex (. * + ? ^ $ ( ) [ ] { } | \\) must be escaped with a backslash to match literally. So \\. matches a dot, while . matches any character. Forgetting to escape is the most common cause of false positives when validating IPs, file extensions, or dotted version numbers.
Anchors and the multiline flag
Without the m flag, ^ and $ match only the start and end of the entire string. With m, they match the start and end of each line. If your regex works on single lines but fails on multi-line input, toggle m. Conversely, if it matches too much on multi-line input, remove m.
Cross-engine syntax differences
This tester uses JavaScript regex. Python's re uses (?P<name>) for named captures instead of (?<name>), .NET allows backreferences \\k<name> differently, and PCRE has features like recursive subpatterns (?R) that JavaScript lacks. If your final target is Python or Java, validate on those engines too before shipping.
Unicode without the u flag
Without the u flag, JavaScript regex treats surrogate pairs (emoji, CJK supplement) as two separate code units. \\u{1F600} (smiling face emoji) does not work without u. With the u flag, the regex becomes Unicode-aware, .property escapes like \\p{Letter} become available, and surrogate-pair handling is correct. Always set u when matching international text.
Privacy and data handling
Every regex is compiled and executed by your browser's RegExp engine. We do not send your pattern, your test string, or your replacement template to any server. The matcher runs locally, the highlights are rendered locally, and the match details list is computed locally. There are no analytics tied to the content of your inputs.
Once the page is loaded, the tester works offline. You can disconnect from the network, paste sensitive log lines or PII, and run patterns against them without any data leaving your device. This makes the tool safe for testing regex against production data without sending it through a third-party service.
When not to use a regex
Parsing HTML or XML
HTML is not a regular language. You cannot reliably parse nested tags with regex; the famous Stack Overflow answer about Zalgo and Cthulhu makes this point colorfully. Use DOMParser or a library like cheerio (Node.js) or BeautifulSoup (Python) instead. Regex is fine for one-off extractions but breaks on edge cases like self-closing tags, comments, CDATA, and malformed input.
Anything truly recursive (JSON, source code, math expressions)
Balanced braces, balanced parens, nested function calls, arithmetic precedence, all require a context-free grammar, not a regular one. Use a parser combinator (Parsimmon, nom) or a generator (pegjs, antlr). Regex can match opening or closing tokens but cannot track balance.
When a simple string operation is enough
If you need to check whether a string starts with prefix-, use str.startsWith("prefix-"), not /^prefix-/. String methods are faster, clearer, and impossible to get wrong with quantifiers. Reserve regex for patterns that string methods cannot express.
Complex schema validation
Validating that a JSON document has a specific shape (required fields, nested types, value ranges) is far better done with a JSON Schema validator (ajv, zod, joi) than a regex. Regex can check format but not structure, and a regex that tries to validate a JSON document is a maintenance nightmare.
More questions
When should I use lookahead vs lookbehind?
Lookahead (?=...) asserts that what follows matches without consuming it; lookbehind (?<=...) does the same for what precedes. Use lookahead when the trailing context determines whether to match, lookbehind when the leading context does. JavaScript supports both since 2018 (ES2018), and all modern browsers do. Older Safari versions before 16.4 lacked lookbehind support.
Is lookbehind supported in all browsers?
Lookbehind (positive and negative) is supported in Chrome since version 62 (2017), Firefox since 78 (2020), Edge since 79 (2020), and Safari since 16.4 (2023). If your audience may use older Safari, avoid lookbehind or polyfill with an alternative pattern. For Node.js, lookbehind has been supported since 10.0.
What does the Unicode (u) flag do?
The u flag enables Unicode mode: surrogate pairs are treated as a single character, \\u{...} escapes work, and \\p{...} property escapes become available. Without u, an emoji like the smiling face counts as two code units and . matches only the first half. Always set u when working with text beyond ASCII.
How fast is the regex engine?
V8's RegExp engine uses an Irregexp implementation that compiles to native code. For simple patterns it matches millions of characters per second. Pathological patterns (nested quantifiers on adversarial input) can blow up to exponential time, which is why ReDoS is a real attack vector. Modern engines apply heuristics to detect and abort runaway matches, but you should still avoid risky patterns.
How do JavaScript and Python regex differ?
Named groups use different syntax (?<name> in JS, ?P<name> in Python). Python lacks the y (sticky) flag; JavaScript lacks Python's verbose mode. Python supports recursion via the regex third-party module but not built-in re. Character class shorthand differs slightly (\\d means [0-9] in both, but \\w in Python includes underscores in Unicode mode while JS requires the u flag for the same behavior).
Can I use AI to generate regex instead?
LLMs are good at proposing initial regex patterns but routinely produce subtly wrong output (greedy where lazy was needed, missing escapes, wrong flags). Use AI for first drafts, then validate by running the regex against real samples in this tester. Iterate until the highlights match exactly what you expect. The interactive feedback loop catches LLM mistakes before they ship to production.