How to Compare Text and Find Differences
Finding what changed between two versions of a document, config file, or piece of code is a common task. Reading both versions and spotting differences manually is slow and unreliable, especially with long texts. A diff checker does it instantly and highlights every change. The same algorithm that powers Git's commit history, GitHub's pull-request review interface, and the Unix diff command underlies every visual diff tool you have ever used.
How to compare text
- Paste both versions: enter the original text on the left and the modified text on the right.
- Review the highlights: added lines are shown in green, removed lines in red. Modified lines show both the old and new versions.
- Export or copy: copy the diff results or download a report.
The comparison happens as soon as both panes have text. There is no Compare button to click; edits to either side re-run the diff in real time, which is useful when you are iterating on a fix and want to see the effect immediately.
Reading a diff
Diff output uses a simple color system:
- Green (added): lines that exist in the new version but not the old
- Red (removed): lines that existed in the old version but are gone from the new
- Unchanged: lines that are identical in both versions
This is the same convention used by Git, GitHub, GitLab, Bitbucket, and every major version control system. The colors are not arbitrary: green for additions and red for deletions has been the standard since the 1970s when the first visual diff tools (like the sdiff command) shipped on Unix. Modern tools sometimes add yellow or orange for "changed" (a line that exists in both but is different), but red and green remain the universal additions/deletions.
A short history of diff
The diff algorithm was first published by Douglas McIlroy at Bell Labs in 1976, building on Eugene Myers's later refinement (the O(ND) algorithm, 1986) that made diff fast enough for interactive use. McIlroy's original algorithm was published with the Unix V7 release in 1979 and has been part of every Unix-like operating system since. The Myers algorithm is what powers modern diff implementations: Git's diff, GitHub's web interface, every diff GUI from Beyond Compare to VS Code.
The visual side-by-side diff format predates the algorithm: it traces back to manual proofreading conventions in publishing (showing two columns of text with changes marked in the margin). The 1970s software just automated what editors had been doing on paper for centuries. The "unified diff" format (the one with --- and +++ headers that you see in patch files) was introduced in 1990 by GNU diff and is now the de facto standard for sharing changes by text.
When diff checking is useful
- Code review: compare your changes against the original before committing to see exactly what you modified
- Document revisions: find what changed between two versions of a contract, article, or policy
- Configuration debugging: compare a working config file against a broken one to spot the difference
- Data validation: check if two data exports are identical or find where they diverge
- Merge conflicts: understand both sides of a conflict before resolving it
- Translation review: compare an original document against a translation to make sure no sections were skipped
- Email or message comparison: when someone says "I sent you the corrected version," diff both messages to see what actually changed
- Database export validation: compare two CSV exports from a database to confirm that an ETL run produced identical output
Line-based vs character-based diff
The diff checker uses line-based comparison, which means it treats each line as the smallest unit of difference. If you change a single word on a line, the entire line is shown as changed (the old line in red, the new line in green) and you have to spot the word-level difference yourself.
Line-based diff is the standard for code and configuration files because those are typically line-oriented (one statement per line, one config option per line). It is fast, predictable, and matches how Git and every code review tool work.
For prose comparison where line-level changes are too coarse, some tools offer word-level or character-level diff that highlights just the changed words within a line. That is more precise but harder to read for code. If you need word-level diff, look for a tool specifically labeled "word diff" or "intra-line diff."
Privacy and confidential content
The diff checker runs entirely in your browser. Both pieces of text stay on your device; nothing is uploaded. This matters because the text you most want to diff is often confidential: contracts under negotiation, draft press releases, internal policy documents, source code under an NDA. Cloud diff tools (DiffChecker.com, JsonDiff.com, online merge tools) require uploading both texts to a third-party server, which is precisely what you want to avoid for sensitive content. Browser-based diff has none of that exposure.
The session is also stateless: nothing persists after you close the tab. If you need to keep a record of the diff, copy the output or take a screenshot before navigating away.
Common pitfalls
- Whitespace noise: trailing spaces, mixed tabs and spaces, and different line endings (LF on Unix vs CRLF on Windows) often show up as "changes" even when the visible text is identical. Most diff tools have a "ignore whitespace" toggle for this case.
- Line ending mismatches: Windows line endings (CRLF) vs Unix line endings (LF) make every line appear as changed. If you are diffing files from different operating systems, normalize line endings first.
- Encoding differences: text in UTF-8 vs UTF-16 vs Windows-1252 may look identical but compare as completely different. Normalize encoding to UTF-8 before diffing.
- Reordered identical content: if you cut a paragraph from page 3 and pasted it to page 1, the diff shows the paragraph as removed-from-page-3 and added-to-page-1 even though the content is unchanged. Some tools offer "moved block detection" to handle this; basic diff does not.
- Large file performance: comparing files with over 10,000 lines can slow the browser. For very large diffs, use command-line
diffor a desktop tool like Beyond Compare.
Tips
- Paste clean text: remove headers, footers, or metadata that you do not want to compare. Extra noise makes real differences harder to spot.
- Use side-by-side view: seeing both versions next to each other with aligned line numbers makes differences easier to trace than an inline view.
- Check for whitespace: sometimes "identical" text has invisible differences like trailing spaces, different line endings (LF vs CRLF), or tabs vs spaces. The diff checker catches these.
- Normalize first for prose: for natural language comparison, run both texts through a whitespace normalizer or paste into a plain editor before diffing. That avoids spurious differences from formatting carried over from Word or PDF.
- Save the diff if you need a record: copy the highlighted output or take a screenshot. The diff is not persisted automatically.
- Works offline: once the page loads, comparisons run locally in your browser with no internet needed.
Frequently Asked Questions
Does the diff checker compare character by character?
It compares line by line, the same approach used by Git and most professional diff tools. If any character on a line changes, the entire line is highlighted as changed.
Is there a size limit?
There is no hard limit, but very large texts (over 10,000 lines) may take a moment to process since the comparison runs entirely in your browser.
Can I compare code files?
Yes. The diff checker works with any text, including source code. Syntax highlighting helps you read code diffs more easily.
Is my text sent to a server?
No. The comparison happens in your browser. Your text never leaves your device.