टेक्स्ट अंतर जाँचकर्ता
दो टेक्स्ट की तुलना करें और तुरंत अंतर खोजें।
टेक्स्ट तुलना के बारे में
यह diff टूल दो टेक्स्ट की पंक्ति-दर-पंक्ति तुलना के लिए Longest Common Subsequence (LCS) एल्गोरिथम का उपयोग करता है। जोड़ हरे रंग में, हटाव लाल रंग में हाइलाइट होते हैं।
सामान्य उपयोग
- कमिट करने से पहले कोड परिवर्तनों की तुलना करें
- संशोधित दस्तावेज़ों या अनुबंधों की समीक्षा करें
- अनुवाद के अंतर की जाँच करें
- कॉन्फ़िगरेशन फ़ाइल परिवर्तनों को सत्यापित करें
- API प्रतिक्रियाओं की तुलना करें
अक्सर पूछे जाने वाले प्रश्न
क्या यह वर्ण-दर-वर्ण तुलना करता है?
यह टूल पंक्ति-दर-पंक्ति तुलना करता है। यदि एक पंक्ति में कोई भी परिवर्तन है (यहाँ तक कि एक भी वर्ण), तो पूरी पंक्ति को संशोधित के रूप में चिह्नित किया जाता है। यह Git और अधिकांश diff टूल्स द्वारा उपयोग किया जाने वाला दृष्टिकोण है।
क्या कोई आकार सीमा है?
कोई सख्त सीमा नहीं है, लेकिन बहुत बड़े टेक्स्ट (10,000 से अधिक पंक्तियाँ) को प्रोसेस करने में एक पल लग सकता है, क्योंकि तुलना पूरी तरह आपके ब्राउज़र में चलती है।
What "diff" actually means
A diff describes how to turn one text into another using the smallest set of insertions and deletions. There are infinitely many ways to do that, the trivial one is "delete every character of A and then insert every character of B." A useful diff is the shortest edit script, which is the dual of the Longest Common Subsequence problem: find the longest sequence of lines that appears (in order) in both texts, and everything else is an addition or a deletion. Most diff algorithms are LCS algorithms wearing a different hat.
A subtle point: the longest common subsequence is not the longest common substring. A subsequence preserves order but not adjacency, so "ABCDE" and "AXBYCZD" share the subsequence "ABCD" even though no contiguous run of three characters appears in both.
A short history of diff algorithms
The first widely-used file-comparison program was the Unix diff written by Douglas McIlroy at Bell Labs, with the underlying algorithm published by James W. Hunt and McIlroy as Bell Labs Computing Science Technical Report #41 in June 1976. McIlroy himself described the four-month development as "a desperate effort." The Hunt–McIlroy approach hashed every line of file B, indexed where each unique line occurred, and looked for the longest monotonically increasing chain of matches. It shipped in Version 6 Unix and remained essentially the same algorithm in /usr/bin/diff on most Unix systems for decades.
Eugene W. Myers published the modern industry standard in 1986, "An O(ND) Difference Algorithm and Its Variations" in Algorithmica Vol. 1, No. 2. Myers reframed LCS as a shortest-path problem through an edit graph: place A along the top axis and B along the side, draw a diagonal whenever two lines match, and search for the path from top-left to bottom-right with the fewest non-diagonal edges. Each non-diagonal edge is one insertion or deletion; each diagonal is free. Crucially, the algorithm runs in O((N+M)·D) time where D is the size of the edit script, that is, it gets faster the more similar the files are. For typical version-control diffs D is tiny compared to N+M, so Myers is dramatically faster than the O(N·M) worst case in practice. His paper also includes a linear-space refinement using a divide-and-conquer "middle snake" trick. Myers' algorithm is the default backing GNU diff, git diff, Mercurial, Subversion, jsdiff, Python's difflib, and most GUI diff viewers.
Bram Cohen (later better known as the inventor of BitTorrent) designed patience diff for the Bazaar version-control system around 2002. The motivating problem: Myers diff aligns any matching line, including very common ones, so moving a function in C can produce a noisy diff in which the closing brace of the moved function aligns with the closing brace of an unrelated function. Patience diff anchors the diff on lines that occur exactly once in each file (typically function signatures, comment headers, distinctive log messages), computes the LCS of just those anchors using patience sort, and recurses on the gaps. The result aligns on meaningful lines and reads better for source code. Git supports it via git diff --patience. Histogram diff is a refinement added to git around 2010 that builds a histogram of line occurrences and prefers low-frequency anchors rather than strictly unique ones; from git 2.45 onward it's the default for many configurations.
Levenshtein vs LCS
A related concept worth distinguishing. Levenshtein distance (Vladimir Levenshtein, 1965) counts the minimum number of single-character insertions, deletions, and substitutions to turn one string into another. LCS-based diff is closely related but slightly different: classic diff treats a "change" as a delete-then-insert pair rather than a single substitution, because that maps better onto line-oriented files. Levenshtein answers "how different are these strings?" and gives you a number; LCS answers "what specifically changed?" and gives you an actual edit script. Spell-checkers and "did you mean?" features want the first; diff tools want the second. The Damerau-Levenshtein variant (1964) extends Levenshtein with a transposition operator for typo detection.
Granularity: line, word, and character
Diff can be computed at different granularities, each with trade-offs:
- Line-level: each line is one token; two lines either match exactly or don't. Fast, maps cleanly onto how programmers edit code, produces concise patches that
patch(1)can apply. Cost: reformatting one character in a 200-character line marks the whole line as changed. The default fordiff, git, SVN, and this tool. - Word-level: each whitespace-delimited token is a unit, so a line that changed one word shows just that word highlighted. Excellent for prose, contracts, blog drafts. Used by
git diff --word-diff, Google Docs "Suggesting" mode, Microsoft Word "Track Changes," Diffchecker's compare-words mode. - Character-level: each Unicode code point is a unit. Catches single-character typos, ideal for legal redlining where one word can flip meaning. Slow on large inputs and sometimes unreadable for long edits.
A common pattern in modern diff UIs is to compute the diff at line level first, then for any pair of "removed/added neighbours" run a secondary word-level diff on just those two lines and highlight the intra-line changes. That's exactly how GitHub's PR view and the diff-match-patch library both work.
Three diff display modes you'll see
- Unified diff (the format of every
.patchfile), hunks introduced by@@ -oldStart,oldLen +newStart,newLen @@with three lines of context above and below by default. Originated with Wayne Davison's GNU diff-uflag around 1990. Compact and machine-readable bypatch(1), which is why git, kernel-style email patches and SVN all use it. - Side-by-side: original on the left, modified on the right, aligned line by line.
diff -yproduces a primitive ASCII version; Beyond Compare, Meld, WinMerge and the "Side by Side" view above all use this format. Best for visual review and proofreading prose. - Inline (stacked): a single column where deleted and added lines are interleaved. GitHub's PR view defaults to this. The most space-efficient format for narrow screens and small edits.
Useful git diff variants
git diff: index vs working tree.git diff --staged: HEAD vs index.git diff main..feature: branch tips.main...featurecompares against the merge base, what the PR will land.git diff --word-diffand--color-words: markup-friendly and colour-tinted word-level diff.git diff --no-index a.txt b.txt: works on files outside any git repo. Handy as a poor-man's standalone diff with patience or histogram algorithms.git diff -w: ignore all whitespace (matches the "Ignore whitespace" checkbox above).git diff --histogram: switch algorithm per command.
When you'd reach for a browser diff
- Code review. Paste two versions of a file, scan for what changed, decide whether the change makes sense.
- Reviewing AI-edited code. Diff your original against an LLM's refactored version to spot drift or hallucinations.
- Legal contract redlining. Confirm that only the agreed clauses changed between version 3 and version 4.
- Translation QA. Compare two French translations of the same paragraph, or diff previous and updated localisation files when only a handful of strings should have changed.
- Configuration drift. Compare
nginx.conffrom prod against staging. - SQL schema comparison. Dump
CREATE TABLEstatements from two databases and diff them to find missing indexes or column-type drift. - Spec drift. Diff two versions of an OpenAPI / Swagger YAML to verify breaking changes are documented.
- Markdown blog drafts. Spot what a co-author edited.
- Compliance audits. Diff a privacy policy month-over-month for regulators.
- Lock-file forensics. Diff
package-lock.jsonoryarn.lockto understand a dependency upgrade.
Honest limitations
- Plain text diff has no semantic awareness. Renaming
footobaracross a file looks identical in cost to a totally unrelated rewrite. Tools like difftastic parse the language's AST and produce structural diffs ("function added," "argument reordered"). This page does not, it's a plain LCS line diff. - Whitespace and line-endings cause spurious diffs. CRLF vs LF, trailing whitespace, tabs vs spaces. The "Ignore whitespace" toggle above maps to git's
-w. - Binary files don't fit. Diff is a text concept. For binary diff there are tools like
bsdiff,xdelta, or VBinDiff. - Order sensitivity. Line diff assumes the documents were edited line by line. For unsorted CSVs or JSON object key reorderings, you'd want to sort or canonicalise first.
- Move detection. Classic LCS diff doesn't recognise that a function was moved within a file, it shows the move as a delete plus an insert. Beyond Compare, Meld and difftastic attempt move detection; this page does not.
More questions
Why doesn't this page show character-level highlights inside changed lines?
Because the line-level pass is enough for most use cases and runs much faster on long inputs. For intra-line word-by-word highlighting, two-step refinement is standard practice but adds latency. If the entire line difference matters to you (legal redlining, prose proofreading), a dedicated word-level tool (including git diff --word-diff on the command line) is a better fit.
What's the difference between unified diff and side-by-side?
Unified diff is the compact, machine-readable format every .patch file uses, three lines of context, hunks marked with @@. Side-by-side displays the two versions in parallel columns, which is easier on the eye for visual review but takes more horizontal space. Unified is what you'd email or commit; side-by-side is what you'd read on a wide monitor.
Are there algorithms better than Myers for source code?
For source code with moved functions or large rearrangements, yes, patience diff (Bram Cohen, 2002) and histogram diff (git, around 2010) are designed to align on meaningful unique lines rather than common ones, producing more readable output. Both are available via git diff --patience and git diff --histogram. AST-aware tools like difftastic go further and parse the actual language structure.
Is it safe to paste confidential text here?
Yes, the diff runs entirely in your browser using a JavaScript LCS implementation. Nothing is uploaded, no analytics on the input, no server-side log of the text. This is the only category of diff tool that's safe for NDAs, internal source code, or unreleased contract clauses; cloud-based diff services see whatever you paste.