टेक्स्ट अंतर जाँचकर्ता

दो टेक्स्ट की तुलना करें और तुरंत अंतर खोजें।

आपका डेटा आपके डिवाइस से बाहर नहीं जाता

टेक्स्ट तुलना के बारे में

यह diff टूल दो टेक्स्ट की पंक्ति-दर-पंक्ति तुलना के लिए Longest Common Subsequence (LCS) एल्गोरिथम का उपयोग करता है। जोड़ हरे रंग में, हटाव लाल रंग में हाइलाइट होते हैं।

सामान्य उपयोग

अक्सर पूछे जाने वाले प्रश्न

क्या यह वर्ण-दर-वर्ण तुलना करता है?

यह टूल पंक्ति-दर-पंक्ति तुलना करता है। यदि एक पंक्ति में कोई भी परिवर्तन है (यहाँ तक कि एक भी वर्ण), तो पूरी पंक्ति को संशोधित के रूप में चिह्नित किया जाता है। यह Git और अधिकांश diff टूल्स द्वारा उपयोग किया जाने वाला दृष्टिकोण है।

क्या कोई आकार सीमा है?

कोई सख्त सीमा नहीं है, लेकिन बहुत बड़े टेक्स्ट (10,000 से अधिक पंक्तियाँ) को प्रोसेस करने में एक पल लग सकता है, क्योंकि तुलना पूरी तरह आपके ब्राउज़र में चलती है।

What "diff" actually means

A diff describes how to turn one text into another using the smallest set of insertions and deletions. There are infinitely many ways to do that, the trivial one is "delete every character of A and then insert every character of B." A useful diff is the shortest edit script, which is the dual of the Longest Common Subsequence problem: find the longest sequence of lines that appears (in order) in both texts, and everything else is an addition or a deletion. Most diff algorithms are LCS algorithms wearing a different hat.

A subtle point: the longest common subsequence is not the longest common substring. A subsequence preserves order but not adjacency, so "ABCDE" and "AXBYCZD" share the subsequence "ABCD" even though no contiguous run of three characters appears in both.

A short history of diff algorithms

The first widely-used file-comparison program was the Unix diff written by Douglas McIlroy at Bell Labs, with the underlying algorithm published by James W. Hunt and McIlroy as Bell Labs Computing Science Technical Report #41 in June 1976. McIlroy himself described the four-month development as "a desperate effort." The Hunt–McIlroy approach hashed every line of file B, indexed where each unique line occurred, and looked for the longest monotonically increasing chain of matches. It shipped in Version 6 Unix and remained essentially the same algorithm in /usr/bin/diff on most Unix systems for decades.

Eugene W. Myers published the modern industry standard in 1986, "An O(ND) Difference Algorithm and Its Variations" in Algorithmica Vol. 1, No. 2. Myers reframed LCS as a shortest-path problem through an edit graph: place A along the top axis and B along the side, draw a diagonal whenever two lines match, and search for the path from top-left to bottom-right with the fewest non-diagonal edges. Each non-diagonal edge is one insertion or deletion; each diagonal is free. Crucially, the algorithm runs in O((N+M)·D) time where D is the size of the edit script, that is, it gets faster the more similar the files are. For typical version-control diffs D is tiny compared to N+M, so Myers is dramatically faster than the O(N·M) worst case in practice. His paper also includes a linear-space refinement using a divide-and-conquer "middle snake" trick. Myers' algorithm is the default backing GNU diff, git diff, Mercurial, Subversion, jsdiff, Python's difflib, and most GUI diff viewers.

Bram Cohen (later better known as the inventor of BitTorrent) designed patience diff for the Bazaar version-control system around 2002. The motivating problem: Myers diff aligns any matching line, including very common ones, so moving a function in C can produce a noisy diff in which the closing brace of the moved function aligns with the closing brace of an unrelated function. Patience diff anchors the diff on lines that occur exactly once in each file (typically function signatures, comment headers, distinctive log messages), computes the LCS of just those anchors using patience sort, and recurses on the gaps. The result aligns on meaningful lines and reads better for source code. Git supports it via git diff --patience. Histogram diff is a refinement added to git around 2010 that builds a histogram of line occurrences and prefers low-frequency anchors rather than strictly unique ones; from git 2.45 onward it's the default for many configurations.

Levenshtein vs LCS

A related concept worth distinguishing. Levenshtein distance (Vladimir Levenshtein, 1965) counts the minimum number of single-character insertions, deletions, and substitutions to turn one string into another. LCS-based diff is closely related but slightly different: classic diff treats a "change" as a delete-then-insert pair rather than a single substitution, because that maps better onto line-oriented files. Levenshtein answers "how different are these strings?" and gives you a number; LCS answers "what specifically changed?" and gives you an actual edit script. Spell-checkers and "did you mean?" features want the first; diff tools want the second. The Damerau-Levenshtein variant (1964) extends Levenshtein with a transposition operator for typo detection.

Granularity: line, word, and character

Diff can be computed at different granularities, each with trade-offs:

A common pattern in modern diff UIs is to compute the diff at line level first, then for any pair of "removed/added neighbours" run a secondary word-level diff on just those two lines and highlight the intra-line changes. That's exactly how GitHub's PR view and the diff-match-patch library both work.

Three diff display modes you'll see

Useful git diff variants

When you'd reach for a browser diff

Honest limitations

More questions

Why doesn't this page show character-level highlights inside changed lines?

Because the line-level pass is enough for most use cases and runs much faster on long inputs. For intra-line word-by-word highlighting, two-step refinement is standard practice but adds latency. If the entire line difference matters to you (legal redlining, prose proofreading), a dedicated word-level tool (including git diff --word-diff on the command line) is a better fit.

What's the difference between unified diff and side-by-side?

Unified diff is the compact, machine-readable format every .patch file uses, three lines of context, hunks marked with @@. Side-by-side displays the two versions in parallel columns, which is easier on the eye for visual review but takes more horizontal space. Unified is what you'd email or commit; side-by-side is what you'd read on a wide monitor.

Are there algorithms better than Myers for source code?

For source code with moved functions or large rearrangements, yes, patience diff (Bram Cohen, 2002) and histogram diff (git, around 2010) are designed to align on meaningful unique lines rather than common ones, producing more readable output. Both are available via git diff --patience and git diff --histogram. AST-aware tools like difftastic go further and parse the actual language structure.

Is it safe to paste confidential text here?

Yes, the diff runs entirely in your browser using a JavaScript LCS implementation. Nothing is uploaded, no analytics on the input, no server-side log of the text. This is the only category of diff tool that's safe for NDAs, internal source code, or unreleased contract clauses; cloud-based diff services see whatever you paste.

संबंधित टूल