My favorites | Sign in
Project Home Downloads Wiki Issues Source
Search
for
API  
API for Diff, Match and Patch Library.
Phase-Implementation, Featured
Updated Feb 18, 2011 by neil.fra...@gmail.com

Introduction

This library is available in multiple languages. Regardless of the language used, the interface for using it is the same. This page describes the API for the public functions. For further examples, see the relevant test harness.

Initialization

The first step is to create a new diff_match_patch object. This object contains various properties which set the behaviour of the algorithms, as well as the following methods/functions:

diff_main(text1, text2) => diffs

An array of differences is computed which describe the transformation of text1 into text2. Each difference is an array (JavaScript, Lua) or tuple (Python) or Diff object (C++, C#, Objective C, Java). The first element specifies if it is an insertion (1), a deletion (-1) or an equality (0). The second element specifies the affected text.
diff_main("Good dog", "Bad dog") => [(-1, "Goo"), (1, "Ba"), (0, "d dog")]
Despite the large number of optimisations used in this function, diff can take a while to compute. The diff_match_patch.Diff_Timeout property is available to set how many seconds any diff's exploration phase may take. The default value is 1.0. A value of 0 disables the timeout and lets diff run until completion. Should diff timeout, the return value will still be a valid difference, though probably non-optimal.

diff_cleanupSemantic(diffs) => null

A diff of two unrelated texts can be filled with coincidental matches. For example, the diff of "mouse" and "sofas" is [(-1, "m"), (1, "s"), (0, "o"), (-1, "u"), (1, "fa"), (0, "s"), (-1, "e")]. While this is the optimum diff, it is difficult for humans to understand. Semantic cleanup rewrites the diff, expanding it into a more intelligible format. The above example would become: [(-1, "mouse"), (1, "sofas")]. If a diff is to be human-readable, it should be passed to diff_cleanupSemantic.

diff_cleanupEfficiency(diffs) => null

This function is similar to diff_cleanupSemantic, except that instead of optimising a diff to be human-readable, it optimises the diff to be efficient for machine processing. The results of both cleanup types are often the same.
The efficiency cleanup is based on the observation that a diff made up of large numbers of small diffs edits may take longer to process (in downstream applications) or take more capacity to store or transmit than a smaller number of larger diffs. The diff_match_patch.Diff_EditCost property sets what the cost of handling a new edit is in terms of handling extra characters in an existing edit. The default value is 4, which means if expanding the length of a diff by three characters can eliminate one edit, then that optimisation will reduce the total costs.

diff_levenshtein(diffs) => int

Given a diff, measure its Levenshtein distance in terms of the number of inserted, deleted or substituted characters. The minimum distance is 0 which means equality, the maximum distance is the length of the longer string.

diff_prettyHtml(diffs) => html

Takes a diff array and returns a pretty HTML sequence. This function is mainly intended as an example from which to write ones own display functions.

match_main(text, pattern, loc) => location

Given a text to search, a pattern to search for and an expected location in the text near which to find the pattern, return the location which matches closest. The function will search for the best match based on both the number of character errors between the pattern and the potential match, as well as the distance between the expected location and the potential match.
The following example is a classic dilemma. There are two potential matches, one is close to the expected location but contains a one character error, the other is far from the expected location but is exactly the pattern sought after: match_main("abc12345678901234567890abbc", "abc", 26) Which result is returned (0 or 24) is determined by the diff_match_patch.Match_Distance property. An exact letter match which is 'distance' characters away from the fuzzy location would score as a complete mismatch. For example, a distance of '0' requires the match be at the exact location specified, whereas a threshold of '1000' would require a perfect match to be within 800 characters of the expected location to be found using a 0.8 threshold (see below). The larger Match_Distance is, the slower match_main() may take to compute. This variable defaults to 1000.
Another property is diff_match_patch.Match_Threshold which determines the cut-off value for a valid match. If Match_Threshold is closer to 0, the requirements for accuracy increase. If Match_Threshold is closer to 1 then it is more likely that a match will be found. The larger Match_Threshold is, the slower match_main() may take to compute. This variable defaults to 0.5. If no match is found, the function returns -1.

patch_make(text1, text2) => patches

patch_make(diffs) => patches

patch_make(text1, diffs) => patches

Given two texts, or an already computed list of differences, return an array of patch objects. The third form (text1, diffs) is preferred, use it if you happen to have that data available, otherwise this function will compute the missing pieces.

patch_toText(patches) => text

Reduces an array of patch objects to a block of text which looks extremely similar to the standard GNU diff/patch format. This text may be stored or transmitted.

patch_fromText(text) => patches

Parses a block of text (which was presumably created by the patch_toText function) and returns an array of patch objects.

patch_apply(patches, text1) => [text2, results]

Applies a list of patches to text1. The first element of the return value is the newly patched text. The second element is an array of true/false values indicating which of the patches were successfully applied. [Note that this second element is not too useful since large patches may get broken up internally, resulting in a longer results list than the input with no way to figure out which patch succeeded or failed. A more informative API is in development.]
The previously mentioned Match_Distance and Match_Threshold properties are used to evaluate patch application on text which does not match exactly. In addition, the diff_match_patch.Patch_DeleteThreshold property determines how closely the text within a major (~64 character) delete needs to match the expected text. If Patch_DeleteThreshold is closer to 0, then the deleted text must match the expected text more closely. If Patch_DeleteThreshold is closer to 1, then the deleted text may contain anything. In most use cases Patch_DeleteThreshold should just be set to the same value as Match_Threshold.
Comment by lop...@gmail.com, Sep 10, 2008

It's very good API.

Could you tell me about how to integrate to GWT.

Comment by philip.l...@gmail.com, Sep 12, 2008

Although this is nor the place to ask user questions, nor the place to answer them (Neil, feel free to delete both): @lopmuj: just call it using JSNI - something like this will do the trick:

    private native String getDiffString(String original, String amendment)/*-{
        var dmp = new $wnd.diff_match_patch(); 
        var d = dmp.diff_main(original, amendment);
        dmp.diff_cleanupSemantic(d);
        return dmp.diff_prettyHtml(d);
    }-*/;

But this is so obvious that I'm not even sure if you tried finding it out by yourself in the first place ..

Comment by manico.james@gmail.com, Nov 19, 2008

This is an incredible piece of work, nice job Neil!

Comment by arnoux.j...@gmail.com, Feb 7, 2009

This is completely what I was looking for after getting fed up with SVN&Co. bogus APIs Thanx Neil!

Comment by piotr.ba...@gmail.com, Mar 5, 2009

Excellent work. Thanks!

Comment by nuno.car...@gmail.com, Mar 9, 2009

Simply awesome!

I just built a wrapper around your calls, customize it according to my needs! I was initially tempted to mimic the GNU "diff" behavior (dumping everything into the stdout), but when I saw the results in HTML....I was "...wwoooowed! this is great, it's really easy to spot the differences (Inserts/Deletes), as they are coded: "Insertions" show up in the HTML with a green background and underlined font whilst "Deletions" show up with a red background and Strike-through font...

Great job!

Thanks!

Comment by shashank...@gmail.com, Mar 27, 2009

This is a great tool. However, I wasn't able to understand the matching algorithm. Is there a way to ignore the location and match based just on the pattern in match_main. Basically, I'm trying to identify if a particular word or phrase (or its variant (like a missing hyphen etc.)) appears in the given text.

Comment by jjkn...@googlemail.com, May 26, 2009

Neil, Excellent!

One question: If I diff a big document against a small one, I get a big "DELETE" block in the resulting diff. Is this really nessesary? I think a "delete XX chars" would be enough?

Bob

Comment by project member neil.fra...@gmail.com, May 26, 2009

jjknopf, how you format/display/visualize the diff is entirely up to you. diff_prettyHtml(diffs) is one way to render a diff, it's a trivial function, you can roll your own. Take a look at diff_toDelta(diff), it's not listed in the above API, but it is very compact and may be similar to what you are looking for.

Comment by mtc...@gmail.com, Jun 11, 2009

I'm trying to match a piece of text against several "candidates'. Is there some way to compare the results of two matches to see which is "best"?

Thanks!

Comment by project member neil.fra...@gmail.com, Jun 12, 2009

I've added a new function to all language versions called diff_levenshtein(diff) which takes a diff and reports the number of character insertions, deletions or substitutions it represents. Thus one can diff your target against several options and compute the Levenshtein distance for each to find the option that's closest to the target.

Comment by jimmygil...@gmail.com, Aug 27, 2009

Hello all,

I would like to know. based on this implementation, how can I create a list of diffs to simulate something like this : http://www.caffeinated.me.uk/kompare/

What I need is the following : - be able to know "inserted" block of text - be able to know "deleted" block of text - be able to know "changed" block of text

- if the block changed, I would like to know if the block is completely changed on where is the change in the block.

Is it possible with this implementation ? Thank you very much !

Jimmy

Comment by project member neil.fra...@gmail.com, Aug 27, 2009

Jimmy this would better be asked in the newsgroup: http://groups.google.com/group/diff-match-patch

But the simple answer is run diff_main(text1, text2), then step through the resulting array of diffs. They will be insertions, deletions or equalities. Obviously a 'change' is a deletion and an insertion next to each other.

Comment by erik.pou...@gmail.com, Sep 3, 2009

Nice work. However, how hard would it be add unpatch(), that is, reversepatch()? The second issue is that I need to port this to php ...

Comment by project member neil.fra...@gmail.com, Sep 3, 2009

Unpatching would just be to loop through the diff, swapping DIFF_INSERT with DIFF_DELETE, then apply the patch. As for PHP, there's a partial translation which someone wrote, email me and I'll send you a copy.

Comment by nckul...@gmail.com, Oct 29, 2009

great job, this is what I'm looking for! thanks a lot.

Comment by pacohernandezg, Oct 30, 2009

A terrific work! Thanks a lot!

Comment by sipa...@gmail.com, Nov 4, 2009

good stuff, just what i was looking for.

Comment by mathews....@gmail.com, Dec 2, 2009

I need the PHP implementation as well. Neil -- I don't have your email but could you email me it at mathews.kyle@gmail.com? Thanks, that'd be really helpful.

Comment by tobias.b...@gmail.com, Dec 16, 2009

Great Stuff! Please send me the php-implementation as well. I will send it back if i extend it.

E-Mail: dmp@shwups.ch

Comment by danutchi...@gmail.com, Mar 18, 2010

Great job!

Comment by tom%rsdn...@gtempaccount.com, Mar 23, 2010

Hi, but what about patch conflicts? In standard GNU patch it is implemented. How we can recognize patching conflict occured?

Thanks

Comment by project member neil.fra...@gmail.com, Mar 23, 2010

Tom, look at the results from patch_apply. There are two returned values, one is the patched text, and the other is a list of booleans indicating which patches were applied and which failed.

Comment by kong8...@gmail.com, Mar 24, 2010

I used a "Demo of Patch" example (http://neil.fraser.name/software/diff_match_patch/svn/trunk/demos/demo_patch.html). 1. I put "Old Version" text like this: <appSettings>

<add key="AccessAllowedIPAddresses" value="127.0.0.1" />
</appSettings> <applicationSettings>
<setting name="SqlServerDataPath" serializeAs="String">
<value>C:\DATA_PATH</value>
</setting>
</applicationSettings>

2. I put "New Version" text like this: <appSettings>

<add key="AccessAllowedIPAddresses" value="127.0.0.1" /> <add key="SqlServerDataPath" value="C:\data_path" />
</appSettings> <applicationSettings> </applicationSettings>

3. I computed a patch.

4. I put "Old Version" text like this: <appSettings>

<add key="AccessAllowedIPAddresses" value="127.0.0.1" />
</appSettings> <applicationSettings>
<setting name="SqlServerDataPath" serializeAs="String">
<value>C:\REAL_DATA_PATH</value>
</setting>
</applicationSettings>

5. I applied a patch. No conflicts were found. I got text like "New Version" text. Why weren't patch conflicts found?

Comment by clon...@gmail.com, Apr 8, 2010

What a remarkable piece of work! I'm impressed. :)

Comment by student...@gmail.com, Apr 23, 2010

Would it be possible to add functionality to work with files, not just strings?

Comment by student...@gmail.com, May 3, 2010

I compare 2 large texts and it gives me one big delete and one big insert even with "No cleanup". If I make texts smaller it shows differences nicely. Something is wrong.

Comment by project member neil.fra...@gmail.com, May 3, 2010

student00x, nothing is wrong, your diff exceeded the time limit and it was forced to return a valid, but non-optimal diff. Try increasing diff_match_patch.Diff_Timeout?, or setting it to zero.

Comment by cppgabri...@gmail.com, May 6, 2010

I have two documents (.txt) with accented characters (è, ù). After taking the patches, I try to use the function "patch_toText(patches)" but I obtain a segmentation fault. Is there a solution?. I'm working with C language.

Comment by tac...@gmail.com, Jun 2, 2010

Impressing !

Comment by lars...@gmail.com, Jun 25, 2010

Nice work. I'm evaluating this for use in a larger project requiring client-side text operations. The project is built on the GWT-platform. Are there any recommendations as to: 1. Use the supplied javascript library via JSNI (GWT). 2. Compile the supplied java library to javascript using the GWT-compiler.

I'm considering option 2 to be the most feasible. I might have to verify the correctness of the result but i will get highly optimized, crosscompiled, javascript. And i don't have to include any JSNI-code (which should be avoided when possible). Any thoughts anyone?

Comment by ewod...@gmail.com, Sep 17, 2010

Very Cool, Is it possible to make it an option to have the comparison unit be a word instead of a char? An example where the output is confusing (to me at least) is Text1 = $200.00 abc $100.00 Text2 = $205.00 abc $102.50

Output is ... $20<strike>0</strike>5.00 abc $10<strike>0.0</strike>2.50

This is much less intuitive that <strike>200.00</strike>205.00 abc <strike>100.0</strike>102.50

Comment by kumb...@gmail.com, Oct 1, 2010

Very nice and perfectly simple to use!

Comment by deb.sand...@gmail.com, Oct 2, 2010

Very impressive piece of work. Thanks for sharing it with the world!

Comment by WCharlie...@gmail.com, Oct 5, 2010

I too would like the php version, whatever state it's in.

Comment by mikepodonyi, Nov 29, 2010

Can i use this Library for generating diffs of binary files.

Comment by project member neil.fra...@gmail.com, Nov 29, 2010

Yes Mike, DMP works great on binary files. When stress its suitability for 'text', that's meant as opposed to structured content such as XML; there's no problem with binary.

Comment by reinstei...@gmail.com, Dec 19, 2010

Hey Neil,

Great job on this package. Truly remarkable work. :) I have a question; I'd like to develop an editing app (non realtime) that behaves like GNU diff3 (compare docs A and B to older, parent copy C). I want to graphically represent the successful merges and and conflicting segments. I was looking at using patch_apply for this but I noticed your comment in the function description:

"...A more informative API is in development..."

I'm assuming I probably won't be able to build a conflict resolver built on the output of this api until that API improvement you mentioned is ready?

Thanks again!

-Mike

Comment by di...@kollesche.de, Jan 6, 2011

Nice job. One correction should be done in the Java-version though: The Diff-subclass implements a equals()-method but NO hashcode()-method. A hashCode is needed if you e.g. put Diff-instances into a HashMap? to count occurencies of certain Diffs.

Comment by meena...@gmail.com, Feb 5, 2011

Nice Work! Would it be possible to provide a line-level diff? I'm looking for a tool that would provide a line-level diff between two text docs that can be stored in db and later recreate a particular version by either applying/removing patches.

for eg:

Original:
This simple strategy is relatively fragile.
Minor changes between the two base documents can result in failure.
In the above example if the function name had been changed from "maxbits" to "max_bits" the patch would have failed to match on the first pass, but would have matched on the second pass (once the first line of the prefix context was ignored)
However, if the "b++;" had been changed to "b=b+1;", the patch would fail, despite the fact that all the other context lines match.



Revised:
This simple strategy is relatively faster.
Minor changes between the two base documents can result in failure.
In the above example if the function name had been changed from "maxbits" to "max_bits" the patch would have failed to match on the first pass, but would have matched on the second pass (once the first line of the prefix context was ignored).
--- O
+++ R
@@ -1,1 +1,1 @@
-This simple strategy is relatively fragile.
+This simple strategy is relatively faster.
@@ -3,2 +3,1 @@
-In the above example if the function name had been changed from "maxbits" to "max_bits" the patch would have failed to match on the first pass, but would have matched on the second pass (once the first line of the prefix context was ignored).
-However, if the "b++;" had been changed to "b=b+1;", the patch would fail, despite the fact that all the other context lines match.
+In the above example if the function name had been changed from "maxbits" to "max_bits" the patch would have failed to match on the first pass, but would have matched on the second pass (once the first line of the prefix context was ignored).

Thanks

Comment by project member neil.fra...@gmail.com, Feb 5, 2011

Since the question of line and word level diffs is frequently asked, I've created a page describing how to do this:

http://code.google.com/p/google-diff-match-patch/wiki/LineOrWordDiffs

Comment by meena...@gmail.com, Feb 7, 2011

Thanks Neil!

Is there an option to unpatch patches from the latest text in order to obtain the original?

I don't see one in the api list.

	@Test
	public void testDiffLineMode(){
		dmp = new diff_match_patch();
		String text1 = "The quick brown fox jumps over the lazy dog.";
		String text2 = "That quick brown fox jumped over a lazy dog.";
		LinesToCharsResult a = dmp.diff_linesToChars(text1, text2);   
		List<String> lines = a.lineArray;
		
		//create the patches
		LinkedList<diff_match_patch.Diff> diffs = dmp.diff_main(a.chars1,a.chars2);
		dmp.diff_charsToLines(diffs, lines);		
		LinkedList<Patch> patches = dmp.patch_make(diffs);
		
		//apply patches to text1 to get text2
		Object[] results = dmp.patch_apply(patches, text1);
		Assert.assertEquals("That quick brown fox jumped over a lazy dog.", results[0]);
		
		boolean[] boolArray = (boolean[]) results[1];
		Assert.assertTrue(boolArray[0]);
		  
		//remove patches from text2 to get text1
		....something like this ...
		results = dmp.patch_remove(patches, text2);  --> remove the patch(es) from text2 to obtain text1				
	}	
Comment by fra...@google.com, Feb 7, 2011

Unpatching can be done by just looping through the diff, swapping DIFF_INSERT with DIFF_DELETE, then applying the patch.

Comment by project member neil.fra...@gmail.com, Feb 8, 2011

This page is for documentation of the API. meena, could you move this discussion to the newsgroup? I'll delete these posts once you have done so. Thanks.

Comment by meena...@gmail.com, Feb 8, 2011

Please go ahead and remove them (including this one). Going forward I'll use the newsgroup. thanks

Comment by cra...@gmail.com, Apr 21, 2011

can i use it to compare two HTML files

Thanks Vishal

Comment by lifeisbu...@gmail.com, Aug 2, 2011

Grate job Thank you but in my working i want to escape html tag example text1 = "<span class='some_class'>This is my Text</span>"; text2 = "<div>This is my Text</div>"; it equals text(not check html tag <span>,<div>) how i use diff_match_patch please suggest

Comment by enishi.k...@gmail.com, Sep 20, 2011

Hello!

One question. It is possible to know the line number where an INSERT diff appear?

Thanks for all!

Comment by kalmb...@gmail.com, Dec 1, 2011

ANN: A Ruby implementation is available as a GEM. Gem can be installed with RubyGems?, "gem install diff_match_patch" Source Code here https://github.com/kalmbach/diff_match_patch

Comment by Enmarant...@gmail.com, Jan 23, 2012

Big thx for this great work! It's just what I was looking for :)

Comment by sameena....@gmail.com, Feb 14, 2012

Hi , I have used the following program to compare data of 2 text files , got to check whether first file data is present in second file , comparing row wise . .......................................... import java.io.BufferedReader?; import java.io.File; import java.io.FileNotFoundException?; import java.io.FileReader?; import java.io.IOException; import java.util.ArrayList?; import java.util.List; public class ComparingTextFiles? {

private BufferedReader? mReader = null; private static String firstFileRow; private static String secondFileRow; private static ArrayList? firstList = new ArrayList?(); private static ArrayList? secondList = new ArrayList?();
private static ArrayList? differentDataRows = new ArrayList?();
public ComparingTextFiles?(String aFileName) throws java.io.FileNotFoundException? {
mReader = new BufferedReader?(new FileReader?(aFileName));
} public static void main(String args) {
//compareFiles("D:/first.txt","D:/second.txt");
compareFiles("D:/TPDealsNashFinchGreatLakes20120212.txt","D:/VendorNumber?_TPDealsNashFinchGreatLakes20120212.txt"); }
public static void compareFiles(String firstFile,String secondFile) {
try {
ComparingTextFiles? comparingTextFiles1 = new ComparingTextFiles?(firstFile);
while ((firstFileRow = comparingTextFiles1.readLine()) != null) {
firstList.add(firstFileRow);
}
ComparingTextFiles? comparingTextFiles2 = new ComparingTextFiles?(secondFile);
while ((secondFileRow = comparingTextFiles2.readLine()) != null) {
secondList.add(secondFileRow);
}
} catch (Exception e) {
e.printStackTrace();
}
resultOfComparingFiles();
}
public static void resultOfComparingFiles() {
for(Object rowFromFirstFile : firstList)
{
String row = (String)rowFromFirstFile;
if(!(secondList.contains(row))) {
int index = firstList.indexOf(row); int atRowNo = index++;
System.out.println(" ... This row in first file is not present in second file ... ");
System.out.println(row);
System.out.println(" at row no "+atRowNo); System.out.println(".....................");
}
}
} public String readLine() throws java.io.IOException {
if (null == mReader)
throw new java.io.IOException();
String lLine = mReader.readLine(); if (null == lLine)
this.close();
return lLine;
} public void close() {
if (null == mReader)
return;
try {
mReader.close(); mReader = null;
} catch (Exception e) { }
}
}

..................................

Comment by sameena....@gmail.com, Feb 14, 2012

its working .

Comment by mcdonald...@gmail.com, Apr 1, 2012

Can you unapply a patch?

Comment by eesi...@gmail.com, Jul 11, 2012

are there any function which returns the added string or deleted string in Patch classes or somewhere else...

Comment by vdil...@gmail.com, Aug 8, 2012

Some patch functions are not working when they are implemented in java. for eg. following function is not working :

diff_match_patch diff = new diff_match_patch();

String text1 = "Hello This Is Java"; String text2 = "Hello This Is Net For You"; LinkedList?<Diff> d = diff.diff_main(text1, text2); LinkedList?<Patch> p = diff.patch_make( d); // Here is showing error

If you have any solutions on this then please help me.

Comment by s...@springaudio.com, Jan 11, 2013

Very good. it help me.

Comment by arnab.d...@gmail.com, Jan 24, 2013

I wrote a JS/CoffeeScript? wrapper library to help with the "presentation work" needed to use diff_match_patch: https://github.com/arnab/jQuery.PrettyTextDiff.

Comment by vdil...@gmail.com, Apr 10, 2013

Google diff match patch is very good. it really helps me for showing the diff between two revisions. But the only problem is that it can not display diff in proper format for eg. If I calculate the diff between

String 1 : Difference

&

String 2 : Hence

Then it shows that "Differ" is deleted and "H" is added and "ence" remains unchanged. How can I show that the String "Difference" is deleted and "Hence" is added.

Please help me

Sorry for my poor english.....

Thanks in advance.

Comment by vdil...@gmail.com, Apr 11, 2013
Comment by ickhyu...@gmail.com, May 9, 2013
just what i was looking for.
Comment by nshar2...@gmail.com, May 14, 2013

Can i update the patch and then apply that updated patch?

Comment by wuwenyao...@gmail.com, May 26, 2013

Nice work. Is there a revert patch function() ? I want to save the last version of text and patch in a database,and with them get the last version text ... till the first version.

Comment by madhu.va...@gmail.com, Jun 26, 2013

I was trying to try out the patch_apply functionality. I could see that it work effectively if the multiple text string matches differ beyond teh diff length. I meant Say if a file text1 has (as per the demo)


Hamlet: Do you see yonder cloud that's almost in shape of a camel? Polonius: By the mass, and 'tis like a camel, indeed. Hamlet: Methinks it is like a weasel. Polonius: It is backed like a weasel. Hamlet: Or like a whale? Polonius: Very like a whale. -- Shakespeare

file text2 has


Hamlet: Do you see the cloud over there that's almost the shape of a camel? Polonius: By golly, it is like a camel, indeed. Hamlet: I think it looks like a weasel. Polonius: It is shaped like a weasel. Hamlet: Or like a whale? Polonius: It's totally like a whale. -- Shakespeare

and if the we compute the diffs between text1 and text2 (same as in patch demo) and apply it on to a file text3 havin content --- Kirk: Do you see yonder cloud that's Kirk: Do you see yonder cloud that's almost in shape of a Klingon? Spock: By the mass, and 'tis like a Klingon, indeed. Kirk: Methinks it is like a Vulcan. Spock: It is backed like a Vulcan. Kirk: Or like a Romulan? Spock: Very like a Romulan. -- Trekkie


The result would be
Kirk: Do you see the cloud over there that's Kirk: Do you see yonder cloud that's almost the shape of a Klingon? Spock: By golly, it is like a Klingon, indeed. Kirk: I think it looks like a Vulcan. Spock: It is shaped like a Vulcan. Kirk: Or like a Romulan? Spock: It's totally like a Romulan. -- Trekkie

And clearly the first patch got applied improperly.

Is there any work around to overcome this limitation? like extent ehe diff match lenght? patch_apply considering the whole line (delimitted by \n) to check the extact patch offset or so? How to achieve it?

Comment by rudyg...@gmail.com, Aug 14, 2013

DMP is very nice and fast. but...

I was trying to compare 2 big (~3MB), similar to each other text files in C++ using "checklines=true" parameter. the count of lines in each file was exceeding ushort capacity. the result was pretty messy, and the diff_text1/2(QList<Diff>) functions were returning strings which were different from the original text1 and text2 inputs.

is there any easy way to get the right result in this case? e.g.splitting the input texts (how/where?) and join the result? "checklines=false" is supposed to be slower, so using it for big files is not the best option.

Comment by swifty21...@gmail.com, Nov 29, 2013

EXCELLENT!!! Thanks to the developers

Comment by rat...@gmail.com, Nov 29, 2013

wow this is seriously easy to use. Well done.

Comment by dua...@gmail.com, Feb 19, 2014

Where are the reject files when patch_apply() failed

Comment by gowtham....@gmail.com, Mar 5, 2014

This is an excellent piece of work. In my case, I have 2 HTML formatted strings simple text?, I would like to find the difference (NOT the HTML source) between these two HTML's (Diff when they appear in UI).

Say for example, I have a text with hyperlink. Now, I changed the hyperlink NOT the text, when I see in UI it should display me as 'NO DIFFERENCE'

Is it possible to achieve with this service?

Appreciate all your help.

G'

Comment by adam.b...@gmail.com, Jun 10, 2014

To answer the last comment, you could use a html parser like the one here: http://htmlparser.sourceforge.net/ (assuming you are using Java)

Then recursively go through the tags and concatenate the text together, so that you have only the text and not the HTML, then you could use the diff methods in this project

Comment by mprabha....@gmail.com, Jul 22, 2014

We are facing some permanface issue(OUT of Memory issue) while doing for huge data like say HTML files .. Is there is any restriction for Size of the file to compare


Sign in to add a comment
Powered by Google Project Hosting