|
Software
Existing software solutions in Java.
Featured Finding differences in TextThe diff problem for pure text is considered solved more or less. There are several implementations which offer high quality results at minimal cost. Open-source solutions (especially GNU diff) are also available. It is also worth noting that in several cases we are not interested in word-level diffs but only on line changes. Typical examples include Source Code Management systems or Wiki applications. This makes the job of a text diff library very easy. Fine-grained differences at the character level are also possible. Google diff for example can show differences in characters. For example it can understand that horse and horses differ in one character only. Finding differences in XML/HTMLComparing two XHTML files is a completely different story. HTML holds tree structured data so the problem is no longer trivial. A diff library must be essentially "smart" in order to understand what is an html tag and what is not. Changes can now happen in HTML attributes apart from simple text. HTML also contains advanced constructs likes lists and tables which complicate the output code. HTML found in the wild can also be very rough for a diff library. Some pre-processing code is needed that cleans up the HTML before the actual comparison takes place. There is a lot of research and literature on XML diffing methods. Unlike pure text, a definitive solution has not yet appeared. Diff software in JavaBelow is a table that lists other solutions apart from Daisy Diff
What Daisy Diff offersOne of the most important features of Daisy Diff is the fact that it "understands" HTML tags and will actually look into the text to decide if a node is same for not. For example assume that a user has changed a single word in a big paragraph. Most XML libraries would just mark the whole paragraph as different. Daisy Diff however will look into the inline text (the contents of the p tag node) and understand that only one word is different. Therefore it will present to the user only this word as changed. DaisyDiff is also used in production (Daisy CMS) and also comes with a business friendly licence. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||