Issue 487: infinite loop in MyersDiff
Status:  Accepted
Owner: ----
Project Member Reported by sop@google.com, Mar 3, 2010
Change I97a85ec7 on internal server throws the patch list
compute method into an infinite loop.

   java.lang.Thread.State: RUNNABLE
	at org.eclipse.jgit.diff.MyersDiff$MiddleEdit$EditPaths.calculate(MyersDiff.java:377)
	at org.eclipse.jgit.diff.MyersDiff$MiddleEdit.calculate(MyersDiff.java:230)
	at org.eclipse.jgit.diff.MyersDiff.calculateEdits(MyersDiff.java:158)
	at org.eclipse.jgit.diff.MyersDiff.calculateEdits(MyersDiff.java:163)
	at org.eclipse.jgit.diff.MyersDiff.calculateEdits(MyersDiff.java:145)
	at org.eclipse.jgit.diff.MyersDiff.<init>(MyersDiff.java:120)
	at com.google.gerrit.server.patch.PatchListCacheImpl.newEntry(PatchListCacheImpl.java:248)
	at com.google.gerrit.server.patch.PatchListCacheImpl.readPatchList(PatchListCacheImpl.java:198)
	at com.google.gerrit.server.patch.PatchListCacheImpl.compute(PatchListCacheImpl.java:129)
	at com.google.gerrit.server.patch.PatchListCacheImpl.access$000(PatchListCacheImpl.java:62)

Mar 5, 2010
#1 sop@google.com
This infinite loop is reproducible and is definitely inside of
JGit's MyersDiff.  Its data dependent on a particular file in
that change.  :-\
Labels: Component-JGit
Mar 8, 2010
#2 sop@google.com
Change I257be95ac93c0775d24fcd36a315f293503febb7 adds
cache.diff.intraline = false as a configuration option
to permit an administrator to disable this code path,
but still upgrade to 2.1.2.

Its not a solution.  Its not even a great workaround.
But I'm hoping it will let me release 2.1.2 sooner.
Apr 26, 2010
#3 sop@google.com
(No comment was entered for this change.)
Labels: -Milestone-2.1.2
Nov 11, 2010
#4 sop@google.com
 Issue 774  has been merged into this issue.
Nov 11, 2010
#5 jtur...@gmail.com
Should it also hang the push via command line though, so that the git push review "never" returns either?  Or is the web UI doing some work even when you're pushing for caching/informational purposes?
Nov 11, 2010
#6 sop@google.com
Its the same root cause, yes.

When pushing the server sends out email notifications
to let interested parties know there is a new change
available.  That email is generated by the server push
thread, so the command is blocked until its done being
created and passed off to the SMTP server.

That notification contains the list of files that were
modified by the change.  The list of files is loaded by
a cache, which is the same cache that the web UI uses.

The cache uses a lock to ensure that only 1 thread does
the actual computation, while other threads block and
wait for the cache entry to be created and stored into
the cache.

Unfortunately right now the email code only needs the
list of paths affected, which is very cheap to compute,
but the cache also stores the line-by-line and intra-line
difference data at the same time, for the web UI.  Its
the last step that is choking (the intra-line difference).
Nov 11, 2010
#7 sop@google.com
Change I61bbd3d068bd9b3da9d1018c0f66969a5b6ff58e
(which will be in 2.1.6 final) splits the intraline diff code
out of the main code path for file listings, deferring it
until a user asks to view that specific file.  This allows a
user to temporarily disable the intraline diff for a single
file by adjusting their patch settings in the web UI, which
gives us a partial work-around when the server gets
stuck on a change.
Nov 11, 2010
#8 jtur...@gmail.com
Awesome!  Thank you.
Nov 15, 2010
#9 sop@google.com
Change I6cbfdd0acc6f7e612a29ed789efe9da591a45273
(which will also be in 2.1.6 final) uses a custom thread
pool with a timeout to try and abort runaway MyserDiff
cases and at least recover server CPU.  When we get to
a bad file, a timeout will trigger, the thread will die, and
the user will see the file without the intraline difference.

This isn't a fix.  Its a bandaid workaround.  But it will
cat least buy us some time until the JGit project can
track down and fix this issue.