| Issue 232: | remote.name.timeout causes NullPointerException, extra TransportException: Read timed out |
‹ Prev
43 of 43
|
| 1 person starred this issue and may be notified of changes. | Back to list |
Reported by Shawn Pearce <sop@google.com> on Thu Jun 25 07:25:49 PDT 2009
Source: JIRA GERRIT-233
Affected Version: 2.0.15
Environment: JSch 0.1.41
JSch is crashing with an NPE during connection setup if there is a timeout
configured. Its random, which means its a thread race condition, as sometimes
the code succeeds. Unfortunately the root cause below is incomplete as JSch
catches RuntimeException and rethrows it, discarding the original stack
trace. So its harder to know where the failure is.
2009-06-24 17:15:16,999::ERROR: com.google.gerrit.git.PushQueue - Cannot
replicate to ssh://android-replication@remote:29418/platform/manifest.git
org.spearce.jgit.errors.TransportException: ssh://android-replication@remote:29418/platform/manifest.git:
java.lang.NullPointerException
at org.spearce.jgit.transport.TransportGitSsh.exec
(TransportGitSsh.java:150)
at org.spearce.jgit.transport.TransportGitSsh$SshPushConnection.<init>
(TransportGitSsh.java:348)
at org.spearce.jgit.transport.TransportGitSsh.openPush
(TransportGitSsh.java:97)
at org.spearce.jgit.transport.PushProcess.execute(PushProcess.java:119)
at org.spearce.jgit.transport.Transport.push(Transport.java:734)
at com.google.gerrit.git.PushOp.pushVia(PushOp.java:192)
at com.google.gerrit.git.PushOp.runImpl(PushOp.java:145)
at com.google.gerrit.git.PushOp.run(PushOp.java:94)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:
441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ScheduledThreadPoolExecutor
$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
at java.util.concurrent.ScheduledThreadPoolExecutor
$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:207)
at com.google.gerrit.git.WorkQueue$Task.run(WorkQueue.java:231)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask
(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run
(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: com.jcraft.jsch.JSchException: java.lang.NullPointerException
at com.jcraft.jsch.Channel.connect(Channel.java:206)
at org.spearce.jgit.transport.TransportGitSsh.exec
(TransportGitSsh.java:147)
... 16 more
Sep 24, 2009
#1
code-rev...@gtempaccount.com
Sep 24, 2009
Comment by Shawn Pearce <sop@google.com> on Thu Jun 25 11:04:58 PDT 2009 This seems to be a feature of JSch. In http://www.mail-archive.com/jsch-users@lists.sourceforge.net/msg00520.html the author more or less says we shouldn't rely on the message text of the JSchException during a connect failure when a timeout is used... and instead just retry in an application loop, and then give up after some number of attempts. Based on the sourceforge bug tracker, there's a lot of thread race conditions and NPEs lurking around in JSch... and my own reading of the source code has sent shivers down my spine about how unsafe the shared data really is between threads. Near as I can tell, it really violates JSR-133 (http://jcp.org/en/ jsr/detail?id=133) and pays no attention to it whatsoever. If there is more than one processor in the system I can easily see how JSch can fall over with random exceptions. Perhaps JGit should move to MINA SSHD's client library.
Sep 24, 2009
Comment by Shawn Pearce <sop@google.com> on Wed Jul 01 15:55:56 PDT 2009 In http://thread.gmane.org/gmane.comp.version-control.git/122227 I asked other JGit developers this question, and there doesn't appear to be a consensus. Robin's remark about moving to an unknown from a semi-known that maybe could be fixed is however pretty wise; it might be easier to fix JSch than to rewrite the interfaces to MINA SSHD, and implement missing features in MINA SSHD.
Sep 24, 2009
Update by Shawn Pearce <sop@google.com> on Thu Jul 02 10:14:47 PDT 2009
Sep 24, 2009
Comment by Shawn Pearce <sop@google.com> on Thu Jul 02 10:16:40 PDT 2009
The timeout also seems to trigger too frequently. E.g. setting
remote.name.timeout to 30 (seconds) can cause replication to completely fail,
because data doesn't make it up into the application layer in time. This is
either a bug in Gerrit's TimeoutInputStream (unlikely, given it waits 30
seconds) or in JSch's own network code (much more likely, given what I have
seen of it).
2009-07-01 17:23:19,572::ERROR: com.google.gerrit.git.PushQueue - Cannot
replicate to ssh://android-replication@source.android.com:29418/platform/frameworks/base.git
org.spearce.jgit.errors.TransportException: Read timed out
at org.spearce.jgit.transport.BasePackConnection.readAdvertisedRefs
(BasePackConnection.java:148)
at org.spearce.jgit.transport.TransportGitSsh$SshPushConnection.<init>
(TransportGitSsh.java:365)
at org.spearce.jgit.transport.TransportGitSsh.openPush
(TransportGitSsh.java:97)
at org.spearce.jgit.transport.PushProcess.execute(PushProcess.java:119)
at org.spearce.jgit.transport.Transport.push(Transport.java:866)
at com.google.gerrit.git.PushOp.pushVia(PushOp.java:192)
at com.google.gerrit.git.PushOp.runImpl(PushOp.java:145)
at com.google.gerrit.git.PushOp.run(PushOp.java:94)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:
441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ScheduledThreadPoolExecutor
$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
at java.util.concurrent.ScheduledThreadPoolExecutor
$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:207)
at com.google.gerrit.git.WorkQueue$Task.run(WorkQueue.java:244)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask
(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run
(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.InterruptedIOException: Read timed out
at org.spearce.jgit.util.io.TimeoutInputStream.readTimedOut
(TimeoutInputStream.java:131)
at org.spearce.jgit.util.io.TimeoutInputStream.read
(TimeoutInputStream.java:104)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at org.spearce.jgit.util.NB.readFully(NB.java:67)
at org.spearce.jgit.transport.PacketLineIn.readLength
(PacketLineIn.java:120)
at org.spearce.jgit.transport.PacketLineIn.readString
(PacketLineIn.java:92)
at org.spearce.jgit.transport.BasePackConnection.readAdvertisedRefsImpl
(BasePackConnection.java:161)
at org.spearce.jgit.transport.BasePackConnection.readAdvertisedRefs
(BasePackConnection.java:142)
... 16 more
Sep 24, 2009
(No comment was entered for this change.)
Status:
Accepted
Owner: s...@google.com
Nov 21, 2009
(No comment was entered for this change.)
Owner:
s...@google.com
|
|
| ► Sign in to add a comment |