My favorites | Sign in
Project Home Downloads Wiki Issues Source
New issue   Search
for
  Advanced search   Search tips   Subscriptions
Issue 2284: "internal server error" when pushing to refs/for/branch
2 people starred this issue and may be notified of changes. Back to list
Status:  New
Owner:  ----


Sign in to add a comment
 
Reported by kebne...@gmail.com, Nov 27, 2013
************************************************************
***** NOTE: THIS BUG TRACKER IS FOR GERRIT CODE REVIEW *****
***** DO NOT SUBMIT BUGS FOR CHROME, ANDROID, INTERNAL *****
***** ISSUES WITH YOUR COMPANY'S GERRIT SETUP, ETC.    *****
***** THOSE ISSUE BELONG IN DIFFERENT ISSUE TRACKERS!  *****
************************************************************

Affected Version: 2.7


$ git push origin HEAD:refs/for/master
Counting objects: 29, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (15/15), done.
Writing objects: 100% (16/16), 7.76 KiB, done.
Total 16 (delta 12), reused 0 (delta 0)
remote: Resolving deltas: 100% (12/12)
remote: Processing changes: refs: 2, done
To ssh://source.myhost.com/myproject.git
 ! [remote rejected] HEAD -> refs/for/master (internal server error)
error: failed to push some refs to 'ssh://source.myhost.com/myproject.git'

The log on the gerrit server shows only

[2013-11-22 11:42:49,879] ERROR com.google.gerrit.server.git.ReceiveCommits : Only 0 of 1 new change refs created in myproject; aborting

The change_id counter in the database gets incremented.

We started seeing this after upgrading from 2.4-something to 2.7. Using MySQL as the back end, running from gerrit's embedded jetty, LDAP for auth. It doesn't happen every time, but once it does, it affects all subsequent pushes to refs/for/master (at least). Direct pushes bypassing review are not affected (e.g., pushes to refs/heads/master work fine). On one occasion, clearing the caches fixed the issue; on others, only restarting the gerrit server has cleared it.

I only know for sure that it affects our primary repository, which is fairly large (~3.5 GB), and has a fairly large number of refs (~15,000). 

According to https://code.google.com/p/gerrit/issues/detail?id=1593#c44, at least one other installation (using PostgreSQL as the back end) is seeing this in 2.7 and also in 2.8rc0.

I can provide any further details required; if someone can tell me how to increase the log level I'd be happy to do that as well.
Nov 27, 2013
#1 kebne...@gmail.com
Here's the output of gerrit show-caches:

Gerrit Code Review        2.7                       now    15:40:22   PST
                                                 uptime    14 min 14 sec

  Name                          |Entries              |  AvgGet |Hit Ratio|
                                |   Mem   Disk   Space|         |Mem  Disk|
--------------------------------+---------------------+---------+---------+
  accounts                      |    28               |  19.5ms | 99%     |
  accounts_byemail              |    19               |  15.0ms | 92%     |
  accounts_byname               |    28               |  17.3ms | 96%     |
  adv_bases                     |                     |         |100%     |
  changes                       |                     |         |         |
  groups                        |     7               |  10.5ms | 12%     |
  groups_byinclude              |                     |         |         |
  groups_byname                 |                     |         |         |
  groups_byuuid                 |                     |         |         |
  groups_external               |                     |         |         |
  groups_members                |     5               |  17.2ms | 99%     |
  ldap_group_existence          |                     |         |         |
  ldap_groups                   |    27               |   6.0ms | 98%     |
  ldap_groups_byinclude         |                     |         |         |
  ldap_usernames                |     1               |  12.2ms |  0%     |
  permission_sort               |     9               |         | 99%     |
  plugin_resources              |                     |         |         |
  project_list                  |                     |         |         |
  projects                      |    65               |  10.7ms | 95%     |
  sshkeys                       |     8               |  31.0ms | 98%     |
D diff                          |     4     70  90.92k|  18.8ms | 87% 100%|
D diff_intraline                |     5     77  39.64k|   2.4ms | 16% 100%|
D git_tags                      |                0.00k|         |  0%     |
D web_sessions                  |     6    321 133.75k|         | 98% 100%|

SSH:      1  users, oldest session started 517 ms ago
Tasks:    3  total =    1 running +      0 ready +    2 sleeping
Mem:   1.65g total =   1.05g used + 439.45m free + 174.67m buffers
       8.89g max
        8192 open files,        4 cpus available,      204 threads

This is basically a show stopper for us.
Mar 14, 2014
#2 nnun...@gmail.com
I was the reporter of the other installation seeing this failure.  I can confirm that my repositories are still in a state where we still can't push reviews using the latest and greatest version (2.8.2).  We are running with a postgres backing DB, LDAP authentication, and the embedded jetty/netty web server.  Flushing the caches seems to help occasionally.

Please let me know if there is anything that I can provide reporting-wise.  The only error I see in the error_log is several repetitions of:

java.io.IOException: Connection reset by peer
	at sun.nio.ch.FileDispatcher.read0(Native Method)
	at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
	at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:251)
	at sun.nio.ch.IOUtil.read(IOUtil.java:224)
	at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:254)
	at org.apache.mina.transport.socket.nio.NioProcessor.read(NioProcessor.java:273)
	at org.apache.mina.transport.socket.nio.NioProcessor.read(NioProcessor.java:44)
	at org.apache.mina.core.polling.AbstractPollingIoProcessor.read(AbstractPollingIoProcessor.java:690)
	at org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:664)
	at org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:653)
	at org.apache.mina.core.polling.AbstractPollingIoProcessor.access$600(AbstractPollingIoProcessor.java:67)
	at org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:1124)
	at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:679)

Which unfortunately provides no context on where to actually look.
Apr 21, 2014
#3 kebne...@gmail.com
It turns out that our problem stemmed from a mismatch between the change numbers in the change_id table and change numbers in the actual git repositories, specifically, we had existing refs with change numbers higher than the current Gerrit counter. Fixing that issue appears to have fixed this problem. Many thanks to Lucas Milanesio and the GerritForge folks for helping with this.

However, we believe strongly that Gerrit should give better error messages for issues like this. "Internal Server Error" is basically meaningless.
May 1, 2014
Project Member #4 luca.mil...@gmail.com
@Kate: definitelly agreed. We will submit the patch on Gerrit 2.10 master (and once merged then to stable-2.9 as well) so that a more explicit error message can be displayed.
Sign in to add a comment

Powered by Google Project Hosting