Issue 2094: Gerrit 2.6.1 marks a commit as merged but it is not merged
Status:  Released
Owner: ----
Closed:  Mar 2014
Reported by s...@imsand.li, Aug 29, 2013
Affected Version: 2.6.1

Hi,
since we've updated our Gerrit from 2.5.4 to 2.6.1 we are facing following problem (at least once a month):

Gerrit marks a commit as merged, but refs/heads/master doesn't get updated.

This is the error message from the logfile:

[2013-08-13 14:26:09,469] ERROR com.google.gerrit.server.git.ChangeMergeQueue : Merge attempt for <project-name>,refs/heads/master failed
com.google.gerrit.server.git.MergeException: Cannot update refs/heads/master
	at com.google.gerrit.server.git.MergeOp.updateBranch(MergeOp.java:708)
	at com.google.gerrit.server.git.MergeOp.merge(MergeOp.java:288)
	at com.google.gerrit.server.git.ChangeMergeQueue$2.call(ChangeMergeQueue.java:207)
	at com.google.gerrit.server.git.ChangeMergeQueue$2.call(ChangeMergeQueue.java:204)
	at com.google.gerrit.server.util.RequestScopePropagator$5.call(RequestScopePropagator.java:222)
	at com.google.gerrit.server.util.RequestScopePropagator$4.call(RequestScopePropagator.java:201)
	at com.google.gerrit.server.git.PerThreadRequestScope$Propagator$1.call(PerThreadRequestScope.java:75)
	at com.google.gerrit.server.git.ChangeMergeQueue.mergeImpl(ChangeMergeQueue.java:204)
	at com.google.gerrit.server.git.ChangeMergeQueue.merge(ChangeMergeQueue.java:124)
	at com.google.gerrit.server.change.Submit.apply(Submit.java:111)
	at com.google.gerrit.server.change.Submit.apply(Submit.java:53)
	at com.google.gerrit.httpd.restapi.RestApiServlet.service(RestApiServlet.java:288)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
	at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263)
	at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178)
	at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
	at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62)
	at com.google.gerrit.pgm.http.jetty.GetUserFilter.doFilter(GetUserFilter.java:76)
	at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
	at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
	at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168)
	at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
	at com.google.gwtexpui.server.CacheControlFilter.doFilter(CacheControlFilter.java:70)
	at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
	at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
	at com.google.gerrit.httpd.RequireSslFilter.doFilter(RequireSslFilter.java:68)
	at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
	at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
	at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168)
	at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
	at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168)
	at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
	at com.google.gerrit.httpd.AllRequestFilter$FilterProxy$1.doFilter(AllRequestFilter.java:64)
	at com.google.gerrit.httpd.AllRequestFilter$FilterProxy.doFilter(AllRequestFilter.java:57)
	at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
	at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
	at com.google.gerrit.httpd.RequestContextFilter.doFilter(RequestContextFilter.java:75)
	at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
	at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
	at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168)
	at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
	at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118)
	at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
	at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:67)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
	at org.eclipse.jetty.server.Server.handle(Server.java:365)
	at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485)
	at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:937)
	at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:998)
	at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:856)
	at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
	at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
	at org.eclipse.jetty.io.nio.SslConnection.handle(SslConnection.java:196)
	at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:627)
	at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:51)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
	at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: REJECTED
	at com.google.gerrit.server.git.MergeOp.updateBranch(MergeOp.java:705)
	... 66 more

Additional Information:
 * Change Submit Action is cherry-pick
 * The commit which fails has the comment: Change has been successfully merged into the git repository. Instead of Change has been successfully cherry-picked as <id>
 * Error did never occurred in 2.5.x
 * Usually when the error happens, there were several (depended commits) and gerrit had already problems with conflicts
 * According to the users which had this problem, gerrit had several seconds (15-30s) for submitting which is quite unusual.
 * We also have sometimes the message "Submitted, Merge Pending" which is strange for me, when submit type is cherry-pick and the commit was not a merge-commit.
 * I've tried to reproduce this error for hours but I had no success...

Oct 4, 2013
#1 Slawek.P...@gmail.com
We have the same problem. We also have Submit Action set to cherry-pick.

What we observed is that in the same time other user successfully submitted his the change.

I guess it might be some race condition. Maybe fast forward of master branch to the newly cherry-picked commit fails, because due to the concurrent Submit by another user, it no longer is fast forward.
Oct 7, 2013
#2 s...@imsand.li
I had recently the same problem again. This time I run our backup and restored gerrit  on a virtual machine.
Finally I was able to reproduce the problem and revert back to it using the Virtual Machine. Yay!

As I'm not familiar with the gerrit-code I've just added some debug statements to find out what really happens. Compiled the ear (2.6.1), started it, submitted the Change which always failed. -> And it worked perfectly fine :(

my conclusion: This is a race condition, probably a timing issue. And I can just confirm the analysis of Brandon Casey (https://groups.google.com/forum/#!msg/repo-discuss/MYo9RSupHCw/_wQproXU1ngJ). The actual problem is that jgit returns Result.REJECTED (for no apparent reason). A retry here would probably already workaround the problem.
Oct 31, 2013
Project Member #3 bassem.rabil
We experienced the same issue with Gerrit 2.7, it is very mis-leading to mark the change as merged while not being merged. This caused confusion for our users.

Nov 1, 2013
Project Member #4 dougk....@gmail.com
When I just experienced this on 2.7, it happened because there was a path conflict, and I ended up in Submitted, merge pending state.  So, still, a weird result (and not what happened on 2.4), but not exactly the same thing (just with a nearly-identical backtrace).
Nov 5, 2013
Project Member #5 bassem.rabil
I think the priority of this bug needs to be increased because users are getting wrong status about the change in the Gerrit UI. My assumption is that this is caused by not using the submit type "Cherry-Pick" and Gerrit tries to perform the merge and thus this results in merge REJECTED which stops the change from being merged later on however the database is updated and thus the Gerrit UI shows it has been merged.
Nov 5, 2013
#6 pacos...@gmail.com
We are getting bitten by this several times a day.  We have the submit type set to "Cherry-Pick" on all projects.

Manual workaround we've been using is to manually reset the status in the SQL db to unmerged, then have the user re-do submit in Gerrit. It seems to work on the second attempt almost every time.
Nov 6, 2013
#7 schniede...@gmail.com
We also see this issue about once a week, using 2.6.1 with "Cherry-Pick".

The problem is that the user doesn't notice anything wrong, so it usually takes a few days until it is found to be hit by this bug.
Nov 6, 2013
Project Member #8 dougk....@gmail.com
I've seen changes come back from JGit as REJECTED, however when I look at the source in MergeOp, it seems the exception gets thrown in updateBranch(), which happens before updateChangeStatus().  Indeed, when I look on the server what branches contain that commit, everything looks correct.  So, I'm at a loss.

As for the other issue I saw that might be related (related to path conflicts showing up as merge pending), it appears to be completely unrelated.  Ignore that one. :)
Nov 12, 2013
#9 casta...@motorola.com
Can anyone independently verify "pacos...@gmail.com"'s workaround of rolling back the state of the affected changes table entries?

Is this a reliable workaround until a fix is made?
Nov 12, 2013
#10 omfgcw...@gmail.com
We also have this issue in 2.7.0 Cherry-pick.

I strongly recommend upgrading the priority of this bug. It is not technically repository corruption, but comes quite close. By any standard this shouldn't be regarded "minor"!
Nov 12, 2013
#11 pacos...@gmail.com
I obviously can't independently verify my own work around, but we're using it multiple times per day with (apparently) no negative consequences.

FWIW, here's the command we use to reset it:
  $ ssh -p 29418 <host> gerrit gsql "UPDATE changes SET open = 'Y', status = 'n' WHERE change_id = <change_id>;"

where "<change_id> is the id from the "refs/changes/<change_id>" string, not the ChangeId you'd see in the commmit message.


We've actually now seen two variations of this bug.  The first is the one we've been discussing where the change gets marked as merged, but nothing else happens (nothing is ever updated in the repository).  The second, less common, variation is that it actually successfully cherry-picks the commit to a new commit in the repository and creates the object, but never advances the branch to point at the new commit, leaving us with a dangling commit object.  In that case, _if_ you catch it before any other changes are merged to the same branch, you can manually advance the branch using "git update-ref" and then clear the gerrit cache so it picks it up, but if you don't get to it before another change gets merged, then you have to reset the db as noted above.

I agree, this is more than a minor issue.  We're losing hours of engineering time a week on this.  
Nov 12, 2013
#12 omfgcw...@gmail.com
@pacos

To add a bit from us, we use psql and

  select change_id from change_messages where message like "%Change has
  been successfully merged into the git repo%";

to find the suspects. In case anyone one this bug didn't already do something like that.

Instead of changing the gerrit DB, we just get our changes a new Change-Id and resubmit, which feels safer.

I generally do not understand the reason for this bug. It is just a mere symptom.

There is, from common sense to hard cryptographic facts, no reason one could ever expect a parallel cherry-pick to the same branch to work out in git. It's just impossible to implement, and rightly so, and deserves guards in place to defend against this type of bug.

Devs, please do not waste time to jot down this error precisely but just fail early, ideally encouraging the user to re-submit when there is less contention. Even distributed teams know when others go to sleep.

Nov 15, 2013
Project Member #13 dougk....@gmail.com
This sounds kinda scary, but I actually found one commit that showed as merged, not contained on any branch, but it showed the regular "Change has been cherry-picked" (and indeed, the ref to the cherry picked commit is there), but again, the branch pointer just didn't move forward to it.  Also, while it showed REJECTED by JGit and then stuck in "SUBMITTED" status in the DB, the "MERGED" status didn't show up for two hours, and that was at the point a second change was cherry-picked into the branch (and both got marked as merged... but only the second appears in the branch).

Also, both changes ended up in SUBMITTED status -- first change had a path conflict, second change did not but got stuck as a result of the "REJECTED" error (and the first was still in the merge queue), the first change was rebased and merged successfully, at which point the second change was shown as merged, but is not on a branch.  Hopefully that gives an idea of the chronology.

This behavior makes me suspect the ALREADY_MERGED behavior, such as:
https://gerrit-review.googlesource.com/49890
https://gerrit-review.googlesource.com/48370 (as these were brought up on the mailing list).  I don't think either of these are explicitly to blame, though, as they exist on the stable-2.7 branch, and this issue dates back to 2.6.1 at least.

The bug does have some similarities to  Issue 2244  (which I am attempting to fix in https://gerrit-review.googlesource.com/51522 ) -- basically, there should have been no way for a patch with a path conflict to get into SUBMITTED status.

We also have monitoring that polls for changes in the SUBMITTED status, and I'm usually pretty quick about abandoning and restoring changes that get stuck SUBMITTED to evict them from the merge queue.  Both times I've observed this corruption, I've observed a change stuck in SUBMITTED status around the same time, on the same project/branch as the one that loses a commit.  May be just a coincidence, but it at least seems like something to watch for and see if a pattern begins to emerge.
Nov 15, 2013
Project Member #14 dougk....@gmail.com
Sorry for the rapid-fire updates... but I figured out how to reproduce the "REJECTED" result, but not the commit missing from the branch yet.

1. Get a change into merge pending status (easy way is to create a review with a path conflict, then hit submit twice to reproduce  Issue 2244 ).
2. Submit a second change, possibly unrelated to the first, but on the same branch.
3. Attempt to merge the second change, note that you'll end up in the "unborn branch" case of CherryPick._run().  This is the part I can't figure out.
4. You'll receive the "REJECTED" stack trace as described here.

This is where I get stuck.  If I do something (i.e. rebase, abandon) the other change that's blocking the merge queue, the change goes in successfully and appears on the branch.  I've yet to figure out how a change gets marked "MERGED" but isn't.
Feb 3, 2014
#15 casta...@motorola.com
Has there, or will there be any movement on this CR?  We have this problem show up from time to time on our system and we have no fix for it.

This is really not a "Minor" problem when the system shows from all available output other than error logs that the user's actions work but in fact the change never leaves "refs/changes/..." and never in fact merges.

The consequences of such a bug at the right time in a development cycle of our products could be devastating.

It frankly makes Gerrit unreliable as a tool.

So when can we get some traction on this?
Feb 3, 2014
#16 s...@imsand.li
FYI.

We haven't faced this issue in 2.8 and 2.8.1 yet.
Mar 14, 2014
#17 sop@google.com
(No comment was entered for this change.)
Status: Released
Labels: FixedIn-2.8.3
Mar 14, 2014
#18 schniede...@gmail.com
I'm still seeing that issue in 2.8.1 occasionally.
May 9, 2014
#19 nthieb...@gmail.com
Still present in 3.8.3

see:https://gerrit.libreoffice.org/#/c/8795/
and notice the 'has been merged' message instead of 'has been cherry-picked'...
and it did not make it to master despite gerrit saying so.
May 12, 2014
Project Member #20 bassem.rabil
Still persists in Gerrit 2.8.4
Jan 19, 2015
#21 glennch...@gmail.com
still get this kind of error in Gerrit 2.8.6.1
Sep 4, 2015
#22 moej...@gmail.com
Still gets this error in 2.9.4, and at times it happens in clusters
Sep 20, 2015
#23 nam6...@gmail.com
had same issue in 2.9