My favorites | Sign in
Project Home Downloads Wiki Issues Source
New issue   Search
for
  Advanced search   Search tips   Subscriptions
Issue 377: Support git submodule dependencies between changes
62 people starred this issue and may be notified of changes. Back to list
Status:  Released
Owner:  ----
Closed:  Jun 2015


Sign in to add a comment
 
Project Member Reported by Shane...@gmail.com, Jan 6, 2010
One of the projects I work on uses git submodules for different parts of
the application, each submodule is added in gerrit as a separate project.

Sometimes code committed in one of the submodules is dependent on code
commited in either the main project or another one of the submodules, it
would be nice if we could manually specify this in the commit. Something
like Depends-On: Iefb00f7a27b39d664c8cee4923c01409431bc3df


Jan 11, 2010
#1 sop@google.com
Do you really want to do this manually?

What we actually want to do is have Gerrit understand the submodule repository
structure and automatically setup these dependencies based on the actual commit
SHA-1s.

In your example, the supermodule change wouldn't be able to be submitted until
the submodule commit(s) it points to is submitted to the submodule project.

By doing this we also hope to support automatically updating the supermodule if
a submodule changes and the supermodule is configured to automatically track the
submodule.  This would allow engineers to submit changes into a subproject and
not worry about every single update in the supermodule.
Summary: Support git submodule dependencies between changes
Status: Accepted
Labels: -Type-Bug Type-Feature
Jan 11, 2010
Project Member #2 Shane...@gmail.com
I'd be much happier with it being automatic :)

I didn't think of that approach because on the project I am on we use the cherry-pick
method of merging (We wanted the Reviewed-On/Reviewied-By/Tested-By bits in the
commits), so the SHA1 that actually gets committed to the central repo doesn't match
the SHA1 I committed locally, (This means that after the change in the submodule has
been committed, someone (in our case, this someone crontab) has to go update the
submodule pointer.) so I didn't think it would be possible.
Apr 9, 2010
#3 OdinG...@gmail.com
Automatic submodule dependency tracking as you describe it may solve the whole problem.

However, being able to manually specify "Depends-On: xxxxx" in a commit message would
be very useful for people using repo manifests. This would allow you to create
inter-project dependencies when they are warranted. Currently there is no way to let
Gerrit know that a change in one repository should be submitted only after a change
in another.

Usually this doesn't affect that many changes so I think manual creation is fine.
Also, to make it easy on the developer being able to put either the Ixxx change id or
the original git SHA-1 would be nice (the SHA-1 can be grabbed before everything has
been uploaded to gerrit).
Apr 24, 2010
#4 sop@google.com
This is being worked on by some folks.

https://review.source.android.com/14033
Status: Started
Aug 16, 2010
Project Member #5 bklarson@gmail.com
Shawn, do you envision this as replacing the current repo manifest?  I'm not sure if this change encapsulates your long-term plans [1].

I'm curious because we are starting lots of scripted testing using repo, and are running into problems such as Issue 69 [2].  I'm not sure if we should spend time fixing issues with the current manifest format, or help convert to the new format.

Are you at a point where you could use help with [1]?

[1] http://groups.google.com/group/repo-discuss/browse_thread/thread/a337001e3495bb23/3123cdd6186ee7ca
[2] https://code.google.com/p/git-repo/issues/detail?id=69#c0
Aug 16, 2010
Project Member #6 bklarson@gmail.com
Ugh - that is issue 69 in git-repo, not the closed issue in gerrit.
Mar 1, 2011
#7 mar...@longhome.co.uk
Why limit this to submodules. I haven't worked on a project that uses submodules yet, but I do use repo. The same could apply to repo uploads. The ability to manually add a dependency (via web or git comment) would be awesome. 

I currently have a change submitted to AOSP, which touches several repositories. Currently they are only linked by the title of the commit - someone could potentially try to review the change to just one of the repositories. 

+1
May 16, 2012
Project Member #8 choro...@wikimedia.org
Isn't this pretty much accomplished with submodule registration in 2.3?
Mar 16, 2014
#10 phil.le...@googlemail.com
I'm currently working on implementing this, as submodule registration isn't the solution for me. However, I could do with some pointers from a friendly dev who understands the gerrit codebase....

Specifically, I'm starting by modifying GetChange.java to walk through Depends-On: footers. I believe I need to use ChangeControl.Factory to get the Change for the given Change-Id, however adding ChangeControl.Factory as a constructor argument to GetChange isn't enough, and I run into the error at the end of this message.

I'm fairly proficient at standalone java, however Guice and servlets are relatively unchartered territory for me, so be gentle :)

[2014-03-16 15:49:47,240] ERROR com.google.gerrit.pgm.Daemon : Unable to start daemon
com.google.inject.CreationException: Guice creation errors:

1) No scope is bound to com.google.inject.servlet.RequestScoped.
  at com.google.gerrit.server.project.PerRequestProjectControlCache.class(PerRequestProjectControlCache.java:34)
  while locating com.google.inject.Provider<com.google.gerrit.server.project.PerRequestProjectControlCache>
    for parameter 0 at com.google.gerrit.server.project.ProjectControl$Factory.<init>(ProjectControl.java:105)
  while locating com.google.gerrit.server.project.ProjectControl$Factory
    for parameter 0 at com.google.gerrit.server.project.ChangeControl$Factory.<init>(ChangeControl.java:125)
  while locating com.google.gerrit.server.project.ChangeControl$Factory
    for parameter 2 at com.google.gerrit.server.change.GetRelated.<init>(GetRelated.java:74)
  at com.google.gerrit.extensions.restapi.RestApiModule.view(RestApiModule.java:82)

1 error
	at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:448)
	at com.google.inject.internal.InternalInjectorCreator.initializeStatically(InternalInjectorCreator.java:155)
	at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:107)
	at com.google.inject.internal.InjectorImpl.createChildInjector(InjectorImpl.java:230)
	at com.google.gerrit.pgm.Daemon.createSysInjector(Daemon.java:356)
	at com.google.gerrit.pgm.Daemon.start(Daemon.java:274)
	at com.google.gerrit.pgm.Daemon.run(Daemon.java:200)
	at com.google.gerrit.pgm.util.AbstractProgram.main(AbstractProgram.java:63)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at com.google.gerrit.launcher.GerritLauncher.invokeProgram(GerritLauncher.java:167)
	at com.google.gerrit.launcher.GerritLauncher.mainImpl(GerritLauncher.java:94)
	at com.google.gerrit.launcher.GerritLauncher.main(GerritLauncher.java:51)
	at Main.main(Main.java:25)

Mar 16, 2014
#11 phil.le...@googlemail.com
Should have read GetRelated.java, not GetChange.java
Mar 16, 2014
#12 phil.le...@googlemail.com
OK, I've made progress and am now battling my way through the gwtui portion of the code. For anyone ending up here who wants to know the solution, it's to use ReviewDb's methods to create objects. e.g. :

List<Change> changes = dbProvider.get().changes().byKey(changeKey).toList();
PatchSet patchSet = dbProvider.get().patchSets().get(currentPatchSetId);

...and so on.
Feb 18, 2015
Project Member #14 sbel...@google.com
As a follow up on this:
If you want to have something similar, checkout the `change.submitWholeTopic` config in the master branch of gerrit. That works across different projects.

Also another thing to be handy in this situation is the submodule subscription
https://review.openstack.org/Documentation/user-submodules.html which sounds more
like the original issue. 

I'll be working on combining these two things (submit different submodules at the same time in the same topic, such that you only get one superproject commit. That's my current vision at least).
Jun 22, 2015
Project Member #15 jrn@google.com
submitWholeTopic does what ShaneMcC was asking for. Submodule subscription is a different thing.
Status: Released
Jun 23, 2015
#16 phil.hord
submitWholeTopic does not do precisely what ShaneMcC was asking for.  What he wanted (5 years ago) was a gate to hold some commit from being merged if other related topics are unmerged; however, submitWholeTopic is a tool to allow someone to cause all related topics to be submitted with one simple action.

Consider I have two changes, A and B, both in the same topic (if that's to be our control mechanism).

With submitWholeTopic, I get this feature:

   When viewing change A, there is a SubmitWholeTopic button.  
   If I press it, A and B are both submitted.

What Shane wanted (and I want) is this:

   When viewing change A, there is a Submit button.  
   If I press it, then if
       B is already merged  => A is merged ;
       B is "merge pending" => A and B are both merged ;
       B is not submitted   => A is "merge pending"

Quite simply:
  A is treated as if B is its ancestor.
  B is treated as if A is its ancestor.
  One cannot be merged without the other also being merged.


Jun 23, 2015
Project Member #17 sbel...@google.com
Phil,

So here is how I understand the underlying message, correct me if I'm wrong:

What you describe are atomic guarantees between A and B. So you actually care about
"Either both of A and B are not merged, or both of A and B are merged, there should be no state in which A or B is merged, but not the other." Mind that there is no
assertions about SUBMITTED vs NEW.

The submitWholeTopic tries to minimize the time of the forbidden state being observable, but doesn't eliminate it. This is because we started to tackle this "two changes are set to the MERGED state atomically" problem in a way which has most impact first. And having minimized the time of the undesired state is doing what you want mostly.

Currently I am working on getting rid of the MergeQueue (and eventually the SUBMITTED state), because it doesn't make sense to keep that around. You always had the uncertainty of SUBMITTED, Merge Pending and alike, which is not what the user wants, but the user wants a clear feedback of when a change is not in and when it is.

When getting rid of the MergeQueue, we will not send emails any more about failed merges, but rather the UI will tell you if the merge (of both A and B grouped together by the topic) succeeded or not. That way we will minimize the undesired state of one of A,B being merged while the other isn't, a bit further. 

Eventually later we want to have both A and B to be merged within a single data base
transaction.

Jun 23, 2015
#18 phil.le...@googlemail.com
I've moved jobs since this was an issue for me, but from what I recall,
that's essentially correct.

The more common scenario is where the superproject S has dependencies on A
& B, and change B shouldn't be merged until change A has been merged.
However, sometimes cross-dependencies between A&B were unavoidable due to
the architecture of the project being modified (and since it's a big
open-source project, restructuring wasn't a viable option).

This is all to do with plugins for S and introduction library functions in
A to be used by B (and possibly C).
Jun 24, 2015
#19 phil.hord
I appreciate the efforts you are taking to understand my issues here.  Thanks.

I think you are still missing my own motivation.  Since my read of the OP's wording matches my own needs exactly, I tend to think he thinks the same as I do.  But I cannot say for sure.

I am not so concerned about the atomicity of the commit action on A & B. I am more concerned with policy.  At the risk of confusing you even more, let me change the example premise a little bit;  let's say "B depends-on A" with the understanding that this does not mean automatically that "A depends-on B".

For my project, B-depends-on-A means that the changes in B will not work without the changes in A.  For example, A may introduce a new API and B uses this new API.  If this were a single repo I would merely ensure that A is an ancestor of B and Gerrit (and Git) would take care of the rest, so long as I do not use "Submit type: Cherry Pick".  

In fact, this is a good example to understand the desired behavior.  In a single repo, if Y depends-on X, then I may CR=+2 on Y and CR=-1 on X.  And I can safely "submit" Y; it will not get merged right away because it depends-on X.  Some day X may get merged, and then Y can get merged too.  But it is not important to me that Y gets merged right away, atomically or close to it.  It is only important to me that Y is prohibited from merging _by policy_ until X is also merged.

So, let's look at A and B again.

  B depends-on A  ==>  B may not be merged until A is also merged.

For this issue (#377) I do not care if B is merged immediately, or atomically, or "with all due speed".  I only care that it is not merged before A. The API which A introduces may have other customers or none at all.  The merit of commit A does not extend to B, nor vice versa.  So I may safely merge A without needing to merge B, and I can do so far in advance of merging B.

Now, another example:

  B depends-on A  ==>  B may not be merged until A is also merged.
  C depends-on A  ==>  C may not be merged until A is also merged.

This does not mean that C depends-on B.  C is merely another customer of A's new API.  

  C may not be merged before A.
  C may     be merged before B.
  C and B are unrelated, except that both depends-on A.

Here is the example I think you are trying to solve with submitWholeTopic:

For some commits D and E, 

  D depends-on E
  E depends-on D

This is a reasonable situation.  For example, perhaps D changes an existing API which is used only in two repositories.  D and E contain the required changes to each repo.  Neither is acceptable without the other one.

This is a specialization of this issue and it should be considered, but it is not the general issue.  It can be solved by solving the general issue, though.  Suppose D is reviewed and submitted.  Then we should end up here:

  D: Submitted; merge-pending  (Cannot be merged because depends-on E)
  E: New change; Needs review  (Cannot be merged because needs-review)

Later when E is reviewed and also submitted, we briefly are in this state:

  D: Submitted; merge-pending  (depends-on E)
  E: Submitted; merge-pending  (depends-on D)

But you can see that both commits conditions' are met, both of these commits may be safely merged.

  D: merged
  E: merged

1. You *should* merge both of these atomically (as closely as possible) but this still is not the point of this  issue #377 .

2. You *could* merge both of these to some candidate-branch in each repo and only move the branch-heads after you ensure that all dependent changes were merged successfully.  I think this might be nice, but it is outside the scope of this  issue #377 .

3. You *could* safely merge each of these commits and then add a single commit to the superprojects (which are subscribed to them) with updates to both submodule gitlinks atomically, if the projects belong to some registered_submodules.  But this also is not the point of  issue #377 .  (However, it does nicely handle the atomicity for you in the case of submodule subscriptions, except that if there is some merge-failure you do still need to undo _something_ so your errant repo is still a valid merge target for other commits.)

So, to recap, what I want is this:

  B depends-on A  ==>  
     Policy: B may not be merged until A is also merged.


Contrarily, submitWholeTopic gives me this:

  A belongs to foo; B belongs to foo  ==> 
     Convenience: There is now a button to submit *all* 
                  'foo-related' commits approximately at once.

These are two very different things.
Jun 24, 2015
Project Member #20 jrn@google.com
When submitWholeTopic is enabled, there is no way in the UI to submit a single change without submitting the rest of the topic.
Jun 24, 2015
Project Member #21 sbel...@google.com
Ok, now I understand your position very well with the API provider and consumer. And as A and B are in different projects we cannot do a ancestor/parent relationship as Git would provide within a single repository.

Looking at the proposed solution in the first post, I think it doesn't cut it yet.

So say you have repoA and repoB which you develop changes A and B for. A "depends-on" B. You can have a change (with the same change id, but different legacy change numbers), targeted at different branches, so what you would really need to write down policy-wise is:

A "depends-on" (B is integrated in repoB/master) 

[B could also be just cherry-picked to repoB/proposedIntegration for testing or such]

Regarding examples D & E:
1) Assuming we were to have such a policy enforcement in place, a non-atomic merge
would violate this policy, which would be bad design (why would we allow Gerrit to violate a policy which it is told to enforce?)

3) I thought a similar idea is very a neat, so I had code for it[1].
Say change A in repoA and change B in repoB are in the same topic (the special case with circular dependency),
and both would be submitted and we have submodule subscriptions enabled such that
changes in repo{A,B} will feed into a common superproject, then both changes A and B would come into the superproject via a single commit updating both submodules in that commit.
I want to revive that topic, once I am done with the MergeQueue refactoring though.





[1] https://gerrit-review.googlesource.com/#/q/topic:towards-atomic-submodule-updates




Jun 24, 2015
Project Member #22 sbel...@google.com
jrn: Yeah but this single change submission may be desired when building out a dependency graph.

Say we have new API change A, B and C (each in its own repository repo{A,B,C}) and then we also have changes bringing in new features depending on these new APIs. (D depends on {A,B}), (E depends on {B,C}), (F depends on B)

Because the APIs are wonderfully documented and you know how they will work eventually, you can write the code and test with mocks and submit the features.
They would be merge pending until the depending APIs are actually merged.

So currently you could not model this.


Jun 24, 2015
#23 phil.hord
jrn> When submitWholeTopic is enabled, there is no way in the UI to submit
jrn> a single change without submitting the rest of the topic.

Yeah, and that's a problem for my use case.

sbel> 1) Assuming we were to have such a policy enforcement in place, a non-atomic 
sbel> merge would violate this policy, which would be bad design (why would we 
sbel> allow Gerrit to violate a policy which it is told to enforce?)

For the sake of simplicity, I overstated the dependency a bit.  I only want Gerrit to refuse to merge E until it agrees to merge D.  After that, I only hope Gerrit will do a best-effort attempt at carrying out the merge goals.

I take your point about the change-id since they can (now) be re-used on different branches.  I actually have an implementation of this "Depends-on" relationship in Jenkins, but there I am on the other side of the fence.  When Jenkins wants to test a changeset, I first have it search the "new" patches for any "Depends-on:" directives in the commit log.  All the found directives are queried in each submodule of my project using something like this:

   gerrit query --current-patchset is:open "$@"

Any matching commits are fetched and merged, and the newly arriving commits are again searched for new "Depends-on:" directives.  (That is, it is recursive.)  This continues until no _new_ directives are found.

Notice my query searches only for the latest patchset of is:open changes. This means that an abandoned change becomes a non-dependency.  This is fine for Jenkins since the point of the dependency system here is to help Jenkins SUCCEED. If there is a real dependency here and it gets ignored, Jenkins will presumably fail during 'make' and the hapless author will need to investigate why.

If the query does not match anything, I ignore it. I could complain instead, but for Jenkins' needs, I am willing to fail during the build instead (VRFY:-1 gets people's attention), or to pass if the dependency is already met somehow else.

A query may match multiple items in the same and/or different submodules; this is allowed so long as they all merge cleanly.

A query which matches the same change-id on multiple branches is likely to cause pandemonium and will fail to merge.  This would be a false-negative for Jenkins, but it is an acceptable fallout for such irresponsible construction.  In practice we do not have many of these; less than 5% of our changes are ever cherry-picked to another branch, and something less than 1% ever have a "Depends-on" directive; the intersection of these sets is miniscule.

But I can see how this could be a problem for Gerrit.  I would expect that Gerrit's policy might be to refuse to merge E unless ALL(D) are also merge(able).  This would be overly restrictive for some, but it's better than being too permissive.  In particular, I don't think you can say E(master) depends on D(master) or similar branch-matching heuristics, because my superproject may lean on different branch names in different projects.  Gerrit should not presume to know my wishes.  But if it were really a problem, I could do this (though it does not scale):

  Depends-on: I7a6d7ff0ce95f3cfe2787eef5959428664bc0d1d branch:master

I say it does not "scale", but I mean it does not automatically get updated when you cherry-pick this commit to some other branch.  I personally do not think this is a specialization worth solving, but others may be more interested in it.

In practice my directives look like this:

   Depends-on: I7a6d7ff0ce95f3cfe2787eef5959428664bc0d1d
   Depends-on: I68b5e642186869c0c4e484ad12ba57b1f86e75e9

I use change-id's because I know them before I have to push, so I can set up my dependencies ahead of time.  But if I know the change number of some commit already, I can use that instead:

   Depends-on: 18532

And if I want whole-topic dependency (even if _this_ change is not in that topic), I can do this:

   Depends-on: topic:FooBar

Or this:

   Depends-on: topic:FooBar branch:master

Sign in to add a comment

Powered by Google Project Hosting