| Issue 560: | Rename project in web interface |
1 of 43
Next ›
|
| 98 people starred this issue and may be notified of changes. | Back to list |
Sometimes rename is needed due to (for example) typo or path change.
May 10, 2010
#1
sop@google.com
Status:
Accepted
Jun 15, 2010
Within SAP we started to work on this issue and we are facing some dilema: - project name is the primary key of the Project class and, therefore, all tables that reference project(s) contain project name(s) - if we update the primary key we have to update all the references. gwtorm doesn't seem to support update of primary key. Right? So, we have to delete/insert in the projects table? - we also have to update all the project references in other tables. This means that, in the future, whenever a new reference to project is introduced one has to remember to also update this piece of code Is there any generic way in gwtorm to achieve that? for example, find all instances that contain project references? - what about introducing a generated ID as the project key and having the project name only as a project property? Of course, in this case we would need a data migration but then the project rename would be simple after that.
Jun 16, 2010
We used to use a synthetic key for the project primary key, because I did want to support renaming without having to edit the entire database and all dependent data records. But I took it out in commit 8df7cf76891b [1] for a couple of reasons: - We are trying to port Gerrit Code Review onto a storage engine similar to Apache Cassandra or Google Bigtable. - We are trying to port Gerrit Code Review to keep all of its data inside of Git repositories, rather than SQL. Either approach prefers that projects have a single way to identify them, rather than multiple. So, I dropped the Id from the projects table and started using the name as the only way to identify it. Unfortunately gwtorm doesn't support changing the key of an object, because for a NoSQL system like Cassandra or Bigtable you have to move the record by inserting the new copy and deleting the old copy. We also don't have a way to tell you which objects are using the key, though that could be derived by walking the graph of the database model. Its not something we currently export in the public API of gwtorm, but I think all of the data you need to compute it is available in the internal model gwtorm uses to generate the access code. This change is hard to implement. Even if we revert the Id removal and go back to a synthetic primary key, we have to also rename the Git repository on disk, while it is being used by clients. Perhaps I was wrong to remove the Project.Id key, and should have done a dual-entity like we have with the AccountGroup type and AccountGroupName. OK. So maybe I screwed up here, and we need to go back to a synthetic project key. You can start by looking at what it will take to revert that commit... [1] http://android.git.kernel.org/?p=tools/gerrit.git;a=commit;h=8df7cf76891bca1b65aa48df7b5369c2245b8432
Jun 17, 2010
OK, the motivation is clear and the solution with dual-entity approach is also clear. One additional question came from looking at the Schema_21.migrateData method. it moves the value of wildProjectName from the projects table to the system_config table but it doesn't delete the '-- All Projects --' entry from the projects table. Was it intentional or was it forgotten? The issue with renaming/moving the Git repository on disk was also known to us and we were hoping that there is a way to acquire an exclusive write lock on the complete Git repository then move it and finally release the lock. Is something like that possible from JGit?
Jun 17, 2010
It was intentional to keep '-- All Projects --' in the table. This is a special project that has no repository, but its access rights are inherited by every other project in the server. So we need it for that inheritance. Previous to schema #21 we identified that special project by giving it the Project.Id value of 0. This was hardcoded in the source code. When we removed the Project.Id we had to identify it by another means. Since it was remotely possible to rename a project by editing the database, I chose to track the name of this special project in a column in system_config (where we also track special groups) rather than rely on a magic string. Going forward I would suggest changing that field in system_config back to the Project.Id column and using that to identify this special project, rather than hardcoding it again in the source code, or using its name. Regarding an exclusive lock, there isn't a way to do this in JGit. You can however try to do it in LocalDiskRepositoryManager class in Gerrit Code Review. This class is what handles the requests coming in from the clients. Implement a fair read-write lock here, and during a rename wait for the write lock. To actually perform the unlock, you need to know when the caller has invoked close() on the Repository object. This may require that you subclass the Repository to override its close method. Ugh, not pretty. Maybe a small change to JGit to better support a read-write lock here is worthwhile.
Jul 20, 2010
The JGit's Repository object seems to be the natural place for the read-write lock in order to ensure unlocking in Repository.close(). I looked at many implementation options: 1. Subclassing Repository class in Gerrit. This wouldn't work as Repository instances are created by the JGit's RepositoryCache class and we can't tell it to instantiate our subclass. 2. Wrapping Repository instance and storing the lock in the wrapper Unfortunately Repository is not an interface which makes wrapping not so nice. The wrapper has to subclass concrete Repository class just to override all its (public) methods to delegate to the wrapped instance. Not nice, but would work. 3. Extract the interface out of the Repository class in JGit Name the interface Repository, (re)name the Repository class to RepositoryImpl. Wrapping such a Repository would be nicer in Gerrit and we could even use dynamic proxy for that. Actually, a Repository wrapper that supports locking could even be provided by JGit in this case. 4. Enhance the existing Repository class to support read-write (or shared/exclusive if it sounds better) locking. Solution 2 is the fastest to implement. Solution 3 looks the best for me but would take more time to implement than solution 2. Solution 4 has a disadvantage that it hardcodes the concept of locking into the Repository class. The choice whether to use locking support would be lost. What do you think?
Jul 20, 2010
Repository went through a major refactoring in JGit recently. If you want to poke at wrapping it like in solution 2,3,4 you really need to start from the tip of my refactoring series [1]. Given the work we have done it may be possible to extend Repository to provide locking and wrap another Repository within. I guess you are talking about having the read lock be taken when the open count is incremented, and released when the open count is decremented? Gerrit should be correctly matching gets from the cache (open count++) with closes (open count--). I wonder if its just easier to redo that code in Gerrit. We get a Repository from the GitRepositoryManager class, but we don't return it through there. $ git grep openRepository | wc -l 27 There are currently only 27 open call sites. Should be easy enough to go through them and change repo.close() to instead be repositoryManager.closeRepository(repo) and put all of the locking inside of the GitRepositoryManager. [1] http://egit.eclipse.org/r/1141
Aug 17, 2010
A change that introduces a synthetic key as primary key for projects was pushed for review: https://review.source.android.com/16512 Having a synthtic key (project id) as primary key for projects instead of using the project name as primary key allows renaming of projects without need to search for dependent data records in the entire database and updating them. This change should make the implementation of project rename support in the WebUI much easier.
Owner:
edwin.kempin
Aug 25, 2010
I'm now focussing on the implementation of the read-write lock. As proposed by Shawn I would like to handle the locking within GitRepositoryManager. As already pointed out for proper unlocking it is then needed that the closing of the repository goes through the repository manager instead of directly closing the repository. While replacing the calls 'repo.close()' with 'repositoryManager.closeRepository(repo)' I noticed that there are a few calls of 'openRepository(name)' after which the returned repository is not closed. I wonder if these are bugs or if it is intended. In the following methods a repository is opened and not closed: 1. com.google.gerrit.httpd.gitweb.GitWebServlet#service(HttpServletRequest, HttpServletResponse) 2. com.google.gerrit.server.git.LocalDiskRepositoryManager#getProjectDescription(String) From the javadoc of the 'createRepository(name)' method of the repository manager I understand that the caller also in this case must ensure that the repository is properly closed. However at the only place where this method is used the returned repository is not closed. see com.google.gerrit.sshd.commands.CreateProject#start(Environment) Also I noticed at least one place where the repository is not properly closed in case of an exception. see com.google.gerrit.server.git.LocalDiskRepositoryManager#setProjectDescription(String, String) in case of IOException while locking the LockFile or writing to it First of all, do we have bugs at the places pointed out above? If yes, this makes me wonder how save it would be to rely for the unlocking on the caller's discipline to properly close the repository?
Aug 26, 2010
I think you are right, those are all bugs where the close is missing, but should be present. :-(
Aug 26, 2010
I've pushed a change for review that takes care to close the opened repositories in the cases explained above: https://review.source.android.com/16821
Sep 14, 2010
I've pushed several additional changes [1], [2], [3] to complete the rename functionality. After they've gone through the review and if they get accepted I would consider this issue as finished. However I see some left-overs: - There should be an e-mail notification on rename (if a project gets renamed that's surely something about which the stakeholders of this project would like to be informed) - It would be nice to have an ssh command to rename a project Do you see anything else that is missing? If nobody objects I would then create new issues for these left-overs. [1] https://review.source.android.com/17167 [2] https://review.source.android.com/17168 [3] https://review.source.android.com/17169
Sep 17, 2010
Would it make sense to have some kind of admin log in which admin operations such as project rename, project deletion etc. are recorded? If we would have such a log, we could answer questions like who was doing the rename, when was it done, what was the old name etc. Any thoughts about this?
Sep 23, 2010
I think an audit log would be very valuable to the situation the group of admins at my $DAYJOB is in. Also, for such a feature, please consider changes to rights, as well as the possibility to manually add an entry to the audit log, as a kind of captains log if you will. Perhaps such a suggestion should have its own ID in the issue list though.
Sep 24, 2010
I'll +1 the audit/admin log idea. Someone was just asking me about that yesterday as a way to learn/train new admins.
Sep 28, 2010
As part of the review in Gerrit Fredrik pointed out that the project rename
must also take care about replication (see summary comments in change [1]).
I think this is a very important point that has to be addressed.
Since until now I have no experience with the replication in Gerrit I would
like to clarify some details before starting the implementation.
As I understand it we need to do 2 things for the replication when a project
is renamed:
1. adapt the URL for this project in the 'replication.config' file if it is
contained there
2. trigger the project rename on the slave servers
Is this correct, or do we need to do something more/else?
Regarding 1. I have the following questions:
- From the documentation of the 'remote.<name>.url' parameter in the
'replication.config' file [2] I understand that one would normally make use
of the magic placeholder '${name}' to specifiy that all projects should be
replicated. I guess in this case we do not need to change anything if a
project gets renamed, don't we?
- Is it possible to explicitly list certain projects for replication in
'replication.config'? If yes, we need to adapt the url in this file if such
a project gets renamed.
- From a discussion on the mailing list (see reply from Shawn on 30 Aug.,
16:55 [3]) I understand that the 'replication.config' file is currently only
once loaded on server start and changes in this file would have no effect
during the runtime? Is this still the case or was there already something
done about it?
Regarding 2. I have the following questions:
- To do this I would add a new method 'replicateProjectRename' to
com.google.gerrit.git.ReplicationQueue, similar to the existing
'replicateNewProject' method. Would this be the correct place?
- For the replication of a new project the Gerrit master server sends for the
project creation commands over SSH to the slave servers [4], so I guess for
replicating the project rename we would need to have an SSH command for
project rename that can be invoked on the Gerrit slave servers by the Gerrit
master server.
- If we need an SSH command for project rename to do the replication, how would
be the best way to implement this SSH command? Currently the rename logic is
implemented in gerrit-httpd [5], whereas all existing SSH commands are
defined in gerrit-sshd. Since there is no dependency from gerrit-sshd to
gerrit-httpd I wonder how the SSH command for renaming a project can reuse
the existing rename logic [5]? As I see currently there is no SSH command
that uses code that can also be triggered from the WebUI. Where would be the
correct place for the rename logic so that it can be used both from WebUI and
SSH command?
- Can it be detected whether a project exists on a slave server in order to
decide whether a rename must be triggered for this slave server? How would be
the best way to do this?
- What should be done if renaming of a project on a slave server fails? Is it
sufficient to generate an error in error_log?
- What if a slave server is not available when a project is renamed? From
Shawn's reply on 3 Sep. 16:45 from the mailing list discussion [3] I
understand that there was the idea of having an event list so that
replication can still be done when the slave server is back. Is this just an
idea or was there already something implemented.
Sorry for the amount of questions. Any help would be appreciated.
[1] https://review.source.android.com/17167
[2] https://review.source.android.com/Documentation/config-replication.html#replication_config
[3] http://groups.google.com/group/repo-discuss/browse_thread/thread/945f313be4f26167
[4] https://review.source.android.com/#patch,sidebyside,11109,2,src/main/java/com/google/gerrit/git/PushReplication.java
[5] https://review.source.android.com/#patch,sidebyside,17168,2,gerrit-httpd/src/main/java/com/google/gerrit/httpd/rpc/project/RenameProject.java
Sep 28, 2010
Hi Edwin!
Good move into this issue instead of in the comments. My bad. :)
1. I think only the version with a variable in it is available. I've never seen any replication line without it. It's meant to replicate all projects, not single projects. If you would like to restrict the number of projects to replicate, you should use the authgroup (or similar) setting instead, and then restrict access too projects / branches for that group. So you can design with only {$name} in mind.
2. It's a well thought out process, but it has minor flaws. There are several people who run replication to locations where no Gerrit slave is running. Therefore, it's not a good idea to move the rename operation on a slave to a slave server, this should be done by the master.
Shawn: "[...] Let the master init the slave when a project
is created. But the init process may need to go through a different
URL than replication normally follows."
I interpret that as that the master will have to take full care of the replication part, both creation/init of gits, replication of gits, rename of gits, and deletion of gits.
Also, as in the thread I pointed to, you really should honor the adminUrl[1] setting used for shell commands explicitly over an ssh link coming in a change shortly as well.
I believe this puts a new angle on many o your assumptions above, so I'll leave some of them uncommented.
It is however, as you suggest, wise to check on all destinations (even the local ones) that you're not overwriting any previously existing git, and to handle that situation somehow. Perhaps make use of the attic also meant for project deletion?
If renaming fails, just as with init and deletion, it must be remembered somehow. I start to feel that maybe it's about time that you and the delete project and create project developers agree on something. Perhaps time for a new thread in the list about this? I THINK there isn't already an event list implemented, but I'm so far not involved in any coding in Gerrit, so better ask on the list than trust me on that one.
[1]http://groups.google.com/group/repo-discuss/browse_thread/thread/945f313be4f26167#aec4ec98657c2879
Hope this helped a little,
Fredrik
Oct 5, 2010
I was now experimenting a bit with the replication and things got much clearer. Somehow I had the assumption that we have on the replication target systems Gerrit slaves servers running to which all data (including database content) would be replicated. Now I understood that there are no slave Gerrit servers but that we replicate only the Git repositories. This makes things of course much easier. I've now included a first handling for replicating project renames into the change 17168 [1]. The only thing that worries me, if the replication of a project rename fails it will only be logged, however this has quite severe consequences (e.g. for the renamed project no further changes will be replicated since the renamed project does not exist in the replication target). [1] https://review.source.android.com/17168
Dec 8, 2010
I'd like to request that renaming projects also be accessible through the CLI (ssh -p 29418 host gerrit).
Dec 9, 2010
yes, once the rename functionality is through the review, I will implement an SSH command for renaming projects
Aug 31, 2011
I'm currently not working on this. The project rename must be reimplemented on top of the git_store changes which were done in Gerrit 2.2.1.
Owner:
---
Jan 14, 2013
Any status on this? Not being able to delete or rename projects is still really annoying (as of 2.5.1).
Jan 14, 2013
@24 FYI: as of v2.5, you can delete projects through plugins. See the Gerrit docs and also bug 349 for more info
Jan 14, 2013
That's good, but it won't save historical info, like change approvals, comments, etc. Will it?
Jan 14, 2013
I havent tried it, but a dirty way might by this: Clone the repo you want to rename, create an empty repo (no initial commit) with the name you want, in the cloned repo change the origin url to the new repo and push directly to repo in stead of push to review. Delete the first repo afterwards. Make sure you have no changes under review.
Jan 14, 2013
@26 - No, you will lose all historical info when you delete a project. Methods like that described in @27 might work once we move all project info from the database to git, but we aren't there yet. For now, to rename a project, the only option is to: 1. Shut down the Gerrit server process 2. Rename the repository on the server's file system 3. Go through the database and update any entries pointing to the old file system location 4. Have all users update their remotes 5. Start back up the server
Mar 14, 2013
This is in progress in https://gerrit-review.googlesource.com/#/c/42247/ and its various followups.
Apr 3, 2015
https://gerrit-review.googlesource.com/#/c/42247/ was abandoned due to too many merge conflicts...
Apr 29, 2015
Using the importer plugin [1] you can copy a project and then delete the old project with the delete-project plugin [2]. Precondition for this is Gerrit 2.11. [1] https://gerrit-review.googlesource.com/Documentation/config-plugins.html#importer [2] https://gerrit-review.googlesource.com/Documentation/config-plugins.html#delete-project
Apr 29, 2015
@31 That sounds pretty inefficient. Or is the importer plugin doing some smart stuff (i.e. not copying all data) if source and destination are the same server?
Apr 29, 2015
> That sounds pretty inefficient Is is inefficient, but it's something that works today. |
|
| ► Sign in to add a comment |