Title git statistics
Student Sverre Rabbelier
Mentor David Symonds
Abstract
== Abstract ==

Git, 'the stupid content tracker', is actually quite intelligent. It contains a lot of information about whatever content it is tracking, and using the right tools this information can be extracted and presented to the user. Statistics on the tracked content can be very useful to the user. Currently, git lacks almost all forms of statistic gathering.


== Project Goals ==

Consider Ohloh, an external tool that provides commit information about contributors to a project. It provides with a quick overview of all contributors to a project, and what their contribution has been so far. At the moment git does not have anything similar, even though all the data needed for such an analysis is present.
Interesting information would be 'who is maintaining this code?' and 'how well are they doing in maintaining it'. Such information is especially useful when trying to decide whom to send a copy of a patch, or whom to ask to fix a particular bug. In a more broad sense it might be interesting to determine what part of the code is most actively worked on, and what part of the code is most stable. Information on how actively part of the code is edited could be used to find 'edit wars'. (In which a part of the code is changed over and over again, this could indicate that that particular part of the code is hard/difficult/error prone.)
Even more generally, code could be analyzed to detect 'bugfixes' based on what they edit, and how often that code is edited thereafter. After completing the analysis all commits are marked as 'introducing a bug', 'enhancement' or 'bugfix' on a per-file basis. Based on such information it is possible to gather statistics about users and their performance. Using these statistics steps can be taken to remedy whatever is causing a contributor to introduce many bugs, or to compliment a maintainer that fixes many.

My plan for this summer is to create a 'statistics' feature for git.


== Provided functionality ==

Upon completion of this proposed project git would be extended with the following functionality.
* Display statistics on how many changes a user has made, for example in the form of 'lines changed' or a 'total diff'.
* Show which contributors have contributed to the part of the code that a patch modifies.
* Find out what part of the code a maintainer is working on the most.
* Analyse what type a commit is ('introducing a bug', 'enhancement' or 'bugfix') and perform analysis with this data.