Issue 1451: GitHub commit URLs got mangled by the hightlighter in the cover message
Status:  AwaitingInformation
Owner: ----
Reported by marcin.c...@gmail.com, Jun 20, 2012
************************************************************
***** NOTE: THIS BUG TRACKER IS FOR GERRIT CODE REVIEW *****
***** DO NOT SUBMIT BUGS FOR CHROME, ANDROID, INTERNAL *****
***** ISSUES WITH YOUR COMPANY'S GERRIT SETUP, ETC.    *****
***** THOSE ISSUE BELONG IN DIFFERENT ISSUE TRACKERS!  *****
************************************************************

Affected Version:

Gerrit Code Review (2.3)

What steps will reproduce the problem?
1. Go to a patchset
2. Add a review
3. Put the following URL: https://github.com/pediapress/mwlib/commit/4406d53b4c7dcd6cba8b7e5cf7076d08bf754f5c#docs/collection.rst in the conver message
4. Submit

What is the expected output? What do you see instead?

I would expect 

<a href="https://github.com/pediapress/mwlib/commit/4406d53b4c7dcd6cba8b7e5cf7076d08bf754f5c#docs/collection.rst">https://github.com/pediapress/mwlib/commit/4406d53b4c7dcd6cba8b7e5cf7076d08bf754f5c#docs/collection.rst</a> 

to be generated.

Instead we get

<p>#docs/collection.rst"&gt;https://github.com/pediapress/mwlib/commit/#docs/collection.rst<a href="#q,4406d53b4c7dcd6cba8b7e5cf7076d08bf754f5c,n,z">4406d53b4c7dcd6cba8b7e5cf7076d08bf754f5c</a><a href="https://github.com/pediapress/mwlib/commit/&lt;a href=" #q,4406d53b4c7dcd6cba8b7e5cf7076d08bf754f5c,n,z"="">4406d53b4c7dcd6cba8b7e5cf7076d08bf754f5c</a></p>

Please provide any additional information below.

See https://gerrit.wikimedia.org/r/#/c/12157/2/ - comment to PatchSet 2 from "saper" starting with "Committed upstream as"
Jun 20, 2012
#1 marcin.c...@gmail.com
Cannot reproduce this on 

2.4.1-408-gec689b4
Jun 20, 2012
#2 sop@google.com
This is probably caused by the gerrit.config file having a commentlink that tries to add a link to SHA-1s. Gerrit doesn't do this normally out of the box, and the specific regex used in that commentlink is allowing it to pull the SHA-1 out of the URL. It might need word break boundaries around the pattern, like \b[0-9a-f]{40}\b to avoid it from doing this replace.
Status: WontFix
Jun 20, 2012
#3 marcin.c...@gmail.com
Yes, I have added this to etc/gerrit.config:

[commentlink "commit"]
         match = "\\b([0-9a-f]{40})\\b"
         link = "#q,$1,n,z"

and now I can replicate the problem as specified. Probably \\b includes also / so that particual URL is mangled.

I agree the RE is at fault, but maybe we should not apply commitlinks to links (or apply them before URL links are generated?)
Jun 20, 2012
Project Member #4 bklarson@gmail.com
imho, the solution is to treat links as commentlinks, and then fix commentlinks so that it only parses sections of the message which haven't already been matched/replaced (perhaps we already do that second part?)
Jun 20, 2012
#5 sop@google.com
interesting point bklarson, if we stopped using linkify and ran the commentlink and only parsed through the message once, it wouldn't do this.
Status: AwaitingInformation
Jun 20, 2012
#6 marcin.c...@gmail.com
Yeah, it seems to me that both linkification and commentlinks should be done in SafeHtml derivative on append() only. Doing replaceAll on safe HTML is ... well ... unsafe.
Jun 21, 2012
#7 jeremyb....@gmail.com
Also there should be a way to set the order in which commentlinks are evaluated.
Sep 4, 2012
Project Member #8 jaysoff...@gmail.com
 Issue 1545  has been merged into this issue.
Sep 4, 2012
Project Member #9 jaysoff...@gmail.com
FWIW, I tried to work-around this issue using a negative look-behind assertion of "(?<!http)" before realizing that sadly Javascript doesn't support look-behind assertions.
Jun 24, 2014
#11 oswald.b...@gmx.de
why is this AwaitingInformation when the issue is rather clearly identified and the solution more or less obvious?