Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Canoniclizing out of EncodeforLdap or EncodeForDN if contains specific characters like "(, ) #" etc. messes up the input. #293

Closed
meg23 opened this issue Nov 13, 2014 · 15 comments

Comments

@meg23
Copy link

meg23 commented Nov 13, 2014

From shilpi.a...@gmail.com on September 28, 2012 07:16:09

What steps will reproduce the problem? I can do EncodeForLdap and EncodeForDN however the output when passed through canonicalize, is garbage.

Try using test string - "Hi (This) ="
Here "(" is converted to \28 on encoding. On decoding it gets converted to Character 2 which is stx i.e. nothing. What is the expected output? What do you see instead? Input string before encoding should be returned What version of the product are you using? On what operating system? 2.0.1 Does this issue affect only a specified browser or set of browsers? all browsers Please provide any additional information below. I can do EncodeForLdap and EncodeForDN however the output when passed through canonicalize, is garbage.

Try using test string - "Hi (This) ="
Here "(" is converted to \28 on encoding. On decoding it gets converted to Character 2 which is stx i.e. nothing.

Please help.

Original issue: http://code.google.com/p/owasp-esapi-java/issues/detail?id=287

@meg23
Copy link
Author

meg23 commented Nov 13, 2014

From shilpi.a...@gmail.com on October 29, 2012 22:55:50

Same is happening for encodeForCSS -

Input String - !@$%()=+{}[]
Encoded String - \21 \40 \24 \25 \28 \29 \3d \2b \7b \7d \5b \5d
When used Canonicalize - 8 9 d b b d b d
This clearly is an error.

Is there any way we can know when this issue will be resolved.

Shilpi

@meg23
Copy link
Author

meg23 commented Nov 13, 2014

From shilpi.a...@gmail.com on September 08, 2013 23:25:42

Hi,

Is there any update on this issue?

Shilpi

@xeno6696
Copy link
Collaborator

I think this is actually a bug. LDAP filtering syntax requires only unmatched parentheses to be escaped. I couldn't find an RFC, but this link from Microsoft has this to say:

The LDAP filter specification assigns special meaning to the following characters:

`* ( ) \ NUL

@xeno6696
Copy link
Collaborator

@kwwall This bug is legitimate, I will take this one.

@xeno6696 xeno6696 self-assigned this Jan 28, 2016
@kwwall
Copy link
Contributor

kwwall commented Jan 29, 2016

@xeno6696 Go for it. I was thinking perhaps that '|', '&', and '!' would also need escaping for LDAP, but since it uses prefix notation for everything [e.g., "(&(objectCategory=person)(objectClass=contact)(|(sn=Wall)(sn=Seil)))"] and every sub-expression has to be fully parenthesized, I guess in hindsight, you wouldn't have to do to that extreme. If those characters were in the middle of an expression, they would just be interpreted literally.

Also, were you planning on fixing the CSS encoding as well? If not, we should probably split this issue into two issues so that we can at least close the LDAP part of it.

@xeno6696
Copy link
Collaborator

No... just the LDAP issue.

Are we open to the idea of using parsers? Some of this stuff makes more sense (to me) to use something like JavaCC to construct an LDAP parser as opposed to the char array searches we seem to be using.

@planetlevel
Copy link

Not sure I’m following here — we’re canonicalizing to make sure that encoded attacks can’t sneak through validation to reach an interpreter. We should decode for all possible downstream interpreters (this is hard). And sometimes that means that we will have encoding scheme collisions that prevent the use of certain constructs in the data.

I don’t know what you have in mind for the parser, but it seems dangerous to me as we wouldn’t be able to match the behavior of all the parsers in all the implementations of these interpreters. We can’t rely on the BNF because many parsers are written by crazy people, particularly the ones in browsers. But these parsers go to great lengths to accommodate malformed data that doesn’t match any spec.

--Jeff

From: Matt Seil <notifications@github.commailto:notifications@github.com>
Reply-To: ESAPI/esapi-java-legacy <reply@reply.github.commailto:reply@reply.github.com>
Date: Friday, January 29, 2016 at 8:09 AM
To: ESAPI/esapi-java-legacy <esapi-java-legacy@noreply.github.commailto:esapi-java-legacy@noreply.github.com>
Subject: Re: [esapi-java-legacy] Canoniclizing out of EncodeforLdap or EncodeForDN if contains specific characters like "(, ) #" etc. messes up the input. (#293)

No... just the LDAP issue.

Are we open to the idea of using parsers? Some of this stuff makes more sense (to me) to use something like JavaCC to construct an LDAP parser as opposed to the char array searches we seem to be using.


Reply to this email directly or view it on GitHubhttps://github.com//issues/293#issuecomment-176746302.

@kwwall
Copy link
Contributor

kwwall commented Feb 3, 2016

We can’t rely on the BNF because many parsers are written by crazy people

@planetlevel I've heard rumors that ESAPI was also written by a bunch of crazy people. :)

@xeno6696
Copy link
Collaborator

So @planetlevel and @kwwall what's the final direction on this one?

@kwwall
Copy link
Contributor

kwwall commented Jun 29, 2016

Sorry, this got lost in all my other emails.
Using LDAP parsers is okay, but one thing I want to refrain from doing is
dragging in any more 3rd party libraries. ESAPI already has way too many
and in the past it has kept some from using it. You can handle LDAP
authentication using JNDI which doesn't require any additional jars so
there must be the ability to do LDAP parsing of DNs. (Alternately, if you
stick only to parsing DNs, that's probably something that is not too
difficult to write; RFC 2253 defines DN's simple enough that you can
probably
hand-roll a parser in less than a weekend.)

In theory you probably could write a regex to recognize valid
LDAPv3 DNs (BTW, LDAPv2 DNs are defined by RFC 1779), but it
likely would be complicated enough that we would have ReDoS
issues with it, so that approach is probably best avoided.

Overall though, I'm not sure this is really an issue. I mean,
',' certainly is special to DNs, so that you would expect if
an unquoted comma to be treated special during canonicalization.
According to the LDAP RFCs, if (say) you want to use a comma
as part of a DN, then it must be surrounded with a double-quote
character. Same with other special characters. See the RFCs for
details.

Again, sorry so late for providing feedback.

​kevin​

@xeno6696
Copy link
Collaborator

@kwwall I thought I saw a new pull request fixing this 1-2months back. Shall we close?

@kwwall
Copy link
Contributor

kwwall commented Jun 18, 2017 via email

@kwwall
Copy link
Contributor

kwwall commented Jun 18, 2017 via email

@xeno6696
Copy link
Collaborator

This one isn't the same as #389

@xeno6696
Copy link
Collaborator

For all I wrote, this isn't actually a bug. Rereading the OP:

What steps will reproduce the problem? I can do EncodeForLdap and EncodeForDN however the output when passed through canonicalize, is garbage.

Had I paid a bit more attention here, his complaint is that he'll encode it, and then it gets transformed later when canonicalize scans it and identifies it as a JavaScript character.

In short, he's not using the encoder correctly here. He should be encoding input just before he hands it off to the LDAP interpreter. He should be storing the encoded value or otherwise using it.

This isn't a bug, I am closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants