Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BaseEncoder.base64() should accept newlines #1252

Closed
gissuebot opened this issue Oct 31, 2014 · 6 comments
Closed

BaseEncoder.base64() should accept newlines #1252

gissuebot opened this issue Oct 31, 2014 · 6 comments

Comments

@gissuebot
Copy link

Original issue created by progoth on 2013-01-09 at 07:28 PM


It looks like we encoded some old data with an older version of commons-io Base64 (or perhaps some other encoder) that inserted CR/LFs into the encoded string. This is probably common enough behavior that the base64 decoder in guava should accept that.

@gissuebot
Copy link
Author

Original comment posted by ogregoire on 2013-01-09 at 08:12 PM


What if you used something like this?

BaseEncoding.base64().decode(CharMatcher.WHITESPACE.removeFrom(...));

Another possibility would be to ask to have a Readable that can skip characters as defined in a CharMatcher.

But having a specific case for CR/LF? I kind of dislike it.

@gissuebot
Copy link
Author

Original comment posted by cgdecker on 2013-01-09 at 08:34 PM


(No comment entered for this change.)


Labels: Package-IO, Type-Enhancement

@gissuebot
Copy link
Author

Original comment posted by lowasser@google.com on 2013-01-09 at 10:20 PM


All BaseEncodings ignore characters that they use in their own separators. So base64().withSeparator("\r\n", lineLength) would also ignore those characters in decoding.

Is this reasonable enough?

@gissuebot
Copy link
Author

Original comment posted by progoth on 2013-01-09 at 10:32 PM


I worked around it by input.replaceAll("\s+", ""), but the only reason I suggested this is because a lot of base64 encoding is/was done for MIME, so a lot of base64 encoders limit[ed] line length to 76 characters. I don't mind using a workaround (and I saw the withSeparator but didn't read that about the decoder skipping over, so that might be a decent workaround), but it just seems like this will be a very common case (and the exception was something like "Invalid character:" with the CR printed, so not visible while watching logs - ^M when opening a log in Vim - so a little bit cryptic).

Just a suggestion, feel free to close.

@gissuebot
Copy link
Author

Original comment posted by lowasser@google.com on 2013-01-09 at 11:04 PM


CharMatcher.WHITESPACE.removeFrom(string) is probably the preferred solution if you're not using the same encoder for encoding and decoding.

I'm pretty strongly inclined against adding line wrapping unless specifically requested -- but on the other hand, if you're using the same encoder for encoding and decoding, then any line separators you add will get ignored in decoding, and everyone will be happy.


Status: WorkingAsIntended

@gissuebot
Copy link
Author

Original comment posted by progoth on 2013-01-09 at 11:13 PM


Yeah, my point was that line wrapping shouldn't be in there by default, but that a lot of users will run into this same scenario while consuming various data (commons-codec as of 1.7 still defaults to encoding with line separators and line length of 76) and so will end up always having to take the extra step of sanitizing the input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant