|
WebSafeBase64
Using base64 in http urls
This was original based from research for Cryptography for Internet and Database Applications Base 64 StandardBase 64 encoding converts 3 binary bytes into 4 ascii characters. The base 64 "alphabet" or "digits" are A-Z,a-z,0-9 plus two extra characters + (plus) and / (slash). It also defines on extra character = for padding if the original input was not a multiple of 3. Problems with the standard in urlsUnfortunately, the choice of the extra characters clashes with the http/html specifications. For URLs and form-posts the mime type of www-form-urlencoded is used and defines certain transformations:
While using . in the spec is safe, some proxies and servers may misinterpret this as a filename reference (this is not so common anymore in 2007). Likewise I've seen some implementations use a tilde ~, but again this is a special character used in user directory mappings and is not safe according to spec. So to convert the base64 alpha into a format that is "web-safe" the following changes are recommended:
However this is not standardized. stringencoders to the rescueYou could use a standard base64 encoder and then post-process the output to convert the bad characters to ones that are web safe. However the stringencoders library has separate functions to do this for you with no loss in performance. The modp_b64_xxx functions are standard base 64, while the the modp_b65w_xx (note the "w") are the web-safe versions. You can change what characters are used at compile time. The default is the equivalent of ./configure --with-b64w-chars='-_\*' Other popular variations are ./configure --with-b64w-chars='-_.' ./configure --with-b64w-chars='_-.' If you are integrating with a third party source, be sure to check with them on the alphabet. If you only create and process the urls, the default is fine. ALWAYS TEST |
Sign in to add a comment
thanks