|
Project Information
Featured
|
Go-idn is project that hopes to bring IDN to Go and aims to become feature compatible with libidn. Go-idn is a fully documented implementation of the Stringprep, Punycode and IDNA specifications. Go-idn's purpose is to encode and decode internationalized domain names using pure Go code. The library contains a generic Stringprep implementation. Profiles for Nameprep, iSCSI, SASL and XMPP are included. Punycode and ASCII Compatible Encoding (ACE) via IDNA are supported. A mechanism to define Top-Level Domain (TLD) specific validation tables, and to compare strings against those tables, is included. Default tables for some TLDs are also included. InstallingInstalling go-idn is fairly simple. Just run goinstall go-idn.googlecode.com/hg/src/idna and all the packages are included. What's new?7. September 2010: The Unicode normalization is now complete. It succeeds all the test cases in NormalizationTests.txt including the PRI #29 Tests. Next up is code cleanup, more unit-tests and performance improvements.20. April 2010: I'll be very busy doing other things until 15. May. I'm also looking for another set of eyes to bounce off ideas (for the API in particular). Feel free to send me an email at hannson@gmail.com if you're interested in this project and have questions or ideas. 11. April 2010: The Unicode normalization is making good progress. As of this writing the NFKC unit test passes on the first 13916 lines in NormalizationTest.txt (warning: 2.2MB). The implementation of tables.go is still limited to 16bit runes which is unfortunate, but once we finish maketables.go it should pass more tests. Components
StatusThe current status of the project is 97%. It's mostly hanging on unit tests that need to be written and and nothing has been written for TLD (which is a low priority long term goal) but anything else is mostly working.
PunycodePunycode is an instance of a general encoding syntax (Bootstring) by which a string of Unicode characters can be transformed uniquely and reversibly into a smaller, restricted character set. Punycode is intended for the encoding of labels in the Internationalized Domain Names in Applications (IDNA) framework, such that these domain names may be represented in the ASCII character set allowed in the Domain Name System of the Internet. The encoding syntax is defined in IETF document RFC 3492. import "go-idn.googlecode.com/hg/src/punycode" Package punycode implements the punycode data encoding as used for encoding of labels in the IDNA framework, as described in RFC 3492. Punycode is used by the IDNA protocol IDNA for converting domain labels into ASCII; it is not designed for any other purpose. It is explicitly not designed for processing arbitrary free text. Current statusThe specification has been implemented 100% and unit tests have been written. The code passes all unit tests (fails one, only when case-sensitive) and should be ready for review. StringprepStringprep is a framework for preparing Unicode text strings in order to increase the likelihood that string input and string comparison work in ways that make sense for typical users throughout the world. The stringprep protocol is useful for protocol identifier values, company and personal names, internationalized domain names, and other text strings. import "go-idn.googlecode.com/hg/src/stringprep" This package contains methods for the preparation of internationalized strings ("stringprep") as described in RFC 3454. ProfilesThe following standard profiles are included.
Current statusAll the standard profiles have been implemented. Needs some unit-test cases. IDNAThe IDNA methodology encodes only select label components of domain names with procedures known as ToASCII and ToUnicode. import "go-idn.googlecode.com/hg/src/idna" This package implements a mechanism called IDNA for handling International Domain Names (IDN) in applications in a standard fashion as described RFC 3490. Current statusIDNA specification has been implemented 100% and the API is stable because the function names are standard as described in RFC 3490, thus unlikely to change. It depends on punycode (which fails a single case-sensitive unit test) but IDNA is specifically case-insensitive so it should be bug-free. It requires some unit-tests cases. TLDRelated resources
|