My favorites | Sign in
Project Home Wiki Issues Source
READ-ONLY: This project has been archived. For more information see this post.
Search
for
  Advanced search   Search tips   Subscriptions
Issue 159: StringBuf.addChar assume UTF8
1 person starred this issue and may be notified of changes. Back to list
Status:  Fixed
Owner:  ----
Closed:  Apr 2012


 
Reported by ncanna...@gmail.com, Mar 11, 2012
StringBuf.addChar(0x80) will encode the char into UTF8 because it uses String.fromCharCode.

I'm not sure if String.fromCharCode should encode in UTF8 if the native String class does not enforce UTF8. At lease that's not the case for PHP/Neko

We could add haxe.Utf8.fromCharCode that does that.
Mar 15, 2012
Project Member #1 gameh...@gmail.com
I guess there is some confusion between "character" and the good old C "char".
I will just conform cpp to neko behaviour.

The other confusion is when you go from utf8 to "String of bytes, each representing a character in the 0-255 range" since it does not allow for characters > 255.  It should really go to some wide-string type.

Hugh
Mar 21, 2012
#2 ncanna...@gmail.com
Well, as for the String-to-bytes it feels normal to use use the actual binary representation of the string (raw on Neko/CPP/PHP, UTF8 on JS/Flash). At least that's what haxe.io.Bytes.ofString does
Mar 22, 2012
Project Member #3 gameh...@gmail.com
It is ok if you think of C "char *" - but what about characters over 255?  Users have to know that String is a string of "signed char", not arbitrary characters, so utf8 -> "unsigned char" is a lossy operation.

Apr 5, 2012
Project Member #4 gameh...@gmail.com
I have conformed this to the neko api, so it should be good for building up strings from bytes without assuming utf8.
Status: Fixed

Powered by Google Project Hosting