Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent UTF-8 Behavior in Dartium #14948

Closed
DartBot opened this issue Nov 8, 2013 · 11 comments
Closed

Inconsistent UTF-8 Behavior in Dartium #14948

DartBot opened this issue Nov 8, 2013 · 11 comments
Labels
type-bug Incorrect behavior (everything from a crash to more subtle misbehavior)

Comments

@DartBot
Copy link

DartBot commented Nov 8, 2013

This issue was originally filed by chris.ee...@gmail.com


What steps will reproduce the problem?

  1. Create a main.dart file encoded in UTF-8 that prints a unicode character
  2. Run from the command line
  3. Run from Dartium

I expect the output to be identical, but Dartium seems to be treating the UTF-8 code units as UTF-16 code points.

Seen with: Dart VM version: 0.8.10.6_r30036 (Thu Nov 7 01:23:45 2013) on "linux_x64"

See the attached test file, which is UTF-8 encoded.


Attachment:
string_test.dart (204 Bytes)

@floitschG
Copy link
Contributor

Added Area-Dartium, Triaged labels.

@floitschG
Copy link
Contributor

@DartBot
Copy link
Author

DartBot commented Nov 8, 2013

This comment was originally written by googlegroups...@kaioa.com


"⌘" is actually the result of reading an UTF-8 "⌘" as ISO-8859-1.

How did the response headers of that dart file look like? (You can easily get those with curl -I.)

Personally, I think it would be good idea if it would default to UTF-8. Python does this too [1], for example.

[1] http://www.python.org/dev/peps/pep-3120/

@DartBot
Copy link
Author

DartBot commented Nov 8, 2013

This comment was originally written by chri...@gmail.com


Actually, I opened the HTML with a file:// URL.

I see the same behavior if I serve it from pub serve, in which case there are no charset headers:

$ curl -I http://localhost:8080/string_test.html
HTTP/1.1 200 OK
server: Dart/0.8 (dart:io)
transfer-encoding: chunked

Even if I serve it up from a Python simple HTTP server, there are no charset headers and I see the same behavior.

BUT... if I explicitly set the charset to UTF-8 in an HTML <meta> (see attached), then the output is the expected glyph (and if I set it to UTF-16, it also behaves as desired). And yes, if I change the <meta> tag to ISO-8859-1, then I see the same incorrect behavior.


Attachment:
string_test.html (204 Bytes)

@vsmenon
Copy link
Member

vsmenon commented Nov 14, 2013

Set owner to @vsmenon.

@vsmenon
Copy link
Member

vsmenon commented Nov 14, 2013

Added this to the M9 milestone.

@vsmenon
Copy link
Member

vsmenon commented Nov 14, 2013

Removed this from the M9 milestone.
Added this to the 1.1 milestone.

@vsmenon
Copy link
Member

vsmenon commented Feb 27, 2014

Removed this from the 1.1 milestone.
Added this to the 1.3 milestone.

@vsmenon
Copy link
Member

vsmenon commented May 26, 2014

Removed this from the 1.3 milestone.
Added this to the 1.6 milestone.

@kasperl
Copy link

kasperl commented Jul 10, 2014

Removed this from the 1.6 milestone.
Added Oldschool-Milestone-1.6 label.

@kasperl
Copy link

kasperl commented Aug 4, 2014

Removed Oldschool-Milestone-1.6 label.

@kevmoo kevmoo added type-bug Incorrect behavior (everything from a crash to more subtle misbehavior) and removed priority-unassigned labels Feb 29, 2016
@vsmenon vsmenon removed their assignment Jun 30, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug Incorrect behavior (everything from a crash to more subtle misbehavior)
Projects
None yet
Development

No branches or pull requests

6 participants