What steps will reproduce the problem?
1. á é í ó ú
2. à è ì ò ù
3. (And others)
What is the expected output? What do you see instead? The expected output is the char that is writed: á, é, í, ó, ú instead I see "?" char.
What version of the product are you using? 0.3 Beta 3
Please provide any additional information below. Some char are not displayed correctly.
Comment #1
Posted on Sep 15, 2007 by Helpful ElephantThis is a problem that depends on the encoding - it should be either sent by the server, or encrusted in a tag but if it isn't... then there's a problem. I have seen a few pages which don't send an encoding and use iso-8859-1. I'll see what I can do here.
Comment #2
Posted on Sep 18, 2007 by Helpful ElephantI've made a change in r326.
The HTML spec says "no default encoding should be assumed" but you have to start off with something... since iso-8859-1 pages are the worst for not setting the charset type, I've defaulted to this.
I've added BOM => http://en.wikipedia.org/wiki/Byte_Order_Mark - checks for UTF-8 too and set the encoding based on content-type header, oh, I also changed the parsing of these to use the more flexible ParameterSet class that parses key=value pairs a bit less strictly (allows spaces, quotes, etc). Lets see if that works a bit better!
Of course bunjallo only supports utf-8 and iso-8859-1 encodings (at the moment...), since these are by far the most common on pages I read. It's quite a hack of course, but works quite well. Other encodings are a pain to add, patches welcome (if anyone's reading ;-)
Comment #3
Posted on Sep 19, 2007 by Quick RhinoNice work, Can't wait to see your progress in the next version :)
Comment #4
Posted on Sep 27, 2007 by Helpful ElephantArggh! I just discovered that this bug isn't fixed! :-( It's far more serious (but easily fixed) - any page with: UTF-8 or ISO-8859-1 are not handled right, only lower case, like utf-8 and iso- are.
What a wally I am. This fix + a crasher are going into a 0.3.1 soon.
Comment #5
Posted on Sep 29, 2007 by Helpful ElephantOK, this is really fixed in 0.3.1.
Status: Fixed
Labels:
Type-Defect
Priority-Medium
Component-Bunjalloo
Usability
Milestone-0.3