| Issue 12: | WikiEntityUtil - Apostrophe rendered as ’ instead of ' | |
| 2 people starred this issue and may be notified of changes. | Back to list |
WikiEntityUtil translates the "'" (apostrophe) character to the entity "’". This entity is not recognized by the SAX parser if you feed it the HTML generated by WikiModel. I believe that WikiEntityUtil should be changed to map this character to "'" which the SAX parser handles correctly. |
|
,
Jan 10, 2008
This is one possible solution. We can replace all XHTML entities by the corresponding
digital codes. In this case all elements will be recognized by XML parsers. Right now
the XHTML output can not be parsed as is: many entities are not defined in the XML
header (but they are recognized as such by browsers).
Example of entities to replace:
{{{
"
«
»
©
...
}}}
The full list of entities to replace see the org.wikimodel.wem.util.WikiEntityUtil class.
(http://wikimodel.googlecode.com/svn/trunk/org.wikimodel.wem/src/main/java/org/wikimodel/wem/util/WikiEntityUtil.java)
Status: Accepted
|
|
,
Jan 10, 2008
If you think this is a valid solution, it sounds good to me. |
|
,
Jan 11, 2008
Actually, I just realized what you are saying here. Is there some way to pre-define
the common ones used in HTML like:
{{{
<
>
&
"
'
}}}
It would by nice if the parser would handle these common character entities.
|
|
,
Jan 11, 2008
I just tried and the standard all of the above are already handled except for . I have searched information on SAX parsers and the only way I have found to add entities is to define them in a dtd. This means that the input xml must have a dtd defined. Even then, I'm not sure if this means that DTD validation must be turned on for the SAX parser to recognize the character entities. |
|
,
Oct 26, 2008
Danny, I think I've fixed this some time ago by adding the XHTML DTDs to the XHTML parser. Could you try again and let me know if all is working for you? Thanks
Owner: vmassol
|
|
|
|