Export to GitHub

pandoc - issue #260

unicode combining character messes table parsing


Posted on Oct 5, 2010 by Helpful Rabbit

I'm using version 1.6 from cabal. When I give pandoc any cell, in Emacs table mode, with a combining character, like the combining acute (U+0301), all cells to the right get a | as part of the string inside it. It behaves normally if I delete that character. I tried latex and html output, both get the |. Attached goes an example.

Thanks!

Attachments

Comment #1

Posted on Oct 5, 2010 by Grumpy Dog

Thanks. Pandoc counts characters for table alignment, and it doesn't (yet) know the difference between combining characters and other characters. I'll look into fixing this when I get a chance.

Data.Char can distinguish these nonspacing characters: Prelude Data.Char> generalCategory '\x0301' NonSpacingMark

Comment #2

Posted on Oct 27, 2010 by Grumpy Dog

(No comment was entered for this change.)

Comment #3

Posted on Jan 27, 2012 by Grumpy Dog

Fixed by ff93a8e7891d8537c713d6d1b0fd4409c5e43ebe

Status: Fixed

Labels:
Type-Defect Priority-Medium