| |
ID |
Type |
Status |
Priority |
Milestone |
Owner |
|
Summary + Labels |
Port |
... |
| |
30 |
Defect
|
Accepted
|
Medium
|
----
|
jgraham.cantab
|
|
Missing codecs in python 2.3
|
----
|
|
| |
35 |
Defect
|
Accepted
|
Medium
|
----
|
jgraham.html
|
|
Poor performace parsing numeric entities
|
Ruby
|
|
| |
47 |
Defect
|
New
|
Medium
|
----
|
----
|
|
[PATCH] Sanitizer passes uppercase tags through untouched
|
----
|
|
| |
52 |
Defect
|
New
|
Medium
|
----
|
----
|
|
thin and thick not in CSS whitelist
|
----
|
|
| |
55 |
Defect
|
Accepted
|
High
|
----
|
jgraham.html
|
|
trunk python UnicodeError: UTF-16 stream does not start with BOM
|
----
|
|
| |
57 |
Defect
|
New
|
Medium
|
----
|
ryansking
|
|
/trunk/testdata external set as https
|
----
|
|
| |
59 |
Defect
|
New
|
Medium
|
----
|
----
|
|
maximum recursion depth exceeded in tree traversal (python)
|
Python
|
|
| |
61 |
Defect
|
New
|
Medium
|
----
|
----
|
|
Integration of CSS Parser
|
----
|
|
| |
62 |
Defect
|
New
|
Medium
|
----
|
----
|
|
Sanitizer does not allow stripping of tags
|
----
|
|
| |
63 |
Defect
|
New
|
Medium
|
----
|
----
|
|
Problems with UTF8
|
Ruby
|
|
| |
66 |
Defect
|
New
|
Medium
|
----
|
----
|
|
Check for valid utf-8 in inputstream.rb gives false negatives when $KCODE is set to "UTF8" [w/fix]
|
Ruby
|
|
| |
68 |
Defect
|
New
|
Medium
|
----
|
----
|
|
chardet is no longer maintained but rchardet
|
Ruby
|
|
| |
69 |
Enhancement
|
Accepted
|
Medium
|
----
|
----
|
|
charsUntil is slow (Python)
|
Python
|
|
| |
73 |
Defect
|
New
|
Medium
|
----
|
----
|
|
problem with reading from stdin
|
Python
|
|
| |
74 |
Defect
|
Accepted
|
Medium
|
----
|
jgraham.html
|
|
AttributeError: 'module' object has no attribute 'isValidEncoding'
|
----
|
|
| |
77 |
Defect
|
Accepted
|
Medium
|
----
|
ryansking
|
|
Only first instance of white space is stripped
|
Ruby
|
|
| |
79 |
Defect
|
New
|
High
|
Release1.0
|
----
|
|
getElementById doesn't work with minidom
|
Python
|
|
| |
80 |
Defect
|
Accepted
|
Medium
|
----
|
----
|
|
TypeError when serializing some pages to BeautifulSoup
|
Python
|
|
| |
81 |
Defect
|
Accepted
|
High
|
Release1.0
|
----
|
|
Verision info
|
Python
|
|
| |
82 |
----
|
New
|
Critical
|
----
|
----
|
|
Zip archive is messed up
|
Python
|
|
| |
88 |
Defect
|
Accepted
|
Critical
|
----
|
jgraham.html
|
|
Reading from stdin broken
|
Python
|
|
| |
89 |
Defect
|
New
|
Critical
|
Release1.0
|
----
|
|
Installation using setup.py fails under Windows
|
Python
|
|
| |
92 |
Defect
|
Accepted
|
----
|
Release1.0
|
----
|
|
Possible to make IE run script after roundtripping in html5lib
|
Python
|
|
| |
93 |
Defect
|
Accepted
|
Critical
|
Release1.0
|
----
|
|
Quote attributes containing weird whitespace or '<'
|
----
|
|
| |
95 |
Task
|
Accepted
|
Medium
|
----
|
jgraham.html
|
|
Implement scripting-disabled case
|
----
|
|
| |
96 |
Enhancement
|
Accepted
|
Low
|
----
|
----
|
|
a better intToUnicodeStr
|
----
|
|
| |
98 |
----
|
New
|
----
|
----
|
----
|
|
Encoding issue: 'ascii' codec instead of appropriate one.
|
----
|
|
| |
103 |
----
|
Accepted
|
----
|
----
|
ja...@hoppipolla.co.uk
|
|
Can't easy_install/pip install html5lib==dev
|
----
|
|
| |
112 |
----
|
Accepted
|
----
|
----
|
ja...@hoppipolla.co.uk
|
|
assertion in processSpaceCharacters in InTableTextPhase
|
----
|
|
| |
113 |
Defect
|
Accepted
|
Critical
|
Release1.0
|
----
|
|
cannot handle mailformed attribute names with html5lib and lxml
|
Python
|
|
| |
117 |
----
|
Accepted
|
----
|
----
|
----
|
|
Tokenizer tests are not JSON
|
----
|
|
| |
119 |
Enhancement
|
Accepted
|
Critical
|
Release1.0
|
geoffers
|
|
Update to LC spec
NeedsTests
|
Python
|
|
| |
120 |
Defect
|
Accepted
|
Critical
|
Release1.0
|
geoffers
|
|
Deprecate BeautifulSoup
|
Python
|
|
| |
121 |
Defect
|
Accepted
|
----
|
Release1.0
|
excors
|
|
nonXmlBMPRegexp is totally bogus
|
Python
|
|
| |
122 |
----
|
New
|
----
|
----
|
----
|
|
Comments beginning a file crashes the xml parser
|
----
|
|
| |
123 |
----
|
New
|
----
|
----
|
----
|
|
simpletree cloneNode only works for Elements (+patch)
|
----
|
|
| |
125 |
----
|
Accepted
|
----
|
----
|
excors
|
|
InfosetFilter.toXmlName doesn't filter first character properly
|
----
|
|
| |
129 |
----
|
New
|
----
|
----
|
----
|
|
Crash when parsing windows-style quotes
|
----
|
|
| |
130 |
----
|
New
|
----
|
----
|
----
|
|
Genshi Tree Walker Broken
|
----
|
|