| Issue 2: | error when parsing this feed | |
| 3 people starred this issue and may be notified of changes. | Back to list |
What steps will reproduce the problem? 1. try to parse this feed from command line with feedparser. check attached file I expect to see parsed feed, but script produces error. I took feedparser.py from SVN latests verion. I run it with python 2.4.3 on WindowsXP. |
|
,
Dec 04, 2007
feed error with traceback:
>>> d = feedparser.parse("http://ftp.gnome.org/pub/GNOME/LATEST.xml")
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "feedparser.py", line 2623, in parse
feedparser.feed(data)
File "feedparser.py", line 1441, in feed
sgmllib.SGMLParser.feed(self, data)
File "/usr/lib/python2.4/sgmllib.py", line 95, in feed
self.goahead(0)
File "/usr/lib/python2.4/sgmllib.py", line 134, in goahead
k = self.parse_endtag(i)
File "/usr/lib/python2.4/sgmllib.py", line 296, in parse_endtag
self.finish_endtag(tag)
File "/usr/lib/python2.4/sgmllib.py", line 336, in finish_endtag
self.unknown_endtag(tag)
File "feedparser.py", line 476, in unknown_endtag
method()
File "feedparser.py", line 1217, in _end_description
value = self.popContent('description')
File "feedparser.py", line 700, in popContent
value = self.pop(tag)
File "feedparser.py", line 641, in pop
output = _resolveRelativeURIs(output, self.baseuri, self.encoding)
File "feedparser.py", line 1594, in _resolveRelativeURIs
p.feed(htmlSource)
File "feedparser.py", line 1441, in feed
sgmllib.SGMLParser.feed(self, data)
File "/usr/lib/python2.4/sgmllib.py", line 95, in feed
self.goahead(0)
File "/usr/lib/python2.4/sgmllib.py", line 129, in goahead
k = self.parse_starttag(i)
File "/usr/lib/python2.4/sgmllib.py", line 283, in parse_starttag
self.finish_starttag(tag, attrs)
File "/usr/lib/python2.4/sgmllib.py", line 314, in finish_starttag
self.unknown_starttag(tag, attrs)
File "feedparser.py", line 1589, in unknown_starttag
_BaseHTMLProcessor.unknown_starttag(self, tag, attrs)
File "feedparser.py", line 1460, in unknown_starttag
strattrs = u''.join([u' %s="%s"' % (key, value) for key, value in
uattrs]).encode(self.encoding)
LookupError: unknown encoding:
|
|
,
Dec 10, 2007
I have no problem with this URL with Python 2.4.4 and feedparser __version__ = "4.1"# + "$Revision: 1.92 $"[11:15] + "-cvs on Debian etch |
|
,
Dec 11, 2007
(In reply to comment #2) Danielle, the http://ftp.gnome.org/pub/GNOME/LATEST.xml feed is OK now, but sometimes not. You have to use the attached LATEST.xml . |
|
,
Apr 26, 2008
The problem is a bug with python's sgmllib. Check out http://mail.python.org/pipermail/python-bugs-list/2007-February/037082.html for more details. I have attached a patch for feedparser.py to resolve this issue. Please apply the attached patch to feedparser to solve the problem |
|
,
Yesterday (26 hours ago)
This bug has been fixed as of Python 2.5.2 on Ubuntu.
The following code works:
>>> import feedparser
>>> f = feedparser.parse("http://feedparser.googlecode.com/issues/attachment?
aid=-7827582398651082781&name=LATEST.xml")
>>> f.feed
{'lastbuilddate': u'Tue, 04 Dec 2007 08:12:41 +0000', 'publisher':
u'webmaster@gnome.org', 'subtitle': u"A list of recent files released on GNOME's FTP
site", 'links': [{'href': u'http://ftp.gnome.org/pub/GNOME/', 'type': 'text/html', 'rel':
'alternate'}], 'title': u'GNOME FTP Releases', 'subtitle_detail': {'base':
u'http://feedparser.googlecode.com/issues/attachment?aid=-
7827582398651082781&name=LATEST.xml', 'type': 'text/html', 'value': u"A list of
recent files released on GNOME's FTP site", 'language': None}, 'title_detail': {'base':
u'http://feedparser.googlecode.com/issues/attachment?aid=-
7827582398651082781&name=LATEST.xml', 'type': 'text/plain', 'value': u'GNOME
FTP Releases', 'language': None}, 'link': u'http://ftp.gnome.org/pub/GNOME/',
'publisher_detail': {'email': u'webmaster@gnome.org'}}
I'm marking this closed since the problem is in the Python libraries and they've been
fixed.
Status: Fixed
|
|
|
|