New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues with the Media-RSS implementation #195
Comments
Could you please give us a short description about what MediaRSS is for. Maybe a real use case would improve the understanding. |
Of course. Media-RSS is used to describe medias, such as audio or video files, and their metadata (thumbnails, description, number of views/listening, rating, links to read the media in different format etc.) It is used in every youtube feeds (example) or peertube feeds (example though support should improve in an upcoming version). |
I have the same issue , did you solve it? |
Actually this would take some time to fix. I am willing to do a patch, but I would like to be sure that it will merged in the end before I start. @kurtmckee What do you think? |
This is something we are very interested in as well, especially when it comes to children in I have started work on a patch but the changes are breaking at this time (see example below). Main changes:
Any thoughts on these changes and how they affect the parsed data? @azmeuk Is this in line with what you had in mind or were you planning on something different? @kurtmckee Is this in line with the project as a whole? Input file<rss version="2.0" xmlns:media="http://search.yahoo.com/mrss/"
xmlns:dcterms="http://purl.org/dc/terms/">
<channel>
<title>Music Videos 101</title>
<link>http://www.foo.com</link>
<description>Discussions of great videos</description>
<item>
<title>The latest video from an artist</title>
<link>http://www.foo.com/item1.htm</link>
<media:content url="http://www.foo.com/movie.mov" fileSize="12216320" type="video/quicktime" expression="full">
<media:player url="http://www.foo.com/player?id=1111" height="200" width="400" />
<media:hash algo="md5">dfdec888b72151965a34b4b59031290a</media:hash>
<media:credit role="producer">producer's name</media:credit>
<media:credit role="artist">artist's name</media:credit>
<media:category scheme="http://blah.com/scheme">
music/artistname/album/song
</media:category>
<media:text type="plain">
Oh, say, can you see, by the dawn's early light
</media:text>
<media:rating>nonadult</media:rating>
<dcterms:valid>
start=2002-10-13T09:00+01:00;
end=2002-10-17T17:00+01:00;
scheme=W3C-DTF
</dcterms:valid>
</media:content>
</item>
</channel>
</rss> Parsed data WITHOUT changes[
{
"title": "The latest video from an artist",
"title_detail": {
"type": "text/plain",
"language": null,
"base": "",
"value": "The latest video from an artist"
},
"links": [
{
"rel": "alternate",
"type": "text/html",
"href": "http://www.foo.com/item1.htm"
}
],
"link": "http://www.foo.com/item1.htm",
"media_content": [
{
"url": "http://www.foo.com/movie.mov",
"filesize": "12216320",
"type": "video/quicktime",
"expression": "full"
}
],
"media_player": {
"url": "http://www.foo.com/player?id=1111",
"height": "200",
"width": "400",
"content": ""
},
"media_hash": {
"algo": "md5"
},
"media_credit": [
{
"role": "producer",
"content": "producer's name"
},
{
"role": "artist",
"content": "artist's name"
}
],
"credit": "artist's name",
"tags": [
{
"term": "music/artistname/album/song",
"scheme": "http://blah.com/scheme",
"label": null
}
],
"media_text": {
"type": "plain"
},
"media_rating": {
"content": "nonadult"
},
"rating": "nonadult",
"validity": "start=2002-10-13T09:00+01:00;\n end=2002-10-17T17:00+01:00;\n scheme=W3C-DTF",
"validity_start": "2002-10-13T09:00+01:00",
"validity_start_parsed": [
2002,
10,
13,
8,
0,
0,
6,
286,
0
]
}
] Parsed data WITH changes[
{
"title": "The latest video from an artist",
"title_detail": {
"type": "text/plain",
"language": null,
"base": "",
"value": "The latest video from an artist"
},
"links": [
{
"rel": "alternate",
"type": "text/html",
"href": "http://www.foo.com/item1.htm"
}
],
"link": "http://www.foo.com/item1.htm",
"media_content": [
{
"url": "http://www.foo.com/movie.mov",
"filesize": "12216320",
"type": "video/quicktime",
"expression": "full",
"media_player": {
"url": "http://www.foo.com/player?id=1111",
"height": "200",
"width": "400",
"content": ""
},
"media_hash": {
"algo": "md5"
},
"media_credit_details": [
{
"role": "producer",
"content": "producer's name"
},
{
"role": "artist",
"content": "artist's name"
}
],
"media_credit": "artist's name",
"tags": [
{
"term": "music/artistname/album/song",
"scheme": "http://blah.com/scheme",
"label": null
}
],
"media_text": {
"type": "plain"
},
"media_rating_details": {
"content": "nonadult"
},
"media_rating": "nonadult",
"validity": "start=2002-10-13T09:00+01:00;\n end=2002-10-17T17:00+01:00;\n scheme=W3C-DTF",
"validity_start": "2002-10-13T09:00+01:00",
"validity_start_parsed": [
2002,
10,
13,
8,
0,
0,
6,
286,
0
]
}
]
}
] Output diff...
"media_content": [
{
"url": "http://www.foo.com/movie.mov",
"filesize": "12216320",
"type": "video/quicktime",
- "expression": "full"
- }
- ],
+ "expression": "full",
"media_player": {
"url": "http://www.foo.com/player?id=1111",
"height": "200",
...
- "media_credit": [
+ "media_credit_details": [
{
"role": "producer",
"content": "producer's name"
},
{
"role": "artist",
"content": "artist's name"
}
],
- "credit": "artist's name",
+ "media_credit": "artist's name",
...
"media_text": {
"type": "plain"
},
- "media_rating": {
+ "media_rating_details": {
"content": "nonadult"
},
- "rating": "nonadult",
+ "media_rating": "nonadult",
"validity": "start=2002-10-13T09:00+01:00;\n end=2002-10-17T17:00+01:00;\n scheme=W3C-DTF",
...
+ }
+] |
Hello,
I noticed some issues with the media-rss implementation. Before trying to fix them, I would like to discuss it here.
media:group is ignored
According to the Media-RSS specification, the
<media:group>
tag is used to group several links/representation for a same media. However, my understanding is that feedparser just ignores this tag, and consider every<media:content>
as a new media.feedparser/feedparser/namespaces/mediarss.py
Lines 64 to 66 in d12d3bd
feedparser/feedparser/namespaces/mediarss.py
Lines 119 to 122 in d12d3bd
The description is set on the feed entry
The <media:description> tag belongs to the media, but feedparser updates the feed entry description.
feedparser/feedparser/namespaces/mediarss.py
Lines 91 to 95 in d12d3bd
Some tags are missing
For instance, the <media:subtitle> tag is not handled by feedparser.
Attributes are ignored
When tags are handled, a lot of the attributes in the Media-RSS specification are just ignored. For instance,
<media:description>
can either be plain text or html but feedreader does not make a difference.So...
I would like to tackle this issues, but there could be some backward compatibility problems. How can I manage this? I believe Media-RSS is not much used, and the simpler option for me is just to break the compatibility so feedparser can correctly respect the specification.
What do you think?
The text was updated successfully, but these errors were encountered: