New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added a timeout parameter to the parse function #80
Conversation
cool 👍 |
👍 |
Agreed this would be awesome. should the default timeout actually be -1 so existing usages of I've had to build a workaround using the requests library in the meantime, would love to switch back to using only feedparser, but there's no other clean way besides fussing with global/thread timeout. |
This feature would be very useful -- any plans to merge? |
I think this repo is dead, isn't it ? This PR is one year old. |
My plan is to remove custom HTTP code from feedparser at some point in the future. Therefore I don't want to encourage people to depend on HTTP features in feedparser and would instead suggest using far more robust libraries like requests. I have consistently rejected adding timeouts to feedparser. Please consider using a library like requests instead. 😄 |
ok, but dont expect people to find your library very useful when it hangs
their scripts and programs
…On Fri, Dec 21, 2018 at 11:08 PM Kurt McKee ***@***.***> wrote:
My plan is to remove custom HTTP code from feedparser at some point in the
future. Therefore I don't want to encourage people to depend on HTTP
features in feedparser and would instead suggest using far more robust
libraries like requests.
I have consistently rejected adding timeouts to feedparser. Please
consider using a library like requests instead. 😄
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#80 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAo7p5Nlxc3xkUtDUPRpMrMw69SJVfvoks5u7bBQgaJpZM4KR9zK>
.
|
@kurtmckee so what's your advice on this one? use requests to fetch the page and then parse it with feedparser? In this case what's the point of feedparser, we could do it with any other xml parser, no? |
@peterashwell I think you missunderstood Kurts answer. @JPFrancoia Your are part right. But using another xml parser would cause you a lot more work. Because feedparser does more then just parsing the xml. It interprete the data in the context of feeds. IMO this packages is essential for me (currently working on a feedreader). |
@JPFrancoia, I do recommend using a strong HTTP client library like requests. feedparser's HTTP client was written at a time when the only game in town was the standard library, and it was painful to interact with. Adding an HTTP client to feedparser was really helpful to people.I
It's been a decade and a half, and libraries have emerged with great features and usability, so I recommend using those libraries. requests is a good option for sure!
feedparser's strength is not its combination of HTTP and XML; feedparser's strength is its ability to handle real-world edge cases, including mangled XML that compliant XML parsers would choke on and reject.
…On December 23, 2018 10:43:15 AM UTC, JPFrancoia ***@***.***> wrote:
@kurtmckee so what's your advice on this one? use requests to fetch the
page and then parse it with feedparser? In this case what's the point
of feedparser, we could do it with any other xml parser, no?
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#80 (comment)
|
@kurtmckee just re-wrote my implementation to use |
feedparser is great at parsing feeds, not at fetching data. Following its author's recommendation[1], we are no longer relying on feedparser to fetch the feeds, and instead we use 'requests'. [1] kurtmckee/feedparser#80 Closes: #12
I had issues with connection hanging forever, my recommendation is to remove then the broken HTTP client, and expect the data to be passed directly to (btw, thanks for this library, it is really helpful :) |
@kurtmckee I know that there is currently no time schedule for removing the HTTP-request part from feedparser. But maybe it is a good idea to add a deprecation warning to the next release if someone use it. |
…e/feedparser#80) as per feedparser author and pass data only to feedparser. Add exceptions to catch timeouts and sites having network connections not working
The default is set to 30 seconds.
I only saw #77 after I made my corrections. However PR 77 introduces a hardcoded parameter, and does not respect the API of feedparser.
I recommend using this PR instead of 77.
Usage:
But old syntax will still work, of course: