|
FAQ
frequently asked questions
Actually it doesn't have anything to do with eating ;) It is inspired from the name of the Navajo language which was used in World War II as a way of secret communication. That language seems to be so cryptic and hard to understand for outsiders, that no one could find out what the code talkers were speaking about. It sometimes seems to me that making a piece of code understand information found on the internet is an equally hard task, so I chose this name.
dine executes your steps concurrently and the best way to parallelize execution is to use a single thread for each website to parse. In fact a step is an implementation of the command pattern.
Dine compiles your steps only once at startup and uses a singleton instance everytime a step is executed. That means that steps need to be stateless! This architecture allows a very low memory footprint of dine.
It uses E4X, check wikipedia and the in-depth tutorial from Mozilla to learn more.
I think that E4X is much more intuitive and easier to use than XPath, so the dine API currently only offers support for E4X. Maybe there will also be an XPath API in the future, because a lot of people are familiar with it and it is much more powerful than E4X, yet harder to handle. If you're interested in implementing this, feel free to join this project!
It supports POST, you have to add a function called getMethod() in your step, that returns the string "POST". You can then implement another method call getPostParams() that should return a map of the parameters used for the post. HTTP Auth is currently not supported, but could be added with very small effort, as we use commons-httpclient inside. Drop me a mail to spam at alombra.de, if you need this functionality, maybe I will sit down for some hours and add it. |
Sign in to add a comment