My favorites | Sign in
Project Home Downloads Wiki Issues Source
Project Information
Members
Links
Ariel intends to assist in extracting information from semi-structured
documents including (but not in any way limited to) web pages. Although you
may use libraries such as Hpricot or Rubyful Soup, or even plain Regular
Expressions to achieve the same goal, Ariel approaches the problem very
differently. Ariel relies on the user labeling examples of the data they
want to extract, and then finds patterns across several such labeled
examples in order to produce a set of general rules for extracting this
information from any similar document.



Powered by Google Project Hosting