| Projects on Google Code | Results 1 - 9 of 9 |
<wiki:toc max_depth="2" />
= Overview =
SiteScraper extracts the data you want from webpages. No programming or HTML knowledge is required.
<br>
For an in depth analysis of how it works have a browse of [http://sitescraper.googlecode.com/files/SiteScraper.pdf this paper].
== Example ...
Based on libcurl, libtidy and simpleXML.
Focus on using XPath for extraction of content.
Want to get data from a web page but don't want to spend time analysing its DOM tree? Use this tool to quickly build the brain for your spider.
Unlike a tree-based scrapers, this is a *search-based scraper*, meaning that data is gleaned by matching its surrounding data against the data you provid...
http://guessmedia.com/tools/assets/images/YourSpace321x55_1a.gif
*An Open-Source Social Networking Tool from Mija Media*
YourSpace is a CommandLineInterface utility (CLI) that allows you to quickly view the "Last login:" date of any MySpace user. This minimizes your need to suffer through long...
YourSpace,
MySpace,
WebScraper,
WebScraping,
MultiLanguage,
Polyglot,
Social,
CLI,
Utility,
Toy,
PHP,
BASH,
Java,
Ruby,
Python
===WebExtractor360===
WebExtractor360 is a free and open source web data extractor. It uses Regular Expressions to find, extract and scrape internet data quickly and easily. It is very flexible, allowing you to extract both simple and commonly used data and complex data structures like HTML table...
=phpQuery - pq();=
*phpQuery* is a server-side, chainable, CSS3 selector driven Document Object Model (DOM) API [http://code.google.com/p/phpquery/wiki/jQueryPortingState based on] [http://jquery.com/ jQuery JavaScript Library].
Library is written in [http://code.google.com/p/phpquery/wiki/Depen...
php,
jquery,
dom,
html,
css,
events,
json,
browser,
ajax,
xpath,
server,
xml,
webscraping,
chainable,
cli
Ticketyboo was an online context-aware music concert ticket recommender that was proposed by researchers in UCD last year. Its goal was to examine the artists someone listen to with iTunes (by looking at the user preference information stored in iTunes' own database) and automatically search online ...
This project's intention is to get book information from the web. Platform initially will be java - programs ideally will be usable via the GCJ Gnu Java compiler.
= DiggStripper =
_JavaScript for stripping Digg pages using jQuery_
[http://en.wikipedia.org/wiki/Web_scraping Web scraping] is a common practice employed by search engines, and other web utilities to scrape content from sites. Traditionally this is a server side functionality with extensive use...