My favorites | Sign in
Project Logo
                
Search
for
Updated Jun 11, 2008 by sebast...@alombra.de
Labels: Featured, Phase-Implementation
GettingStarted  
5 minute introduction

Requirements

Make sure you have Java 5 or greater installed, if you don't have that, download it here.

Grab the latest release from dine's project page

Directory Structure

Create 2 folders, one will contain the JavaScript code dine will execute, all output will be written to the other folder

For this guide, let's assume your directory structure looks like this:

/tmp/dine/
/tmp/dine/dine-0.3.1-beta-cli.jar
/tmp/dine/js/
/tmp/dine/output/

Executing dine

Now go to /tmp/dine, open a shell and type java -jar dine-0.3.1-beta-cli.jar to see what parameters dine expects:

Usage: java -jar dine-0.3.1-beta-cli.jar <maxConcurrentThreads> <StepsBaseDirectory> <DownloadBaseDirectory> <seedStep>
maxConcurrentThreads is the maximum number of HTTP requests dine will execute concurrently
StepsBaseDirectory is the directory containing the JavaScript code dine will execute (/tmp/dine/js in this example)
DownloadBaseDir is the directory where the output is written to (/tmp/dine/output in this example )
seedStep is the name of the first JavaScript Step dine will execute

Write your first Step

Time to write our first Step! A Step is basically a piece of JavaScript code that tells dine what content to fetch from the internet, which dine will do for you and then the Step can extract whatever information it wants.

Instead of the classical "Hello World", we'll write a step that echoes the page title of google.com.

Create a file called googleTitle.js in /tmp/dine/js/ and paste the following code in it:

createStep({

	getUrl: function( ctx ) {
		
		return "http://www.google.com";		
	},
	
	run: function( ctx ) {		

		var xml = new XML( ctx.getResponse() );
		
		print( "The title of google.com is " + xml..title +", what a surprise!" );
	}
	
});

Run your first step

TIme to execute it! Open a shell and type:

java -jar dine-0.3.1-beta-cli.jar 1 /tmp/dine/js/ /tmp/dine/output/ /googleTitle

The output should look similar to this:

The title of google.com is Google, what a surprise!

Understanding the code

Learn more

dine can do much more things, such as trigger new steps, pass params to steps, retrieve JSON...

Check the samples and tutorials to learn more.

Also, feel free to comment this page,if you feel something misses or is not really well understandable in this introduction.



Sign in to add a comment
Hosted by Google Code