My favorites | Sign in
Project Home Downloads Issues Source
Search
for
introduction  
What is Dexen?
Updated May 13, 2011 by patrick....@gmail.com

Introduction

Key characteristics

Dexen is a Distributed EXecution ENvironment for population based algorithms. Such algorithms include optimisation algorithms such as hill climbing, simulated annealing and evolutionary algorithms. For optimisation problems, Dexen supports multi-criteria optimisation using Pareto ranking methods.

The key characteristics of Dexen are as follows:

  • It can be used to run compute intensive jobs, on multiple networked computers in parallel. This includes computers on the internet, computers on LANs, and computers on cloud platforms.
  • It can run any job that requires a population of individuals to be processed. These types of jobs include optimisation algorithms such as hill climbing, Tabu search, simulated annealing, and evolutionary algorithms.
  • It allows users to define jobs by specifying a set of tasks that need to be executed. An arbitrary number of tasks can be defined, and dependencies between tasks can be created via data filters.
  • It allows users to specify tasks using a flexible programming model that allows a wide variety of jobs to be defined. The programming model requires the user to specify tasks by sub-classing a number of predefined classes.
  • It is implemented in Python, and runs on Windows and Linux. It has been tested on Windows Vista, Windows 7, and Ubuntu 10. This current version has a command line interface (GUI is coming soon).
  • It is easy to install and easy to run.

Jobs and tasks

The process of running a population based optimisation problem within Dexen is described as a job. The blueprint for the job is referred to as a job definition (or JobDef). The job definition defines a set of computational procedures, which are referred to as tasks. When a job is run, the tasks will be executed by Dexen.

There are two ways of creating the job definition - either you can write the Python code from scratch, or you can use an automated generator. See writing `JobDefs` and generating `JobDefs` respectively.

Each task will act on entities in the population called individuals. A Dexen population consists of a set of such individuals. For optimization problems, an individual represents a solution to the problem being optimised.

Server, nodes, and clients

Dexen consists of three main types of components: one server, multiple nodes, and multiple clients. Each of these components can run on separate machines, thereby allowing the computation to be distributed.

  • The server runs on a computer that is the core of the system, and all other components connect to the server.
  • The node runs on a computer that can be used by the system as a processing unit. The system will use the node to start either a master process or one or more slave processes. Masters manage jobs, including the populations of individuals associated with each job. If you start multiple jobs, then Dexen will create multiple masters. Slaves execute the tasks associated with a particular job. Typically, many slaves will be running in parallel. Dexen will automatically assign slaves to masters to execute tasks without requiring any action from the user.
  • A client provides a front-end for you to start, stop, and monitor the progress of jobs. When you start a job, you will need to use the client to upload the job definition. This will include a set of tasks that need to be executed.

Master task and slave tasks

The job definition defines two types of tasks: one master task and one or more slave tasks.

  • The master task is executed on the master and is the main starting point for the job, and this task starts all slave tasks.
  • The slave task is executed on the slave and processes individuals in the population.

For each slave task, the job definition defines the task itself, as well as three other key pieces of information: the name, the input size, and the filter function.

  • The name is a name used in various places to refer to the task.
  • The input size specifies the number of individuals required by the task for processing. Some tasks just require one individual. Other tasks require 100 individuals, all at the same time.
  • The filter function is a function used to decide whether a particular individual is ready for processing by that task. If the function returns true, then the individual is ready for processing, if it returns false, then it is not ready for processing.

Scheduling slave tasks

Each task define in the job definition will process individuals in the population. The master will first select a task, and will then select a set of individuals from the population that are ready for processing by that task. The master will then send these individuals off to a free slave for processing. This is referred to as task scheduling.

For each slave task, the master keeps track of two sets of individuals, the pending set are individuals ready to be processed by the task, and the processing set are the individuals currently being processed by the task.

The master runs in an infinite loop, continuously trying to find tasks that can be executed. The master will look at each task in turn. If the size of the pending set is greater than the input size for the task, then the master will:

  1. move the individuals to the processing set,
  2. make copies of the individuals, and
  3. send the copies off to the slave for processing.

When these individuals return from the slave, then the master will:

  1. merge the individuals with those in the population (which involves copying new data into the individual in the population),
  2. remove the individuals from the processing set for the task that just processed them,
  3. for each task, use the task filter function to check if the individuals can be processed by the task, and if they can, then add them to the pending set.

Note that the individuals can be copied into more than one pending set.


Sign in to add a comment
Powered by Google Project Hosting