snakemake


A Python based language and execution environment for make-like workflows.

Build systems like make are frequently used to create complicated workflows, e.g. in bioinformatics. This project aims to reduce the complexity of creating workflows by providing a clean and modern domain specific specification language (DSL) in python style, together with a fast and comfortable execution environment.

Snakemake has moved to https://bitbucket.org/johanneskoester/snakemake. Please update your links. Sorry for the inconvenience.

Features

  • Define workflows in a textual way by writing rules how to create output files from input files in a simple python based syntax. In contrast to GNU make (which is primarily a build system), snakemake allows a rule to create multiple output files.
  • Snakemake automatically calculates which rules need to be executed to create the desired output.
  • Both shell based rules as well as full python syntax inside a rule is supported. Shell commands have direct access to all local and global python variables.
  • Like GNU make, snakemake can schedule parallel rule executions where possible. Further, inter rule parallelization can be combined with intra rule parallelization (e.g. threads) and snakemake ensures that the number of used cores does not exceed a given threshold.
  • Files can be marked as temporary (i.e. they can be deleted once not needed any more) or protected (i.e. they will be write protected after creation).
  • Input and output files can contain multiple named wildcards.
  • Input and output files can be given names to ease addressing them inside the rule.
  • A map-reduce like functionality is accomplished by using the easy to read python list comprehension syntax.
  • As an experimental feature, snakemake can run on a cluster by specifying the submit command (e.g. qsub for Sun Grid Engine).

Paper

Köster, Johannes and Rahmann, Sven. "Snakemake - A scalable bioinformatics workflow engine". Bioinformatics 2012.

For further questions please feel free to contact me (http://www.rahmannlab.de/people/koester).

Project Information

The project was created on Nov 13, 2011.

Labels:
Academic Python Workflow