My favorites | Sign in
Project Home Wiki Issues Source
READ-ONLY: This project has been archived. For more information see this post.
Search
for
Construction  
A summary of the various BWT construction methods included in the API
Updated Nov 19, 2014 by holt...@cs.unc.edu

MultiStringBWTCython Methods

The construction functions provided in this library use the MSBWT construction algorithm as described by Bauer et al. in "Lightweight BWT construction for very large string collections". This method is a "column-wise" approach that requires all strings (reads) to be of uniform length. The methods provided allow for the BWT to be built into either the Byte BWT or RLE BWT format. Additionally, inputs can be in FASTQ, FASTA, BAM, or Python string formats. For custom builds, refer to the functions for examples on calling the construction APIs.

Recommendations:

  • Format: FASTQ, FASTA, or Python strings
  • Input type: Illumina; short, low error reads
  • Output: RLE - significantly reduced disk I/O

MultimergeCython Methods

The construction functions provided in this library use the MSBWT construction algorithm as described by Holt and McMillan in "Merging of Multi-String BWTs with Applications". This algorithm performs a divide-and-conquer of the reads by building many small MSBWTs and merging them together. The method only builds into the Byte BWT format. Inputs can be in the FASTQ, FASTA, or Python string formats. For custom builds, refer to the functions for examples on calling the construction APIs.

Recommendations:

  • Format: FASTQ, FASTA, or Python strings
  • Input type: PacBio; long, high error reads
  • Output: Byte BWT only
Powered by Google Project Hosting