MultiStringBWTCython Methods
The construction functions provided in this library use the MSBWT construction algorithm as described by Bauer et al. in "Lightweight BWT construction for very large string collections". This method is a "column-wise" approach that requires all strings (reads) to be of uniform length. The methods provided allow for the BWT to be built into either the Byte BWT or RLE BWT format. Additionally, inputs can be in FASTQ, FASTA, BAM, or Python string formats. For custom builds, refer to the functions for examples on calling the construction APIs.
Recommendations:
- Format: FASTQ, FASTA, or Python strings
- Input type: Illumina; short, low error reads
- Output: RLE - significantly reduced disk I/O
MultimergeCython Methods
The construction functions provided in this library use the MSBWT construction algorithm as described by Holt and McMillan in "Merging of Multi-String BWTs with Applications". This algorithm performs a divide-and-conquer of the reads by building many small MSBWTs and merging them together. The method only builds into the Byte BWT format. Inputs can be in the FASTQ, FASTA, or Python string formats. For custom builds, refer to the functions for examples on calling the construction APIs.
Recommendations:
- Format: FASTQ, FASTA, or Python strings
- Input type: PacBio; long, high error reads
- Output: Byte BWT only