|
Project Information
Members
Featured
Downloads
Wiki pages
Links
|
Version 0.3.9 can map 128x128 paired-end Illumina reads. The updated version with OpenMP are released on Nov 28. The source released in Nov 28 is a false one. The correct source is released. Please download it again. Version 0.3.6 fix defect that has error in reading nucleotide 'n' in reference. Version 0.3.5 make 3 enhancements (Issue 42 and 43 and 44) including filtering reads with given number of 'N', output mapped read in fastq format, and enlarge the maximum number of alignments per read. PerMPerM is a software package which was designed to perform highly efficient genome scale alignments for hundreds of millions of short reads produced by the ABI SOLiD and Illumina sequencing platforms. Today PerM is capable of providing full sensitivity for alignments within 4 mismatches for 50bp SOLID reads and 9 mismatches for 100bp Illumina reads. UsageThe reference sequence(s) can be whole genomes with multiple chromosomes, the transcriptome or even the millions reads in the fasta format, separated by '>'. The reads can be in the fasta, fastq, csfasta + QUAL formats or fastq for SOLiD reads. PerM can output alignments in our mapping format or the SAM format and that output can be further processed by ComB, SAMtools, RseqFlow pipeline and the Galaxy's *test* server. Check the manual for more detail. Algorithm and PerformanceWith its special periodic spaced seeds, PerM can be fully sensitive to four mismatches, and highly sensitive to higher numbers of mismatches. This seed matching method has speed advantages in longer read (although limited to 64bp currently), non-mappable reads (for fixed number of shift and checking) and in the genome scale mapping due to the high seed weight. PerM is about 37 million reads per CPU hour, full sensitive to 3 mismatches and highly sensitive to more than 3 mismatches for 50bp SOLiD reads. PerM can build the reference index in parallel; it takes half hour to build the human genome index with 16 CPUs and 14 GB memory. SNP Calling
Splice Junctions Detection
System RequirementsPerM uses 4.5 bytes memory per base to index the reference genome. The memory usage does not dependent on the number of reads. Thus PerM requires 2GB to map reads to the human transcriptom (400 M bases)and 14 GB of memory to map reads to the human genome (3 G bases). Multiple read sets can be mapped simultaneously with multiple CPUs (Cores)using OpenMP to look up the shared memory index. Users can use iPerM, our wrapper, on a smaller memory computer or use qPerM to map one read set in parallel. CitationPlease cite our publication in the Bioinformatics journal, Chen Y, Souaiaia T, Chen T. PerM: Efficient mapping of short sequencing reads with periodic full sensitive spaced seeds. Bioinformatics, 2009, 25 (19): 2514-2521. Development TeamThis tool was developed by Ting Chen's group, Center of Excellence in Genomic Sciences at the University of Southern California. Please email Yangho Chen (yanghoch at usc.edu,), so I can put you in the PerM mailing list for any new updates. All suggestions are welcomed.
|