My favorites | Sign in
Project Home Downloads Wiki Issues Source
Search
for
TestData  

Featured
Updated Sep 17, 2010 by matthew....@gmail.com

This page lists the data sets used in the paper. PerM has been tested on many other public and private data sets. For more public data sets, check the NCBI Short read archive and the Applied Biosystems or Illumina websites.

Reference Data Sets

  • The anthracis.fasta file can be downloaded from SOCS download page.
  • Both repeat masked and non-masked human genomes can be downloaded from UCSC's genome browser. The file names are chromFaMasked.zip and chromFa.zip: Human Genome from UCSC.

Read files

  • Illumina Test data used in the revised paper SRR001154
  • SOLiD Read Test data used in the paper. ERR00455

The data downloaded from the NCBI SRA is in the base format. We translated the data into color signals in CSFASTA format. Because the data set is too large for many people to download, we are provide the subset we used in our download section (ERR000455_5M.tgz).

  • Real SOLiD reads from B. anthracis can be downloaded from SOCS's download page.
  • Simulated SOLiD or Solexa reads can be generated using the small program in the download section.
Back to manual


Sign in to add a comment
Powered by Google Project Hosting