My favorites | Sign in
Project Home Wiki Issues Source
READ-ONLY: This project has been archived. For more information see this post.
Search
for
SplitBam  
splits bam by chromosomes
bioinformatics, ngs, bam, sam
Updated Apr 24, 2013 by plindenb...@gmail.com

splitbam splits a BAM by chromosomes.

Using the reference sequence dictionary (*.dict), it also creates some empty BAM files if no sam record was found for a chromosome. A pair of 'mock' SAM-Records can also be added to those empty BAMs to avoid some tools (like samtools) to crash.

Usage

java -jar splitbam.jar -p OUT/__CHROM__/__CHROM__.bam -R ref.fasta (bam|sam|stdin)

Options

  • -h help; This screen.
  • -R (indexed reference file) REQUIRED.
  • -u (unmapped chromosome name): default:Unmapped
  • -e | --empty : generate EMPTY bams for chromosome having no read mapped
  • -m | --mock : if option '-e', add a mock pair of sam records to the empty bam
  • -p (output file/bam pattern) REQUIRED. MUST contain __CHROM__ and end with .bam
  • -s assume input is sorted.
  • -x | --index create index.
  • -t | --tmp (dir) tmp file directory
  • -G (file) chrom-group file (see below)

Chromosome group

by default splitBam produces one file per chromosome. But you can use a group-file to group some chromosomes. The format is :

(group-name1)\tchrom1\tchrom2\tchrom3...\n
(group-name2)\tchrom11\tchrom12\tchrom22...\n

The missing chromosomes are defined in their own group by default.

Example:

XY	X	Y
GL_CHROM	SGL000207.1	GL000226.1	GL000229.1	GL000231.1	GL000210.1	GL000239.1	GL000235.1	GL000201.1	GL000247.1	GL000245.1	GL000197.1	GL000203.1	GL000246.1	GL000249.1	GL000196.1	GL000248.1	GL000244.1	GL000238.1	GL000202.1	GL000234.1	GL000232.1	GL000206.1	GL000240.1	GL000236.1	GL000241.1	GL000243.1	GL000242.1	GL000230.1	GL000237.1	GL000233.1	GL000204.1	GL000198.1	GL000208.1	GL000191.1	GL000227.1	GL000228.1	GL000214.1	GL000221.1	GL000209.1	GL000218.1	GL000220.1	GL000213.1	GL000211.1	GL000199.1	GL000217.1	GL000216.1	GL000215.1	GL000205.1	GL000219.1	GL000224.1	GL000223.1	GL000195.1	GL000212.1	GL000222.1	GL000200.1	GL000193.1	GL000194.1	GL000225.1	GL000192.1

Compilation

$ cd jvarkit

Edit the file build.properties if needed:

picard.jar=/path/to/picard-1.xx.jar
sam.jar=/path/to/sam-1.xx.jar

invoke ant

$ ant splitbam

splitbam:
    [mkdir] Created dir: tmp
    [javac] Compiling 1 source file to tmp
      [jar] Building jar: dist/splitbam.jar
   [delete] Deleting directory tmp

BUILD SUCCESSFUL
Total time: 1 second
Powered by Google Project Hosting