
glycan-pipeline
About HS-SEQ
HS-SEQ is the first de novo sequencing algorithm designed specifically for identifying heparan sulfate (HS) isomer using high resolution tandem mass spectrometry. It is able to provide product-grade identification performance within seconds or minutes.
Principle
HS-SEQ deals with ambiguous HS tandem MS data. Due to the repeating disaccharide units of HS and lability of sulfate groups, peaks in the tandem MS spectra are easily to assigned to multiple fragment structures, leading to ambiguous evidence for sulfation localization. HS-SEQ constructs a graph based on the most confident peak interpretations, and expands the graph by integrating the currently most confident peak interpretations (Bayesian inference). The graph can be converted into a modification localization profile -- "modification distribution", which allows the prediction of sulfation/acetylation sites. The computation time only depends on the number of peak interpretations, and is independent of the total candidate space.
Performance Demonstration
| Test data | Modification distribution (dp15 for demo) |
|:--------------|:----------------------------------------------|
| |
|
Downloads
Program(v1.0.2): https://glycan-pipeline.googlecode.com/svn/trunk/GAG/bin/hsseq_v1.0.2.zip
Usage
Step 0: Deisotoping/deconvolution (generating monoisotopic peak list, for Solarix only, to be uploaded)
simplefinder.exe -s spectrum_file -p 245.000 -z 4- -o output_file -e 1
Step 1: Peak interpretation (Optional)
librarymatch.exe -c composition_file -m monoisotopic_peak_list -o output_file -e 2
- type "librarymatch.exe -h" for help.
- composition_file example can be found from directory "structure"
- monoisotopic_peak_list example can be found from directory "data" in the format: m/z intensity z
step 2: Sequencing
hsseq -c composition_file -m monoisotopic_peak_list -o output_file -e 2
- type "hsseq.exe -h" for help.
Example
hsseq.exe -c structure\arixtra.gl -m data\arixtra_4_NETD_mo
no_list.txt
https://glycan-pipeline.googlecode.com/svn/trunk/GAG/image/hs-seq-output.JPG'>https://glycan-pipeline.googlecode.com/svn/trunk/GAG/image/hs-seq-output.JPG
Contact
The program was developed by Han Hu(hh1985@bu.edu)
PI: Prof. Joseph Zaia (jzaia@bu.edu) and Prof. Yu (Brandon) Xia (brandon.xia@mcgill.ca)
Reference
Hu, H. et al. A Computational Framework for Heparan Sulfate Sequencing Using High-resolution Tandem Mass Spectra. Mol Cell Proteomics mcp.M114.039560 (2014). doi:10.1074/mcp.M114.039560 http://www.mcponline.org/content/early/2014/06/12/mcp.M114.039560.abstract'>[Link]
Project Information
The project was created on Jan 17, 2012.
- License: MIT License
- 1 stars
- svn-based source control
Labels:
Glycosaminoglycan
MassSpectrometry
Bioinformatics
HeparanSulfate
DeNovoSequencing
Glycomics