What is InterProScan?
InterPro is a database which integrates together predictive information about proteins' function from a number of partner resources, giving an overview of the families that a protein belongs to and the domains and sites it contains.
Users who have novel nucleotide or protein sequences that they wish to functionally characterise can use the software package InterProScan to run the scanning algorithms from the InterPro database in an integrated way. Sequences are submitted in FASTA format. Matches are then calculated against all of the required member database's signatures and the results are then output in a variety of formats.
- 32 bit Linux
- 64 bit Linux
There are no versions planned for Windows or Apple (MAC OS X) operating systems.
This is due to constraints in the various third-party binaries that InterProScan runs.
New in Release Candidate 5
- Synchronized with InterPro version 41.0
- Gene3d - updated to version 3.5.0
- PIRSF - updated to version 2.83
- Prosite profiles and patterns - updated to version 20.89
- Support for job distribution on LSF clusters (v8 or earlier)
- Amino acid sequences which contain translation stops are not considered to be valid input and will cause InterProScan to exit. If you have asterisks in your sequence as a consequence of using a coding sequence prediction algorithm, we suggest that you either:
- i) use InterProScan's natively provided ORF translation facility (based on getorf), which gives you the added advantage of being able to map back your predictions to the original nucleotide sequence
- ii) split your peptides into separate fasta sequences, with unique headers, so that the asterisks are removed and each subsequence is uniquely identified.
Inclusion of asterisks in a sequence cause some of InterProScan's member database applications to behave strangely and are not guaranteed to produce correct results, so this is why InterProScan5 will fail when it encounters them.
How is InterProScan 5 different to InterProScan 4?
InterProScan 5 differs from InterProScan v4.x in the following ways:
- New analysis type: Phobius for transmembrane and signal peptide prediction
- New feature: ability to map InterPro results back to the original nucleotide sequences that were submitted
- New feature: option to look up biological pathways that the protein is potentially involved in
- New output formats: "IMPACT" XML format and GFF3.0
- Improved graphical (HTML and SVG) representations of the protein matches
LSF Cluster users
Interproscan 5 allows components of the analysis to be farmed out on an LSF cluster. Full details of this can be found in Running on an LSF Cluster