Public Software
Neural Network Predictor
(manuscript in preparation)
Neural Network Predictor is a machine learning method for proteotypic peptide prediction based on physico-chemical properties and protein abundance
O Serang*, JW Froehlich*, J Muntel, G McDowell, H Steen, RS Lee^, JA Steen^ (2013) SweetSEQer: simple de novo filtering and annotation of glycoconjugate mass spectra Molecular & Cellular Proteomics (in press) [publication at PubMed]
SweetSEQer is a method for simple de novo identification and sequencing of glycoconjugate spectra. It generates spectral annotations and graphical annotation figures.
Nonparametric Cutout Index (npCI)
O Serang, J Paulo, JA Steen, and H Steen (2013) A non-parametric cutout index (npCI) for robust evaluation of identified proteins Molecular & Cellular Proteomics (in press) [publication at PubMed]
The nonparametric cutout index (npCI) is simple yet robust measure for reliably evaluating protein identifications. This metric can be used with multiple existing peptide scores and protein identification methods, to determine the appropriate protein-level false discovery rate threshold, and to evaluate different strategies for merging evidence across replicate experiments.
BY Renard, B Xu, M Kirchner, A Tzur, S Korten, NW Brattig, H Steen, FA Hamprecht (2011) Overcoming Species Boundaries in Peptide Identification with BICEPS (submitted)
BICEPS is an open source tool for the error-tolerant identification of peptides based on a statistical regularization scheme. It balances possible improvements in peptide-spectrum-matches by allowing substitutions against the increased risk of false positives. BICEPS can identify peptides containing two or more substitutions as occuring e.g. in cross-species searches.
MS3toMimicS2 MGF Conversion Script
Timm W, Ozlu N, Steen JJ, Steen H (2010) The effect of high accuracy precursor masses on phosphopeptide identification from MS3 spectra Analytical Chemistry, in press [10.1021/ac100118u]
Perl script to convert an MGF containing MS3 spectra to a MimicS2 MGF by replacing all precursor masses with the more accurate precursor masses from the survey scan.
curveFDP: Estimation of confidence levels for peptide identifications
Renard BY, Timm W, Kirchner M, Hamprecht FA, Steen H Estimating the Confidence of Peptide Identifications without the Need for Decoy Databases Spectra for Peptide Identification. Analytical Chemistry, in press.
The curveFDP R package is available for local installation on Windows (.zip) and Unix-alike systems (.tar.gz). The function curveFDR(scores) fits a gaussian mixture model to a score distribution of peptide identifications and thereby allows the estimation of confidence levels based on the false discvory proportion. Use "?curveFDP" for a detailed description after installing and loading the package.
MGFp: An open Mascot Generic Format (MGF) parser library implementation
Kirchner M, Steen JAJ, Hamprecht FA, Steen H MGFp: An open Mascot Generic Format (MGF) parser library implementation Journal of Proteome Research (JPR), in press.
MGFp implements a formal grammar based on the MGF format description using the bison/flex parser generators. It provides an efficient, intuitive C++ library interface, which can easily be integrated into existing C++ projects, adapted to managed code environments (e.g. .NET) or bridged to scripting languages such as Python or R. The library is portable and has been tested on Linux, MacOS and MS Windows platforms.
Fractional Mass Filtering Data & R Scripts
Kirchner M, Timm W, Fong P, Wangemann P, Steen H (2009). Nonlinear Classification for On-The-Fly Fractional Mass Filtering and Targeted Precursor Fragmentation in Mass Spectrometry Experiments Bioinformatics, in press.
This package contains all scripts and datasets that were used in order to generate the results presented in the aforementioned publication. Please see the README file for a detailed description of the package contents.
Profile Similarity Screening: MATLAB PSS toolbox
Kirchner M*, Renard BY*, Köthe U, Pappin DJ, Hamprecht FA, Steen H*, Steen JAJ* (2009). Computational Protein Profile Similarity Screening for Quantitative Mass Spectrometry Experiments. Bioinformatics, Advance Access [10.1093/bioinformatics/btp607]
The protein Profile Similarity Screening (PSS) MATLAB toolbox implements a compositional hierarchical clustering procedure w/ top-down split significance testing and a Mallows distance-based protein level similarity inference.
ms2preproc: Computational Preprocessing of MS/MS Spectra
Renard BY, Kirchner M, Monigatti F, Invanov AR, Rappsilber J, Winter D, Steen JAJ, Hamprecht FA, Steen H When Less Can Yield More - Computational Preprocessing of MS/MS Spectra for Peptide Identification. PROTEOMICS (2009) 9(21): 4978-4984 [10.1002/pmic.200900326]
Windows installation package and Linux executables for the computational preprocessing of MS/MS spectra. This includes the 'Top X', 'Top X in Y regions" as well as the 'Top X in m/z windows of size Z' approaches. The zip file also contains an R-function to compute the the local false discovery rate based on the PSPEP approach (WH Tang, IV Shilov, SL Seymour. Nonlinear fitting method for determining local false discovery rates from decoy database searches. J Proteome Res. 2008 Sep;7(9):3661-7).
[Update 2010-01-19] The MGF parser now supports scientific notation.
[Update 2010-01-13] Updated the MGF parser to support integer formats in all double fields if appropriate.
[Update 2010-01-12] Updated the MGF parser to support integer values in ion series (such as reported by e.g. MaxQuant).
Valid HTML 4.01 Strict Valid CSS!