Mixture Model Quality Test Failure: Searched Entire Proteome + Decoys against 4 protein sample #228

whitleyo · 2017-11-11T17:28:56Z

whitleyo
Nov 11, 2017

Hello,

I'm currently testing out MSFragger with a 4 protein dataset (4 glycoproteins known to be in the sample, and nothing else should be in there), and tried using a human proteome fasta file (from uniprot) with reversed decoys appended. I was able to generate search results via MSFragger, but when I try and use philosopher's implementation of peptideprophet, I get messages saying that the mixture model fails quality test for charges +1 to +7.

Here's the command:

philosopher peptideprophet --database uniprot-reviewed_yes+taxonomy_9606_W_DECOYS_v3.fasta --nonparam --decoy DECOY_ --masswidth 2000 --decoyprobs 4ProteinMix_01_pickpeak_v3.pepXML

Here's the output:

file 1: C:\Users\newot\ms_fragger\MSFragger_20170103\4ProteinMix_01_pickpeak_v3.pepXML
processed altogether 6119 results
INFO: Results written to file: C:\Users\newot\ms_fragger\MSFragger_20170103\interact-4ProteinMix_01_pickpeak_v3.pep.xml

C:\Users\newot\ms_fragger\MSFragger_20170103\interact-4ProteinMix_01_pickpeak_v3.pep.xml
Building Commentz-Walter keyword tree...
Searching the tree...
Linking duplicate entries...
Printing results...

Using Decoy Label "DECOY_".
Decoy Probabilities will be reported.
Using non-parametric distributions
(X! Tandem)
init with X! Tandem trypsin
MS Instrument info: Manufacturer: UNKNOWN, Model: UNKNOWN, Ionization: UNKNOWN, Analyzer: UNKNOWN, Detector: UNKNOWN

INFO: Processing standard MixtureModel ...
PeptideProphet (TPP v5.0.1 Post-Typhoon dev, Build 201705191533-7588 (Windows_NT-x86_64)) AKeller@ISB
read in 0 1+, 18 2+, 4707 3+, 1010 4+, 319 5+, 65 6+, and 0 7+ spectra.
Initialising statistical models ...
Found 2932 Decoys, and 3187 Non-Decoys
Iterations: .........10.........20.....
WARNING: Mixture model quality test failed for charge (1+).
WARNING: Mixture model quality test failed for charge (2+).
WARNING: Mixture model quality test failed for charge (3+).
WARNING: Mixture model quality test failed for charge (4+).
WARNING: Mixture model quality test failed for charge (5+).
WARNING: Mixture model quality test failed for charge (6+).
WARNING: Mixture model quality test failed for charge (7+).
model complete after 26 iterations

I'm going to guess that most 'non-decoy' peptide sequences are going to have matches with spectra due to chance or overlap in fragment masses with peptides from the 4 'correct' proteins (that are actually in the sample). Thus, the probability of getting a high score from MSFragger given that you have a 'non-decoy' peptide should be roughly equivalent to that for peptides from decoy proteins.

Should philosopher's peptideprophet be able to identify the peptides tiny subset of non-decoy proteins (4) that should have very high scores?

Thanks,
Owen

Answered by prvst

Nov 15, 2017

The PeptideProphet version you have inside Philosopher is pretty much the same you find in TPP v5.01, the few adjustments I do have no affect over the results. I don't see why it shouldn't find your proteins, regarding that you have a properly set parameter file for the search and the correct set of parameters for PeptideProphet. I suggest you to get in touch with the developers that are maintaining the Prophets code and ask them.

View full answer

prvst · 2017-11-11T20:05:07Z

prvst
Nov 11, 2017
Collaborator

Hey @whitleyo ;

Could you give me a little more information about your experiment ? It seems to me that you have a database with +- 20k decoys and only 4 targets, is that correct ?

0 replies

whitleyo · 2017-11-11T20:54:43Z

whitleyo
Nov 11, 2017
Author

Hello,

I basically downloaded a FASTA file from this search

http://www.uniprot.org/uniprot/?query=reviewed:yes%20taxonomy:9606

(reviewed and human), which includes about 20k sequences, and appended decoys via a python script I wrote and tested. So the database has 20k FASTAs and 20k reversed decoys. The MS data I am using come from a 4-protein mixture of 4 glycoproteins (2 of them are bovine, 2 are human) (see PDX006031 on pride for the raw data), and proteins underwent CID fragmentation. The search I did had a precursor mass tolerance window of 2000 Da since these proteins are glycosylated and I wanted to see if MS-Fragger + Philosopher could pick up peptides with large mass shifts (from glycosylation). The fragment mass tolerance window was 20ppm.

The thinking behind that was to hopefully identify precursors that faithfully matched up to theoretical fragment spectra but would not be within the precursor mass tolerance.

0 replies

whitleyo · 2017-11-11T21:38:51Z

whitleyo
Nov 11, 2017
Author

Looking again through some reviews (e.g. Allen et al. 2013) and looking at the spectral data through some other software (glycopat), the MS2 spectra should be dominated by precursor peptides with mass shifts of varying magnitude (due to CID predominantly breaking bonds between sugars of the glycan groups).

0 replies

prvst · 2017-11-15T15:46:24Z

prvst
Nov 15, 2017
Collaborator

The PeptideProphet version you have inside Philosopher is pretty much the same you find in TPP v5.01, the few adjustments I do have no affect over the results. I don't see why it shouldn't find your proteins, regarding that you have a properly set parameter file for the search and the correct set of parameters for PeptideProphet. I suggest you to get in touch with the developers that are maintaining the Prophets code and ask them.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mixture Model Quality Test Failure: Searched Entire Proteome + Decoys against 4 protein sample #228

{{title}}

Replies: 4 comments

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Mixture Model Quality Test Failure: Searched Entire Proteome + Decoys against 4 protein sample #228

whitleyo Nov 11, 2017

Replies: 4 comments

prvst Nov 11, 2017 Collaborator

whitleyo Nov 11, 2017 Author

whitleyo Nov 11, 2017 Author

prvst Nov 15, 2017 Collaborator

whitleyo
Nov 11, 2017

prvst
Nov 11, 2017
Collaborator

whitleyo
Nov 11, 2017
Author

whitleyo
Nov 11, 2017
Author

prvst
Nov 15, 2017
Collaborator