mspire-mascot-dat 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,667 @@
1
+ # Results File
2
+
3
+ The results file contains the search results together with the search
4
+ input parameters and MS data. This means that a results file contains
5
+ everything necessary to generate a report, repeat the search at a later
6
+ date, or act as the self-contained input file to a project database or LIMS.
7
+ The contents are divided into logical sections:
8
+
9
+ 1. Search parameters
10
+ 2. Mass values
11
+ 3. Quantitation method (if used)
12
+ 4. Unimod extract
13
+ 5. Enzyme definition
14
+ 6. Taxonomy (if a taxonomy filter was used)
15
+ 7. Misc. header information
16
+ 8. Summary results (for Protein Summary)
17
+ 9. Mixtures (if PMF)
18
+ 10. Summary of decoy results (if automatic decoy)
19
+ 11. Summary of error tolerant results (if automatic ET)
20
+ 12. Mixtures in decoy results (if automatic decoy PMF)
21
+ 13. Peptides (if SQ or MIS)
22
+ 14. Decoy peptides (if SQ or MIS and automatic ET)
23
+ 15. Error tolerant peptides (if SQ or MIS and automatic ET)
24
+ 16. Proteins (if SQ or MIS)
25
+ 17. Query data, one block for each query
26
+ 18. Index
27
+
28
+ ### General Notes
29
+
30
+ 1. Values are shown in italics
31
+ 2. Scripts are written so that label case doesn’t matter.
32
+ 3. Labels are used to assist readability, but kept short to minimise
33
+ file size
34
+ 4. Parameters are grouped logically
35
+ 5. Order of blocks is not important except that the index block
36
+ must be the last block. Presence of blank lines within the index
37
+ block may cause a problem.
38
+ 6. Because the MIME type is defined as an unknown application,
39
+ if this file passes through a mail agent, it will be treated as an
40
+ “octet stream” and encoded “base64” for transmission.
41
+
42
+ ## Search parameters
43
+
44
+ --gc0p4Jq0M2Yt08jU534c0p
45
+ Content-Type: application/x-Mascot; name=”parameters”
46
+
47
+ USERNAME=user name in plain text
48
+ USEREMAIL=email address in plain text
49
+ SEARCH=PMF
50
+ COM=search title text
51
+ DB=MSDB
52
+ CLE=Trypsin
53
+ MASS=Monoisotopic
54
+ MODS=Mod 1,Mod 2
55
+ .
56
+ .
57
+ .
58
+ RULES=1,2,5,6,8,9,13,14
59
+ --gc0p4Jq0M2Yt08jU534c0p
60
+
61
+ The Parameters section contains the complete set of parameter values
62
+ from the search form apart from the contents of the uploaded data file or
63
+ the query window. Labels must be unique, independent of case. Where a
64
+ parameter can be multivalued (e.g. mods) the values are listed on one
65
+ line separated by commas.
66
+ RULES contains a list of the rule numbers that define the instrument
67
+ type in the configuration file fragmentation_rules. The rule numbers
68
+ are listed explicitly because the contents of the configuration file may
69
+ have changed since the search was run.
70
+
71
+ ## Masses
72
+
73
+ --gc0p4Jq0M2Yt08jU534c0p
74
+ Content-Type: application/x-Mascot; name=”masses”
75
+
76
+ A=71.037110
77
+ B=114.534940
78
+ C=160.030649
79
+ D=115.026940
80
+ E=129.042590
81
+ F=147.068410
82
+ G=57.021460
83
+ H=137.058910
84
+ I=113.084060
85
+ J=0.000000
86
+ K=128.094960
87
+ L=113.084060
88
+ M=131.040480
89
+ N=114.042930
90
+ O=0.000000
91
+ P=97.052760
92
+ Q=128.058580
93
+ R=156.101110
94
+ S=87.032030
95
+ T=101.047680
96
+ U=150.953630
97
+ V=99.068410
98
+ W=186.079310
99
+ X=111.000000
100
+ Y=163.063330
101
+ Z=128.550590
102
+ Hydrogen=1.007825
103
+ Carbon=12.000000
104
+ Nitrogen=14.003074
105
+ Oxygen=15.994915
106
+ Electron=0.000549
107
+ C_term=17.002740
108
+ N_term=1.007825
109
+ delta1=15.994919,Oxidation (M)
110
+ NeutralLoss1=0.000000
111
+ FixedMod1=57.021469, Carbamidomethyl (C)
112
+ FixedModResidues1=C
113
+ --gc0p4Jq0M2Yt08jU534c0p
114
+
115
+ This block contains “actual” mass values. That is, average or monisotopic
116
+ residue masses, including any fixed modifications; C and N terminus
117
+ groups also include any fixed modifications.
118
+
119
+ FixedMod1, FixedMod2, etc., records the delta mass and name for each
120
+ fixed modification as comma separated values. FixedModResidues1 gives
121
+ the site specificity. If multiple residues are affected, they are listed as a
122
+ string, e.g. STY. If there was a neutral loss, the delta mass is given by
123
+ the value of FixedModNeutralLoss1.
124
+
125
+ FixedModn=delta, Name
126
+ FixedModResiduesn=[A-Z]|C_term|N_term
127
+ FixedModNeutralLossn=mass
128
+
129
+ Fixed modifications cannot have peptide neutral losses, multiple neutral
130
+ losses and cannot be protein-terminal or residue-terminal. In all these
131
+ cases, fixed modifications are automatically converted into variable ones.
132
+
133
+ Variable modifications are reported in delta1, delta2, etc. Each entry
134
+ defines the difference in mass introduced by the modification together
135
+ with the name of the modification, separated by a comma. If a variable
136
+ modification suffers a neutral loss on fragmentation, the delta is speci-
137
+ fied by a NeutralLossn entry. By definition, this is always a master
138
+ neutral loss. If there are multiple neutral losses, then two more lines
139
+ appear:
140
+
141
+ NeutralLossn_master=mass[[,mass] ...]
142
+ NeutralLossn_slave=mass[[,mass] ...]
143
+
144
+ The first neutral loss (defined by NeutralLossn) has an implicit index
145
+ number of 1. Any additional neutral losses (defined by
146
+ NeutralLossn_master or followed by NeutralLossn_slave) have implicit
147
+ index numbers of 2 and up.
148
+
149
+ If a modification includes a required or optional neutral loss from the
150
+ precursor, this is recorded as follows:
151
+
152
+ ReqPepNeutralLossn=mass[[,mass] ...]
153
+ PepNeutralLossn=mass[[,mass] ...]
154
+
155
+ Error-tolerant modifications are not listed in masses section.
156
+
157
+ ## Quantitation
158
+
159
+ --gc0p4Jq0M2Yt08jU534c0p
160
+ Content-Type: application/x-Mascot; name=”quantitation”
161
+
162
+ <?xml version=”1.0" encoding=”UTF-8" standalone=”no” ?>
163
+ <quantitation majorVersion=”1" minorVersion=”0" xmlns=”http://
164
+ www.matrixscience.com/xmlns/schema/quantitation_1" xmlns:xsi=
165
+ “http://www.w3.org/2001/XMLSchema-instance” xsi:schemaLocation=”http:/
166
+ /www.matrixscience.com/xmlns/schema/quantitation_1 qu
167
+ antitation_1.xsd”>
168
+ <method constrain_search=”false” description=”15N metabolic label-
169
+ ling” min_num_peptides=”2" name=”15N Metabolic [MD]” pro
170
+ t_score_type=”mudpit” protein_ratio_type=”weighted”
171
+ report_detail=”true” require_bold_red=”true” show_sub_sets=”0.5"
172
+ sig_th
173
+ reshold_value=”0.05">
174
+ <component name=”light”>
175
+ <isotope/>
176
+ </component>
177
+ <component name=”heavy”>
178
+ <isotope>
179
+ <old>N</old>
180
+ <new>15N</new>
181
+ </isotope>
182
+
183
+ This section is an extract from quantitation.xml containing the
184
+ quantitation method specified for the search. For more details and a link
185
+ to the schema, refer to the Mascot HTML help pages for quantitation.
186
+
187
+ ## Unimod
188
+
189
+ --gc0p4Jq0M2Yt08jU534c0p
190
+ Content-Type: application/x-Mascot; name=”unimod”
191
+
192
+ <?xml version=”1.0" encoding=”UTF-8" standalone=”no” ?>
193
+ <umod:unimod xmlns:umod=”http://www.unimod.org/xmlns/schema/unimod_2"
194
+ majorVersion=”2" minorVersion=”0" xmlns:xsi=”http://w
195
+ ww.w3.org/2001/XMLSchema-instance” xsi:schemaLocation=”http://
196
+ www.unimod.org/xmlns/schema/unimod_2 unimod_2.xsd”>
197
+ <umod:elements>
198
+ <umod:elem avge_mass=”1.00794" full_name=”Hydrogen”
199
+ mono_mass=”1.007825035" title=”H”/>
200
+ <umod:elem avge_mass=”2.014101779" full_name=”Deuterium”
201
+ mono_mass=”2.014101779" title=”2H”/>
202
+ <umod:elem avge_mass=”6.941" full_name=”Lithium”
203
+ mono_mass=”7.016003" title=”Li”/>
204
+ <umod:elem avge_mass=”12.0107" full_name=”Carbon” mono_mass=”12"
205
+ title=”C”/>
206
+
207
+ This section is an extract from unimod.xml containing data for the
208
+ elements, amino_acids, and any modifications specified in the search
209
+ form. For more details and a link to the schema, refer to the help pages
210
+ at www.unimod.org
211
+
212
+ ## Enzyme
213
+
214
+ --gc0p4Jq0M2Yt08jU534c0p
215
+ Content-Type: application/x-Mascot; name=”enzyme”
216
+
217
+ Title:Trypsin
218
+ Cleavage:KR
219
+ Restrict:P
220
+ Cterm
221
+ *
222
+
223
+ This section is simply an extract from the enzyme file. Syntax details can
224
+ be found in Chapter 6
225
+
226
+ ## Taxonomy
227
+
228
+ --gc0p4Jq0M2Yt08jU534c0p
229
+ Content-Type: application/x-Mascot; name=”taxonomy”
230
+
231
+ Title:. . . . . . . . . . . . . . . . Homo sapiens (human)
232
+ Include: 9606
233
+ Exclude:
234
+ *
235
+
236
+ This section is simply an extract from the taxonomy file. Syntax details
237
+ can be found in Chapter 9
238
+
239
+ ## Header
240
+
241
+ --gc0p4Jq0M2Yt08jU534c0p
242
+ Content-Type: application/x-Mascot; name=”header”
243
+
244
+ sequences=number of sequences in DB
245
+ sequences_after_tax=number of sequences after taxonomy filter
246
+ residues=number of residues in DB
247
+ distribution=see below
248
+ exec_time=search time in seconds
249
+ date=timestamp (seconds since Jan 1st 1970)
250
+ time=time in hh:mm:ss
251
+ queries=number of queries, (>= 1)
252
+ max_hits=maximum number of hits to be listed
253
+ version=version information
254
+ fastafile=full path to database fasta file
255
+ release=filename of actual database used - e.g. Owl_31.fasta
256
+ taskid=unique task identifier for searches submitted asynchronously
257
+ pmf_num_queries_used=number of mass values selected for PMF match
258
+ pmf_queries_used=comma separated list of selected query numbers
259
+ Warn0=
260
+ Warn1=
261
+ Warn2=
262
+ --gc0p4Jq0M2Yt08jU534c0p
263
+
264
+ The Header section contains general values, used in the master results
265
+ page header paragraph.
266
+ Distribution is a comma separated list of values that represent a
267
+ histogram of the complete protein score distribution. The first value is
268
+ the number of entries with score 0, the second is the number of entries
269
+ with score 1, and so on, up to the maximum score for the search. Scores
270
+ are converted to integers by truncation. This distribution is only mean-
271
+ ingful for a peptide mass fingerprint search.
272
+ If intensity values are supplied for a peptide mass fingerprint, Mascot
273
+ iterates the experimental peaks to find the set that gives the best score.
274
+ The number of values selected is reported in pmf_num_queries_used
275
+ and the selected queries listed in pmf_queries_used.
276
+
277
+ ## Summary results
278
+
279
+ --gc0p4Jq0M2Yt08jU534c0p
280
+ Content-Type: application/x-Mascot; name=”summary”
281
+
282
+ qmass1=Mr
283
+ qexp1=m/z for query 1,
284
+ charge
285
+ qintensity1=intensity value for query1 (if available)
286
+ qmatch1=Total number of peptide mass matches for query1 in database
287
+ qplughole1=Threshold score for homologous peptide match (MIS only)
288
+ qmass2=...
289
+ qexp2=...
290
+ qintensity1=
291
+ qmatch2=...
292
+ qplughole2=...
293
+ .
294
+ .
295
+ .
296
+ qmassn=...
297
+ qexpn=...
298
+ qintensityn=
299
+ qmatchn=...
300
+ qplugholen=...
301
+ num_hits=number of hits in the summary block (<= max_hits)
302
+ h1=accession string,
303
+ total protein score,
304
+ obsolete,
305
+ intact protein mass
306
+ h1_text=title text
307
+ h1_frame=frame_number (between 1 and 6, for nucleic acid only)
308
+ h1_q1=missed cleavages, (–1 indicates no match)
309
+ peptide Mr,
310
+ delta,
311
+ start,
312
+ end,
313
+ number of ions matched,
314
+ peptide string,
315
+ peaks used from Ions1,
316
+ variable modifications string,
317
+ ions score,
318
+ multiplicity,
319
+ ion series found,
320
+ peaks used from Ions2,
321
+ peaks used from Ions3,
322
+ total area of matched peaks
323
+ h1_q1_et_mods=modification mass,
324
+ neutral loss mass,
325
+ modification description
326
+ h1_q1_et_mods_master=neutral loss mass[[,neutral loss mass] ... ]
327
+ h1_q1_et_mods_slave=neutral loss mass[[,neutral loss mass] ... ]
328
+ h1_q1_primary_nl=neutral loss string
329
+ h1_q1_na_diff=original NA sequence,
330
+ modified NA sequence
331
+ h1_q1_tag=tagNum:startPos:endPos:seriesID,...
332
+ h1_q1_drange=startPos:endPos
333
+ h1_q1_terms=residue,residue
334
+ h1_q1_subst=pos1,ambig1,matched1 ... ,posn,ambign,matchedn
335
+ h1_q2=...
336
+ .
337
+ .
338
+ .
339
+ h1_qm=...
340
+ h2=...
341
+ .
342
+ .
343
+ .
344
+ hn_qm=...
345
+ --gc0p4Jq0M2Yt08jU534c0p
346
+
347
+ Where a parameter has multiple values, these are shown on separate
348
+ lines for clarity. In the actual result file, all values for a parameter are
349
+ on a single line and there are no spaces or tabs between values.
350
+ Variable modifications is a string of digits, one digit for the N terminus,
351
+ one for each residue and one for the C terminus. Each digit specifies the
352
+ modification used to obtain the match: 0 indicates no modification, 1
353
+ indicates delta1, 2 indicates delta2 etc., in the masses section. If the
354
+ number of modifications exceeds 9, the letters A to W are used to repre-
355
+ sent modifications 10 to 32. X is used to indicate a modification found in
356
+ error tolerant mode.
357
+ neutral loss string is the same concept as the variable mod string,
358
+ except each character represents the index of the primary neutral loss
359
+ (one of the master NL). Any position that is not modified, or where the
360
+ mod has no neutral loss, is set to 0. hn_qm_primary_nl will only be
361
+ output if the string contains at least one non-zero character.
362
+ If a new modification is found in an error tolerant search, its position is
363
+ marked by X, and details are recorded in an additional entry,
364
+ hn_qm_et_mods. If the error tolerant search is of a nucleic acid data-
365
+ base, and the modification is a single base change in the primary se-
366
+ quence, the two mass fields will be set to zero, and one of the keywords
367
+ NA_INSERTION, NA_DELETION, or NA_SUBSTITUTION will appear in the
368
+ description field. The additional parameter hn_qm_na_diff is then used
369
+ to record the ‘before’ and ‘after’ nucleic acid sequences.
370
+ *Ion series* is a string of 19 digits representing the ion series:
371
+
372
+ a
373
+ place holder
374
+ a++
375
+ b
376
+ place holder
377
+ b++
378
+ y
379
+ place holder
380
+ y++
381
+ c
382
+ c++
383
+ x
384
+ x++
385
+ z
386
+ z++
387
+ z+H
388
+ z+H++
389
+ z+2H
390
+ z+2H++
391
+
392
+ A digit is set to 1 if the corresponding series contains more than just
393
+ random matches and 2 if the series contributes to the score.
394
+ Multiplicity means number of peptide mass matches for a query in a
395
+ protein
396
+ For each sequence tag, four colon separated values are output: 1-based
397
+ tag number, 1-based residue position where tag starts, 1-based residue
398
+ position where tag ends, ion series into which the tag was matched:
399
+
400
+ -1 means no matches for the tag
401
+ 0 “a” series (single charge)
402
+ 1 “a-NH3” series (single charge)
403
+ 2 “a” series (double charge)
404
+ 3 “b” series (single charge)
405
+ 4 “b-NH3” series (single charge)
406
+ 5 “b” series (double charge)
407
+ 6 “y” series (single charge)
408
+ 7 “y-NH3” series (single charge)
409
+ 8 “y” series (double charge)
410
+ 9 “c” series (single charge)
411
+ 10 “c” series (double charge)
412
+ 11 “x” series (single charge)
413
+ 12 “x” series (double charge)
414
+ 13 “z” series (single charge)
415
+ 14 “z” series (double charge)
416
+ 15 “a-H2O” series (single charge)
417
+ 16 “a-H2O” series (double charge)
418
+ 17 “b-H2O” series (single charge)
419
+ 18 “b-H2O” series (double charge)
420
+ 19 “y-H2O” series (single charge)
421
+ 20 “y-H2O” series (double charge)
422
+ 21 “a-NH3” series (double charge)
423
+ 22 “b-NH3” series (double charge)
424
+ 23 “y-NH3” series (double charge)
425
+ 25 “internal yb” series (single charge)
426
+ 26 “internal ya” series (single charge)
427
+ 27 “z+H” series (single charge)
428
+ 28 “z+H” series (double charge)
429
+ 29 high-energy “d” and “d’” series (single charge)
430
+ 31 high-energy “v” series (single charge)
431
+ 32 high-energy “w” and “w’” series (single charge)
432
+ 33 “z+2H” series (single charge)
433
+ 34 “z+2H” series (double charge)
434
+
435
+ If there are multiple tags for a query, comma separated groups of these
436
+ numbers are output for each tag.
437
+ hn_qm_drange is output for a query that includes an error tolerant
438
+ sequence tag. It defines the range of positions within which an unsus-
439
+ pected modification has been located. For a peptide of 10 residues,
440
+ position 0 would indicate the amino terminus and position 11 would
441
+ indicate the carboxy terminus. If there is no location information, the
442
+ range is output as 0,256
443
+
444
+ hn_qm_terms shows the residues the bracket the peptide in the protein.
445
+ If the peptide forms the terminus of the protein, then a hyphen is used
446
+ instead.
447
+
448
+ hn_qm_subst is output when the matched peptide contained an ambigu-
449
+ ous residue, (B, X, or Z). The argument is one or more triplets of comma
450
+ separated values. For each triplet, the first value is the residue position,
451
+ the second is the ambiguous residue, and the third is the residue that
452
+ has been substituted to obtain the reported match.
453
+
454
+ For a large MS/MS search, num_hits is set to zero, and the summary
455
+ block only contains entries for qmassn, qexpn, qmatchn,
456
+ qplugholen. The threshold for switching to this mode is specified using
457
+ two parameters in the Options section of mascot.dat. SplitDataFileSize
458
+ is the size of the search process in bytes, (default 10000000), and
459
+ SplitNumberOfQueries is the size of the search in queries, (default
460
+ 1000).
461
+
462
+ If this is a two-pass search, either an automatic decoy database search or
463
+ an automatic error tolerant search, a second summary block appears,
464
+ containing the second set of results. The section name is either
465
+ et_summary or decoy_summary. The syntax of the contents is identical
466
+
467
+ ## Mixture
468
+
469
+ --gc0p4Jq0M2Yt08jU534c0p
470
+ Content-Type: application/x-Mascot; name=”mixture”
471
+
472
+ num_hits=number of mixtures found
473
+ h1_score=total score for mixture 1
474
+ h1_numprot=number of proteins in mixture 1
475
+ h1_nummatch=number of queries matched
476
+ h1_m1=accession string for protein component 1
477
+ h1_m2=accession string for protein component 2
478
+ .
479
+ .
480
+ .
481
+ h1_mm=accession string for protein component m
482
+ h2_score=
483
+ .
484
+ .
485
+ .
486
+ hn_mm=
487
+ --gc0p4Jq0M2Yt08jU534c0p
488
+
489
+ The Mixture section is only output for a peptide mass fingerprint. If any
490
+ statistically significant protein mixtures are found, the mixture compo-
491
+ nents are summarised. For details of individual components, use the
492
+ accession strings to refer back to the Summary section.
493
+
494
+ If this is an automatic decoy database search, a second mixture block
495
+ appears, containing the second set of results. The section name is
496
+ decoy_mixture. The syntax of the contents is identical
497
+
498
+ ## Peptides
499
+
500
+ --gc0p4Jq0M2Yt08jU534c0p
501
+ Content-Type: application/x-Mascot; name=”peptides”
502
+
503
+ q1_p1=missed cleavages, (–1 indicates no match)
504
+ peptide Mr,
505
+ delta,
506
+ number of ions matched,
507
+ peptide string,
508
+ peaks used from Ions1,
509
+ variable modifications string,
510
+ ions score,
511
+ ion series found,
512
+ peaks used from Ions2,
513
+ peaks used from Ions3;
514
+ “accession string”: data for first protein
515
+ frame number:
516
+ start:
517
+ end:
518
+ multiplicity,
519
+ “accession string”: data for second protein
520
+ frame number:
521
+ start:
522
+ end:
523
+ multiplicity,
524
+ etc.
525
+ q1_p1_et_mods=modification mass,
526
+ neutral loss mass,
527
+ modification description
528
+ q1_p1_et_mods_master=neutral loss mass[[,neutral loss mass] ... ]
529
+ q1_p1_et_mods_slave=neutral loss mass[[,neutral loss mass] ... ]
530
+ q1_p1_primary_nl=neutral loss string
531
+ q1_p1_na_diff=original NA sequence,
532
+ modified NA sequence
533
+ q1_p1_tag=tagNum:startPos:endPos:seriesID,...
534
+ q1_p1_drange=startPos:endPos
535
+ q1_p1_terms=residue,residue[[:residue,residue] ... ]
536
+ q1_p1_subst=pos1,ambig1,matched1 ... ,posn,ambign,matchedn
537
+ q1_p1_comp=quantitation component name
538
+ q1_p2=...
539
+ .
540
+ .
541
+ .
542
+ qn_pm=...
543
+ --gc0p4Jq0M2Yt08jU534c0p
544
+
545
+ Each line contains the data for a peptide match followed by data for at
546
+ least one protein in which the peptide was found.
547
+
548
+ If there multiple entries in the database containing the matched peptide,
549
+ there will be a corresponding number of pairs of bracketing residues
550
+ listed in qn_pm_terms.
551
+
552
+ Otherwise, individual field descriptions are identical to those for the
553
+ Summary section
554
+
555
+ If this is a two-pass search, either an automatic decoy database search or
556
+ an automatic error tolerant search, a second peptides block appears,
557
+ containing the second set of results. The section name is either
558
+ et_peptides or decoy_peptides. The syntax of the contents is identical
559
+
560
+ ## Proteins
561
+
562
+ --gc0p4Jq0M2Yt08jU534c0p
563
+ Content-Type: application/x-Mascot; name=”proteins”
564
+
565
+ “accession string”=protein mass,
566
+ “title text”
567
+ .
568
+ .
569
+ .
570
+ “accession string”=protein mass,
571
+ “title text”
572
+ --gc0p4Jq0M2Yt08jU534c0p
573
+
574
+ This block contains reference data for the proteins listed in the peptides
575
+ block.
576
+
577
+ ## Input data for query n
578
+
579
+ --gc0p4Jq0M2Yt08jU534c0p
580
+ Content-Type: application/x-Mascot; name=”queryn”
581
+
582
+ title=query title
583
+ index=query index
584
+ seq1=sequence qualifier (e.g. N-ABCDEF)
585
+ seq2=...
586
+ .
587
+ .
588
+ .
589
+ seqn=
590
+ comp1=composition qualifier (e.g. 0[P]2[W])
591
+ comp2=...
592
+ .
593
+ .
594
+ .
595
+ compn=...
596
+ PepTol=peptide tolerance qualifier (e.g. 2.000000,Da)
597
+ IT_MODS=Mod 1[,Mod 2[,...]]
598
+ INSTRUMENT=instrument identifier, (e.g. ESI-TRAP)
599
+ RULES=1,2,5,6,8,9,13,14
600
+ INTERNALS=min mass,max mass
601
+ CHARGE=charge state (e.g. 2+)
602
+ RTINSECONDS=a[[-b][,c[-d]]]
603
+ SCANS=a[[-b][,c[-d]]]
604
+ tag1=sequence tag (e.g. t,889.4,[QK]S,1104.54)
605
+ .
606
+ .
607
+ .
608
+ tagn=...
609
+ mass_min=lowest mass
610
+ mass_max=highest mass
611
+ int_min=lowest intensity
612
+ int_max=highest intensity
613
+ num_vals=number of mass values
614
+ num_used1=-1 (obsolete)
615
+ ions1=1344.65:34.3,1365.41:13.2
616
+ ions2=y-1344.65:34.3,1365.41:13.2
617
+ ions3=b-1344.65:34.3,1365.41:13.2
618
+ --gc0p4Jq0M2Yt08jU534c0p
619
+
620
+ Value “queryn” runs from “query1” (no leading zeros). ionsn values are
621
+ sorted so that the matched values come first.
622
+
623
+ Most searches will only require a few of these fields. For example, a
624
+ peptide mass fingerprint would only include the charge field.
625
+
626
+ The index is a 0 based record of the original query order before sorting by
627
+ Mr
628
+
629
+ ions2 and ions3 are only required when fragment ions are specified in a
630
+ sequence query as being N-terminal or C-terminal series.
631
+ The first field in a tagn value is t for a standard sequence tag and e for
632
+ an error tolerant sequence tag
633
+
634
+ Some search parameters can be define in the local scope of a query.
635
+ These are CHARGE, COMP, INSTRUMENT, IT_MODS, TOL, TOLU.
636
+ Any that are used are listed here. If the MGF file contained scan range
637
+ information in terms of seconds or scans, this is written to
638
+ RTINSECONDS and/or SCANS.
639
+
640
+ ## Index
641
+
642
+ --gc0p4Jq0M2Yt08jU534c0p
643
+ Content-Type: application/x-Mascot; name=”index”
644
+
645
+ parameters=4
646
+ masses=78
647
+ unimod=116
648
+ enzyme=322
649
+ taxonomy=329
650
+ header=336
651
+ summary=351
652
+ et_summary=6059
653
+ peptides=6473
654
+ et_peptides=7143
655
+ proteins=7292
656
+ query1=7362
657
+ query2=7374.
658
+ .
659
+ .
660
+ .
661
+ query81=8322
662
+ query82=8334
663
+ --gc0p4Jq0M2Yt08jU534c0p--
664
+
665
+ Values in index are the line number offsets of the section “Content-
666
+ Type:” lines (starting from 0 for the first line of the file).
667
+
@@ -0,0 +1,11 @@
1
+ require 'rspec'
2
+
3
+ # Requires supporting files with custom matchers and macros, etc,
4
+ # in ./support/ and its subdirectories.
5
+ #Dir["#{File.dirname(__FILE__)}/support/**/*.rb"].each {|f| require f}
6
+
7
+ RSpec.configure do |config|
8
+ config.treat_symbols_as_metadata_keys_with_true_values = true
9
+ end
10
+
11
+ TESTFILES = __dir__ + "/testfiles"