pe_criteria.txt
----------------------------------------------------------------------------
UniProt - Swiss-Prot Protein Knowledgebase
Swiss Institute of Bioinformatics (SIB); Geneva, Switzerland
European Bioinformatics Institute (EBI); Hinxton, United Kingdom
Protein Information Resource (PIR); Washington DC, USA
----------------------------------------------------------------------------
Description: Criteria used to assign the PE level of entries
Name: pe_criteria.txt
Release: 56.5 of 25-Nov-2008
----------------------------------------------------------------------------
This document lists the criteria used to assign a PE level to entries.
Criteria used to implement the 'PE 1: Evidence at protein level'
----------------------------------------------------------------
We add the 'Evidence at protein level' qualifier to all entries with at
least one of the annotations listed below:
1. RP lines containing:
CHARACTERIZATION
PROTEIN SEQUENCE
AMINO-ACID COMPOSITION
CATALYTIC ACTIVITY
FUNCTION AS, FUNCTION IN
INTERACTION WITH, SUBUNIT, IDENTIFICATION IN ... COMPLEX
KNOCKOUT
-BINDING
IDENTIFICATION BY MASS SPECTROMETRY, MASS SPECTROMETRY
CRYSTALLIZATION, X-RAY CRYSTALLOGRAPHY, STRUCTURE BY NMR
CLEAVAGE, PROTEOLYTIC PROCESSING
TOPOLOGY
DISULFIDE
LEVEL OF PROTEIN EXPRESSION
PTM information (acetylation, glycosylation, methylation,
phosphorylation, ubiquitination etc.)
2. CC topics:
ALLERGEN - without the 'By similarity' qualifier
BIOPHYSICOCHEMICAL PROPERTIES
BIOTECHNOLOGY
DEVELOPMENTAL STAGE - with the 'at protein level' qualifier
INDUCTION - with the 'at protein level' qualifier
INTERACTION
MASS SPECTROMETRY
PHARMACEUTICAL
TISSUE SPECIFICITY - with the 'at protein level' qualifier
WEB RESOURCE: NAME=GeneReviews;
3. DR lines:
2DBase-Ecoli
COMPLUYEAST-2DPAGE
Cornea-2DPAGE
ECO2DBASE
HSC-2DPAGE
Rat-heart-2DPAGE
Siena-2DPAGE
SWISS-2DPAGE
World-2DPAGE
HPA; CABxxxx
PeptideAtlas
ProMEX
PDB - with the exception of the 'model' category
4. Keywords:
Direct protein sequencing
Disease mutation
5. FT lines:
CARBOHYD - without qualifiers 'By similarity', 'Potential', 'Probable'
CROSSLNK - without qualifiers 'By similarity', 'Potential', 'Probable'
DISULFID - without qualifiers 'By similarity', 'Potential', 'Probable'
LIPID - without qualifiers 'By similarity', 'Potential', 'Probable'
MOD_RES - without qualifiers 'By similarity', 'Potential', 'Probable'
MUTAGEN
6. In vivo subcellular location analysis from PubMed=14562095 in
S.cerevisiae
The 'PE 1' assignment overides assignement to PE categories 2 to 3.
Criteria used to implement the 'PE 2: Evidence at transcript level'
-------------------------------------------------------------------
We add the 'Evidence at transcript level' qualifier to all entries with at
least one of the annotations listed below:
1. RP lines containing:
The [MRNA] 'molecule type' [*]
DEVELOPMENTAL STAGE
INDUCTION
RNA EDIT - for non-viral entries only
TISSUE SPECIFICITY
2. CC topics:
RNA EDITING - for non-viral entries only and without the qualifiers
'By similarity' and 'Potential'
3. DR lines:
ArrayExpress
CleanEx
EMBL - with molecule type "mRNA" [*]
[*] These two criteria are only applied to proteins at least 120 residues
long, since small CDS may be regulatory RNAs that are not translated.
The 'PE 2' assignment overides assignement to PE category 3.
Criteria used to implement the 'PE 3: Inferred from homology'
-------------------------------------------------------------
We add the 'Evidence from homology' qualifier to all entries with at least
one of the annotations listed below:
Entries that have a CC SIMILARITY topic with 'Belongs to... family'
Entries that have any CC topic with the 'By similarity' qualifier
DR lines: HAMAP
FT lines: SIGNAL, INIT_MET, MOD_RES, LIPID, CROSSLNK with the
'By similarity' qualifier
Criteria used to implement the 'PE 4: Predicted'
------------------------------------------------
All entries that are not assigned to categories 1; 2; 3 or 5.
Criteria used to implement the 'PE 5: Uncertain'
------------------------------------------------
All entries that contain one of the following texts in a CC CAUTION topic:
'Could be the product of a pseudogene.'
'Product of a dubious CDS prediction.'
'Product of a dubious gene prediction.'
The 'PE 5' assignment overides assignement to PE categories 1, 2 or 3.
-----------------------------------------------------------------------
Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms
Distributed under the Creative Commons Attribution-NoDerivs License
-----------------------------------------------------------------------



