Skip Header

 

UniProtKB query fields

Supported query fields for searching specific data in UniProtKB (query syntax help).

Field Example Description
active active:no list all obsolete UniProtKB entries
annotation annotation:(type:non-positional)
annotation:(type:positional)
annotation:(type:mod_res "Pyrrolidone carboxylic acid" confidence:proven)
list all entries with:
author author:ashburner list all entries with at least one reference co-authored by Michael Ashburner
citation citation:("intracellular structural proteins")
citation:(author:ashburner journal:nature)
list all entries with a literature citation:
  • containing the phrase "intracellular structural proteins" in either title or abstract
  • co-authored by Michael Ashburner and published in Nature
cluster cluster:UniRef90_A5YMT3 list all entries in the UniRef 90% identity cluster whose representative sequence is UniProtKB entry A5YMT3 (about UniRef)
content content:diabetes list all entries containing the term diabetes
count annotation:(type:transmem count:5)
annotation:(type:transmem count:[5 TO *])
annotation:(type:cofactor count:[3 TO *])
List all entries with:
  • exactly 5 transmembrane regions
  • 5 or more transmembrane regions
  • 3 or more Cofactor comments
created created:[20070107 TO *] list all entries created since October 1st 2007
database database:pfam list all entries with a cross-reference to the Pfam database (Databases cross-referenced in UniProtKB and ID mapping help)
domain domain:VWFA list all entries with a Von Willebrand factor type A domain described in the general annotation section (Index of protein domains and families)
ec ec:3.2.1.23 list all beta-galactosidases (Enzyme nomenclature database)
existence existence:"inferred from homology" see Protein existence criteria
family family:serpin list all entries belonging to the Serpin family of proteins (Index of protein domains and families)
fragment fragment:yes list all fragment entries
gene gene:HSPC233 list all entries for proteins encoded by gene HSPC233
go go:cytoskeleton
go:0015629
list all entries associated with:
  • a GO term containing the word "cytoskeleton"
  • the GO term Actin cytoskeleton and any subclasses
host host:mouse
host:10090
host:40674
list all entries for viruses infecting:
  • organisms with a name containing the word "mouse"
  • Mus musculus (Mouse)
  • all mammals (all taxa classified under the taxonomy node for Mammalia)
interactor interactor:P00520 list all entries describing interactions with P00520
keyword keyword:toxin list all entries associated with the keyword Toxin (UniProtKB Keywords)
length length:[500 TO 700] list all entries describing sequences of length between 500 and 700 residues
lineage this fields is a synonym for the field: taxonomy
mass mass:[500000 TO *] list all entries describing sequences with a mass of at least 500,000 Da
method method:maldi
method:xray
list all entries for proteins identified by: matrix-assisted laser desorption/ionization (MALDI), crystallography (X-Ray). The method field searches names of physico-chemical identification methods in the general annotation, reference and cross-reference sections
mnemonic mnemonic:ATP6_HUMAN list all entries with entry name (ID) ATP6_HUMAN. Searches also obsolete entry names (What is the difference between an accession number (AC) and the entry name?)
modified modified:[20060101 TO 20060301] list all entries that were modified between January and March 2006
name name:"prion protein" list all entries for prion proteins.
organelle organelle:ColE1 list all entries for proteins encoded on plasmid ColE1 (Controlled vocabulary of plasmids).
organism organism:"Ovis aries"
organism:9940
organism:sheep
list all entries for proteins expressed in sheep (first 2 examples) and organisms whose name contains the term "sheep" (UniProt taxonomy).
replaces replaces:P02023 list all entries that were created from a merge with P02023 (see FAQ)
reviewed reviewed:yes list all UniProtKB/Swiss-Prot entries (about UniProtKB)
scope scope:mutagenesis list all entries containing a reference that was used to gather information about mutagenesis (See reference section of the user manual)
sequence sequence:P05067-9 list all entries containing a link to isoform 9 of the sequence described in entry P05067. Allows searching by specific sequence identifier
source source:intact list all entries containing a GO term whose annotation source is the IntAct database
strain strain:wistar list all entries containing a reference relevant to strain wistar (List of strains in reference comments and Taxonomy help: organism strains)
taxonomy taxonomy:40674 list all entries for proteins expressed in Mammals. This field is used to retrieve entries for all organisms classified below a given taxonomic node (taxonomy classification).
tissue tissue:liver list all entries containing a reference describing the protein sequence obtained from a clone isolated from liver (Controlled vocabulary of tissues)
web web:wikipedia list all entries for proteins that are described in Wikipedia