UniMod
 
 
Unimod Help
Introduction
Database fields
Adding or modifying
Naming guidelines
Symbols & mass values
Cross-link entries
Downloads
 
 

Guidelines for Naming Modifications

Standard short name for a modification are agreed by members of the protein modification workgroup of the Proteomics Standards Initiative. Names are chosen according to a number of (mandatory) rules and (non-mandatory) recommendations.
  RULES
1These are names for modifications. Specificity, (residue, position, terminus), linkage, chirality, neutral loss behaviour, etc., can be indicated separately, in an application specific manner, and should not be part of the name.
2Legal characters for modification names are the ASCII characters 0-9 A-Z a-z together with the following symbols and punctuation characters # ( ) * + , - . / : > [ ] { }
3Names are not case-sensitive.
4Names have a minimum length of 3 characters and a maximum length of 64 characters.
5Modifications that correspond to deltas with different elemental formulae cannot have the same name.
62H not D for deuterium
7Nucleon number comes before the element symbol, not after. That is, 18O not O18
8There is no requirement to specify nucleon number for the most abundant natural isotope of each element. That is, N not 14N
9Reserved keywords are delta, label, lipid, cation, and xlink.
10The colon character is reserved for delimiting keywords from the remainder of the name
11The plus character is reserved for delimiting combined modifications, e.g. Label:13C(9)+Phospho
12Parentheses are reserved for counts in formulae, e.g. Delta:H(2)C(3)
13For glycosylation, the name is constructed from the monosaccharide composition, e.g. Hex(1)HexNAc(1)dHex(1)
14If there is no suitable name for a lipid, a name can be constructed from the keyword Lipid: plus the nominal monoisotopic mass of the delta. No instance at present
15For isotope labels that do not change the elemental composition of the residue or terminus, the name is constructed from the keyword Label: plus a count of the labelled atoms. For example, Label:13C(6). This is an implied substitution, so that Label:13C(6) is the same as Delta:C(-6)13C(6)
16For isotope labels that are also chemical modifications, the name is constructed from a unique keyword, and a colon delimited count of the labelled atoms. Where an isotope tag has an accepted acronym or trademark, like ICAT, this should be used as part or all of the keyword. For example ICPL and ICPL:13C(6). Heavy and light are not sufficient for isotope labels. Like Label:, this is an implied substitution
17Where there is no obvious or unambiguous name, use the keyword Delta: followed by the complete empirical formula of the delta. For example, Delta:H(2)C(3)
18For cations, use the keyword cation, the element symbol, and the oxidation state using Roman numerals in square brackets. If there is only a single oxidation state for the element, and this is I, it can be omitted, e.g. Cation:Na and Cation:Fe[II]

  RECOMMENDATIONS
1Names are not intended to be IUPAC-style systematic nomenclature, ( http://www.chem.qmul.ac.uk/iupac/AminoAcid/ and http://www.acdlabs.com/iupac/nomenclature/93/r93_296.htm). The aim is a controlled list of common or trivial names.
2Names should semi-descriptive or, at least, recognizable.
3If there is a generally accepted name or acronym for a modification, this should be adopted.
4As far as possible, the length of a name should be 24 characters or less.
5Names can be unambiguously delimited from application specific prefixes and suffixes using any of the illegal characters, especially space, underscore, and @
6As far as possible, there should be a one to one relationship between a name and a delta of a given empirical formula. Exceptions are allowed where the structures are different or where there are generally accepted names that convey additional useful information.
7Use American English spelling (e.g. sulfate).
8Use case to enhance readability.
9Minimise use of hyphens. Not required for prefixes like di and tri. Do not use a hyphen to indicate "loss of" because this is too ambiguous.
10The preference is to name the modification as a moiety, not the reaction or the reagent or the modifed residue. For example, acetyl and not acetylation or acetyl-L-lysine. This is only a preference, not a rule. It runs into difficulty when the modification is the removal of a moiety, because it is difficult to name the absence, and it may be necessary to use the reaction or process as the name.
11Many modifications are essentially substitutions that are specific to a single residue. These can be most clearly represented using an arrow syntax, e.g. Arg->Orn, Trp->Kynurenin. This syntax must never use 1-letter codes for residues to avoid confusion with elements.
12Try to avoid starting a name with something that might be mistaken for a specificity, e.g. N- or C-, which are often used to indicate protein terminus PTMs.
13Related modifications should be differentiated by suffixes rather than prefixes so that they sort together in an alphabetical list.

Non-standard Amino Acid Residues

There is no universal set of three letter abbreviations for non-standard residues. This list is drawn from:

Aad2-aminoadipicacid
Abu2-Aminobutyric acid
AcmAcetamidomethyl
Aib2-Aminoisobutyric acid
Bumt-butyloxymethyl
CitCitruline
CmcCarboxymethylcysteine
Cyacysteic acid
Dhadehydroalanine
DhbDehydroamino-2-butyric acid
Gla4-carboxyglutamic acid
Glppyroglutamic acid (also Pga, pGlu and
Hsehomoserine
Hslhomoserine lactone
HylHydroxylysine
HypHydroxyproline
IvaIsovaline
NleNorleucine
OrnOrnithine
Pip2-Piperidinecarboxylic acid
Pyrpyruvic acid
SarSarcosine
Secselenocysteine