|
|
Guidelines for Naming Modifications
Standard short name for a modification are agreed by members of the
protein modification workgroup of the Proteomics
Standards Initiative. Names are chosen according to a number of (mandatory) rules and (non-mandatory) recommendations.
|
RULES |
1 | These are names for modifications. Specificity, (residue, position, terminus), linkage,
chirality, neutral loss behaviour, etc., can be indicated separately, in an application specific manner,
and should not be part of the name. |
2 | Legal characters for modification names are the ASCII characters 0-9 A-Z a-z together
with the following symbols and punctuation characters # ( ) * + , - . / : > [ ] { } |
3 | Names are not case-sensitive. |
4 | Names have a minimum length of 3 characters and a maximum length of 64 characters. |
5 | Modifications that correspond to deltas with different elemental formulae cannot have the same name. |
6 | 2H not D for deuterium |
7 | Nucleon number comes before the element symbol, not after. That is, 18O not O18 |
8 | There is no requirement to specify nucleon number for the most abundant natural isotope of
each element. That is, N not 14N |
9 | Reserved keywords are delta, label, lipid, cation, and xlink. |
10 | The colon character is reserved for delimiting keywords from the remainder of the name |
11 | The plus character is reserved for delimiting combined modifications, e.g. Label:13C(9)+Phospho |
12 | Parentheses are reserved for counts in formulae, e.g. Delta:H(2)C(3) |
13 | For glycosylation, the name is constructed from the monosaccharide composition, e.g.
Hex(1)HexNAc(1)dHex(1) |
14 | If there is no suitable name for a lipid, a name can be constructed from the keyword
Lipid: plus the nominal monoisotopic mass of the delta. No instance at present |
15 | For isotope labels that do not change the elemental composition of the residue or
terminus, the name is constructed from the keyword Label: plus a count of the labelled atoms. For
example, Label:13C(6). This is an implied substitution, so that Label:13C(6) is the same as
Delta:C(-6)13C(6) |
16 | For isotope labels that are also chemical modifications, the name is constructed
from a unique keyword, and a colon delimited count of the labelled atoms. Where an isotope tag has
an accepted acronym or trademark, like ICAT, this should be used as part or all of the keyword.
For example ICPL and ICPL:13C(6). Heavy and light are not sufficient for isotope labels. Like
Label:, this is an implied substitution |
17 | Where there is no obvious or unambiguous name, use the keyword Delta: followed
by the complete empirical formula of the delta. For example, Delta:H(2)C(3) |
18 | For cations, use the keyword cation, the element symbol, and the oxidation state
using Roman numerals in square brackets. If there is only a single oxidation state for the element,
and this is I, it can be omitted, e.g. Cation:Na and Cation:Fe[II] |
|
RECOMMENDATIONS |
1 | Names are not intended to be IUPAC-style systematic nomenclature, (
http://www.chem.qmul.ac.uk/iupac/AminoAcid/ and
http://www.acdlabs.com/iupac/nomenclature/93/r93_296.htm).
The aim is a controlled list of common or trivial names. |
2 | Names should semi-descriptive or, at least, recognizable. |
3 | If there is a generally accepted name or acronym for a modification, this should be adopted. |
4 | As far as possible, the length of a name should be 24 characters or less. |
5 | Names can be unambiguously delimited from application specific prefixes and suffixes
using any of the illegal characters, especially space, underscore, and @ |
6 | As far as possible, there should be a one to one relationship between a name and a delta
of a given empirical formula. Exceptions are allowed where the structures are different or where there are
generally accepted names that convey additional useful information. |
7 | Use American English spelling (e.g. sulfate). |
8 | Use case to enhance readability. |
9 | Minimise use of hyphens. Not required for prefixes like di and tri.
Do not use a hyphen to indicate "loss of" because this is too ambiguous. |
10 | The preference is to name the modification as a moiety, not the reaction
or the reagent or the modifed residue. For example, acetyl and not acetylation or acetyl-L-lysine.
This is only a preference, not a rule. It runs into difficulty when the modification is the
removal of a moiety, because it is difficult to name the absence, and it may be necessary to
use the reaction or process as the name. |
11 | Many modifications are essentially substitutions that are specific to a single residue.
These can be most clearly represented using an arrow syntax, e.g. Arg->Orn, Trp->Kynurenin.
This syntax must never use 1-letter codes for residues to avoid confusion with elements. |
12 | Try to avoid starting a name with something that might be mistaken for
a specificity, e.g. N- or C-, which are often used to indicate protein terminus PTMs. |
13 | Related modifications should be differentiated by suffixes rather than prefixes
so that they sort together in an alphabetical list. |
Non-standard Amino Acid Residues
There is no universal set of three letter abbreviations for non-standard residues. This list is drawn from:
Aad | 2-aminoadipicacid |
Abu | 2-Aminobutyric acid |
Acm | Acetamidomethyl |
Aib | 2-Aminoisobutyric acid |
Bum | t-butyloxymethyl |
Cit | Citruline |
Cmc | Carboxymethylcysteine |
Cya | cysteic acid |
Dha | dehydroalanine |
Dhb | Dehydroamino-2-butyric acid |
Gla | 4-carboxyglutamic acid |
Glp | pyroglutamic acid (also Pga, pGlu and |
Hse | homoserine |
Hsl | homoserine lactone |
Hyl | Hydroxylysine |
Hyp | Hydroxyproline |
Iva | Isovaline |
Nle | Norleucine |
Orn | Ornithine |
Pip | 2-Piperidinecarboxylic acid |
Pyr | pyruvic acid |
Sar | Sarcosine |
Sec | selenocysteine |
|
|