Nominative nomenclature…

[ratings]

Those familiar with one end of a biology text book from another will be aware that for the purposes of covenience/brevity/secret codes (take your pick), we use a one-letter and three-letter coding system for amino acids (the building blocks of proteins).

In a paper I must have read several times over the years, entitled ‘The liveliest effusion of wit and humor‘, the author Jan Witkowski describes some of the logic behind the one-letter codes:

The single letter amino acid code was devised in 1966 by an informal group led by Richard Eck, and the derivations of the letters are, for the most part, fairly clear [1]. For amino acids with a unique first letter, that letter is used; for example, I for isoleucine, M for methionine and V for valine. For amino acids with common first letters, that letter is used for the most common amino acid – A is used for alanine rather than aspartic acid, and L for leucine rather than lysine. That leaves a set of amino acids with a more cryptic one-letter notation. F for phenylalanine (Fenyalanine) and R for arginine (Rginine) are fairly obvious but why is W the letter for tryptophan? Eck explains this by stating that ‘tryptophan’ should be pronounced ‘twyptophan’ and, hence, ‘W’ is an appropriate symbol for it. The entry has an asterisk against it, leading the reader to a footnote: ‘My collaborators insist that I take full responsibility for this – R.V.E.’ Unfortunately, this explanation was omitted from later editions and ‘W’ is now supposed to represent the double ring system in tryptophan.

1. R.V. Eck , One- and three letter amino acid abbreviations: mneumonics of the one-letter notation. In: R.V. Eck and M.O. Dayhoff, Editors, Atlas of Protein Sequence and Structure xiii, National Biomedical Research Foundation (1966).

This leaves the infamous five amino acids that most 1st year biochemistry students forget: glutamic acid (E), asparagine (N), aspartic acid (D), glutamine (Q) and lysine (K).

Asparagine, at least, contains an ‘N'; glutamic acid results from the first syllable ‘gluE'; aspartic acid is best pronounced with a US accent, ‘asparDic’ (doesn’t work with an R.P. English accent). I have heard several reasons for why glutamine ended up with Q, and lysine with K (the latter of which is because K is close to L, which was already taken up by Leucine), but none really satisfy. None the less, they’re taught by rote and each generation of biochemists (et al.) is left to find their own reasoning.