Talk:Proteinogenic amino acid

Page contents not supported in other languages.
Source: Wikipedia, the free encyclopedia.

Combining

Shouldn't

List of standard amino acids be combined & maybe renamed Standard amino acids. Gregogil 20:22, 5 April 2007 (UTC)[reply
]

I don't see why not, but Standard amino acid might be preferable given the convention of singular article names. I've added the merge tags to both articles now.--Eloil 03:15, 19 April 2007 (UTC)[reply]
Yes merge. --Sadi Carnot 05:04, 4 June 2007 (UTC)[reply]

There's something to be said for keeping this shorter page for the discussion of what constitutes a standard/nonstandard amino acid, and leaving the list for just talking about the qualities of the 20 standard amino acids. Then, this page could expand out and link to pyrrolysine, selenocysteine, hydroxyproline, and the other weird cases that you don't want to confuse the average reader with, but that need to be talked about as a category. Currently, there isn't a "nonstandard amino acids" page as a clearinghouse for the exceptions. Reesei 19:20, 1 May 2007 (UTC)[reply]

I agree with Reesei for the same reasons. Let's not forget the non-standard proteinogenic amino acids. Antorjal 19:49, 2 August 2007 (UTC)[reply]

I agree in principle with the original proposal to merge the pages but I would call the resulting page Proteinogenic amino acid since this term has an implied meaning, whereas its somewhat arbitrary what is "standard", even though the term is widely used (e.g. IUPAC doesn't actually define them as "standard" AFAIK, but on Google: "standard amino acids" 116,000 vs. "proteinogenic amino acids" 33,000 hits). Subsections could then summarize the distinctions between encoded (the 20), implied (RNA signal → SeCYS) and posttranslational incorporation. Article redirects elsewhere could make this the standard aggregator page for the synonymous concepts; the very useful Lists would get a section under properties; and the specialized information can branch from here. One advantage is that the page would be very generally useful for crossreferencing amino acid information from other pages. (E.g. the

Steipe (talk) 22:35, 23 April 2009 (UTC)[reply
]

22 proteinogenic amino acids

I would like to propose that the 22 amino acids be listed as proteinogenic amino acids. This would make the most sense as a page header. I do agree that the 20 listed could be considered "standard" amino acids in the proteinogenic amino acid page. Therefore the standard amino acid page should be removed as it has been. I think "canonical" is a better descriptor in this case as found in a recent review. This review also supports 22 proteinogenic amino acids. Ambrogelly A, Palioura S, Söll D (2007). "Natural expansion of the genetic code". Nat Chem Biol. 3 (1): 29–35.

PMID 17173027. {{cite journal}}: Unknown parameter |month= ignored (help)CS1 maint: multiple names: authors list (link
)

This article now asserts that there are '22 standard amino acids'. Surely there are 20 standard amino acids? The standard amino acids are those that are directly encoded by the genetic code. Nature is full of proteins containing modified amino acids. Pyrrolysine in particular is not found in eukaryotes. Well it's hardly a 'standard' amino acid then is it? Selenocysteine is a modified amino acid. While technically it may be correct to say that there are 22 'proteinogenic' amino acids, they are not at the same time 'standard' and I feel that the article should be redrafted to reflect this. Johnpretty010 (talk) 21:55, 17 June 2011 (UTC)[reply]

Remarks

The "Remarks" section at the end discusses the properties of each amino acid in protein, but also says whether each is "Essential in humans". This information is redundant with the previous table, to the extent it matches (it doesn't: see Histidine). And the two types of information don't really fit together anyway. If it were expanding on the previous table, it might make sense, but it actually has less information (nothing about conditionals). I'll remove it if no one objects. —Preceding unsigned comment added by Bennetto (talkcontribs) 23:44, 21 June 2008 (UTC)[reply]

Yes, the "Remarks" seems to be just a dumping-ground of info. I say feel free to do anything to merge or organize/separate out any of it into its own discussion/table. The (bio)chemical information in this section does seem useful but would be better to push it all somewhere else (maybe each amino-acid's own page?) instead of some cherry-picked highlights here (it's less "hard-facts" than the other data tables). DMacks (talk) 06:38, 23 June 2008 (UTC)[reply]

Molecular weights

The MW for Selenocysteine given here, 169.06, does not match page http://en.wikipedia.org/wiki/Selenocysteine 168.053 g/mol —Preceding unsigned comment added by 84.68.181.173 (talk) 16:46, 20 July 2008 (UTC)[reply]

Corrected to 168.053. I wonder if the former value was for some ionic form (though would probably require pH much lower than physiologic to get protonated)? DMacks (talk) 18:43, 20 July 2008 (UTC)[reply]

The MW for Selenocysteine seems inconsistent:

  • 168.053 in General chemical properties table
  • 168.0541 in Mass Spec table

I guess the mass of H2O should be added to the first value.. 85.58.63.146 (talk) 23:34, 21 October 2009 (UTC) errh, no -- subtracted from the second (in the MS table), as this seems to be the value of the free AA according to http://en.wikipedia.org/wiki/Selenocysteine 85.58.63.146 (talk) 23:38, 21 October 2009 (UTC)[reply]

Fixed --Kkmurray (talk) 15:20, 8 January 2010 (UTC)[reply]

cysteine

Cysteine is not acidic, it's neutral. Who thought it was acidic, why, and can someone remove the incorrect information? 132.198.84.83 (talk) 19:36, 18 September 2008 (UTC)[reply]

Cys has a pKa of ~8, so could be considered acidic. Inside the cell it is not, since it is involved in disulfide bond formation, which is affected by disulfide exchange by glutathione and thioredoxin more than pH. This is why biochemistry books don't list it as acidic. Qchristensen (talk) 01:42, 20 March 2009 (UTC)[reply]
There's a reducing environment inside most cells, so intracellular Cys tends to be in the -SH form. Disulfides are found primarily in extracellular proteins where an oxidizing environment exists. It's a historical accident that the first two proteins to be sequenced were insulin and ribonuclease, both extracellular, and as a result the disulfide bond gets overemphasised in some textbooks.96.54.32.44 (talk) 06:11, 16 February 2011 (UTC)[reply]

The "side chain properties" table lists cysteine as being hydrophobic (and not polar), but the cysteine page describes it as being polar and hydrophilic. I have also seen it described as polar in other sources (non-definitive). Should the table be changed? And4e (talk) 23:25, 25 October 2011 (UTC)[reply]

Amino acid residue masses

I added a table of residue masses for mass spectrometry. --Kkmurray (talk) 20:01, 6 January 2009 (UTC) I updated the table to reflect current atomic-mass values. --ChiBeta (talk) 00:42, 13 December 2016 (UTC)[reply]

Hydroxyproline

Hydroxyproline is given as an example of a non-proteinogen, but on the hydroxyproline it is called a proteinogen. Correct one of them someone? —Preceding unsigned comment added by 212.248.169.208 (talk) 18:40, 16 August 2009 (UTC)[reply]

Hydroxyproline and hydroxylysine are formed by in situ hydroxylation of Pro and Lys respectively after they have been incorporated into the polypeptide chain.96.54.32.44 (talk) 06:07, 16 February 2011 (UTC)[reply]

What, no zwitterions?

Given the pKas of alpha amino and alpha carboxylate groups, the zwitterion form is by far and away the predominant form in aqueous pH 7 and crystalline states. The ratio of zwitterion to neutral is about 10^7 : 1.96.54.32.44 (talk) 06:18, 16 February 2011 (UTC)[reply]

Occurrence in human proteins - molar or mass?

I couldn't see a reference to the source of information about occurrence in human proteins (Gene expression and biochemistry section). I assume these percentages are molar fractions, rather than by mass. Robin Whittle (talk) 11:39, 14 February 2013 (UTC)[reply]

N-formylmethionine

I wonder why

N-formylmethionine is not considered a proteinogenic amino acid (or amino acid derivative). The AUG codon codes it in a normal way, and it has its own specific tRNA. It is used in protein biosynthesis in bacteria, mitochondria and chloroplast (so also in eukaryotes) but not in archaea. Technically, it is a amino acid derivative rather than a normal amino acid, but it is coded in the genetic code and added into proteins by the normal mechanism. Therefore, there would be 23 proteinogenic amino acids. I think the case of N-formylmethionine should be explained in the article either it is considered amino acid or just a derivative.--Miguelferig (talk) 19:10, 2 June 2013 (UTC)[reply
]

Its own article calls it a proteinogenic AA and explains how it's encoded and involved in translation/transcription. But that article also says it's usually removed as part of the protein synthesis process, so it might be outside the definition of "proteinogenic" as used in the article here? This whole business of counting is pretty crazy and subject to interpretation/context (whose genetic code? etc.), I've only ever seen high-school and freshman-college teachers care. See also the next section. DMacks (talk) 21:16, 10 July 2013 (UTC)[reply]
The main argument to say that N-formylmethionine is a proteinogenic amino acid is that the AUG codon codes it. However, the AUG codon is the only codon for Methionine; this means that, if we accept N-formylmethionine as a proteinogenic AA for this reason, then Methionine isn't a proteinogenic AA anymore: only one of them can meet this condition at the same time. In either case, there are 22 AA, not 23. I'm not a specialist, so that's just what I understand by myself, but what do you think about it? --Grelot-de-Bois (talk) 03:29, 7 September 2014 (UTC)[reply]
N-Formylmethionine says that it "plays a crucial part in the protein synthesis of bacteria, mitochondria and chloroplasts. It is not used in cytosolic protein synthesis of eukaryotes". My reading of various snippets is "there are 21 proteinogenic AAs in eukaryotes" and "there are 23 proteinogenic AAs in prokaryotes". If that reading is correct, it seems like we should change the numbers listed in this article to include N-formylmethionine or change its title and wording to make it clear that it's referring primarliy to eukaryotes and delineate the references to prokaryotes. [1]--Wcoole (talk) 00:45, 27 October 2015 (UTC)[reply]

References

How many standard amino acids are there?

This article says, "There are 22 standard amino acids..." but the page on Amino acid says, "...20 of the 23 proteinogenic amino acids are encoded directly by triplet codons in the genetic code and are known as "standard" amino acids." These two statements cannot be simultaneously true. One of them must be wrong. There are either 22 or 23 standard amino acids. Can someone who knows what they are doing please edit this. Cottonshirtτ 11:30, 22 June 2013 (UTC)[reply]

This is still a real problem. Here, we can read: "There are 23 proteinogenic amino acids, but only 21 are encoded by the nuclear genes of eukaryotes. Of the 23, selenocysteine and pyrrolysine are incorporated into proteins by distinct post-translational biosynthetic mechanisms, and N-formylmethionine is often the initial amino acid of proteins in bacteria, mitochondria, and chloroplasts, but is often removed post-translationally."; and on the Amino acid page, we can read: "They (α-amino acids) include the 23 proteinogenic ("protein-building") amino acids, which combine into peptide chains ("polypeptides") to form the building-blocks of a vast array of proteins.", as well as "Aside from the 23 proteinogenic amino acids, there are many other amino acids that are called non-proteinogenic or non-standard". I found absolutely no reference to an existing 23d amino acid on google. Can someone confirm this? --Grelot-de-Bois (talk) 01:04, 7 September 2014 (UTC)[reply]


It looks like are three issues in question. The confusion around them is muddling both
Amino Acid
. The issue are:
1. Is Selenocysteine (Sec) triplet-encoded?
This page includes Sec in the list of eukaryotic, triplet-encoded AAs. With its inclusion, there are 21. This is inconsistent with the
Amino Acid
page, which states that "Twenty of the proteinogenic amino acids are encoded directly by triplet codons in the genetic code." Additionally, the image depicting how many AAs are triplet-encoded on this page is inconsistent with the inclusion of Sec.
2. Is N-Formylmethionine (fMet) a proteinogenic AA found in prokaryotes?
The
Amino Acid
page includes fMet in the list of proteinogenic AAs found in prokaryotes, listing 23. This page excludes it and lists 22.
3. What does "standard" mean
It seems to have be used to refer to the 20, triplet-encoding AAs of eukaryotes. If we are to accept selenocysteine into this group, the "standard" amino acids would number 21. This affects the list of "standard" amino acid properties and the language opening the page.
I lack the knowledge to opine about any of this, just hoping to clarify the inconsistencies. - dcrowe13 20:17, 18 September 2015 (UTC) — Preceding unsigned comment added by Dcrowe2 (talkcontribs)
Well I do have an opinion and I think I can back it up:
1. Is Selenocysteine (Sec) triplet-encoded?
Not in any meaningful sense, it is only the pairing of a SECIS element with a nearby UGA codon that encodes Sec. A SECIS element alone has no coding role and a UGA alone encodes STOP. But note that Sec is directly incorporated into proteins during translation.
2. Is N-Formylmethionine (fMet) a proteinogenic AA found in prokaryotes?
Not really and no. The not really is that I can't find a convincing source for fMet being called proteinogenic (this might serve but a single ref is hardly definitive), plus I can find sources that imply that fMet isn't. Ref 2 from the pyrrolysine article describes Sec and Pyl (pyrrolysine) as the 21st and 22nd proteinogenic AAs but fMet and its role has been known far longer than Pyl. But note that fMet is directly incorporated into proteins during translation so it is unclear why it should be excluded. The no is that fMet is used by bacteria but not archaea, so prokaryotes would be misleading.
3. What does "standard" mean
If it means anything it refers to the standard genetic code, which codes for 20 amino acids. Selenocysteine is not one of them. I wouldn't say that there are 20 standard amino acids, I would say that there are 20 amino acids in the standard genetic code.
TuxLibNit (talk) 05:32, 9 January 2016 (UTC)[reply]
Regarding the status of fMet, I think that it has traditionally been regarded as a variant of Met rather than as a distinct proteinogenic amino acid. This view is supported by its use of the same codon as Met and by the observation that the formyl group is typically removed post-translationally. This would leave 22 "codon-directed" proteinogenic amino acids, two of which require extra genetic information for their incorporation. It should be noted that other amino acids (e.g. 2-aminopentanoic acid, 2-aminohexanoic) may be mis-incorporated translationally into proteins and thus could be considered "proteinogenic".ChiBeta (talk) 02:09, 13 December 2016 (UTC)[reply]

I have to disagree with the above. Given that it is commonly removed, but not always removed post-translationally, makes it a proteinogenic amino acid because it can end up incorporated into a final protein sometimes. 2600:1700:A420:4C40:C4F5:A9FF:6032:706B (talk) 21:17, 4 January 2019 (UTC)[reply]

Beautifully Said

The introduction is one of the best written I have had the pleasure of reading. It is just what all Wikipedia science articles should be, but many fail at. It is a sterling example of how a collaborative writing effort can lead to clear, concise, complete prose. Thanks to all who contributed. Nick Beeson (talk) 14:13, 27 June 2013 (UTC)[reply]

Side-chain pKa vs pH

In the

pH. The former is a value, and the latter is either blank or a word/phrase such as strongly/weakly acidic/basic. But the pKa value trends don't agree with the pH descriptors. Values 3.90–8.18 are "acidic" and 10.46 is "weakly acidic", but 6.04/10.54/12.48 are "weakly basic"/"basic"/"strongly basic" respectively. That is, the "weakly acidic" value is well beyond the "weakly basic" one and almost the same as the "basic" one, and the "weakly basic" one is in the middle of the range of "acidic" ones. I assume the senses of acidic and basic aren't direct opposite equilibrium-constant ideas (not literally predominance of one or the other side of the same equilibrium equation), but this really needs to be explained clearly. DMacks (talk) 05:01, 12 December 2016 (UTC)[reply
]

The description of acidic or basic depends on whether the functional group donates or accepts a proton. The "weak" vs "strong" classification relates to the charge state of the side-chain at neutral pH. Thus, Cys is a weak acid as it is predominantly uncharged at pH 7. The quoted pKa values are those that may be relevant to a protein's behaviour under experimental conditions such as chromatography or isoelectric focusing. Although the listed pKa values are reasonable, there is no reference for them and I don't know of any single publication that contains these exact values. If there isn't one, I'd like to revise them and quote an authoritative source.--ChiBeta (talk) 01:01, 13 December 2016 (UTC)[reply]

I've made changes and quoted values determined for residues within alanine pentapeptides, or at least with blocked N- & C-termini, and in dilute salt solution. These seem the best to represent typical values for the amino-acid residues in an unfolded protein under aqueous conditions.ChiBeta (talk) 01:18, 2 February 2017 (UTC)[reply]

Source for van-der-Waals volume of side-chains?

I'm having a hard time to find a source for the side-chain van-der-Waals volumes in the table of this page mentioned in the subject. To search, I use the AAindex database which I think one could see as canonical source for all things related to amino acid values. More specifically, I use the aaindex1 file which is downloadable from the ftp server.

The format of the aaindex1 file makes it easy to search for values for alanine (or leucine) starting with, e.g., "6" via the Unix command line: rgrep "^[ ]\+6"

Example: the van-der-Waals volume for alanine is said to be "67" in the Wikipedia table. However, the grep given above does not find a single line in the aaindex1 file which has the required " 67" entry. Lot's of "6.x" entries, one "60." entry, one "685." entry, but no "67".

To account for multiples of ten and rounding, I also looked for lines which values would approximately match the alanine and arginine values: "67 148", as they are side by side in the aaindex file. E.g.: 'rgrep "^[ ]\+6[^ ]\+1" ' would find a line starting with " 6.6 14.8". But there, too, no luck.

The edit which brought the table into this Wikipedia entry was made by a merger in 2008 (here), but after that I cannot trace back further.

Ideas? — Preceding unsigned comment added by BaChev (talkcontribs) 19:54, 14 December 2017 (UTC)[reply]

Hydropathy of tyrosine

Tyrosine is listed in the table of properties here, and in the table of the mRNA genetic code in that article, as a polar (i.e. hydrophilic) amino acid; but the illustration in the "Structures" section of this article, and the description in the compound's own article, both classify it as hydrophobic. The real issue is that there's no hard line between hydrophilic and hydrophobic, and tyrosine is less hydrophobic than others while still being basically more hydrophobic than hydrophilic. But if we're going to do a binary classification, we ought to do it consistently - and the tyrosine article, and the illustration in this article, both are cited, while the tables seem not to be cited. 2607:FEA8:12A0:44D:0:0:0:C319 (talk) 02:03, 24 May 2020 (UTC)[reply]