Nuclear magnetic resonance spectroscopy of proteins
Nuclear magnetic resonance spectroscopy of proteins (usually abbreviated protein NMR) is a field of
, among others. Structure determination by NMR spectroscopy usually consists of several phases, each using a separate set of highly specialized techniques. The sample is prepared, measurements are made, interpretive approaches are applied, and a structure is calculated and validated.Currently most samples are examined in a solution in water, but
A typical study might involve how two proteins interact with each other, possibly with a view to developing small molecules that can be used to probe the normal biology of the interaction ("
Sample preparation
Protein nuclear magnetic resonance is performed on aqueous samples of highly purified protein. Usually, the sample consists of between 300 and 600 microlitres with a protein concentration in the range 0.1 – 3 millimolar. The source of the protein can be either natural or produced in a production system using recombinant DNA techniques through genetic engineering. Recombinantly expressed proteins are usually easier to produce in sufficient quantity, and this method makes isotopic labeling possible.[citation needed]
The purified protein is usually dissolved in a buffer solution and adjusted to the desired solvent conditions. The NMR sample is prepared in a thin-walled glass tube.[citation needed]
Data collection
Protein NMR utilizes multidimensional nuclear magnetic resonance experiments to obtain information about the protein. Ideally, each distinct nucleus in the molecule experiences a distinct electronic environment and thus has a distinct
Depending on the concentration of the sample, the magnetic field of the spectrometer, and the type of experiment, a single multidimensional nuclear magnetic resonance experiment on a protein sample may take hours or even several days to obtain suitable signal-to-noise ratio through signal averaging, and to allow for sufficient evolution of magnetization transfer through the various dimensions of the experiment. Other things being equal, higher-dimensional experiments will take longer than lower-dimensional experiments.[citation needed]
Typically, the first experiment to be measured with an isotope-labelled protein is a 2D
Resonance assignment
In order to analyze the nuclear magnetic resonance data, it is important to get a resonance assignment for the protein, that is to find out which chemical shift corresponds to which atom. This is typically achieved by sequential walking using information derived from several different types of NMR experiment. The exact procedure depends on whether the protein is isotopically labelled or not, since a lot of the assignment experiments depend on carbon-13 and nitrogen-15.[citation needed]
Homonuclear nuclear magnetic resonance
With unlabelled protein the usual procedure is to record a set of two-dimensional homonuclear nuclear magnetic resonance experiments through
One important problem using homonuclear nuclear magnetic resonance is overlap between peaks. This occurs when different protons have the same or very similar chemical shifts. This problem becomes greater as the protein becomes larger, so homonuclear nuclear magnetic resonance is usually restricted to small proteins or peptides.[citation needed]
Nitrogen-15 nuclear magnetic resonance
The most commonly performed 15N experiment is the 1H-15N HSQC. The experiment is highly sensitive and therefore can be performed relatively quickly. It is often used to check the suitability of a protein for structure determination using NMR, as well as for the optimization of the sample conditions. It is one of the standard suite of experiments used for the determination of the solution structure of protein. The HSQC can be further expanded into three- and four dimensional NMR experiments, such as 15N-TOCSY-HSQC and 15N-NOESY-HSQC.[5]
Carbon-13 and nitrogen-15 nuclear magnetic resonance
When the protein is labelled with carbon-13 and nitrogen-15 it is possible to record triple resonance experiments that transfer magnetisation over the peptide bond, and thus connect different spin systems through bonds.[6][7] This is usually done using some of the following experiments, HNCO, HN(CA)CO}, HNCA,[8] HN(CO)CA, HNCACB and CBCA(CO)NH. All six experiments consist of a 1H-15N plane (similar to a HSQC spectrum) expanded with a carbon dimension. In the HN(CA)CO, each HN plane contains the peaks from the carbonyl carbon from its residue as well the preceding one in the sequence. The HNCO contains the carbonyl carbon chemical shift from only the preceding residue, but is much more sensitive than HN(CA)CO. These experiments allow each 1H-15N peak to be linked to the preceding carbonyl carbon, and sequential assignment can then be undertaken by matching the shifts of each spin system's own and previous carbons. The HNCA and HN(CO)CA works similarly, just with the alpha carbons (Cα) rather than the carbonyls, and the HNCACB and the CBCA(CO)NH contains both the alpha carbon and the beta carbon (Cβ). Usually several of these experiments are required to resolve overlap in the carbon dimension. This procedure is usually less ambiguous than the NOESY-based method since it is based on through bond transfer. In the NOESY-based methods, additional peaks corresponding to atoms that are close in space but that do not belong to sequential residues will appear, confusing the assignment process. Following the initial sequential resonance assignment, it is usually possible to extend the assignment from the Cα and Cβ to the rest of the sidechain using experiments such as HCCH-TOCSY, which is basically a TOCSY experiment resolved in an additional carbon dimension.
Restraint generation
In order to make structure calculations, a number of experimentally determined restraints have to be generated. These fall into different categories; the most widely used are distance restraints and angle restraints.
Distance restraints
A crosspeak in a
It is of great importance to assign the NOESY peaks to the correct nuclei based on the chemical shifts. If this task is performed manually it is usually very labor-intensive, since proteins usually have thousands of NOESY peaks. Some computer programs such as PASD
To obtain as accurate assignments as possible, it is a great advantage to have access to carbon-13 and nitrogen-15 NOESY experiments, since they help to resolve overlap in the proton dimension. This leads to faster and more reliable assignments, and in turn to better structures.
Angle restraints
In addition to distance restraints, restraints on the torsion angles of the chemical bonds, typically the psi and phi angles, can be generated. One approach is to use the Karplus equation, to generate angle restraints from coupling constants. Another approach uses the chemical shifts to generate angle restraints. Both methods use the fact that the geometry around the alpha carbon affects the coupling constants and chemical shifts, so given the coupling constants or the chemical shifts, a qualified guess can be made about the torsion angles.
Orientation restraints
The analyte molecules in a sample can be partially ordered with respect to the external magnetic field of the spectrometer by manipulating the sample conditions. Common techniques include addition of
Hydrogen–deuterium exchange
NMR spectroscopy is nucleus specific. Thus, it can distinguish between hydrogen and deuterium. The amide protons in the protein exchange readily with the solvent, and, if the solvent contains a different isotope, typically deuterium, the reaction can be monitored by NMR spectroscopy. How rapidly a given amide exchanges reflects its solvent accessibility. Thus amide exchange rates can give information on which parts of the protein are buried, hydrogen-bonded, etc. A common application is to compare the exchange of a free form versus a complex. The amides that become protected in the complex, are assumed to be in the interaction interface.
Structure calculation
The experimentally determined restraints can be used as input for the structure calculation process. Researchers, using computer programs such as
Structure validation
The ensemble of structures obtained is an "experimental model", i.e., a representation of certain kind of experimental data. To acknowledge this fact is important because it means that the model could be a good or bad representation of that experimental data.[22] In general, the quality of a model will depend on both the quantity and quality of experimental data used to generate it and the correct interpretation of such data.
It is important to remember that every experiment has associated errors. Random errors will affect the reproducibility and precision of the resulting structures. If the errors are systematic, the accuracy of the model will be affected. The precision indicates the degree of reproducibility of the measurement and is often expressed as the variance of the measured data set under the same conditions. The accuracy, however, indicates the degree to which a measurement approaches its "true" value.
Ideally, a model of a protein will be more accurate the more fit the actual molecule that represents and will be more precise as there is less uncertainty about the positions of their atoms. In practice there is no "standard molecule" against which to compare models of proteins, so the accuracy of a model is given by the degree of agreement between the model and a set of experimental data. Historically, the structures determined by NMR have been, in general, of lower quality than those determined by X-ray diffraction. This is due, in part, to the lower amount of information contained in data obtained by NMR. Because of this fact, it has become common practice to establish the quality of NMR ensembles, by comparing it against the unique conformation determined by X-ray diffraction, for the same protein. However, the X-ray diffraction structure may not exist, and, since the proteins in solution are flexible molecules, a protein represented by a single structure may lead to underestimate the intrinsic variation of the atomic positions of a protein. A set of conformations, determined by NMR or X-ray crystallography may be a better representation of the experimental data of a protein than a unique conformation.[23]
The utility of a model will be given, at least in part, by the degree of accuracy and precision of the model. An accurate model with relatively poor precision could be useful to study the evolutionary relationships between the structures of a set of proteins, whereas the rational drug design requires both precise and accurate models. A model that is not accurate, regardless of the degree of precision with which it was obtained will not be very useful.[22][24]
Since protein structures are experimental models that can contain errors, it is very important to be able to detect these errors. The process aimed at the detection of errors is known as validation. There are several methods to validate structures, some are statistical like PROCHECK and WHAT IF while others are based on physical principles as CheShift, or a mixture of statistical and physics principles PSVS.
Dynamics
In addition to structures,
NMR spectroscopy on large proteins
Traditionally, nuclear magnetic resonance spectroscopy has been limited to relatively small proteins or protein domains. This is in part caused by problems resolving overlapping peaks in larger proteins, but this has been alleviated by the introduction of isotope labelling and multidimensional experiments. Another more serious problem is the fact that in large proteins the magnetization relaxes faster, which means there is less time to detect the signal. This in turn causes the peaks to become broader and weaker, and eventually disappear. Two techniques have been introduced to attenuate the relaxation:
Automation of the process
Structure determination by NMR has traditionally been a time-consuming process, requiring interactive analysis of the data by a highly trained scientist. There has been considerable interest in automating the process to increase the throughput of structure determination and to make protein NMR accessible to non-experts (See structural genomics). The two most time-consuming processes involved are the sequence-specific resonance assignment (backbone and side-chain assignment) and the NOE assignment tasks. Several different computer programs have been published that target individual parts of the overall NMR structure determination process in an automated fashion. Most progress has been achieved for the task of automated NOE assignment. So far, only the FLYA and the UNIO approach were proposed to perform the entire protein NMR structure determination process in an automated manner without any human intervention.[13][14] Modules in the NMRFAM-SPARKY such as APES (two-letter-code: ae), I-PINE/PINE-SPARKY (two-letter-code: ep; I-PINE web server) and PONDEROSA (two-letter-code: c3, up; PONDEROSA web server) are integrated so that it offers full automation with visual verification capability in each step.[29] Efforts have also been made to standardize the structure calculation protocol to make it quicker and more amenable to automation.[30] Recently, the POKY suite, the successor of programs mentioned above, has been released to provide modern GUI tools and AI/ML features.[31]
See also
- NMR spectroscopy
- Nuclear magnetic resonance
- Nuclear magnetic resonance spectroscopy of carbohydrates
- Nuclear magnetic resonance spectroscopy of nucleic acids
- Protein crystallization
- Protein dynamics
- Relaxation (NMR)
- X-ray crystallography
References
- S2CID 26153265.
- ISBN 9780470034590.
- PMID 2266107.
- PMID 2676353.
- PMID 2047852.
- .
- .
- S2CID 20037190.
- PMID 15149223.
- ^ PMID 18668206.
- ^ PMID 12565051.
- .
- ^ ISBN 978-0470034590.
- ^ PMID 15318003.
- PMID 17121777.
- S2CID 33910776.
- PMID 27169728.
- PMID 25190042.
- PMID 27023095.
- PMID 15317993.
- PMID 31522945.
- ^ )
- PMID 19564690.
- .
- PMID 9356455.
- PMID 7952934.
- S2CID 2451574.
- PMID 23583779.
- PMID 25505092.
- PMID 16027363.
- PMID 33715003.
Further reading
- Hitchens TK, Rule GS (2005). Fundamentals of Protein NMR Spectroscopy (Focus on Structural Biology). Berlin: Springer. ISBN 978-1-4020-3499-2.
- Teng Q (2005). Structural biology: practical NMR applications. Berlin: Springer. ISBN 978-0-387-24367-2.
- Rance M, Cavanagh J, Fairbrother WJ, Hunt III AW, Skelton NJ (2007). Protein NMR spectroscopy: principles and practice (2nd ed.). Boston: Academic Press. ISBN 978-0-12-164491-8.
- Wüthrich K (1986). NMR of proteins and nucleic acids. New York: Wiley. ISBN 978-0-471-82893-8.
External links
- NOESY-Based Strategy for Assignments of Backbone and Side Chain Resonances of Large Proteins without Deuteration (a protocol)
- relax Software for the analysis of NMR dynamics
- ProSA-web Archived 2011-05-11 at the Wayback Machine Web service for the recognition of errors in experimentally or theoretically determined protein structures
- Protein structure determination from sparse experimental data - an introductory presentation
- Protein NMR Protein NMR experiments