Docking (molecular)
Docking glossary |
---|
|
edit |
In the field of
![](http://upload.wikimedia.org/wikipedia/commons/thumb/9/97/Docking_representation_2.png/340px-Docking_representation_2.png)
The associations between biologically relevant molecules such as
Molecular docking is one of the most frequently used methods in
Definition of problem
One can think of molecular docking as a problem of “lock-and-key”, in which one wants to find the correct relative orientation of the “key” which will open up the “lock” (where on the surface of the lock is the key hole, which direction to turn the key after it is inserted, etc.). Here, the protein can be thought of as the “lock” and the ligand can be thought of as a “key”. Molecular docking may be defined as an optimization problem, which would describe the “best-fit” orientation of a ligand that binds to a particular protein of interest. However, since both the ligand and the protein are flexible, a “hand-in-glove” analogy is more appropriate than “lock-and-key”.[4] During the course of the docking process, the ligand and the protein adjust their conformation to achieve an overall "best-fit" and this kind of conformational adjustment resulting in the overall binding is referred to as "induced-fit".[5]
Molecular docking research focuses on computationally simulating the molecular recognition process. It aims to achieve an optimized conformation for both the protein and ligand and relative orientation between protein and ligand such that the free energy of the overall system is minimized.
Docking approaches
Two approaches are particularly popular within the molecular docking community.
- One approach uses a matching technique that describes the protein and the ligand as complementary surfaces.[6][7][8]
- The second approach simulates the actual docking process in which the ligand-protein pairwise interaction energies are calculated.[9]
Both approaches have significant advantages as well as some limitations. These are outlined below.
Shape complementarity
Geometric matching/shape complementarity methods describe the protein and ligand as a set of features that make them dockable.
Simulation
Simulating the docking process is much more complicated. In this approach, the protein and the ligand are separated by some physical distance, and the ligand finds its position into the protein's active site after a certain number of “moves” in its conformational space. The moves incorporate rigid body transformations such as translations and rotations, as well as internal changes to the ligand's structure including torsion angle rotations. Each of these moves in the conformation space of the ligand induces a total energetic cost of the system. Hence, the system's total energy is calculated after every move.
The obvious advantage of docking simulation is that ligand flexibility is easily incorporated, whereas shape complementarity techniques must use ingenious methods to incorporate flexibility in ligands. Also, it more accurately models reality, whereas shape complementary techniques are more of an abstraction.
Clearly, simulation is computationally expensive, having to explore a large energy landscape. Grid-based techniques, optimization methods, and increased computer speed have made docking simulation more realistic.
Mechanics of docking
![](http://upload.wikimedia.org/wikipedia/commons/thumb/e/e8/Overview_docking.png/340px-Overview_docking.png)
To perform a docking screen, the first requirement is a structure of the protein of interest. Usually the structure has been determined using a biophysical technique such as
- X-ray crystallography,
- NMR spectroscopyor
- cryo-electron microscopy (cryo-EM),
but can also derive from homology modeling construction. This protein structure and a database of potential ligands serve as inputs to a docking program. The success of a docking program depends on two components: the search algorithm and the scoring function.
Search algorithm
The
A variety of conformational search strategies have been applied to the ligand and to the receptor. These include:
- systematic or stochastic torsional searches about rotatable bonds
- molecular dynamics simulations
- genetic algorithms to "evolve" new low energy conformations and where the score of each pose acts as the fitness function used to select individuals for the next iteration.
Ligand flexibility
Conformations of the ligand may be generated in the absence of the receptor and subsequently docked[15] or conformations may be generated on-the-fly in the presence of the receptor binding cavity,[16] or with full rotational flexibility of every dihedral angle using fragment based docking.[17] Force field energy evaluation are most often used to select energetically reasonable conformations,[18] but knowledge-based methods have also been used.[19]
Peptides are both highly flexible and relatively large-sized molecules, which makes modeling their flexibility a challenging task. A number of methods were developed to allow for efficient modeling of flexibility of peptides during protein-peptide docking.[20]
Receptor flexibility
Computational capacity has increased dramatically over the last decade making possible the use of more sophisticated and computationally intensive methods in computer-assisted drug design. However, dealing with receptor flexibility in docking methodologies is still a thorny issue.[21] The main reason behind this difficulty is the large number of degrees of freedom that have to be considered in this kind of calculations. Neglecting it, however, in some of the cases may lead to poor docking results in terms of binding pose prediction.[22]
Multiple static structures experimentally determined for the same protein in different conformations are often used to emulate receptor flexibility.[23] Alternatively rotamer libraries of amino acid side chains that surround the binding cavity may be searched to generate alternate but energetically reasonable protein conformations.[24][25]
Scoring function
Docking programs generate a large number of potential ligand poses, of which some can be immediately rejected due to clashes with the protein. The remainder are evaluated using some scoring function, which takes a pose as input and returns a number indicating the likelihood that the pose represents a favorable binding interaction and ranks one ligand relative to another.
Most scoring functions are physics-based molecular mechanics force fields that estimate the energy of the pose within the binding site. The various contributions to binding can be written as an additive equation:
The components consist of solvent effects, conformational changes in the protein and ligand, free energy due to protein-ligand interactions, internal rotations, association energy of ligand and receptor to form a single complex and free energy due to changes in vibrational modes.[26] A low (negative) energy indicates a stable system and thus a likely binding interaction.
Alternative approaches use modified scoring functions to include constraints based on known key protein-ligand interactions,[27] or knowledge-based potentials derived from interactions observed in large databases of protein-ligand structures (e.g. the Protein Data Bank).[28]
There are a large number of structures from
One way to reduce the number of false positives is to recalculate the energy of the top scoring poses using (potentially) more accurate but computationally more intensive techniques such as Generalized Born or Poisson-Boltzmann methods.[9]
Docking assessment
The interdependence between sampling and scoring function affects the docking capability in predicting plausible poses or binding affinities for novel compounds. Thus, an assessment of a docking protocol is generally required (when experimental data is available) to determine its predictive capability. Docking assessment can be performed using different strategies, such as:
- docking accuracy (DA) calculation;
- the correlation between a docking score and the experimental response or determination of the enrichment factor (EF);[29]
- the distance between an ion-binding moiety and the ion in the active site;
- the presence of induce-fit models.
Docking accuracy
Docking accuracy[30][31] represents one measure to quantify the fitness of a docking program by rationalizing the ability to predict the right pose of a ligand with respect to that experimentally observed.[32]
Enrichment factor
Docking screens can also be evaluated by the enrichment of annotated ligands of known binders from among a large database of presumed non-binding, “decoy” molecules.[29] In this way, the success of a docking screen is evaluated by its capacity to enrich the small number of known active compounds in the top ranks of a screen from among a much greater number of decoy molecules in the database. The area under the receiver operating characteristic (ROC) curve is widely used to evaluate its performance.
Prospective
Resulting hits from docking screens are subjected to pharmacological validation (e.g. IC50, affinity or potency measurements). Only prospective studies constitute conclusive proof of the suitability of a technique for a particular target.[33] In the case of G protein-coupled receptors (GPCRs), which are targets of more than 30% of marketed drugs, molecular docking led to the discovery of more than 500 GPCR ligands.[34]
Benchmarking
The potential of docking programs to reproduce binding modes as determined by X-ray crystallography can be assessed by a range of docking benchmark sets.
For small molecules, several benchmark data sets for docking and virtual screening exist e.g. Astex Diverse Set consisting of high quality protein−ligand X-ray crystal structures,[35] the Directory of Useful Decoys (DUD) for evaluation of virtual screening performance.[29], or the LEADS-FRAG data set for fragments[36]
An evaluation of docking programs for their potential to reproduce peptide binding modes can be assessed by Lessons for Efficiency Assessment of Docking and Scoring (LEADS-PEP).[37]
Applications
A binding interaction between a small molecule ligand and an enzyme protein may result in activation or inhibition of the enzyme. If the protein is a receptor, ligand binding may result in agonism or antagonism. Docking is most commonly used in the field of drug design — most drugs are small organic molecules, and docking may be applied to:
- hit identification – docking combined with a scoring function can be used to quickly screen large databases of potential drugs in silico to identify molecules that are likely to bind to protein target of interest (see virtual screening). Reverse pharmacology routinely uses docking for target identification.
- lead optimization – docking can be used to predict in where and in which relative orientation a ligand binds to a protein (also referred to as the binding mode or pose). This information may in turn be used to design more potent and selective analogs.
- bioremediation – protein ligand docking can also be used to predict pollutants that can be degraded by enzymes.[38][39]
See also
- Drug design
- Katchalski-Katzir algorithm
- List of molecular graphics systems
- Macromolecular docking
- Molecular mechanics
- Protein structure
- Protein design
- Software for molecular mechanics modeling
- List of protein-ligand docking software
- Molecular design software
- Docking@Home
- Ibercivis
- ZINC database
- Lead Finder
- Virtual screening
- Scoring functions for docking
References
- PMID 8804827.
- S2CID 1069493.
- S2CID 164985789.
- PMID 1719636.
- ^
Wei BQ, Weaver LH, Ferrari AM, Matthews BW, Shoichet BK (Apr 2004). "Testing a flexible-receptor docking algorithm in a model binding site". Journal of Molecular Biology. 337 (5): 1161–82. PMID 15046985.
- PMID 10651041.
- S2CID 97778840.
- .
- ^ S2CID 3191066.
- S2CID 42749294.
- PMID 11858640.
- PMID 15728116.
- PMID 17337005.
- PMID 31540192.
- S2CID 8834526.
- PMID 15027865.
- PMID 16860582.
- PMID 17786192.
- S2CID 206768542.
- PMID 29733895.
- S2CID 6589810.
- S2CID 36656063.
- PMID 18302984.
- S2CID 36088213.
- S2CID 15814316.
- PMID 8544170.
- S2CID 232340746.
- PMID 10623530.
- ^ PMID 17154509.
- PMID 26682916.
- S2CID 12569345.
- PMID 30039402.
- S2CID 26260725.
- S2CID 245163594.
- PMID 17300160.
- PMID 33289563.
- PMID 26651532.
- PMID 17606396.
- S2CID 63136337.
External links
- Bikadi Z, Kovacs S, Demko L, Hazai E. "Molecular Docking Server - Ligand Protein Docking & Molecular Modeling". Virtua Drug Ltd. Retrieved 2008-07-15.
Internet service that calculates the site, geometry and energy of small molecules interacting with proteins
- Malinauskas T. "Step by step installation of MGLTools 1.5.2 (AutoDockTools, Python Molecular Viewer and Visual Programming Environment) on Ubuntu Linux 8.04". Archived from the original on 2009-02-26. Retrieved 2008-07-15.
- Docking@GRID Archived 2019-12-31 at the Wayback Machine Project of Conformational Sampling and Docking on Grids : one aim is to deploy some intrinsic distributed docking algorithms on computational Grids, download Docking@GRID open-source Linux version
- Click2Drug.org - Directory of computational drug design tools.
- Ligand:Receptor Docking Archived 2019-02-02 at the Wayback Machine with MOE (Molecular Operating Environment)