DNA-binding domain
A DNA-binding domain (DBD) is an independently folded protein domain that contains at least one structural motif that recognizes double- or single-stranded DNA. A DBD can recognize a specific DNA sequence (a recognition sequence) or have a general affinity to DNA.[1] Some DNA-binding domains may also include nucleic acids in their folded structure.
Function
One or more DNA-binding domains are often part of a larger
DNA-binding domains with functions involving DNA structure have biological roles in DNA replication, repair, storage, and modification, such as methylation.
Many proteins involved in the
The DBD interacts with the
Many DNA-binding domains must recognize specific DNA sequences, such as DBDs of
The specificity of DNA-binding proteins can be studied using many biochemical and biophysical techniques, such as gel electrophoresis, analytical ultracentrifugation, calorimetry, DNA mutation, protein structure mutation or modification, nuclear magnetic resonance, x-ray crystallography, surface plasmon resonance, electron paramagnetic resonance, cross-linking and microscale thermophoresis (MST).
DNA-binding protein in genomes
A large fraction of genes in each genome encodes DNA-binding proteins (see Table). However, only a rather small number of protein families are DNA-binding. For instance, more than 2000 of the ~20,000 human proteins are "DNA-binding", including about 750 Zinc-finger proteins.[3]
Species | DNA-binding proteins[4] | DNA-binding families[4] |
---|---|---|
Arabidopsis thaliana (thale cress) | 4471 | 300 |
Saccharomyces cerevisiae (yeast) | 720 | 243 |
Caenorhabditis elegans (worm) | 2028 | 271 |
Drosophila melanogaster (fruit fly) | 2620 | 283 |
Types
Helix-turn-helix
Originally discovered in bacteria, the
Zinc finger
The zinc finger domain is mostly found in eukaryotes, but some examples have been found in bacteria.[5] The zinc finger domain is generally between 23 and 28 amino acids long and is stabilized by coordinating zinc ions with regularly spaced zinc-coordinating residues (either histidines or cysteines). The most common class of zinc finger (Cys2His2) coordinates a single zinc ion and consists of a recognition helix and a 2-strand beta-sheet.[6] In transcription factors these domains are often found in arrays (usually separated by short linker sequences) and adjacent fingers are spaced at 3 basepair intervals when bound to DNA.
Leucine zipper
The basic leucine zipper (bZIP) domain is found mainly in eukaryotes and to a limited extent in bacteria. The bZIP domain contains an alpha helix with a leucine at every 7th amino acid. If two such helices find one another, the leucines can interact as the teeth in a zipper, allowing dimerization of two proteins. When binding to the DNA, basic amino acid residues bind to the sugar-phosphate backbone while the helices sit in the major grooves. It regulates gene expression.
Winged helix
Consisting of about 110 amino acids, the winged helix (WH) domain has four helices and a two-strand beta-sheet.
Winged helix-turn-helix
The winged
Helix-loop-helix
The
HMG-box
HMG-box domains are found in high mobility group proteins which are involved in a variety of DNA-dependent processes like replication and transcription. They also alter the flexibility of the DNA by inducing bends.[7][8] The domain consists of three alpha helices separated by loops.
Wor3 domain
Wor3 domains, named after the White–Opaque Regulator 3 (Wor3) in Candida albicans arose more recently in evolutionary time than most previously described DNA-binding domains and are restricted to a small number of fungi.[9]
OB-fold domain
The OB-fold is a small structural motif originally named for its oligonucleotide/oligosaccharide binding properties. OB-fold domains range between 70 and 150 amino acids in length.[10] OB-folds bind single-stranded DNA, and hence are single-stranded binding proteins.[10]
OB-fold proteins have been identified as critical for
Unusual
Immunoglobulin fold
The immunoglobulin domain (InterPro: IPR013783) consists of a beta-sheet structure with large connecting loops, which serve to recognize either DNA major grooves or antigens. Usually found in immunoglobulin proteins, they are also present in Stat proteins of the cytokine pathway. This is likely because the cytokine pathway evolved relatively recently and has made use of systems that were already functional, rather than creating its own.
B3 domain
The
TAL effector
RNA-guided
The CRISPR/Cas system of Streptococcus pyogenes can be programmed to direct both activation[20] and repression to natural and artificial eukaryotic promoters through the simple engineering of guide RNAs with base-pairing complementarity to target DNA sites.[21] Cas9 can be used as a customizable RNA-guided DNA-binding platform. Domain Cas9 can be functionalized with regulatory domains of interest (e.g., activation, repression, or epigenetic effector) or with endonuclease domain as a versatile tool for genome engineering biology.[22][23] and then be targeted to multiple loci using different guide RNAs.
See also
- For a structural classification of DNA-binding-domains presents in land plant genomes, see [24]
- Comparison of nucleic acid simulation software
References
- ISBN 0-19-963453-X.
- PMID 19269243.
- ^ "reviewed:yes AND organism:"Homo sapiens (Human) [9606]" AND proteome:up000005640 in UniProtKB". www.uniprot.org. Retrieved 2017-10-25.
- ^ PMID 23775796.
- PMID 26365095.
- PMID 11395410.
- PMID 25063301.
- PMID 28303166.
- PMID 23610392.
- ^ PMID 20515430.
- PMID 12598368.
- PMID 19400638.
- S2CID 6648530.
- S2CID 206522347.
- PMID 22223736.
- PMID 26027871.
- PMID 23692030.
- PMID 24452192.
- PMID 26481363.
- PMID 23892895.
- PMID 23977949.
- S2CID 10165663.
- PMID 24076990.
- .
External links
- DBD database of predicted transcription factors Kummerfeld SK, Teichmann SA (January 2006). "DBD: a transcription factor prediction database". Nucleic Acids Research. 34 (Database issue): D74-81. PMID 16381970. Uses a curated set of DNA-binding domains to predict transcription factors in all completely sequenced genomes
- Table of DNA-binding motifs
- DNA+Footprinting at the U.S. National Library of Medicine Medical Subject Headings (MeSH)
- DNA-Binding+Proteins at the U.S. National Library of Medicine Medical Subject Headings (MeSH)
- DNA-binding domains[permanent dead link] in PROSITE