Transcription factor
Transcription factor glossary | |
---|---|
| |
In
TFs work alone or with other proteins in a complex, by promoting (as an
A defining feature of TFs is that they contain at least one
TFs are of interest in medicine because TF mutations can cause specific diseases, and medications can be potentially targeted toward them.
Number
Transcription factors are essential for the regulation of gene expression and are, as a consequence, found in all living organisms. The number of transcription factors found within an organism increases with genome size, and larger genomes tend to have more transcription factors per gene.[14]
There are approximately 2800 proteins in the human genome that contain DNA-binding domains, and 1600 of these are presumed to function as transcription factors,[3] though other studies indicate it to be a smaller number.[15] Therefore, approximately 10% of genes in the genome code for transcription factors, which makes this family the single largest family of human proteins. Furthermore, genes are often flanked by several binding sites for distinct transcription factors, and efficient expression of each of these genes requires the cooperative action of several different transcription factors (see, for example, hepatocyte nuclear factors). Hence, the combinatorial use of a subset of the approximately 2000 human transcription factors easily accounts for the unique regulation of each gene in the human genome during development.[13]
Mechanism
Transcription factors bind to either
- stabilize or block the binding of RNA polymerase to DNA[citation needed]
- catalyze the acetylation or deacetylation of histone proteins. The transcription factor can either do this directly or recruit other proteins with this catalytic activity. Many transcription factors use one or the other of two opposing mechanisms to regulate transcription:[17]
- histone acetyltransferase (HAT) activity – acetylates histone proteins, which weakens the association of DNA with histones, which make the DNA more accessible to transcription, thereby up-regulating transcription
- histone deacetylase (HDAC) activity – deacetylates histone proteins, which strengthens the association of DNA with histones, which make the DNA less accessible to transcription, thereby down-regulating transcription
- recruit corepressor proteins to the transcription factor DNA complex[18]
Function
Transcription factors are one of the groups of proteins that read and interpret the genetic "blueprint" in the DNA. They bind to the DNA and help initiate a program of increased or decreased gene transcription. As such, they are vital for many important cellular processes. Below are some of the important functions and biological roles transcription factors are involved in:
Basal transcriptional regulation
In
Differential enhancement of transcription
Other transcription factors differentially regulate the expression of various genes by binding to enhancer regions of DNA adjacent to regulated genes. These transcription factors are critical to making sure that genes are expressed in the right cell at the right time and in the right amount, depending on the changing requirements of the organism.[citation needed]
Development
Many transcription factors in
Response to intercellular signals
Cells can communicate with each other by releasing molecules that produce signaling cascades within another receptive cell. If the signal requires upregulation or downregulation of genes in the recipient cell, often transcription factors will be downstream in the signaling cascade.[27] Estrogen signaling is an example of a fairly short signaling cascade that involves the estrogen receptor transcription factor: Estrogen is secreted by tissues such as the ovaries and placenta, crosses the cell membrane of the recipient cell, and is bound by the estrogen receptor in the cell's cytoplasm. The estrogen receptor then goes to the cell's nucleus and binds to its DNA-binding sites, changing the transcriptional regulation of the associated genes.[28]
Response to environment
Not only do transcription factors act downstream of signaling cascades related to biological stimuli but they can also be downstream of signaling cascades involved in environmental stimuli. Examples include
Cell cycle control
Many transcription factors, especially some that are
Pathogenesis
Transcription factors can also be used to alter gene expression in a host cell to promote pathogenesis. A well studied example of this are the transcription-activator like effectors (
Regulation
It is common in biology for important processes to have multiple layers of regulation and control. This is also true with transcription factors: Not only do transcription factors control the rates of transcription to regulate the amounts of gene products (RNA and protein) available to the cell but transcription factors themselves are regulated (often by other transcription factors). Below is a brief synopsis of some of the ways that the activity of transcription factors can be regulated:
Synthesis
Transcription factors (like all proteins) are transcribed from a gene on a chromosome into RNA, and then the RNA is translated into protein. Any of these steps can be regulated to affect the production (and thus activity) of a transcription factor. An implication of this is that transcription factors can regulate themselves. For example, in a negative feedback loop, the transcription factor acts as its own repressor: If the transcription factor protein binds the DNA of its own gene, it down-regulates the production of more of itself. This is one mechanism to maintain low levels of a transcription factor in a cell.[39]
Nuclear localization
In
Activation
Transcription factors may be activated (or deactivated) through their signal-sensing domain by a number of mechanisms including:
- ligand binding – Not only is ligand binding able to influence where a transcription factor is located within a cell but ligand binding can also affect whether the transcription factor is in an active state and capable of binding DNA or other cofactors (see, for example, nuclear receptors).
- phosphorylation[41][42] – Many transcription factors such as STAT proteins must be phosphorylated before they can bind DNA.
- interaction with other transcription factors (e.g., homo- or hetero-dimerization) or coregulatory proteins[citation needed]
Accessibility of DNA-binding site
In eukaryotes, DNA is organized with the help of histones into compact particles called nucleosomes, where sequences of about 147 DNA base pairs make ~1.65 turns around histone protein octamers. DNA within nucleosomes is inaccessible to many transcription factors. Some transcription factors, so-called pioneer factors are still able to bind their DNA binding sites on the nucleosomal DNA. For most other transcription factors, the nucleosome should be actively unwound by molecular motors such as chromatin remodelers.[43] Alternatively, the nucleosome can be partially unwrapped by thermal fluctuations, allowing temporary access to the transcription factor binding site. In many cases, a transcription factor needs to compete for binding to its DNA binding site with other transcription factors and histones or non-histone chromatin proteins.[44] Pairs of transcription factors and other proteins can play antagonistic roles (activator versus repressor) in the regulation of the same gene.[citation needed]
Availability of other cofactors/transcription factors
Most transcription factors do not work alone. Many large TF families form complex homotypic or heterotypic interactions through dimerization.[45] For gene transcription to occur, a number of transcription factors must bind to DNA regulatory sequences. This collection of transcription factors, in turn, recruit intermediary proteins such as cofactors that allow efficient recruitment of the preinitiation complex and RNA polymerase. Thus, for a single transcription factor to initiate transcription, all of these other proteins must also be present, and the transcription factor must be in a state where it can bind to them if necessary. Cofactors are proteins that modulate the effects of transcription factors. Cofactors are interchangeable between specific gene promoters; the protein complex that occupies the promoter DNA and the amino acid sequence of the cofactor determine its spatial conformation. For example, certain steroid receptors can exchange cofactors with NF-κB, which is a switch between inflammation and cellular differentiation; thereby steroids can affect the inflammatory response and function of certain tissues.[46]
Interaction with methylated cytosine
Transcription factors and methylated cytosines in DNA both have major roles in regulating gene expression. (Methylation of cytosine in DNA primarily occurs where cytosine is followed by guanine in the 5' to 3' DNA sequence, a CpG site.) Methylation of CpG sites in a promoter region of a gene usually represses gene transcription,[47] while methylation of CpGs in the body of a gene increases expression.[48] TET enzymes play a central role in demethylation of methylated cytosines. Demethylation of CpGs in a gene promoter by TET enzyme activity increases transcription of the gene.[49]
The DNA binding sites of 519 transcription factors were evaluated.[50] Of these, 169 transcription factors (33%) did not have CpG dinucleotides in their binding sites, and 33 transcription factors (6%) could bind to a CpG-containing motif but did not display a preference for a binding site with either a methylated or unmethylated CpG. There were 117 transcription factors (23%) that were inhibited from binding to their binding sequence if it contained a methylated CpG site, 175 transcription factors (34%) that had enhanced binding if their binding sequence had a methylated CpG site, and 25 transcription factors (5%) were either inhibited or had enhanced binding depending on where in the binding sequence the methylated CpG was located.[citation needed]
TET enzymes do not specifically bind to methylcytosine except when recruited (see
Structure
Transcription factors are modular in structure and contain the following
- response elements.
- An optional signal-sensing domain (SSD) (e.g., a ligand-binding domain), which senses external signals and, in response, transmits these signals to the rest of the transcription complex, resulting in up- or down-regulation of gene expression. Also, the DBD and signal-sensing domains may reside on separate proteins that associate within the transcription complex to regulate gene expression.
DNA-binding domain
The portion (
Family | InterPro | Pfam | SCOP
|
---|---|---|---|
basic helix-loop-helix[54]
|
InterPro: IPR001092 | Pfam PF00010 | SCOP 47460
|
basic-leucine zipper (bZIP)[55] | InterPro: IPR004827 | Pfam PF00170 | SCOP 57959
|
C-terminal effector domain of the bipartite response regulators | InterPro: IPR001789 | Pfam PF00072 | SCOP 46894
|
AP2/ERF/GCC box | InterPro: IPR001471 | Pfam PF00847 | SCOP 54176
|
helix-turn-helix[56] | |||
homeodomain proteins, which are encoded by homeobox genes, are transcription factors. Homeodomain proteins play critical roles in the regulation of development.[57][58]
|
InterPro: IPR009057 | Pfam PF00046 | SCOP 46689
|
lambda repressor -like
|
InterPro: IPR010982 | SCOP 47413
| |
srf-like (serum response factor) | InterPro: IPR002100 | Pfam PF00319 | SCOP 55455
|
paired box[59] | |||
winged helix
|
InterPro: IPR013196 | Pfam PF08279 | SCOP 46785
|
zinc fingers[60] | |||
* multi-domain Cys2His2 zinc fingers[61] | InterPro: IPR007087 | Pfam PF00096 | SCOP 57667
|
* Zn2/Cys6 | SCOP 57701
| ||
* Zn2/Cys8 nuclear receptor zinc finger | InterPro: IPR001628 | Pfam PF00105 | SCOP 57716
|
Response elements
The DNA sequence that a transcription factor binds to is called a
Transcription factors interact with their binding sites using a combination of electrostatic (of which hydrogen bonds are a special case) and Van der Waals forces. Due to the nature of these chemical interactions, most transcription factors bind DNA in a sequence specific manner. However, not all bases in the transcription factor-binding site may actually interact with the transcription factor. In addition, some of these interactions may be weaker than others. Thus, transcription factors do not bind just one sequence but are capable of binding a subset of closely related sequences, each with a different strength of interaction.[citation needed]
For example, although the consensus binding site for the TATA-binding protein (TBP) is TATAAAA, the TBP transcription factor can also bind similar sequences such as TATATAT or TATATAA.[citation needed]
Because transcription factors can bind a set of related sequences and these sequences tend to be short, potential transcription factor binding sites can occur by chance if the DNA sequence is long enough. It is unlikely, however, that a transcription factor will bind all compatible sequences in the genome of the cell. Other constraints, such as DNA accessibility in the cell or availability of cofactors may also help dictate where a transcription factor will actually bind. Thus, given the genome sequence, it is still difficult to predict where a transcription factor will actually bind in a living cell.
Additional recognition specificity, however, may be obtained through the use of more than one DNA-binding domain (for example tandem DBDs in the same transcription factor or through dimerization of two transcription factors) that bind to two or more adjacent sequences of DNA.
Clinical significance
Transcription factors are of clinical significance for at least two reasons: (1) mutations can be associated with specific diseases, and (2) they can be targets of medications.
Disorders
Due to their important roles in development, intercellular signaling, and cell cycle, some human diseases have been associated with mutations in transcription factors.[63]
Many transcription factors are either
Below are a few of the better-studied examples:
Condition | Description | Locus |
---|---|---|
Rett syndrome | Mutations in the MECP2 transcription factor are associated with Rett syndrome, a neurodevelopmental disorder.[65][66] | Xq28 |
Diabetes | A rare form of insulin promoter factor-1 (IPF1/Pdx1).[68]
|
multiple |
Developmental verbal dyspraxia | Mutations in the FOXP2 transcription factor are associated with developmental verbal dyspraxia, a disease in which individuals are unable to produce the finely coordinated movements required for speech.[69] | 7q31 |
Autoimmune diseases
|
Mutations in the IPEX.[70]
|
Xp11.23-q13.3 |
Li-Fraumeni syndrome
|
Caused by mutations in the tumor suppressor p53.[71]
|
17p13.1 |
Breast cancer | The STAT family is relevant to breast cancer.[72] | multiple |
Multiple cancers | The HOX family are involved in a variety of cancers.[73]
|
multiple |
Osteoarthritis | Mutation or reduced activity of SOX9[74] |
Potential drug targets
Approximately 10% of currently prescribed drugs directly target the
Role in evolution
Gene duplications have played a crucial role in the
Role in biocontrol activity
The transcription factors have a role in
Analysis
There are different technologies available to analyze transcription factors. On the
The most commonly used method for identifying transcription factor binding sites is
Classes
As described in more detail below, transcription factors may be classified by their (1) mechanism of action, (2) regulatory function, or (3) sequence homology (and hence structural similarity) in their DNA-binding domains.
Mechanistic
There are two mechanistic classes of transcription factors:
- TFIIH. They are ubiquitous and interact with the core promoter region surrounding the transcription start site(s) of all class II genes.[91]
- Upstream transcription factors are proteins that bind somewhere upstream of the initiation site to stimulate or repress transcription. These are roughly synonymous with specific transcription factors, because they vary considerably depending on what recognition sequences are present in the proximity of the gene.[92]
Examples of specific transcription factors[92] | |||
---|---|---|---|
Factor | Structural type | Recognition sequence | Binds as |
SP1
|
Zinc finger | 3' |
Monomer |
AP-1 | Basic zipper |
5'-TGA(G/C)TCA-3' | Dimer |
C/EBP
|
Basic zipper |
5'-ATTGCGCAAT-3' | Dimer |
Heat shock factor | Basic zipper |
5'-XGAAX-3' | Trimer |
ATF/CREB | Basic zipper |
5'-TGACGTCA-3' | Dimer |
c-Myc | Basic helix-loop-helix
|
5'-CACGTG-3' | Dimer |
Oct-1 | Helix-turn-helix | 5'-ATGCAAAT-3' | Monomer |
NF-1
|
Novel | 5'-TTGGCXXXXXGCCAA-3' | Dimer |
(G/C) = G or C X = A, T, G or C |
Functional
Transcription factors have been classified according to their regulatory function:[13]
- I. constitutively active – present in all cells at all times – CCAAT
- II. conditionally active – requires activation
- II.A developmental (cell specific) – expression is tightly controlled, but, once expressed, require no additional activation – Hox, Winged Helix
- II.B signal-dependent – requires external signal for activation
- II.B.1 extracellular ligand (paracrine)-dependent – nuclear receptors
- II.B.2 intracellular ligand (SREBP, p53, orphan nuclear receptors
- II.B.3 cell membrane receptor-dependent – second messenger signaling cascades resulting in the phosphorylation of the transcription factor
- II.B.3.a resident nuclear factors – reside in the nucleus regardless of activation state – AP-1, Mef2
- II.B.3.b latent cytoplasmic factors – inactive form reside in the cytoplasm, but, when activated, are translocated into the nucleus –
- II.B.3.a resident nuclear factors – reside in the nucleus regardless of activation state –
- II.B.1 extracellular ligand (
- II.A developmental (cell specific) – expression is tightly controlled, but, once expressed, require no additional activation –
Structural
Transcription factors are often classified based on the
- 1 Superclass: Basic Domains
- 1.1 Class: bZIP)
- 1.1.1 Family: c-Jun)
- 1.1.2 Family: CREB
- 1.1.3 Family: C/EBP-like factors
- 1.1.4 Family: bZIP / PAR
- 1.1.5 Family: Plant G-box binding factors
- 1.1.6 Family: ZIP only
- 1.1.1 Family:
- 1.2 Class: Helix-loop-helix factors (bHLH)
- 1.2.1 Family: Ubiquitous (class A) factors
- 1.2.2 Family: Myogenic transcription factors (MyoD)
- 1.2.3 Family: Achaete-Scute
- 1.2.4 Family: Tal/Twist/Atonal/Hen
- 1.3 Class: Helix-loop-helix / leucine zipper factors (bHLH-ZIP)
- 1.3.1 Family: Ubiquitous bHLH-ZIP factors; includes USF (SREBP)
- 1.3.2 Family: Cell-cycle controlling factors; includes c-Myc
- 1.3.1 Family: Ubiquitous bHLH-ZIP factors; includes USF (
- 1.4 Class: NF-1
- 1.5 Class: RF-X
- 1.6 Class: bHSH
- 1.1 Class:
- 2 Superclass: Zinc-coordinating DNA-binding domains
- 2.1 Class: Cys4 zinc finger of nuclear receptor type
- 2.1.1 Family: Steroid hormone receptors
- 2.1.2 Family: Thyroid hormone receptor-like factors
- 2.2 Class: diverse Cys4 zinc fingers
- 2.2.1 Family: GATA-Factors
- 2.3 Class: Cys2His2 zinc finger domain
- 2.3.1 Family: Ubiquitous factors, includes Sp1
- 2.3.2 Family: Developmental / cell cycle regulators; includes Krüppel
- 2.3.4 Family: Large factors with NF-6B-like binding properties
- 2.3.1 Family: Ubiquitous factors, includes
- 2.4 Class: Cys6 cysteine-zinc cluster
- 2.5 Class: Zinc fingers of alternating composition
- 2.1 Class: Cys4 zinc finger of nuclear receptor type
- 3 Superclass: Helix-turn-helix
- 3.1 Class: Homeo domain
- 3.1.1 Family: Homeo domain only; includes Ubx
- 3.1.2 Family: POU domain factors; includes Oct
- 3.1.3 Family: Homeo domain with LIM region
- 3.1.4 Family: homeo domain plus zinc finger motifs
- 3.1.1 Family: Homeo domain only; includes
- 3.2 Class: Paired box
- 3.2.1 Family: Paired plus homeo domain
- 3.2.2 Family: Paired domain only
- 3.3 Class: Fork head / winged helix
- 3.3.1 Family: Developmental regulators; includes forkhead
- 3.3.2 Family: Tissue-specific regulators
- 3.3.3 Family: Cell-cycle controlling factors
- 3.3.0 Family: Other regulators
- 3.3.1 Family: Developmental regulators; includes
- 3.4 Class: Heat Shock Factors
- 3.4.1 Family: HSF
- 3.5 Class: Tryptophan clusters
- 3.5.1 Family: Myb
- 3.5.2 Family: Ets-type
- 3.5.3 Family: Interferon regulatory factors
- 3.6 Class: TEA ( transcriptional enhancer factor) domain
- 3.1 Class: Homeo domain
- 4 Superclass: beta-Scaffold Factors with Minor Groove Contacts
- 4.1 Class: RHR (Rel homology region)
- 4.2 Class: STAT
- 4.2.1 Family: STAT
- 4.3 Class: p53
- 4.3.1 Family: p53
- 4.4 Class: MADS box
- 4.4.1 Family: Regulators of differentiation; includes (Mef2)
- 4.4.2 Family: Responders to external signals, SRF (serum response factor) (SRF)
- 4.4.3 Family: Metabolic regulators (ARG80)
- 4.5 Class: beta-Barrel alpha-helix transcription factors
- 4.6 Class: TATA binding proteins
- 4.6.1 Family: TBP
- 4.7 Class: HMG-box
- 4.8 Class: Heteromeric CCAAT factors
- 4.8.1 Family: Heteromeric CCAAT factors
- 4.9 Class: Grainyhead
- 4.9.1 Family: Grainyhead
- 4.10 Class: Cold-shock domain factors
- 4.10.1 Family: csd
- 4.11 Class: Runt
- 4.11.1 Family: Runt
- 0 Superclass: Other Transcription Factors
Transcription factor databases
There are numerous databases cataloging information about transcription factors, but their scope and utility vary dramatically. Some may contain only information about the actual proteins, some about their binding sites, or about their target genes. Examples include the following:
- footprintDB-- a metadatabase of multiple databases, including JASPAR and others
- JASPAR: database of transcription factor binding sites for eukaryotes
- PlantTFD: Plant transcription factor database[95]
- TcoF-DB: Database of transcription co-factors and transcription factor interactions[96]
- TFcheckpoint: database of human, mouse and rat TF candidates
- transcriptionfactor.org (now commercial, selling reagents)
- MethMotif.org: An integrative cell-specific database of transcription factor binding motifs coupled with DNA methylation profiles. [97]
See also
- Cdx protein family
- DNA-binding protein
- Inhibitor of DNA-binding protein
- Mapper(2)
- Nuclear receptor, a class of ligand activated transcription factors
- Open Regulatory Annotation Database
- Phylogenetic footprinting
- TRANSFAC database
- YeTFaSCo
References
- ^ PMID 9570129.
- PMID 2128034.
- ^ PMID 15193307.
- YouTube
- PMID 29425488.
The final tally encompasses 1,639 known or likely human TFs.
- PMID 8870495.
- PMID 8990153.
- PMID 11092823.
- PMID 2667136.
- S2CID 6203915.
- ^ PMID 24174544.
- ^ PMID 16381825.
- ^ S2CID 14954195.
- S2CID 15887416.
- ^ "List Of All Transcription Factors In Human". biostars.org.
- PMID 11758455.
- S2CID 14586791.
- PMID 10322133.
- ISBN 1-86094-126-5.
- PMID 12672487.
- PMID 12676794.
- S2CID 13073440.
- PMID 1424766.
- S2CID 35650754.
- PMID 16515781.
- S2CID 23824870.
- PMID 8293575.
- PMID 11916222.
- S2CID 9912334.
- S2CID 44049779.
- PMID 15457548.
- PMID 8960358.
- PMID 8864058.
- PMID 7846125.
- PMID 19400638.
- S2CID 6648530.
- S2CID 206522347.
- S2CID 33257689.
- S2CID 44783683.
- ^ PMID 8314906.
- PMID 2149275.
- PMID 17536004.
- PMID 19625488.
- S2CID 103345.
- PMID 18406148.
- S2CID 205469320.
- S2CID 22446734.
- PMID 25263941.
- PMID 24108092.
- S2CID 206653898.
- PMID 30809228.
- ^ Sun Z, Xu X, He J, Murray A, Sun MA, Wei X, Wang X, McCoig E, Xie E, Jiang X, Li L, Zhu J, Chen J, Morozov A, Pickrell AM, Theus MH, Xie H. EGR1 recruits TET1 to shape the brain methylome during development and upon neuronal activity. Nat Commun. 2019 Aug 29;10(1):3892. doi: 10.1038/s41467-019-11905-3. PMID 31467272
- S2CID 31314461.
- PMID 7553065.
- PMID 12192032.
- PMID 8831795.
- PMID 7979246.
- PMID 26464018.
- S2CID 23755557.
- PMID 11179890.
- PMID 10940247.
- PMID 15711128.
- ISBN 978-0-19-511239-9.
- PMID 16475943.
- PMID 16647848.
- from the original on 2 October 2023 – via Zenodo.
- PMID 17923767.
- from the original on 2 October 2023.
- S2CID 22021740.
- PMID 18317533.
- PMID 15917654.
- PMID 15509516.
- ^ ""Transcription factors as targets and markers in cancer" Workshop 2007". Archived from the original on 25 May 2012. Retrieved 14 December 2009.
- PMID 30465885.
- S2CID 11979420.
- S2CID 205475111.
- PMID 8049612.
- PMID 7549464.
- PMID 9755455.
- PMID 15790306.
- PMID 28094913.
- PMID 29685496.
- PMID 19907488.
- Lay summary in: Katherine Bagley (11 November 2009). "New drug target for cancer". The Scientist. Archived from the original on 16 November 2009.
- S2CID 207778924.
- PMID 25750178.
- PMID 33452020.
- PMID 16845064.
- PMID 18591661.
- PMID 23090257.
- PMID 26383089.
- PMID 8946909.
- ^ ISBN 1-4160-2328-3.
- PMID 15706513. Archived from the originalon 19 June 2013.
- ^ "TRANSFAC database". Retrieved 5 August 2007.
- PMID 27924042.
- PMID 27789689.
- PMID 30380113.
Further reading
- Carretero-Paulet, Lorenzo; Galstyan, Anahit; Roig-Villanova, Irma; Martínez-García, Jaime F.; Bilbao-Castro, Jose R. «Genome-Wide Classification and Evolutionary Analysis of the bHLH Family of Transcription Factors in Arabidopsis, Poplar, Rice, Moss, and Algae». Plant Physiology, 153, 3, 2010-07, pàg. 1398–1412. ISSN 0032-0889
- Jin J, He K, Tang X, Li Z, Lv L, Zhao Y, et al. (2015). "An Arabidopsis Transcriptional Regulatory Map Reveals Distinct Functional and Evolutionary Features of Novel Transcription Factors". Molecular Biology and Evolution. 32 (7): 1767–73. PMID 25750178.
External links
- Transcription+Factors at the U.S. National Library of Medicine Medical Subject Headings (MeSH)
- Transcription factor database Archived 4 December 2008 at the Wayback Machine