Figure 1. TATA box structural elements. The TATA box consensus sequence is TATAWAW, where W is either A or T.

In molecular biology, the TATA box (also called the Goldberg-Hogness box)[1] is a sequence of DNA found in the promoter region of genes in archaea and eukaryotes.[2] The prokaryotic homolog of the TATA box is called the pribnow box which has a shorter consensus sequence. The TATA box is considered a non-coding DNA sequence (also known as a cis-regulatory element).

The TATA box was first identified in 1978[1] as a component of eukaryotic promoters. It is the binding site of TATA-binding protein, additional transcription factors, and RNA polymerase II in some eukaryotic genes. Transcription is initiated at the TATA box in TATA-containing genes. Based on the sequence and mechanism of TATA box initiation, mutations to this consensus sequence can result in phenotypic changes.

History

edit

Discovery

edit

The TATA box was the first eukaryotic core promoter motif to be identified in 1978 by American biochemist David Hogness[1] while he and his graduate student, Michael Goldberg where on sabbatical at the University of Basel in Switzerland.[3] They first discovered the TATA sequence while analyzing 5' DNA promoter sequences in Drosophila, mammalian, and viral genes.[4][2] The TATA box was found in protein coding genes transcribed by RNA polymerase II.[2]

Evolutionary History

edit

Most research on the TATA box has been conducted on yeast, human, and Drosophila genomes, however, similar elements have been found in archaea and ancient eukaryotes.[2] In archaea species, the promoter contains an 8 bp AT-rich sequence located ~24 bp upstream of the transcription start site. This sequence was originally called Box A, which in now known to be the sequence that interacts with the homologue of the archaeal TATA-binding protein (TBP). Also, even though studies have discovered several similarities, there are other that have observed a notable differences between archaeal and eukaryotic TBP. The archaea protein exhibits a greater symmetry in its primary sequence and in the distribution of electrostatic charge, which is important because the higher symmetry lowers the protein's ability to bind the TATA box in a polar manner.[2]

Even though the TATA box is present in many eukaryotic promoters, is important to note that is not contained in the majority of promoters. One study found less than 30% of 1031 potential promoter regions contain a putative TATA box motif in humans.[5] In Drosophila, less than 40% of 205 core promoters contain a TATA box.[4] When there is an absence of the TATA box and TBP is not present, the downstream promoter element (DPE) in cooperation with the initiator element (Inr) bind to the transcription factor II D (TFIID), initiating transcription in TATA-less promoters. The DPE has been identified in three Drosophila TATA-less promoters and in the TATA-less human IRF-1 promoter.[6]

Analogous Sequences

Promoter sequences vary between prokaryotes and eukaryotes. In eukaryotes, the TATA box is located 25 base pairs upstream of the start site that Rpb4/Rbp7 use to initiate transcription. In metazoans, the TATA box is located about ~25-30 bp upstream of the transcription start site. While in yeast, S. cerevisiae, the TATA box has a variable position which can range from 40 to 100 bp upstream of the start site. In prokaryotes, promoter regions may contain a Pribnow box, which serves an analogous purpose to the eukaryotic TATA box. The Pribnow box has a 6 bp region centered around the -10 position and a 8-12 bp sequence around the -35 region that are both conserved.[6]

A CAAT box (also CAT box) is a region of nucleotides with the following consensus sequence: 5’ GGCCAATCT 3’. The CAAT box is located about 75-80 bases upstream of the transcription initiation site and about 150 bases upstream of the TATA box. It binds transcription factors (CAAT TF or CTFs) and thereby stabilizes the nearby preinitiation complex for easier binding of RNA polymerasesCAAT boxes are rarely found in genes that express proteins ubiquitous in all cell types.[6]

Structure and Function

edit

Sequence and Prevalence

edit
 
Figure 2. Mechanism for transcription initiation at the TATA box. Transcription factors, TATA binding protein (TBP), and RNA polymerase II are all recruited to begin transcription.

The TATA box is a component of the eukaryotic core promotor and generally contains the consensus sequence 5'-TATA(A/T)A(A/T)-3'.[7] In yeast, for example, one study found that various Saccharomyces genomes had the consensus sequence 5'-TATA(A/T)A(A/T)(A/G)-3', yet only about 20% of yeast genes even contained the TATA sequence.[8] Similarly, in humans only 24% of genes have promoter regions containing the TATA box.[9] Genes containing the TATA-box tend to be involved in stress-responses and certain types of metabolism and are more highly regulated when compared to TATA-less genes.[8][10] Generally, TATA-containing genes are not involved in essential cellular functions such as cell growth, DNA replication, transcription, and translation because of their highly regulated nature.[10]

The TATA box is usually located 25-35 base pairs upstream of the transcription start site. Genes containing the TATA box usually require additional promoter elements, including an initiator site located just upstream of the transcription start site and a downstream core element (DCE).[7] These additional promoter regions work in conjunction with the TATA box to regulate initiation of transcription in eukaryotes.

Role in Transcription Initiation

edit

The TATA-box is the site of preinitiation complex formation, which is the first step in transcription initiation in eukaryotes. Formation of the preinitiation complex begins when the multi-sunbunit transcription factor II D (TFIID) binds to the TATA box at its TATA-binding protein (TBP) subunit.[7] TBP binds to the minor groove[11] of the TATA box via a region of antiparallel β sheets in the protein.[12] Three types of molecular interactions contribute to TBP binding to the TATA box:

  1. Two phenylalanine residues on insert at the ends of the TATA box and form kinks in the DNA.[13][14][15]
  2. Four hydrogen bonds form between polar side chains on TBP amino acid residues and bases in the minor groove.[13]
  3. Numerous hydrophobic interactions form between TBP residues and DNA bases, including van der Waals forces.[13][14][15]

Additionally, binding of TBP is facilitated by stabilizing interactions with DNA flanking the TATA box, which consists of G-C rich sequences.[16] These secondary interactions induce bending of the DNA and helical unwinding.[17] The degree of DNA bending is species and sequence dependent. For example, one study used the adenovirus TATA promoter sequence (5'-CGCTATAAAAGGGC-3') as a model binding sequence and found that human TBP binding to the TATA box induced a 97° bend toward the major groove while the yeast TBP protein only induced an 82° bend.[18] X-ray crystallography studies of TBP/TATA-box complexes generally agree that the DNA goes through an ~80° bend during the process of TBP-binding.[13][14][15]

The conformational changes induced by TBP binding to the TATA box allows for additional transcription factors and RNA polymerase II to bind to the promoter region. TFIID first binds to the TATA box, facilitated by TFIIA binding to the upstream part of the TFIID complex[19][20]. TFIIB then binds to the TFIID-TFIIA-DNA complex through interactions both upstream and downstream of the TATA box.[21] RNA polymerase II is then recruited to this multi-protein complex with the help of TFIIF.[21] Additional transcription factors then bind, first TFIIE and then TFIIH.[21] This completes the assembly of the preinitiation complex for eukaryotic transcription.[7] Generally, the TATA box is found at RNA polymerase II promoter regions, although some in vitro studies have demonstrated that RNA polymerase III can recognize TATA sequences.[22]

This cluster of RNA polymerase II and various transcription factors is known as the basal transcriptional complex (BTC). In this state, it only gives a low level of transcription. Other factors must stimulate the BTC to increase transcription levels.[2] One such example of a BTC stimulating region of DNA is the CAAT box. Additional factors, including the Mediator complex, transcriptional regulatory proteins, and nucleosome-modifying enzymes also enhance transcription in vivo.[7]

Mutations

edit
 
Figure 3. Effects on TBP binding to the TATA box from mutations. Wildtype shows transcription done normally. An insertion or deletion shifts the TATA box recognition site which results in a shifted transcription site. [23] Point mutations risk the TBP being unable to bind for initiation.[24]

Mutations to the TATA box can range from a deletion or insertion to a point mutation with varying effects based on the gene that has been mutated. The mutations change the binding of the TATA-binding protein (TBP) for transcription initiation. Thus, there is a resulting change in phenotype based on the gene that is not being expressed (Figure 3).

Insertions or Deletions

edit

One of the first studies of TATA box mutations looked at a sequence of DNA from Agrobacterium tumefaciens for the octopine type cytokinin gene.[23] This specific gene has three TATA boxes. A phenotype change was only observed when all three TATA boxes were deleted. An insertion of extra base pairs between the last TATA box and the transcription start site resulted in a shift in the start site; thus, resulting in a phenotypic change.  From this original mutation study, a change in transcription can be seen when there is no TATA box to promote transcription, but transcription of a gene will occur when there is an insertion to the sequence. The nature of the resulting phenotype may be affected due to the insertion.

Point Mutations

edit

Point mutations to the TATA box have similar varying phenotypic changes depending on the gene that is being affected. Studies also show that the placement of the mutation in the TATA box sequence hinders the binding of TBP.[24] For example, a mutation from TATAAAA to CATAAAA does completely hinder the binding sufficiently to change transcription, the neighboring sequences can affect if there is a change or not.[25] However, a change can be seen in HeLa cells with a TATAAAA to TATACAA which leads to a 20 fold decrease in transcription.[26] Some diseases that can be caused due to this insufficient by specific gene transcription are:  Thalassemia[27], lung cancer[28], chronic hemolytic anemia[29], immunosuppression[30], hemophilia B Leyden[31], and thrombophlebitis and myocardial infarction[32].

Savinkova et al. has written a simulation to predict the KD value for a selected TATA box sequence and TBP.[33] This can be used to directly predict the phenotypic traits resulting from a selected mutation based on how tightly TBP is binding to the TATA box.

See also

edit
  1. ^ a b c Lifton, R. P.; Goldberg, M. L.; Karp, R. W.; Hogness, D. S. (1978). "The organization of the histone genes in Drosophila melanogaster: Functional and evolutionary implications". Cold Spring Harbor Symposia on Quantitative Biology. 42: 1047–1051. doi:10.1101/sqb.1978.042.01.105. PMID 98262.
  2. ^ a b c d e f Smale, Stephen T.; Kadonaga, James T. (2003). "The RNA polymerase II core promoter". Annual Review of Biochemistry. 72: 449–479. doi:10.1146/annurev.biochem.72.121801.161520. ISSN 0066-4154. PMID 12651739.
  3. ^ Gehring, Walter J. (1998). Master Control Genes in Development and Evolution: The Homeobox Story. New Haven: Yale University Press. ISBN 978-0300074093.
  4. ^ a b Kutach, Alan K.; Kadonaga, James T. (2000-7). "The Downstream Promoter Element DPE Appears To Be as Widely Used as the TATA Box in Drosophila Core Promoters". Molecular and Cellular Biology. 20 (13): 4754–4764. ISSN 0270-7306. PMC 85905. PMID 10848601. {{cite journal}}: Check date values in: |date= (help)CS1 maint: PMC format (link)
  5. ^ Suzuki, Yutaka; Tsunoda, Tatsuhiko; Sese, Jun; Taira, Hirotoshi; Mizushima-Sugano, Junko; Hata, Hiroko; Ota, Toshio; Isogai, Takao; Tanaka, Toshihiro (2001-5). "Identification and Characterization of the Potential Promoter Regions of 1031 Kinds of Human Genes". Genome Research. 11 (5): 677–684. doi:10.1101/gr.164001. ISSN 1088-9051. PMC 311086. PMID 11337467. {{cite journal}}: Check date values in: |date= (help)CS1 maint: PMC format (link)
  6. ^ a b c Tripathi, G. (2010). Cellular and Biochemical Science. New Delhi: I.K. International Publishing House Pvt. Ltd. pp. 373–374. ISBN 978-81-88237-85-X. {{cite book}}: Check |isbn= value: invalid character (help)
  7. ^ a b c d e Molecular biology of the gene. Watson, James D., 1928- (Seventh edition ed.). Boston. ISBN 9780321762436. OCLC 824087979. {{cite book}}: |edition= has extra text (help)CS1 maint: others (link)
  8. ^ a b Basehoar, Andrew D.; Zanton, Sara J.; Pugh, B. Franklin (2004-03-05). "Identification and distinct regulation of yeast TATA box-containing genes". Cell. 116 (5): 699–709. ISSN 0092-8674. PMID 15006352.
  9. ^ Yang, C; Bolotin, E; Jiang, T; Sladek, FM; Martinez, E (2007). "Prevalence of the initiator over the TATA box in human and yeast genes and identification of DNA motifs enriched in human TATA-less core promoters". Gene. 389 (1): 52–65. doi:10.1016/j.gene.2006.09.029. PMC 1955227. PMID 17123746.
  10. ^ a b Bae, Sang-Hun; Han, Hyun Wook; Moon, Jisook (2015). "Functional analysis of the molecular interactions of TATA box-containing genes and essential genes". PloS One. 10 (3): e0120848. doi:10.1371/journal.pone.0120848. ISSN 1932-6203. PMC 4366266. PMID 25789484.{{cite journal}}: CS1 maint: PMC format (link) CS1 maint: unflagged free DOI (link)
  11. ^ Starr, D. B.; Hawley, D. K. (1991-12-20). "TFIID binds in the minor groove of the TATA box". Cell. 67 (6): 1231–1240. ISSN 0092-8674. PMID 1760847.
  12. ^ Kim, J. L.; Nikolov, D. B.; Burley, S. K. (1993-10-07). "Co-crystal structure of TBP recognizing the minor groove of a TATA element". Nature. 365 (6446): 520–527. doi:10.1038/365520a0. ISSN 0028-0836. PMID 8413605.
  13. ^ a b c d Kim, J. L.; Nikolov, D. B.; Burley, S. K. (1993-10-07). "Co-crystal structure of TBP recognizing the minor groove of a TATA element". Nature. 365 (6446): 520–527. doi:10.1038/365520a0. ISSN 0028-0836. PMID 8413605.
  14. ^ a b c Nikolov, D. B.; Chen, H.; Halay, E. D.; Hoffman, A.; Roeder, R. G.; Burley, S. K. (1996-05-14). "Crystal structure of a human TATA box-binding protein/TATA element complex". Proceedings of the National Academy of Sciences of the United States of America. 93 (10): 4862–4867. ISSN 0027-8424. PMID 8643494.
  15. ^ a b c Kim, Y.; Geiger, J. H.; Hahn, S.; Sigler, P. B. (1993-10-07). "Crystal structure of a yeast TBP/TATA-box complex". Nature. 365 (6446): 512–520. doi:10.1038/365512a0. ISSN 0028-0836. PMID 8413604.
  16. ^ Horikoshi, M.; Bertuccioli, C.; Takada, R.; Wang, J.; Yamamoto, T.; Roeder, R. G. (1992-02-01). "Transcription factor TFIID induces DNA bending upon binding to the TATA element". Proceedings of the National Academy of Sciences of the United States of America. 89 (3): 1060–1064. ISSN 0027-8424. PMID 1736286.
  17. ^ Blair, Rebecca H.; Goodrich, James A.; Kugel, Jennifer F. (2012-09-25). "Single-molecule fluorescence resonance energy transfer shows uniformity in TATA binding protein-induced DNA bending and heterogeneity in bending kinetics". Biochemistry. 51 (38): 7444–7455. doi:10.1021/bi300491j. ISSN 1520-4995. PMC 3551999. PMID 22934924.{{cite journal}}: CS1 maint: PMC format (link)
  18. ^ Whittington, JoDell E.; Delgadillo, Roberto F.; Attebury, Torrissa J.; Parkhurst, Laura K.; Daugherty, Margaret A.; Parkhurst, Lawrence J. (2008-07-08). "TATA-binding protein recognition and bending of a consensus promoter are protein species dependent". Biochemistry. 47 (27): 7264–7273. doi:10.1021/bi800139w. ISSN 1520-4995. PMID 18553934.
  19. ^ Louder, RK; He, Y; López-Blanco, JR; Fang, J; Chacón, Nogales; E (2016). "Structure of promoter-bound TFIID and model of human pre-initiation complex assembly". Nature. 531: 604–609. doi:10.1038/nature17394.
  20. ^ Wang, Juan; Zhao, Shasha; He, Wei; Wei, Yun; Zhang, Yang; Pegg, Henry; Shore, Paul; Roberts, Stefan G. E.; Deng, Wensheng (2017-07-14). "A transcription factor IIA-binding site differentially regulates RNA polymerase II-mediated transcription in a promoter context-dependent manner". The Journal of Biological Chemistry. 292 (28): 11873–11885. doi:10.1074/jbc.M116.770412. ISSN 1083-351X. PMC 5512080. PMID 28539359.{{cite journal}}: CS1 maint: PMC format (link) CS1 maint: unflagged free DOI (link)
  21. ^ a b c Krishnamurthy, Shankarling; Hampsey, Michael. "Eukaryotic transcription initiation". Current Biology. 19 (4): R153–R156. doi:10.1016/j.cub.2008.11.052.
  22. ^ Duttke, Sascha H. C. (2014-07-18). "RNA polymerase III accurately initiates transcription from RNA polymerase II promoters in vitro". The Journal of Biological Chemistry. 289 (29): 20396–20404. doi:10.1074/jbc.M114.563254. ISSN 1083-351X. PMC 4106352. PMID 24917680.{{cite journal}}: CS1 maint: PMC format (link) CS1 maint: unflagged free DOI (link)
  23. ^ a b de Pater, B. S.; de Kam, R. J.; Hoge, J. H. C.; Schilperoort, R. A. (1987). "Effects of mutations in the TATA box region of the Argobactrium T-cyt gene on its transcription in plant tissues". Nucleic Acids Research. 15 (20): 8283–8292. PMC 306359.{{cite journal}}: CS1 maint: PMC format (link)
  24. ^ a b Wang, Y.; Jensen, R. C.; Stumph, W. E. (1996). "Role of TATA box sequence and orientation in determining RNA polymerase II/III transcription specificity". Nucleic Acids Res. 24 (15): 3100–3106. PMC 146060.{{cite journal}}: CS1 maint: PMC format (link)
  25. ^ Fei, Y. J.; Stoming, T. A.; Efremov, G. D.; Efremov, D. G.; Battacharia, R.; Gonzales-Redondo, J. M.; Altay, C.; Gurgey, A.; Huisman, T. H. J. (1988). "β-thalassemia due to a T→A mutation within the ATA box". Biochemical and Biophysical Research Communications. 153 (2): 741–747. doi:10.1016/S0006-291X(88)81157-4.
  26. ^ Wobbe, C. R.; Strahl, K. (1990). "Yeast and human TATA-binding proteins have nearly identical DNA sequence requirements for transcription in vitro". Mol Cell Biol. 10 (8): 3859–3867. PMC 360896.{{cite journal}}: CS1 maint: PMC format (link)
  27. ^ Antonarakis, S. E.; Irkin, S. H.; Cheng, T. C.; Scott, A. F.; Sexton, J. P.; Trusko, S. P.; Charache, S.; Kazazian, Jr, H. H. (1984). "beta-Thalassemia in American Blacks: novel mutations in the "TATA" box and an acceptor splice site". Proc. Natl. Acad. Sci. USA. 81 (4): 1154–1158. PMC 344784.{{cite journal}}: CS1 maint: PMC format (link)
  28. ^ Zienolddiny, S.; Ryberg, D.; Maggini, V.; Skaug, V.; Canzian, F.; Haugen, A. (2004). "Polymorphisms of the interleukin-1 β gene are associated with increased risk of non-small cell lung cancer". Int J Cancer. 109 (3): 353–356. doi:10.1002/ijc.11695.
  29. ^ Watanabe, M.; Zingg, B. C.; Mohrenweiser, H. W. (1996). "Molecular analysis of a series of alleles in humans with reduced activity at the triosephosphate isomerase locus". Am. J. Hum. Genet. 58: 308–316. PMC 1914533.{{cite journal}}: CS1 maint: PMC format (link)
  30. ^ Takahashi, K.; Ezekowitz, R. A. (2005). "The role of the mannose-binding lectin in innate immunity". Clin. Infect. Dis. 7: S440–S444. doi:10.1086/431987.
  31. ^ Reijnen, M. J.; Sladek, F. M.; Bertina, R. M.; Reitsma, P. H. (1992). "Disruption of a binding site for hepatocyte nuclear factor 4 results in hemophilia B Leyden". Proc. Natl. Aca. Sci. USA. 89 (14): 6300–6303. PMC 49488.{{cite journal}}: CS1 maint: PMC format (link)
  32. ^ Arnaud, E.; Barbalat, V.; Nicaud, V.; Cambien, F.; Evans, A.; et al. (2000). "Polymorphisms in the 5' regulatory region of the tissue factor gene and the risk of myocardial infarction and venous thromboembolism: the ECTIM and PATHROS studies. Etude Cas-Te´moins de l'Infarctus du Myocarde. Paris Thrombosis case-control Study". Arterioscler. Thromb. Vasc. Biol. 20: 892–898. doi:10.1161/01.ATV.20.3.892. {{cite journal}}: Explicit use of et al. in: |last6= (help)
  33. ^ Savinkova, L.; Drachkova, I.; Arshinova, T.; Ponomarenko, P.; Ponomarenko, M.; Kolchanov, N. (2013). "An experimental verification of the predicted effects of promoter TATA-box polymorphisms associated with human diseases on interactions between the TATA boxes and TATA-binding protein". PLOS ONE. 8 (2): 1–7. doi:10.1371/journal.pone.0054626. PMC 3570547.{{cite journal}}: CS1 maint: PMC format (link) CS1 maint: unflagged free DOI (link)