Transmembrane protein 247 (also known as TMEM247 or transmembrane protein ENSP00000343375) is a multi-pass transmembrane protein of unknown function found in Homo sapiens encoded by the TMEM247 gene. Notable in the protein are two transmembrane regions near the c-terminus of the translated polypeptide. Transmembrane protein 247 has been found to be expressed almost entirely in the testes.[5]

TMEM247
Identifiers
AliasesTMEM247, transmembrane protein 247
External IDsMGI: 1925719; HomoloGene: 54379; GeneCards: TMEM247; OMA:TMEM247 - orthologs
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001145051

NM_001277980
NM_030104

RefSeq (protein)

NP_001138523

NP_001264909
NP_084380

Location (UCSC)Chr 2: 46.48 – 46.48 MbChr 17: 87.22 – 87.23 Mb
PubMed search[3][4]
Wikidata
View/Edit HumanView/Edit Mouse

Gene attributes

edit

General information

edit

The TMEM247 gene is located on chromosome 2 at c2p21, nucleotide: 46,479,565-46,484,425. It has three exons and two introns. TMEM247 is 4,861 nucleotides (nt) long pre-mRNA processing, reduced to 661 nt after mRNA processing and its protein product is 219 amino acids (aa) long.[6] The gene does not include a stop codon as most genes do, but instead has a stop codon created by the process of polyadenylation during mRNA processing. Due to this, TMEM247 has no 3' UTR (untranslated region). TMEM247 codes only for one variant.

Promoter region

edit

The promoter region of TMEM247 has a huge variety of predicted binding sites in the promoter region associated with the gene. Twenty potential interactions of interest have been collected below, though many more exist. Anchor base positions are based on distance from the start of the gene's promoter region, which itself is 1302 base pairs long.

There are a number of notable predicted binding sites on the TMEM247 promoter, as well as a notable omission. The promoter lacks a traditional TATA box, the typical binding site for proteins that recruit RNA Polymerase and begin the process of transcription. Instead, TMEM247 contains several predicted binding sites which are core promoter elements for TATA-less promoters.

TMEM247 has a promoter region that also contains a significant number of predicted development-related binding sites, such as pluripotent stem cell related factors (Oct4, Sox2, Nanog), sex-determining HMG box factors, and various homeobox/homeodomain binding sites.[7]

 
Tail end of the promoter region of the TMEM247 gene with notable predicted binding sites highlighted. The start of transcription is marked by an arrow.
Matrix Detailed matrix information Anchor base Strand Matrix similarity Sequence
V$TBX5.01 Brachyury gene, mesoderm developmental factor 1040 (+) 1 ctacctcaaaGGTGtcacaccctccacca
V$EOMES.03 Brachyury gene, mesoderm developmental factor 1042 (-) 0.987 tttggtggagggTGTGacacctttgaggt
V$PDEF.01 Human and murine ETS1 factors (Prostate-derived Ets Factor) 998 (-) 0.974 gaactgcaGGATgggcctttg
V$RFX3.01 X-box binding factors 1064 (+) 0.974 aaggggccctagCAACttg
V$SPZ1.01 Testis-specific bHLH-Zip transcription factors (Spermatogenic Zip 1 transcription factor) 1046 (-) 0.966 tGGAGggtgtg
V$TBX20.02 Brachyury gene, mesoderm developmental factor 1149 (-) 0.939 catcatttgaggtgctGACAtttggcctc
V$HSF1.05 Heat shock factors 1198 (-) 0.938 ctgctgccatCCAGaaaaccagaac
V$MYOD.01 Myogenic regulatory factor MyoD (myf3) 1178 (-) 0.919 cgctGCCAggtggggtc
V$MTBF.01 Human muscle-specific Mt binding site 1128 (+) 0.906 tggaATCTg
V$RFX3.02 Regulatory factor X, 3 (secondary DNA binding preference) 1278 (+) 0.889 gatggtgcctgGTGActcc
V$OCT3_4.02 Motif composed of binding sites for pluripotency or stem cell factors 892 (+) 0.882 acaatctTCATttaaaaaa
V$HSF1.01 Heat shock factors 1190 (-) 0.845 atccagaaaaccAGAAcgctgccag
V$EN1.01 Homeobox transcription factors 897 (-) 0.832 gttcctttTTTAaatgaag
O$XCPE1.01 Activator-, mediator- and TBP-dependent core promoter element for RNA polymerase II transcription from TATA-less promoters 1243 (+) 0.831 gtGCGGgagaa
V$DICE.01 Downstream Immunoglobulin Control Element, critical for B cell activity and specificity 1091 (-) 0.827 tgtcGTCAtcatagc
V$ISL1.01 Lim homeodomain factors 1012 (+) 0.827 tgcagttctTAATgttagcatgt
V$RFX4.03 X-box binding factors 1064 (-) 0.814 caaGTTGctagggcccctt
V$EN1.01 Homeobox transcription factors 922 (+) 0.788 aaatggatTTCAaatggtg
V$SOX9.03 SOX/SRY-sex/testis determining and related HMG box factors 1061 (+) 0.786 caCCAAaggggccctagcaactt
V$OSNT.01 Composed binding site for Oct4, Sox2, Nanog, Tcf3 (Tcf7l1) and Sall4b in pluripotent cells 1151 (+) 0.784 aatgtcaGCACctcaaatg
V$PROX1.01 Prospero-related homeobox 1163 (+) 0.783 aatGATGtcttgt
V$SOX9.03 SOX/SRY-sex/testis determining and related HMG box factors 975 (+) 0.781 ttTCAAagccatccttatgggca
V$HSF2.03 Heat shock factors 1075 (+) 0.777 ctagcaacttgtAGAAtgtaggcta
V$HSF5.01 Heat shock factors 1074 (-) 0.764 agcctacatTCTAcaagttgctagg

Protein attributes

edit

The TMEM247 gene codes for a single protein, transmembrane protein 247 (also referred to as TMEM247). TMEM247 has two transmembrane domains at the c-terminus of the protein as part of its multi-pass transmembrane protein structure. They are identical in length at 21 amino acids each, and are separated by a span of six amino acids.[6] TMEM247 has a predicted molecular weight of 25 kilodaltons, and a predicted isoelectric point of 5.[8]

In composition, TMEM247 has a significantly higher amount of methionine when compared to the set of all human proteins. It also has slightly elevated levels of glutamic acid in the same analysis. The charge distribution of amino acids comprising TMEM247 is relatively uniform. Two predicted hydrophobic segments exist in the protein which match with the known two transmembrane regions.

Protein domains

edit

Transmembrane protein 247 has two transmembrane domains. The three regions of the protein that remain are predicted to be outside of the membrane it resides in on the N- and C-terminus of the protein, while the segment between the protein's two transmembrane regions is predicted to reside inside of the membrane.[9][10]

Analysis of TMEM247 predicts that it localizes in the cell at the endoplasmic reticulum. In this case, inner predicted domains would be inside the ER and outer predicted domains would reside in the cytoplasm.

 
A domain-level view of TMEM247 with points of interest at predicted post-translational modification sites

Predicted post-translational modifications

edit

Transmembrane protein 247 has a variety of predicted post-translational modifications that may affect protein function. Predicted modifications include O-beta-GlcNAc attachment, Glycation, and O-glycosylation.[11][12][13]

 
A conceptual translation of TMEM247 and predicted modifications to its protein product

Predicted kinase interactions

edit

Protein kinases may modify transmembrane protein 247, and a variety of sites along the translated protein have been predicted to be kinase binding sites. These are represented by red squares surrounding the potential bound amino acids in the conceptual translation and listed in the table below. Predicted kinase interactions are listed in the order of the score of their prediction (higher, lower).[14]

Amino acid position Kinases
17 CKI
20 PKC
29 unspecified
31 unspecified
43 unspecified, DNAPK, ATM
48 unspecified
49 CKII, unspecified, DNAPK
50 unspecified
72 unspecified, cdk5, p38MAPK
75 unspecified, PKC
79 PKC, unspecified
95 cdk5, p38MAPK, GSK3
98 unspecified
161 PKA
219 PKA

Protein structure

edit

Transmembrane protein 247 has a predicted secondary structure which includes two major features in the form of beta sheets that reside near its determined transmembrane regions. This is slightly unusual for transmembrane proteins, whose transmembrane regions are often alpha helices.[15]

 
A Chou–Fasman method prediction of TMEM247's secondary protein structure
 
A 3D prediction of the TMEM247 secondary protein structure.

Evolutionary history

edit

Orthologs

edit

TMEM247 has several hundred orthologs, with its most distant fully sequenced available ortholog being Anolis carolinensis.[16][17] These orthologs are exclusive to land-based animals, as clades with an evolutionary origin before reptiles are not represented. The fact that TMEM247 has no relatives before the green anole makes it likely that the gene was novel when it appeared in an ancestor of the species, and was nonexistent before the evolution of reptiles. Classes represented in the orthologs include mammalia, aves, and reptilia.

Most orthologs within mammalia are strongly conserved across the entire gene, including a very highly conserved region near the center of the translated protein. The highest evolutionary conservation is centered around the transmembrane regions of the protein, which are highly conserved in all orthologous species.[18]

Genus and species Common name Taxonomic group MYA Accession # Sequence length (aa) Sequence identity to humans Sequence similarity to humans
Homo sapiens Human Primates 0 NP_001138523.1 219 100% 100%
Tupaia chinensis Treeshrew Scandentia 82 XP_006159980.1 266 74% 81%
Urocitellus parryii Arctic ground squirrel Rodentia 90 XP_026241536.1 224 71% 77%
Cavia porcellus Guinea pig Rodentia 90 XP_003472978.1 262 69% 77%
Vulpes vulpes Red fox Carnivora 96 XP_025848559.1 231 76% 80%
Sus scrofa Wild boar Artiodactyla 96 XP_003125218.3 257 74% 78%
Pteropus alecto Black flying fox Chiroptera 96 XP_015442982.1 280 69% 78%
Myotis lucifugus Little brown bat Chiroptera 96 XP_006083536.1 212 73% 78%
Lynx canadensis Canadian lynx Carnivora 96 XP_030167645.1 214 74% 78%
Leptonychotes weddellii Weddell seal Carnivora 96 XP_006740668.1 214 76% 81%
Equus caballus Horse Perissodactyla 96 XP_023474197.1 286 74% 78%
Enhydra lutris kenyoni Sea otter Carnivora 96 XP_022371955.1 214 76% 80%
Canis lupus familiaris Dog Carnivora 96 XP_005626294.1 231 76% 80%
Camelus ferus Wild Bactrian camel Artiodactyla 96 XP_032353339.1 276 73% 78%
Bos taurus Cattle Artiodactyla 96 NP_001070537.2 217 73% 78%
Bos indicus × Bos taurus Hybrid cattle Artiodactyla 96 XP_027410252.1 258 73% 78%
Loxodonta africana African bush elephant Proboscideans 105 XP_023413034.1 265 73% 78%
Echinops telfairi Lesser hedgehog tenrec Afrosoricida 105 XP_004700102.1 217 70% 77%
Pelodiscus sinensis Softshell turtle Testudines 312 XP_006125563.2 184 46% 60%
Columba livia Pigeon Columbiformes 312 XP_021154517.1 195 44% 62%
Chelonia mydas Green sea turtle Testudines 312 XP_027681026.1 213 38% 55%
Antrostomus carolinensis Chuck-will's-widow Caprimulgiformes 312 XP_028940116.1 154 38% 52%
Anolis carolinensis Green anole Squamata 312 XP_008115619.1 223 33% 50%

Paralogs

edit

In humans, TMEM247 has a single paralog (hCG17037) that has a sequence which theoretically would translate into a protein which is identical to that produced by TMEM247 aside from seven positions constituting a 96.8% similarity, including two deletions that reduce the total amino acid count from 219 to 217.[19] The extreme similarity of the TMEM247 gene and its paralog make it a likely result of gene duplication.

Paralog alignment

edit
 
CLUSTAL O(1.2.4) multiple sequence alignment of TMEM247 and its paralog, hCG17037

Significance/function

edit

TMEM247 has no major known effects or uses in a clinical setting. There are several studies that indicate TMEM247, despite being found almost exclusively in the testes, does not play a significant role in reproduction.[20] Further studies have revealed an association with variants in TMEM247 and coronary artery disease, though not of major significance.[21]

A mutation in TMEM247 has been noted to be unusually common in populations of Tibetan highlanders. The exact mutation is rs116983452, a change at nucleotide position 248 in the gene from cystine to tyrosine, which causes a missense in the protein product of alanine to valine.[22]

While the function of TMEM247 is unknown, it is notable for its polyadenylation-synthesized stop codon. Some research has shown that genes which rely on polyadenylation for the creation of stop codons are relatively common in a human parasite, Blastocystis.[23]

References

edit
  1. ^ a b c GRCh38: Ensembl release 89: ENSG00000284701Ensembl, May 2017
  2. ^ a b c GRCm38: Ensembl release 89: ENSMUSG00000037689Ensembl, May 2017
  3. ^ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. ^ "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. ^ "TMEM247 transmembrane protein 247, Homo sapiens (human)". Gene—NCBI. Retrieved 28 April 2020.
  6. ^ a b "Homo sapiens transmembrane protein 247 (TMEM247), mRNA (345842501)". NCBI Nucleotide Database. 2019.
  7. ^ "Genomatix". Retrieved 29 March 2020.
  8. ^ ExPASy—Compute pI/Mw tool. (n.d.). Retrieved April 20, 2020, from https://web.expasy.org/compute_pi/
  9. ^ TMHMM result. (n.d.). Retrieved April 20, 2020, from http://www.cbs.dtu.dk/cgibin/webface2.fcgi?jobid=5E9CC91C00001F03029DB033&wait=20
  10. ^ Phobius. (n.d.). Retrieved April 20, 2020, from http://phobius.sbc.su.se/
  11. ^ NetGlycate 1.0 Server—Prediction results. (n.d.). Retrieved April 20, 2020, from http://www.cbs.dtu.dk/cgi-bin/webface2.fcgi?jobid=5E9CCC4300001F0306A57D84&wait=20
  12. ^ NetOGlyc 4.0 Server—Prediction results. (n.d.). Retrieved April 20, 2020, from http://www.cbs.dtu.dk/cgi-bin/webface2.fcgi?jobid=5E9CCD2200001F033FFFF880&wait=20
  13. ^ YinOYang 1.2 Server. (n.d.). Retrieved April 20, 2020, from http://www.cbs.dtu.dk/services/YinOYang/
  14. ^ NetPhos 3.1 Server—Prediction results. (n.d.). Retrieved April 20, 2020, from http://www.cbs.dtu.dk/cgi-bin/webface2.fcgi?jobid=5E9CCE08000067A5DE7F60BB&wait=20
  15. ^ "CFSSP: Chou & Fasman Secondary Structure Prediction Server". Retrieved 20 April 2020.
  16. ^ "BLAST: Basic Local Alignment Search Tool". Retrieved 1 May 2020.
  17. ^ "UCSC Genome Browser Gateway". Retrieved 1 May 2020.
  18. ^ EMBOSS Needle—Alignment. (n.d.). Retrieved February 9, 2020, from https://www.ebi.ac.uk/Tools/services/web/toolresult.ebi?jobId=emboss_needle-I20200210030452-0663-36912718-p1m
  19. ^ "HCG17037, partial Homo sapiens". Protein—NCBI. Retrieved 1 May 2020.
  20. ^ Miyata H, Castaneda JM, Fujihara Y, Yu Z, Archambeault DR, Isotani A, et al. (July 2016). "Genome engineering uncovers 54 evolutionarily conserved and testis-enriched genes that are not required for male fertility in mice". Proceedings of the National Academy of Sciences of the United States of America. 113 (28): 7704–7710. Bibcode:2016PNAS..113.7704M. doi:10.1073/pnas.1608458113. PMC 4948324. PMID 27357688.
  21. ^ van der Harst P, Verweij N (February 2018). "Identification of 64 Novel Genetic Loci Provides an Expanded View on the Genetic Architecture of Coronary Artery Disease". Circulation Research. 122 (3): 433–443. doi:10.1161/CIRCRESAHA.117.312086. PMC 5805277. PMID 29212778.
  22. ^ Deng L, Zhang C, Yuan K, Gao Y, Pan Y, Ge X, et al. (November 2019). "Prioritizing natural-selection signals from the deep-sequencing genomic data suggests multi-variant adaptation in Tibetan highlanders". National Science Review. 6 (6): 1201–1222. doi:10.1093/nsr/nwz108. PMC 8291452. PMID 34691999.
  23. ^ Venton D (August 2014). "Highlight: not like a textbook-nuclear genes in blastocystis use mRNA polyadenylation for stop codons". Genome Biology and Evolution. 6 (8): 1962–1963. doi:10.1093/gbe/evu167. PMC 4159010. PMID 25104295.