Small integral membrane protein 14, also known as SMIM14 or C4orf34, is a protein encoded on chromosome 4 of the human genome by the SMIM14 gene.[2] SMIM14 has at least 298 orthologs mainly found in jawed vertebrates and no paralogs.[3] SMIM14 is classified as a type I transmembrane protein. While this protein is not well understood by the scientific community, the transmembrane domain of SMIM14 may be involved in ER retention.[4]
Gene
editThe SMIM14 gene is located on the minus strand at cytogenetic band 4p14 and is 92,567 base pairs in length.[5] The gene has five exons, four of which constitute the open-reading frame for SMIM14.[6]
The Kozak sequence, which functions as the protein translation initiation site in most eukaryotic mRNA transcripts, is considered a strong motif.[7] There is no signal peptide in SMIM14, but the encoded transmembrane domain acts as the signal sequence. It is predicted that one disulfide bridge is encoded in SMIM14, which stabilizes the tertiary (and sometimes quaternary) structures of proteins. There are at least ten polyadenylation sequences in the 3’ UTR of the SMIM14 gene, indicating transcription termination.
SMIM14 is expressed at four-times the level of an average gene.[8]
Gene regulation
editPromoter
editSMIM14 has seven predicted promoter regions. The promoter with the greatest number of transcripts and CAGE tags is approximately 1,420 base pairs in length. It is found on the minus strand and has a start position at residue 39,638,806 and ends at residue 39,640,225. The identified promoter has five coding transcripts and a maximum of 105,458 CAGE tags from one of the transcripts.[9]
Promoter ID | Start Position | End Position | Length (bp) | Coding Transcripts |
---|---|---|---|---|
GXP_150112 | 39,549,547 | 39,550,812 | 1,266 | 0 |
GXP_3198013 | 39,583,919 | 39,584,958 | 1,040 | 0 |
GXP_9520406 | 39,605,105 | 39,606,144 | 1,040 | N/A |
GXP_9520407 | 39,626,490 | 39,627,529 | 1,040 | N/A |
GXP_6750876 | 39,627,082 | 39,628,121 | 1,040 | 1 |
GXP_3198015 | 39,638,191 | 39,639,230 | 1,040 | 0 |
GXP_6750877 | 39,638,806 | 39,640,225 | 1,420 | 5 |
For the SMIM14 gene, the associated CpG sites are found in CpG island 76; additional transcription factors can bind to this promoter to drive SMIM14 gene expression.[10]
Literature-curated Transcription Factors
(via ORegAnno) |
---|
SMARCA4 |
STAT1 |
RBL2 |
TRIM28 |
EGR1 |
TFAP2C |
RNA and expression
editSMIM14 has three mRNA transcript variants. Transcript variant 1 is the longest variant, with 6,397 base pairs.[2]
Transcript | Length (bp) | Accession Number |
---|---|---|
Transcript variant 1 | 6,397 | NM_001317896.2 |
Transcript variant 2 | 6,252 | NM_174921 |
Transcript variant 3 | 6,263 | NM_001317897 |
SMIM14 has high expression in the liver, adrenal gland, colon, and prostate. It is under-expressed in peripheral blood lymphocytes, skeletal muscles, and the heart.[11]
Protein
editFrom SMIM14, transcript variant 1, a protein of 99 amino acids is synthesized.[13]
Primary structure
editThe predicted molecular weight (Mw) of the SMIM14 protein is 10710.34 Da. The SMIM14 protein carries no electrical charge at a pH value of 5.10 (i.e. isoelectric point, pI).[14] The abundance of every amino acid is within the normal range for humans.[14]
Transmembrane domain and motifs
editThe Kozak sequence is considered a strong motif.[7]
SMIM14 has one transmembrane domain, so it is classified as a single-pass membrane protein.[15] The transmembrane domain extends from residues 51–70.[16] It is predicted that within the domain, there is a dileucine motif, which plays a role in the sorting of transmembrane proteins to endosomes and lysosomes.[17] The N-terminus is positioned in the extracellular space, while the C-terminus is located inside the cell, further classifying SMIM14 as a type I transmembrane protein.
Secondary structure
editIt is predicted that there is an ɑ-helix within the transmembrane domain.[18] It is also predicted that SMIM14 is randomly coiled near the C-terminus.[18][19] A random coil is regarded as the protein's lack of a secondary structure, so it assumes a relaxed, non-interacting nor stabilizing conformation. It is also predicted that extended strands (E-strands) are throughout the protein.[18][19] E-strands are a common secondary structure, as well, and are often characterized by their involvement in hydrogen bonding with polar side chains.
Within the N-terminus, SMIM14 is predicted to have three palmitoylation sites,[20] which facilitates the clustering of proteins, and one disulfide bridge, stabilizing the structure of the protein. There is also a predicted glycosaminoglycan site spanning residues 45–48, proximal to the transmembrane domain.[21] The C-terminus is predicted to have two unidentified phosphorylation sites and one PKA-phosphorylation site.[22]
Subcellular location
editSMIM14, a transmembrane protein, is usually expressed in the ER membrane.[4] While there is no conventional ER retention signal within SMIM14 coding sequences, it has been suggested that the transmembrane domain mediates ER retention.
Homology
editSMIM14 has no known paralogs and at least 298 orthologs.
Paralogs
editThrough BLAST, it has been established that there are no paralogs of the SMIM14 gene in Homo sapiens.[23]
Orthologs
editSMIM14 is conserved in most vertebrates, excluding hagfish, lampreys, lobe-finned fish, and lungfish.[23] For invertebrates, they are conserved in flatworms, roundworms, mollusks, and arthropods. It is also relatively conserved in distant relatives, such as sea anemones and corals.
Species | Common Name | Taxons | DoD (mya) | % Identity | % Similarity | Corrected % Divergence (m) | Accession Number |
---|---|---|---|---|---|---|---|
Mastomys coucha | Southern multimammate mouse | rodentia | 90 | 87.9 | 98.0 | 12.9 | XP_031198284.1 |
Phyllostomus discolor | pale spear-nosed bat | mammalia | 96 | 93.4 | 99.0 | 6.70 | XP_028361411.1 |
Manacus vitellinus | golden-collared manakin | aves | 312 | 85.1 | 91.1 | 16.1 | XP_017923893.1 |
Python bivittatus | Burmese python | reptilia | 312 | 80.2 | 89.1 | 22.1 | XP_007426519 |
Nanorana parkeri | high Himalaya frog | amphibia | 352 | 69.2 | 79.8 | 36.8 | XP_018420132.1 |
Danio rerio | zebrafish | actinopterygii | 435 | 68.0 | 82.5 | 38.6 | NP_991165.1 |
Rhincodon typus | whale shark | chondrichthyes | 473 | 71.8 | 84.5 | 33.1 | XP_020383770.1 |
Ciona intestinalis | sea vase | ascidiacea | 676 | 42.7 | 55.3 | 85.1 | XP_026690156.1 |
Strongylocentrotus
purpuratus |
Pacific purple sea urchin | echinodermata | 684 | 50.5 | 68.0 | 68.3 | XP_787363.2 |
Lingula anatina | lamp shell | brachiopoda | 797 | 59.0 | 74.3 | 52.8 | XP_013382479.1 |
Limulus polyphemus | Atlantic horseshoe crab | arthropoda | 797 | 49.5 | 65.0 | 70.3 | XP_013782563.1 |
Agrilus planipennis | emerald ash borer | insecta | 797 | 39.8 | 57.3 | 92.1 | XP_018319678.1 |
Octopus vulgaris | octopus | mollusca | 797 | 51.0 | 64.4 | 67.3 | XP_029637526.1 |
Strongyloides ratti | threadworm | nematoda | 797 | 33.3 | 48.1 | 110 | XP_024504825.1 |
Exaiptasia pallida | sea anemone | anthozoa | 824 | 58.2 | 65.5 | 54.1 | XP_020902189.1 |
Schistosoma haematobium | urinary blood fluke | platyhelminthes | 824 | 37.4 | 53.3 | 98.3 | XP_012793134.1 |
The sequence of the SMIM14 gene is highly conserved in orthologs proximal to the N-terminus. In stark contrast, the C-terminus is more varied across orthologs. Sequence analysis of the SMIM14 gene in humans suggests that the C-terminus encodes a disproportionate amount of proline residues (9 out of 29; 31%) with several proline-rich sequences (PXXP).[4] Proline-rich domains are usually associated with protein-protein interactions; thus, the C-terminus has a high probability of interacting with proteins.
Protein interactions
editSMIM14 has been predicted to interact with the FATE1 protein, which is involved in the Ca2+ transfer from the ER to mitochondria, a regulatory mechanism for apoptosis.[24][25] It has also been predicted that SMIM14 interacts with LSM4, a glycine-rich protein that plays a role in pre-mRNA splicing.[26][27]
References
edit- ^ Hunt, Sarah E; McLaren, William; Gil, Laurent; Thormann, Anja; Schuilenburg, Helen; Sheppard, Dan; Parton, Andrew; Armean, Irina M; Trevanion, Stephen J; Flicek, Paul; Cunningham, Fiona (1 January 2018). "Ensembl variation resources". Database. 2018. doi:10.1093/database/bay119. PMC 6310513. PMID 30576484.
- ^ a b "Homo sapiens small integral membrane protein 14 (SMIM14), transcript variant 1, mRNA". 2019-07-07.
{{cite journal}}
: Cite journal requires|journal=
(help) - ^ "SMIM14 orthologs". NCBI. Retrieved 2020-02-07.
- ^ a b c Jun, Mi-Hee; Jun, Young-Wu; Kim, Kun-Hyung; Lee, Jin-A; Jang, Deok-Jin (31 October 2014). "Characterization of the cellular localization of C4orf34 as a novel endoplasmic reticulum resident protein". BMB Reports. 47 (10): 563–568. doi:10.5483/bmbrep.2014.47.10.252. PMC 4261514. PMID 24499674.
- ^ Chalifa-Caspi, V.; Shmueli, O; Benjamin-Rodrig, H; Rosen, N; Shmoish, M; Yanai, I; Ophir, R; Kats, P; Safran, M; Lancet, D (1 January 2003). "GeneAnnot: Interfacing GeneCards with high-throughput gene expression compendia". Briefings in Bioinformatics. 4 (4): 349–360. doi:10.1093/bib/4.4.349. PMID 14725348.
- ^ "SMIM14 Gene - GeneCards | SIM14 Protein | SIM14 Antibody". www.genecards.org. Retrieved 2020-02-25.
- ^ a b Hernández, Greco; Osnaya, Vincent G.; Pérez-Martínez, Xochitl (1 December 2019). "Conservation and Variability of the AUG Initiation Codon Context in Eukaryotes". Trends in Biochemical Sciences. 44 (12): 1009–1021. doi:10.1016/j.tibs.2019.07.001. PMID 31353284. S2CID 198966937.
- ^ "AceView: Gene:C4orf34, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView". www.ncbi.nlm.nih.gov. Retrieved 2020-04-30.
- ^ Cartharius, K.; Frech, K.; Grote, K.; Klocke, B.; Haltmeier, M.; Klingenhoff, A.; Frisch, M.; Bayerlein, M.; Werner, T. (1 July 2005). "MatInspector and beyond: promoter analysis based on transcription factor binding sites". Bioinformatics. 21 (13): 2933–2942. doi:10.1093/bioinformatics/bti473. PMID 15860560.
- ^ Kent, W. J.; Sugnet, C. W.; Furey, T. S.; Roskin, K. M.; Pringle, T. H.; Zahler, A. M.; Haussler, a. D. (16 May 2002). "The Human Genome Browser at UCSC". Genome Research. 12 (6): 996–1006. doi:10.1101/gr.229102. PMC 186604. PMID 12045153.
- ^ "49002542 - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-04-30.
- ^ "Protter - interactive protein feature visualization". wlab.ethz.ch. Retrieved 2020-05-01.
- ^ "small integral membrane protein 14 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-04-30.
- ^ a b Brendel, V.; Bucher, P.; Nourbakhsh, I. R.; Blaisdell, B. E.; Karlin, S. (15 March 1992). "Methods and algorithms for statistical analysis of protein sequences". Proceedings of the National Academy of Sciences. 89 (6): 2002–2006. Bibcode:1992PNAS...89.2002B. doi:10.1073/pnas.89.6.2002. PMC 48584. PMID 1549558.
- ^ Kall, L.; Krogh, A.; Sonnhammer, E. L.L. (8 May 2007). "Advantages of combined transmembrane topology and signal peptide prediction--the Phobius web server". Nucleic Acids Research. 35 (Web Server): W429–W432. doi:10.1093/nar/gkm256. PMC 1933244. PMID 17483518.
- ^ Gouw, Marc; Michael, Sushama; Sámano-Sánchez, Hugo; Kumar, Manjeet; Zeke, András; Lang, Benjamin; Bely, Benoit; Chemes, Lucía B; Davey, Norman E; Deng, Ziqi; Diella, Francesca; Gürth, Clara-Marie; Huber, Ann-Kathrin; Kleinsorg, Stefan; Schlegel, Lara S; Palopoli, Nicolás; Roey, Kim V; Altenberg, Brigitte; Reményi, Attila; Dinkel, Holger; Gibson, Toby J (4 January 2018). "The eukaryotic linear motif resource – 2018 update". Nucleic Acids Research. 46 (D1): D428–D434. doi:10.1093/nar/gkx1077. PMC 5753338. PMID 29136216.
- ^ Bonifacino, Juan S.; Traub, Linton M. (June 2003). "Signals for Sorting of Transmembrane Proteins to Endosomes and Lysosomes". Annual Review of Biochemistry. 72 (1): 395–447. doi:10.1146/annurev.biochem.72.121801.161800. PMID 12651740.
- ^ a b c Combet, C; Blanchet, C; Geourjon, C; Deléage, G (March 2000). "NPS@: Network Protein Sequence Analysis". Trends in Biochemical Sciences. 25 (3): 147–150. doi:10.1016/s0968-0004(99)01540-6. PMID 10694887.
- ^ a b Ashok Kumar, T (1 April 2013). "CFSSP: Chou and Fasman Secondary Structure Prediction server". Wide Spectrum. 1 (9): 15–19. doi:10.5281/ZENODO.50733.
- ^ Ren, J.; Wen, L.; Gao, X.; Jin, C.; Xue, Y.; Yao, X. (27 August 2008). "CSS-Palm 2.0: an updated software for palmitoylation sites prediction". Protein Engineering Design and Selection. 21 (11): 639–644. doi:10.1093/protein/gzn039. PMC 2569006. PMID 18753194.
- ^ Gouw, Marc; Michael, Sushama; Sámano-Sánchez, Hugo; Kumar, Manjeet; Zeke, András; Lang, Benjamin; Bely, Benoit; Chemes, Lucía B; Davey, Norman E; Deng, Ziqi; Diella, Francesca; Gürth, Clara-Marie; Huber, Ann-Kathrin; Kleinsorg, Stefan; Schlegel, Lara S; Palopoli, Nicolás; Roey, Kim V; Altenberg, Brigitte; Reményi, Attila; Dinkel, Holger; Gibson, Toby J (4 January 2018). "The eukaryotic linear motif resource – 2018 update". Nucleic Acids Research. 46 (D1): D428–D434. doi:10.1093/nar/gkx1077. PMC 5753338. PMID 29136216.
- ^ Blom, Nikolaj; Sicheritz-Pontén, Thomas; Gupta, Ramneek; Gammeltoft, Steen; Brunak, Søren (June 2004). "Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence". Proteomics. 4 (6): 1633–1649. doi:10.1002/pmic.200300771. PMID 15174133. S2CID 18810164.
- ^ a b Altschul, Stephen F.; Gish, Warren; Miller, Webb; Myers, Eugene W.; Lipman, David J. (October 1990). "Basic local alignment search tool". Journal of Molecular Biology. 215 (3): 403–410. doi:10.1016/S0022-2836(05)80360-2. PMID 2231712. S2CID 14441902.
- ^ "FATE1 - Fetal and adult testis-expressed transcript protein - Homo sapiens (Human) - FATE1 gene & protein". www.uniprot.org. Retrieved 2020-04-30.
- ^ Doghman-Bouguerra, Mabrouka; Granatiero, Veronica; Sbiera, Silviu; Sbiera, Iuliu; Lacas-Gervais, Sandra; Brau, Frédéric; Fassnacht, Martin; Rizzuto, Rosario; Lalli, Enzo (September 2016). "FATE 1 antagonizes calcium- and drug-induced apoptosis by uncoupling ER and mitochondria". EMBO Reports. 17 (9): 1264–1280. doi:10.15252/embr.201541504. PMC 5007562. PMID 27402544.
- ^ "LSM4 - U6 snRNA-associated Sm-like protein LSm4 - Homo sapiens (Human) - LSM4 gene & protein". www.uniprot.org. Retrieved 2020-04-30.
- ^ Bertram, Karl; Agafonov, Dmitry E.; Dybkov, Olexandr; Haselbach, David; Leelaram, Majety N.; Will, Cindy L.; Urlaub, Henning; Kastner, Berthold; Lührmann, Reinhard; Stark, Holger (August 2017). "Cryo-EM Structure of a Pre-catalytic Human Spliceosome Primed for Activation". Cell. 170 (4): 701–713.e11. doi:10.1016/j.cell.2017.07.011. PMID 28781166. S2CID 12185819.