Chromosome 4 Open Reading Frame 45 (C4orf45) is a protein which in humans is encoded by the C4orf45 gene.[5] It is predicted to be localized in the cytoplasm and nucleus of a cell[6]

C4orf45
Identifiers
AliasesC4orf45, chromosome 4 open reading frame 45
External IDsMGI: 4936993; HomoloGene: 105634; GeneCards: C4orf45; OMA:C4orf45 - orthologs
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_152543

NM_001142953
NM_001370773

RefSeq (protein)

NP_689756

n/a

Location (UCSC)Chr 4: 158.89 – 159.04 MbChr 3: 79.24 – 79.37 Mb
PubMed search[3][4]
Wikidata
View/Edit HumanView/Edit Mouse

Gene

edit

The C4or45 gene is found on chromosome 4 (4q32.1) from 158,893,134 to 159,082,885, spanning 189,752 bases, and is oriented on the minus strand.[7] The other names for this gene are FLJ25371 and LOC152940.[8]

mRNA

edit

There are four total mRNA transcript variants of C4orf45. The most common C4orf45 mRNA transcript is 1263 nucleotides in length and contains 5 exons.[9] The coding sequence spans from nucleotide 134 to 694.[9] In RNA-seq datasets, C4orf45 has been observed to be ubiquitously expressed at low levels across all tissues, with greatest expression in the testes.[10]

C4orf45 mRNA Transcript Variant Properties
Variant # mRNA length (nt) Protein length (aa) MW (kDa) # of Exons
1[11] 1263 186 21.6 5
X1[12] 1856 186 21.6 7
X3[13] 1252 179 20.9 5
X4[14] 1359 179 20.9 5

Protein

edit

The C4orf45 protein produced from the most common mRNA transcript is 186 amino acids in length. Its theoretical isoelectric point is 9.97 and its molecular weight is 21.6 kDa.[5] The protein contains a domain of unknown function, DUF4562.[15] It is also predicted to contain a forkhead-associated (FHA) domain and a SRC homology 3 (SH3) domain. According to the structural prediction from AlphaFold,[16] the C4orf45 protein's tertiary structure consists primarily of alpha helices.

 
AlphaFold predicted tertiary structure of the human C4orf45 protein. Colors represent charge of amino acids. Annotations include predicted FHA domain, SH3 domain, and nuclear localization signal.

Sub-cellular localization

edit

The human C4orf45 protein is predicted to contain a nuclear localization signal at its C-terminus, indicating it is a nuclear protein.[6] The protein is also predicted to be found in the cytoplasm.[17]

Post-translational modifications

edit

The human C4orf45 protein is predicted to contain 4 phosphorylation sites.[18] It is also predicted to contain two sumoylation sites, which are common in nuclear proteins and may aid in sub-cellular localization of the protein.[19] C4orf45 is predicted to contain 8 O-linked beta-N-acetylglucosamine (O-β-GlcNA) attachment sites, which may play a role in regulating transcription, signaling, and protein-protein interactions of C4orf45.[20] Five of the O-β-GlcNA attachment sites are also predicted to be phosphorylation sites.[21] The C-terminus of the human C4orf45 protein is predicted to contain two acetylation sites.[22]

 
Schematic diagram of the human C4orf45 protein displaying predicted post-translational modifications and domains.

Homology

edit

Orthologs

edit

Orthologs of C4orf45 are found in mammals, birds, amphibians, fish, and some invertebrates, but not in plants, fungi, or protists.[7]

Ortholog Sequences of C4orf45 Protein in Humans
Genus and Species Common Name Taxonomic Group Estimated Date of Divergence (MYA) Accession Number Sequence Length (aa) Sequence Identity (%) Sequence Similarity (%)
Homo sapiens Human Primates 0 NP_689756.2 186 100 100
Mus musculus House mouse Rodentia 90 NP_001136425.1 189 67.1 76.6
Ailuropoda melanoleuca Giant panda Carnivora 94 XP_019657778.1 197 66.0 81.7
Crocodylus porosus Saltwater crocodile Reptilia 319 XP_019390090.1 182 43.8 54.9
Caretta caretta Loggerhead sea turtle Reptilia 319 XP_048703517.1 184 40.9 54.6
Gallus gallus Chicken Aves 319 XP_001233138.7 194 42.0 48
Cygnus atratus Black swan Aves 319 XP_050566334.1 204 37.8 46.3
Xenopus tropicalis Western clawed frog Amphibia 353 XP_002934787.1 183 41.1 47.6
Eleutherodactylus coqui Common coqui Amphibia 353 KAG9479981.1 122 44.3 35.2
Xenopus laevis Common frog Amphibia 353 XP_018086524.1 172 43.1 46.7
Latimeria chalumnae West Indian Ocean coelacanth Sarcocterygii 414 XP_014346686.1 203 40.0 40.7
Polyodon spathula American paddlefish Actinopterygii 431 XP_041119378.1 202 39.4 43.9
Salmo salar Atlantic salmon Actinopterygii 431 NP_001134722.1 160 37.9 41.7
Scyliorhinus canicula Smaller spotted catshark Chondrichthyes 464 XP_038648240.1 201 43.2 45.8
Branchiostoma lanceolatum European lancelet Leptocardii 556 CAH1247082.1 185 35.0 41.3
Phallusia mammillata Tunicate Ascidiae 603 CAB3227125.1 186 38.1 34.9
Anneissia japonica Anneissia Echinodermata 619 XP_033121347.1 208 35.2 32.2
Stylophora pistillata Hood coral Cnidaria 685 XP_022802728.1 206 40.7 41.6
Exaiptasia diaphana Pale anemone Cnidaria 685 KXJ18647.1 219 34.5 30.5
Pecten maximus Great scallop Mollusca 694 XP_033741332.1 214 32.7 39.7
Haliotis rubra Blacklip abalone (sea snail) Mollusca 694 XP_046577682.1 204 36.6 37.4

Evolution

edit

C4orf45 seems to have first appeared in invertebrates. When comparing the date of divergence (MYA) versus corrected sequence divergence for C4orf45, cytochrome c, and fibrinogen alpha, it appears that C4orf45 is evolving at an intermediate rate compared to the fast evolving fibrinogen alpha and the slow evolving cytochrome c.

Function

edit

Interacting proteins

edit

The human C4orf45 protein has been experimentally shown to interact with BANP (BTG3 associated nuclear protein) and PFDN5 (prefoldin subunit 5).[23] It also interacts with NEK4 (NIMA related kinase 4), which is a serine/threonine protein kinase that is required for normal entry into replicative senescence.[24]

Clinical significance

edit

C4orf45 had been identified as part of four different driver gene sets in the development of ovarian cancer.[25] C4orf45 has also been identified in the development of multiple myeloma (MM).[26] A study found a reciprocal translocation between a breakpoint in chromosome 8 downstream of the MYC gene and a breakpoint in chromosome 4 in the C4orf45 gene in a patient with MM.[26] An intronic variant found in C4orf45 suggests that the gene may contain variants associated with the development of cardiovascular disease in Mexican Americans.[27] A cross-trait meta analysis study found a SNP at the C4orf45 gene loci that is shared between late-onset Alzheimer's disease and snoring.[28] Another study discovered that the C4orf45 gene was altered by a copy number variant (CNV) in each member of a family that was diagnosed with familial schizophrenia (SCZ), indicating the gene's possible involvement in SCZ.[29]

 
Conceptual translation of human (Homo sapiens) C4orf45 transcript 1 and the protein it encodes using Bioline Six-Frame Translation Tool. The C4orf45 mRNA transcript 1 (NM 152543.3) and uncharacterized protein C4orf45 (NP 689756.2) amino acid sequence were used for the translation. mRNA annotations include exon boundaries, poly A signal, poly A site, and start and stop codons. Protein annotations include a domain of unknown function (DUF4562), FHA domain, SH3 domain, and a nuclear localization signal. Predicted phosphorylation, O-β-GlcNAc attachment, sumoylation, YinYang, and acetylation sites are also labeled. The alpha helices predicted by AlphaFold are underlined.

References

edit
  1. ^ a b c GRCh38: Ensembl release 89: ENSG00000164123Ensembl, May 2017
  2. ^ a b c GRCm38: Ensembl release 89: ENSMUSG00000091685Ensembl, May 2017
  3. ^ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. ^ "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. ^ a b "uncharacterized protein C4orf45 [Homo sapiens]". NCBI Protein. Retrieved 26 September 2022.
  6. ^ a b Nakai K, Horton P (January 1999). "PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization". Trends in Biochemical Sciences. 24 (1): 34–36. doi:10.1016/S0968-0004(98)01336-X. PMID 10087920.
  7. ^ a b "C4orf45 Gene - Chromosome 4 Open Reading Frame 45". GeneCards. Retrieved 26 September 2022.
  8. ^ "C4orf45". Alliance of Genome Resources. Retrieved 26 September 2022.
  9. ^ a b "Homo sapiens chromosome 4 open reading frame 45 (C4orf45), mRNA". NCBI Nucleotide. 27 June 2021. Retrieved 26 September 2022.
  10. ^ Fagerberg L, Hallström BM, Oksvold P, Kampf C, Djureinovic D, Odeberg J, et al. (February 2014). "Analysis of the Human Tissue-specific Expression by Genome-wide Integration of Transcriptomics and Antibody-based Proteomics *". Molecular & Cellular Proteomics. 13 (2): 397–406. doi:10.1074/mcp.M113.035600. PMC 3916642. PMID 33498127.
  11. ^ "Homo sapiens chromosome 4 open reading frame 45 (C4orf45), mRNA". 27 June 2021.
  12. ^ "PREDICTED: Homo sapiens chromosome 4 open reading frame 45 (C4orf45), transcript variant X1, mRNA". 5 April 2022.
  13. ^ "PREDICTED: Homo sapiens chromosome 4 open reading frame 45 (C4orf45), transcript variant X3, mRNA". 5 April 2022.
  14. ^ "PREDICTED: Homo sapiens chromosome 4 open reading frame 45 (C4orf45), transcript variant X4, mRNA". 5 April 2022.
  15. ^ "C4orf45". The Human Protein Atlas. Retrieved 26 September 2022.
  16. ^ Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. (August 2021). "Highly accurate protein structure prediction with AlphaFold". Nature. 596 (7873): 583–589. Bibcode:2021Natur.596..583J. doi:10.1038/s41586-021-03819-2. PMC 8371605. PMID 34265844.
  17. ^ Almagro Armenteros JJ, Sønderby CK, Sønderby SK, Nielsen H, Winther O (December 2017). "DeepLoc: prediction of protein subcellular localization using deep learning". Bioinformatics. 33 (24): 4049. doi:10.1093/bioinformatics/btx548. PMID 29028934.
  18. ^ Blom N, Gammeltoft S, Brunak S (December 1999). "Sequence and structure-based prediction of eukaryotic protein phosphorylation sites". Journal of Molecular Biology. 294 (5): 1351–1362. doi:10.1006/jmbi.1999.3310. PMID 10600390.
  19. ^ Zhao J (December 2007). "Sumoylation regulates diverse biological processes". Cellular and Molecular Life Sciences. 64 (23): 3017–3033. doi:10.1007/s00018-007-7137-4. PMC 7079795. PMID 17763827.
  20. ^ Sakabe K, Wang Z, Hart GW (November 2010). "Beta-N-acetylglucosamine (O-GlcNAc) is part of the histone code". Proceedings of the National Academy of Sciences of the United States of America. 107 (46): 19915–19920. Bibcode:2010PNAS..10719915S. doi:10.1073/pnas.1009023107. PMC 2993388. PMID 21045127.
  21. ^ Gupta R, Brunak S (2002). "Prediction of glycosylation across the human proteome and the correlation to protein function". Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing: 310–322. PMID 11928486.
  22. ^ Deng W, Wang C, Zhang Y, Xu Y, Zhang S, Liu Z, et al. (December 2016). "GPS-PAIL: prediction of lysine acetyltransferase-specific modification sites from protein sequences". Scientific Reports. 6 (1): 39787. Bibcode:2016NatSR...639787D. doi:10.1038/srep39787. PMC 5177928. PMID 28004786.
  23. ^ Luck K, Kim DK, Lambourne L, Spirohn K, Begg BE, Bian W, et al. (April 2020). "A reference map of the human binary protein interactome". Nature. 580 (7803): 402–408. Bibcode:2020Natur.580..402L. doi:10.1038/s41586-020-2188-x. PMC 7169983. PMID 32296183.
  24. ^ Basei FL, Meirelles GV, Righetto GL, Dos Santos Migueleti DL, Smetana JH, Kobarg J (2015-01-01). "New interaction partners for Nek4.1 and Nek4.2 isoforms: from the DNA damage response to RNA splicing". Proteome Science. 13: 11. doi:10.1186/s12953-015-0065-6. PMC 4367857. PMID 25798074.
  25. ^ Zhang J, Zhang S (June 2017). "Discovery of cancer common and specific driver gene sets". Nucleic Acids Research. 45 (10): e86. doi:10.1093/nar/gkx089. PMC 5449640. PMID 28168295.
  26. ^ a b Affer M, Chesi M, Chen WG, Keats JJ, Demchenko YN, Roschke AV, et al. (August 2014). "Promiscuous MYC locus rearrangements hijack enhancers but mostly super-enhancers to dysregulate MYC expression in multiple myeloma". Leukemia. 28 (8): 1725–1735. doi:10.1038/leu.2014.70. PMC 4126852. PMID 24518206.
  27. ^ Gao C, Tabb KL, Dimitrov LM, Taylor KD, Wang N, Guo X, et al. (April 2018). "Exome Sequencing Identifies Genetic Variants Associated with Circulating Lipid Levels in Mexican Americans: The Insulin Resistance Atherosclerosis Family Study (IRASFS)". Scientific Reports. 8 (1): 5603. Bibcode:2018NatSR...8.5603G. doi:10.1038/s41598-018-23727-2. PMC 5884862. PMID 29618726.
  28. ^ Chen D, Wang X, Huang T, Jia J (2022). "Sleep and Late-Onset Alzheimer's Disease: Shared Genetic Risk Factors, Drug Targets, Molecular Mechanisms, and Causal Effects". Frontiers in Genetics. 13 (794202): 794202. doi:10.3389/fgene.2022.794202. PMC 9152224. PMID 35656316.
  29. ^ Xu B, Woodroffe A, Rodriguez-Murillo L, Roos JL, van Rensburg EJ, Abecasis GR, et al. (September 2009). "Elucidating the genetic architecture of familial schizophrenia using rare copy number variant and linkage scans". Proceedings of the National Academy of Sciences of the United States of America. 106 (39): 16746–16751. Bibcode:2009PNAS..10616746X. doi:10.1073/pnas.0908584106. PMC 2757863. PMID 19805367.