Uncharacterized protein C1orf131 is a protein that in humans is encoded by the gene C1orf131. The first ortholog of this protein was discovered in humans.[5][6] Subsequently, through the use of algorithms and bioinformatics, homologs of C1orf131 have been discovered in numerous species, and as a result, the name of the majority of the proteins in this protein family is Uncharacterized protein C1orf131 homolog.
C1orf131 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | C1orf131, chromosome 1 open reading frame 131 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 1913773; HomoloGene: 11982; GeneCards: C1orf131; OMA:C1orf131 - orthologs | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Gene
editIn humans C1orf131 is located on the minus strand of chromosome 1 and on the cytogenetic band 1q42.2 along with 193 other genes.[7] Notably, the gene upstream of C1orf131 is GNPAT, and the gene downstream of C1orf131 is TRIM67. When this gene is transcribed in humans, C1orf131 most often forms an mRNA of 1458 base pairs long which is composed of seven exons. There are at least nine others alternative splice forms in humans that produce proteins. They range in size from 129 base pairs (2 exons) to 1458 base pairs (7 exons).[8]
Protein
editIn the C1orf131 protein family, the proteins are between 93 and 450 amino acids long; however, the majority tend to be between 160-295 amino acids long. They have a molecular weight between 10.6 and 49.0 kDa with the majority between 18.6 and 32.7 kDa. They have an isoelectric point between 9.6 and 11.2.[9] Over 30 orthologs from mammals, birds and lizards have been identified as having a poly(A) RNA binding site.[10] All orthologs in this protein family have a domain of unknown function DUF4602.[10][11] The human protein has been shown to be both phosphorylated and acetylated.[12][13][14][15][16][17] These proteins are lysine-rich, charged amino acids (DEHKR), and basic charged amino acids (HKR).[18] The secondary structure of these proteins primarily consist of alpha helices and coils with a small percentage of beta strands.[19] C1orf131 has been shown to interact with ubiquitin[20] through affinity capture followed by mass spectrometry and APP (amyloid beta (A4) precursor protein)[21] through reconstituted complex.
DUF4602
editDUF4602 (PF15375) is generally 120+ amino acids long.[22] There is typically only one gene that contains this DUF domain;however, the DUF domain has been identified in two different proteins in several species. In Trichuris suis DUF4602 is found in both hypothetical protein M5114_09117 and tRNA pseudouridine synthase D, and in Echinocuccus granulosus DUF4602 has been found in hypothetical protein EGR 05135 and expressed conserved protein. DUF4602 has been found primarily in eukaryotes; however, DUF4602 has been identified in the virus DRHN1, Bacillus sp. UNC41MFS5, Enterococcus faecalis, and Enterococcus faecalis 13-SD-W-01. In the C1orf131 orthologs the DUF domains are typically located in the middle of the gene toward the C-terminus side in larger proteins (250+ residues) and in smaller orthologs (160-250 residues) the DUF domain is located near the N-terminus. Also in larger orthologs there are regions of low complexity which could indicate that these proteins are intrinsically disordered proteins.
Evolutionary history
editThis gene family exists only in eukaryotes. There are no paralogs of this gene; however, there are a few pseudogenes of C1orf131. Thus far they have only been found in orangutans, mouse lemurs, and sloths.[11] When this gene family is compared to cytochrome C, a slow evolving gene,[23] and fibrinogen gamma chain, a fast evolving gene[24] it is shown to evolve at a faster rate than fibrinogen.
References
edit- ^ a b c GRCh38: Ensembl release 89: ENSG00000143633 – Ensembl, May 2017
- ^ a b c GRCm38: Ensembl release 89: ENSMUSG00000031984 – Ensembl, May 2017
- ^ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
- ^ "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
- ^ Gerhard DS, Wagner L, et al. (October 2004). "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)". Genome Research. 14 (10b): 212–2127. doi:10.1101/gr.2596504. PMC 528928. PMID 15489334.
- ^ Ota,T., Suzuki,Y., et al. (December 21, 2004). "Complete sequencing and characterization of 21,243 full-length human cDNAs". Nature Genetics. 36 (1): 40–45. doi:10.1038/ng1285. PMID 14702039.
- ^ "Browse Homo sapiens ORF cDNA clones by chromosome 1, map 1q42, page 1". Archived from the original on 2015-05-18. Retrieved 2015-04-27.
- ^ "AceView: Gene:C1orf131, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView".
- ^ Kozlowski LP (2016). "IPC - Isoelectric Point Calculator". Biology Direct. 11 (1): 55. doi:10.1186/s13062-016-0159-9. PMC 5075173. PMID 27769290.
- ^ a b "Uniprot Gene: C1orf131". Retrieved May 7, 2015.
- ^ a b "BLAT". Retrieved May 7, 2015.
- ^ Olsen JV, Vermeulen M, Santamaria A, Kumar C, Miller ML, Jensen LJ, Gnad F, Cox J, Jensen TS, Nigg EA, Brunak S, Mann M (January 2010). "Quantitative phosphoproteomics reveals widespread full phosphorylation site occupancy during mitosis". Science Signaling. 3 (104): ra3. doi:10.1126/scisignal.2000475. PMID 20068231. S2CID 24775963.
- ^ Wang B, Malik R, Nigg EA, Körner R (December 2008). "Evaluation of the low-specificity protease elastase for large-scale phosphoproteome analysis". Analytical Chemistry. 80 (24): 9526–9533. doi:10.1126/scisignal.2000475. PMID 20068231. S2CID 24775963. Retrieved April 26, 2015.
- ^ Matsuoka S, Ballif BA, Smogorzewska A, McDonald ER 3rd, Hurov KE, Luo J, Bakalarski CE, Zhao Z, Solimini N, Lerenthal Y, Shiloh Y, Gygi SP, Elledge SJ (May 2007). "ATM and ATR substrate analysis reveals extensive protein networks responsive to DNA damage". Science. 316 (5828): 1160–1166. Bibcode:2007Sci...316.1160M. doi:10.1126/science.1140321. PMID 17525332. S2CID 16648052.
- ^ Kim D, Hahn Y (July 9, 2011). "Identification of novel phosphorylation modification sites in human proteins that originated after the human–chimpanzee divergence". Bioinformatics. 27 (18): 2494–501. doi:10.1093/bioinformatics/btr426. PMID 21775310.
- ^ Dephoure N, Zhou C, Villén J, Beausoleil SA, Bakalarski CE, Elledge SJ, Gygi SP (August 2008). "A quantitative atlas of mitotic phosphorylation". Proceedings of the National Academy of Sciences of the United States of America. 105 (31): 10762–10767. Bibcode:2008PNAS..10510762D. doi:10.1073/pnas.0805139105. PMC 2504835. PMID 18669648.
- ^ Choudhary C, Kumar C, Gnad F, Nielsen ML, Rehman M, Walther TC, Olsen JV, Mann M (August 2010). "Lysine acetylation targets protein complexes and co-regulates major cellular functions". Science. 325 (5942): 834–40. Bibcode:2009Sci...325..834C. doi:10.1126/science.1175371. PMID 19608861. S2CID 206520776.
- ^ Brendel V, Bucher P, Nourbakhsh IR, Blaisdell BE, Karlin S (March 1992). "Methods and algorithms for statistical analysis of protein sequences". Proceedings of the National Academy of Sciences of the United States of America. 89 (6): 2002–2006. Bibcode:1992PNAS...89.2002B. doi:10.1073/pnas.89.6.2002. PMC 48584. PMID 1549558.
- ^ Garnier J, Gibrat JF, Robson B (1996). "GOR method for predicting protein secondary structure from amino acid sequence". Computer Methods for Macromolecular Sequence Analysis. Methods in Enzymology. Vol. 266. pp. 540–553. doi:10.1016/S0076-6879(96)66034-0. ISBN 978-0-12-182167-8. PMID 8743705.
- ^ Stes E, Laga M, Walton A, Samyn N, Timmerman E, De Smet I, Goormachtig S, Gevaert K (June 2014). "A COFRADIC Protocol To Study Protein Ubiquitination". J Proteome Res. 13 (6) (3rd ed.): 3107–3113. doi:10.1021/pr4012443. PMID 24816145.[permanent dead link ]
- ^ Olah J, Vincze O, Virok D, Simon D, Bozso Z, Tokesi N, Horvath I, Hlavanda E, Kovacs J, Magyar A, Szucs M, Orosz F, Penke B, Ovadi J (September 2011). "Interactions of pathological hallmark proteins: tubulin polymerization promoting protein/p25, beta-amyloid, and alpha-synuclein". J. Biol. Chem. 286 (39) (39th ed.): 34088–34100. doi:10.1074/jbc.m111.243907. PMC 3190826. PMID 21832049.
- ^ "Family: DUF4602". Retrieved May 8, 2015.
- ^ Dickerson, R. (1971). "The structure of cytochrome c and the rates of molecular evolution". Journal of Molecular Evolution. 1 (1) (1st ed.): 26–45. Bibcode:1971JMolE...1...26D. doi:10.1007/bf01659392. PMID 4377446. S2CID 24992347.
- ^ Prychitko TM, Moore WS (2000). "Comparative evolution of the mitochondrial cytochrome b gene and nuclear beta-fibrinogen intron 7 in woodpeckers". Mol Biol Evol. 17 (7): 1101–11. doi:10.1093/oxfordjournals.molbev.a026391. PMID 10889223.