C1orf167

Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene.^[1] The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.^[2]^[3]

Gene

Locus

It can be located on chromosome 1 at position 1p36.22 on the plus strand and spans from positions 11,824,457 to 1,849,503.^[2]^[4]

Aliases

C1orf167 has one known alias with the name Chromosome 1 Open Reading Frame 167.^[5]

Number of Exons

There are 26 exons associated with the protein.^[1]

mRNA

Alternative Splicing

A splice region that is conserved in primate orthologs of the C1orf167 mRNA was located between exon 1 and exon 2.^[6]

Known mRNA Isoforms

The mRNA sequence has 8 known splice isoforms as determined by the conserved domains.^[7] The isoforms span the regions 426-863, 981-1418, 954-1391, 999-1329, 999-1400, 999-1436, 999-1404. and 999-1463 of the mRNA sequence.^[8]

Protein

Conceptual Translation of C1orf167 showcasing the conserved Domain of Unknown Function that begins at the break between exon 13 and exon 14.

Known Protein Isoforms

Alternative splicing produces two known isoforms of the human protein. They are XP_006711141.1 which is 1489aa in length and XP_003307860.2 which is 713aa in length.^[9]^[10]

Composition

The protein has an isoelectric point (pI) of 11. The predicted molecular weight (mW) is 160kDa for the human protein, but ranges from 140-180kDa for more distant orthologs.^[11] Compositional analysis revealed the most abundant amino acid to be Alanine (A) at 12.4% of the total protein. The analysis also revealed C1orf167 protein to be rich in Tryptophan (W) and deficient in Tyrosine (Y) and Isoleucine (I).^[12]

Subcellular Localization

C1orf167 is predicted to be localized to the cell nucleus.^[13]

Post-Translational Modifications

C1orf167 is predicted to undergo phosphorylation, O-Glycosylation, SUMOylation, glycation, and cleavage by staphylococcal peptidase I (Q105, Q321) and Glutamyl endopeptidase (Q1101).^[14]^[15]^[16]^[17]^[18]


	Species
	H. sapiens	T. manatus latirostris	U. parryii	D. novaehollandiae	P. vitticeps	C. milli
SUMOylation	K22	IVTLE447-451, K604, K605, VRVVP 684-688,	VAVVD502-506	K434	K57,K128,K578, K993, K1388	ISILH 121-125, K264,K477, K497, K522 IVSIC 621-625 LCLVY 703-707 VVVLR 975-979, VLQLR 1027-1031 K1199 K1208
O-GlcNAcylation	Many*	Similar Distribution (but more sites)	Similar Distribution (but fewer sites)	Similar Distribution (but fewer sites)	Similar Distribution	Similar Distribution (but fewer sites)
Glycation of ε amino groups of lysines	K -22, 114, 323,399, 433,505, 701, 710,720, 832,975, 1138,1279, 1306,1394, 1418	K-335,516, 534,605, 747,757, 1080,1125, 1189, 1382	K-114, 123,333, 462,651, 660,661, 938, 1111, 1149	K-72,103,128, 133, 183,240,241, 248,290,398, 437,466,483, 494,505,552, 589,718, 767,772,820, 974,1106	K-14,57,60,89,96, 128,133,157,275, 423,488,578,619, 647,890,900,952, 983,993,1208,1279, 1288,	K-4,56,106,131,163,169, 177,235,291, 480, 566,660,666,717, 780,814,827, 853, 857, 936, 954, 964,974, 986, 1015, 1079, 1208
Nuclear Export Signal	L84	L808	L84	L589	V869, L874	L186, L188, L1117
Phosphorylation	Many*	Similar Distribution	Similar Distribution	Similar Distribution	Similar Distribution	Similar Distribution
Proteinase Cleavage Sites	Q105, Q321, Q1101	Q441, Q1030	Q72	Q60	Q90, Q155, Q498	Q520, Q809, Q908,

Table 1. Post-Translational Modifications determined for C1orf167.

Schematic Illustration of predicted post-translational modifications for C1orf167 made using the Dog 2.0 ^[19] The DUF at locations 954-1418 is labeled

*GPS, NetPhos results indicated hyper-phosphorylation of C1orf167 in H. sapiens and five of its orthologs.

Domain and Motifs by Homology

One domain of unknown function, located from 954aa-1418aa, is 465 amino acids in length.

Secondary Structure

C1orf167 was determined to be rich in alpha helices. No notable regions of beta pleated sheets or coils were predicted.^[20] In particular, high confidence was indicated for 42 alpha helices with the longest alpha helix region spanning from residues 450aa to 1182aa. This long alpha helix region includes a significant portion of the conserved DUF which spans 954aa-1418aa.^[21]^[22]^[23]^[24]^[25]

Tertiary Structure

The best-aligned structural analog, generated by I-TASSER, of C1orf167 had a confidence (c-score) score of -0.68 given a range of [-5,2] with higher values indicating a higher confidence.^[25] Per Swiss Model, two monomers are predicted to form an alpha helix.^[26] Both of the helices are aligned facing outwards with hydrophobic amino acids such as glutamic acid (E) on the interior and asparagine (R), Serine (and lysine (K) on the exterior. Asparagine residues may serve as an important oligosaccharide binding site.^[27]

Expression

C1orf167 has high expression in the larynx, blood, placenta, testis and prostate, with the highest expression found in the testis.^[28] The promoter GXP_5109290 spans 1507 base pairs on chromosome 1.^[29] GXP_5109290 was found to be conserved in the bonobo (Pan Paniscus), gorilla (Gorilla Gorilla Gorilla), mouse (Mus musculus), chimp (Pan Troglodytes), and rhesus monkey (Macaca mulata).^[30]^[31]

Protein Interactions

There were 10 interactions identified by STRING.^[32]

Homology

Paralogs

No known paralogs or paralogous domains were identified for C1orf167.

Orthologs

Using NCBI BLAST, orthologs of C1orf167 were determined. No orthologs could be found in single-celled organisms, or fungi whose genomes have been sequenced. In terms of multi-cellular organisms, orthologs were found in mammals, aves, reptiles, and cartilaginous fishes. The table below shows a representative sample of 20 of the orthologs for C1orf167. The table is organized based on the time of divergence from humans in millions of years (MYA) and then by sequence similarity.

Genus and Species	Common Name	Taxonomic Group	Date of Divergence	Accession #	Sequence Length	Sequence Identity	Sequence Similarity
Homo sapiens	Humans	Mammalia	0	NP_001010881.1	1449aa	100%	100%
Pan troglodytes	Chimpanzee	Mammalia (primate)	6.6	XP_024212133.1	1442 aa	97%	97%
Piliocolobus tephrosceles	Ugandan Red Colobus	Mammalia (primate)	29	XP_026303745.1	1453aa	87%	90%
Macaca fascicularis	Crab-eating Macaque	Mammalia (primate)	29.4	XP_015298104.1	1444aa	87%	90%
Trichechus manatus latirostris	American Manatee	Mammalia (sirenia)	76	XP_023587965.1	1631aa	49%	56%
Marmota flaviventris	Yellow-bellied Marmot	Mammalia (rodentia)	90	XP_027803235.1	1284aa	49.16%	57%
Galeopterus variegatus	Sunda Flying Lemur	Mammalia (primate)	90	XP_008588133.1	1439aa	54%	60%
Camelus ferus	Bactrian Camel	Mammalia (artiodactyla)	90	XP_014421294.1	1442aa	53%	62%
Miniopterus natalensis	Natal Clinging Bat	Mammalia (chiroptera)	96	XP_016061116.1	1644aa	48.64%	56%
Desmodus rotundus	Common Vampire Bat	Mammalia (chiroptera)	96	XP_024410696.1	1548aa	47.97%	56%
Ictidomys tridecemlineatus	Thirteen-lined Ground Squirrel	Mammalia (rodentia)	96	XP_021576066.1	1349aa	47.59%	56%
Urocitellus parryii	Arctic Ground Squirrel	Mammalia (rodentia)	96	XP_026253666.1	1299aa	46.47%	55%
Myotis brandtii	Brandt's Bat	Mammalia (chiroptera)	105	XP_014400940.1	1390aa	50.19%	59%
Dromaius novaehollandiae	Emu	Aves	312	XP_025951247.1	1154aa	31.56%	47%
Pseudopodoces humilis	Ground Tit	Aves	312	XP_014112713.1	1415aa	30.34%	47%
Columba livia	Rock Dove	Aves	312	XP_021137589.1	1430aa	30.45%	46%
anser cygnoides domesticus	Swan Goose	Aves	312	XP_013043263.1	1126aa	27%	40%
Alligator sinensis	Chinese Alligator	Reptilia	312	XP_025067177.1	1626aa	34%	45%
Pogona vitticeps	Central Bearded Dragon	Reptilia	312	XP_020637641.1	1388aa	27.76%	38%
Callorhinchus milii	Australian Ghostshark	Chondrichthyes	473	XP_007896104.1	1210aa	29%	43%

Table 2. This table shows the divergence timeline of the C1orf167 orthologs. It is sorted by date of divergence, color according to taxonomic group or class and then by sequence similarity.

Multiple Sequence Alignment of Strict Orthologs for C1orf167. Beginning of the conserved DUF region at the break between exon 13 and 14 is shown.^[33]

Function

At this time the function of C1orf167 is uncharacterized.

Clinical Significance

Pathology

According to the EST profile for breakdown by healthy state, the expression levels of C1orf167 were higher than healthy cells for leukemia, head, neck and lung cancers.^[28] Based on the results from NCBI GeoProfiles, C1orf167 was found to have increased expression on dendritic cells for patients experiencing Chlamydia pneumoniae infections. Increased expression of C1orf167 was also indicated for Human Pulmonary Tuberculosis tissues given the presence of caseous tuberculosis granulomas in the lungs when compared to normal lung tissues.^[34]

References

^ ^a ^b NCBI. "C1orf167 chromosome 1 open reading frame 167 [ Homo sapiens (human) ])". NCBI. Retrieved February 9, 2019.
^ ^a ^b "C1orf167 Gene". www.genecards.org. Retrieved 9 February 2019.
^ "Homo sapiens chromosome 1 open reading frame 167 (C1orf167), mRNA". NCBI. 30 June 2018. Retrieved 8 February 2019.
^ "RCSB PDB - Gene View - C1orf167 - chromosome 1 open reading frame 167". www.rcsb.org. Archived from the original on 2019-05-05. Retrieved 2019-03-04.
^ "C1orf167 chromosome 1 open reading frame 167 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-04-22.
^ "Genome Browser FAQ". genome.ucsc.edu. Retrieved 2019-04-22.
^ "C1orf167 GeneCards".
^ "C1orf167 chromosome 1 open reading frame 167 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-04-27.
^ "C1orf167 (human)". www.phosphosite.org. Retrieved 2019-03-04.
^ "HomoloGene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-03-04.
^ "ExPASy: SIB Bioinformatics Resource Portal - Categories". www.expasy.org. Retrieved 2019-04-27.
^ "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk. Retrieved 2019-04-27.
^ "PSORT II Tool". PSORT II.^{[permanent dead link]}
^ "SUMOplot analysis program". SUMOplot. Archived from the original on 2005-01-03. Retrieved 2019-05-05.
^ "GPS 3.0 - Kinase-specific Phosphorylation Site Prediction". gps.biocuckoo.org. Retrieved 2019-04-22.
^ "YinOYang O-GLcNAc sties". YinOYang.
^ "NetOGlyc 4.0 Server". www.cbs.dtu.dk. Retrieved 2019-04-22.
^ "C1orf167 NetCorona entry".
^ "DOG 2.0 - Protein Domain Structure Visualization". dog.biocuckoo.org. Retrieved 2019-05-02.
^ "PHYRE2 Protein Fold Recognition Server". www.sbg.bio.ic.ac.uk. Retrieved 2019-04-22.
^ "CFSSP: Chou & Fasman Secondary Structure Prediction Server". www.biogem.org. Retrieved 2019-04-22.
^ "Phyre2 Database". Phyre2.
^ "SOPMA secondary prediction".
^ "GOR protein prediction".
^ ^a ^b "I-TASSER results". zhanglab.ccmb.med.umich.edu. Archived from the original on 2019-05-05. Retrieved 2019-05-05.
^ "SWISS-MODEL Interactive Workspace". swissmodel.expasy.org. Retrieved 2019-05-05.
^ Kornfeld, R.; Kornfeld, S. (1985). "Assembly of asparagine-linked oligosaccharides" (PDF). Annual Review of Biochemistry. 54: 631–664. doi:10.1146/annurev.bi.54.070185.003215. PMID 3896128.
^ ^a ^b "EST Profile - Hs.585415". www.ncbi.nlm.nih.gov. Retrieved 2019-04-22.
^ "ElDorado Introduction". www.genomatix.de. Archived from the original on 2016-06-02. Retrieved 2019-04-22.
^ "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov. Retrieved 2019-04-22.
^ "Clustal Omega < Multiple Sequence Alignment < EMBL-EBI". www.ebi.ac.uk. Retrieved 2019-04-22.
^ "C1orf167 protein (human) - STRING interaction network". string-db.org. Retrieved 2019-04-19.
^ "Clustal Omega < Multiple Sequence Alignment < EMBL-EBI". www.ebi.ac.uk. Retrieved 2019-05-01.
^ "c1orf167 - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-05-01.

[:0-1] NCBI. "C1orf167 chromosome 1 open reading frame 167 [ Homo sapiens (human) ])". NCBI. Retrieved February 9, 2019.

[GeneCards_entry_on_C1orf167-2] "C1orf167 Gene". www.genecards.org. Retrieved 9 February 2019.

[NCBI_entry_on_Homo_Sapiens_C1orf167_mRNA-3] "Homo sapiens chromosome 1 open reading frame 167 (C1orf167), mRNA". NCBI. 30 June 2018. Retrieved 8 February 2019.

[4] "RCSB PDB - Gene View - C1orf167 - chromosome 1 open reading frame 167". www.rcsb.org. Archived from the original on 2019-05-05. Retrieved 2019-03-04.

[5] "C1orf167 chromosome 1 open reading frame 167 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-04-22.

[6] "Genome Browser FAQ". genome.ucsc.edu. Retrieved 2019-04-22.

[7] "C1orf167 GeneCards".

[8] "C1orf167 chromosome 1 open reading frame 167 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-04-27.

[9] "C1orf167 (human)". www.phosphosite.org. Retrieved 2019-03-04.

[10] "HomoloGene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-03-04.

[11] "ExPASy: SIB Bioinformatics Resource Portal - Categories". www.expasy.org. Retrieved 2019-04-27.

[12] "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk. Retrieved 2019-04-27.

[13] "PSORT II Tool". PSORT II.^{[permanent dead link]}

[14] "SUMOplot analysis program". SUMOplot. Archived from the original on 2005-01-03. Retrieved 2019-05-05.

[15] "GPS 3.0 - Kinase-specific Phosphorylation Site Prediction". gps.biocuckoo.org. Retrieved 2019-04-22.

[16] "YinOYang O-GLcNAc sties". YinOYang.

[17] "NetOGlyc 4.0 Server". www.cbs.dtu.dk. Retrieved 2019-04-22.

[18] "C1orf167 NetCorona entry".

[19] "DOG 2.0 - Protein Domain Structure Visualization". dog.biocuckoo.org. Retrieved 2019-05-02.

[20] "PHYRE2 Protein Fold Recognition Server". www.sbg.bio.ic.ac.uk. Retrieved 2019-04-22.

[21] "CFSSP: Chou & Fasman Secondary Structure Prediction Server". www.biogem.org. Retrieved 2019-04-22.

[22] "Phyre2 Database". Phyre2.

[23] "SOPMA secondary prediction".

[24] "GOR protein prediction".

[:2-25] "I-TASSER results". zhanglab.ccmb.med.umich.edu. Archived from the original on 2019-05-05. Retrieved 2019-05-05.

[26] "SWISS-MODEL Interactive Workspace". swissmodel.expasy.org. Retrieved 2019-05-05.

[27] Kornfeld, R.; Kornfeld, S. (1985). "Assembly of asparagine-linked oligosaccharides" (PDF). Annual Review of Biochemistry. 54: 631–664. doi:10.1146/annurev.bi.54.070185.003215. PMID 3896128.

[:1-28] "EST Profile - Hs.585415". www.ncbi.nlm.nih.gov. Retrieved 2019-04-22.

[29] "ElDorado Introduction". www.genomatix.de. Archived from the original on 2016-06-02. Retrieved 2019-04-22.

[30] "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov. Retrieved 2019-04-22.

[31] "Clustal Omega < Multiple Sequence Alignment < EMBL-EBI". www.ebi.ac.uk. Retrieved 2019-04-22.

[32] "C1orf167 protein (human) - STRING interaction network". string-db.org. Retrieved 2019-04-19.

[33] "Clustal Omega < Multiple Sequence Alignment < EMBL-EBI". www.ebi.ac.uk. Retrieved 2019-05-01.

[34] "c1orf167 - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-05-01.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]