Cxorf26

CXorf26 (Chromosome X Open Reading Frame 26), also known as MGC874, is a well conserved human is_associated_with::gene found on the plus strand of the short arm of the is_associated_with::X chromosome. The exact function of the gene is poorly understood, but the is_associated_with::polysaccharide biosynthesis domain that spans a major portion of the is_associated_with::protein product (known as UPF0368), as well as the yeast homolog, YPL225, offer insights into its possible function.

Proposed function
Given the mass of data available on CXorf26, potential function is likely related to the workings of is_associated_with::RNA polymerase II, is_associated_with::ubiquitination, and is_associated_with::ribosomes in the cytoplasm. The basis of these arguments is on the interaction data of human CXorf26 as well as its yeast homolog, YPL225W. Both homologs show interaction with multiple ubiquinated proteins as well as the transcriptional enzyme RNA polymerase II. For example, ubiquitiation and subsequent degradation of the 26S proteasome serves an important function in regulating transcription in eukaryotes. The yeast protein RPN11, which interacts with YPL225W, has a homolog in humans that is a metalloprotease component of 26S proteasome that also degrades proteins targeted for destruction by the ubiquitin pathway. These functions do not seem to relate to a polysaccharide biosynthesis function as would be assumed due to its conserved domain, but it may still play a role in secondary structure or sites of phosphorylation.

Further experimentation into the potential role of CXorf26 can give further insight into its exact function in these key cellular processes. Experiments such as a RNA polymerase II inhibitior and subsequent gene expression of CXorf26 could enlighten potential function as well as a complete knoockout of YPL225W in yeast using methods such as is_associated_with::RNAi.

Gene
CXorf26 is found on the plus strand of the short arm of the is_associated_with::X chromosome, specifically on the gene locus Xq13.3 spanning the genomic chromosome region from bases 75,393,420-75,397,740. The primary mRNA transcript sequence has 1214 is_associated_with::base pairs and its protein product, UPF0368,is composed of 233 amino acids and has a predicted mass of 26,057 Da. The locus where CXorf26 is located, Xq13.3, has known associations to X-linked mental retardation. The third is_associated_with::gene located upstream of CXorf26 is is_associated_with::ATRX, which encodes for an ATPase/helicase domain, and when mutated causes an X-linked mental retardation syndrome along with alpha thalassemia syndrome; both are known to cause changes in the DNA methylation patterns. Furthermore, the third gene downstream of CXorf26, ZDHHC15, which when mutated, causes mental retardation X-linked type 91. One noteworthy gene located nearby is is_associated_with::Xist, which plays a role in the inactivation process of the X chromosome. X inactivation relates to CXorf26, and is discussed below in the relevant research section.

Expression
Expression data for CXorf26 shows it is highly ubiquitously expressed throughout human tissues and ESTs in nearly all situations. The GEO profile to the right shows the expression levels for CXorf26 in common human tissues to consistently be around the 75th percentile range, suggesting it may possess a housekeeping function due its seemingly ubiquitous expression. If the conserved domain does indeed play a role in polysaccharide biosynthesis of some sort, this high is_associated_with::gene expression is sensible to that function.

Gene expression profiles in the Gene Expression Omnibus (GEO) repository located within the NCBI website demonstrated that there were not many treatments that resulted in a changing of expression of CXorf26 in examined tissues. However, one experiment compared CXorf26 expression in lung adenocarcinoma CL1-5 cells either overexpressing or underexpressing Claudin-1. Results indicated that CXorf26 expression greatly drops when CLDN1 is overexpressed. CLDN1 is a major component in forming is_associated_with::tight junction complexes between cells, which foster cell-cell adhesion of is_associated_with::cell membranes. More tight junctions formed by CLDN1 would likely result in decreased expression of CXorf26 since the cell membrane would be used for tight junctions instead of its normal function related to heparan sulfate.

Alternative splice forms
There is only one alternative splice form for CXorf26. This splice form has significantly fewer mRNA base pairs at 977, but still has a protein product of 232 amino acids. This alternative splice form appears to be missing is_associated_with::exon 5 of the transcript, but it may be added onto exon 6, creating a larger exon compared to the consensus transcript.

There were no other predicted exons within the genomic CXorf26 sequence when 3000 base pairs were added on either side in the search.

Promoter region
The promoter for CXorf26 is predicted to be located from bases 75392235 to 75393075 on the X chromosome positive strand. The promoter region has extensive conservation with all primates and most mammal homologs, but conservation is lessened in more distantly related species. Given the primary transcript begins at base 7539277, the promoter overlaps with it by 304 bases. 20 predicted transcription factor binding sites with their transcription factor family was collected as well. A high amount of the transcriptional factors relate to zinc finger factors, which have the function of stabilizing protein folds, while none of the factors seem to relate to a potential polysaccharide biosynthesis function. One transcription factor family predicted to bind to the promoter region was V$CHRF, and is involved in regulation of the cell cycle. The regulation could be related to is_associated_with::ubiquitin function; proteins with ubiquitination type function were found to interact with CXorf26.

Subcellular distribution
The CXorf26 protein is 56.5% likely to be localized within the cytoplasm while 17.4% likely to localized to the is_associated_with::mitochondria. CXorf26's yeast homolog, YPL225W, was GFP tagged and its location was determined to be in the cytoplasm. Cytoplasmic location instead of transmembrane was supported since no hydrophobic signal peptide sequence and TMAP predicted no potential transmembrane segments in CXorf26 or any of its homologs in other is_associated_with::species.

Polysaccharide domain
CXorf26 was found to have conserved domain known as DUF757 within its sequence. The conserved domain spans a majority of the protein sequence, from amino acids 39-159. Conservation of the domain is strong throughout all homologs compared, including is_associated_with::mammals, is_associated_with::invertebrates such as is_associated_with::insects, and even is_associated_with::sponges. The is_associated_with::yeast homolog, YPL225W, shows 42.4% identity and 62% similarity in this domain. Conservation of the domain is especially high in areas which include one of the multiple is_associated_with::alpha helices or is_associated_with::beta sheets. There are also multiple conserved is_associated_with::phosphorylation sites located in the is_associated_with::amino acid sequence at is_associated_with::tyrosine 72 and is_associated_with::serine 126.

According to NCBI, this domain is in the family of proteins expected play a role in is_associated_with::xylan biosynthesis in plant cell walls, but its exact role in the synthesis pathway is unknown. As is_associated_with::animal cells do not contain cell walls, its exact function in other organisms such as humans is unknown.

Xylan is made from units of the pentose sugar is_associated_with::xylose, which is known for being the first saccharide in multiple biosynthetic pathways of anionic polysaccharides such as is_associated_with::heparan sulfate and is_associated_with::chondroitin sulfate. Like Xylan, heparan sulfate it is found on the cell surface; since it is needed for both the cell surface and extracellular matrix,it may explain CXorf26's high expression in nearly all human tissues. Heparan biosynthesis occurs in the lumen of the endoplasmic reticulum and is initiated by the transfer of a xylose from UDP-xylose by xylosyltransferase to specific serine residues within the protein core. PSORTII predicts the presence of a KKXX-like motif, GEKA, near the is_associated_with::C-terminus of CXorf26. KKXX-like motifs are predicted is_associated_with::endoplasmic reticulum membrane retention signals. This motif is only conserved in primates. However, another KKXX-like motif, QDKE, is found to exist at the end of the domain. The K in this motif is highly conserved back to most is_associated_with::invertebrates. However, contradicting results from NetNGlyc predicted no N-glycosylation sites, suggesting CXorf26 does not undergo special folding in the endoplasmic reticulum lumen. Given that the conserved domain cannot function to create xylan since there are no cell walls in animal cells, the function may be related to this pathway.

Secondary structure
Predictions across multiple programs suggest the presence of 7 is_associated_with::alpha helices and 2 is_associated_with::beta sheets for CXorf26; the majority of the secondary structures are in the conserved domain. Experimental evidence in the yeast homolog shows 4 alpha helices and 2 beta sheets all in the polysaccharide domain, just as the predicted SWISS model above shows for humans. The location of the secondary structures are also conserved.

Post-translational modifications
Pepsin (pH 1.3), Asp-N endopeptidase, N-terminal Glutamate and Proteinase K all had 50 or more cleavage sites within the protein, but none of the 10 caspases had any cleavage sites. This suggests CXorf26 is not likely to be cleaved or degraded during apoptosis. This follows with the observation that CXorf26 is expressed highly in nearly all tissues and experimental conditions.

Lysine 63 and 66 are potential sites of glycation of epsilon amino groups of lysines. Lysine 63 was conserved in both Macaca mulatta and Bombus impatiens. There are 10 is_associated_with::serine, 3 is_associated_with::threonine, and 6 is_associated_with::tyrosine phosphorylation sites predicted within the CXorf26 protein. When comparing the predicted phosphorylation sites, those shown in the table below were those conserved in is_associated_with::Macaca mulatta as well as is_associated_with::Bombus impatiens. S127 was left in the table even though Homo sapiens and Macaca mulatta did not have significant scores above threshold for that position.Through evolutionary change, the is_associated_with::serine in Bombus was changed to a tyrosine in Homo sapiens and Macaca mulatta, which is still capable of phosphorylation, suggesting although there was a mutation, it would likely not result in a large change for the protein and its function.

Species distribution
CXorf26 is strongly evolutionary conserved, with conservation found in is_associated_with::Batrachochytrium dendrobatidis. A multiple sequence alignment of 20 is_associated_with::orthologous protein sequences reveals very strong conservation of the polysaccharide biosynthesis domain, but conservation after it was essentially non-existent in is_associated_with::invertebrates. For those vertebrates that contained a sequence after the conserved domain, it was found to be of low complexity and filled with repetitive sequence of the is_associated_with::amino acid motif 'GEK', corresponding to amino acids is_associated_with::glycine, is_associated_with::glutamic acid, and is_associated_with::lysine. Glutamic acid and lysine both are charged, which contributes to the overall hydrophilicity of the section after the conserved domain.

Yeast homolog YPL225W
The CXorf26 homolog in yeast, YPL225W, has an overall identity match of 27% but a 42.4% identity and 62% similarity with the polysaccharide biosynthesis domain. Like the predicted human secondary structure, YPL225W is experimentally verified to also contain four is_associated_with::alpha helices and two is_associated_with::beta sheets within the biosynthesis domain. Like CXorf26, YPL225W function in yeast is unknown, but based on co-purification experiments it may interact with ribosomes since many of its 18 interacting proteins were related to RNA and ribosomes. There were also multiple proteins involved with is_associated_with::RNA polymerase, which is involved in the cellular process of transcription. Furthermore, multiple proteins were involved in is_associated_with::ubiquitination. Some of the interacting yeast proteins with the higher interaction scores were UBI4, RPB8, SRO9, and NAB2.

Interacting proteins
Potential interacting proteins were identified using the tools provided at the I2D Interlogous Interaction Database and the STRING 9.0 program. Although more proteins were predicted, those shown below had the highest scores and showed the greatest possibility of relating to potential CXorf26 function.

is_associated_with::SMAD2, PHB, and is_associated_with::CTNNB1 were found in an experiment investigating transcriptional factor networks. The BABAM1 interaction was found in both databases using an anti-tag coimmunoprecipitation assay while is_associated_with::POLR2H was based on a tandem affinity purification assay using the yeast homolog, YPL225W.