Rs4488809

rs4488809 is a SNP associated with non-small cell lung cancer (NSCLC), which comprises the vast majority of lung cancer cases. Encompassing adenocarcinoma, squamous cell carcinoma, large cell carcinoma, and other less common subtypes, NSCLC accounts for nearly all cases of lung cancer in non-smokers. NSCLC is usually treated via a combination of surgical tumor resection, chemotherapy, and radiation therapy. However, despite advances in treatment in recent years, including the development of targeted therapies against specific mutations, the prognosis for NSCLC is still bleak, and it remains the leading cause of death due to cancer worldwide.

rs4488809 is located on chromosome 3q28, in the TP63 locus. This SNP was first linked to NSCLC by Miki et al. in 2010. The authors noted that while previous GWAS studies had identified the 15q24-25.1 region to be the most significantly associated with lung cancer in patients of European ancestry, the SNPs found to be associated with lung cancer risk in European populations occurred with very rare frequency in Asian populations. Thus, Miki et al. conducted GWAS analysis of East Asian individuals to look for SNPs associated with lung cancer risk in these populations. While an initial study of 1004 cases of lung adenocarcinoma and 1900 controls did not identify any SNPs with a genome-wide significance level of association, a replication study examining 50 SNPs in Japanese (525 cases, 7678 controls) and Korean (569 cases, 1470 controls) subjects identified rs10937405 as statistically significantly associated with adenocarcinoma, after applying a Bonferroni correction (p=7.26×10-12, OR=1.31). Fine mapping of an ~200kb region from 190.7-190.9 Mb on chromosome 3 identified rs4488809 in intron 1 of the TP63 gene, 27 kb upstream of rs10937405, as the most strongly associated SNP in the region.

Since then, multiple other studies have examined the relationship between rs4488809 and lung cancer. In 2011, Hu et al. conducted GWAS analysis on individuals of Han Chinese ancestry, starting with a genome-wide association scan in 5408 subjects (2331 cases, 3077 controls), then performing a 2-stage validation in 12722 subjects (6313 cases, 6409 controls). The combined results confirmed the significance of rs4488809 (p=7.2×10-26, OR=1.26), and also found the effect of rs10937405 to be abolished by rs4488809 (pest=0.54 for rs10937405, conditioned on rs4488809, pest=2.2×10-17 for rs4488809, conditioned on rs10937405). Another analysis combining 13 case-control and 1 cohort study (total 5510 cases, 4544 controls) looked specifically for susceptibility loci in never-smoking Asian women, and confirmed genome-wide significance for association of rs4488809 with lung cancer. Notably, this study found no significant association throughout the 15q25 region, suggesting that the genetic components of the etiology of lung cancer in patients of European vs. Asian ancestry may be distinct. More recently, Zhang et al. conducted a meta-analysis of 9 studies (35961 cases, 57790 controls), and also found rs4488809 to be significantly associated with NSCLC, most strongly with adenocarcinoma (p<10-5, OR=1.19) [PMCID 3900682].

However, in 2012, Hosgood et al. came to a different conclusion about the significance of rs4488809 in a subset of NSCLC patients. The authors observed that in the GWAS conducted by Miki et al., rs4488809 did not display a genome-wide level of significance when restricted to only never-smoking males (p=0.012, OR=1.22) and never-smoking females (p=3.5×10-6, OR=1.38). Based on this observation, they decided to investigate further into the significance of rs10937405 and rs4488809 (the same two loci studied by Miki et al.) in non-smokers, by combining the results of 10 studies in Asian never-smoking women with NSCLC (total 3467 cases, 3787 controls). They found that while rs10937405 has a significant association with adenocarcinoma (p=7.1×10-8, OR=0.8), rs4488809 did not achieve a genome-wide level of significance (p=7.4×10-5) in non-smokers.

There is also some controversy about the risk allele for rs4488809, as well as even what the minor allele (the allele that occurs at a lower frequency) is. While Zhang et al. and Lan et al. found the C allele to confer increased risk for lung cancer, for example, Hosgood et al. and Miki et al. found the T allele to be the risk allele. This switching between apparent risk alleles has been observed for other SNPs before as well (for example, rs10937823, which may be associated with bipolar disorder). The confusion is likely exacerbated by the fact that both alleles are present at comparable frequencies – for example, Miki et al. reported that the T allele was the minor allele in East Asians, appearing in controls with a frequency of 0.49, and in the cases with a frequency of 0.55. In contrast, Lan et al. reported that the C allele appeared in controls with a frequency of 0.42, and in the cases with a frequency of 0.46.

rs4488809 is located in intron 1 of the TP63 gene, which codes for the transcription factor p63, a member of the p53 family of transcription factors that is induced after exposure to DNA damage. While p53 is a well-known tumor suppressor gene that is found in a majority of human cancers, phylogenetic analysis suggests that p53 in fact evolved from p63. p63 knockout mice exhibit severe developmental defects, and TP63 mutations in humans underlie physical malformations, such as cleft palate. These phenotypic changes are due to the role of p63 as an “epithelial organizer,” serving to modulate epidermal mesenchymal transition, stemness, and cell death, at least partly via interactions of a subset of its isoforms with p53. Because rs4488809 is located in intron 1 of TP63, which encodes for these isoforms, the linkage between rs4488809 and lung cancer may be explained by its impact on the function of an important transcription factor in regulating the cell cycle and response to cellular stress.