If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Co-occurrence of mutations in remyelination and immunity genes significantly increases the risk of MS occurrence in females.
20 SNPs with significant association with MS have been identified on the x chromosome.
Majority of MS associated SNPs were found in genes with remyelination or immunity functions.
Multiple sclerosis (MS) is a chronic neurodegenerative disease, which has a strong genetic component and is more prevalent in women. MS is caused by an autoimmunity initiated inflammatory response which leads to axon demyelination, followed by axon loss, plaque formation and neurodegeneration. The goal of this article was to explore X-linked genetic factors that are associated with MS susceptibility.
Using UK Biobank microarray, we analyzed the prevalence of alleles on the X chromosome to identify variants potentially involved in MS. Overall, 488,225 patients across 18,857 markers were analyzed using PLINK.
Our results identify 20 SNPs that are significantly more abundant in persons with MS. The genes associated with these SNPs belong to immunity (LAMP2, AVPR2, MTMR8, F8, BCOR, PORCN, and ELF4) and remyelination (NSDHL, HS6ST2, RBM10, TAZ, and AR) pathways that are potentially of great significance for understanding the onset and progression of multiple sclerosis. We further identified a significant 20-fold increase in incidence of MS cases in women with co-occurrences of SNPs associated with myelination and immunity functions.
Our analysis provides novel insights into the roles of X-linked genes in the onset and presentation of multiple sclerosis, identifying 20 SNPs in 14 genes involved primarily in immunity and myelination functions that are significantly more abundant in persons with MS. Our co-occurrence analysis suggests that concurrent disruption of both myelination and immune systems significantly increases the risk of MS onset in women.
). The BBB consists of cerebral endothelial cells, pericytes and their basal lamina. Disruption of the BBB, in pathological conditions such as MS, allows T lymphocytes activated in the periphery to infiltrate the central nervous system to trigger the immune responses responsible for myelin damage (
). The infiltrating lymphocytes release cytotoxic factors including pro-inflammatory cytokines, proteases, and reactive oxygen species, initiate microglia and astrocytes, and recruit macrophages and other lymphocytes (
). The myelin repair, remyelination process, does occur and is able to reverse the damage due to inflammation; however, repeated attacks result in less effective remyelination and the formation of plaques around the damaged axon (
). The new myelin sheath acts as a protective physical barrier against damage from inflammatory molecules and restores trophic support to the axon. Despite this, the remyelination process becomes less efficient with progressive damage, leading to increased neurodegeneration (
). Since the lack of myelination is the proximate cause of the axonal death and neurodegeneration associated with MS, remyelination has been an important topic of research in the treatment and recovery of persons with MS (
). Because of the impact of MS on demyelination and the sex-linked differences in persons with MS, we focused our analysis on the X chromosome. The X chromosome has long been under investigation in its role in MS (
). However, despite the presence of multiple immune and remyelination response genes on the X chromosome, the SNPs located within these genes have not been fully analyzed for their implication in the presentation of MS. In addition variant co-occurrence analysis has not been systematically conducted previously.
In this study, we focused on analyzing the large array data set from the UK Biobank repository to identify causal low frequency alleles (single nucleotide polymorphisms) on X chromosome affecting immune and remyelination responses. The UK Biobank is a particularly useful resource, since it is a population-based data repository, with a focus on middle and old age diseases (
). The UK Biobank Axiom Array covers 820,967 SNP and indel markers across all chromosomes. The array contains rare coding variants, composed of 30,581 protein truncating variants and 80,581 missense variants. Additionally, the array contains 348,569 common variants genome-wide and 280,838 low frequency variants genome-wide. In total allele variant information was obtained from 488,225 patients. To identify persons with MS, we utilized the International Classification of Diseases 10th Revision (
) code G35, which is specific for multiple sclerosis within demyelinating diseases of the central nervous system. Specifically, the following Summary Diagnosis data fields from the UK Biobank were searched for G35: 41,270 (Diagnoses - ICD10), 41,202 (Diagnoses - main ICD10), 41,204 (Diagnoses - secondary ICD10), and 41,201 (External causes - ICD10).
Genomic data for the X chromosome was obtained from the UK Biobank data field 22,418. Specifically, the PLINK binary biallelic genotype table file ukb22418_cX_b0_v2.bed, and it's associated PLINK sample information file ukb22418_cX_b0_v2_s488225.fam were downloaded using the gfetch utility. Additionally, the PLINK extended MAP file for chromosome X was obtained from UK Biobank Resource 1963. The previously converted MS diagnosis information was encoded as the phenotype variable for the PLINK analysis. PLINK was used to perform the association analysis, using the default parameters, for all 18,857 SNPs that were mapped to the X chromosome (
). The Benjamini multiple testing correction was employed on the resultant p-values. The full Homo sapiens gene set was used as the background gene set.
Co-occurrence analysis was performed by grouping significant SNPs from myelination and autoimmune implicated genes into two categories. For each individual, co-occurrence was defined a presence of at least one alternate allele from each group. The significance of the interaction between MS diagnosis and the co-occurrence of myelination and autoimmune implicated variants was then assessed using a Chi-squared test.
2.1 Ethical approval
UK Biobank had obtained ethics approval from the North West Multi-center Research Ethics Committee (approval number: 16/NW/0274). Informed consent from all participants was obtained by the UK Biobank. The UK Biobank approved an application for use of the data (ID 69,385). All data used in this analysis has been fully de-identified by the UK Biobank, following the de-identification protocol V2. Further, the received Participant Data was released to researchers with distinct encrypted random number identifiers.
Using the publically available UK Biobank resource, we examined 488,225 patients available in the database. The mean age of recruitment for participants in the UK Biobank set was 56.53 (S.D. = 8.1). The mean age of participants with MS at recruitment was 57.48 (S.D. = 9.2). To understand the available cohort of persons with MS, we first examined the frequency of multiple sclerosis in the database. In contrast with previous estimates of multiple sclerosis prevalence of 203.4 per 100,000 population of the United Kingdom (
), we observed a significantly higher ratio of 404.3 individuals per 100,000 (χ2 p-value = 2.04e-21). In total, 1974 diagnosed cases of MS as defined by ICD10 code G35 were identified from the UK Biobank cohort. Next, we examined the sex based frequency of MS in the UK Biobank cohort. 0.536% (1419/264,772) of women were diagnosed with MS as defined by the G35 ICD10 code. In contrast, only 0.248% (555/223,453) of men have an MS diagnosis. In accordance with the previously reported higher prevalence of MS among women of 2.3–3.5:1 (
), we observed a ratio of 2.16:1, with 71.9% of MS cases in the UK Biobank being diagnosed in women. This constitutes a significantly higher prevalence of MS in women in the UK Biobank cohort (χ2 p-value = 1.76e-55).
To identify SNPs significantly associated with MS, we performed PLINK association case-control analysis for the X chromosome. In total, 488,377 individuals were analyzed. Of these, 223,453 were encoded as men, 264,772 as women, and 152 as unspecified sex. Further, 1974 were mapped as MS cases, 486,251 as controls, with 152 possessing a missing phenotype. Thus, a total genotyping rate in remaining individuals is 0.98. In total, we analyzed 18,857 SNPs that were present on the X chromosome as part of the UK Biobank Axiom® Array (
) We first analyzed the Q-Q plot, which revealed that the PLINK analysis had sufficient power to identify significant X-linked SNPs associated with MS (Fig. 1). To identify significant SNPs, we performed False Discovery Rate multi-testing corrections on the resultant PLINK p-values. This analysis revealed 44 SNPs that were significant to a FDR < 0.01 level (Fig. 2). This included 20 SNP variants that were significantly more abundant in persons with MS, and 24 SNP variants that were significantly more abundant in control cases. In total, the 20 significant SNP variants positively associated with MS prevalence were mapped to 14 genes (Table 1). All 14 genes are located outside the pseudoautosomal regions, implying that all male cases are homozygous at these SNP locations. Suggesting the importance of these genes in proper homeostasis and metabolic functioning, the 14 genes were significantly enriched (Benjamini p-value = 3.2E-3) in UniProt keyword for Disease mutation (9/14). To get a better understanding of how these SNPs affect the function of the genes in which they reside, we next examined the individual SNPs.
Table 1Genes with significant (FDR < 0.01) MS associated SNPs, classified by their putative roles in remyelination and autoimmunity.
We observed that the majority of genes with significant SNPs could be classified into two categories, widely implicated in MS onset and progression: myelination and immunity pathways. In total, we were able to identify 5 genes and 6 significant SNPs implicated in myelination functions, and 7 genes and 12 significant SNPs in immunity related functions (Table 1). Interestingly, the most significant SNP is implicated in myelination functionality.
3.1 Myelination implicated SNPs
The most significant SNP was rs797045835, which maps to the NAD(P) dependent steroid dehydrogenase-like (NSDHL) gene (FDR p-value = 1.57E-45, OR = NA). The alternative allele for this SNP is an 8 nt deletion, which results in a frameshift mutation affecting the tail 27 aa of the NSDHL protein. The NSDHL protein performs essential roles in the production of cholesterol. Further, the cholesterol pathway and NSDHL specifically have been recently implicated in the demyelination associated with MS (
). Specifically, downregulation of cholesterol biosynthesis was associated with increased demyelination, and it is possible to speculate that the frameshift mutation results in the dysregulation of the NSDHL protein.
The SNP rs950792996 (FDR p-value = 1.78E-7, OR = 55.28) represents an intron variant in heparan sulfate 6-O-sulfotransferase 2 (HS6ST2). Recent research has shown that heparan sulfate accumulation by oligodendrocyte cells slows demyelination and promotes remyelination associated with MS (
). Thus our results suggest that HS6ST2 has a potential role in helping generate specific heparan sulfates responsible for promoting remyelination, which this variant potentially disrupts.
The gene tafazzin (TAZ) contain two significant SNPs: rs387907218 and Affx-89,017,095. rs387907218 (FDR p-value = 7.08E-6, OR = 44.2) is a missense variant (G > R) that is associated with infantile dilated X-linked cardiomyopathy (
). Affx-89,017,095 (FDR p-value = 1.55E-3, OR = 27.6) also represents a missense mutation (I > N) in the putative acyl-acceptor binding pocket of the Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: AGPAT-like domain, with the potential to disrupt or alter substrate binding. TAZ plays an important role in remodeling cardiolipin and by extension the proper structure and function of mitochondria (
). It has been experimentally shown that cerebrospinal fluid of patients with progressive MS causes neuronal mitochondrial elongation, which is thought to contribute to the metabolic impairment of neuronal bioenergetics underlying neurodegeneration associated with MS (
RNA binding motif protein 10 (RBM10) contains the significant SNP rs139585263 (FDR p-value = 4.51E-4, OR = 31.57), which encodes a synonymous variant. RBM10 is an RNA-binding protein that regulates alternative splicing of DNA (cytosine-5)-methyltransferase 3b (DNMT3B) (
). DNMT3B regulates the activity of NF-κB-responsive promoters and consequently inflammation development. Further, increased demethylation activity of, in part, DNMT3B has been shown to coincide with hippocampal demyelination in persons with MS (
). If the observed synonymous variant is able to increase translational efficiency of RBM10, it has the downstream potential to promote demethylation and, in turn, the demyelination associated with MS.
A significant SNP rs367604031 (FDR p-value = 2.25E-3, OR = 13.39) is also located in the Androgen receptor gene (AR). It represents a missense (E > Q) variant, which is localized between two phosphorylation sites and adjacent to the transcription activation unit Tau-5 (
The most significant immunity implicated SNP was rs1194422515, which maps to lysosomal associated membrane protein 2 (LAMP2) (FDR p-value = 1.00E-21, OR = 221.7). The alternative allele is a single nucleotide deletion, inducing a frameshift, which results in disruption of terminal 218 amino acid residues that encode the second lumenal domain and a protein binding site. In addition to rs1194422515, three additional SNPs are significantly associated with MS risk. Those SNPs are rs42895 (FDR p-value = 4.53E-4, OR = 1.194); rs42886 (FDR p-value = 1.83E-3, OR = 1.486); and rs41300191 (FDR p-value = 1.01E-3, OR = 1.177). While rs42895 and rs42886 are intron variants, rs41300191 is a 3′ UTR variant. Autophagy, in which LAMP2 participates, is tightly linked to autoimmune regulation, and directly participates in the progress of MS. Further, as inflammation and oxidative stress are increased in MS lesions, LAMP2 expression is reduced. As such the resultant frameshift mutation induced by the alternative rs1194422515 allele likely similarly reduces the abundance of LAMP2 and in turn promotes inflammation and oxidative stress.
The arginine vasopressin receptor 2 (AVPR2) gene has 3 significant SNPs with alternative alleles more prevalent in MS cases. These four SNPs are Affx-89,012,620 (FDR p-value = 6.03E-21, OR = 213.5); Affx-89,008,152 (FDR p-value = 8.71E-10, OR = 73.63); and Affx-89,010,658 (FDR p-value = 7.11E-5, OR = 36.69). Affx-89,012,620 is a 1 nucleotide insertion, resulting in a frameshift mutation, affecting the terminal 59 amino acid residues. This region also includes transmembrane helix 7 of the AVPR2 protein. Disruption of this transmembrane region by other SNP insertions, such as rs886040961, has been implicated in causing nephrogenic diabetes insipidus. Affx-89,008,152 and Affx-89,010,658 result in missense mutations. Vasopressin has been implicated in multiple sclerosis. Vasopressin (AVP) is released after brain injury and contributes to the inflammatory response. Previous research showed that blocking the AVPR2 receptor can decrease BBB permeability and affect MS progression (
). Thus the association of MS with these AVPR2 mutations seems to indicate increased function of the AVPR2 receptor in promoting BBB permeability.
Another significant SNP, rs766668643, (FDR p-value = 8.85E-14, OR = 110.4) encodes a stop gain variant in the myotubularin related protein 8 gene. The stop gain prematurely terminated the MTMR8 gene, removing the Myotubularin-like phosphatase domain. MTMR8 functions to dephosphorylate phosphatidylinositol 3-phosphate [PtdIns(3)P], which in turn decreases activity of autophagy processes (
). We speculate that the stop gain mutation to MTMR8, which removes its catalytic domain, promotes increased PtdIns(3)P levels, which in turn promote increased autophagy as seen in relapsing-remitting persons with MS.
We also identify a significant SNP in the coagulation factor VIII gene. The SNP rs369414658 (FDR p-value = 5.87E-5, OR = 18.45) is a serine to threonine missense variant. Factor VIII has also been implicated in BBB permeability (
The BCL6 corepressor gene (BCOR) presents an interesting case as it contains a SNP variant that is significantly more abundant in persons with MS (rs199676230) and also three SNP variants that are significantly more abundant in the control population (rs5963736, rs5963739, rs4076107). The rs199676230 SNP (FDR p-value = 4.24E-4, OR = 31.55) variant encodes a stop codon gain mutation. This mutation disrupts two ankyrin protein-protein binding domains, and the critical Polycomb Group Ring Finger 1 (PCGF1) binding domain. PCGF1 is an important factor in regulation of hematopoietic cell differentiation (
The porcupine O-acyltransferase (PORCN) gene contains a significant SNP, rs1556974235 (FDR p-value = 3.74E-3, OR = 24.54), which a missense (R > C) variant. PORCN acts as a stimulator of Wnt secretion, which has been implicated in promoting the proper formation of the BBB phenotype (
Finally, ETS-related transcription factor Elf-4 (ELF4) contains the significant SNP rs373568641 (FDR p-value = 3.79E-3, OR = 12.68), which encodes a missense (S > P) variant. ELF4 functions to inhibit differentiation of CD4+ T cells into Th17 cells (
Another significant SNP, rs121912302, (FDR p-value = 1.94E-7, OR = 55.44) is located in the dyskerin pseudouridine synthase 1 (DKC1) gene. This is a missense mutation, which has previously been associated with X-linked dyskeratosis congenita (
The gene RIB43A domain with coiled-coils 1 (RIBC1) contains one significant SNP, rs782346908 (FDR p-value = 7.35E-5, OR = 36.8). It encodes a splice donor variant, which has the potential to affect protein structure through exclusion of exons or inclusion of intron sequences into the mature mRNA.
3.4 Co-occurrence analysis
Our results revealed the importance of mutations in myelination and immunity genes on the X-chromosome in the presentation of MS. Due to the importance of deficient remyelination and overactive autoimmunity (particularly with regard to BBB disruption driven inflammation) phenotypes in driving multiple sclerosis progression, we asked whether individuals with mutant variants from both functional classes (immunity and remyelination) were more likely to be present in MS (Table 2). To analyze the importance of the concurrent disruption of myelination and immune functions in driving the MS phenotypes, we performed a co-occurrence analysis. First, we observed that no men have been identified in possessing alternative alleles for significant SNPs associated with remyelination implicated genes. Further, of all women with these remyelination implicated SNPs (58 total), none possessed two copies of the alternant variants and were all heretozygous at these SNP positions. The co-occurrence analysis revealed that concurrent presence of both remyelination and autoimmunity SNPs was significantly enriched in women with MS (20.7X; χ2 p-value=0.0). This result implies that women possessing variant alleles in both groups (immunity and remyelination) have a 20 fold higher risk of developing MS. This effect was largely driven by remyelination SNPs that were also significantly enriched in women diagnosed with MS (22.8X; χ2 p-value=0.0). Our results indicate that individuals with co-occurring variant alleles in both X-linked remyelination implicated genes and X-linked immune functioning genes are over 20 times more likely to have MS.
Table 2Co-occurrence of significant SNPs classified as myelination functioning and immunity functioning in the UK Biobank cohort as variant frequency per 100,000 people.
Here we present an analysis using a large, genomic, publically-available resource, the UK Biobank, to identify alleles which potentially contribute to the observed sex-bias in presentation of multiple sclerosis. Since MS is a progressive, autoimmune disease which presents through axonal demyelination and neuronal death, and presents almost three times more commonly in women, we examined the X chromosome for possible informative variant alleles. Using the genomic and biomedical information from 488,377 individuals available as part of the UK Biobank cohort, we performed chromosome-wide association analysis for MS occurrence with the 18,857 SNPs that were present on the X chromosome. Our analysis identified 20 significant SNPs, at an FDR level of less than 0.01, that were significantly associated with MS. These SNPs belong to 14 genes. Although many of these genes have been tangentially implicated in MS, as described in the results, our results present the first evidence of causal alleles within them that are significantly associated with MS occurrence. Among them are NSDHL, LAMP2, AVPR2, MTMR8, HS6ST2, DKC1, TAZ, and F8. These genes fall into two main categories: seven genes that are implicated in inflammatory responses, and five genes that are implicated in myelination functions.
Genes that promote inflammation do so through a variety of pathways. However, we observed that the majority (4) are implicated in BBB phenotypes: AVPR2, F8, PORCN, and ELF4. The breakdown of the BBB, in which the significant SNPs from these genes have been implicated, has also been implicated in allowing infiltration of lymphocytes which release pro-inflammatory cytokines, proteases, and reactive oxygen species, responsible for demyelination (
). Thus our results further support the role of a breakdown in BBB functionality in driving MS onset. The other immunity implicated genes, LAMP2 and MTMR8, regulate the inflammatory response, while BCOR regulates immune cell development. Autophagy, in which LAMP2 and MTMR8 have been functionally implicated, plays a major role in two of the main hallmarks of MS, neurodegeneration and inflammation, making it especially important to understand how this pathway contributes to MS manifestation and progression (
). This may be particularly impactful for LAMP2, where disrupted lysosome function has previously been implicated in neurodegenerative diseases, including Alzheimer's disease, amyotrophic lateral sclerosis and familial Parkinson's disease (
The five genes that affect myelination are involved in diverse pathways. The most significant SNP from our analysis resides in the NSDHL gene, which regulates production of cholesterol by NSDHL, an important ingredient in myelin formation. High cholesterol levels have been shown as essential for myelin membrane growth (
). Other significant genes affect myelination through less direct pathways, including heparan sulfate production by HS6ST2, promotion of demethylation by RBM10, or proliferation regulation by TAZ. Of particular interest was the significant SNP located within the androgen receptor gene. AR has been shown to promote remyelination, through the action of testosterone and 5αDHT (
). This result provides new support to the immunomodulatory role of testosterone and further suggests that low testosterone levels potentially predispose women to increased rates of MS. Highlighting the particularly important role the myelination genes play, none of the significant SNPs were found to be present in men or in two copies in women.
The immunity and myelination pathways have been previously shown to play important roles in the onset and progression of MS (
). However, the interplay between these pathways has not been fully explored. Due to the importance of immunity and myelination pathways in the onset and progression of MS, we asked if those individuals with concurrent mutations in significant SNPs to both the immunity genes and the myelination genes were more likely to be diagnosed with MS. For this purpose, we performed a co-occurrence analysis, which revealed that mutations in both myelination and autoimmune/blood brain barrier functionalities significantly increase the risks to developing MS, particularly in women. Women with mutations in both groups of genes were over 20 times more likely to have a MS diagnosis, compared to those with a mutations in only one or neither of these gene groups. These results continue to further strengthen the BBB permeability initiated inflammatory response and the remyelination response as the two critical forces that cause MS onset, axon loss, and subsequent progressive neurodegeneration that characterizes MS. In particular our results further support the important role the remyelination process appears to play in protecting axons from the aberrant autoimmune process, characteristic of MS.
Multiple previous studies have attempted to implicate SNPs in the onset or progression of MS using large genomic datasets (
). The reported SNP rs2807267 is missing from the UK Biobank Axiom array, so we were unable to further validate its significance. We believe our work is the first to focus specifically on the X chromosome as the potential source SNPs, which can be implicated in the onset and progression of MS. Our results shed further light on how mutation to the inflammatory and remyelination pathways are able to promote MS occurrence and progression. In particular, our identification of mutations of genes involved in BBB permeability (AVPR2, F8, PORCN, and ELF4) sheds additional light on the mechanisms which can promote infiltration of T-helper cells, which release cytokines responsible for demyelinated lesions associated with MS (
Although our research provides new insights into SNPs causal for MS, some limitations attenuate the power of the identified SNPs in predicting the likelihood of MS onset or the disease severity. One major limitation is the utilization of the UK Biobank Axiom Array, which covers only 18,857 SNPs across the X chromosome out of the potential 156 million total base pairs. To address this limitation, we will carry out this analysis using the UK Biobank exome sequencing dataset, once it becomes fully available in the future. The utilization of full protein coding regions will allow us to verify the SNPs identified here and to create a much fuller representation of the protein coding changes across the entire X chromosome which might be causal for MS onset. An additional limitation is the gross aggregation of MS cases. Given the differential disease presentation and prognosis across the four different types of MS, particularly distinctions between relapsing-remitting and progressive MS types, our study is limited in classifying the identified SNPs by potential disease severity. Although the UK Biobank resource does not have MS type designation, in future follow-up research we will aim to utilize the brain MRI images available through UK Biobank to classify patients by MS type, using unsupervised machine learning approaches such as Eshagi et al. (
). To further the utility of the identified SNPs in predicting MS onset or progression, we also plan to analyze the differences in distributions of the significant SNPs we identified between women and men. This will allow us to better understand which of the significant genes are causal for MS, specifically in women.
Our analysis provides a novel insight into the roles of X-linked genes in the onset and presentation of multiple sclerosis. We identify 20 SNPs in 14 genes involved primarily in immunity and myelination functions that are significantly more abundant in persons with MS. The immunity genes primarily function in maintaining the blood-brain barrier, the disruption of which allows for the onset of autoimmune mediated inflammation and demyelination. The implicated myelination genes highlight the importance of a properly functioning myelination system to help prevent the onset of neurodegeneration characteristic of MS. Finally, our co-occurrence analysis revealed that concurrent disruption of both myelination and immune systems significantly increases the risk of MS onset in women by 20 fold.
The author reports no competing interests to declare with regards to this work.