Within recent years, there has been a growing number of genes associated with amyotrophic lateral sclerosis (ALS), resulting in an increasing number of novel variants, particularly missense variants, many of which are of unknown clinical significance. Here, we leverage the sequencing efforts of the ALS Knowledge Portal (3864 individuals with ALS and 7839 controls) and Project MinE ALS Sequencing Consortium (4366 individuals with ALS and 1832 controls) to perform proteomic and transcriptomic characterization of missense variants in 24 ALS-associated genes.
The two sequencing datasets were interrogated for missense variants in the 24 genes, and variants were annotated with gnomAD minor allele frequencies, ClinVar pathogenicity classifications, protein sequence features including Uniprot functional site annotations, and PhosphoSitePlus post-translational modification site annotations, structural features from AlphaFold predicted monomeric 3D structures, and transcriptomic expression levels from Genotype-Tissue Expression. We then applied missense variant enrichment and gene-burden testing following binning of variation based on the selected proteomic and transcriptomic features to identify those most relevant to pathogenicity in ALS-associated genes.
Using predicted human protein structures from AlphaFold, we determined that missense variants carried by individuals with ALS were significantly enriched in β-sheets and α-helices, as well as in core, buried or moderately buried regions. At the same time, we identified that hydrophobic amino acid residues, compositionally biased protein regions and regions of interest are predominantly enriched in missense variants carried by individuals with ALS. Assessment of expression level based on transcriptomics also revealed enrichment of variants of high and medium expression across all tissues and within the brain. We further explored enriched features of interest using burden analyses and identified individual genes were indeed driving certain enrichment signals. A case study is presented for SOD1 to demonstrate proof-of-concept of how enriched features may aid in defining variant pathogenicity.
Our results present proteomic and transcriptomic features that are important indicators of missense variant pathogenicity in ALS and are distinct from features associated with neurodevelopmental disorders.
Read more