Working Group 5

Non-coding genomic variation

This working group aims to determine the non-coding genomic variation by focussing on burden analysis on coding genes and associated non-coding elements and specific classes on non-coding RNA’s, with the use of a machine learning approach.

Background

Insights into genetic causes of amyotrophic lateral sclerosis (ALS) have underpinned almost all of our current understanding of the molecular pathogenesis. Estimates of heritability for sporadic ALS are as high as 61% (Al-Chalabi et al 2010) but present knowledge accounts for only a proportion of the genetic basis in <10% of patients. A large proportion of missing ALS heritability is likely to lie in non-coding DNA. Genetic association with ALS is significantly correlated with chromosome length (van Rheenen et al 2017) unlike the length of coding exons (Sakharkar et al 2004). In other disease areas the role of non-coding genetic association is increasingly recognised (e.g. Michailidou et al 2017). Analysis of non-coding sequence does not benefit from well described features such as exons and introns to say nothing of proteomics, all of which enable efficient prioritising of variants to identify likely pathogenic candidates. As a result novel approaches are needed. Machine learning and particularly artificial neural networks have, delivered best-in-class differentiation in fields as diverse as computer-vision and speech recognition with relatively little calibration. Increasingly these methods are being applied to biological problems with significant success (Angermueller et al 2016). We propose to apply both traditional and novel approaches take advantage of the rapid increase in sequencing data available and deliver a significant step forward for ALS genetics.

Objectives

Mapping different non-coding DNA categories, including (i) promoters & enhancers; (ii) transcribed non-protein coding RNAs (e.g., introns, miRNAs, lncRNAs, siRNAs, piRNAs, snoRNAs, snRNAs, exRNAs and scaRNAs).
Developing methods for calling qualifying variants in different subtypes of non-coding regions.
Aggregating non-coding regions with functionally associated coding genes where possible.
Linear analysis of burden of variants in non-coding elements and associated groups of elements.
Increasing detection power for complex association patterns and measurable modifier effect via machine learning approaches including random forest and artificial neural networks.

Members

Johnathan Cooper-Knock

Chair / UK

Kevin Kenna

Co-Chair / UMCU, The Netherlands

Denis Bauer
Chen Eitan
Eran Hornstein
Alfredo Iacoangeli
Kevin P. Kenna
Natalie Twine
Nancy Yacovzada
John Quinn
Jack Marshall
Abigail Savage
Arash Bayat
Dennis Wang
Niamh Errington

Related publications

Common and rare variant association analyses in amyotrophic lateral sclerosis identify 15 risk loci with distinct genetic architectures and neuron-specific biology

Nature Genetics (2021) - van Rheenen W, et al.
Read more