In recent years, efforts to understand the brain have been enhanced by transcriptomic profiling – a measure of the expression levels of every gene in a genome, in every cell or tissue across the brain. This information helps to link the brain’s molecular activity with observable measures relating to its structure or function (known as ‘phenotypes’).
To link gene expression to a brain phenotype, researchers use an approach called gene category enrichment analysis (GCEA). GCEA uses statistical methods to score how well the expression of each gene correlates to a particular phenotype. The genes are then grouped together by category, and their scores are combined. GCEA measures the statistical significance of each category’s cumulative score, which identifies the gene category most strongly related to a particular phenotype.
Brain Function CoE researchers Ben Fulcher and Alex Fornito, together with Aurina Arnatkeviciute from Monash University, examined the statistical biases involved in using GCEA with transcriptomic data. They found that the rate at which a particular gene category is linked to a random phenotype is much higher than would be expected by chance. This leads to false positives – associations reported where there really are none. For some gene categories, the researchers found that more than 20% of associations were false positives.
After identifying the causes of this false-positive bias, the researchers designed a new GCEA approach to overcome the bias. It uses a different method to measure statistical significance. Their software toolbox, which can be used to perform conventional GCEA and their new approach, is freely available online.
The team has no plans to do more work in this research area.
Fulcher, B. D., Arnatkeviciute, A., & Fornito, A. (2021). Overcoming false-positive gene-category enrichment in the analysis of spatially resolved transcriptomic brain atlas data. Nature Communications, 12, 2669. doi: 10.1038/s41467-021-22862-1