How to Interpret a Volcano Plot: A Beginner’s Guide

Table of Contents

If you’ve done any kind of differential analysis — RNA-seq, proteomics, CRISPR screens — you’ve stared at a volcano plot. Done well, they make thousands of features readable at a glance.

What’s on a volcano plot

A volcano plot is a scatter plot with two axes:

  • X-axis: Effect size — typically log2 fold-change. Negative values mean the feature is decreased in the treatment; positive values mean increased.
  • Y-axis: Statistical significance — typically -log10(p-value) or -log10(adjusted p-value). Higher = more significant.

Each dot is one gene, protein, or feature. The “volcano” shape comes from many features clustering at low fold-change with low significance, and a few rising up at the top corners.

The four regions

RegionInterpretation
Upper rightIncreased, significant — likely up-regulated
Upper leftDecreased, significant — likely down-regulated
Lower middleSmall effect, not significant — noise/no change
Lower left/rightLarge effect but not significant — usually low-confidence

Setting thresholds

  • Fold-change cutoff: Typically |log2FC| > 1 (2-fold change), but biology-dependent
  • Significance cutoff: Adjusted p-value < 0.05 (or < 0.01)

Always use adjusted p-values (FDR or BH correction), not raw p-values, in genome-scale experiments. With 20,000 genes tested, you expect 1,000 false positives at p < 0.05 by chance alone.

Common misinterpretations

  • “Highly significant” doesn’t mean “biologically important.” A gene with p = 10⁻²⁰⁰ and fold-change of 1.05 may be very real but irrelevant.
  • “Large fold-change” doesn’t mean “real.” A gene with log2FC = 6 and p = 0.5 is probably noise.
  • The strongest hits combine effect size and significance — they sit in the upper corners.

Improving readability

  • Label only key genes — top hits, candidates of interest, outliers
  • Color by category (up, down, not significant) or by gene set
  • Use transparency to show density without obscuring sparse outliers
  • Adjust axis limits — cap p-values at, say, 10⁻³⁰⁰ for plotting
  • Annotate gene counts in each region

Tools for volcano plots

  • R: EnhancedVolcano (Bioconductor) — feature-rich and publication-ready
  • R / ggplot2: for full custom control
  • Python: matplotlib, seaborn, or bioinfokit
  • GraphPad Prism: straightforward for smaller datasets

What to do after a volcano plot

  • Pathway analysis on the significant gene list (GSEA, Enrichr, Reactome)
  • Validation by qPCR or western blot for top hits
  • Functional follow-up on candidates of interest
  • Comparison to public datasets for replication

A volcano plot summarizes thousands of statistical tests in one image. The genes that matter are usually those with both meaningful effect sizes and strong significance — the upper corners.

Featured Articles

Join 85,000+ Biotech, MedTech, and Pharma Leaders

Your Daily Edge in Biotech, MedTech, and Pharma

Get trusted, high-signal updates every morning
Breakthroughs, trial data, deals, and the news that matters