NGS Workflow Explained: From Sample to Sequence Data

Table of Contents

Next-generation sequencing (NGS) is now standard in most molecular biology labs, but the end-to-end workflow is still a black box for many users. Understanding what happens at each stage helps you design better experiments and read sequencing reports critically.

Stage 1: Sample extraction and QC

Whatever you sequence, you start with clean nucleic acid. DNA extraction kits work for genomic DNA; RNA requires DNase treatment and often poly-A selection or rRNA depletion. Quantify with a fluorometric method (Qubit) — UV absorbance (NanoDrop) overestimates and is not reliable for library prep. Assess integrity on a Bioanalyzer or TapeStation; RNA quality is reported as RIN, and most protocols require RIN > 7.

Stage 2: Library preparation

This is where input nucleic acid is converted into a sequencing-ready library. Core operations:

  • Fragmentation — mechanical, enzymatic, or transposon-based (Tn5)
  • End repair and A-tailing — preparing fragment ends for adapter ligation
  • Adapter ligation — adapters contain flow cell sequences, indices for multiplexing, and primer sites
  • PCR amplification — usually 8–15 cycles; minimize cycles to reduce duplication
  • Size selection and cleanup — typically with SPRI beads

QC the final library on a Bioanalyzer to confirm size distribution and quantify by qPCR for accurate cluster density.

Stage 3: Sequencing

Most short-read sequencing runs on Illumina platforms using sequencing by synthesis with reversible-terminator chemistry. Libraries are loaded onto a flow cell, amplified into clusters via bridge amplification, and sequenced one base at a time with fluorescent nucleotides imaged each cycle.

Long-read platforms (PacBio HiFi, Oxford Nanopore) use single-molecule sequencing — no amplification, much longer reads (10 kb–100+ kb), with modern chemistries achieving high accuracy.

Stage 4: Primary data processing

The sequencer outputs raw signals, basecalled into FASTQ files containing sequences plus per-base quality scores (Q-scores, on the Phred scale). Q30 — meaning 99.9% accuracy — is the typical benchmark for a successful Illumina run.

Stage 5: Bioinformatics analysis

This stage varies most by application:

  • WGS: Align to reference (BWA), call variants (GATK)
  • RNA-seq: Align (STAR, HISAT2) or pseudoalign (Salmon, kallisto), quantify, run differential expression (DESeq2, edgeR)
  • ChIP-seq: Align, call peaks (MACS2), motif analysis
  • scRNA-seq: Cell Ranger or STARsolo, then Seurat or Scanpy

Choosing depth and read length

Depth (coverage) and read length are the main cost drivers. Choose by application: ~30× for human WGS, ~100× for somatic variant calling, 20–30M reads per sample for bulk RNA-seq, far more for single-cell. Short reads work for most quantification tasks; long reads are essential for de novo assembly, structural variant detection, and complex genome regions.

Common pitfalls

  • Using NanoDrop concentrations to plan library inputs
  • Over-amplifying libraries, leading to duplicate-heavy data
  • Skipping batch design — randomize, never group by condition
  • Underpowering — most “negative” RNA-seq results stem from too few replicates

NGS is a chain of steps where each one constrains what’s possible downstream. Pay close attention to extraction QC and library prep — those upstream decisions determine the ceiling on your data quality.

Featured Articles

The Iran War Is Now Hitting Pharma Supply Chains Directly
Daily Updates

The Iran War Is Now Hitting Pharma Supply Chains Directly

The Iran war’s impact on pharmaceutical supply chains is no longer theoretical. Evonik, a major supplier of pharma-grade amino and keto acids, announced a 15% price increase effective immediately, citing rising energy, raw material, and shipping costs caused by the conflict. This is the first

Read More »
Makary Is Out. The FDA Has No Permanent Commissioner.
Daily Updates

Makary Is Out. The FDA Has No Permanent Commissioner.

It’s over. FDA Commissioner Marty Makary resigned on Tuesday after 13 months in the role. The resignation followed days of reporting that the White House had signed off on a plan to replace him. The final trigger was a disagreement over flavored e-cigarette authorization, which

Read More »

Join 85,000+ Biotech, MedTech, and Pharma Leaders

Your Daily Edge in Biotech, MedTech, and Pharma

Get trusted, high-signal updates every morning
Breakthroughs, trial data, deals, and the news that matters