Bibliographic Reference

Bozic, I., Gerold, J. M., & Nowak, M. A. (2016). Quantifying clonal and subclonal passenger mutations in cancer evolution. PLOS Computational Biology, 12(2), e1004731. https://doi.org/10.1371/journal.pcbi.1004731

Core Argument

The frequencies of passenger mutations in a tumor are determined not only by the mutation rate and the timing of their appearance, but critically by the death-birth ratio δ = d/b of cancer cells. When δ is close to 1 (slow net growth, high cell turnover), the frequency spectrum shifts dramatically: mutations are more abundant at high frequencies, multiple clonal passengers can accumulate during expansion (not all were present in the founding cell), and phylogenetic trees become linear rather than star-shaped. Fitting the model to TCGA colorectal cancer data yields δ ≈ 0.997 for microsatellite-stable cancers. The model generalizes the Luria-Delbrück framework to a fully stochastic multitype branching process with cell death.

Methods

Continuous-time multitype branching process (infinite-alleles model). Single founding type-0 cell; all cells divide at rate b and die at rate d. Each division produces a new passenger mutation in one daughter cell with probability u = 0.015 (product of point mutation rate ~5 × 10^−10 per base pair per division × ~3 × 10^7 base pairs in the exome). Mutations are tracked individually — each new mutation starts a new type. The key parameter is δ = d/b (death-birth ratio). Values explored: δ = 0.72 (fast-growing colorectal metastases) to δ = 0.99 (early slow-growing tumors) to δ = 0.999 (premalignant). Monte Carlo simulations using the Gillespie algorithm. Model fitted to 42 TCGA colorectal cancer samples (MSS and MSI) with purity ≥70% and ploidy 1.8–2.2, using allele frequencies in [0.12, 0.25].

Key Findings

  • Death-birth ratio δ = d/b is the critical parameter. When δ = 0 (no cell death, pure birth), all passenger mutations are strictly subclonal and the frequency spectrum follows 1/f. When δ > 0, several fundamental changes occur: neutral mutations can reach fixation, clonal passengers accumulate, and the frequency spectrum shifts toward higher frequencies. For δ = 0.99, the median frequency of the first three surviving mutations is >40%; for δ = 0.72, it is <5%.

  • Neutral mutations can reach fixation when d > 0. The probability that the k-th surviving passenger mutation reaches fixation (becomes present in 100% of cells) is ρ_k ≈ (u/(u − log δ))^k. For δ = 0.72, ρ_1 ≈ 0.04 — unlikely. For δ = 0.99, ρ_1 ≈ 0.60, ρ_2 ≈ 0.36, and even ρ_5 ≈ 0.08 — substantial. In a pure birth process (δ = 0), no passenger mutation ever reaches fixation.

  • Frequency spectrum formula. The cumulative distribution function for the frequency of the k-th surviving mutation: F_k(α) = 1 − [u/(u − log(1 − α(1 − δ)))]^k. This generalizes the 1/f distribution (which is the δ = 0 special case). As α → 1, this reduces to the fixation probability: F_k(1) = 1 − ρ_k.

  • Expected number of subclonal mutations above frequency α: m_s = u(1 − α)/((1 − δ)α). For δ = 0.99, u = 0.015, there are ~150 mutations present in >1% of cells, ~15 mutations in >50%, and ~1.5 clonal. For δ = 0.999, these numbers are tenfold higher. For δ = 0 (no death), there is on average only a single passenger above 1% frequency.

  • Expected number of clonal mutations: m_c = δu/(1 − δ). When δ is close to 1, clonal passenger mutations can be numerous — they were collected during clonal expansion, not all present in the first tumor cell. This challenges the assumption that clonal mutations are synonymous with “truncal” (present at initiation).

  • Phylogenetic tree shape is determined by δ. Fast growth (δ = 0.72) → star-like trees (mutations appear on independent branches). Slow growth (δ = 0.99) → linear trees (each new mutation appears in the lineage of the previous). For intermediate δ = 0.97, the most likely tree shape is a mix. This provides a theoretical basis for interpreting tree shapes observed in multi-region sequencing.

  • Inverse problem: when did a mutation arise? For a mutation observed at frequency α, the maximum likelihood estimate for the number of cells present at its appearance is ẑ = −1/log[1 − α(1 − δ)]. A mutation at 10% frequency: arose when there were ~10 cells (δ=0) to ~1000 cells (δ=0.99). A mutation at 50% frequency: arose when there were 1 cell (δ=0) to ~200 cells (δ=0.99). This has direct implications for interpreting VAF distributions.

  • TCGA colorectal cancer fit. 42 samples. For MSS cancers, median a = u/(1 − δ) = 2.86, yielding δ ≈ 0.997 (assuming normal u = 0.015). For MSI cancers, a = 27.61 (higher mutation rate and/or lower δ). The fit quality (R² ≥ 0.9 for 16/42 samples) confirms that a neutral branching process with cell death describes these tumors well.

  • Correction to Sottoriva & Graham. The deterministic 1/α estimate for the number of cells at mutation appearance is the δ = 0 special case. Bozic et al. provide the stochastic correction for δ > 0 — which can shift the estimate hundredfold.

Concepts Introduced or Used

neutral-evolution, passenger-mutation, clonal-mutation, subclonal-mutation, fixation-probability, death-birth-ratio, branching-process, phylogenetic-tree, star-phylogeny, linear-phylogeny, frequency-spectrum, molecular-clock, Luria-Delbruck-model, infinite-alleles-model

Entities Referenced

  • TCGA colorectal cancer dataset (COAD-US)
  • Genes: (none specific — analysis is at the exome level)
  • Cancer types: colorectal cancer (MSS and MSI subtypes)
  • Methods: multitype branching process, Gillespie algorithm, maximum likelihood estimation

Limitations (as stated by authors)

  • The model assumes no subclonal drivers at observable cell frequencies. Tumors with ongoing selection (e.g., CLL with subclonal drivers) are not well-described.
  • Loss of heterozygosity (LOH) is not modeled. LOH rates in non-CIN tumors are low enough (~10^−6 per division) that results should still hold; CIN tumors with high LOH rates require separate treatment.
  • Driver mutations appearing during clonal expansion are not considered. The model applies to neutral evolution during clonal expansion — relevant for metastases and some primary tumors but not for tumors undergoing selection.
  • Spatial constraints are not modeled in this paper (a spatial version exists in Waclaw et al., 2015).
  • The model does not explicitly model the stem cell hierarchy; however, the authors argue that mutations observable above 0.1–1% frequency will be those in the stem cell population, which behaves as described.
  • Plateau phases after clonal expansion may alter frequencies. In the large-time limit, old mutations either fix or go extinct; new mutations during the plateau are generally not at observable frequencies if the population size is ≥10

Relevance to Clonal Evolution

This paper provides the mathematical foundation for interpreting passenger mutation frequency spectra as a window into tumor growth dynamics. It shows that the death-birth ratio δ is as important as the mutation rate in determining the observed VAF distribution — a finding that directly impacts how neutrality tests (such as the 1/f test) should be calibrated. The insight that clonal passenger mutations accumulate during expansion, not all in the first cell, refines the interpretation of truncal mutations in phylogenetic reconstruction. The tree-shape result (star vs linear as a function of δ) provides a theoretical null model against which observed phylogenies can be compared. This paper extends the Bozic 2010 driver-passenger model to the full passenger frequency spectrum and connects it directly to observable sequencing data.