Branching Process Model
Definition
A branching process is a stochastic model of a population in which each individual reproduces and dies independently according to fixed probabilistic rules. In cancer evolution, the individuals are tumor cells, reproduction is cell division, and death encompasses apoptosis, senescence, and terminal differentiation. The process tracks how the total number of cells — and the distribution of genetic variants among them — changes over time.
The Bozic-Nowak framework uses branching processes as the mathematical foundation for all three of its core papers (2010, 2013, 2016). This page defines the shared architecture; individual formulas live in their respective concept pages.
Why Branching Processes?
Three properties make branching processes the natural model class for somatic evolution:
-
Stochasticity matters. Tumor initiation and driver acquisition are rare events. A single mutant cell’s lineage has a high probability of extinction by drift (~99% for a new driver-bearing cell at s ≈ 0.4%). Deterministic models (ordinary differential equations) miss this entirely — they would predict that every mutant expands, which is wrong by two orders of magnitude.
-
Independence is a reasonable approximation. In a well-mixed expanding tumor, individual cells divide and die largely independently of each other (until spatial constraints become dominant at large sizes). This makes the branching assumption tractable.
-
Clonal structure emerges naturally. A branching process automatically produces the genealogical tree of cells — which mutations occurred in which lineages, in what order, and at what frequencies. This maps directly onto observable quantities: VAF distributions, phylogenetic trees, and subclonal architectures.
flowchart TD subgraph Cell["Single Cell Decision"] C[Cell with k drivers] -->|"b_k = 1 − ½(1−s)^k"| Div[Division] C -->|"d_k = ½(1−s)^k"| Stag[Stagnation / Death] end subgraph Division["At Division"] Div -->|"1 − u"| Norm[Two identical daughters<br/>both with k drivers] Div -->|"u ≈ 3.4×10⁻⁵"| Mut[One mutant daughter<br/>now carries k+1 drivers] end subgraph Fate["Lineage Fate"] Mut -->|"P ≈ 2ks = 0.8%"| Est[Established — clone persists] Mut -->|"P ≈ 99.2%"| Ext[Extinct — lost to drift within<br/>a few generations] Est --> Growth[Exponential growth at rate ks] end Stag -.->|"Each driver reduces<br/>stagnation by factor (1−s)"| C
Discrete vs Continuous Time
The three papers use different time formulations for different questions:
| Aspect | Bozic 2010 | Bozic 2013, 2016 |
|---|---|---|
| Time | Discrete (generations) | Continuous (real time) |
| Division | Once per generation | Rate b per unit time |
| Death | Embedded in stagnation d_k | Rate d per unit time |
| Best for | Driver counting, waiting times | Therapy timing, frequency spectra |
The discrete-time model (a Galton-Watson process) is simpler for counting how many generations pass before the next driver appears. The continuous-time model (a birth-death process) is needed when the exact timing of birth, death, and mutation events matters — such as whether a resistant cell arises before or during treatment.
In the limit of many cells and small per-generation changes, the two formulations converge. Both produce the same logarithmic scaling of waiting times and the same survival probability structure.
The Mutation Rate: Three Scales, One Biology
A common source of confusion: the three papers use different numerical values for u. These are NOT contradictory — they measure the same underlying biology at different genomic resolutions.
| Paper | u value | What it counts | Derivation |
|---|---|---|---|
| 2010 | 3.4 × 10⁻⁵ | Driver-eligible positions | ~34,000 positions × 5 × 10⁻¹⁰ per bp |
| 2013 | 10⁻⁹ | Per base pair per division | Base unit |
| 2016 | 0.015 | Exome-wide passenger rate | ~3 × 10⁷ exome bp × 5 × 10⁻¹⁰ per bp |
All three are anchored to the same per-base-pair point mutation rate of ~5 × 10⁻¹⁰ per cell division. They differ only in the genomic window: narrow (driver loci), single-nucleotide, or exome-wide. The wiki uses these values in driver-mutation, therapy-resistance, and neutral-evolution respectively — they are one consistent biology viewed at three scales.
δ = d/b: The Death-Birth Ratio
The ratio of death rate to birth rate (δ) is the single most important parameter in the framework. It controls:
- Survival probability of a new mutant lineage: s = 1 − δ
- Extinction probability of a single cell: δ (the probability a newborn cell’s lineage dies out)
- Fixation probability of neutral mutations: ρ_k ≈ (u/(u − log δ))
- Frequency spectrum of passenger mutations: m_s = u(1 − α)/((1 − δ)α)
- Phylogenetic tree shape: star-like for δ ≪ 1, linear for δ ≈ 1
- Number of clonal passengers: m_c = δu/(1 − δ)
δ ranges from ~0.72 (fast-growing colorectal metastases) to ~0.997 (MSS colorectal primaries) to ~0.999 (premalignant tissues). δ = 0 means no cell death — a pure birth process, which is the special case that produces the familiar 1/f distribution.
The evolution of δ across the three papers:
| Paper | Treatment of δ |
|---|---|
| 2010 | Implicit in stagnation probability d_k = (1/2)(1−s)^k. The (1/2) baseline encodes homeostasis (δ = 1 at k=0), and each driver reduces effective δ by factor (1−s). |
| 2013 | Explicit but unnamed. b and d are separate rates. The survival probability s = 1 − d/b = 1 − δ carries the same information. |
| 2016 | Explicit and central. δ = d/b is named, varied systematically, and shown to control all aspects of the passenger frequency spectrum. |
flowchart LR subgraph Fast["δ = 0.72 — Fast growth"] F1(Mutation 1) --> F2(Mutation 2) F1 --> F3(Mutation 3) F1 --> F4(Mutation 4) end subgraph Slow["δ = 0.99 — Slow growth"] S1(Mutation 1) --> S2(Mutation 2) S2 --> S3(Mutation 3) S3 --> S4(Mutation 4) end Fast ---|"Star phylogeny<br/>mutations on independent branches<br/>few clonal passengers"| Fast Slow ---|"Linear phylogeny<br/>mutations accumulate sequentially<br/>many clonal passengers"| Slow
The Stagnation Formulation (2010)
In the 2010 discrete-time model, a cell with k driver mutations faces:
- Stagnation probability: d_k = (1/2)(1 − s)
- Division probability: b_k = 1 − d_k
At k = 0 (no drivers): d_0 = 1/2, b_0 = 1/2. Expected offspring = 0 × (1/2) + 2 × (1/2) = 1. This is a critical branching process — each cell exactly replaces itself on average. This encodes normal tissue homeostasis.
Each driver reduces stagnation by factor (1 − s). For small s: d_k ≈ (1/2)(1 − ks) and the expected offspring μ_k ≈ 1 + ks. Each driver adds approximately s to the net growth rate.
The survival probability of a new k-driver mutant lineage is:
P(survival) ≈ 2ks
For k = 1 and s = 0.004: P(survival) ≈ 0.008 (0.8%). Only 1 in 125 new driver-bearing cells establishes a lasting lineage. The other 124 are lost to stochastic extinction within a few generations. This is the quantitative basis for the slowness of tumor progression and the enormous patient-to-patient variation observed clinically — identical parameters produce vastly different trajectories because the waiting time for a rare successful mutant is exponentially distributed with high variance.
The Compound Mutation Rate μ (2013)
In the 2013 therapy model, the effective mutation rate for pre-existing resistance is not the raw point mutation rate u but the compound parameter:
μ = u × log(Ms) / s
where M is the detection size and s = 1 − d/b is the survival probability. This ~250× amplification arises because:
- log(Ms): The number of cell divisions during growth from 1 to M cells, each an opportunity for mutation
- 1/s: Each surviving mutant lineage amplifies by branching process dynamics; the expected clone size at detection depends on when it arose
μ appears in formulas for the expected number of resistant cells. The raw u appears in formulas for the probability that zero resistant lineages were founded — counting founder events rather than cell counts.
Model Boundaries
The branching process framework assumes:
- Exponential growth during clonal expansion (no carrying capacity)
- Well-mixed population (no spatial structure)
- Mutations are rare relative to cell division
- Cell fates are independent
These assumptions hold for early-stage clonal expansion before spatial constraints dominate (tumors < ~1 cm). They break down when:
- gompertzian-growth deceleration becomes significant
- Spatial constraints create heterogeneous growth rates (partially addressed in Waclaw et al., 2015)
- Subclonal drivers create ongoing selection (the 2016 model explicitly requires no subclonal drivers at observable frequencies)
- Chromosomal instability produces high rates of loss of heterozygosity
Connections
- The 1/f² distribution (neutral-evolution) is the δ = 0 special case of the 2016 frequency spectrum
- The s ≈ 0.4% estimate (driver-mutation) is derived from the 2010 driver-passenger formula fitted to GBM and pancreatic cancer
- The cross-resistance framework (therapy-resistance) is the 2013 multitype branching process applied to drug sensitivity states
- The clonal ≠ truncal insight (passenger-mutation) follows from m_c = δu/(1 − δ) when δ is close to 1
Revision history
- 2026-06-26 — Added Mermaid diagrams: cell-division decision tree and δ-driven phylogeny shapes (star vs linear). Bozic trilogy equations verified against original PDFs — all accurate.
- 2026-06-20 — Created from long-form mathematical analysis of the Bozic-Nowak trilogy (2010, 2013, 2016). Provides the architectural scaffolding: branching process definition, discrete vs continuous time, u-scale reconciliation, δ evolution, stagnation formulation, compound mutation rate μ, and model boundaries.