Bibliographic Reference
Bozic, I., Antal, T., Ohtsuki, H., Carter, H., Kim, D., Chen, S., Karchin, R., Kinzler, K. W., Vogelstein, B., & Nowak, M. A. (2010). Accumulation of driver and passenger mutations during tumor progression. Proceedings of the National Academy of Sciences, 107(43), 18545–18550. https://doi.org/10.1073/pnas.1010978107
Core Argument
Tumor progression can be modeled as a discrete-time branching process in which each successive driver mutation confers a small selective advantage by reducing the probability of cell stagnation (differentiation, death, or senescence). The model depends on only three parameters — driver mutation rate u, selective advantage s, and cell division time T — yet captures essential dynamics: the average time between successive driver mutations decreases as the tumor grows, the number of passengers is a predictable function of the number of drivers, and stochastic variation alone produces enormous heterogeneity in progression rates among patients with identical parameters. Fitting the model to GBM and pancreatic cancer sequencing data yields a surprisingly small estimate for the selective advantage of a typical driver mutation: s ≈ 0.4%.
Methods
Discrete-time branching process model. A cell with k driver mutations has stagnation probability d_k = (1/2)(1 − s)^k and division probability b_k = 1 − d_k. At each division, one daughter cell may acquire an additional driver mutation with probability u. Parameter estimates: point mutation rate ~5 × 10^−10 per base pair per cell division; ~34,000 driver-eligible positions in the genome (286 tumor suppressor genes × ~114 positions each + 91 oncogenes × ~14 positions each); u ≈ 3.4 × 10^−5 per cell division. Model fitted to GBM (Parsons et al., 2008; 14 tumors, 713 mutations) and pancreatic adenocarcinoma (Jones et al., 2008; 9 tumors, 562 mutations) using CHASM for driver classification. Validated against two independent FAP clinical cohorts (Giardiello et al., 1993, 2002) for polyp number, size, age distribution, and growth rate.
Key Findings
-
Average selective advantage of a driver mutation is s ≈ 0.4%. Fitting to GBM data: s = 0.004 ± 0.0004. Fitting to pancreatic cancer data: s = 0.0041 ± 0.0004. For mutation rates u = 10^−6 and u = 10^−4, the estimates shift only modestly to s ≈ 0.65% and s ≈ 0.32%, respectively — the estimate is robust to the mutation rate. This surprisingly small selective advantage explains why many drivers are needed to form an advanced malignancy within a human lifetime, and why in vitro validation of such weak effects is nearly impossible over short time periods.
-
Waiting times between successive driver mutations shrink. The average time between the appearance of the first successful cell with k drivers and the first with k + 1 drivers is τ_k ≈ (T / 2ks) × log(2ks / u). For u = 10^−5, s = 0.01, T = 4 days: ~8.3 years to the second driver, but only ~4.5 more years to the third. The cumulative time to accumulate k drivers grows logarithmically with k — later drivers arise faster because each successive mutant clone expands at a faster rate.
-
Passenger mutations accumulate linearly with time. The average number of passenger mutations in a tumor cell after t days is n(t) = v × t/T, where v is the neutral mutation rate. Combined with the driver model, the expected number of passengers in a tumor with k drivers is: n = (v / 2s) × log(4ks² / u) × log(k).
-
Enormous stochastic variation despite identical parameters. Six simulations with identical u, s, and T produced vastly different progression trajectories: one patient’s tumor had only acquired a second driver after 20 years with <10^5 cells, while another had three drivers and >10^11 cells by 25 years. This stochasticity arises from the randomness of when surviving mutant lineages appear — not from parameter differences.
-
Independent validation in FAP. Using s = 0.004 derived from GBM/pancreatic data, the model accurately predicted the age distribution of FAP patients entering a clinical study (model: mean 25 years, 35 polyps of 3.1 mm; data: mean 24 years, 41 polyps of 3.2 mm). The model also predicted that 43% of young APC-mutation carriers would develop polyps within 4 years (data: 49%).
Concepts Introduced or Used
driver-mutation, passenger-mutation, clonal-evolution, clonal-expansion, selective-advantage, branching-process, stagnation-probability, sequential-driver-model, neutral-mutation-rate, CHASM
Entities Referenced
- Genes: APC, BCR-ABL, TP53 (implicit via tumor suppressor classification)
- Cancer types: Glioblastoma multiforme (GBM), pancreatic adenocarcinoma, familial adenomatous polyposis (FAP)
- Methods: CHASM (Cancer-specific High-throughput Annotation of Somatic Mutations), discrete-time branching process, COSMIC database
- Tumor suppressor genes: 286 identified; oncogenes: 91 identified
Limitations (as stated by authors)
- The model assumes all driver mutations confer the same selective advantage s. Testing with s drawn from a Gaussian distribution (σ = s/2) showed the formulas still hold, but individual driver effects vary.
- Generation time T is assumed constant; however, the key formula (Eq. 2, drivers vs passengers) is independent of T.
- Carrying capacities for each mutant lineage are not modeled; the assumption that each lineage’s carrying capacity ≫ 1/u is reasonable but unverified.
- The model is designed for tumor progression, not initiation — it starts from the first driver mutation.
- Driver classification depends on CHASM, which may misclassify some drivers as passengers (the tumor with 1 driver and 16 passengers in the GBM data may reflect this).
- Only two tumor types (GBM, pancreatic) were used for parameter estimation; generalizability to other tumor types was not tested in the original study.
Relevance to Clonal Evolution
This is the foundational quantitative model that established how weak selection in cancer actually is. The s ≈ 0.4% estimate has become one of the most widely cited numbers in cancer evolution — it quantifies what Nowell (1976) described qualitatively, explains why clonal expansions take decades, why so many driver mutations are needed, and why detecting selection from sequencing data is so difficult. The model’s demonstration that identical evolutionary parameters produce vastly different trajectories due to stochastic timing alone provides the theoretical basis for understanding the heterogeneity of tumor progression rates observed clinically. The driver-passenger relationship formula (Eq. 2) links observable quantities (number of mutations, number of drivers) to evolutionary parameters (s, u) — a bridge between genomic data and evolutionary theory that remains in active use.