Lee, Kim, Nam, et al. (2026) — Clonal Evolution and Mutational Trajectories of Metastatic Colorectal Cancer Shaped by Anticancer Therapies

Bibliographic Reference

Lee, W. H., Kim, B., Nam, C. H., Yeo, H. Y., Lee, D. W., Park, S. C., Oh, J. H., Han, S.-S., Lim, M. C., Kim, H., Ju, Y. S., Chang, H. J., & Cha, Y. (2026). Clonal evolution and mutational trajectories of metastatic colorectal cancer shaped by anticancer therapies. Nature Communications. Advance online publication. https://doi.org/10.1038/s41467-026-74384-3

Core Argument

Late-stage clonal evolution in metastatic colorectal cancer (CRC) under therapeutic pressure remains poorly characterized because bulk-tissue sequencing cannot resolve subclonal mutations restricted to small numbers of cancer cells. The authors argue that single-cell-derived clonal tumoroids coupled with whole-genome sequencing can resolve these late-stage dynamics. They show that chemotherapy-associated mutations accumulate unevenly across co-existing clones, with preferential enrichment in more proliferative lineages; that complex genomic rearrangements (CGRs) and LINE-1 retrotransposition continue to evolve during late-stage progression, challenging the view of CGRs as single catastrophic events; and that convergent evolution, driver loss, and shifting L1 source activity are pervasive features of metastatic CRC under therapy.

Methods

Study design. Observational genomic analysis of six patients (4 male, 2 female) with metastatic CRC. Tumor tissues were collected from surgical specimens — eight primary tumors and ten matched metastases (liver in all patients; additionally peritoneum, ovary, and lymph node where available). Matched peripheral blood served as germline control.

Clonal tumoroid culture. Fresh tumor tissue was dissociated and cells were embedded in Matrigel at single-cell dilution. After 1–2 passages, single cells were FACS-sorted, expanded into clonal tumoroids, and harvested for DNA extraction. A total of 58 single-cell-derived tumoroids were established. Each tumor specimen was split: one portion for bulk-tissue sequencing, one for tumoroid culture.

Whole-genome sequencing. Genomic DNA libraries prepared using Truseq DNA PCR-Free Library Prep Kit; sequenced on Illumina HiSeq X Ten or NovaSeq 6000. Tumoroids and blood at ~30x target depth; bulk tumors at ~100x. Mean depth across all samples: 51x. Reads aligned to GRCh37 with BWA-MEM.

Variant calling and phylogenetic reconstruction. Somatic SNVs and indels identified using Mutect2 and Strelka, filtered against dbSNP, panel-of-normals, and matched normal. Copy number and cancer cell fraction (CCF) estimated with Sequenza. Clonal SNVs (CCF > 0.75, or > 0.6 in low-purity samples) were merged across samples and rescued at the read level. A binary presence-absence matrix was used to reconstruct phylogenetic trees under Wagner parsimony (Phylip Mix, jumble = 100). Tree fit rate exceeded 90% after mutation rescue. SVs identified with DELLY and refined with GREMLIN (machine-learning SV filter); SV-based phylogenies confirmed SNV-based topologies.

Mutational signature analysis. SigProfiler was used for signature decomposition against COSMIC v3.2 plus published cisplatin/oxaliplatin and radiation signatures. Signatures SBS17a/SBS17b, SBSoxaliplatin, SBScisplatin/oxaliplatin, and SBS25 were collectively classified as chemotherapy-associated base substitution signatures (SBSchemo). Indel signatures ID1, ID2, ID5, ID6, ID7, ID8, ID12, ID14, and IDradiation were also assessed.

L1 retrotransposition. Somatic L1 retrotranspositions (soL1Rs) called using MELT, TraFiC-mem, DELLY, and xTea; manually reviewed in IGV for poly-A tails, target site duplications, and read support. Insertions rescued across sister clones. Activity of L1 source elements traced via 3’ transductions.

Clonal asymmetry analysis. For each bifurcation in the phylogenetic trees, the two daughter branches were compared for mutation burden and number of descendant clones. Nodes were classified as concordant (higher mutation burden correlates with more descendants) or discordant. Significance assessed by chi-square test against 50:50 null expectation.

External validation datasets. The SBSchemo–SBSclock-like correlation was validated using WGS of chemotherapy-exposed normal blood cells (Mitchell et al. 2025) and normal colon epithelium (Kuijk et al. 2022). soL1R counts compared against treatment-naive CRC from PCAWG and normal colorectal epithelium (Nam et al. 2023).

Key Findings

  1. Pronounced inter- and intra-patient heterogeneity in chemotherapy-induced mutation burden. An average of ~1,400 SBSchemo substitutions per clone (~8% of total burden, range 0.1–17.3%) was identified, but the burden varied substantially across clones within the same patient. For example, SBSoxaliplatin was detected in only 5 of 27 oxaliplatin-exposed tumoroids. The SBS17b burden varied considerably among post-treatment tumoroids from the same individual, demonstrating uneven mutagenic consequences of chemotherapy across co-existing cancer cell populations.

  2. Chemotherapy-associated mutations preferentially accumulate in more proliferative lineages. SBSchemo burden was positively correlated with SBSclock-like mutations (SBS1, SBS5/40) — a proxy for cumulative cell divisions — both within patients (MC04, MC06) and in external normal-tissue datasets (275 clones from 4 normal blood donors, 12 clones from 2 normal colon donors). Phylogenetic asymmetry analysis confirmed that dominant lineages (those producing more descendant clones) accumulated significantly more mutations than their minor/dormant sister lineages (P = 9.413e-11; concordant vs. discordant ratio significantly deviating from 50:50). Patient MC05, who received combined chemoradiotherapy, showed complete absence of SBSchemo mutations despite prolonged chemotherapy, while exhibiting IDradiation indels — consistent with chemotherapy imposing strong negative selection against proliferative cells and radiation penetrating all cells indiscriminately.

  3. Complex genomic rearrangements (CGRs) continue to evolve during late-stage clonal evolution, contradicting the single-catastrophe model. Nine subclonal chromothripsis events were identified in addition to two early-stage events. In patient MC05, two independent early CGRs subsequently rearranged into a single dicentric chromosome through secondary CGR events in later branches. In patient MC03, extrachromosomal DNA (ecDNA) underlying MDM2 amplification showed dynamic evolution, including replacement of the ecDNA pool with a new mutant ecDNA in liver metastasis clones. Post-CGR regions appeared prone to additional rearrangements during clonal evolution.

  4. LINE-1 retrotransposition activity is markedly elevated in metastatic CRC under therapy and scales with cell proliferation. A total of 1,138 soL1Rs were detected (mean 104 per clone, range 8–291) — substantially higher than in normal colorectal organoids (3 per clone). The soL1R rate (events per SBSclock-like mutation) was comparable between branches and trunk (11.5 vs. 10.9; P = 0.9409), indicating that the apparent increase in soL1Rs under chemotherapy reflects additional cell divisions rather than therapy-induced L1 activation per se. L1 source activity showed substantial heterogeneity between patients and among clones within a patient, with some sources changing activity over the course of clonal evolution. An L1-mediated BRCA1 inactivation was identified, leading to homologous recombination deficiency in a post-treatment lineage.

  5. Convergent evolution, driver loss, and early metastatic divergence are prominent features of mCRC clonal architecture. Recurrent, independent driver events were observed: loss-of-function mutations in TP53 and APC in three synchronous primaries, parallel MYC amplifications, and three independent ERBB2 mutations/amplifications (one BFB-mediated focal amplification, one whole-chromosomal amplification, followed by at least three independent losses of the BFB amplicon). Notably, the NRAS G12D mutation was lost in liver metastases via large-chromosomal deletion, suggesting its function is dispensable for metastasis — with potential therapeutic implications (the primary tumor’s NRAS mutation would have precluded anti-EGFR therapy for a patient whose metastases lacked it). Across all six patients, metastatic clones shared a late-stage common ancestor, supporting a single dominant subclonal lineage as the source of distant metastasis. Clonal phylogenies also suggested that peritoneal, ovarian, and lymph node metastases diverged earlier than their corresponding liver metastases.

Concepts Introduced or Used

  • Clonal tumoroid phylogenetics — using single-cell-derived tumoroids as endogenous barcoding to reconstruct subclonal evolutionary trees at single-cell resolution, extending the approach of Roerink et al. (2018) to metastatic, post-therapy CRC.
  • Therapy-associated mutational burden (SBSchemo) — the first single-cell-level quantification of how many mutations chemotherapy deposits in individual cancer clones, as opposed to bulk-tissue estimates.
  • Preferential mutagenesis of proliferative lineages — the finding that chemotherapy disproportionately mutates actively dividing cells, while dormant/quiescent lineages accumulate fewer therapy-associated mutations, linking to cancer dormancy as a resistance mechanism.
  • Asymmetric clonal evolution (concordant/discordant topology) — a formal phylogenetic measure of whether mutation burden and clonal expansion are coupled, providing a quantitative framework for detecting selection-driven vs. neutral dynamics.
  • Post-CGR evolution — the observation that chromosomes having undergone chromothripsis remain prone to secondary rearrangements during later clonal evolution, challenging the single-catastrophe paradigm.
  • ecDNA pool replacement — dynamic change in extrachromosomal DNA composition across clones during metastatic progression, including replacement of existing ecDNA with new mutant variants.
  • LINE-1 retrotransposition rate scaling — the finding that L1 activity in cancer scales with cell division count rather than being independently upregulated by chemotherapy; rate per cell division is comparable between trunk and branches.
  • Driver mutation loss without targeted therapy — loss of key oncogenic mutations (e.g., NRAS G12D, ERBB2 amplicons) in the absence of corresponding targeted agents, suggesting functional redundancy at late stages.
  • RBFOX1 as a recurrently deleted CRC locus — frequent large deletions (22.4% of samples; 38.9% of lesions) with up to 4 concurrent independent deletions, exceeding the two-hit hypothesis expectation and suggesting genomic fragility.
  • L1 source element activity tracing — using 3’ transduction sequences to fingerprint which L1 source elements are active in which clonal branches, revealing temporal shifts in source activity.

Entities Referenced

EntityTypeRelevance
TP53GeneCanonical truncal driver in CRC; convergent loss in synchronous primaries
APCGeneCanonical truncal driver; stepwise loss-of-function and convergent loss
KRASGeneCanonical truncal driver in CRC
SMAD4GeneLate-stage driver; homozygous deletion restricted to liver metastases (MC05); loss enhances Wnt/beta-catenin signaling
SMAD3GeneStepwise loss-of-function via LOH + frameshift + second allele loss
BRCA1GeneL1-mediated inactivation leading to HRD in a post-treatment lineage (MC03)
ERBB2 (HER2)GeneConvergent evolution: three independent mutations/amplifications; BFB cycles; repeated amplicon loss
MYCGeneParallel allele-specific amplifications in synchronous primaries and metastases
MDM2GeneFocal amplification via ecDNA; dynamic ecDNA evolution and pool replacement
NRASGeneG12D mutation lost in liver metastases via large deletion; implications for anti-EGFR therapy
RBFOX1GeneFrequent recurrent deletions (22.4% of samples); candidate CRC-specific driver; fragile genomic region
MACROD2GeneRepeated large deletions observed in patient MC03
FOLFOXDrug regimen5-FU + folinic acid + oxaliplatin
FOLFIRIDrug regimen5-FU + folinic acid + irinotecan
5-fluorouracil (5-FU)DrugAssociated with SBS17a/SBS17b signatures
OxaliplatinDrugAssociated with SBSoxaliplatin and SBScisplatin/oxaliplatin
IrinotecanDrugNo detectable mutational signature in this study
COSMIC v3.2DatabaseMutational signature reference
PCAWGConsortiumExternal cohort for RBFOX1 deletion frequency and soL1R comparison
SigProfilerToolMutational signature extraction and decomposition
GREMLINToolMachine learning-based SV filtering
DELLYToolStructural variant caller
MELT, TraFiC-mem, xTeaToolsL1 retrotransposition callers
Phylip (Mix)ToolPhylogenetic reconstruction under Wagner parsimony

Limitations (as stated by authors)

  1. Small cohort (n = 6 patients). The small number limits generalizability given known inter-patient heterogeneity in CRC. The authors frame this as acceptable because the primary objective was to resolve late-stage subclonal mutations at single-cell resolution — dynamics largely obscured in bulk sequencing — and the 58 tumoroid WGS dataset provides a valuable resource for hypothesis generation.

  2. Tumoroid culture selection bias. In vitro expansion may skew toward more proliferative lineages that preferentially form organoids. The mutational burden and signatures observed may not fully recapitulate the native tumor architecture. However, tumoroid-specific lineages were detectable in matched bulk tissues (albeit at low combined VAFs), suggesting they are not purely artefactual. The authors call for future integration with single-cell DNA amplification-based WGS to rigorously evaluate selection bias.

  3. Absence of transcriptomic and epigenetic data. The study is DNA-only and lacks functional validation of the observed genomic alterations.

  4. Metastatic site bias. Most metastatic tumoroids were derived from liver metastases due to limited tumoroid formation efficiency from other metastatic sites. This may limit generalizability to non-liver metastasis biology.

  5. Heterogeneous tumoroid numbers across sites. Multi-region sampling would be required to draw definitive conclusions about the common ancestral node of metastatic tumors.

  6. Survivorship bias. Clones completely eradicated by anticancer therapy would not have been captured — the study necessarily samples only the post-therapy surviving population.

  7. No sex-based analyses. The cohort size was insufficient for meaningful sex-stratified analysis; the study was not designed to evaluate sex-specific effects.

Relevance to Clonal Evolution

This study is directly and multiply relevant to core themes in clonal evolution, particularly for CRC as a model system:

Bottleneck dynamics under therapy. The paper provides direct single-cell-resolution evidence that chemotherapy imposes a strong selective bottleneck, preferentially eliminating proliferative cells while sparing quiescent/dormant lineages. The MC05 case — where SBSchemo mutations are entirely absent but IDradiation is present — is the clearest example: proliferative cells were purged, leaving only dormant lineages that never experienced DNA replication during chemotherapy exposure. This connects directly to the population-bottleneck concept, but adds the mechanistic twist that the bottleneck is proliferation-dependent: the mutagen does not reach the genome if the cell does not divide. The result is a qualitative difference between therapy-induced bottlenecks (e.g., cytotoxic chemotherapy) that act on replicating cells and physical mutagens (radiation) that act on all cells.

Compression-entrenchment under serial bottlenecks. In patients showing repeated rounds of chemotherapy, the phylogenetic trees reveal a pattern consistent with compression-entrenchment: one daughter lineage consistently out-competes its sister (asymmetric trees with concordant topology), and this dominant lineage accumulates both more clock-like and more therapy-associated mutations. This suggests a feedback loop: more proliferative lineages are more exposed to chemotherapy-induced mutation, which may in turn generate adaptive variants that further entrench the lineage. The loss of driver mutations (e.g., NRAS G12D, ERBB2 amplicons) in late-stage clones, previously described only under targeted therapy, adds a new dimension: functional redundancy at the late stage may permit the relaxation of selection maintaining certain drivers, effectively “forgetting” earlier adaptations.

Intratumor heterogeneity (ITH) resolution. By using clonal tumoroids rather than bulk sequencing, the authors demonstrate that the majority of post-therapy mutations are subclonal and invisible to standard bulk approaches. On average, 44% of base substitutions were classified as late-stage (downstream branches), and SBSchemo mutations were rarely detected in bulk tissue. This underscores that standard measures of ITH from bulk sequencing systematically underestimate the mutational diversity of post-therapy tumors.

Clonal sweeps and metastatic seeding. Across all six patients, a single subclonal lineage was the dominant source of distant metastasis, consistent with a clonal sweep model of metastatic dissemination. However, the finding that peritoneal/ovarian/lymph node metastases diverged earlier than liver metastases (and were evolutionarily independent) demonstrates that metastatic seeding can be spatiotemporally heterogeneous: a single “successful” lineage may not be the first to disseminate.

Mutational processes under selection. The correlation between SBSchemo and clock-like mutations has a direct implication for clonal evolution: it means that the per-cell-division mutation rate from chemotherapy is roughly constant, but the total mutational load on a lineage depends on how many divisions it undergoes during therapy. This shifts the interpretation of therapy-associated mutation burden from “how much mutagen was the patient exposed to” to “which lineages were dividing during exposure.” It also implies that dormant lineages — while resistant to chemotherapy-induced DNA damage — are not entirely passive: they may reactivate and contribute to relapse with a clean genomic slate, having accumulated fewer therapy-associated mutations.

Punctuated vs. gradual evolution. The paper documents both modes. CGRs (chromothripsis, ecDNA genesis) represent punctuated events. However, the finding that post-CGR regions continue to undergo secondary rearrangements complicates the binary distinction: a catastrophic event creates a permissive configuration that then evolves gradually. Similarly, ecDNA pool replacement demonstrates that extrachromosomal elements, once formed, continue to evolve dynamically.