Research Log

2026-06-16 — Vault initialization

  • Initialized wiki/index.md and wiki/log.md
  • Seeded with 7 papers in raw/papers/

2026-06-16 — Source-summary ingestion

  • Ingested 7 source-summaries in parallel: Nowell (1976), Greaves & Maley (2012), Nik-Zainal et al. (2012), McGranahan & Swanton (2017), Turajlic et al. (2019), Gerstung et al. (2020), Tarabichi et al. (2021)
  • Spot-checked McGranahan & Swanton (2017), Nowell (1976), and Greaves & Maley (2012) summaries against PDFs — all accurate

2026-06-16 — Concept vocabulary extraction

  • Extracted 47 concepts from 7 source-summaries
  • Organized into 7 thematic clusters: Foundational, Selection & Fitness, Genomic Alterations, Mutational Processes, Subclonal Inference Methods, Clinical Implications, Entities

2026-06-16 — Concept pages (batch 1): Selection and Evolution

  • Created: clonal-evolution, clonal-expansion, clonal-sweep, driver-mutation, passenger-mutation
  • 5 pages, all anchored to 3-4 sources each, confidence: high

2026-06-16 — Concept pages (batch 2): Heterogeneity and Progression

  • Created: intratumor-heterogeneity, subclonal-architecture, branching-evolution, punctuated-evolution, neutral-evolution
  • 5 pages, all anchored to 3-4 sources each, confidence: high

2026-06-16 — Concept pages (batch 3+): Methods, Genomics, and Clinical

  • Created: chromosomal-instability, genetic-instability, whole-genome-duplication, mutational-signature, kataegis, subclonal-reconstruction, variant-allele-fraction, cancer-cell-fraction, therapy-resistance, metastasis
  • Created entity pages: TRACERx, PCAWG
  • 12 pages total; all multi-sourced except kataegis (discovery paper, single source)

2026-06-16 — Index, lint, and finalization

  • Updated wiki/index.md with all 31 pages
  • Ran L1 lint: 87 unique wikilinks, 31 resolved, 56 planned-but-uncreated (47 concepts/entities, 5 alias mismatches, 4 entity pages)
  • 2 single-source high-confidence flags (kataegis.md, PCAWG.md) — acceptable exceptions
  • 0 orphan pages, 0 frontmatter failures
  • Lint report saved to outputs/lint-2026-06-16.md

Wiki stats

  • Total pages: 31 (7 source-summaries, 20 concepts, 2 entities, index, log)
  • Total commits: 8
  • Papers covered: Nowell 1976 through Tarabichi 2021 (45-year span)

2026-06-17 — Source-summary ingestion batch 2: Tumor growth kinetics

  • User queried “what is Gompertzian growth?” → identified gap: no source in vault covered tumor growth kinetics
  • User added 6 new papers to raw/papers/ (7 files, 1 duplicate: Castorina 2009 PMC vs. published version)
  • Extracted text from all PDFs via pdftotext; 5/6 extracted cleanly; sottoriva2015 had syntax warnings but yielded 1939 readable lines
  • Ingested 6 source-summaries: Castorina et al. (2009), Traina et al. (2010), Sarapata (2013), Sottoriva et al. (2015), Graham & Sottoriva (2017), Hassan & Al-Saedi (2024)
  • Subagents hit permission wall on file writes; wrote all 6 source-summaries directly

2026-06-17 — Review gate (batch 2)

  • Spot-checked Castorina et al. (2009), Sottoriva et al. (2015), and Graham & Sottoriva (2017) against extracted texts
  • All three passed: bibliographic details, core claims, and key values verified

2026-06-17 — Concept synthesis: Gompertzian growth

  • Created: gompertzian-growth — anchored to 5 sources, confidence: high
    • Cross-referenced with: clonal-expansion, neutral-evolution, cancer-cell-fraction, subclonal-architecture, driver-mutation
    • Covers: equation, key properties, fit to empirical data, clone detectability constraint, therapeutic scheduling (Norton-Simon), open question about 1/f null model correction
  • Updated: neutral-evolution — added Graham & Sottoriva (2017) 1/f test + ~30% finding; added Sottoriva et al. (2015) Big Bang model; added growth model caveat (Gompertzian correction to null distribution)
  • Updated: clonal-expansion — added Growth Model Constraints section explaining how Gompertzian kinetics affect clone detectability

2026-06-17 — Index and log

  • Updated wiki/index.md: sources 7→13, new Growth Kinetics category with gompertzian-growth
  • New concept: gompertzian-growth (1)
  • Updated concepts: neutral-evolution, clonal-expansion (2)

2026-06-20 — Source-summary ingestion: PCAWG Consortium (2020)

  • User added PCAWG Consortium flagship paper to raw/papers/PCAWG2020.pdf (23 MB, Nature 2020)
  • Extracted full text via pdftotext (5,248 lines)
  • Created source-summary: pcawg2020-pan-cancer-analysis
  • Review gate: spot-checked 10 key claims against PDF text — all verified (91%/4.6 drivers, 22.3% chromothripsis, 3.6% driver overlap, 4 telomere clusters, Strombolian/Plinian, BRCA1 templated insertions, MBD4 CpG mutagenesis)
  • Created concept: chromothripsis (1)
  • Updated concepts: driver-mutation (pan-cancer driver landscape, biallelic inactivation, non-coding drivers, rank-and-cut), passenger-mutation (rank-and-cut method), clonal-evolution (PCAWG three-precondition Darwinian framing) (3)
  • Updated wiki/index.md: sources 13→14, concepts 21→22
  • L1 lint: passed — 2 new planned-but-uncreated wikilinks (retrotransposition, telomere-maintenance), 0 frontmatter failures, chromothripsis has 3 sources (above single-source threshold)

2026-06-20 — Source-summary ingestion: Al Bakir et al. (2023) TRACERx metastasis

  • User added TRACERx NSCLC metastasis paper to raw/papers/Bakir2023_NsccTRACERx_Lung.pdf (16 MB, Nature 2023)
  • Extracted full text via pdftotext (2,974 lines)
  • Created source-summary: bakir2023-tracerx-metastasis
  • Review gate: spot-checked 12 key claims against PDF text — all verified (75% late, 25% early, 83% misclassification, 32% polyclonal, <20% LN gateway, dN/dS values, <8mm tumour size, 33% metastasis-unique drivers, 68.6% shared drivers, seeding CCF P=6.4×10^−5, 81.8% platinum signature, maintained drivers)
  • Updated concepts: metastasis (major — timing, dissemination, LN role, selection, two-category model, platinum), clonal-sweep (metastatic timing relative to last sweep), driver-mutation (two-category metastasis model)
  • Updated entity: TRACERx (421 cohort findings)
  • Updated wiki/index.md: sources 14→15
  • L1 lint: passed — 0 new broken wikilinks (all flagged are pre-existing planned-but-uncreated), 0 frontmatter failures

2026-06-20 — Source-summary ingestion: Bozic et al. (2010) driver-passenger model

  • User added Bozic et al. (2010) to raw/papers/Bozic2010_AccumulationDriverPassenger_Mutation.pdf (PNAS 2010)
  • Extracted full text via pdftotext (1,041 lines)
  • Created source-summary: bozic2010-driver-passenger-model
  • Review gate: spot-checked 12 claims against PDF — all verified (s=0.004±0.0004, GBM/pancreatic consistency, 34,000 driver positions, u=3.4×10^−5, τ_k formula, 8.3y/4.5y waiting times, FAP validation, stochastic variation)
  • Updated concepts: driver-mutation (expanded selective advantage — branching process model, s=0.4% derivation, FAP validation, inter-driver waiting times, stochastic variation), passenger-mutation (linear accumulation model, driver-passenger formula)
  • Updated wiki/index.md: sources 15→16
  • L1 lint: passed — 0 broken wikilinks, 0 frontmatter failures

2026-06-20 — Source-summary ingestion: Bozic et al. (2013) combination therapy model

  • User added Bozic et al. (2013) to raw/papers/Bozic2013_EvolutionaryDynamics_Cancer.pdf (eLife 2013)
  • Extracted full text via pdftotext (1,434 lines)
  • Cross-checked claims against existing wiki — no contradictions found; wiki was conceptually aligned but lacked quantitative framework
  • Created source-summary: bozic2013-combination-therapy
  • Review gate: spot-checked 10 claims against PDF — all verified
  • Updated concept: therapy-resistance (major — cross-resistance framework n1/n2/n12, X ≈ M n12 μ formula, simultaneous vs sequential proof, multi-lesion burden analysis, cancer stem cell fraction, vemurafenib clinical data, fitness cost bounds)
  • Updated wiki/index.md: sources 16→17
  • L1 lint: passed — 0 broken wikilinks, 0 frontmatter failures

2026-06-20 — Source-summary ingestion: Bozic et al. (2016) clonal/subclonal passenger model

  • User added Bozic et al. (2016) to raw/papers/Bozic2016_QuantifyingClonalSubclonalPassenger.pdf (PLOS Computational Biology 2016)
  • Extracted full text via pdftotext (1,500 lines)
  • Contradiction check: cross-checked against neutral-evolution and passenger-mutation — no contradictions. 1/f distribution is the δ = 0 special case; Bozic 2016 provides the generalization for δ > 0.
  • Created source-summary: bozic2016-clonal-subclonal-passenger
  • Review gate: spot-checked 12 claims against PDF — all verified
  • Updated concepts: neutral-evolution (major — death-birth ratio δ framework, fixation probability, frequency spectrum, clonal ≠ truncal, tree shape, TCGA δ ≈ 0.997), passenger-mutation (m_s and m_c formulas, δ-dependence)
  • Updated wiki/index.md: sources 17→18
  • L1 lint: passed — 0 new broken wikilinks (all flagged are pre-existing planned-but-uncreated), 0 frontmatter failures

2026-06-20 — Long-form mathematical synthesis: branching process scaffolding

  • Dispatched 4 parallel subagents with long-form math analysis of the Bozic-Nowak trilogy (2010, 2013, 2016)
  • Created concept: branching-process-model — architectural scaffolding unifying all three papers (branching process definition, discrete vs continuous time, u-scale reconciliation table, δ evolution, stagnation formulation, compound μ, model boundaries)
  • Updated concepts: driver-mutation (P(survival) ≈ 2ks ≈ 0.8%, log(k) structure), therapy-resistance (Mathematical Architecture — μ derivation, cell/lineage distinction, extinction filter, p_erad independence), neutral-evolution (1/f as joint test of neutrality + pure birth, δ calibration)
  • Fixed formula error: bozic2016 source-summary CDF corrected from [1 + log(…)/u]^(-k) to [u/(u − log(…))]
  • Updated wiki/index.md: concepts 22→23
  • L1 lint: passed — 0 broken wikilinks in new concept page, 0 frontmatter failures

2026-06-20 — Source-summary ingestion: Dananberg et al. (2024) APOBEC review

  • User added Dananberg et al. (2024) to raw/papers/Dananberg2024_ApobecMutagenesis_Cancer.pdf (Cancers 2024)
  • Extracted full text via pdftotext (1,465 lines)
  • Created source-summary: dananberg2024-apobec-cancer-review
  • Updated concept: APOBEC-mutagenesis — added tissue specificity and timing section (normal-tissue prevalence table, early vs late APOBEC activity by tissue), in vivo evidence section (APOBEC3A as most potent carcinogenic APOBEC3, APOBEC3G bladder cancer role, APOBEC3B variable effects), expanded germline modulation (A3AB ethnic variation, rs1014971 SNP)
  • Updated wiki/index.md: sources 20→21

2026-06-20 — Source-summary ingestion: Petljak et al. (2022) APOBEC3 mechanisms

  • User added Petljak et al. (2022) to raw/papers/Petljak2022_ApobecMutagenesisMechanism.pdf (Nature 2022)
  • Extracted full text via pdftotext (11,415 lines)
  • Contradiction detected: Paper directly contradicts the existing APOBEC-mutagenesis page (APOBEC3B as primary source → APOBEC3A is main driver; APOBEC3B can restrain APOBEC3A). Page was corrected.
  • Created source-summary: petljak2022-apobec3-mechanisms
  • Updated concept: APOBEC-mutagenesis — corrected APOBEC3A/B roles, added causal evidence section, UNG/REV1 downstream processing, episodic mutagenesis, updated germline modulation mechanism
  • Updated wiki/index.md: sources 19→20

2026-06-20 — Source-summary ingestion: Conticello (2008) AID/APOBEC family

  • User added Conticello (2008) to raw/papers/Conticello2008_AidApobecFamily.pdf (Genome Biology 2008)
  • Extracted full text via pdftotext (1,387 lines)
  • Created source-summary: conticello2008-aid-apobec-family
  • Created long-planned concept: APOBEC-mutagenesis — resolves 10+ pre-existing wikilinks across mutational-signature, kataegis, clonal-evolution, passenger-mutation, intratumor-heterogeneity, and source-summaries
  • Review gate: 9 claims spot-checked against PDF — all verified
  • Updated wiki/index.md: sources 18→19, concepts 23→24

2026-06-23 — Intermediate clones + punctuated equilibrium synthesis

  • Research query: punctuated equilibrium vs. gradualism, with focus on intermediate clones and clonal ≠ truncal distinction
  • Created concept: intermediate-clones — synthesizing Bozic et al. (2016) δ framework, Turajlic et al. (2019) punctuated equilibrium, and Tarabichi et al. (2021) detection limits
  • Updated: punctuated-evolution — added “The Illusion of Punctured Gradualism” section and revision history
  • Updated: passenger-mutation — linked clonal ≠ truncal to intermediate clone bottlenecks
  • Updated: neutral-evolution — linked δ-driven fixation to intermediate clone extinction
  • Updated: wiki/index.md — concepts 24→25
  • L2 audit: 3 pages sampled, 14 claims verified, 0 fabricated, 0 drift, 3 pre-existing missing wikilinks flagged
  • Audit output: outputs/audit-2026-06-23.md

2026-06-26 — Imprinting literature ingestion

  • Ingested 6 sources in parallel on genomic imprinting and kinship theory:
    • Paper: Haig (2004) — Genomic imprinting and kinship. Annual Review of Genetics.
    • Paper: Falls et al. (1999) — Genomic imprinting: implications for human disease. The American Journal of Pathology.
    • Paper: Monk et al. (2019) — Genomic imprinting disorders. Nature Reviews Genetics.
    • Paper: Patten et al. (2014) — The evolution of genomic imprinting. Heredity.
    • Paper: Wilkins & Haig (2003) — What good is genomic imprinting? Nature Reviews Genetics.
    • Article: Extended Brain (2026) — The genetic text: how David Haig reimagines evolution as interpretation. Substack.
  • Pipeline upgrade: replaced pdftotext with pymupdf4llm for PDF pre-extraction (preserves structure, reading order, italics, hyperlinks). Added rule: math-heavy papers require subagent to Read PDF directly for equations.
  • Visual information policy: added Mermaid.js diagram support for concept pages (Obsidian-native); three-tier system (Mermaid > extracted figures > structured captions) documented in CLAUDE.md.
  • Review gate: spot-checked Falls (1999) and Haig (2004) summaries against original PDFs — all key claims verified (IGF2 LOI 70% Wilms, three-mechanism model, weak vs strong kinship theory, four-cluster approach).
  • Created concept pages: genomic-imprinting, loss-of-imprinting (both with Mermaid diagrams)
  • Updated wiki/index.md — sources 21→27, concepts 25→27
  • Updated wiki/index.md with new “Imprinting and Epigenetics” concept section
  • Updated CLAUDE.md: pymupdf4llm extraction rule, math-heavy paper protocol, Mermaid diagram policy
  • L2 audit completed: 3 pages, 18 claims, 0 fabricated, 0 drift, 1 gap (genetic-instability missing epigenetic instability), 3 missing wikilinks. Output: outputs/audit-2026-06-26.md

2026-06-26 — Bozic trilogy equation audit + Mermaid diagrams

  • Verified Bozic 2010, 2013, and 2016 key equations against original PDFs — all equations in existing source summaries confirmed accurate
  • Added Mermaid diagrams to 4 concept pages:
    • branching-process-model — decision tree (cell fate at division) + δ-driven phylogeny shapes (star vs linear)
    • clonal-sweep — sweep condition dynamics (τ_k > sweep time, early vs late regimes, therapy shortcut)
    • genetic-instability — four instability types (CIN, MSI, mutator, epigenetic) with optimal-level tension
    • subclonal-architecture — three evolutionary patterns (punctuated, gradual, neutral) with phylogenetic signatures
  • Updated genetic-instability with epigenetic instability (MLIDs from Monk 2019) — addressing L2 audit gap
  • Added revision history entries to all 4 updated concept pages

Wiki stats

  • Total pages: 50 (21 source-summaries, 25 concepts, 2 entities, index, log)
  • Papers covered: Nowell 1976 through Monk et al. (2019) (43-year span)
  • New theme: genomic imprinting as epigenetic substrate for clonal evolution

2026-06-27 — Molecular clock concept page via synthesis_agent bridge

  • Dispatched academic-research-skills:synthesis_agent in wiki-bridge mode to synthesize wiki/concepts/molecular-clock.md from 6 source summaries (Greaves & Maley 2012, Graham & Sottoriva 2017, Nik-Zainal et al. 2012, Gerstung et al. 2020, McGranahan & Swanton 2017, Tarabichi et al. 2021) and 4 concept pages
  • Created bridge spec: docs/superpowers/specs/wiki-synthesis-agent-bridge.md
  • Updated CLAUDE.md workflows (Ingest Phase 3, Query→Update→Audit) to reference bridge for 3+ source synthesis
  • Updated wiki/index.md: concepts 27→28, new molecular-clock entry under Mutational Processes
  • Cross-paper tension inventory resolved: CP-001 (rate constancy vs. spectrum shifts), CP-002 (subclonal selection prevalence)
  • Resolves a long-planned wikilink target from the initial concept vocabulary (June 2026)

2026-06-27 — Bridge re-synthesis of 5 core concept pages

  • Dispatched academic-research-skills:synthesis_agent in wiki-bridge mode to re-synthesize 5 high-source-count concept pages with formal cross-paper tension inventories:
    • clonal-evolution (5 sources): Added evolutionary cycle Mermaid diagram, quantitative anchors (s≈0.4%, 4.6 drivers, 22.3% chromothripsis), Evolutionary Modes section with process-to-mode Mermaid diagram, expanded Clinical Significance to 6 subsections. Tension inventory: 6 candidate pairs, 3 conditional differences resolved (Nowell vs Greaves on sweep exclusivity, Nowell vs Turajlic on selection continuity, Nowell vs PCAWG on variation sources).
    • neutral-evolution (7 sources): Added two Mermaid diagrams (neutral vs selection process flow, δ-shifted frequency spectrum across 4 δ values). Formally resolved 1/f vs δ tension as nested models (1/f is δ=0 special case of Bozic 2016 spectrum). Strengthened Big Bang empirical evidence (349 glands, 15 tumors, 6 spatial categories). Added neutral evolution–molecular clock connection. Tension inventory: 3 candidate pairs.
    • APOBEC-mutagenesis (7 sources): Added APOBEC pathway Mermaid diagram (deamination → UNG fork → SBS2/SBS13). Formalized Burns→Petljak historical consensus shift as structured tension table. Added “APOBEC as Clock-Disrupting Process” section (episodic rate violation, tissue-specific asynchrony, 3 implications for neutral inference, practical mitigation).
    • subclonal-architecture (4 sources): Added molecular clock connection (architecture as clock substrate). Crossing-rule with numeric two-region example. Formalized detection limit convergence (Tarabichi CCF floor + Turajlic doublings blind zone).
    • passenger-mutation (6 sources): Added passenger accumulation Mermaid diagram (5 temporal strata). Explicit clock mechanism section. Formal Nik-Zainal neutral-recorder vs Bozic drift-fixation tension.
  • All 5 pages preserved existing content, revision histories, and frontmatter. Cross-paper tension inventories accumulated across all dispatches.
  • Total bridge dispatches in this session: 6 (1 new concept + 5 re-syntheses)

2026-06-28 — LLM context-anchoring architecture: design, research, and P0 implementation

  • Deep-research investigation on LLM context degradation and state externalization for scientific workflows (12 web sources across LLM context mechanics, RAG/knowledge retrieval, state externalization, Claude Code patterns, and scientific rigor)
  • Wrote design spec: docs/superpowers/specs/2026-06-28-llm-context-anchoring-design.md — full architecture adapted from user’s software-engineering context-anchoring template to research knowledge management
  • Created P0 skills:
    • .claude/skills/context-anchor/SKILL.md — discipline skill: checkpoint writes, recovery protocol, mandatory triggers for multi-step wiki tasks, red flags
    • .claude/skills/wiki-index-first/SKILL.md — technique skill: index-first navigation, decision flow, fallback conditions
  • Created .claude/context-anchor.template.md — anchor schema template
  • Created .claude/rules/README.md — rules directory placeholder for P1 path-scoped rules
  • Updated CLAUDE.md: Session Start rule (ignition circuit), skill index entries, design spec reference
  • Architecture draws from: HEMA dual memory (Ahn & Song 2025), PEEK context maps (Gu et al. 2026), ActiveContext reasoning anchors (Li et al. 2026), Martin Fowler’s two-layer context anchoring (Garg 2026), LLM Wiki pattern (Rezvani 2025), and Claude Code community layered memory patterns

2026-06-28 — Query → Update: kataegis mechanistic expansion

  • Research Q&A on how kataegis results in hypermutation. Traced each claim to its source.
  • Updated kataegis: expanded mechanistic inference with Conticello (2008) enzyme chemistry (zinc-dependent deamination, H[AV]E-x[24-36]-PCxxC catalytic core), Petljak et al. (2022) causal evidence for APOBEC3A as primary driver, explicit two-pathway model for mutational outcomes (C>T via replication across uracil, C>G via UNG excision + REV1 TLS), and molecular clock disruption implication (temporally clustered mutations violate gradual-accumulation assumption).
  • Added Conticello (2008) and Petljak (2022) to kataegis sources (was 1, now 3 — single-source high confidence flag resolved).
  • Added molecular-clock to kataegis related links.
  • Lint: 0 new issues. Single-source high confidence flag resolved (kataegis: 1 source → 3).

2026-06-28 — Query → Update: asexual evolution, tetraploidy, and Haig’s framework

  • Research Q&A on McGranahan & Swanton (2017) asexual evolution analogy + tetraploid adaptation, and connection to David Haig’s evolutionary concepts.
  • Updated kataegis: added “Asexual Evolution and Linked Mutations” section. Kataegic focus as permanently linked haplotype — the linked-genome principle at micro-scale. Phylogenetic methods cannot treat kataegic mutations as independent markers. Added McGranahan & Swanton (2017) to sources.
  • Updated genomic-imprinting: added two items to Relevance to Clonal Evolution — (6) asexual evolution and exposure of recessive mutations (functional haploidy → single-hit kinetics where branching-process-model assumes two-hit), (7) tetraploid-imprinting intersection (whole-genome-duplication buffers, imprinting exposes — opposing forces on the same parameter, resolved by context-dependent interpretation of the genetic text). Added McGranahan & Swanton (2017) to sources; added whole-genome-duplication and branching-process-model to related links.
  • Lint: 0 new issues.

2026-06-28 — Infrastructure: Quartz static site for mobile reading

  • Installed Quartz 5 in quartz/ with content symlink to wiki/.
  • Configured: baseUrl bjorn99.github.io/clonal-evolution, ignorePatterns (log, private, templates, .obsidian), 44 plugins.
  • Build succeeds: 61 markdown files → 189 emitted files.
  • Deployment: Cloudflare Pages (free tier, supports private repos, unlimited bandwidth). GitHub Pages blocked on private repos. Setup via Cloudflare dashboard: connect repo, build command cd quartz && npm ci && npx quartz plugin install --from-config && npx quartz build, output dir quartz/public.
  • Updated .gitignore and system architecture doc.