Bibliographic Reference
Conticello, S. G. (2008). The AID/APOBEC family of nucleic acid mutators. Genome Biology, 9(6), 229. https://doi.org/10.1186/gb-2008-9-6-229
Core Argument
The AID/APOBEC family of cytidine deaminases represents a double-edged sword in cellular metabolism: their ability to deaminate cytidine to uridine in DNA and RNA underpins essential physiological functions (antibody diversification, retrovirus restriction, lipid metabolism), but the same enzymatic activity — when misdirected or dysregulated — generates the C>T and C>G mutations at TpC dinucleotides that are among the most prevalent mutational signatures in human cancer. The family originated from the zinc-dependent deaminase superfamily at the dawn of vertebrate radiation, expanded through complex gene duplications, and in primates underwent rapid positive selection driven by conflict with retroviruses and retrotransposons.
Methods
This is a comprehensive protein family review synthesizing phylogenetic analysis, structural biology (crystal structures of APOBEC2 and APOBEC3G carboxy-terminal domain), biochemical characterization of deamination activity and sequence-context preferences, cellular localization studies, and genetic evidence from knockout mice and human immunodeficiency syndromes. The review covers all known AID/APOBEC paralogs: AID, APOBEC1, APOBEC2, APOBEC3A-H, and APOBEC4.
Key Findings
-
AID/APOBECs are zinc-dependent cytidine deaminases. All members share a characteristic H[AV]E-x[24-36]-PCxxC catalytic motif that coordinates a zinc atom. The catalytic mechanism: a zinc-activated water molecule performs nucleophilic attack on carbon 4 of cytidine, with a nearby glutamate acting as proton donor, converting cytidine to uridine. The overall three-dimensional fold — five β-strands forming a backbone with α-helices 2 and 3 holding the catalytic pocket — is conserved from bacterial tRNA-editing enzymes to human APOBEC3s.
-
The family originated from the Tad/ADAT2 tRNA-editing enzymes at the beginning of vertebrate radiation. Phylogenetic analysis places the origin of AID/APOBECs concurrent with the appearance of adaptive immunity. AID is the ancestral member; APOBEC2 and APOBEC4 are ancient paralogs found in all jawed vertebrates. APOBEC1 arose from an inverted duplication of the AID locus. The APOBEC3 locus expanded rapidly in placental mammals through repeated duplications — in primates, it expanded to seven genes — driven by positive selection from ongoing conflict with retroviruses and retrotransposons.
-
AID is the physiological DNA mutator in B cells. AID deaminates cytidine to uridine on single-stranded DNA at the immunoglobulin locus, initiating somatic hypermutation and class-switch recombination. It preferentially targets cytidines within a WRC motif (W = A/T, R = A/G). AID deficiency causes Hyper-IgM Syndrome Type 2. Aberrant AID activity causes c-myc/IgH translocations — the hallmark of Burkitt’s lymphoma — and AID is required for germinal-center-derived lymphomagenesis in mice.
-
APOBEC3s are innate antiviral DNA mutators. APOBEC3G, the most studied member, is packaged into HIV virions and deaminates cytidines on the nascent minus-strand cDNA during reverse transcription, producing characteristic G→A mutations on the viral plus strand. Mutation loads can reach 3% — sufficient to inactivate the viral genome. HIV counteracts this via the Vif protein, which targets APOBEC3G for proteasomal degradation. Different APOBEC3 paralogs have distinct sequence-context preferences: APOBEC3G prefers 5’-CC, APOBEC3F prefers 5’-TC. All primate APOBEC3s also restrict retrotransposons (LINE-1, Alu) and hepatitis B virus.
-
Substrate recognition involves conserved structural loops. Comparison with the bacterial TadA-tRNA co-crystal structure reveals a conserved loop that positions the polynucleotide substrate; a serine-tryptophan-serine (SWS/SSS) motif preceding the PCxxC motif forms a trough where single-stranded DNA is bound and deaminated. Dimerization and RNA-binding modulate enzymatic activity — most cellular APOBEC3G is kept inactive in high-molecular-weight ribonucleoprotein complexes and must be freed to act.
-
The cancer connection was established early. AID was proved to trigger c-myc/IgH translocations. APOBEC1 overexpression causes hepatocellular carcinoma in transgenic mice. APOBEC3s were first identified in keratinocytes treated with PMA, a phorbol ester tumor promoter. The C→T mutational context at TpC dinucleotides — the canonical APOBEC signature — is observed in genes commonly mutated in cancer. Viral infection and antiviral inflammatory pathways can induce APOBEC expression, potentially linking inflammation to mutagenesis.
-
APOBEC3B germline deletion is common in human populations. The APOBEC3A and APOBEC3B genes are located adjacent on chromosome 22q13.1. A ~30-kb germline deletion that removes the APOBEC3B coding sequence and fuses APOBEC3A’s 3’ UTR to APOBEC3B’s coding sequence is common (minor allele frequency ~8% in Europeans) and protective against APOBEC-mediated mutagenesis — a finding later confirmed and extended by PCAWG Consortium (2020).
Concepts Introduced or Used
APOBEC-mutagenesis, AID, APOBEC1, APOBEC3, cytidine-deaminase, somatic-hypermutation, class-switch-recombination, zinc-dependent-deaminase, retrovirus-restriction, kataegis, mutational-signature, TpC-motif, single-stranded-DNA
Entities Referenced
- Genes: AID (AICDA), APOBEC1, APOBEC2, APOBEC3A, APOBEC3B, APOBEC3C, APOBEC3DE, APOBEC3F, APOBEC3G, APOBEC3H, APOBEC4, UNG, ACF, Vif, RPA, PKA
- Organisms: human, mouse, primates, placental mammals, bony fish, cartilaginous fish, sea lamprey
- Viruses: HIV-1, hepatitis B virus, adeno-associated virus
- Mobile elements: LINE-1, Alu, endogenous retroviruses
- Diseases: Hyper-IgM Syndrome Type 2 (AID deficiency), Type 5 (UNG deficiency), Burkitt’s lymphoma
Limitations (as stated by author)
- The functions of APOBEC2 and APOBEC4 remain completely unknown despite their ancient evolutionary origins.
- APOBEC3 endogenous targets (beyond HIV in vivo) are inferred from G→A mutational bias in mobile elements, not directly demonstrated — experimental systems rely on transient overexpression which may not reflect physiological activity levels.
- Whether the AID/APOBEC-cancer association is due to stochastic side effects of a necessary mutational machinery or to specific conditions inducing aberrant function was unresolved at the time of writing.
- The mechanisms targeting AID specifically to the immunoglobulin locus (rather than causing genome-wide mutation) were not understood — only a few AID-interacting proteins had been identified.
Relevance to Clonal Evolution
This is the foundational molecular review of the enzyme family responsible for one of the most prevalent endogenous mutational processes in cancer. The AID/APOBEC deaminases generate the C>T and C>G substitutions at TpC dinucleotides that constitute mutational signatures 2 and 13 — signatures found across the majority of cancer types. Their activity provides a continuous source of somatic genetic variation throughout tumor evolution, fueling both intratumor-heterogeneity and the emergence of therapy-resistant clones. The PCAWG Consortium (2020) finding that a common germline APOBEC3B deletion is protective against APOBEC mutagenesis directly links host genetic variation to the somatic evolutionary substrate available to tumors. The mechanistic coupling between DNA breakage and APOBEC activity — single-stranded DNA at rearrangement breakpoints serving as substrate — connects this mutational process to chromothripsis, kataegis, and other catastrophic genomic events.