Bibliographic Reference

Schmidhuber, J. (2009). Driven by compression progress: A simple principle explains essential aspects of subjective beauty, novelty, surprise, interestingness, attention, curiosity, creativity, art, science, music, jokes. arXiv preprint. IDSIA / TU Munich.

Note: This is a theoretical/philosophical preprint synthesizing the author’s work on artificial curiosity and intrinsic motivation (1990–2008). It is not a standard empirical research article. The paper’s formal appendix provides rigorous RL definitions; the main text makes sweeping interpretive claims about cognition, aesthetics, and creativity.

Core Argument

All cognitive phenomena involving curiosity, creativity, and aesthetic judgment can be explained by a single algorithmic principle: intrinsic reward for compression progress. The framework has four components: (1) store all sensory history, (2) run an adaptive compressor that discovers regularities, (3) generate intrinsic reward proportional to compression progress — the number of bits saved by a better compression of the same data, and (4) use reinforcement learning to maximize this intrinsic reward, driving the agent to seek data that yields further compression progress.

The key conceptual distinction is between subjective beauty (compressibility — how few bits are needed to encode the data given the observer’s current knowledge) and subjective interestingness (the first derivative of beauty — how fast the observer is learning to compress better). Beauty is a stock; interestingness is a flow. This explains why beautiful things become boring: once compressed, they yield no further learning progress. The drive for compression progress thus pushes the agent toward the “interesting” — regions of experience where the learning curve is steepest, avoiding both the already-compressed (boringly predictable) and the permanently incompressible (stochastic noise).

Methods

This is a theoretical position paper with a formal appendix. No new empirical data are presented. The paper:

  • Defines an algorithmic framework (store → compress → reward progress → RL-optimize) in informal terms (Section 1) and formal terms (Appendix A)
  • Derives consequences for cognition: compact internal representations (Section 2.1), consciousness as self-model (Section 2.2), subjective beauty (Section 2.3), interestingness (Section 2.4), novelty detection (Section 2.6), curiosity and attention (Section 2.7), art and music (Sections 2.10–2.12), science (Section 2.15), humor (Section 2.16)
  • Reviews prior concrete implementations from the author’s own work (1990–2008): artificial agents using prediction error, compression progress, or KL-divergence as intrinsic reward (Section 3)
  • Provides visual illustrations of compressible stimuli (Section 4)

The formal appendix covers predictor-compressor equivalence (A.1), compressor performance measures (A.3–A.4), compression progress measures (A.5), the asynchronous reward framework (A.6), optimal but incomputable action selection via AIXI (A.8), and computable approximations (A.9–A.10).

Key Findings

  • Interestingness is the first derivative of subjective beauty. Beauty corresponds to how compressible the data is given the observer’s current knowledge; interestingness corresponds to how rapidly that compressibility is improving. This distinguishes the static pleasure of recognition from the dynamic pleasure of discovery. A stimulus can be beautiful without being interesting (a familiar face) and interesting without being beautiful (a puzzle that resists easy compression). (Schmidhuber, 2009, Sections 2.3–2.4)
  • Neither pure noise nor perfect predictability is interesting. The traditional Shannon/Boltzmann definition of surprise — that unlikely events are surprising — fails for cognition: white noise is maximally surprising in the Shannon sense but boring to human observers. Schmidhuber’s framework correctly predicts that both extremes (already fully compressed AND permanently incompressible) are avoided, and that curiosity seeks the learnable middle ground where compression progress is possible. (Schmidhuber, 2009, Section 2.6)
  • Compact internal representations (symbols) emerge as by-products of efficient compression. A compressor that minimizes description length will automatically create reusable codes for recurring patterns — objects, faces, words, concepts. These internal symbols are not pre-programmed; they are discovered by the compression process itself. (Schmidhuber, 2009, Section 2.1)
  • The framework has been implemented in working artificial systems since 1990. The author documents a lineage of curiosity-driven RL systems using prediction error (1990), compression progress through predictor improvements (1991), KL-divergence between prior and posterior (1995), and zero-sum reward games for compression progress (1997). These systems demonstrate that the principle is computationally tractable, though the gap between these simple implementations and the claimed explanatory scope (art, music, jokes, consciousness) is substantial. (Schmidhuber, 2009, Section 3)

Concepts Introduced or Used

  • Compression progress — the reduction in description length achieved by an improved compressor operating on the same data history. Measured in bits saved. The core intrinsic reward signal in the framework.
  • Subjective beauty — the compressibility of data given the observer’s current knowledge and computational limits. Observer-dependent and time-dependent; changes as the observer learns.
  • Subjective interestingness — the first derivative of subjective beauty with respect to time; the steepness of the learning curve. What drives curiosity, exploration, and attention.
  • True novelty — data that is novel in Schmidhuber’s sense (contains previously unknown algorithmic regularities), as opposed to Shannon-novel (merely low-probability) or Boltzmann-surprising (merely unlikely). A random sequence is Shannon-novel but not Schmidhuber-novel — it contains no learnable structure.
  • Intrinsic motivation / curiosity reward — reward generated internally by the agent for compression progress, independent of external reward signals. Pre-wired rather than learned, because in resource-limited agents facing rare external rewards, curiosity accelerates the discovery of reward-yielding behaviors.
  • Asynchronous compressor evaluation — the technical requirement that both the old and new compressor must be tested on the same data history to fairly measure progress (Appendix A.6).

Entities Referenced

  • Jürgen Schmidhuber — Director of IDSIA (Switzerland) and professor at TU Munich. Pioneer of artificial curiosity, long short-term memory (LSTM), and universal AI. This paper synthesizes two decades of his work on intrinsic motivation.
  • AIXI — Hutter’s (2005) universal artificial intelligence; the optimal but incomputable RL agent. Discussed as a theoretical limit (Appendix A.8).
  • Gödel Machine — Schmidhuber’s (2003) self-referential RL architecture capable of rewriting its own code once it has proved the rewrite is optimal. Referenced as an optimal-but-impractical approach (Appendix A.9).
  • Kolmogorov complexity — the length of the shortest program that computes a given string. The theoretical foundation for the compression framework. Uncomputable in general, motivating practical approximations.

Limitations

As stated by the author and identified by reviewers:

  • Uncomputability of the optimal compressor. Kolmogorov complexity and Solomonoff induction are uncomputable. The practical implementations (neural networks, simple RL) are approximations with no formal proof that they preserve the optimal properties of the theoretical framework. The gap between Kolmogorov-level theory and neural-network-level practice is bridged by aspiration, not derivation.
  • Falsifiability concerns. In its stated form, the framework can rationalize any cognitive outcome post-hoc. If a stimulus is liked, it provides compression progress; if it is disliked, it is either too predictable or too random. No boundary conditions or falsifying observations are specified. The theory risks being an interpretive framework rather than a testable scientific hypothesis.
  • Literature isolation. The paper cites almost exclusively the author’s own prior work and the algorithmic information theory tradition. It does not engage with predictive coding (Rao & Ballard, 1999), the free energy principle (Friston, 2005), Berlyne’s psychobiological theory of curiosity (1960), the dopamine reward prediction error literature (Schultz et al., 1997), or empirical aesthetics research beyond one reference on face perception.
  • Consciousness claim is unsupported. The assertion that consciousness is explained as “a symbol representing the agent itself” (Section 2.2) is made in two paragraphs with no engagement with the consciousness literature, no formal derivation, and no empirical evidence. It confuses a representation with the phenomenal experience of having that representation.
  • Self-citation density. Approximately 28 of ~45 references are the author’s own work. The “concrete implementations” are exclusively Schmidhuber’s systems. Independent replication is absent.
  • No engagement with biology or evolution. The paper never mentions evolution, natural selection, or biology, despite the compression progress principle having a deep and precise isomorphism with evolutionary processes (see compression-progress-evolution).

Relevance to Clonal Evolution

Schmidhuber never mentions biology, but the compression progress framework provides a formal language for describing the deep structure of clonal evolution:

Natural selection as compression. A well-adapted genome is a compressed representation of the environmental challenges its lineage has faced. Metabolic pathways compress energy availability; cell cycle checkpoints compress the regularity of DNA damage; DNA repair compresses mutational insults. The fitness of a genotype is the quality of its compression — how efficiently it encodes the environmental challenge. Mutation and recombination are the “compressor improvement algorithm”; selection evaluates the results. The population-level improvement in adaptation over generations IS compression progress at the lineage level.

Interestingness as fitness gradient. Schmidhuber’s central insight — that interestingness is the first derivative of beauty — maps directly to fitness landscapes. In genotype space, regions of flat fitness (neutral plateaus) are “boring” because no compression progress is possible. Regions where fitness changes rapidly are the “interesting” regions that selection explores. What matters is not the height of fitness peaks but the steepness of the slopes.

Cancer as decompression. A normal cell’s regulatory genome is a compressed program — checkpoints, feedback loops, and regulatory networks that produce complex behavior from compact code. Cancer is decompression: the program fragments, entropy accumulates through genomic instability, and the cell reverts to the “raw data” of unchecked proliferation. Each Hallmark of Cancer can be reframed as failure of a specific compression module.

Clonal sweeps as compression breakthroughs. A driver mutation that confers a significant fitness advantage is a discovery in Schmidhuber’s sense — a new compression of the tumor microenvironment, qualitatively better than previous attempts. The clonal sweep IS the moment of compression progress — the rapid expansion before the new genotype becomes the dominant “compressed representation.”

Adaptive therapy as curiosity-driven exploration. Standard maximum-tolerated-dose therapy is pure exploitation. Adaptive therapy, by maintaining a competitive tumor population, preserves the system’s ability to explore genotype space — it keeps the learning curve steep. This prevents any single clone from completing its compression of the therapeutic environment (i.e., becoming fully resistant).

See compression-progress-evolution for the full synthesis.