Probability Zero is too Generous
The evolution from one genotype to another depends on the entire, path-dependent history of transitional mutations.
Two new, best-selling books and many papers have been published by Vox Day in the last few weeks which provide a definitive rebuttal to the notions of genetic drift and of Darwinian speciation through gradual modification. He then goes on to extend this work.
https://voxday.net/2026/02/10/recalibrating-man
https://voxday.net/2026/02/09/the-significance-of-d-and-k
https://voxday.net/2026/02/08/the-real-rate-revolution
https://voxday.net/2026/02/06/trying-to-salvage-kimura
https://voxday.net/2026/02/05/mailvox-the-n-ne-divergence
https://voxday.net/2026/02/04/response-to-dennis-mccarthy-round-2
https://voxday.net/2026/02/03/dennis-mccarthys-round-2
https://voxday.net/2026/02/02/pz-print-editions
https://voxday.net/2026/02/01/richard-dawkinss-running-shoes
It’s not simply a matter of genetic differences between species. Some take-off points for expansion-
FB: a particular gene is decoded to produce a particular protein, yes?
7:20 AM
Yes, that’s right! The basic process is:
Gene (DNA) → RNA → Protein
A gene is a specific sequence of DNA that contains the instructions for making a particular protein. The process happens in two main steps:
Transcription: The DNA sequence of the gene is copied into messenger RNA (mRNA)
Translation: The mRNA is read by ribosomes, which use the genetic code to assemble amino acids in the correct order to build the specific protein
So each gene essentially serves as a template for producing a specific protein (or sometimes multiple related proteins through alternative splicing or other mechanisms).
This is sometimes called the “central dogma” of molecular biology, though the reality has some nuances—some genes code for RNA molecules that don’t get translated into proteins, and some genes can produce multiple protein variants. But the core principle you stated is correct: a particular gene contains the information needed to produce a particular protein.
FB: when the gene is subject to single-point, non-neutral mutation, what is the probability that it can still produce a viable/usable protein?
7:22 AM
This is a great question that doesn’t have a single simple answer, because it depends heavily on several factors. But I can give you some general principles:
The probability varies widely, but most studies suggest somewhere around 25-35% of random mutations are tolerated (meaning the protein still functions adequately), though this varies enormously by protein and position.
Here’s why it’s complicated:
Factors that affect tolerance:
Location matters hugely: Mutations in the active site of an enzyme are usually devastating, while mutations on the protein’s surface away from functional regions are often tolerated. Some positions are extremely constrained, others very flexible.
Type of amino acid change: The genetic code has some redundancy (synonymous mutations cause no change), but even among non-synonymous mutations, some are “conservative” (swapping similar amino acids like leucine→isoleucine) and often tolerated, while others are “radical” (like charged→hydrophobic) and usually problematic.
The protein’s function: Some proteins are very robust to change, others extremely sensitive. Structural proteins tend to be more tolerant than enzymes with precise catalytic requirements.
Empirical data suggests:
Roughly 30% of random amino acid substitutions in typical proteins retain function
But in highly conserved/critical regions, this drops to near 0%
In flexible regions, it can be 70%+
What specific context are you thinking about?
FB: how many point mutations does it take to render the code unusable
7:24 AM
This also varies tremendously, but here are some key points:
The brutal reality: Often just ONE mutation is enough to completely destroy function, especially if it hits a critical spot. But proteins vary hugely in their fragility.
Some examples across the spectrum:
One mutation can be lethal: Sickle cell disease is caused by a single point mutation (glutamic acid → valine at position 6 of beta-globin). Cystic fibrosis often results from a single 3-base deletion. Many genetic diseases arise from single mutations.
Some proteins tolerate dozens: Through directed evolution experiments, scientists have found that some proteins can accumulate 10-20+ mutations and still function, especially if the mutations are in non-critical regions.
What determines fragility?
Protein size and complexity: Larger proteins with more domains might lose one function but retain others
Redundancy: Some organisms have backup copies of genes
Epistasis: Sometimes multiple mutations together are worse (or better!) than you’d predict from each alone
Selection pressure: In nature, “unusable” really means “reduces fitness enough to be selected against”
A rough estimate: For a typical protein, you might expect that accumulating 3-5 random point mutations would very likely break essential function, though one well-placed mutation could do it instantly, while a lucky set of 10 mutations in tolerant positions might still work.
Are you thinking about this in an evolutionary context, or something else?
FB: In the subspace of all possible, functional proteins of a given type, how many are separated by 1, 2, 3, etc. mutations in the DNA that codes for them? Are there gaps, such that two such proteins can be separated by gaps that can only be bridged by multiple, simultaneous mutations?
7:30 AM
This is a fantastic question that gets at the heart of evolutionary theory and protein sequence space! You’re asking about the connectivity of functional protein space.
The short answer: Yes, there are definitely gaps, and this is a major area of active research.
What we know:
1. Protein sequence space is vast and sparse
For a protein of 100 amino acids, there are 20^100 ≈ 10^130 possible sequences
Only a tiny fraction of these fold properly or function
Functional proteins are like rare islands in a vast ocean of non-functional sequences
2. Local connectivity varies tremendously
Some functional proteins have many functional “neighbors” (sequences 1 mutation away)
Others are isolated peaks with few or no functional single-mutant neighbors
This creates “fitness landscapes” with peaks, valleys, and plateaus
3. Evidence for gaps requiring multiple simultaneous mutations:
Yes, these exist! Some examples:
Evolutionary “fitness valleys”: To get from one functional variant to another, you might need to pass through 2-3 non-functional intermediates. This requires either:
Multiple simultaneous mutations (very rare)
Genetic drift in small populations (mutations fix even when slightly deleterious)
Neutral networks (alternative path through functional variants)
Neutral networks: Many studies show that functional proteins are often connected through “neutral networks”—paths where each step maintains function, even if not improving it. These networks can be extensive, allowing exploration of sequence space without loss of function.
Empirical findings:
Deep mutational scanning experiments show that roughly 30-40% of single mutations are tolerated on average
But the probability of tolerating multiple specific mutations drops multiplicatively for independent sites
Some functional variants are separated by “valleys” requiring 2-3 specific mutations with no functional intermediate path
4. The “accessibility” problem:
This is directly relevant to debates about:
Limits of natural selection: If populations can’t cross multi-mutation gaps (because intermediate steps are non-functional and selected against), some adaptations might be unreachable
Irreducible complexity arguments: Some argue certain features require multiple coordinated changes
Neutral evolution: Random drift might help populations cross small valleys in small populations
Key insight from recent work:
The structure of protein fitness landscapes appears to be rugged but connected—there are many local peaks and valleys, but functional proteins often form interconnected networks. However, the connectivity depends on:
How strictly you define “functional” (20% activity? 80%? 100%?)
Population size (large populations can’t drift across valleys as easily)
Mutation rate
The specific protein family
Some proteins show remarkable “evolvability” with dense networks of functional neighbors, while others are more isolated.
Is this question coming from thinking about evolutionary biology, or protein engineering, or something else?
FB: give examples of 1) proteins that show remarkable “evolvability”, 2) adaptations that can’t be reached, 3) neutral networks
7:36 AM
Great! Here are concrete examples:
1) Proteins with remarkable evolvability
TEM β-lactamase (antibiotic resistance enzyme)
Has been extensively studied because bacteria evolve resistance so readily
Can tolerate many mutations while maintaining function
Shows extraordinary flexibility in evolving resistance to new antibiotics
Deep mutational scanning shows ~40% of all possible single mutations are tolerated
Has evolved to break down antibiotics that didn’t even exist when the enzyme first appeared
Cytochrome P450 enzymes
Metabolize drugs and toxins
Extremely promiscuous—can act on hundreds of different substrates
Laboratory evolution experiments can redirect them to new substrates with just a few mutations
Used extensively in directed evolution because they’re so “plastic”
Green fluorescent protein (GFP)
Started as a jellyfish protein, now evolved into dozens of color variants (blue, cyan, yellow, red)
Researchers have created variants with 10-20+ mutations that still fluoresce
Shows that proteins with stable scaffold structures can tolerate extensive surface mutations
2) Adaptations that may require crossing fitness valleys
The classic example: chloroquine resistance in malaria
Requires specific mutations in the PfCRT transporter protein
Most evidence suggests 2-4 specific mutations needed simultaneously or in quick succession
Single mutations alone appear deleterious or non-functional
Has evolved independently only a few times globally despite enormous parasite populations
This rarity suggests it requires crossing a fitness valley (Nicholas White’s work estimated ~10^20 parasites needed for this to arise once)
Certain protein-protein binding specificity changes
Switching which partner a protein binds often requires coordinated changes at multiple contact residues
Single mutations might break existing binding without creating new binding
Creating a “molecular lock and key” requires both to change together
Some metabolic pathway innovations
Converting one enzyme to work on a substantially different substrate sometimes requires multiple active site changes
Individual changes might eliminate activity on the old substrate without gaining activity on the new one
Note: These examples are somewhat controversial—there’s ongoing debate about whether these truly require simultaneous mutations or whether there might be undiscovered functional intermediates.
3) Neutral networks
The classic experimental demonstration: RNA secondary structures
Pioneering work by Peter Schuster and colleagues
Showed that many different RNA sequences fold into the same secondary structure
These sequences form vast connected networks where you can mutate from one to another while maintaining the same fold
Can “drift” across sequence space while maintaining function
Protein examples:
Lactamase neutral networks
Studies showing that proteins maintaining >20% of wild-type activity form extensive connected networks
Can walk hundreds of mutations away from the starting sequence while maintaining basic function
These networks percolate through sequence space, allowing evolutionary exploration
Thermostability neutral networks
Some mutations decrease stability, others increase it
Stabilizing mutations can “buy” the ability to accept destabilizing mutations elsewhere
Creates a network where you can explore sequence space as long as overall stability stays above threshold
The DnaK chaperone studies (by Jesse Bloom and colleagues)
Systematically mapped which mutations are tolerated
Found extensive neutral networks where many genotypes have equivalent fitness
Showed that historical contingency matters—which mutations you acquire first affects which later mutations are accessible
Key concept: Neutral networks are crucial because they allow:
Exploration of sequence space without immediate fitness cost
Access to new peaks in the fitness landscape (drift to new starting point, then climb)
Evolutionary “capacitance”—hidden variation that might become beneficial under new conditions
The existence and structure of these neutral networks is central to understanding how evolution can innovate despite the vastness of sequence space!
What’s driving your interest in this topic?
FB: expand on: The DnaK chaperone studies (by Jesse Bloom and colleagues) * Systematically mapped which mutations are tolerated * Found extensive neutral networks where many genotypes have equivalent fitness * Showed that historical contingency matters—which mutations you acquire first affects which later mutations are accessible
8:34 AM
Great choice—this is really elegant work that demonstrates key principles about evolutionary landscapes!
The DnaK Studies: Background
What is DnaK?
A heat shock protein (Hsp70 family) in E. coli
Essential chaperone that helps other proteins fold correctly
Highly conserved across bacteria
About 638 amino acids long
Why study it?
Essential gene, so you can see what kills the organism
Well-understood function
Representative of important, constrained proteins
What Bloom and colleagues did
Comprehensive mutational scanning:
Systematically introduced mutations throughout the gene
Used high-throughput sequencing to track which mutations survived under selection
Measured fitness effects of thousands of single mutations
Also examined double mutants to look at epistatic interactions
Key findings:
1) Extensive variation in mutational tolerance
~15% of positions tolerate almost any amino acid substitution (highly tolerant)
~35% of positions tolerate no changes (completely constrained)
The rest fall in between with varying degrees of constraint
This created a detailed “fitness landscape map” showing which mutations are neutral, beneficial, or lethal
2) Neutral networks are extensive but structured
What they found:
Many different genotypes (sequence variants) had essentially identical fitness
These form connected networks—you can mutate from one to another through paths of neutral (or near-neutral) changes
The networks aren’t uniformly distributed; they have structure based on protein architecture
Implications:
A population can “drift” along these neutral networks without selection pressure
This drift explores sequence space and can position the population near different adaptive peaks
Different neutral variants, despite having the same current fitness, have different evolutionary potential
3) Historical contingency—this is the really fascinating part!
Epistasis (mutation interactions):
The key finding: The effect of a mutation depends on what mutations are already present.
Concrete examples from their work:
Sign epistasis: A mutation that’s deleterious on one genetic background might be beneficial (or neutral) on another
Mutation A alone: bad
Mutation B alone: bad
Mutations A+B together: neutral or good
Magnitude epistasis: A mutation’s effect size changes depending on background
Mutation X on background 1: -5% fitness
Mutation X on background 2: -0.5% fitness
What this means:
Path dependence: The order in which mutations occur matters enormously.
Example scenario:
Starting sequence: Wild-type
Want to reach: Variant with mutations A, B, and C
Path 1: A→B→C (all steps viable)
Path 2: B→A→C (B→A step is lethal because B without A breaks the protein)
Result: Same destination, but only one path is evolutionarily accessible
Real implication from DnaK:
They found many cases where reaching a particular beneficial multi-mutation variant required a specific order of mutations
Some evolutionary outcomes are accessible from some starting points but not others
Two populations starting with different neutral variants might evolve in completely different directions even under identical selection pressure
4) The “evolutionary potential” differs among neutral variants
This is subtle but profound:
Even among sequences with identical current fitness:
Some have many beneficial mutations available as “next steps”
Others are in cul-de-sacs with few good options
This creates differential “evolvability”
Think of it like being at the same elevation on a mountain, but:
Position A: surrounded by gentle upward slopes (many paths to higher fitness)
Position B: on a narrow ridge (few paths forward, most lead down)
Broader Implications
1) Contingency in evolution
History matters—not just for explaining what happened, but for predicting what can happen next
This connects to Stephen Jay Gould’s “replaying the tape of life” thought experiment
Evolution is not fully deterministic even given the same selection pressures
2) Protein engineering challenges
Can’t just design “the best” sequence and expect to evolve toward it
Need to consider accessible paths, not just optimal endpoints
Directed evolution must navigate these constraints
3) Evolutionary rescue and adaptation
Populations with the same current fitness might have very different capacities to adapt to new challenges
Genetic diversity that seems “neutral” now might be crucial for future adaptation
Small population bottlenecks might not just reduce diversity, but could eliminate crucial paths through sequence space
4) Predictability limits
Even with perfect knowledge of the fitness landscape, you can’t fully predict evolutionary outcomes
The historical contingency means small random differences early on can determine long-term trajectories
Related work
This DnaK work built on and inspired similar studies:
Influenza hemagglutinin (also by Bloom): showing how antibody pressure creates epistatic constraints
HIV drug resistance: mutations must occur in specific orders to reach high-level resistance
Richard Lenski’s long-term E. coli evolution experiment: showed historical contingency in the evolution of citrate utilization (only 1 of 12 parallel populations evolved this despite identical conditions)
The fundamental insight: The fitness landscape isn’t just about peaks and valleys—it’s about which peaks are reachable from where you currently stand, and that depends on your entire genetic history.
Does this connect to what you’re thinking about?

Excellent work