The Most Important Biology Paper of 2026? AI Can Now Design AND Build 10¹⁶ Proteins — At a Trillion-Fold Lower Cost
- Eddie Avil

- 5 hours ago
- 6 min read

AI Just Built a Quadrillion Proteins for Pennies. Nature Biotechnology Says It's Real.
There is a gap at the heart of modern biology that has quietly undermined years of progress — and until this week, nobody had a credible plan to close it.
Generative AI can now design proteins of breathtaking novelty and complexity. Models like AlphaFold, ESMFold, and RFdiffusion have transformed what scientists can imagine. But imagining a protein and building one are two entirely different things. For every AI-designed protein that exists on a computer, only a tiny fraction has ever been physically synthesised and tested. The cost of conventional DNA synthesis makes large-scale manufacturing not just expensive — for truly ambitious libraries, it is effectively impossible.
A new manufacturing-aware AI model architecture changes that entirely — enabling the construction of quadrillions of novel protein designs not only on the computer but also in the laboratory, with almost no loss of performance, representing a more than one-trillion-fold reduction in the cost of gene synthesis. Pitt
This is Variational Synthesis — and this week it was officially published in Nature Biotechnology, one of the most prestigious journals in science.
The Bottleneck Nobody Talks About
To understand why this matters, you need to understand the design-build gap that has constrained biology AI from day one.
Generative protein models create more novel designs than scientists can feasibly build in the laboratory. If researchers want to find the designs that actually work — the biomolecule that cures the disease or catalyses the reaction — they need to build and test those designs in the real world. Statnews They also need real-world data to improve the models themselves, creating a feedback loop that conventional synthesis economics make nearly impossible to operate at scale.
The physical synthesis of sequences at scale using conventional methods is prohibitively expensive. ResearchGate For a library of 10¹⁶ proteins — the kind of diversity needed to meaningfully explore protein space — the cost using standard approaches would run to roughly a quadrillion dollars. That's not a research budget. That's a number that effectively closes the door entirely.
Variational Synthesis kicks that door off its hinges.
How It Works: Teaching AI to Know How to Build What It Designs
Variational Synthesis works by giving a generative model knowledge of DNA synthesis — making it aware of the constraints and possibilities of chemical manufacturing. These new manufacturing-aware generative models know not only how to generate novel proteins on the computer, but also how to manufacture quadrillions of designed proteins in the laboratory, outputting step-by-step instructions for doing so. Statnews
The underlying mechanism is elegantly physical. DNA synthesis builds strands letter by letter — a chemical process that normally aims for precision, adding one exact nucleotide at each step. Variational Synthesis reimagines this process through the lens of generative modelling. At each step of synthesis, a mixture of different nucleotides is added to the growing DNA molecules, so each strand receives a new letter at random — but the randomness is carefully controlled by the parameters of the variational synthesis model, which prescribes exact experimental parameters for the chemical reaction. Statnews
The result: each molecule of DNA randomly encounters a different series of nucleotides and ends up with a different sequence, with each molecule representing a fresh design from the generative model. The system can manufacture 10¹⁷ DNA molecules or more — one hundred quadrillion samples from the generative model. Statnews
Controlled randomness, implemented in chemistry. The computer's logic, running in a test tube.
What Was Built — and What Was Proven
The paper doesn't just present a theory. The team built things — and verified them.
The researchers synthesised approximately 10¹⁶ designs from a generative model of human antibodies, with realism and diversity comparable to state-of-the-art protein language models, and verified the designed DNA by sequencing. ResearchGate The antibody designs were then expressed in human cell lines and put to work in a real therapeutic screening context.
The results of that screening are where things get genuinely exciting for medicine. Translation and expression of the designed antibody scFvs in human cell lines, combined with high-throughput screening against multiplexed human leukocyte antigen (HLA)-presented intracellular proteins, yielded potential therapeutic chimeric antigen receptors ResearchGate — the molecular machinery behind some of the most advanced cancer immunotherapies in clinical use today.
To confirm the approach works beyond antibodies, the team also synthesised approximately 10¹⁶ DNA designs from models of Taq polymerase and the HLA-presented peptidome, confirming the method's generalisability across protein families. ResearchGate
In addition to antibodies, variational synthesis models have been used to design and build novel T cell epitopes — a critical component of many vaccines — and novel DNA polymerases, a critical enzyme widely used in diagnostics and sequencing. In all cases, high-quality designs were successfully manufactured at massive scale. Statnews
The Numbers That Put This in Context
It helps to sit with the scale for a moment.
200 million AI-designed proteins, built and tested in the lab, against 100 targets, simultaneously. A library containing 10¹⁶ designed sequences — more molecular diversity than most research programmes will encounter in a lifetime — synthesised at a cost orders of magnitude below any previous method.
When measured rigorously, the quality of the real-world manufactured designs compares to the quality of computational designs produced by conventional generative models. According to some key metrics of overall quality, the designs made by variational synthesis even surpass those imagined by a state-of-the-art protein language model. Statnews
The library is not just large. It is good. The AI's chemistry lesson has not dumbed down its designs — it has scaled them without compromise.
Why This Changes Drug Discovery, Vaccine Development and Beyond
The ability to generate synthetic proteins purpose-built for drug development promises not only speed but performance advantages that natural proteins cannot offer. AI can optimise new proteins for improved binding to disease targets, resistance to degradation in the body, and better compatibility with delivery systems like nanoparticles or gene therapy vectors. EurekAlert!
Variational Synthesis accelerates every stage of that pipeline. The design-build-test cycle that forms the backbone of drug development — historically measured in months per iteration — can now be compressed and parallelised at a scale that was simply not possible before. Pairing generative protein design models with the ability to synthesise, express, and test thousands of candidate proteins in parallel dramatically accelerates the design–build–test cycle that underpins drug development. EurekAlert!
For vaccine development, this means designing immunogens with broader or more durable immune responses at scale. For enzyme engineering, it means exploring functional diversity across an entire protein family simultaneously. For cancer therapy, it means screening CAR candidates against multiple HLA-presented tumour antigens in a single experiment rather than one by one.
The economic implications are just as significant. From a commercial perspective, manufacturing-aware generative DNA synthesis promises to reduce costs, timeframes, and resource burdens traditionally associated with large-scale DNA library production, with companies engaged in antibody discovery, vaccine development, and enzyme engineering standing to benefit enormously. Chalmers tekniska högskola
The Feedback Loop That Makes AI Smarter
There is a second-order benefit to Variational Synthesis that may matter even more in the long run than the immediate therapeutic applications: it creates the data that future AI needs to become radically better.
By building and testing the designs of generative models at massive scale, new datasets can be created that are unprecedented in size and scope — transforming our ability to predict and engineer biology from the molecular level up. Statnews Every protein that is synthesised and screened generates a data point. At 10¹⁶ designs per run, the scale of feedback available to train the next generation of biological AI models is without precedent.
Feeding wet-lab results from de novo designs back into training datasets establishes a "design-test-learn" cycle that iteratively improves subsequent designs and raises overall success rates. Calico Variational Synthesis doesn't just populate that loop. It industrialises it.
A New Chapter for Biology — Written in Chemistry
Bridging computational design and physical manufacturing, this technique embodies a synthesis of machine learning algorithms with innovative biochemical processes — ushering in a new era where designed DNA sequences are produced en masse with precise control and remarkable efficiency. Chalmers tekniska högskola
The 2024 Nobel Prize in Chemistry recognised AlphaFold and protein design as the defining scientific breakthrough of the decade. That advance freed biological engineering from the constraints of evolutionary history — natural proteins, optimised for biological fitness rather than biotechnological versatility, are often confined to local optima in the protein functional fitness landscape. Calico Variational Synthesis takes that freedom and makes it physical — not just in the computer, but in the flask, at planetary scale.
The designs are no longer just digital. They are real. And they are being tested against real diseases, right now.
💬 The Bottom Line
Variational Synthesis is not an incremental improvement on existing gene synthesis methods. It is a category change — a new class of AI architecture that understands manufacturing as a first-class design constraint, and uses that understanding to collapse the cost of building protein diversity by a factor of a trillion.
The gap between what AI can imagine and what biology can test has defined and limited the field for years. That gap just closed.
📄 Read the paper: Weinstein, E.N., Gollub, M.G., Slabodkin, A. et al. "Manufacturing-aware generative models enable petascale synthesis of designed DNA." Nature Biotechnology (2026). DOI: 10.1038/s41587-026-03020-8 🔗 JURA Bio blog: jura.bio





Comments