A compound sat in a research database for years. It had been studied as a potential diabetes drug. Results were poor. Development was abandoned. Nobody looked at it again.
Then, in 2020, an MIT machine learning model reviewed it alongside 100 million other molecules — and flagged it as one of the most potent antibiotics ever identified. The compound killed bacteria that no known drug could. It worked against strains that had developed resistance to everything in the clinical toolkit. In mice infected with one of the most dangerous antibiotic-resistant pathogens on the planet, it cleared the infection within 24 hours.
The compound was renamed Halicin — after HAL 9000, the AI from 2001: A Space Odyssey. The name was a choice. It was a signal that something had changed about how science works.
This article explains how that happened. Not in broad strokes — in actual steps. What does the AI model do? What does it learn from? Why did it find something human researchers missed? And what are the limits of this approach that nobody talks about enough?
The Problem That Made This Necessary
Antibiotic resistance is not a future problem. It is a present one. According to the World Health Organization, bacterial antimicrobial resistance was directly responsible for 1.27 million deaths globally in 2019. That number does not include people who died from other causes while also carrying a resistant infection.
The last genuinely new class of antibiotics to reach clinical use was discovered in the 1980s. That's not a minor gap. Bacteria evolve. Resistance spreads. The drugs we developed decades ago are slowly losing effectiveness against the organisms they were designed to kill. Meanwhile, pharmaceutical companies have largely exited the antibiotic development space — the economics don't work. Antibiotics are taken for short courses and then stopped. They don't generate the recurring revenue that chronic disease drugs do. So the pipeline dried up.
The result: a growing list of pathogens classified by the WHO as "critical priority" — organisms for which there is either no available treatment, or only one drug that barely works. Acinetobacter baumannii is one of them. It infects hospital patients, war veterans, people on ventilators. It can survive on surfaces for days. It picks up resistance genes from its environment. By the time you are infected with a carbapenem-resistant strain of A. baumannii, most clinical antibiotics simply stop working.
This is the backdrop against which MIT's machine learning experiment happened.
How Human Researchers Discovered Antibiotics — and Why It Stopped Working
The golden age of antibiotic discovery — roughly 1940 to 1980 — worked through a process called natural product screening. Researchers collected soil samples from around the world. They cultured microorganisms from those samples. Then they looked for organisms that produced compounds capable of killing bacteria. This is how penicillin was found: accidentally, by Alexander Fleming, when a mold contaminating his plate killed the bacteria around it.
The problem is that this method runs out. After decades of global soil sampling, you keep finding the same compounds. The same families of molecules show up again and again. The "easy" discoveries have already been made. What's left in natural product libraries is mostly familiar chemical territory — stuff that's been tried, tested, and found to have resistance problems or toxicity issues.
The pharmaceutical industry shifted to a different approach: synthetic chemistry. Design molecules from scratch using known principles. Test them. Modify them. Test again. This is expensive, slow, and relies heavily on human chemists having accurate intuitions about which structural features are likely to show antibacterial activity.
Human intuition is good within familiar territory. It is not good at finding molecules that look completely unlike anything that has ever been called an antibiotic before. That's the exact category of molecule that's most likely to defeat resistance — because resistance typically evolves in response to specific structural features, and a structurally alien molecule bypasses those evolved defenses.
AI doesn't have intuition. That turned out to be an advantage.
What the AI Model Actually Does
The MIT model is a deep neural network. Before it can evaluate any molecule, it has to be trained — and training requires examples. The researchers fed it a dataset of roughly 2,335 molecules with known properties: each one either inhibited bacterial growth or it didn't. The model processed the molecular structures of all 2,335 compounds and learned to recognize which structural patterns correlate with antibacterial activity against E. coli.
This is not the model learning a set of explicit rules. It doesn't learn "molecules with feature X kill bacteria." It learns a vastly more complex set of relationships between structural features — relationships that cannot be reduced to human-readable principles. The model builds what researchers call a structure-activity relationship map: a high-dimensional representation of which structural properties tend to predict which biological outcomes.
Once trained, the model was applied to the Drug Repurposing Hub — a curated database of 6,680 compounds that had already been through some degree of safety and pharmacological testing for other purposes. The model evaluated each compound and assigned it a predicted antibacterial activity score. This took under two hours.
The researchers then filtered that output through additional criteria: low structural similarity to existing antibiotics (to avoid finding things that were already known), and predicted low toxicity to human cells. From this filtered set, they chose 240 candidates to test in the actual lab.
Nine showed antibacterial activity. One was extraordinary. That was Halicin.
After confirming Halicin, the team scaled up. They ran the same model against the ZINC15 database — a public library of around 1.5 billion chemical compounds. They screened 100 million of those in three days. This would have been physically impossible to do experimentally. You cannot test 100 million compounds in a wet lab. It would take thousands of researchers and decades of work. The model did it computationally.
From that screen, they identified 23 candidates that were structurally distinct from known antibiotics and predicted to be non-toxic. Lab testing found that 8 of them had real antibacterial activity. Two were "particularly powerful," according to the published paper in the journal Cell.
The Halicin Discovery: How It Actually Happened
Halicin's real name — before the researchers renamed it — was SU-3327. It was originally synthesized to inhibit a kinase enzyme implicated in diabetes. That trial failed. The compound went into the Drug Repurposing Hub as an archived molecule.
No human chemist looked at its structure and thought: antibiotic. There's no obvious reason to. Its chemical scaffold doesn't resemble any known antibiotic family. It doesn't belong to the beta-lactam class (penicillins, cephalosporins), the tetracycline class, the macrolide class, or any other recognized group. A medicinal chemist with decades of experience and strong antibiotic intuition would have deprioritized it immediately. The structural cues that signal "this might kill bacteria" simply aren't there in a way human pattern recognition would catch.
The AI model doesn't have those intuitions. It doesn't know what antibiotics are "supposed to" look like. It learned from 2,335 examples and extracted a statistical representation of antibacterial activity that operates at a level of structural complexity beyond what human chemists can hold in their heads. When it evaluated SU-3327, the model's learned representation said: high probability of antibacterial activity.
In lab tests, Halicin killed a broad spectrum of bacterial species: Clostridium difficile, Mycobacterium tuberculosis, carbapenem-resistant Enterobacteriaceae, and — crucially — Acinetobacter baumannii. The last one is on the WHO critical priority list. Halicin cleared it in 24 hours in mouse infection models.
The mechanism is what makes resistance unlikely. Most antibiotics target a specific protein — a cell wall synthesis enzyme, a DNA replication enzyme, a ribosomal subunit. Bacteria can evolve resistance by mutating that specific protein. Halicin works differently: it disrupts the flow of protons across the bacterial cell membrane. This is a fundamental electrochemical process that the cell needs to produce energy (ATP). Disrupting it is more like pulling the power cable than blocking one specific machine. To evolve resistance to this mechanism, bacteria would need to restructure how their membranes manage charge — a much more complex and costly evolutionary change.
The researchers tested this directly. E. coli did not develop measurable resistance to Halicin after 30 days. In the same experiment, the same bacteria developed resistance to ciprofloxacin — a standard fluoroquinolone antibiotic — within one to three days.
Why Human Researchers Missed It
There's a concept in machine learning research called "scaffold bias." Human medicinal chemists, when designing or evaluating potential drug candidates, tend to gravitate toward chemical scaffolds that have worked before. If a class of molecules has yielded useful antibiotics historically, researchers are more likely to synthesize variants of that class. This isn't irrational — it's efficient resource allocation. You work with what you know.
But scaffold bias creates blind spots. Molecules that are structurally distant from known antibiotics are deprioritized — not because they don't work, but because they don't look like they should work. Halicin sat in a database for years because it fell outside the scaffold bias of antibiotic research. Nobody thought to test it.
The AI model reduced that bias. It learned from structural features, not from precedent about what antibiotics "look like." The researchers explicitly built the search criteria to favor candidates with low structural similarity to known antibiotics — specifically to find things that conventional screening would miss.
This connects to a broader point about what AI is actually good at in scientific research. It's not replacing scientific judgment. It's identifying things that human judgment, operating under reasonable heuristics and resource constraints, was structurally unlikely to find. That's a specific and genuinely useful contribution — not magic, not general intelligence, just a very powerful search tool operating in chemical space that humans can't practically navigate at scale.
What Happened After Halicin — and Where This Is Going
The MIT team didn't stop at Halicin. By 2023, they had developed an improved model that identified an entirely new class of bacteria-killing compounds by screening a large chemical library. One of them, a compound called Abaucin, was highly specific against A. baumannii — effective against that single species with minimal effects on others. Narrow-spectrum specificity matters because broad-spectrum antibiotics kill beneficial gut bacteria alongside pathogens. Abaucin's specificity is a feature, not a limitation.
In 2025, the same group published work using generative AI — not just screening existing compounds, but designing new ones from scratch. They generated over 36 million possible molecules computationally and screened them for antibacterial properties. The two most promising candidates showed activity against drug-resistant gonorrhea and MRSA. Both are structurally unlike any existing antibiotic and appear to work by disrupting bacterial cell membranes through mechanisms that haven't been documented before.
This is a meaningful shift. The original Halicin work was about screening: finding a useful molecule among ones that already existed. The 2025 work is about generation: asking AI to design new molecules with specified properties that don't yet exist in any database. The search space expands enormously. You are no longer limited to compounds that have already been synthesized.
The same principle has been applied well beyond antibiotics. Insilico Medicine used generative AI to design a drug candidate for idiopathic pulmonary fibrosis — a scarring lung disease — and moved it into human clinical trials in under 18 months. Normal pharmaceutical development takes 10 to 12 years from discovery to clinical trial. The AI-accelerated timeline compressed that to a fraction. Early trial results showed measurable improvement in lung function. This is a company that now has 18 pre-clinical drug candidates, most generated by AI. As Insilico's CEO noted, achieving the same outcome through traditional methods would have taken roughly four to five years and cost ten times as much.
The underlying logic is the same across all of these cases. Biology involves vast combinatorial spaces — protein structures, molecular interactions, reaction pathways — that are too large to explore experimentally. AI models can navigate those spaces computationally, at a speed and scale that physical experimentation cannot match. Hypotheses that used to require years to generate and test can now be generated in days and tested in weeks. The bottleneck is moving from hypothesis generation to physical validation. That's still slow. Wet lab work still takes time. But AI has already removed one of the biggest constraints on how fast science can move.
This connects to a pattern that shows up across AI-driven science. Just as AlphaChip from Google treats chip component layout as a strategy problem — generating optimized processor designs in under 6 hours that previously took human engineers weeks — the drug discovery models treat molecular search as an optimization problem where the machine explores more of the space than any human team could. You can read more about how AI is being applied to physical design in the piece on what's actually driving progress in AI-physical systems in 2026. The same search-and-optimize logic underpins both.
My Take
The Halicin story is useful precisely because it's specific and verifiable. A named compound. A documented mechanism. Published in Cell. Tested in mice. Every step is on the record. When people talk about AI transforming science, this is what they mean — not abstract capability claims, but a particular molecule that kills resistant bacteria, that would not have been found without machine learning, that is structurally unlike anything a human chemist would have prioritized.
The thing that doesn't get said enough: what AI did here wasn't intelligence. It was scale. A sufficiently thorough human team, given infinite time and resources, could have tested every compound in that database and eventually found Halicin. The AI got there in hours instead of never. That's the actual value — not that it thought in a fundamentally different way, but that it searched in a way that humans practically cannot. Scaffold bias isn't stupidity. It's rational resource allocation under constraints. AI removed the constraints.
The resistance-defeating mechanism is the part that genuinely surprises me. Most antibiotic resistance evolves because you're blocking one protein target, and bacteria have billions of generations to find a mutation that changes that protein's shape. A drug that disrupts membrane electrochemistry doesn't give bacteria a single target to mutate around. Thirty days, no resistance. That's remarkable. Not because AI figured out the mechanism — the human researchers figured out the mechanism — but because the AI found the molecule that happened to work this way, sitting unnoticed in a database of archived drug failures.
I'm more skeptical about the generative AI step — designing entirely new molecules from scratch rather than screening existing ones. The search space is enormous, and AI models can be confidently wrong. Every AI-generated candidate still needs to be synthesized and tested in a physical lab. The bottleneck has moved but hasn't disappeared. What I'm watching is whether the 2025 generative compounds actually clear preclinical testing, because that's where most things die regardless of how promising they look computationally. The screening results are encouraging. The clinical results will take years.
Key Takeaways
- AI drug discovery works by training a model on compounds with known properties, then applying it to screen millions of candidates computationally.
- The key advantage is scale and freedom from scaffold bias — AI evaluates structurally novel molecules that human chemists would deprioritize.
- Halicin was a failed diabetes drug, rediscovered by MIT's AI as a broad-spectrum antibiotic with a resistance-evading mechanism. Bacteria showed no resistance after 30 days of testing.
- The model screened 100 million compounds in under 3 days. Physical lab testing at that scale would be impossible.
- The field has moved from screening existing compounds to generatively designing entirely new ones — 36 million+ candidate molecules designed computationally in one study.
- Physical validation remains the bottleneck. Computational results still require wet lab testing and clinical trials before any drug reaches patients.
FAQ
Is Halicin approved for human use?
Not yet. As of early 2026, Halicin is still in preclinical research stages. MIT's team has been working with pharmaceutical partners through Phare Bio, a nonprofit formed out of the Antibiotics-AI Project, to advance the most promising candidates. Drug development takes years of preclinical and clinical testing before approval, regardless of how it was initially discovered.
How did the AI model learn what makes a good antibiotic?
It was trained on a dataset of 2,335 molecules with documented antibacterial activity — or lack of it — against E. coli. The model processed the molecular structure of each compound and learned statistical relationships between structural features and biological outcomes. It does not learn human-readable rules; it learns a high-dimensional representation that encodes which structural patterns correlate with which biological results.
Could the same approach be used for other diseases?
Yes, and it already is. Insilico Medicine used generative AI to identify drug candidates for lung fibrosis. Other groups are applying similar approaches to antivirals, anti-cancer compounds, and materials for batteries and electronics. The method — train on known examples, screen or generate new candidates, validate experimentally — applies wherever there is a large enough dataset of known outcomes to train on.
Why does scaffold bias matter in drug discovery?
Scaffold bias describes the tendency of medicinal chemists to work with chemical scaffolds that have a track record in a given therapeutic area. It's rational under resource constraints — you work with what you know works. But it systematically underexplores structural territory that doesn't resemble existing drugs. Compounds in that unexplored territory may have entirely different mechanisms of action — ones that bacteria haven't evolved resistance to, precisely because humans haven't used them before.
What is the Drug Repurposing Hub and why was it useful here?
The Drug Repurposing Hub is a database maintained by the Broad Institute at MIT and Harvard. It contains thousands of compounds that have already passed some degree of safety and pharmacological testing, usually for a different indication than the one being explored. Using this library meant that Halicin's initial safety profile was partially known before antibiotic testing began — it had already been through early-stage evaluation as a diabetes treatment. That's a head start in development.
Does AI replace scientists in drug discovery?
No. The AI model generates candidates. Human researchers design the search criteria, interpret the outputs, select compounds for physical testing, perform the lab work, analyze results, and determine what those results mean. The bottleneck in every published example so far is physical validation — wet lab testing, animal models, clinical trials. AI has accelerated the hypothesis generation stage significantly. It has not replaced any of the validation stages, which remain slow and expensive and deeply dependent on human expertise. For more on how AI augments rather than replaces expert judgment, the article on how AI reasoning is monitored and why human oversight still matters gets into some of the limits of AI autonomy in high-stakes contexts.
The honest version of this story is not that AI solved the antibiotic crisis. It hasn't. Halicin is not in clinical use. The generative compounds from 2025 are not in clinical use. Drug development is slow, and no amount of computational speed at the hypothesis generation stage changes how long it takes to run a clinical trial safely. What AI has done is move one specific bottleneck — the search problem — from "practically impossible at the scale required" to "solved." The other bottlenecks are still there. They matter. But removing one real constraint from a process that's been stuck for forty years is not nothing. That much seems clear from the published results.
0 Comments