The Week Washington Shut Down Claude's Mythos — Jun 15, 2026
Listen & watch
Show notes
Three days after launch, Washington pulled the plug on Claude's two most powerful models.
Run time: 5:38
In today's episode:
- U.S. government forces Anthropic to disable Fable 5, Mythos 5
- General chatbots beat dedicated medical AI tools in Nature Medicine
- SIIM: AI nails the numbers, fumbles the judgment
- First AI-formulated drug enters a Phase 1 trial
- Topol: 44 trials, still not standard practice
- Anthropic credit split and model retirements go live today
- GPT-5.6 and Gemini 3.5 Pro still stuck in preview
TL;DR:
- Washington issued Anthropic an export-control order three days after launch, forcing it to disable Fable 5 and Mythos 5 for all customers — the model biomedical labs were using for drug design is gone for now.
- A Nature Medicine head-to-head found general-purpose LLMs beat OpenEvidence and UpToDate Expert AI across all three test rounds, with clinicians preferring them; the dedicated tools failed on clarity and safety-critical omissions.
- Radiologists at SIIM 2026 confirmed top LLMs exceed 95% on numerical extraction from scans but still make confident medical-knowledge errors — arithmetic isn't the weak link, judgment is.
Sources cited:
- Nature Medicine
- AuntMinnie
- Contract Pharma
- Medscape
- Ground Truths
- Crescendo AI news roundup
- Anthropic
- Releasebot
- Essa Mamdani
Subscribe: YouTube
medAI Times is for educational and informational purposes only. The content does not constitute medical advice, diagnosis, treatment recommendation, or professional clinical guidance. Consult qualified healthcare professionals and refer to official sources before making clinical, research, regulatory, or business decisions.
Transcript
Auto-generated from the episode audio. Click any timestamp to jump the player there.
Three days after launch, Washington pulled the plug on Claude's two most powerful models. Welcome to MedAI Times Podcast, your daily update on medical AI. Don't forget to like and subscribe. Post-A. Here's what crossed the desk since Friday.
The US government forced Anthropic to switch off its two newest models three days after they launched. A peer-reviewed study put general chatbots head-to-head against the medical AI tools doctors pay for, and the chatbots won.
Radiologists at their big informatics meeting decided AI can do the arithmetic, but still flunks the judgment. An AI-formulated drug reached a phase one trial. And Eric Topol asked why 44 trials still aren't enough to change practice.
Two days ago, we covered Fable 5 quietly routing biology questions to an older model. This weekend, Washington made that point moot. The top story is a first of its kind. On Friday evening, the US government issued Anthropic an export control directive citing national security, ordering it to suspend all access to Fable 5 and Mythos 5 for any foreign national, anywhere, including Anthropic's own foreign national employees.
The practical effect was that the company had to disable both models for every customer. They had launched just three days earlier. Anthropic's read is that the government saw a demonstration of a jailbreak technique used to surface a handful of already known minor vulnerabilities.
The company disagrees that a narrow jailbreak justifies recalling a model shipped to hundreds of millions of people, and warns the standard would freeze new releases industry-wide. Why does a medical audience care?
Mythos was the version select biomedical labs were using for molecular biology hypotheses and drug design, where scientists preferred it over older models about 80% of the time. So here's the verdict. Noise at the bedside, signal in the lab.
No clinician was prescribing off Mythos. But the drug design teams lost their best tool overnight. And the precedent is the real headline. To the study doctors we'll actually argue about. Nature Medicine, June 12th, researchers pitted three general-purpose models, GPT 5.2, Gemini 3.1, and Claude Opus 4.6 against two dedicated clinical tools, open evidence and up-to-date
expert AI. Three rounds, 500 medical exam questions, 500 alignment items, and 100 real, de-identified physician queries scored by 12 blinded US clinicians. The general models won all three.
Open evidence scored lowest on clarity, and the failures were ugly ones. Incomplete content, disorganized answers, and safety-critical omissions. The catch-worth saying out loud, those are last-generation model versions, and benchmarks still aren't the clinic.
Radiologists spent last week at their informatics meeting in Pittsburgh, and the mood was that imaging AI has hit, in the outgoing board chair's words, a transition point. A study we flagged Friday held up.
Top language models cleared 95% on pulling numbers off scans. Things like minimum T-scores and bile duct diameters. But the residual mistakes weren't math errors. They were the model confidently misreading what a number means.
The other live debate on the floor was whether the radiologist shortage is even real, and how much of it AI should be allowed to paper over. On the drug side, a quieter milestone. A CRDMO called Quotient Sciences says it has put the first AI-formulated drug into a phase one trial, an oral solid dose now being tested for safety in healthy volunteers in the UK after the regulator signed off.
Worth keeping the language precise. This is AI designing the formulation, the actual pill, using a machine-learning platform from intrepid labs, not AI inventing the molecule. Still no AI-discovered drug has reached approval.
But the design layer is now in humans, too. And a useful gut check from Eric Topol's newsletter. He points out there are now 44 randomized trials showing AI assistance helps gastroenterologists catch more precancerous polyps during colonoscopy.
And it still isn't standard practice. Same story he tells about mammography. The evidence for some of these tools has crossed the line. Adoption hasn't, he calls it the paradox of medical AI implementation.
And it's the most honest framing of where we actually are. Two quick items from the model world. The same export order that pulled methos lands the day anthropic subscription and API credits formally split, and the older SONNET 4 and OPUS 4 models retire.
Meanwhile, GPT 5.6 and Gemini 3.5 Pro are both still sitting in preview, promised for this month, not yet shipped. The summer model flood is running late. Spotlight.
The Nature Medicine study leaned on something called the Real Clinical Query Benchmark. Not exam questions, but 100 actual queries physicians typed to a model during real shifts, then graded blind.
That's the direction medical AI evaluation is heading, because a model can ace the licensing exam and still bury the one detail that matters at 2 in the morning. Links and sources for every item are in the description. We'll be watching whether Fable and Mythos come back, and which model flood actually arrives.
See you in the next one. Thanks for listening. Find us on YouTube and your favorite podcast app. See you tomorrow.