·20:02

AI Spots Breast Cancer Signs 6 Years Before Diagnosis — Jun 12, 2026

Show notes

An AI just spotted breast cancer in scans taken six years before diagnosis.

Run time: 20:02

In today's episode:

  1. AI sees breast cancer six years before diagnosis
  2. Stanford crowd cooled on AI scribes after debate
  3. Most clinicians admit using unapproved AI tools
  4. Saudi hospital shows off practical AI at HLTH Europe
  5. States move to regulate AI in insurance approvals
  6. Stanford and Mayo read tumors from a blood draw
  7. Anthropic splits subscription and API credits June 15
  8. xAI's Grok V9 lands mid-June, built on Cursor data
  9. Gemini 3.5 Pro still waiting on its June release

TL;DR:

  • A Radiology study found three FDA-cleared mammography AIs flagged signs of breast cancer up to six years before diagnosis in about one in five cancers — retrospective, but a big, clean dataset.
  • The honest counter-current: Stanford clinicians' support for AI scribes dropped from 69% to 54% after a live debate, and a new survey says ~72% of healthcare staff reach for unapproved "shadow AI" when sanctioned tools fall short.
  • Anthropic is splitting subscription and programmatic usage into separate credit pools on June 15 and retiring Opus 4.1 on August 5 — plan your pipelines now.

Sources cited:

Subscribe: YouTube

medAI Times is for educational and informational purposes only. The content does not constitute medical advice, diagnosis, treatment recommendation, or professional clinical guidance. Consult qualified healthcare professionals and refer to official sources before making clinical, research, regulatory, or business decisions.

Transcript

Auto-generated from the episode audio. Click any timestamp to jump the player there.

And AI just spotted breast cancer in scans taken six years before diagnosis. Welcome to MedAI Times Podcast, your daily update on medical AI. Don't forget to like and subscribe. AI sees breast cancer six years before diagnosis.

Stanford crowd-cooled on AI scribes after a debate. Most clinicians admit using unapproved AI tools. A Saudi hospital shows off practical AI at HLTH Europe. States move to regulate AI in insurance approvals.

Stanford and Mayo read tumors from a blood draw. Anthropic split subscription and API credits June 15. XAI's Grok V9 lands mid-June. And Gemini 3.5 Pro is still waiting on its June release.

Man, that is a really dense stack of updates for just one week. Yeah, it really is. And you know, it's funny because just two days ago we were discussing Eric Topol's argument that every single mammogram should run AI in the background.

Right, yeah, I remember that. And now today we have this massive study showing that the software can actually spot cancer a full six years out. The timing is just incredible, honestly. I mean, it puts Topol's argument into a much sharper context. It really shifts the conversation from, you know, just getting a second opinion to fundamentally altering the entire timeline of disease detection.

Exactly. And for you listening right now, that is exactly our mission for this deep dive. We want to separate the actual clinical utility of medical AI from all the hype and noise. Yeah, because there is a lot of noise out there right now.

So much. So we're going to look at everything from deep biological imaging to the kind of messy reality of how doctors are actually using and sometimes hiding these tools in the clinic.

Right, because we need to see what actually holds up when it leaves the pristine environment of the lab. Yeah. So let's start right there with that top story, because I think it perfectly illustrates the promise of AI in a really highly controlled environment. We're looking at a new radiology study from a Karolinska team with Frederick Strand as the senior author.

Okay, yeah. They took three commercial FDA-cleared AI CAD systems and fed them like 88,963 mammograms. This was from over 31,000 women across a 10-year period.

Just for anyone who isn't, you know, deep in the radiology stack, we should probably clarify that CAD was computer-aided detection. It isn't just a generic image filter on your phone. Right. These are specialized systems trained specifically to flag suspicious tissue densities or microcalcifications, and testing three of them on a 10-year dataset of nearly 90,000 scans.

That is remarkably rigorous. The numbers they pulled out of that dataset are wild, too. They set the specificity at 90%. And at that threshold, the AI caught visible signs of cancer up to six years out in 19.7% of cases.

Wow. Almost 20%. Yeah. And that rises to 25.2% at four years and 39.3% at two years. They are essentially saying that roughly one in five breast cancers is already leaving a mammographic trace that an AI can read six full years early.

Let's pause on that 90% specificity, though, because I think that defines how we should interpret these numbers. Well, it means they tuned the software to only accept a 10% false positive rate. They aren't just cranking up the sensitivity so high that the AI aggressively flags every, you know, benign cyst or anatomical shadow just to inflate their early detection stats.

Oh, OK. So it's not just crying wolf on every scan. Even with that 10% false positive ceiling, catching nearly 40% of cancers two years early is a massive leap. I mean, it really makes me wonder about the signal check here.

The underlying science seems incredibly sound. I'd say it is a strong signal, definitely, but with a major asterisk attached to it. OK. What's the asterisk? Well, the signal is strong because it's a clean data set spanning multiple vetted systems.

It proves those biological markers are present in the tissue much earlier than we thought. But the asterisk is that this is purely retrospective. Oh, meaning they already knew who got cancer in the end. Right. They're looking at old scans where they have the outcome data.

Nobody has actually taken these six year ahead flags and run them forward in a live prospective screening program yet. We don't know if acting on that early AI warning actually improves patient survival without leading to, like, massive overtreatment.

OK, let's unpack this because I really want to dig into how the AI is actually finding these tiny traces. This is where the spotlight is falling on something called longitudinal or delta imaging AI. Yeah, delta imaging is a huge shift.

My understanding is that it's kind of like looking at the stock market. Like, if I tell you a stock closed at $50 today, that single absolute data point doesn't tell you much. Right. It's just a snapshot. But if I show you the 10 year chart and you see it steadily climbing from $10 up to $50, suddenly you have a trajectory.

That analogy holds up perfectly against how traditional diagnostics work, actually, because most imaging AI historically just scores one scan in isolation. It looks at the picture from today and asks, you know, is there a tumor here right now?

But this is doing something different. Yeah. These longitudinal models compare a patient's images over time. They learn from the trajectory of the risk score. They're evaluating the slope of the line, basically. So cancer becomes detectable not because any single mammogram screams malignancy, but because the AI's subtle risk reading creeps upward across years of routine screening.

Okay. But if the math on that slope is really that reliable, and we have three FDA cleared systems that can do this, I have to ask, why aren't we doing it tomorrow? Like, if the software can spot a troubling trajectory six years early, wouldn't a doctor want that context immediately?

In theory, yes, absolutely. But the hurdle preventing deployment tomorrow is data hygiene. Data hygiene. Yeah. The real world is much messier than a curated retrospective data set.

Take scanner drift, for example. What's that? Say a hospital replaces a 10-year-old imaging machine with a brand new one that captures much higher resolution. The AI might look at the sharper pixels and misinterpret that technological upgrade as a biological change in the breast tissue.

Oh, wow. So it thinks the tissue grew denser, but really the camera just got a better lens. Precisely. You also deal with bad registration, where maybe the breast is compressed slightly differently than it was two years ago, or inconsistent time intervals between scans.

Like a patient might come in after 14 months instead of 12. Right. Life gets in the way. Exactly. And all of these technical variations can manufacture a change in that delta analysis that isn't biologically real.

So if we deploy this tomorrow, the false positives caused by just messy hospital data could trigger thousands of unnecessary biopsies. Yeah. That tension, the gap between a pristine retrospective data set and the chaotic reality of a hospital floor, is actually playing out right now with clinical workflow tools, too.

Which brings us to the Stanford Health AI Week. Oh, the scribe debate. Yeah. There was a live debate between two physicians, Tracy Riedel and Leonardo Aliaga, regarding ambient AI scribes. Those are the tools that listen to the doctor-patient conversation and write the clinical notes automatically.

Right. They've been getting a lot of attention lately. Well, before the debate, audience support for these tools was at 69%. But after they debated the merits and risks live on stage, support actually fell to 54%. What's fascinating here is that the people closest to the actual implementation of these tools are becoming more skeptical, not less, when they are forced to unpack the mechanics out loud.

The overarching theme of that Stanford week was implementing AI responsibly, effectively, and reliably. It was a very sober tone. But what specifically drove that 15-point drop in support? Are doctors just, I don't know, afraid of learning new software?

Or is there a structural flaw in how these scribes operate? It's definitely a structural risk. Riedel and Aliaga mapped out the downstream failures. You see, an ambient scribe doesn't just produce a verbatim transcript of the visit.

It creates a summarized clinical narrative. Oh, OK. And when it summarizes, it actively filters information. So a patient comes in for a twisted ankle, but casually mentions a family history of blood clots. The AI might decide the clotting history isn't relevant to the ankle and omit it from the summary entirely.

Oh, I see where this is going. If that patient later has a pulmonary embolism, the hospital looks at the chart, sees no mention of the family history, and asks the doctor why they missed it. Exactly. The liability remains entirely on the human physician.

But the physician is relying on an opaque algorithm to write the medical record. When clinicians at Stanford were forced to articulate that trade-off, the potential for hallucinatory summaries versus the time saved, their enthusiasm cooled off considerably.

OK, wait. So doctors are skeptical of sanctioned AI scribes in a debate, but they are secretly feeding patient data into unsanctioned AI on their phones to save time? Because that brings up a massive contradiction. You're talking about the Wolters-Kluwer survey?

Yeah. This newly resurfaced survey shows that 72% of U.S. healthcare professionals admit to turning to personal, unapproved shadow AI tools when their workplace options fail them. It's a staggering number, really.

And 40% say they know a colleague who is doing it, and 50% explicitly state they are doing it just for faster workflows. So they're standing in a conference hall debating the liability of regulated scribes, but then they go back to the hospital and dump patient case histories into a public chat GPT window.

We call it the shadow AI paradox. It really highlights how desperate clinicians are for administrative relief. I mean, they are drowning in documentation, which forces hospital IT departments into a terrible corner. Damned if you do, damned if you don't.

Right. Do they aggressively lock down the network and slow down care even further? Or do they kind of turn a blind eye to the massive HIPAA risk? Because every single unvetted prompt that contains patient data is a governance nightmare that compliance committees can't see or monitor.

It feels like the U.S. healthcare system is stuck trying to bolt AI onto broken administrative workflows. But it looks like other regions are taking a different approach. Like at the HLTH Europe conference, the King Faisal Specialist Hospital from Saudi Arabia presented their AI strategy.

Oh, their practical AI approach. Yeah. They are completely bypassing the scribe debate to focus strictly on operational forecasting and resource planning. Things that are deployed, not just demoed. Right. They are heavily emphasizing the logistics.

Instead of trying to automate the doctor-patient interaction, they use AI for managing bed capacity, predicting emergency room staffing needs based on historical data, optimizing supply chains. So they're treating the hospital like a massive logistics operation rather than just a collection of doctor-patient interactions.

Which is often the safer place to start, honestly. Yeah. If an operational AI slightly miscalculates the number of nurses needed on a Tuesday, the hospital can adjust. But if a clinical AI hallucinates a medication dosage in a medical record, the consequences are immediate and severe.

That makes total sense. And that invisible logistical layer is where the real friction in healthcare lives. And it's not just happening with hospital bed management. It's happening at the state level too, where algorithms are literally deciding if your care is going to be approved.

Yeah, the prior authorization issue. Right. We are seeing a flurry of new state legislative moves targeting this. Massachusetts has H.46616, which regulates AI and prior authorizations. And the reporting date for that was just June 15th.

They also have S2632 on AI in healthcare decision making. And New York just passed a whole wave of AI laws on June 1st. This is essentially the regulatory floor beginning to form under the unglamorous side of medical AI.

We spend so much time talking about AI discovering drugs or reading mammograms. But this is the application of AI that touches a patient's wallet directly. Yeah, whether insurance actually pays for it. Exactly. It's the algorithm sitting at the insurance company deciding if you actually need that MRI or if a specific target therapy is covered.

States are realizing that if an algorithm is acting as a financial gatekeeper to medical care, the logic behind those denials needs to be transparent and accountable. It can't just be hidden in a black box. So regulators are trying to map and control the invisible administrative systems surrounding patient care.

But there is a completely different type of invisible mapping happening right now, which moves us away from administrative algorithms and back to bleeding edge science. A Stanford and Mayo Clinic study. Yes, they've developed a new blood test, but it's not looking for cancer cells directly.

It's reading the epigenetic or chemical marks on circulating tumor DNA to infer the cells surrounding a tumor. Right. They are mapping the tumor microenvironment. Here's where it gets really interesting.

I was trying to wrap my head around epigenetics. And like the best way I can describe it is this. A traditional tissue biopsy looks at the actual building materials of a cell to see if it's cancerous. But epigenetics is like reading the postmarks on the trash being thrown out of the building.

We don't have to go inside the cell. We just read the chemical tags on the DNA fragments floating in the blood to know if that cell is operating like a normal factory or if it has mutated into a fortress. That is a fantastic way to visualize it.

Right. Those chemical tags are usually methylation marks, and they turn certain genes on or off without changing the underlying DNA code. Ah, OK. By feeding those methylation patterns into an AI, the Stanford and Mayo team identified nine distinct cellular neighborhoods that are actually shared across most cancer types.

And mapping those neighborhoods is crucial for treatments like immunotherapy, right? Because you need to know what the surrounding cells are doing. Precisely. Immunotherapy relies on your own T cells attacking the cancer. But tumors don't just sit there passively.

They actively recruit other cells to build a defensive neighborhood. Yeah. If the epigenetic blood test shows that the tumor is surrounded by a neighborhood of exhausted T cells or regulatory cells that are actively suppressing the immune response, then pumping the patient full of expensive checkpoint inhibitors might not work at all.

The immune cells are already asleep at the wheel. So this blood test, if it's prospectively validated, gives doctors a non-invasive way to see if the environment is hostile to immunotherapy before they even prescribe it. It allows for truly personalized treatment decisions.

But processing millions of methylation marks across a blood draw to infer those cellular neighborhoods, that requires immense computational backbone. Which brings us to the engine room. Because all of these medical tools, from the shadow AI apps on a doctor's phone to the complex algorithms reading epigenetic postmarks, they all rely entirely on the general foundational AI models.

And those models are undergoing some magy shifts right now. Oh, absolutely. The pace of the underlying infrastructure is accelerating just as fast as the medical applications. Let's do a quick-fire exchange on these. First, Anthropic. On June 15th, they are making a structural change, splitting programmatic API usage and subscription credits into separate pools.

OK. They are also retiring Claude Opus 4.1 on the API on August 5th. But the detail that caught my eye was that they shipped more than 20 legal MCP connectors. Yeah. Those MCP model context protocol connectors are fascinating, especially in a heavily regulated field like medicine.

How so? Well, if you are a medical compliance officer trying to figure out if your hospital's new shadow AI policy aligns with that new New York legislation we just talked about, those connectors allow the AI model to securely query established legal databases directly.

It doesn't have to rely on its general, potentially outdated training data. Ah. So it bridges the gap between a general reasoning engine and a specific, verifiable database. Exactly. Meanwhile, xAI is expected to drop Grok V9 Media mid-June.

It's a 1.5 trillion parameter model, and they explicitly trained it on cursor data to handle complex programming. That's a big deal for development. Right. When I see a model optimized for complex coding, I immediately think of the software engineers trying to build the next generation of those FDA-cleared CAD systems for mammograms.

I mean, they need massive coding co-pilots to handle that architecture. It creates a massive feedback loop. Better coding models accelerate the iteration cycle of the medical software itself, which allows smaller teams to build much more robust clinical tools.

And finally, we have Google's Gemini 3.5 Pro, which is still waiting on its June general availability release. Google has been heavily promoting its 2 million token context window and a new deep think mode.

2 million tokens is huge. Right. But to anyone outside of software engineering, 2 million tokens just sounds like a marketing metric. If I'm a clinical researcher or a doctor, why should I care about a context window that massive?

Could it theoretically ingest a patient's entire lifetime medical record at once? If we connect this to the bigger picture, it absolutely could. And that changes the paradigm for clinical record level RE retrieval augmented generation. OK, break that down for me.

A context window is essentially the AI's active short-term memory. Older models might only remember a few pages of text at a time. So if you fed them a massive patient file, they would lose the plot. They'd forget the symptoms mentioned on page one by the time they reach page 50.

Oh, I see. So a doctor would have to fragment the history, querying the lab results separately from the surgical meds. Exactly. It was disjointed. But a 2 million token window means a doctor could drop a patient's entire medical history, every lab result, every clinical note, every scan report for the last 30 years into the model simultaneously.

Wow. The AI can digest the entire narrative of human life in a single gulp. That allows it to spot long-term patterns or subtle drug interactions that a human physician might easily miss when they're rushing through a fragmented electronic health record.

That is wild. We started this journey looking at an AI that can spot the biological trajectory of breast cancer a full six years before a doctor can see it on a scan. We did. And then we explored the paradox of doctors secretly using shadow AI to survive their documentation burden while actively fearing the liability of sanctioned ambient scribes.

We saw states scrambling to regulate the algorithms acting as financial gatekeepers. And we ended with AI reading the microscopic chemical neighborhoods of tumors, all powered by foundational models that are expanding their memory to encompass a lifetime of medical data.

It's a lot to take in. But the through line connecting all of these developments is that the theoretical capabilities of medical AI are no longer the primary bottleneck here. Right. The math works. The science is undeniably there.

So what does this all mean for the listener? It means the true value of AI in healthcare right now hinges entirely on implementation. It depends on data hygiene, on managing invisible risks, and on building interfaces that doctors actually trust in the chaotic reality of a hospital clinic.

Because it's one thing to build a brilliant algorithm, it's another thing entirely to integrate it into a human workflow safely. Which leaves us with a final question for you to consider. Think back to that incredible Karolinska breast cancer study we talked about at the start.

The six-year early detection. Right. It was retrospective, and it was tested on three specific FDA-cleared CAD systems. But the researchers didn't actually name the vendors. So would it change your trust in these systems if you could see a highly transparent head-to-head breakdown of exactly which commercial products were tested and what their specific false positive rates were?

And maybe more importantly, is any vendor out there actually brave enough to take this technology out of the safety of historical data and move it toward a live, prospective trial with real patients? Because the biological signals are definitely there, and the algorithms can clearly read them, the real question now is whether the healthcare system is actually ready to act on the warning.