As we step into a new year, it’s worth reflecting on 2024—an era where we juggled ongoing boarding challenges, new technology hype, and the ever-evolving conversation around generative AI. This newsletter looks back at 2024—the “best of times and worst of times” for AI in health care—and highlights how we at CyrenCare believe in a more measured, provider-centered way forward.
Illustration by Paul Noth "It wants to do our job
A Tale of Two Extremes
“It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness…”
The Fall of a $650M “AI-First” Company
A once-hyped primary care startup—at one point valued at $650 million—pursued a futuristic vision of AI-driven “medical pods” to automate clinical care. In practice, they overestimated current AI capabilities and underestimated the complexity of real-world patient interactions and clinical oversight.
Headlines of AI Chatbots “Defeating” Doctors
Simultaneously, media outlets stirred debate with claims that GPT-like models were “better” than doctors at diagnosing illnesses. The reality is far more nuanced, as multiple studies show below.
Conflicting Results: Man vs. Machine
Various studies in 2024 showed mixed findings and conflicting results, with each context presenting different caveats.
Large Language Model Influence on Diagnostic Reasoning Goh et al., JAMA Netw Open 2024 In this trial, LLM alone outperformed physicians even when the LLM was available to them, making the headlines of NYT and indicating that further development in human-computer interactions is needed to realize the potential of AI in clinical decision support systems.
ChatGPT (GPT-4) versus doctors on complex cases of the Swedish family medicine specialist examination: an observational comparative study Arvidsson et al., BMJ Open 2024 Researchers tested GPT-4 on open-ended primary care questions. GPT fell short compared to physicians—especially in capturing psychosocial details, medication nuances, and individualized care planning.
Evaluation and mitigation of the limitations of large language models in clinical decision-making Hager et al., Nat Med 2024 In a framework simulating a realistic clinical setting, LLMs performed significantly worse than physicians in diagnosing patients, follow neither diagnostic nor treatment guidelines, and cannot interpret laboratory results, thus posing a serious risk to the health of patients.
ChatGPT and Generating a Differential Diagnosis Early in an Emergency Department Presentation Berg, Hidde ten et al., Annals of Emergency Medicine 2024 In a retrospective analysis of ED cases, ChatGPT matched experts’ ability to generate differential diagnoses.
✔️ Key Takeaway: Warraich et al., JAMA 2024 LLM performance should be monitored in the environment where it’s actually used, not just on multiple-choice or short-answer tests—aligning with FDA guidance on real-world lifecycle monitoring.
CyrenCare's Approach:
We’re firm believers in using AI to “do the dishes, not the art.”
CyrenCare believes in using AI to handle repetitive and lower-stakes tasks so clinicians can focus on higher-level decision-making and one-on-one patient interactions.
Direct Patient Voices
Our platform collects structured clinical symptom information directly from patients creating a patient generated HPI report, enabling providers to have focused meaningful conversations rather than struggling to ask questions repetitively to collect basic information.
Transparent, Rule-based Logic
Our algorithms are transparent; ED teams see exactly how data is processed—fostering trust and interpretability. No more black-box mysteries.
Augmentation, Not Replacement
Think of it as “doing the dishes.” We capture core patient details and flag red flags, but the art of medicine remains with you.
Seamless Integration
Designed to fit ED workflows, we continually refine our interface based on clinician feedback to ensure simplicity and timesaving.
“We use LLMs as a lubricant for user experience—collecting symptoms more accurately—but our core system is a rule-based engine. Not every clinical challenge needs a black-box AI solution.”