We shared results from a real-world study of an LLM clinical copilot — a collaboration between OpenAI and Penda Health in Nairobi, Kenya. Across 39,849 live patient visits, clinicians using AI Consult (powered by GPT-4o) had a 16% relative reduction in diagnostic errors and a 13% reduction in treatment errors compared to those without.
This is among the first real-world evidence that LLMs can reduce clinical errors safely and at scale.