Open this publication in new window or tab >>Show others...
2026 (English)In: JAMIA Open, E-ISSN 2574-2531, Vol. 9, no 1, article id ooag004Article in journal (Refereed) Published
Abstract [en]
Objectives To assess the performance of a reasoning large language model (LLM) in identifying medication errors in medical incident reports. Materials and Methods OpenAI’s O4-mini LLM was adapted using prompt engineering on 75 000 anonymized incident reports from the Västmanland region of Sweden (2019-2024). To guide the prompt design, we used a subset of 2434 reports, which were manually reclassified by pharmacists as medication-related or not. For validation, 200 reports (January 2024-March 2024) were independently classified by 2 pharmacists to establish a reference classification. Moreover, the LLM performed binary classification, with concordance rates measured against the expert consensus. Results The LLM achieved a concordance rate of 96.0% (192/200; 95% CI, 92.3-98.3) with expert classification. Eight cases (4.0%) showed disagreements, primarily due to linguistic ambiguity or context-dependent interpretation. Five cases involved pharmacists classifying reports as nonmedication-related, while the LLM classified them as medication-related, with the reverse in 3 cases. Subcategorization accuracy was 76.5%. Discussion The LLM showed expert-level performance, outperforming existing automated methods. Thus, its integration into incident reporting systems might improve the efficiency, accuracy, and consistency of patient safety monitoring. Conclusion This validated AI-driven method can be integrated directly into clinical informatics workflows, enabling healthcare organizations to rapidly and consistently identify medication errors, ultimately enhancing patient safety outcomes.
Place, publisher, year, edition, pages
Oxford University Press, 2026
Keywords
artificial intelligence, incident reporting, medication errors, natural language processing, patient safety, article, binary classification, diagnosis, drug therapy, human, incident report, intelligence, large language model, male, medical informatics, medical information system, medication error, pharmacist, product vigilance, prompt engineering, reasoning, Sweden, workflow
National Category
Artificial Intelligence
Identifiers
urn:nbn:se:mdh:diva-75756 (URN)10.1093/jamiaopen/ooag004 (DOI)001668632800001 ()41601915 (PubMedID)2-s2.0-105028562491 (Scopus ID)
2026-02-042026-02-042026-02-04Bibliographically approved