A team at Yale School of Medicine conducted a group-based simulation trial comparing a standard AI risk dashboard for upper gastrointestinal bleeding with an enhanced version including GutGPT, an LLM-powered conversational interface. While GutGPT significantly improved Effort Expectancy scores, indicating better perceived usability, it did not produce a statistically significant change in Behavioral Intention to adopt the system. The study highlights the importance of integration, trust, and workflow fit beyond ease of use in clinical AI adoption.

Key points

  • Integration of GutGPT—a three-tier LLM architecture (parser, model, guideline retriever)—with an ML-based UGIB risk dashboard.
  • Randomized simulation trial with 106 trainees compared GutGPT+dashboard versus dashboard alone; primary outcome: Behavioral Intention; secondary: Effort Expectancy and decision accuracy.
  • GutGPT improved perceived usability (Effort Expectancy Δ=0.6; 95% CI [0.3,1.0]) but showed no significant effect on adoption intent (BI p=0.657).

Why it matters: Demonstrates that improved AI interface usability alone won’t drive clinical adoption, underscoring the need for trust and workflow integration.

Q&A

  • What is Effort Expectancy?
  • How does GutGPT classify clinician queries?
  • Why didn’t increased usability translate into higher adoption?
  • What is the UTAUT framework?
Copy link
Facebook X LinkedIn WhatsApp
Share post via...


Read full article
Usability and adoption in a randomized trial of GutGPT a GenAI tool for gastrointestinal bleeding