A team from Kırıkkale University systematically evaluated ScholarGPT, ChatGPT-4o, and Google Gemini on 30 endodontic apical surgery questions sourced from Cohen’s Pathways of the Pulp. Analyzing 5,400 responses, they found ScholarGPT achieved 97.7% accuracy, markedly higher than ChatGPT-4o’s 90.1% and Gemini’s 59.5%.

Key points

  • 5,400 responses to 30 endodontic apical surgery questions (12 dichotomous, 18 open-ended) drawn from Cohen’s Pathways of the Pulp.
  • ScholarGPT (academic-tuned LLM) attains 97.7% accuracy versus ChatGPT-4o’s 90.1% and Gemini’s 59.5% (χ2=22.61, p<0.05).
  • High inter-rater reliability confirmed by weighted Cohen’s kappa (κ=0.85) for coding correctness.

Why it matters: Demonstrating an academic-tuned GPT’s superior accuracy underscores the value of specialized LLMs for reliable clinical decision support in dentistry.

Q&A

  • What makes ScholarGPT different?
  • How was model performance evaluated?
  • What are limitations of this study?
  • Why use both dichotomous and open-ended questions?
  • What is endodontic apical surgery?
Copy link
Facebook X LinkedIn WhatsApp
Share post via...


Read full article

Introduction to Large Language Models in Clinical Dentistry

Large Language Models (LLMs) are advanced artificial intelligence systems designed to understand and generate human-like text. In clinical dentistry, they offer the potential to summarize research, provide diagnostic guidance, and assist with treatment planning. This course explores how LLMs work, their applications in endodontics, and considerations for safe use.

How LLMs Work

LLMs like GPT-4 are trained on massive datasets comprising books, articles, and web content. Through a process called transformer-based learning, they learn statistical patterns in language, enabling them to predict and generate coherent text. Key concepts include:

  • Tokens: The smallest units of text (words or subwords) processed by the model.
  • Transformer Architecture: Uses self-attention mechanisms to weigh relationships between tokens in a sequence.
  • Fine-Tuning: Adapting a general model to a specific domain, such as academic literature in dentistry, enhances precision.

Applications in Endodontic Dentistry

Endodontic procedures involve diagnosing and treating diseases of the dental pulp and periapical tissues. LLMs can support clinicians by:

  • Information Retrieval: Summarizing guidelines from authoritative texts like Cohen’s Pathways of the Pulp.
  • Decision Support: Comparing treatment options and suggesting materials based on evidence.
  • Patient Communication: Generating clear explanations of procedures and aftercare instructions.

Case Study: ScholarGPT vs. ChatGPT-4o vs. Gemini

A recent study from Kırıkkale University compared three LLMs on endodontic apical surgery questions. ScholarGPT, an academic-tuned model, achieved 97.7% accuracy, outpacing ChatGPT-4o (90.1%) and Google Gemini (59.5%). This highlights the benefit of specialized fine-tuning on peer-reviewed literature.

Benefits and Limitations

Benefits:

  • Rapid access to summarized evidence.
  • Consistent decision support for common procedures.
  • Scalable training materials for dental education.

Limitations:

  • Reliance on available training data; may omit paywalled studies.
  • Potential for outdated or incomplete information.
  • Need for human oversight to catch errors and ethical considerations.

Guidelines for Safe Use

  1. Verify Citations: Cross-check AI-generated references with primary literature.
  2. Limit Scope: Use LLMs as adjuncts, not sole decision-makers.
  3. Maintain Privacy: Do not share patient-identifiable data with AI services.
  4. Continuing Education: Stay informed about model updates and validation studies.

Future Directions

Advancements in domain-specific training may yield even higher accuracy for dental subspecialties. Combining LLMs with imaging analysis tools and electronic health records could create integrated clinical AI systems, further enhancing patient care and research translation.

Assessment of various artificial intelligence applications in responding to technical questions in endodontic surgery