Medical Students Anchor 73% of Their AI Questions to Specific Study Materials, Validated Across 11,847 Conversations.
Ora AI Research Team. Internal telemetry audit.
Medical students ground 72.5% of their AI questions in specific study materials rather than engaging in freeform chat. Across 11,847 conversations, the dominant pattern is content-linked inquiry, with questions about clinical vignettes forming the largest single sub-pattern. This grounded-dominant behavior aligns with the retrieval-augmented-generation literature (Lewis 20201), which finds that grounded AI responses are higher-quality on knowledge-intensive tasks. This empirical characterization of student behavior informs institutional policies on AI adoption.
across all conversations
within grounded chat
across the corpus
(largest grounded pattern)
8,632 grounded conversations vs 3,279 freeform conversations. The grounded-dominance pattern is stable across the sample (73.0% in the first half vs 72.3% in the second half).
Vignettes (n=3,185), flashcards (n=1,723), articles (n=85), videos (n=62). Video-linked chat is a smaller category, reflecting the smaller relative size of the in-house video corpus. The remaining 41.4% of grounded conversations were initiated without a specific system link but were content-grounded by intent.
Across 11,847 conversations, 72.5% of student AI questions are grounded in specific study materials. Within this grounded majority, questions about clinical vignettes are the single largest sub-pattern (36.9%). This empirical characterization of student behavior suggests that AI tutoring in medical education is overwhelmingly a context-grounded activity.
What we measured
This analysis characterizes the behavioral patterns of medical students using an AI tutor in a production environment. The dataset is an analytic sample of 11,847 conversations and 48,995 student-initiated messages drawn from Ora's production database. Conversations were categorized into two types: grounded (linked to specific study materials) and freeform (generic chat).
For grounded conversations, we further analyzed the distribution across four content modalities: clinical vignettes, flashcards, library articles, and videos. This descriptive analysis provides empirical data on how students actually use AI tools, informing institutional policies and faculty AI committees deliberating AI adoption (Stanford HAI 20242; AAMC 20243). This finding serves as the empirical foundation for subsequent evaluations comparing grounded AI responses to raw frontier LLMs.
The 72.5% ratio is system-labeled, not intent-labeled. Spot checks indicate that some conversations labeled as grounded were initiated without a specific system link but were content-grounded by intent (e.g., students pasting a question into the chat), suggesting the true grounded ratio may be higher. Ora's chat product is designed to make content-linking the default path; the high grounded ratio partly reflects this product design and partly reflects student behavior. Ora's user base is self-selected, and their AI-use patterns may not generalize to medical students who use freeform LLMs outside any structured platform.
References
- Lewis P, Perez E, Piktus A, et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. NeurIPS. 2020. arxiv:2005.11401
- Stanford Institute for Human-Centered Artificial Intelligence (HAI). Artificial Intelligence Index Report 2024. aiindex.stanford.edu
- Association of American Medical Colleges (AAMC). Guidance on AI Use in Medical Education. 2024. aamc.org