3 pointsby trott8 months ago

1 comment

trott8 months ago

TLDR:<p>Physicians scored 73.7. Physicians armed with GPT-4 scored 76.3. But GPT-4 alone scored 89.2.<p>The authors think it's unlikely that the materials are in the GPT-4 training data, because the cases have never been publicly released.

评论 #41794048 未加载

LLMs and Diagnostic Reasoning: A Randomized Clinical Vignette Study [pdf]

1 comment

LLMs and Diagnostic Reasoning: A Randomized Clinical Vignette Study [pdf]

1 comment