It is suggested that again this is an effect of training towards "sounding well" as opposed to alethics.<p>The results in an image: <a href="https://dl.acm.org/cms/10.1145/3706598.3713470/asset/95dbaf80-41a6-4e24-bd6a-01d76f59834a/assets/images/medium/chi25-387-fig4.jpg" rel="nofollow">https://dl.acm.org/cms/10.1145/3706598.3713470/asset/95dbaf8...</a><p>--<p>From the actual paper ( <a href="https://dl.acm.org/doi/10.1145/3706598.3713470" rel="nofollow">https://dl.acm.org/doi/10.1145/3706598.3713470</a> ):<p>> <i>We used ChatGPT-4o to generate the LLM-generated prompts, while UK-based lawyers generated the lawyer-generated advice</i><p>It would have been nice to also have layers assess the LLM output...
Would be good to know if ChatGPT is better or worse than an average lawyer. I'd bet when it comes to a bit more obscure law knowledge, it's unbeatable.