TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: I built the first open LLM for Australian law

40 点作者 ubutler超过 1 年前

5 条评论

ubutler超过 1 年前
Hey HN,<p>Last month, I had the honour of seeing my article on how I built the largest open database of Australian law (<a href="https:&#x2F;&#x2F;umarbutler.com&#x2F;how-i-built-the-largest-open-database-of-australian-law&#x2F;" rel="nofollow noreferrer">https:&#x2F;&#x2F;umarbutler.com&#x2F;how-i-built-the-largest-open-database...</a>) reach the front page. Due in large part to the outpouring of support and encouragement I received for my work from HN, I became determined to publish the first open LLM for Australian law by training a model on my database. I am excited to share that I finally achieved that goal today with the release of Open Australian Legal GPT2, a finetune of GPT2 trained on 37,560 laws and regulations, comprising 635,482,112 tokens, taken from my database.<p>Although it may not be as large as I had originally hoped, I&#x27;m still quite proud of the model. It was a struggle to wade through mountains of options trying to find something that worked. And now I have code I can reuse for training any other causal language model and dataset. The model is thus a small but important step towards maturing the legal AI field here in Australia.<p>If you’re interested in playing around with the model, you can find it here on Hugging Face: <a href="https:&#x2F;&#x2F;huggingface.co&#x2F;umarbutler&#x2F;open-australian-legal-gpt2" rel="nofollow noreferrer">https:&#x2F;&#x2F;huggingface.co&#x2F;umarbutler&#x2F;open-australian-legal-gpt2</a>
评论 #38399383 未加载
gitgud超过 1 年前
Is there a link to try it?<p>This is literally the main application of ML that I’ve been awaiting for years. Making complex legislation and bureaucracy searchable and useful for people with no context.
vermaat超过 1 年前
How do you compare your own trained LLM versus using for example GPT4 + RAG (Vector DB + your Australian Law DB?
评论 #38400970 未加载
Obscurity4340超过 1 年前
What do you think about DevonThink?
RecycledEle超过 1 年前
One if the best current uses cases for LLMs is to point out possible errors in human produced work.<p>I would love to see every small town judge forced to submit a complete recording of the trial, a draft opinion, and what the AI thought of it before submitting their final ruling. All of this should be on the public record.<p>The problem is that, at least in Texas, the justices of the peace (JPs) would refuse to record or erase recordings if evidence that hurts their buddy.