TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Machine Assisted Proof [video]

223 点作者 stefanpie超过 1 年前

16 条评论

bagels超过 1 年前
One hour talk, mostly going over the history of machine proofs, and machine assisted proofs.<p>Conclusions (from slides at 49:56)<p>Computers by themselves still seem unlikely to resolve major mathematical problems on their own.<p>However, they are increasingly being used to generate (sic) assist human mathematicians in a variety of creative ways, beyond just brute-force case checking or computation.<p>For instance, we have seen they can be useful at generating conjectures or uncovering intriguing mathematical phenomena.<p>Automated provers could also be used to explore the space of proofs itself, beyond the small set of &quot;human-generatable&quot; proofs that often require one to stay close to other sources of intuition, such as existing literature or connections to other ways of thinking.<p>While AI technology shows great potential, in the immediate term, I expect it to have the most impact on tasks peripheral to mathematical research rather than central to it, such as automatically summarizing large amounts of literator or suggesting related work.<p>Proof formalization continues to make steady improvements in speed and ease of use. The &quot;de Brujin factor&quot; (the ratio between the difficulty of writing a correct formal proof and a correct informal proof) is still well above one (I estimate ~ 20), but dropping. Once AI integration takes place, this factor could potentially drop below one, which would be transformative to our field.
评论 #39377657 未加载
npunt超过 1 年前
The coolest thing about Machine Assisted Proofs is the idea of expanding collaborators on a single large problem by orders of magnitude by breaking up work into small machine-verifiable pieces.<p>This to me is one of the exciting core capabilities that AI unlocks: the ability to verify&#x2F;enforce a set of standards across many inputs. It&#x27;s one of the information age&#x27;s core problems. One example used a lot in social sciences is inter-rater reliability (IRR) [1]. There&#x27;s probably a lot of domains out there that can benefit from this pattern of machines verifying distributed human inputs, both in crunch the numbers like machine assisted proofs, as well as more subjective domains where subjectivity can be extremely carefully defined.<p>I&#x27;d love to see more AI tooling focused on these kinds of large scale multi-contributor problem solving methods, including the idea of knowing how to correctly state the overall problem (I believe this was an example in the video).<p>[1] <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Inter-rater_reliability" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Inter-rater_reliability</a>
评论 #39378257 未加载
jumploops超过 1 年前
It&#x27;s refreshing to see someone like Terence Tao embrace GPT-4 as an assistant.<p>A few friends of mine (who are ML practitioners!) still don&#x27;t trust LLMs or find them useful in their day-to-day work.<p>I&#x27;ve noticed a similar trend among folks who work on compilers, preferring to &quot;stay in their lane&quot; instead of embracing GPT-4. The opposite appears true among those who work on user-facing applications, adoption of LLMs is much higher in their day-to-day.
评论 #39379177 未加载
评论 #39378168 未加载
评论 #39378076 未加载
评论 #39377867 未加载
评论 #39383969 未加载
munchler超过 1 年前
Theorem provers like Lean are awesome. As a software guy who admires mathematics from a safe distance, the Curry-Howard isomorphism is one of the most beautiful and surprising things I&#x27;ve ever seen. Programs are proofs!
评论 #39379197 未加载
pfdietz超过 1 年前
I would like to see computational assistance for formalizing existing mathematical results. The goal would be a system that could take a paper and output its proofs in formalized form. Ultimately, this would enable us to vet all published mathematical results and deliver them as formal proofs, which could then be used as training inputs for future proof systems.
wenc超过 1 年前
I’m not proving anything but I’m using ChatGPT to come up with MIP optimization mathematical formulations. It’s ingested all the techniques from journal papers so it actually comes up with some really solid formulations based on what I tell it. Super useful!
评论 #39378345 未加载
cubefox超过 1 年前
It is interesting to note, apparently combining formal proof checkers with machine learning could be used to create synthetic training data for automatic self-training of proof generators. It could work something like this: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=38036986">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=38036986</a><p>Since formal proof checkers provide a reward signal (correct&#x2F;incorrect proof) the above process could scale to superhuman performance in generating proofs for conjectures. This is unlike ordinary language models, which only try to imitate the human-written training distribution.
the_panopticon超过 1 年前
In the same spirit of this talk, good HN post <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=38035672">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=38035672</a> on Terry Tao&#x27;s use of proof assistants and LLMs to find a paper bug <a href="https:&#x2F;&#x2F;terrytao.wordpress.com&#x2F;2023&#x2F;11&#x2F;18&#x2F;formalizing-the-proof-of-pfr-in-lean4-using-blueprint-a-short-tour&#x2F;" rel="nofollow">https:&#x2F;&#x2F;terrytao.wordpress.com&#x2F;2023&#x2F;11&#x2F;18&#x2F;formalizing-the-pr...</a>
zero-sharp超过 1 年前
Part of mathematics is formalization. There&#x27;s no doubt that computers will help us with proofs more and more, but there&#x27;s also the &quot;discovering&quot; of what statements we should be taking for granted. That involves things like intuition, insight, and intention. People didn&#x27;t have a &quot;least upper bound property&quot; hundreds of years ago, even though they were studying the real numbers.<p>Just wanted to remind people that we haven&#x27;t sucked the human element out of math.
spaduf超过 1 年前
Terrence talks a lot about his experiences with Lean and the field more generally on his Mastodon over at @tao@mathstodon.xyz
Xcelerate超过 1 年前
Terence mentions how AI can potentially assist with finding mathematical proofs in the last few slides, but I wonder what his thoughts are on going the opposite direction, i.e., using mathematics to accelerate the development of AI. If we consider Levin search to be the “brute-force” approach to universal search, then Hutter search improves upon this by interleaving the search of algorithm&#x2F;program space with a search of the space of formal mathematical proofs to find proofs of algorithmic equivalence, whereby we substitute the slower running algorithm with its faster equivalent.<p>But can we do even better? Terence touches upon automated theorem proving, which to some degree formalizes the notion of searching proof space. Once formalized, this becomes its own area of mathematics, where now the goal is to find proofs of the fastest ways to search the subset of proofs relevant to fast proof-finding.<p>It’s always been a bit odd to me that we defaulted to considering probabilistic approaches to universal search rather than mathematical ones; these are all mathematical structures at the core of it after all, so I would think deduction would be much more effective than induction in this domain. But then again, who knows—GPT seems to be quickly getting better at providing novel intuitive ideas to explore. And I guess induction is actually kind of necessary when determining which potential path of deduction to explore.<p>Another question is whether there exist algorithms that are the most efficient at solving particular classes of problems but unprovably so within any reasonable formal system, in which case automated theorem proving wouldn’t help. I kind of doubt it though. And in that case, brute-force search certainly isn’t going to find those algorithms either, so they may as well not exist within our “lightcone” of reachable algorithms that could be used to accelerate universal search.<p>The only exception would be if these algorithms do exist and are actually dense in program space—in that case I suppose you would then have to optimize the balance of time spent between searching proof space and algorithm space. Again, I would intuitively think this possibility is highly unlikely, and so we should instead spend all of our time on improving machine assisted proof systems if we want to accelerate AI development via improvements to universal search.
评论 #39383020 未加载
Sakos超过 1 年前
On that note, I&#x27;ve been using ChatGPT to (re-)teach myself mathematical induction and working with proofs. It&#x27;s like having my own tutor. There is huge potential here just for helping people learn new topics and skills.
评论 #39377850 未加载
评论 #39378766 未加载
评论 #39378392 未加载
评论 #39378232 未加载
practal超过 1 年前
It&#x27;s great that top mathematicians are now interested in machine assisted proof.<p>Love that my name appears in the talk (in small print) :-D<p>Now I just have to somehow find funding for my project, Practal [1].<p>[1] <a href="https:&#x2F;&#x2F;practal.com" rel="nofollow">https:&#x2F;&#x2F;practal.com</a>
HanClinto超过 1 年前
I&#x27;ve been tinkering with a hobby project lately, and I wonder how related it is to this.<p>I&#x27;ve been wanting to build a tool to help find and work with combos for deck builders of card games -- specifically Magic: The Gathering, but it could also apply to other games.<p>There is a wonderful dataset of combos available for download from Commander Spellbook -- currently boasting over 26k combos in the database, and growing all the time.<p>One thought I&#x27;ve had is to train my own embedding model so that cards that are likely to combo with each other embed closely with one other. This way, even after new cards are printed, we can rapidly discover cards that are likely to combo with them. In practice, the first attempt that I had at fine-tuning my own embedding model proved lackluster, but I intend to refine my data and try again -- possibly after pre-training.<p>Second thought is to fine-tune an LLM on the text of existing combos -- give it the text of each card in the combo, and then train it to predict the rest of the interactions. This is cool and all, but I don&#x27;t entirely know how to train it to (reliably) give &quot;these cards don&#x27;t combo&quot; answers -- I fear that it would tend to hallucinate for cards that don&#x27;t combo, and I don&#x27;t know how to handle that.<p>Obviously any answers that come out of this system would need to be vetted by humans before adding to the database, but it feels like this could be an interesting way to explore the game space if nothing else.<p>In a related way, it feels like a mathematical proof begins with a set of starting conditions, a conjecture, and then works forward using established rules. In a similar way, a combo in Magic starts with a set of starting conditions, a conjecture (&quot;this combo will result in infinite life&quot; or &quot;this combo will result in infinite damage&quot;), and then works forward to detail the process of using established rules to accomplish the conjecture.<p>Anyways, it&#x27;s an interesting use-case to me, and I&#x27;m excited to learn more about the parallels. I don&#x27;t know if my embedding model or my LLM approach are worthwhile, and I would like to learn about other tactics I might employ!
paulpauper超过 1 年前
Interesting, I wonder how this could be applied to solving any of the millennium problems.
rq1超过 1 年前
Terence Tao is a machine to me. :)