“Don Knuth Plays with ChatGPT” but with ChatGPT-4

223 pointsby LifeIsBioalmost 2 years ago

17 comments

LifeIsBioalmost 2 years ago

This is a reference to: <a href="https://news.ycombinator.com/item?id=36012360" rel="nofollow">https://news.ycombinator.com/item?id=36012360</a>

blazespinalmost 2 years ago

The sequence of these two threads is just too perfect. Almost likely someone is trying to make a point.

评论 #36016988 未加载

评论 #36017038 未加载

评论 #36016762 未加载

kibwenalmost 2 years ago

>> What is the most beautiful algorithm?> Quicksort AlgorithmDefinitive proof that AI must be stopped. Ranking quicksort as more elegant than heapsort?!

评论 #36016032 未加载

评论 #36016540 未加载

评论 #36016547 未加载

jameshartalmost 2 years ago

Worth noting also that, while asking Bing chat to "Tell me what Donald Knuth says to Stephen Wolfram about chatGPT" doesn't (yet) produce exactly the right result, it produced the following answer when asked what Donald Knuth says about chatGPT:> Donald Knuth, a computer scientist and mathematician known for his contributions to the field of computer programming, particularly in the area of algorithms and data structures, has expressed some skepticism about the potential of artificial intelligence to achieve true human-level intelligence and creativity[1]. He once conducted an experiment with chatGPT where he posed 20 questions to it and analyzed its responses[1]. Is there anything specific you would like to know about his views on GPT?With [1] being a citation link to <a href="https://cs.stanford.edu/~knuth/chatGPT20.txt" rel="nofollow">https://cs.stanford.edu/~knuth/chatGPT20.txt</a>

评论 #36017002 未加载

评论 #36017192 未加载

ryanseysalmost 2 years ago

It now knows to communicate that the NASDAQ doesn't operate on Saturdays.

评论 #36016446 未加载

评论 #36016686 未加载

erwincoumansalmost 2 years ago

It makes you wonder why Knuth bothered with an outdated ChatGPT version? He couldn't find someone with access to GPT-4?

评论 #36017341 未加载

评论 #36017178 未加载

benatkinalmost 2 years ago

Reminds me of that time AlphaGo got its ass handed to it multiple times, and then a short while later...

评论 #36016426 未加载

评论 #36016246 未加载

ec109685almost 2 years ago

Interesting both completely whiff on the number of chapters in the Haj.

评论 #36016138 未加载

评论 #36016081 未加载

fnordpigletalmost 2 years ago

What I find amazing about the original exchange was the profound lack of curiosity Knuth demonstrated. Because the model wasn’t flawless in performance he pinned it as a curiosity that was good at grammar and vacuous otherwise and wasn’t interested to hear how it improves. This reminds me of an awful lot of the computing field in this drama as it plays out. People that literally know how implausible any of these feats have been using traditional approaches immediately discount the entire thing the moment it hallucinates - and it feels like the more deterministic the bent of the person the more absolutely dismissive they are of what’s transpiring in front of us.These models are doing feats that are stupendous and impossible before their advent. Not just a little bit, but the capability differences are so vast that it’s perhaps not even recognizable by people as being as vast as it is. I am impressed that Wolfram seems to have immediately grasped its significance and is running with it.The fact this gist demonstrates essentially every single flaw was addressed. But that Knuth apparently doesn’t know / care months after GPT4’s introduction is demonstrative of a different type of personality.I know which I aspire to be.

评论 #36017837 未加载

评论 #36020986 未加载

评论 #36016931 未加载

评论 #36016834 未加载

评论 #36017211 未加载

SomewhatLikelyalmost 2 years ago

Thank you for specifying ChatGPT-4. So many commenters on the web say they used GPT4 without specifying if they're using the ChatGPT version. ChatGPT-4 is specifically aligned for answering questions better than the base GPT4 model.

评论 #36020754 未加载

dotancohenalmost 2 years ago

I would not be surprised if these questions become some form of canonical test for future language models.Obviously, being the work of Knuth, they are extraordinarily insightful in peeling back the first layer of the answer and providing insight to the underlying properties of both the model itself, and the dataset on which it was trained. It also tests the ability to compute (not recite) very specific facts (e.g. when the sun will be directly above Japan), so checks if subroutines and ephemerides specific to this type of data exist.But beyond the obvious technical merit - there is an alluding property to base our tests on those whom we respect. I used a similar - but far less sophisticated - set of questions when first exploring ChatGPT. But nobody will be drawn to Dotan Cohen's language model benchmarks - rightfully so. The name Knuth has such reverence in the field that I forsee this test, and variations on it to prevent rigging, becoming a canonical test of language models.

billyloalmost 2 years ago

You made me curious about who Bard would respond to them. Here they are:<a href="https://gist.github.com/billylo1/bb717512d2d5145ce7eec02d055de50e" rel="nofollow">https://gist.github.com/billylo1/bb717512d2d5145ce7eec02d055...</a>Notable: Bard struggles in similar ways. It does mention NASDAQ close at 12,043.59 on Friday, May 20, 2023

underdeserveralmost 2 years ago

Interesting that it didn't get the 5-letter word sentence right.

评论 #36016137 未加载

评论 #36015903 未加载

评论 #36016447 未加载

评论 #36015873 未加载

bpicoloalmost 2 years ago

Most importantly, much better wonton recipe.

评论 #36016564 未加载

8thcrossalmost 2 years ago

thats a shitload of difference between its previous version!

cratermoonalmost 2 years ago

Literary Libations: <a href="https://cratermoon.substack.com/p/the-literary-libations" rel="nofollow">https://cratermoon.substack.com/p/the-literary-libations</a>

axpy906almost 2 years ago

Nailed every one. Some by saying not possible to answer but still.

评论 #36016287 未加载

评论 #36016284 未加载

评论 #36016460 未加载

评论 #36016730 未加载

评论 #36024600 未加载