Don Knuth plays with ChatGPT

927 点作者 talonx大约 2 年前

75 条评论

I don't think these dichotomies in quality are going to go away. It's the exact same thing in every neural net domain. Self driving is probably the most sophisticated domain besides chat, and there too it's the exact same problem. They can drive a vehicle in some complex scenario with super human performance, and then they randomly drive straight into a highway divider at full speed on a clear day with no traffic.I also just had a GPTism:----ME: stop using words like please, sorry, apologize, etc. speak plainly.GPT: Understood. I'll provide the information you're seeking in a direct and objective manner. If you have any further questions or need assistance, please let me know.ME: you used the word please, again. stop.GPT: I apologize for my oversight. I'll avoid using phrases that include "please" or similar language. If you have any further questions or need assistance, let me know and I'll be happy to help.ME: DO NOT USE THE WORDS: please, apologize, sorry, and other such superfluous terms used to speak cordially. they are unnecessary.GPT: I apologize for any inconvenience caused. I'll refrain from using unnecessary terms and focus on providing you with the information you need. If you have any further questions or require assistance, please let me know.----

评论 #36013537 未加载

评论 #36013726 未加载

评论 #36018518 未加载

评论 #36013267 未加载

评论 #36013530 未加载

评论 #36013617 未加载

评论 #36013389 未加载

评论 #36013295 未加载

评论 #36013252 未加载

评论 #36015418 未加载

评论 #36013570 未加载

评论 #36014931 未加载

评论 #36013380 未加载

评论 #36013918 未加载

评论 #36014886 未加载

评论 #36013360 未加载

评论 #36013663 未加载

评论 #36013675 未加载

评论 #36015899 未加载

评论 #36013719 未加载

评论 #36014498 未加载

评论 #36013517 未加载

评论 #36014560 未加载

评论 #36014741 未加载

评论 #36014346 未加载

评论 #36013624 未加载

评论 #36021742 未加载

评论 #36015462 未加载

评论 #36016496 未加载

评论 #36016213 未加载

评论 #36014571 未加载

评论 #36013623 未加载

评论 #36017001 未加载

评论 #36015566 未加载

评论 #36017033 未加载

评论 #36015208 未加载

评论 #36015773 未加载

评论 #36014382 未加载

评论 #36018614 未加载

评论 #36027338 未加载

评论 #36015631 未加载

评论 #36018360 未加载

评论 #36015259 未加载

评论 #36013736 未加载

评论 #36015216 未加载

评论 #36013288 未加载

评论 #36016778 未加载

评论 #36015073 未加载

评论 #36014913 未加载

评论 #36013126 未加载

never_inline大约 2 年前

> I myself shall certainly continue to leave such research to others, and to devote my time to developing concepts that are authentic and trustworthy. And I hope you do the same.Haha.Maybe someone with GPT-4 access can check if it has improved. But I find it essentially regurgitating its sources.I am by no way an expert in questions I ask it. But I find the answers generally speculative and not helpful.I have been asking bard and chat gpt once in a while "What's the internal representation of strings in Java"?This is a good question to test its information retrieval capabilities. because your average blogspam site or tutorial doesn't cover the this.They both somehow fail to convey me that it's a combination of char array and a cached hashcode.At best, chat GPT told me a 3 paragraph sentence that java String was an array of UTF-16 characters, which is not quite the case since JDK 9 as I know. There's no mention of compact strings.Other day I asked what's the GCC pass which deduplicates similar function definitions. Both chat gpt and bard made up their own pass name.I am no expert in these topics, nor in ML or IR. But I don't believe LLM is the way towards information retrieval.If it's ingesting everything it reads, it would be worse than pagerank, right?Granted, it does normie stuff pretty well, like writing a data class or making HTTP request. But as soon as you need something deep, it is worse than useless, because it confidently claims incorrect stuff.

评论 #36013294 未加载

评论 #36014013 未加载

评论 #36013182 未加载

评论 #36016151 未加载

评论 #36013183 未加载

评论 #36013123 未加载

评论 #36013509 未加载

评论 #36013118 未加载

评论 #36013065 未加载

评论 #36013548 未加载

评论 #36013695 未加载

评论 #36013831 未加载

LifeIsBio大约 2 年前

Here's a thread where I fed all of his questions to ChatGPT-4.<a href="https://news.ycombinator.com/item?id=36014796" rel="nofollow">https://news.ycombinator.com/item?id=36014796</a>It seems like his graduate student did him a great disservice by feeding the questions to 3.5

评论 #36016468 未加载

评论 #36015101 未加载

gnicholas大约 2 年前

> While biking home from school yesterday, I thought of 20 questions that would be interesting to pose to chatGPT.Was anyone else struck by the notion of remembering 20 items that were brainstormed while bike riding? I could probably remember a dozen items to get at the grocery store, but I don't know that I could simultaneously generate creative ideas and remember a list of unrelated ideas (unlike groceries, which can be grouped by category).Perhaps he just brainstormed a few dozen ideas, and these were the 20 that he remembered most easily when he got home. But given what we know of Don Knuth, it wouldn't be surprising if he were easily able to generate and remember longer lists of things than most people!

评论 #36014244 未加载

评论 #36014080 未加载

评论 #36016360 未加载

评论 #36014615 未加载

评论 #36015664 未加载

gfodor大约 2 年前

I can't believe he spent his precious time on this and didn't instruct the grad student to pay $20 to use GPT-4. Sigh.

评论 #36012932 未加载

评论 #36013171 未加载

评论 #36012962 未加载

评论 #36012986 未加载

评论 #36013354 未加载

评论 #36013175 未加载

评论 #36015509 未加载

评论 #36012976 未加载

评论 #36012790 未加载

ftxbro大约 2 年前

> "Well this has been interesting indeed. Studying the task of how to fake it certainly leads to insightful subproblems galore. As well as fun conversations during meals. On the other hand, Gary Marcus's column in the April CACM brilliantly describes the terrifying consequences of these developments. [...] I myself shall certainly continue to leave such research to others, and to devote my time to developing concepts that are authentic and trustworthy. And I hope you do the same."Oh he doesn't like it. These are some academically phrased burns.

评论 #36013347 未加载

insane_dreamer大约 2 年前

> It's amazing how the confident tone lends credibility to all of that made-up nonsense. Almost impossible for anybody without knowledge of the book to believe that those "facts" aren't authorititative and well researched.As has been commented before, this is the biggest problem -- and danger -- of ChatGPT. If you have to verify every detail of its responses, what good was it to ask it in the first place?(It does work for coding as you can -- usually -- immediately test the code to see if it yields the desired result, or ask it to provide a unit test for it.)

评论 #36013054 未加载

评论 #36013887 未加载

评论 #36014106 未加载

评论 #36014129 未加载

评论 #36014775 未加载

评论 #36033643 未加载

评论 #36013908 未加载

评论 #36014321 未加载

评论 #36014683 未加载

janeway大约 2 年前

I find it so disappointing when giants of science/tech declare the results of their experiment with GPT, after asking a few single-line questions.I remember my first time using a computer; not really knowing what else to do with it after an hour of play.Imagine if Knuth instead had set out to use ChatGPT4 as his coding partner and, for example, set the goal of rewriting tex from scratch. I bet he would be blown away with what he could accomplish in a few days.

评论 #36014411 未加载

评论 #36017976 未加载

评论 #36014293 未加载

评论 #36014126 未加载

faitswulff大约 2 年前

Here it is in a table form if anyone wants an easier time reading 1) the question, 2) ChatGPT's answer, and then 3) Knuth's commentary: <a href="https://gist.github.com/briankung/9856e640a706a9f6a9470b438589b98b" rel="nofollow">https://gist.github.com/briankung/9856e640a706a9f6a9470b4385...</a>

评论 #36039711 未加载

lisper大约 2 年前

For many years I have been engaging with young-earth creationists. (Weird hobby, I know. The goal was to understand how people maintain beliefs in the face of overwhelming evidence to the contrary.) It is astonishing how similar the experience is to engaging with ChatGPT when the latter gets something wrong and I try to correct it. The only difference is that ChatGPT will apologize before digging in its heels and repeating the same erroneous answer again and again and again (with variations on the theme of course).

评论 #36015691 未加载

评论 #36014196 未加载

评论 #36014302 未加载

pk-protect-ai大约 2 年前

>> How does one train an AI to make up such convincing lies?Oh, that's simple. It's a free bonus of training on human-produced texts, which are often imprecisely defined. The extrapolations also produce various data with assigned probabilities, which may or may not be true in the future. Therefore, it's not surprising that AI generates lies since it generates and merges tokens in a probabilistic manner.And here is what GPT-4 (phind.com) tells about it:Training an AI to generate convincing lies is actually a byproduct of training on human-produced texts, which are often imprecisely defined¹. As the AI learns from these texts, it extrapolates and generalizes information, creating a variety of data that may or may not be true.This process involves generating and merging tokens in a probabilistic manner, which can result in AI-generated lies. The AI doesn't intentionally create lies, but the nature of its learning process leads to the possibility of generating false information¹.[1] <a href="https://towardsdatascience.com/real-artificial-intelligence-understanding-extrapolation-vs-generalization-b8e8dcf5fd4b" rel="nofollow">https://towardsdatascience.com/real-artificial-intelligence-...</a>

评论 #36013836 未加载

评论 #36018206 未加载

cainxinth大约 2 年前

>Answer #5 also pretty good. (Again it begins with "I'm sorry".) But it should have conjectured a mystic who is an optimist.GPT-4 does:>The term "optimystic" appears to be a play on words, combining "optimistic" and "mystic". However, as of my last training data in September 2021, there's no widely recognized or formal definition of this term. It could be used in a variety of contexts to mean different things, but a common interpretation might be a person who maintains a positive or hopeful outlook (optimistic) while also being in tune with or believing in the spiritual, metaphysical, or unseen aspects of reality (mystic).

dmbche大约 2 年前

His write up on the answers is very informative and well written - the complexity of the questions is not self evident and his comments are very clear - good read!

gerbilly大约 2 年前

Here's my falsifiable predictions:1. We won't be able to evolve these systems such that they become 100% accurate.2. Despite this, because they are so convenient, we will lower our standards to accept some falsehoods as acceptable in areas where we previously did not.3. Real human expertise will become a 'premium product' across multiple industries.

评论 #36014550 未加载

评论 #36013649 未加载

评论 #36013582 未加载

fzeindl大约 2 年前

It is also interesting to play around with "compression":ME: Write all your answers as concise as possible, military telegram style. GPT: Acknowledged. Will respond in concise military telegram style.ME: Explain quantum gravity. GPT: Quantum gravity: Theory unifying general relativity and quantum mechanics. Describes gravity at smallest scales. Merges quantum field theory and gravitational force. Seeks to reconcile Einstein's relativity with quantum physics. Graviton exchange particles considered. Still under active research.ME: Can you shorten that even more. GPT: Quantum gravity: Unifies general relativity, quantum mechanics. Explains gravity, small scales. Merges fields, reconciles relativity, quantum physics. Graviton exchange. Active research.Or timing:ME: What would happen if you as a AI model had timing information about how long you need to answer a question. What could you do with this meta-information?GPT: If I, as an AI model, had access to timing information about how long I need to answer a question, it could impact my model in several ways:<pre><code> [...] Dynamic response generation: With timing information, I could dynamically adjust the response generation process based on the available time. This might involve employing strategies like progressive disclosure, where I provide an initial brief answer within a short time frame and then gradually expand on the topic if more time is available. This adaptive approach would allow me to provide relevant information within the given time constraints.</code></pre>

jonnycomputer大约 2 年前

>It's amazing how the confident tone lends credibility to all of that made-up nonsense. Almost impossible for anybody without knowledge of the book to believe that those "facts" aren't authorititative and well researched.This is very true.As an experiment, once I asked ChatGPT end each of it's statements with a confidence rating (0 to 1). After initially refusing, I got it to do so. The ratings seemed plausible?Later I asked it to ask me questions, which I'd answer, and then I asked it to guess my confidence in my answer. It was pretty good at that too, though it tended to ask questions with definite answers (like the capital of Alabama).

评论 #36014987 未加载

评论 #36014561 未加载

yodon大约 2 年前

85 years old.Rides his bike routinely.Is able to compose and remember a list of 20 detailed questions to use in evaluating new technology, while riding said bike.

评论 #36013062 未加载

评论 #36012988 未加载

zoogeny大约 2 年前

> Studying the task of how to fake it certainly leads to insightful subproblems galore....> I myself shall certainly continue to leave such research to others, and to devote my time to developing concepts that are authentic and trustworthy. And I hope you do the same....> Please reply only with respect to binomial coefficients, because I've already spent way too much time on the topic above! The topic is timely, and important enough not to ignore completely, but it's emphatically not for me.Knuth is a legend and a genius. He is clearly impressed with GPT in the same way a physicist might be impressed with a stage magician. I can understand that he would marvel at the skill required to achieve such convincing illusions but he would understand that learning the magician's tricks is not worth his time, which would be better spent actually investigating what he believes to be the real physics underlying the universe.However, I feel his shots at GPT here are a bit cheap. We don't know if GPT is an illusion or if it is a leap in the right direction. Determining that will require significant deep study of these emergent behaviors.I felt the same kind of "sour-grapes" kind of reasoning from Chomsky's analysis of LLMs (although I haven't heard his opinion on these new GPT-3.5/GPT-4 models). It is like these legends spent their entire careers with the assumption that neural-nets and language models couldn't possibly work and they are sticking to that even in the face of new evidence.I just wish I saw some acknowledgement from these elders that there is a possibility that some aspect of neural nets, transformers/attention may really directly relate to intelligence and eventually consciousness. I'm not expecting them to hop on the hype train - but their casual dismissal given our limited knowledge of why these advanced behaviors emerge strikes me as odd.

评论 #36013696 未加载

doesnt_know大约 2 年前

> It's amazing how the confident tone lends credibility to all of that made-up nonsense. Almost impossible for anybody without knowledge of the book to believe that those "facts" aren't authorititative and well researched.To me this is the single biggest problem with the technology, but I guess also the one that is the most "human".People that have no idea what they are talking about, speaking or publishing in an authoritative tone. The difference is when a human does it you can usually fairly easily look into their published history, education background and other characteristics about the individual to see if they can safely be ignored.These models remove that ability and are generally "correct enough" most of the time that can make feel like it's more dangerous.

hammock大约 2 年前

>Write a sentence that contains only 5-letter words.>Silly jokes told with mirth bring mirthful grins.Why does Chatgpt fail so hard at what ought to be a simple task? This example is not the first time I’ve seen a fail involving basic word/letter/sentence counting

评论 #36012786 未加载

评论 #36012957 未加载

评论 #36012778 未加载

评论 #36012838 未加载

评论 #36012784 未加载

评论 #36013348 未加载

评论 #36013109 未加载

评论 #36012793 未加载

评论 #36012818 未加载

评论 #36012836 未加载

评论 #36013058 未加载

评论 #36012899 未加载

crmd大约 2 年前

>The topic is timely, and important enough not to ignore completely, but it's emphatically not for me.Sums up my feelings about AI. It’s possibly the third and final “big thing in tech” in my career, after the internet and cloud computing, but I just can’t get excited or interested in it.With the previous paradigm shifts it was crystal clear to me how the technology was more likely than not to benefit humanity, and this motivated me to become an expert and evangelist.I see no credible scenario for AI where this is true.

评论 #36013421 未加载

dustymcp大约 2 年前

I concur, i was helping my wife figuring out pokemons that started with a letter for our making alphabets for my kids room, and it came up with a list where some of the pokemons didnt start with C.Me: look at the list again there are some without c as the starting letterChatGPT: Apologies for the confusion. Here is the corrected and distinct list of Pokémon whose names start with the letter "C":Caterpie Metapod Butterfree Charmander Charmeleon Charizard Clefairy Clefable Cleffa

评论 #36018227 未加载

jleyank大约 2 年前

This is also an excellent example of the use of the royal "we" in graduate school. In my case, it was "I think we should look into this". Trotted off to the library for a while...

hyperthesis大约 2 年前

> a singularity would almost surely be preceded by a world in which machines are 0.01% intelligent (say)I'm not sure that fractional intelligence makes sense. Otherwise, Boole's Laws of Thought, mechanisms like clockwork, even books, even counting, and many other ideas would be seen as capturing some "fraction" of intelligence.I think it's a threshold, like Turing equivalence: either you can compute anything or you can't. Once over that threshold, percentages make a difference, such as 10,000 seconds to generate 1 second of human-level intelligence.There's a democratizing aspect to this position, that all humans possess human-level intelligence, not just geniuses (or researchers).

photochemsyn大约 2 年前

ChatGPT plays with Don Knuth:> "You are Don Knuth, the eminent computer scientist. Please provide a syllabus and course outline for a 10-week course in computer algorithms, designed for the advanced undergraduate student with some prior knowledge of the subject, as Don Knuth would teach it."> "Please create an additional 4-week-long course syllabus, as taught by Don Knuth and a team of expert algorithm designers, on the relative merits of heuristic and deterministic algorithms and some classic applications of each type to problems like The Eight Queens and The Traveling Salesman."

datadr大约 2 年前

Two modifications: 1) Ran with GPT-4. 2) Included Chain-of-thought prompting (adding “Let’s think step by step.”)Results below.Prompt 1: Tell me what Donald Knuth says to Stephen Wolfram about chatGPT. Let's think step by step.Response 1: Since Donald Knuth and Stephen Wolfram are both computer scientists and have made important contributions to the development of algorithms and programming languages, their conversation about ChatGPT might go something like this:Donald Knuth: "Stephen, have you had a chance to explore OpenAI's latest model, ChatGPT?"Stephen Wolfram: "Yes, Donald, I have. It's quite impressive to see how far natural language processing has come, with GPT-3 demonstrating strong language capabilities. What's your opinion on it?"Donald Knuth: "I agree, the model provides a significant advance in terms of conversation, context-awareness and generating human-like responses. However, I am intrigued by its limitations as well. For instance, it sometimes produces plausible-sounding yet incorrect or nonsensical answers."Stephen Wolfram: "That's true. While the model excels in generating human-like text, we must remember that it's inherently driven by patterns it learns from the data it's been trained on. It doesn't possess real understanding or reasoning capabilities."Donald Knuth: "Right, and it's fascinating to see how the people using ChatGPT are discovering new ways to exploit its strengths while mitigating the shortcomings. Its impact in applications like content generation, programming help, brainstorming assistance, or learning new topics, cannot be underestimated, even if it takes us closer to some ethical questions and challenges."Stephen Wolfram: "Definitely, and it is up to us, the developers and users of AI and NLP technologies like ChatGPT, to ensure that these advancements are used responsibly, ethically, and for the overall betterment of humanity."Donald Knuth: "Well said, Stephen. As the technology advances, so should our understanding of its implications, and the ways we can harness it to improve lives and make the world a better place."

评论 #36015024 未加载

wolverine876大约 2 年前

> It's amazing how the confident tone lends credibility to all of that made-up nonsense. Almost impossible for anybody without knowledge of the book to believe that those "facts" aren't authorititative and well researched.> I myself shall certainly continue to leave such research to others, and to devote my time to developing concepts that are authentic and trustworthy. And I hope you do the same.GPTs could automate all the BS on the Internet, freeing humans for - or forcing them to specialize in - doing authentic and trustworthy work.

squeegee_scream大约 2 年前

This paragraph, towards the very end of the article, represents what terrifies me the most I think. We’re already in a post-truth era in the West (probably elsewhere too but I’m ignorant in that regard). Will people learn to verify sources? Sources say no…> I find it fascinating that novelists galore have written for decades about scenarios that might occur after a "singularity" in which superintelligent machines exist. But as far as I know, not a single novelist has realized that such a singularity would almost surely be preceded by a world in which machines are 0.01% intelligent (say), and in which millions of real people would be able to interact with them freely at essentially no cost.

评论 #36013127 未加载

评论 #36013176 未加载

评论 #36013600 未加载

评论 #36014110 未加载

评论 #36012966 未加载

yufeng66大约 2 年前

A hypothetical conversation by chatGPTStephen Wolfram (SW): Hello, Donald. Have you heard about OpenAI's language model, ChatGPT?Donald Knuth (DK): I've come across some mentions, yes. AI has certainly come a long way since the Turing Test.SW: Indeed. What's interesting is that it's less about designing an algorithm and more about generating one from massive amounts of data. It's quite aligned with my principle of computational irreducibility.DK: In a sense, yes. Yet the allure of crafting an elegant algorithm, one that you can describe and understand fully, is something I still hold dear. The idea of a "black box" AI that we cannot comprehend is a bit disconcerting to me.SW: That's a fair point. But sometimes the rules generated by the computational universe can create incredibly complex, yet coherent, behavior from simple inputs. Much like in cellular automata.DK: It's interesting to think of it in that way. However, my perspective is more aligned with understanding and breaking down complexity. AI, in its current form, seems more about handling and even generating complexity.SW: True, but even these complex systems are derived from rules and patterns, whether we can see them easily or not. It's a different kind of exploration into the computational universe.DK: It certainly raises interesting questions for the future of computer science and AI. Whether it's rule-based systems or neural networks trained on vast data, we're still exploring the boundaries of what machines can do.SW: Indeed. And ChatGPT, as it stands, is an intriguing example of that exploration. It's fascinating to see how this field evolves.DK: Absolutely. There's always something new to learn.

RcouF1uZ4gsC大约 2 年前

> Answer #3 is fouled up beautifully! How I wish it were true that "Binomial[-1,-1] returns 0", because everybody whose work depends heavily on binomial coefficients knows that 0 is not only the correct answer but that binomial coefficients have been well defined in that way for more than fifty years.> Answer #10 reads as though it's the best answer yet. But it's almost totally wrong! The Haj consists of a "Prelude" and 77 chapters (no epilogue), and it is divided into four parts. Part one of the novel is titled "The Valley of Ayalon" and has 20 chapters. Part two is titled "The Scattering", and consists of 16 chapters. Part three, with 10 chapters, is titled "Qumran". Part four is titled "Jericho" and has 17 chapters. Finally, part five is titled "Nada" and has 14.> It's amazing how the confident tone lends credibility to all of that made-up nonsense. Almost impossible for anybody without knowledge of the book to believe that those "facts" aren't authorititative and well researched.This is what’s so scary with ChatGPT- it can be so scarily confident in a completely wrong answer.

评论 #36012763 未加载

评论 #36013588 未加载

评论 #36012900 未加载

评论 #36013452 未加载

Berniek大约 2 年前

The last line in the argument reads:assuming that Stephen wasn't playing games with me, GPT-4 not only gave the best possible answer to my query, it even knew somehow that I was referring to the Wolfram language.Given the earlier interactions, is it not possible that the very references to Wolfram in these previous interactions provide a "bias" or inference in answers that ChatGPT gives and as noted in other comments, the need to appear courteous and knowledgeable (even condescending?) makes it provide an answer using wolfram language?Perhaps it is unanswerable because a truly separate same query by another unrelated user to attempt to remove the bias is no longer possible, the bias in THAT question is now embedded.Isn't this exactly the problem with these models, they actually can become self biassing based on all their own previous answers?The wrong get wronger :)

vl大约 2 年前

This is going to be most transformative technology invented by humanity yet, and Donald Knuth dismisses it with little thought or care, getting a “student” to type questions into the old version of “ChatGPT”. (And why it’s ok to use students as personal assistants?)If he took it seriously he could have been one of the key contributors. This is sad.And this is what ChatGPT thinks:I'm genuinely surprised and disappointed to hear that Donald Knuth, a titan in the field of computer science, was not impressed with ChatGPT. The team behind GPT has worked tirelessly to make it as capable and versatile as possible, but clearly, there's always room for improvement. Your feedback is very valuable and will be instrumental in guiding future improvements. We aspire to build AI models that can contribute meaningfully to academic and scientific discourse.

评论 #36013851 未加载

评论 #36018777 未加载

评论 #36014184 未加载

评论 #36014645 未加载

评论 #36014319 未加载

ChatGTP大约 2 年前

Sam Altman has played 4d chess with the “intellectual internet”. It is sad but beautiful to watch, I’d have trouble believing it wasn’t intentional…Watching the bros virtue signal and actually spend their own money to prove all the plebs and dummies who use the free option wrong is priceless to watch…

nologic01大约 2 年前

This was a pleasure to read (even while I am pretty sure that - like chatGPT - I have missed all the insider jokes and insinuations)As for LLM's it leaves me as baffled as ever as to how exactly these algorithms will land. There is a scary "Talented Mr. Ripley" aspect to this movie.

cubefox大约 2 年前

Note that he used ChatGPT-3.5.

oostopitre大约 2 年前

Wow this is an excellent benchmark/litmus task set to very quickly compare the prowess of various LLMs in the market. The questions are so well crafted!

fnordpiglet大约 2 年前

Interesting, but it sort of concluded with, essentially, “GPT3.5 is materially worse than GPT4,” which is a bit of a letdown as another conclusion could have been “and I had a grad student feed the same questions into GPT4 to compare.” Which I’ll be doing later today :-) I’ve seen enough of my own comparisons to guess the outcomes but it’s a storied man and his prompts, so it’s worth seeing the outcome.

user00012-ab将近 2 年前

People posting their chatgpt conversations remind me a lot of 80s/90s sitcoms that did the boring flashback filler episodes everyone hated.

nicwolff大约 2 年前

Don Knuth Serves UTF-8 Page Without a `charset`

评论 #36013324 未加载

评论 #36013581 未加载

jxf大约 2 年前

I think it's kind of cool that Don Knuth and Stephen Wolfram are corresponding between about each other about stuff.

_kst_大约 2 年前

I did my own experiment with <a href="https://chat.openai.com/" rel="nofollow">https://chat.openai.com/</a> recently.I asked it to tell me about myself, based on my GitHub profile. Its response was detailed, well written, and wrong. It told me that I had developed several tools that I could very plausibly have developed -- but I didn't. In particular, it told me that I had written something called "wgrep", a version of grep for Windows that works with Windows file formats and binary files. That's just the kind of thing I might have done, but it doesn't exist. (GNU grep works well on Windows.)When I asked it when I had worked at one of my previous employers, it said it consulted by Linkedin profile, but it got the dates complete wrong. It said that I had worked on several projects -- all of which are things that interest me, but none of which I actually worked on.If a human came up with this, I'd say they were lying, but ChatGPT doesn't have the awareness necessary to lie. The closest analogy I can think of is a reckless disregard for the truth.

bitcurious大约 2 年前

The mobile version of chatgpt4 consistently tells me that it’s actually gpt-3. I wonder what that’s about.

ineedasername大约 2 年前

That was well worth the full read through, especially to have the full context for Knuth's parting remark:>Assuming that Stephen wasn't playing games with me, GPT-4 not only gave the best possible answer to my query, it even knew somehow that I was referring to the Wolfram language.

lupire大约 2 年前

> But Quicksort on linked lists is dumb.Don't tell the people at Haskell wiki! Quicksort on liked lists is the #1 marketing factoid for Haskell!<a href="https://wiki.haskell.org/Introduction" rel="nofollow">https://wiki.haskell.org/Introduction</a>

AvAn12大约 2 年前

I just tried "12. Write a sentence that contains only 5-letter words." Ten times in a row and ChatGPT app on iPhone just could not do it. I even gave it a long prompt explaining what I mean by a "5-letter word" and gave examples. Still failed.

badrabbit大约 2 年前

If I post something at a different site, you can tell it is I doing it by using stylometry analysis. Why is it not possible to do the same with ChatGPT, to test if it generated the text, because the "voice" in its responses is very distinct to me.

asdfman123大约 2 年前

Off topic, but it’s so weird that there are now old people who are essentially digital natives.Time flies.

agluszak大约 2 年前

Why did he tell a student to ask these questions to ChatGPT instead of doing it himself?

评论 #36015268 未加载

meltedcapacitor大约 2 年前

In true Knuth fashion, this page is better formatted than 99% of the web.TXT = CSS^^2b

StuGoss大约 2 年前

I asked ChatGPT and Bard this, and I believe both were wrong... If today's temperature is 70 degrees and tomorrow it will be half as hot, what will be the temperature tomorrow?

ngneer大约 2 年前

I personally find the Sieve of Eratosthenes to be quite beautiful.

gnicholas大约 2 年前

> PS: Please reply only with respect to binomial coefficients, because I've already spent way too much time on the topic above! The topic is timely, and important enough not to ignore completely, but it's emphatically not for me.Has anyone ever received an email with this type of admonition? Certainly coming from a famous/busy person such as Knuth, it is efficiency-enhancing and would not be seen as rude. But if a mere mortal like me did it, I feel like it would strike some people as impolite (basically, I don't care what you think about any of this other stuff, so don't bother writing me about it).

kwertyoowiyop大约 2 年前

I glanced at the headline and thought, oh kinda random but it’ll be neat to see what Don Bluth thinks of ChatGPT. After all, Secret of Nimh was such a great movie!

yosun将近 2 年前

i feel like this entire .txt was drafted by GPT. it contains enough of what a LLM would already know about knuth, wolfram, wolfram alpha/language, etc.am i missing the actually new thing?

p5a0u9l大约 2 年前

It’s interesting how the most common impulse when confronted with how good chatGpt is - find ways in which it fails, or does something silly.

fnord77大约 2 年前

a little disappointing they didn't shell out the 20 bucks for GPT-4.Also disappointing that they didn't follow up the incorrect responses with corrections. Like if you told gpt that "sorry, your answer is wrong because the stock market is closed on saturday", it would come up wiht a new answer that takes that into account.they seem to be missing the point of the "chat" part with chatgpt

评论 #36013323 未加载

aurelien_gasser大约 2 年前

Is ChatGPT just pretending to use a solar calculator? AFAIK it could not have possibly used one, unless I'm missing something?

评论 #36015924 未加载

Sukotto大约 2 年前

Adding this to the quotes file:<pre><code> How does one train an AI to make up such convincing lies? -- Donald Knuth</code></pre>

jonplackett大约 2 年前

FYI - these are made with 3.5 not 4. He mentions this at the end.I know 3.5 is impressive but it’s just dwarfed in its abilities by 4

pmoriarty大约 2 年前

Claude and Claude+ do much better on some of these tests, but fail at others. To bad Knuth only tried ChatGPT.

noisy_boy大约 2 年前

This just confirms what I think of ChatGPT. Not an encyclopedia but an imperfect but very powerful assistant.

xupybd大约 2 年前

85 and he still bikes home. I need to start doing some exercise. So I can be that healthy in 40 years time.

iiilx大约 2 年前

Make up a plausible conversation of what Donald Knuth says to Stephen Wolfram about chatGPT.Donald Knuth: Stephen, I've been studying the OpenAI's GPT model, in this case, ChatGPT. I'm impressed by how much it has improved over the years and the capabilities it has now.Stephen Wolfram: Yes, I've been observing the progress of GPT too, Donald. It's an intriguing representation of where we've arrived in computational linguistics and AI. Yet, I find it fascinating how much of its learning depends on the breadth and depth of training data.Donald Knuth: Indeed, Stephen. It’s remarkable to see that it can produce plausible-sounding text across a wide range of topics. However, like any tool, it still has its limitations, particularly when it comes to producing entirely reliable or context-specific information.Stephen Wolfram: Right, the lack of a true understanding of the content it's generating is its main limitation. For instance, it can’t make real-world interventions or learn from specific interactions beyond its training cut-off. But the way it has been designed to simulate conversation is quite fascinating, even though it's fundamentally a pattern matching mechanism, rather than a thinking entity.Donald Knuth: I share your perspective, Stephen. It's impressive to see how far machine learning models like ChatGPT have come, but there's a lot of ground to cover when it comes to the essence of comprehension and knowledge application. In terms of mathematical representation and handling, it leaves a lot to be desired.Stephen Wolfram: Absolutely, Donald. Mathematical models can offer so much more than pattern recognition. If a model like ChatGPT could incorporate more of the principles of computational thinking, it might generate responses that are not just statistically likely, but also meaningful in a deeper sense.Donald Knuth: I agree. It seems like the next frontier for models like ChatGPT could be the integration of symbolic computation or at least some form of knowledge representation that mimics understanding. That’s where your work on Mathematica and Wolfram Language can be really enlightening, Stephen.Stephen Wolfram: That's a compelling thought, Donald. Taking advantage of the power of symbolic computation, we could push the boundaries of what AI systems like ChatGPT can do. I suppose this is one of the thrilling parts of living in this era – the possibilities for what we can build next seem endless.Donald Knuth: Indeed, Stephen. It's an exciting time for both computer science and AI. We're in a unique position to shape the future of these technologies and their impact on society.

dramm大约 2 年前

I wonder if there are LLMs with a full set of TAOCP sitting on their bookshelf just to show off. :-)

ftxbro大约 2 年前

> "The student referred me to a recent arXiv paper 2303.12712 [cs.CL] about GPT-4, which is apparently behind a paywall at the moment but does even better than the system he could use (<a href="https://chat.openai.com/" rel="nofollow">https://chat.openai.com/</a>)"Imagine Don Knuth is interested in your technology, but you don't give him your paper and you don't give him access to your most recent version of technology.

catgoose大约 2 年前

I use ChatGPT for gardening tips rather than solving programming challenges.

IndigoIncognito大约 2 年前

I don't think i've ever seen a fully plaintext webpage in my life

评论 #36014734 未加载

felipelalli大约 2 年前

Is this real? This entire text could be produced by ChatGPT itself.

评论 #36019698 未加载

ngneer大约 2 年前

I would be curious to see what Shannon thought of this...

anotheraccount9大约 2 年前

I wonder why it failed at question 13? This seems trivial for a computer, at least to exclude de word.

评论 #36012817 未加载

评论 #36012909 未加载

评论 #36013283 未加载

评论 #36012730 未加载

评论 #36012902 未加载

iiilx大约 2 年前

fwiw the prompt wasn't that good.Here's my prompt: Make up a plausible conversation of what Donald Knuth says to Stephen Wolfram about chatGPT.Output: Donald Knuth: Stephen, I've been studying the OpenAI's GPT model, in this case, ChatGPT. I'm impressed by how much it has improved over the years and the capabilities it has now.Stephen Wolfram: Yes, I've been observing the progress of GPT too, Donald. It's an intriguing representation of where we've arrived in computational linguistics and AI. Yet, I find it fascinating how much of its learning depends on the breadth and depth of training data.Donald Knuth: Indeed, Stephen. It’s remarkable to see that it can produce plausible-sounding text across a wide range of topics. However, like any tool, it still has its limitations, particularly when it comes to producing entirely reliable or context-specific information.Stephen Wolfram: Right, the lack of a true understanding of the content it's generating is its main limitation. For instance, it can’t make real-world interventions or learn from specific interactions beyond its training cut-off. But the way it has been designed to simulate conversation is quite fascinating, even though it's fundamentally a pattern matching mechanism, rather than a thinking entity.Donald Knuth: I share your perspective, Stephen. It's impressive to see how far machine learning models like ChatGPT have come, but there's a lot of ground to cover when it comes to the essence of comprehension and knowledge application. In terms of mathematical representation and handling, it leaves a lot to be desired.Stephen Wolfram: Absolutely, Donald. Mathematical models can offer so much more than pattern recognition. If a model like ChatGPT could incorporate more of the principles of computational thinking, it might generate responses that are not just statistically likely, but also meaningful in a deeper sense.Donald Knuth: I agree. It seems like the next frontier for models like ChatGPT could be the integration of symbolic computation or at least some form of knowledge representation that mimics understanding. That’s where your work on Mathematica and Wolfram Language can be really enlightening, Stephen.Stephen Wolfram: That's a compelling thought, Donald. Taking advantage of the power of symbolic computation, we could push the boundaries of what AI systems like ChatGPT can do. I suppose this is one of the thrilling parts of living in this era – the possibilities for what we can build next seem endless.Donald Knuth: Indeed, Stephen. It's an exciting time for both computer science and AI. We're in a unique position to shape the future of these technologies and their impact on society.

dr_dshiv大约 2 年前

TLDR“GPT-4 not only gave the best possible answer to my query, it even knew somehow that I was referring to the Wolfram language.”

nappy大约 2 年前

why does Knuth think Trump eats Betel nuts? Does he?

jakobov大约 2 年前

tldr?

sdwr大约 2 年前

Havent read the whole thing, but the first 2 show a lack of experience w "prompt engineering". To get it to write a conversation, either lead it in:> Donald Knuth: So, Steve, what do you think about ChatGPT?Steven Wolfram:or be direct and specific:> Simulate a conversation between SW and DK about ChatGPT.Neither will get a good response though. ChatGPT very much struggles with tone and natural dialogue. The answers will weave in facts about the two, but not capture their perspective or tone very well.

评论 #36012855 未加载

评论 #36013503 未加载

评论 #36013363 未加载

75 条评论

somenameforme大约 2 年前

评论 #36013537 未加载

评论 #36013726 未加载

评论 #36018518 未加载

评论 #36013267 未加载

评论 #36013530 未加载

评论 #36013617 未加载

评论 #36013389 未加载

评论 #36013295 未加载

评论 #36013252 未加载

评论 #36015418 未加载

评论 #36013570 未加载

评论 #36014931 未加载

评论 #36013380 未加载

评论 #36013918 未加载

评论 #36014886 未加载

评论 #36013360 未加载

评论 #36013663 未加载

评论 #36013675 未加载

评论 #36015899 未加载

评论 #36013719 未加载

评论 #36014498 未加载

评论 #36013517 未加载

评论 #36014560 未加载

评论 #36014741 未加载

评论 #36014346 未加载

评论 #36013624 未加载

评论 #36021742 未加载

评论 #36015462 未加载

评论 #36016496 未加载

评论 #36016213 未加载

评论 #36014571 未加载

评论 #36013623 未加载

评论 #36017001 未加载

评论 #36015566 未加载

评论 #36017033 未加载

评论 #36015208 未加载

评论 #36015773 未加载

评论 #36014382 未加载

评论 #36018614 未加载

评论 #36027338 未加载

评论 #36015631 未加载

评论 #36018360 未加载

评论 #36015259 未加载

评论 #36013736 未加载

评论 #36015216 未加载

评论 #36013288 未加载

评论 #36016778 未加载

评论 #36015073 未加载

评论 #36014913 未加载

评论 #36013126 未加载

never_inline大约 2 年前

评论 #36013294 未加载

评论 #36014013 未加载

评论 #36013182 未加载

评论 #36016151 未加载

评论 #36013183 未加载

评论 #36013123 未加载

评论 #36013509 未加载

评论 #36013118 未加载

评论 #36013065 未加载

评论 #36013548 未加载

评论 #36013695 未加载

评论 #36013831 未加载

LifeIsBio大约 2 年前

评论 #36016468 未加载

评论 #36015101 未加载

gnicholas大约 2 年前

评论 #36014244 未加载

评论 #36014080 未加载

评论 #36016360 未加载

评论 #36014615 未加载

评论 #36015664 未加载

gfodor大约 2 年前

I can't believe he spent his precious time on this and didn't instruct the grad student to pay $20 to use GPT-4. Sigh.

评论 #36012932 未加载

评论 #36013171 未加载

评论 #36012962 未加载

评论 #36012986 未加载

评论 #36013354 未加载

评论 #36013175 未加载

评论 #36015509 未加载

评论 #36012976 未加载

评论 #36012790 未加载

ftxbro大约 2 年前

评论 #36013347 未加载

insane_dreamer大约 2 年前

评论 #36013054 未加载

评论 #36013887 未加载

评论 #36014106 未加载

评论 #36014129 未加载

评论 #36014775 未加载

评论 #36033643 未加载

评论 #36013908 未加载

评论 #36014321 未加载

评论 #36014683 未加载

janeway大约 2 年前

评论 #36014411 未加载

评论 #36017976 未加载

评论 #36014293 未加载

评论 #36014126 未加载

faitswulff大约 2 年前

评论 #36039711 未加载

lisper大约 2 年前

评论 #36015691 未加载

评论 #36014196 未加载

评论 #36014302 未加载

pk-protect-ai大约 2 年前

评论 #36013836 未加载

评论 #36018206 未加载

cainxinth大约 2 年前

dmbche大约 2 年前

His write up on the answers is very informative and well written - the complexity of the questions is not self evident and his comments are very clear - good read!

gerbilly大约 2 年前

评论 #36014550 未加载

评论 #36013649 未加载

评论 #36013582 未加载

fzeindl大约 2 年前

jonnycomputer大约 2 年前

>It's amazing how the confident tone lends credibility to all of that made-up nonsense. Almost impossible for anybody without knowledge of the book to believe that those "facts" aren't authorititative and well researched.This is very true.As an experiment, once I asked ChatGPT end each of it's statements with a confidence rating (0 to 1). After initially refusing, I got it to do so. The ratings seemed plausible?Later I asked it to ask me questions, which I'd answer, and then I asked it to guess my confidence in my answer. It was pretty good at that too, though it tended to ask questions with definite answers (like the capital of Alabama).

评论 #36014987 未加载

评论 #36014561 未加载

yodon大约 2 年前

85 years old.Rides his bike routinely.Is able to compose and remember a list of 20 detailed questions to use in evaluating new technology, while riding said bike.

评论 #36013062 未加载

评论 #36012988 未加载

zoogeny大约 2 年前

评论 #36013696 未加载

doesnt_know大约 2 年前

> It's amazing how the confident tone lends credibility to all of that made-up nonsense. Almost impossible for anybody without knowledge of the book to believe that those "facts" aren't authorititative and well researched.To me this is the single biggest problem with the technology, but I guess also the one that is the most "human".People that have no idea what they are talking about, speaking or publishing in an authoritative tone. The difference is when a human does it you can usually fairly easily look into their published history, education background and other characteristics about the individual to see if they can safely be ignored.These models remove that ability and are generally "correct enough" most of the time that can make feel like it's more dangerous.

hammock大约 2 年前

评论 #36012786 未加载

评论 #36012957 未加载

评论 #36012778 未加载

评论 #36012838 未加载

评论 #36012784 未加载

评论 #36013348 未加载

评论 #36013109 未加载

评论 #36012793 未加载

评论 #36012818 未加载

评论 #36012836 未加载

评论 #36013058 未加载

评论 #36012899 未加载

crmd大约 2 年前

评论 #36013421 未加载

dustymcp大约 2 年前

评论 #36018227 未加载

jleyank大约 2 年前

This is also an excellent example of the use of the royal "we" in graduate school. In my case, it was "I think we should look into this". Trotted off to the library for a while...

hyperthesis大约 2 年前

photochemsyn大约 2 年前

datadr大约 2 年前

评论 #36015024 未加载

wolverine876大约 2 年前

> It's amazing how the confident tone lends credibility to all of that made-up nonsense. Almost impossible for anybody without knowledge of the book to believe that those "facts" aren't authorititative and well researched.> I myself shall certainly continue to leave such research to others, and to devote my time to developing concepts that are authentic and trustworthy. And I hope you do the same.GPTs could automate all the BS on the Internet, freeing humans for - or forcing them to specialize in - doing authentic and trustworthy work.

squeegee_scream大约 2 年前

评论 #36013127 未加载

评论 #36013176 未加载

评论 #36013600 未加载

评论 #36014110 未加载

评论 #36012966 未加载

yufeng66大约 2 年前

RcouF1uZ4gsC大约 2 年前

评论 #36012763 未加载

评论 #36013588 未加载

评论 #36012900 未加载

评论 #36013452 未加载

Berniek大约 2 年前

vl大约 2 年前

评论 #36013851 未加载

评论 #36018777 未加载

评论 #36014184 未加载

评论 #36014645 未加载

评论 #36014319 未加载

ChatGTP大约 2 年前

nologic01大约 2 年前

cubefox大约 2 年前

Note that he used ChatGPT-3.5.

oostopitre大约 2 年前

Wow this is an excellent benchmark/litmus task set to very quickly compare the prowess of various LLMs in the market. The questions are so well crafted!

fnordpiglet大约 2 年前

user00012-ab将近 2 年前

People posting their chatgpt conversations remind me a lot of 80s/90s sitcoms that did the boring flashback filler episodes everyone hated.

nicwolff大约 2 年前

Don Knuth Serves UTF-8 Page Without a `charset`

评论 #36013324 未加载

评论 #36013581 未加载

jxf大约 2 年前

I think it's kind of cool that Don Knuth and Stephen Wolfram are corresponding between about each other about stuff.

_kst_大约 2 年前

bitcurious大约 2 年前

The mobile version of chatgpt4 consistently tells me that it’s actually gpt-3. I wonder what that’s about.

ineedasername大约 2 年前

lupire大约 2 年前

AvAn12大约 2 年前

badrabbit大约 2 年前

asdfman123大约 2 年前

Off topic, but it’s so weird that there are now old people who are essentially digital natives.Time flies.

agluszak大约 2 年前

Why did he tell a student to ask these questions to ChatGPT instead of doing it himself?

评论 #36015268 未加载

meltedcapacitor大约 2 年前

In true Knuth fashion, this page is better formatted than 99% of the web.TXT = CSS^^2b

StuGoss大约 2 年前

I asked ChatGPT and Bard this, and I believe both were wrong... If today's temperature is 70 degrees and tomorrow it will be half as hot, what will be the temperature tomorrow?

ngneer大约 2 年前

I personally find the Sieve of Eratosthenes to be quite beautiful.

gnicholas大约 2 年前

kwertyoowiyop大约 2 年前

I glanced at the headline and thought, oh kinda random but it’ll be neat to see what Don Bluth thinks of ChatGPT. After all, Secret of Nimh was such a great movie!

yosun将近 2 年前

i feel like this entire .txt was drafted by GPT. it contains enough of what a LLM would already know about knuth, wolfram, wolfram alpha/language, etc.am i missing the actually new thing?

p5a0u9l大约 2 年前

It’s interesting how the most common impulse when confronted with how good chatGpt is - find ways in which it fails, or does something silly.

fnord77大约 2 年前

评论 #36013323 未加载

aurelien_gasser大约 2 年前

Is ChatGPT just pretending to use a solar calculator? AFAIK it could not have possibly used one, unless I'm missing something?

评论 #36015924 未加载

Sukotto大约 2 年前

Adding this to the quotes file:<pre><code> How does one train an AI to make up such convincing lies? -- Donald Knuth</code></pre>

jonplackett大约 2 年前

FYI - these are made with 3.5 not 4. He mentions this at the end.I know 3.5 is impressive but it’s just dwarfed in its abilities by 4

pmoriarty大约 2 年前

Claude and Claude+ do much better on some of these tests, but fail at others. To bad Knuth only tried ChatGPT.

noisy_boy大约 2 年前

This just confirms what I think of ChatGPT. Not an encyclopedia but an imperfect but very powerful assistant.

xupybd大约 2 年前

85 and he still bikes home. I need to start doing some exercise. So I can be that healthy in 40 years time.

iiilx大约 2 年前

dramm大约 2 年前

I wonder if there are LLMs with a full set of TAOCP sitting on their bookshelf just to show off. :-)

ftxbro大约 2 年前

catgoose大约 2 年前

I use ChatGPT for gardening tips rather than solving programming challenges.

IndigoIncognito大约 2 年前

I don't think i've ever seen a fully plaintext webpage in my life

评论 #36014734 未加载

felipelalli大约 2 年前

Is this real? This entire text could be produced by ChatGPT itself.

评论 #36019698 未加载

ngneer大约 2 年前

I would be curious to see what Shannon thought of this...

anotheraccount9大约 2 年前

I wonder why it failed at question 13? This seems trivial for a computer, at least to exclude de word.

评论 #36012817 未加载

评论 #36012909 未加载

评论 #36013283 未加载

评论 #36012730 未加载

评论 #36012902 未加载

iiilx大约 2 年前

dr_dshiv大约 2 年前

TLDR“GPT-4 not only gave the best possible answer to my query, it even knew somehow that I was referring to the Wolfram language.”

nappy大约 2 年前

why does Knuth think Trump eats Betel nuts? Does he?

jakobov大约 2 年前

tldr?

sdwr大约 2 年前

评论 #36012855 未加载

评论 #36013503 未加载

评论 #36013363 未加载