ChatGPT vs. a Cryptic Crossword

135 pointsby jamespwilliamsover 2 years ago

20 comments

rich_sashaover 2 years ago

I kind of feel like poking at the failings if ChatGPT misses the point a bit.Yes it's certainly not an AGI or even super close but to even converse with humans at this level is mind boggling. 10 years before stable diffusion AI could just about label pictures, now it can do, well, stable diffusion.The pace or progress is insane.Like this, I feel we might engage in a naysaying dialogue with consecutive generations of GPT-like models, but finding increasingly minor nitpicks. "Ah but does it understand diminutives"? "It's handling of sarcasm isn't up to scratch". "I tried 10 languages to converse in and Esperanto was quite weak".And then one day we might wake up to a world where we can't really nitpick anymore.

评论 #33877673 未加载

评论 #33876542 未加载

评论 #33876848 未加载

评论 #33876597 未加载

评论 #33877618 未加载

评论 #33876683 未加载

评论 #33877258 未加载

评论 #33876990 未加载

评论 #33877092 未加载

评论 #33876914 未加载

评论 #33876659 未加载

评论 #33876433 未加载

JacobiXover 2 years ago

The problem with many of the tasks that people are trying is: the answers are already available on the internet for those very popular crosswords. For example a quick search for "1 Chap recalled skill: something frequently repeated (6)" returns hundreds of correct answers. It’s highly probable that it has already encountered the questions and answers for this crosswords in the training phase.

评论 #33876633 未加载

评论 #33875485 未加载

评论 #33876276 未加载

评论 #33878250 未加载

评论 #33877059 未加载

评论 #33875427 未加载

lsh123over 2 years ago

AI passes Turing test by producing BS indistinguishable from human BS

评论 #33875669 未加载

评论 #33875199 未加载

评论 #33876644 未加载

评论 #33876280 未加载

评论 #33876458 未加载

doff_over 2 years ago

Probably worth noting that it may not show its true reasoning, rather it immediately arrives at an answer and then proceeds to add an explanation which seems reasonable to it.

评论 #33875433 未加载

评论 #33875311 未加载

评论 #33875740 未加载

Bluecobraover 2 years ago

I tried using it to solve Wordle and it failed spectacularly:> What 5 letter words can be made only from the letters w,e,y,o,p,d,f,g,h,j,k,z,x,c,v,b,n,m? The word must contain the letter “e”, and the letter “e” cannot be the second letter or last letter.> Here are five five-letter words that can be made using the letters you provided, and which contain the letter "e" as the first or third letter:fewer: a smaller number of people or things eyewp: not a valid English word yowep: not a valid English word oweyp: not a valid English word pweyo: not a valid English word

评论 #33876152 未加载

评论 #33876365 未加载

评论 #33876338 未加载

mmlkrxover 2 years ago

Here's another fun thing I accidentally discovered when it made a factual mistake: It will change its reasoning to support its answer.>The average German citizen produces about 10.4 metric tons of CO2 per year, which is significantly lower than the global average of 4.9 metric tons per person per year. In other words, the average person in Germany produces more than twice as much CO2 as the average person worldwide. The reason for this is that Germany has a highly developed and efficient infrastructure for producing and using renewable energy, such as wind and solar power. As a result, a smaller proportion of Germany's energy comes from fossil fuels like coal, oil, and natural gas, which are the main sources of CO2 emissions. Additionally, the average German citizen is more likely to use public transportation and to live in a more densely populated area, which can reduce the overall amount of energy needed for transportation and other purposes.To which I replied: You said:"The average German citizen produces about 10.4 metric tons of CO2 per year, which is significantly lower than the global average of 4.9 metric tons per person per year."this is a false statement, no?>You are correct, my previous statement was incorrect. I apologize for the error. The average German citizen actually produces about 10.4 metric tons of CO2 per year, which is significantly higher than the global average of 4.9 metric tons per person per year. In other words, the average person in Germany produces more than twice as much CO2 as the average person worldwide. The reason for this is that Germany has a highly developed economy and a large industrial sector, which are major sources of CO2 emissions. Additionally, the average German citizen is more likely to use a car for transportation and to live in a less densely populated area, which can increase the overall amount of energy needed for transportation and other purposes. Again, I apologize for any confusion my previous statement may have caused.Notice that all of the sudden the average German citizen is more likely to live in a less densely populated area to support the fact that they produce more CO2 per year than the global average.Really fun stuff!

评论 #33877811 未加载

TillEover 2 years ago

Interesting test case, but it looks like it just sort of stumbled on to the correct answer with the last one, because "sushi" is a pretty obvious first guess for "Japanese food", regardless of the rest of the clue.But yes, it is impressive that it manages to parse the general intent of the clue.

mustachionutover 2 years ago

Great, just when I thought captchas were hard enough...

omnicognateover 2 years ago

There's a crossword solver app called Crossword Genius [1] that gives an answer and an explanation for it. I tried it a while back when you could use it free online and it was very good, but now they appear to have made it app only.It claims to be AI based, though you can provide feedback when it gets it wrong and there aren't many cryptic crosswords in the world, so I don't know to what extent it's solving with clever algorithms vs just echoing back human solutions.[1] <a href="https://www.crosswordgenius.com/" rel="nofollow">https://www.crosswordgenius.com/</a>

danjcover 2 years ago

This blog references another blog where the author has ChatGPT emulate a Linux terminal [1].The output it produces is incredibly impressive but I just tried to use the same prompts and got back "I'm sorry, but I am not capable of acting as a Linux terminal or any other type of command-line interface."Interesting that this capability has been neutered.1. <a href="https://www.engraved.blog/building-a-virtual-machine-inside/" rel="nofollow">https://www.engraved.blog/building-a-virtual-machine-inside/</a>

评论 #33877530 未加载

评论 #33876837 未加载

dsjoergover 2 years ago

Third time today I've seen someone remark on the _confidence_ of ChatGPT responses. Indeed it is remarkable!

评论 #33875613 未加载

评论 #33877561 未加载

hokkosover 2 years ago

there is no reasoning, just google "chap recalled skill something frequently repeated" and there is multiples times the answer, it is just the best fit from the training corpus.

a3wover 2 years ago

I have no chance of solving these. I am human, but english is not my first language. (Yet I speak at level C1 or better). The faulty reasoning is where the AI nonsense shows, though.

评论 #33876978 未加载

DrScientistover 2 years ago

Is it just me - or is the characteristic of deciding on an answer first and then justifying it using selected/made up facts and faulty logic all too human? :-)

ada1981over 2 years ago

Using the phrase “understands” seems anthropomorphizing.It’s a fancy autocomplete. It understands nothing.

评论 #33877052 未加载

评论 #33879294 未加载

randallsquaredover 2 years ago

> taking the first letter of the word “chap” (M)Well, frankly, the answer this is the start of sounds only literally incorrect, rather than profoundly incorrect, like presuming that "recalled" and "reversed" are synonyms. :/

renewiltordover 2 years ago

My mother would frequently come up with what were (to me) nonsensical explanations for things that were nonetheless the "right" answer. This is hilarious to me.

mkageniusover 2 years ago

I asked chatGPT what colored square does white king start on - it said e1, which is a white square. Try again. Same answer, white.

whatever1over 2 years ago

ChatGPT feels like the sequel of IBM Watson. Super intriguing first impressions, but I doubt it will solve any real problems.

评论 #33876130 未加载

评论 #33876430 未加载

评论 #33875869 未加载

评论 #33877902 未加载

ARandomerDudeover 2 years ago

OpenAI:> Verify your phone number.That’s the page where my interest in ChatGPT went to zero. Why do they need to know exactly who I am? What’s the real goal here?

评论 #33876099 未加载

评论 #33875918 未加载

评论 #33875955 未加载

评论 #34026319 未加载

评论 #34026317 未加载