TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

LLMs are not even good wordcels

9 pointsby epidemian11 months ago

1 comment

dissahc11 months ago
i don&#x27;t think having weaknesses like these say too much about the general capabilities or potential of LLMs. try this yourself: generate a novel pangram without any sort of iterative process or revisions -- the first thing that pops into your head you must commit to paper. it&#x27;s very hard. it&#x27;s a lot easier if you go in alphabetical order, as you don&#x27;t need to keep track of which letters you&#x27;ve already used. interestingly, gpt-4o also performs better at this task when you ask it to go in alphabetical order.<p>LLMs have known weaknesses, many having to do with an inability to &quot;think&quot; without using tokens. so for tasks like these, you can dramatically increase their performance by getting them to think out loud. this was the crux of the &quot;let&#x27;s verify step by step&quot; paper.<p>i ran this prompt three times:<p>### prompt start ###<p>your goal is to create a novel pangram in the spanish language that sounds natural. it should be grammatically correct and coherent. it shouldn&#x27;t be just technically coherent and grammatically correct, it should sound like a normal sentence.<p>first, print out the spanish alphabet. these are your $remainingletters. $sentence = &quot;&quot;. then enter a loop where you do the following:<p>loop 1: - select a random letter from $remainingletters choose based on which one allows you to add the most natural sounding word to the sentence, do NOT go in alphabetical order - eliminate the letter you chose from the remaining letters - ensure the word you chose actually begins with that letter! very important - add the word you chose for that letter to $sentence - are there letters in $remaining? if so, go back to start of loop 1. otherwise move on to loop 2.<p>loop 2: - go through the spanish alphabet in order and ensure your $sentence contains a word starting with that letter - once every letter is accounted for, translate the sentence to english - does it sound like a natural sentence? - if not, go back to the start of loop 2 - if so, print $sentence as well as its translation in english<p>think out loud, keep track of your work as you go<p>i&#x27;m not asking you to generate code, i&#x27;m just explaining how you should accomplish the task<p>### prompt end ###<p>i ran this prompt three times, and each time it generated a valid pangram that was coherent and grammatically correct. i can&#x27;t be bothered to run it a bunch of times to get an accurate success rate, but i&#x27;m fairly sure there exists a prompt with a success rate of 100%. there are a lot of output tokens available, and by asking the model to iterate, it will arrive at something correct far before 128,000 tokens are exhausted.<p>sidenote: when i asked gpt-4o to generate 5 novel pangrams in english, and it got them all right. so language definitely matters when it comes to getting things like this right in one shot.
评论 #40619088 未加载