AI coding and the peanut butter and jelly problem

122 点作者 tylerg大约 1 个月前

21 条评论

kenjackson大约 1 个月前

This is actually no different than for humans once you get past the familiar. It's like the famous project management tree story: <a href="https://pmac-agpc.ca/project-management-tree-swing-story" rel="nofollow">https://pmac-agpc.ca/project-management-tree-swing-story</a>If anything, LLMs have surprised at much better they are than humans in understanding instructions for text based activities. But they are MUCH worse than humans when it comes to creating images/videos.

评论 #43662984 未加载

评论 #43662572 未加载

zahlman大约 1 个月前

Okay, but like.If you do have that skill to communicate clearly and describe the requirements of a novel problem, why is the AI still useful? Actually writing the code should be relatively trivial from there. If it isn't, that points to a problem with your tools/architecture/etc. Programmers IMX are, on average, far too tolerant of boilerplate.

评论 #43660939 未加载

评论 #43661579 未加载

评论 #43659634 未加载

评论 #43659773 未加载

评论 #43659667 未加载

Syzygies大约 1 个月前

"I bought those expensive knives. Why doesn't my cooking taste better?""I picked up an extraordinary violin once. It sounded awful!"There's an art here. Managerial genius is recognizing everyone's strengths and weaknesses, and maximizing impact. Coding with AI is no different.Of course I have to understand the code well enough to have written it. Usually much of the time is spent proposing improvements.I'm a few months in, learning to code with AI for my math research. After a career as a professor, I'm not sure I could explain to anyone what I'm starting to get right, but I'm working several times more efficiently than I ever could by hand.Some people will get the hang of this faster than others, and it's bull to think this can be taught.

phalangion大约 1 个月前

This video shows the peanut butter and jelly problem in action: <a href="https://youtu.be/cDA3_5982h8?si=xIQpzNTvhRcGY4Nb" rel="nofollow">https://youtu.be/cDA3_5982h8?si=xIQpzNTvhRcGY4Nb</a>

_wire_大约 1 个月前

Step 1: Computer, make me a peanut butter and jelly sandwich.If this can't work, the program abstraction is insufficient to the task. This insufficiency is not a surprise.That an ordinary 5-year can make a sandwich after only ever seeing someone make one, and that the sandwich so-made is a component within a life sustaining matrix which inevitably leads to new 5 year-olds making their own sandwiches and serenading the world about the joys of peanut butter and jelly is the crucial distinction between AI and intelligence.The rest of the stuff about a Harvard professor ripping a hole in a bag and pouring jelly on a clump of bread on the floor is a kooky semantic game that reveals something about the limits of human intelligence among the academic elite.We might wonder why some people have to get to university before encountering such basic epistemological conundrum as what constitutes clarity in exposition... But maybe that's what teaching to the test in U.S. K-12 gets you.Alan Kay is known a riff on a simple study where Harvard students were asked what causes the earth's seasons: almost all of them give the wrong explanation, but many of them are very confident about the correctness of their wrong explanations.Given that the measure of every AI chat program's performance is how agreeable its response is to a human, is there a clear distinction between a the human and the AI?If this HN discussion was among AI chat programs considering their own situations and formulating understanding of their own problems; maybe waxing about the ineffable, for them, joy of eating a peanut butter and jelly sandwich...But it isn't.

pkdpic大约 1 个月前

lol, I didn't realize how famous the PB&J exercise was. That's fantastic. I thought it was just from this puppet video I've been showing my 4yo and his friends. Anyway they seem to love it.<a href="https://m.youtube.com/watch?v=RmbFJq2jADY&t=3m25s" rel="nofollow">https://m.youtube.com/watch?v=RmbFJq2jADY&t=3m25s</a>Also seems like great advice, feels like a good description of what Ive been gravitating towards / having more luck with lately proompting.

评论 #43661768 未加载

extr大约 1 个月前

Didn't know they did the PB&J thing at Harvard. I remember doing that in the 3rd grade or thereabouts.

iDon大约 1 个月前

For decades people have been dreaming of higher-level languages, where a user can simply specify what they want and not how to do it (the name of the programming language Forth derives from '4th Generation Language', reflecting this idea).Here we are - we've arrived at the next level.The emphasis in my prompts is specification : clear and concise, defining terms as they are introduced, and I've had good results with that. I expect that we'll see specification/prompt languages evolve, in the same way that MCP has become a defacto standard API for connecting LLMs to other applications and servers. We could use a lot of the ideas from existing specification languages, and there has been a lot of work done on this over 40+ years, but my impression is they are largely fairly strict, because their motivation was provably-correct code. The ideas can be used in a more relaxed way, because prompting fits well with rapid application development (RAD) and prototyping - I think there is a sweet spot of high productivity in a kind of REPL (read/evaluate/print loop) with symbolic references and structure embedded in free-form text.Other comments have mentioned the importance of specification and requirements analysis, and dahlfox menions being able to patch new elements into the structure in subsequent prompts (via BASIC line number insertion).

grahac大约 1 个月前

Anyone here see the CS50 peanut butter and jelly problem in person?

评论 #43660830 未加载

评论 #43659150 未加载

评论 #43659495 未加载

评论 #43660635 未加载

mkw5053大约 1 个月前

Similar to having a remote engineering team that fulfills the (insufficient) requirements but in ways you did not predict (or want).

daxfohl大约 1 个月前

Whenever I'm prompting LLM for this kind of thing I find myself wishing there was a BASIC style protocol that we could use to instruct LLMs. Numbered statements, GOTOs to jump around, standardized like MCP or A2A such that all LLMs are trained to understand and verified to follow the logic.Why BASIC? It's a lot harder to mix English and structured programming concepts. Plus it's nice if you forget a step between 20 and 30 you can just say `25 print 'halfway'` from the chat.

评论 #43661706 未加载

cadamsdotcom大约 1 个月前

AI is exposing the difference in effectiveness between communicating clearly and precisely (potentially being more verbose than you think you need), vs. leaning heavily on context.

01HNNWZ0MV43FF大约 1 个月前

> Over the past year, I’ve been fully immersed in the AI-rena—building products at warp speed with tools like Claude Code and Cursor, and watching the space evolve daily.Blast fax kudos all around

sevenseacat大约 1 个月前

Heh, I remember doing that same peanut butter exercise in my high school computing class. It also has stuck in my head for all these years!

conductr大约 1 个月前

Funny to see, I used this exact analogy a few weeks ago regarding AI

davidcalloway大约 1 个月前

My teacher did the peanut butter and jelly problem with us in the fourth grade, but we were given the time to write the instructions as homework and she picked a few to execute the following day.The disappointment always stayed with me that my instructions were not chosen, as I really had been far more precise than the fun examples she did choose. I recall even explaining which side of the knife to use when taking peanut butter from the jar.Of course, she would still have found plenty of bugs in my instructions, which I wish I still had.Thanks for that, and also the pet rats, Ms. Clouser!

tedunangst大约 1 个月前

Why is the peanut butter so sloppy?

kazinator大约 1 个月前

> If your “sandwich” is a product that doesn’t have an obvious recipe—a novel app, an unfamiliar UX, or a unique set of features—LLMs struggleBzzt, nope!If the sandwich does not have an obvious recipe --- an app similar to many that have been written before or familiar, conventional UX, or boring features found in countless existing apps --- LLMs struggle.Fixed it for ya!

focusgroup0大约 1 个月前

Garbage In, Garbage Out

derefr大约 1 个月前

> Today’s AI Still Has a PB&J ProblemIf this is how you're modelling the problem, then I don't think you learned the right lesson from the PB&J "parable."Here's a timeless bit of wisdom, several decades old at this point:Managers think that if you can just replace code with something else that isn't text with formal syntax, then all the sudden "regular people" (like them, maybe?) will be able to "program" a system. But it never works. And the reason it never works is fundamental to how humans relate to computers.Hucksters continually reinvent the concept of "business rules engines" to sell to naive CTOs. As a manager, you might think it's a great idea to encode logic/constraints into some kind of database — maybe one you even "program" visually like UML or something! — and to then have some tool run through and interpret those. You can update business rules "live and on the fly", without calling a programmer!They think it's a great idea... until the first time they try to actually use such a system in anger to encode a real business process. Then they hit the PB&J problem. And, in the end, they must get programmers to interface with the business rules engine for them.What's going on there? What's missing in the interaction between a manager and a business rules engine, that gets fixed by inserting a programmer?There are actually two things:1. Mechanical sympathy. The programmer knows the solution domain — and so the programmer can act as an advocate for the solution domain (in the same way that a compiler does, but much more human-friendly and long-sighted/predictive/10k-ft-view-architectural). The programmer knows enough about the machine and about how programs should be built to know what just won't work — and so will push back on a half-assed design, rather than carrying the manager through on a shared delusion that what they're trying to do is going to work out.2. Iterative formalization. The programmer knows what information is needed by a versatile union/superset of possible solution architectures in the solution space — not only to design a particular solution, but also to "work backward", comparing/contrasting which solution architectures might be a better fit given the design's parameters. And when the manager hasn't provided this information — the programmer knows to ask questions.Asking the right questions to get the information needed to determine the right architecture and design a solution — that's called requirements analysis.And no matter what fancy automatic "do what I mean" system you put in place between a manager and a machine — no matter how "smart" it might be — if it isn't playing the role of a programmer, both in guiding the manager through the requirements analysis process, and in pushing back through knowledge of mechanical sympathy... then you get PB&J.That being said: LLMs aren't fundamentally incapable of "doing what programmers do", I don't think. The current generation of LLMs is just seemingly1. highly sycophantic and constitutionally scared of speaking as an authority / pushing back / telling the user they're wrong; and2. trained to always try to solve the problem as stated, rather than asking questions "until satisfied."

评论 #43663642 未加载

gblargg大约 1 个月前

At least with AI you can ask it what it understands about the topic so you know what you can assume.

评论 #43660663 未加载

评论 #43660596 未加载