I think he used the free version. GPT4o did better. It was missing 2 states. GPTo3-mini-high did it correctly in one shot.<p>So this is against what Gary Marcus is saying. I think his argument falls apart if he's says that we won't have AGI soon because a free model is making mistakes, but ignores that the newer and more expensive model can do what he says.<p>No one said AGI will be cheap in 2-3 years. They're saying it could be achieved in 2-3 years. It could be achieved but require an entire state's electricity to run it inititally.
I don’t know if I agree with the sentiment of this article. Yes, all the LLMs have flaws and limitations and can’t work for all purposes in real life. But we’re still early on all this. It can get a lot better. Does that mean someone is in shambles? No. This author implies that LLMs are no better than they were two years ago. But we know that’s not true. We have benchmarks to compare LLMs by. A small collection of anecdotes doesn’t make those benchmarks mean less.