科技回声

7 条评论

This is a really odd way to test capabilities of an LLM. First, most photos of clocks are 10:10, since the training data for watches are usually set to 10:10 (in order to better sell watches etc).Second, I don't think the photo generation aspect of chat gpt is being marketed or presented as a problem solving AI.

chomp4 个月前

I like the part where the AI couldn’t be trusted to draw a clock, so we trusted it to psychoanalyze the incorrect clock

solresol4 个月前

I administered the CDT to ChatGPT and got Claude to diagnose what was wrong with the "patient" based on the results.There are signs of pre-frontal cortex damage or early stage dementia.

评论 #42672602 未加载

pnm456784 个月前

Here's the thing (which you probably knew going in).. Generative AI is quite well-known to be terrible at drawing specific times on clock faces.This is down to the training data. It has been trained on a huge amount of images.That includes advertising. For whatever reason, wrist watch manufacturers have a tendency to set watches to 10:10 in ads, almost without exception. Perhaps it's just a nice-looking time, or it's good for comparison purposes.Simply Google "wrist watch" and you'll see.So, these generative models have a huge bias towards 10:10 on clock faces, because that's what all the clocks they've been trained on look like.

airstrike4 个月前

FWIW, Claude 3.5 Sonnet got the SVG right on the first try: <a href="https://claude.site/artifacts/8dedf16e-b861-4497-96e2-872773d71baf" rel="nofollow">https://claude.site/artifacts/8dedf16e-b861-4497-96e2-872773...</a>Prompt was just "create an svg of a clockface with the time being 10 past 11"

pockybum5224 个月前

I love the concept of the article where one LLM can't draw a simple clock but the other one can accurately diagnose medical conditions from a hypothetical drawn image.

batch124 个月前

It has sentience problems...

7 条评论

MisterKent4 个月前

chomp4 个月前

I like the part where the AI couldn’t be trusted to draw a clock, so we trusted it to psychoanalyze the incorrect clock

solresol4 个月前

I administered the CDT to ChatGPT and got Claude to diagnose what was wrong with the "patient" based on the results.There are signs of pre-frontal cortex damage or early stage dementia.

评论 #42672602 未加载

pnm456784 个月前

airstrike4 个月前

pockybum5224 个月前

I love the concept of the article where one LLM can't draw a simple clock but the other one can accurately diagnose medical conditions from a hypothetical drawn image.

batch124 个月前

It has sentience problems...

Maybe ChatGPT has some pre-frontal cortex problems

7 条评论

Maybe ChatGPT has some pre-frontal cortex problems

7 条评论