科技回声

6 条评论

pietz大约 1 年前

I haven't noticed that GPT-4o hallucinates a lot more than the previous version but I noticed 2 other things of which especially the latter seems relevant here.1) it's insanely chatty, to a point where it ignores instructions about not doing certain things. I think this behavior is heavily favoured by benchmarks but as somehow who expects concise answers, this model annoys me. Custom instructions don't fully fix this for me.2) It likes repetitive answers a lot more than the previous version. Meaning that it will try its hardest to generate the followup answer in the same format as the first one. I think this is also the problem in your example.To my understanding, this is a measure against laziness, where the model would exclude information from the first answer that haven't changed in the followup. I always liked this behavior but maybe you remember the time from a few months ago where many people complained about the laziness of (I believe) 0125.Btw, while I type this, I notice that this is probably the highest level of first world problems I've ever complained about. There is this amazing almost free tool that answers all my questions and does most of my coding and I dislike it because it provides me with thorough context.

评论 #40543682 未加载

评论 #40543652 未加载

评论 #40544040 未加载

freesam大约 1 年前

We found that gpt-4o is consistently hallucinating. Here is an example: [ <a href="https://platform.openai.com/playground/chat?models=gpt-4o&preset=yEwza9Ibnw3RnQxGNnTwzOqd" rel="nofollow">https://platform.openai.com/playground/chat?models=gpt-4o&pr...</a> ] In this case, we never mentioned any store location in Ringsted. However , gpt-4o still fakes a store in the address: Adresse: Ringstedet, Klosterparks Allé 10, 4100 Ringsted Åbningstider: Mandag-fredag: 9.30-19, Lørdag: 10-17, Søndag: 11-16 Tlf: 50603398 Email: ringsted4100@gmail.com This hallucination is quite consistent. The gpt-4-turbo or even gpt-3.5 model doesn't have the same issue and correctly acknolwedged that there is no store in this location.

bhaney大约 1 年前

> Please log in to access this pageNo thanks.Not like an LLM hallucinating is particularly surprising or newsworthy anyway.

评论 #40543617 未加载

andrei512大约 1 年前

This isn't front-page material...

评论 #40543623 未加载

lionkor大约 1 年前

I found that, compared to GPT-3.5, it refuses to shut up when told to shut up.In the middle of a conversation, try going "SHUT UP, STOP TALKING ALREADY". For me, it just keeps repeating the last output. Very cool.

评论 #40543578 未加载

评论 #40543540 未加载

threeseed大约 1 年前

Not sure what you are expecting.The models are non-deterministic and there is no way to predict hallucinations.

评论 #40543591 未加载

6 条评论

pietz大约 1 年前

评论 #40543682 未加载

评论 #40543652 未加载

评论 #40544040 未加载

freesam大约 1 年前

bhaney大约 1 年前

> Please log in to access this pageNo thanks.Not like an LLM hallucinating is particularly surprising or newsworthy anyway.

评论 #40543617 未加载

andrei512大约 1 年前

This isn't front-page material...

评论 #40543623 未加载

lionkor大约 1 年前

评论 #40543578 未加载

评论 #40543540 未加载

threeseed大约 1 年前

Not sure what you are expecting.The models are non-deterministic and there is no way to predict hallucinations.

评论 #40543591 未加载

GPT-4o is consistently hallucinating

6 条评论

GPT-4o is consistently hallucinating

6 条评论