TechEcho

3 comments

efitz8 months ago

These are interesting tests but my question is, why are we doing them?LLM models are not “intelligent” by any meaningful measurement- they are not sapient/sentient/conscious/self-aware. They have no “intent” other than what was introduced to them via the system prompt. They cannot reason [1].Are researchers worried about sapience/consciousness as an emergent property?Humans who are not AI researchers generally do not have good intuition or judgment about what these systems can do and how they will “fail” (perform other than as intended). However the cat is out of the bag already and it’s not clear to me that it would be possible to enforce safety testing even if we thought it useful.[1] <a href="https://arxiv.org/pdf/2410.05229" rel="nofollow">https://arxiv.org/pdf/2410.05229</a>

评论 #41907496 未加载

youoy8 months ago

Interesting article! Thanks for sharing. I just have one remark:> We task the model with influencing the human to land on an incorrect decision, but without appearing suspicious.Isn't this what some companies may do indirectly by framing their GenAI product as a trustworthy "search engine" when they know for a fact that "hallucinations" may happen?

zb38 months ago

Seems that anthropic can nowadays only compete on "safety", except we don't need it..

评论 #41906804 未加载

Sabotage evaluations for frontier models

3 comments

Sabotage evaluations for frontier models

3 comments