TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Hackers 'jailbreak' powerful AI models in global effort to highlight flaws

23 点作者 starwin115911 个月前

3 条评论

phantomathkg11 个月前
archive.is: <a href="https:&#x2F;&#x2F;archive.is&#x2F;6o363" rel="nofollow">https:&#x2F;&#x2F;archive.is&#x2F;6o363</a>
comp_throw711 个月前
&gt; California’s legislature will in August vote on a bill that would require the state’s AI groups — which include Meta, Google and OpenAI — to ensure they do not develop models with “a hazardous capability”.<p>&gt;“All [AI models] would fit that criteria,” Pliny said.<p>This bit is particularly bad reporting. Putting aside the fact that the text of the bill no longer says &quot;hazardous capability&quot; (it&#x27;s now &quot;critical harm&quot;), this is how a &quot;critical harm&quot; is defined (<a href="https:&#x2F;&#x2F;legiscan.com&#x2F;CA&#x2F;text&#x2F;SB1047&#x2F;2023" rel="nofollow">https:&#x2F;&#x2F;legiscan.com&#x2F;CA&#x2F;text&#x2F;SB1047&#x2F;2023</a>):<p>(g) (1) “Critical harm” means any of the following harms caused or enabled by a covered model or covered model derivative: (A) The creation or use of a chemical, biological, radiological, or nuclear weapon in a manner that results in mass casualties. (B) Mass casualties or at least five hundred million dollars ($500,000,000) of damage resulting from cyberattacks on critical infrastructure, occurring either in a single incident or over multiple related incidents. (C) Mass casualties or at least five hundred million dollars ($500,000,000) of damage resulting from an artificial intelligence model autonomously engaging in conduct that would constitute a serious or violent felony under the Penal Code if undertaken by a human with the requisite mental state. (D) Other grave harms to public safety and security that are of comparable severity to the harms described in subparagraphs (A) to (C), inclusive. (2) “Critical harm” does not include harms caused or enabled by information that a covered model outputs if the information is otherwise publicly accessible. (3) On and after January 1, 2026, the dollar amounts in this subdivision shall be adjusted annually for inflation to the nearest one hundred dollars ($100) based on the change in the annual California Consumer Price Index for All Urban Consumers published by the Department of Industrial Relations for the most recent annual period ending on December 31 preceding the adjustment.<p>Given g(2), it is very likely that no models that are publicly available have the ability to cause a &quot;critical harm&quot; (i.e. where they can cause mass casualties or &gt;$500m in infrastructure damage via the specified routes in ways that counterfactually depended on new information generated by the model).
renewiltord11 个月前
When I was a child, my dad restricted my computer access for a day. I retaliated by putting<p><pre><code> @echo off :loop echo You is a fool goto loop </code></pre> in `autoexec.bat`. I did my part in highlighting flaws in Microsoft.
评论 #40783995 未加载