TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Hackers 'jailbreak' powerful AI models in global effort to highlight flaws

23 pointsby starwin115911 months ago

3 comments

phantomathkg11 months ago
archive.is: <a href="https:&#x2F;&#x2F;archive.is&#x2F;6o363" rel="nofollow">https:&#x2F;&#x2F;archive.is&#x2F;6o363</a>
comp_throw711 months ago
&gt; California’s legislature will in August vote on a bill that would require the state’s AI groups — which include Meta, Google and OpenAI — to ensure they do not develop models with “a hazardous capability”.<p>&gt;“All [AI models] would fit that criteria,” Pliny said.<p>This bit is particularly bad reporting. Putting aside the fact that the text of the bill no longer says &quot;hazardous capability&quot; (it&#x27;s now &quot;critical harm&quot;), this is how a &quot;critical harm&quot; is defined (<a href="https:&#x2F;&#x2F;legiscan.com&#x2F;CA&#x2F;text&#x2F;SB1047&#x2F;2023" rel="nofollow">https:&#x2F;&#x2F;legiscan.com&#x2F;CA&#x2F;text&#x2F;SB1047&#x2F;2023</a>):<p>(g) (1) “Critical harm” means any of the following harms caused or enabled by a covered model or covered model derivative: (A) The creation or use of a chemical, biological, radiological, or nuclear weapon in a manner that results in mass casualties. (B) Mass casualties or at least five hundred million dollars ($500,000,000) of damage resulting from cyberattacks on critical infrastructure, occurring either in a single incident or over multiple related incidents. (C) Mass casualties or at least five hundred million dollars ($500,000,000) of damage resulting from an artificial intelligence model autonomously engaging in conduct that would constitute a serious or violent felony under the Penal Code if undertaken by a human with the requisite mental state. (D) Other grave harms to public safety and security that are of comparable severity to the harms described in subparagraphs (A) to (C), inclusive. (2) “Critical harm” does not include harms caused or enabled by information that a covered model outputs if the information is otherwise publicly accessible. (3) On and after January 1, 2026, the dollar amounts in this subdivision shall be adjusted annually for inflation to the nearest one hundred dollars ($100) based on the change in the annual California Consumer Price Index for All Urban Consumers published by the Department of Industrial Relations for the most recent annual period ending on December 31 preceding the adjustment.<p>Given g(2), it is very likely that no models that are publicly available have the ability to cause a &quot;critical harm&quot; (i.e. where they can cause mass casualties or &gt;$500m in infrastructure damage via the specified routes in ways that counterfactually depended on new information generated by the model).
renewiltord11 months ago
When I was a child, my dad restricted my computer access for a day. I retaliated by putting<p><pre><code> @echo off :loop echo You is a fool goto loop </code></pre> in `autoexec.bat`. I did my part in highlighting flaws in Microsoft.
评论 #40783995 未加载