TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

White House is working with hackers to ‘jailbreak’ ChatGPT’s safeguards

1 点作者 eliomattia大约 2 年前

2 条评论

eliomattia大约 2 年前
As a programmer, I find it fascinating to build things from the ground up, with the inner workings either in full display or readily accessible for editing. With AI, the need to beg it to please behave, with a long list of things to do and not to do and a resounding order not to disclose such a list, is becoming commonplace.<p>Obviously, finding jailbreaks in LLMs is extremely important and consequential. However, there are meta questions around modern AI that remain valid, and this article is a reminder: is a continuous and <i>direct</i> feedback loop between code and coder a thing of the past? To what extent should we accept that LLMs are trained one-way, that we can only truly edit them with expensive trial-and-error retraining runs, hence, all we are left with is asking kindly? Are the current implementations all, or are we dealing with just one possible paradigm? Do we want AI, which relies upon computers, algorithms, and numbers written on memory, to be fundamentally programmable?
verdverm大约 2 年前
They could probably just visit Reddit, there are ample prompts there.<p>Prompt injection or prompt attacks are well known and likely impossible to guard against. Can you really get a human to be invulnerable to manipulation? Why would we expect the machines to be any better?
评论 #36016712 未加载