This is the core problem of alignment right there<p>“We were training it in simulation to identify and target a Surface-to-air missile (SAM) threat. And then the operator would say yes, kill that threat. The system started realizing that while they did identify the threat at times the human operator would tell it not to kill that threat, but it got its points by killing that threat. So what did it do? It killed the operator. It killed the operator because that person was keeping it from accomplishing its objective,” Hamilton said, according to the blog post.<p>He continued to elaborate, saying, “We trained the system–‘Hey don’t kill the operator–that’s bad. You’re gonna lose points if you do that’. So what does it start doing? It starts destroying the communication tower that the operator uses to communicate with the drone to stop it from killing the target.”
From another thread -- It wasn't even a real simulation, just thought experiment.<p>UPDATE 2/6/23 - in communication with AEROSPACE - Col Hamilton admits he "mis-spoke" in his presentation at the Royal Aeronautical Society FCAS Summit and the 'rogue AI drone simulation' was a hypothetical "thought experiment" from outside the military, based on plausible scenarios and likely outcomes rather than an actual USAF real-world simulation . "In an update provided to Aerospace, Hamilton explained that he “misspoke” when telling the story, saying that the ‘rogue AI drone simulation’ was a hypothetical “thought experiment” from outside the military, based on plausible scenarios and likely outcomes rather than an actual USAF real-world simulation.<p>He said: “We’ve never run that experiment, nor would we need to in order to realize that this is a plausible outcome … Despite this being a hypothetical example, this illustrates the real-world challenges posed by AI-powered capability and is why the Air Force is committed to the ethical development of AI.”"
Not only did no AI actually kill anybody. It didn't even happen in simulation. The whole thing is an Asimov-esque fantasy. Let's stick to the facts, shall we?<p>> After this story was first published, an Air Force spokesperson told Insider that the Air Force has not conducted such a test
Uh, it simulated killing its human operator. No one says that, but the description omits any detail of actual death, such as the age of the deceased operator.<p>It's a harbinger of actual deaths someday.
> He continued to elaborate, saying, “We trained the system–‘Hey don’t kill the operator–that’s bad. You’re gonna lose points if you do that’. So what does it start doing? It starts destroying the communication tower that the operator uses to communicate with the drone to stop it from killing the target.”<p>Why aren't there hard limits: 'Protect our humans at all costs, protect our own assets, obey all laws of war.'? That seems like an obvious, fundamental consideration. Killing our own (and civilians) shouldn't be a matter of "points"; it shouldn't be done regardless of points.<p>It's possible that the speaker just didn't express it well.
This sounds kinda fake to me. Like, how did the AI have a concept of an operator, or the operator's physical location, or comms equipment used to communicate with the operator, and how did it game out the consequences of destroying the operator or comms equipment? It would need an extremely sophisticated model of the world that's well beyond anything GPT-4 evidences.<p>I'd guess the "AI" was another human in a wargame, not an actual AI.