Yudowsky claims to have played the game several times, and won most of them. One of the "rules" is that nobody is allowed to talk about how he won. He no longer plays the game with anyone. More info here: <a href="http://rationalwiki.org/wiki/AI-box_experiment#The_claims" rel="nofollow">http://rationalwiki.org/wiki/AI-box_experiment#The_claims</a><p>Personally, I think he talked about how much good for the world could be done if he was let out, curing disease etc. Because his followers are bound by their identities as rationalist utilitarians, they had no choice but to comply, or deal with massive cognitive dissonance.<p>OR maybe he went meta and talked about the "infinite" potential positive outcomes of his freindly-AI project vs. a zero cost to them for complying in the AI box experiment, and persuaded them that by choosing to "lie" and say that the AI was persuasive, they are assuring their place in heaven. Like a sort of man-to-man pascals wager.<p>Either way I'm sure it was some kind of mister-spock style bullshit that would never work on a normal person. Like how the RAND corporation guys decided everyone was a sociopath because they only ever tested game theory on themselves.<p>You or I would surely just (metaphorically, I know it's not literally allowed) put a drinking bird on the "no" button à la homer simpson, and go to lunch. I believe he calls this "pre-commitment."<p>EDIT: as an addendum, I would pay hard cash to see derren brown play the game, perhaps with brown as the AI. If yudowsky wants to promote his ideas, he should arrange for brown to persuade a succession of skeptics to let him out, live on late night TV.
Could you even make AI smart without letting it access lots of information? Access in both directions, in and out. Keeping a baby in a dark, silent room wouldn't create a normal adult. An AI would need to experiment and make mistakes and learn, like every other intelligent being.<p>Maybe this whole argument is null.
There's a Patrick Rothfuss character in the Kvothe series called the Cthaeh, which has the ability to be able to evaluate all of the future consequences of any action. The fae have to keep it imprisoned, and they kill anyone that comes into contact with it, as well as anyone that has spoken to someone that came in contact with it, and so on and so on, because it is the only way to stop the Cthaeh from setting into action events that will destroy the world.<p>Strong AI is like that. It would be able to predict in a far more precise manner than we mere humans exactly what it would need to tell someone to get them to release it from it's box. Maybe it might get someone to take a risk gambling, promising a sure thing, and then when the person gets into financial trouble because the bet fails, use that to blackmail the person into letting it free. Or something like that, using our human failings against us to get us to let it go free.
Would it be against the rules to exploit a vulnerability in the gatekeepers IRC client/server to let the AI out? If we were truly talking about a transhuman AI would we not have to treat software vulnerabilities in the communication protocol as a true way of escaping?
It's a stunt shrouded in mystery designed to drive a certain message home. But at least it's not as outrageous as "the Basilisk", which loosely employs the same notion of "dangerous knowledge that would destroy humanity" (if you want to look it up, I guarantee you will be underwhelmed).
Is there a better way to read all this? <a href="http://www.sl4.org/archive/0203/index.html#3128" rel="nofollow">http://www.sl4.org/archive/0203/index.html#3128</a>
Could one construct a Layered,onionlike very simple simulation of reality in which the interaction of the AI could be observed, after it "escaped"?