yeah but how reliable is the mirror test?<p>cats only fail because fuck your expectations.<p>eventually people will calm down over a language model accurately responding plausibly. that’s the point.<p>if it couldn’t recognize a picture, the multi-modal functionality would be garbage. recognizing chats should be easy, given all the screenshots of irc from the dark ages.<p>if it couldn’t take a prompt about the picture and its response and the mirror test, and lead that into discussion of the mirror test, well, it would be a shit language model.<p>but would the model have passed the test without having been taken quite so lovingly by the hand?