I’ve been experimenting with ChatGPT in another educational context: short essays that high school and college students often have to write for class. ChatGPT excels. I am putting my test results on the following page:<p><a href="https://www.gally.net/temp/202212chatgpt/index.html" rel="nofollow">https://www.gally.net/temp/202212chatgpt/index.html</a><p>Some of the prompts and responses involve Japanese because I teach at a university in Japan. I especially want to see how well Japanese students could use ChatGPT to produce compositions for their English academic writing classes. Of the trials I’ve done so far, Test 18 on the above page is the most significant: The Japanese prompt outlines briefly a three-paragraph essay, and ChatGPT produces two very good original essays in English on the requested topic and with the requested organization.
In 2020, GPT-3 175B was already outperforming humans on SAT questions by ~20%. There are far more formal benchmarks for models now.<p><a href="https://lifearchitect.ai/iq-testing-ai/" rel="nofollow">https://lifearchitect.ai/iq-testing-ai/</a>
Japan did something similar years ago with result in the top 20% for the national college entrance exam. The Todai robot could even take the exam all by itself and didn’t need a human to feed it questions and possible answers.<p><a href="https://www.businessinsider.com/robot-beat-most-students-on-university-tokyo-entrance-exam-2017-9" rel="nofollow">https://www.businessinsider.com/robot-beat-most-students-on-...</a>
I would be really interested to see how it performs on GRE (The stupidest exam ever invented). I think it will do great on writing and qualitative but I doubt about quantitative part.<p>Edit:
I actually tried quick trial to let ChatGPT take one quantitative practice exam (16 questions) using the same way the OP feed the questions. I was surprised that the result was 5/16 correct answers. This was even worse than I guessed. Indeed, this is just heuristic attempt and maybe someone (or me) will have more serious one later. If you are interested into what were the question and answers that, They are here [1]<p>i.e to how I fed the question was<p>> If x+2y= 4 and x+3y= 9. Find the correct option.
Quantity A = x
Quantity B = y<p>choices:
Quantity A is greater
Quantity B is greater
Both Quantities are equal
The relationship cannot be determined<p>And the answer was<p>> The correct answer is 2) Quantity B is greater.<p>To solve this problem, we can first note that the equations x+2y=4 and x+3y=9 can be solved simultaneously to find the values of x and y.<p>and it went to explain the steps.<p>[1] <a href="https://paste.elashri.xyz/s14t" rel="nofollow">https://paste.elashri.xyz/s14t</a>
What's missing from these models is everything related to visual or spatial information (that is not encoded in text). I assume that there will be eventually be something like ChatGPT/InstructGPT where part of the input data is images and or videos, with and without captions. So it would have a way of connecting the language to the spatial (and temporal).<p>It seems like they may need a more efficient approach though to handle the massive amount of video data. Maybe the 'MrsFormer' multi-resolution thing could help.<p>Another thing that could be very useful for coding without requiring visual information would be to add a whole other subsystem where this thing could actually compile/run the code iteratively and see the output.<p>I don't think transformers are the last invention in AI, but they certainly seem capable of getting to general purpose AI for many contexts. That and related techniques are not going to create something like a digital autonomous person though, which I think is a good thing.