Kids who use ChatGPT as a study assistant do worse on tests

176 pointsby notamy9 months ago

36 comments

When I was young, and learning math, my father always forbade me from looking at the answer in the back of the textbook. “You don’t work backwards from the answer!”, and I think this is right.In life, we rarely have the answer in front of us, we have to work that out from the things we know. It’s this struggling that builds a muscle you can then apply to any problem. ChatGPT, I suspect, is akin to looking up the answer. You’re failing to exercise the muscle needed to solve novel (to you), problems.

评论 #41453540 未加载

评论 #41453546 未加载

评论 #41453650 未加载

评论 #41453995 未加载

评论 #41455837 未加载

评论 #41453592 未加载

评论 #41453556 未加载

评论 #41453666 未加载

评论 #41459928 未加载

tripletao9 months ago

These comments are filled with misunderstandings of the result. There were three groups of kids:1. Control, with no LLM assistance at any time.2. "GPT Base", raw ChatGPT as provided by OpenAI.3. "GPT Tutor", improved by the researchers to provide hints rather than complete answers and to make fewer mistakes on their specific problems.On study problem sets ("as a study assistant"), kids with access to either GPT did better than control.When GPT access was subsequently removed from all participants ("on tests"), the kids who studied with "GPT Base" did worse than control. The kids with "GPT Tutor" were statistically indistinguishable from control.

评论 #41454556 未加载

评论 #41454031 未加载

评论 #41454701 未加载

dghlsakjg9 months ago

Used incorrectly. Yes.LLMs, for me, have been tremendously useful in learning new concepts. I frequently feed it my own notes and ask it to correct any misunderstandings, or to expand on things I don’t understand.I use it like I would an on demand tutor, but I can totally understand how it could be used as a shortcut that wouldn’t be helpful.In the same way, I can hire a tutor that will help me actually learn, or I can hire a “tutor” that just does the homework for me. I’ve worked as a tutor so I’ve seen people looking for both, and people that don’t want to learn are always going to find a way. People who do want to learn are also going to find a way.

评论 #41453612 未加载

评论 #41453719 未加载

madhatter9999 months ago

From the abstract:“Consistent with prior work, our results show that access to GPT-4 significantly improves performance (48% improvement for GPT Base and 127% for GPT Tutor). However, we additionally find that when access is subsequently taken away, students actually perform worse than those who never had access (17% reduction for GPT Base).”Kids who use ChatGPT do actually “significantly” better according to the authors. Now I don’t know if significantly means statistically significant here because I haven’t read the methodology but 127% increase in performance must be something. That said, that’s a clickbaity title if I’ve ever seen one.Edit: Upon closer reading, the increase in performance is statistically significant. Also “access to GPT“ in this case is having GPT open while solving the problems, not studying with GPT and then solving the problems, which was my first understanding from the clickbaity title. Results are not terribly surprising in that regard.

评论 #41453597 未加载

评论 #41453535 未加载

评论 #41453586 未加载

评论 #41457585 未加载

评论 #41453528 未加载

hi_hi9 months ago

Story time. I always struggled with math as a kid. School to high school, then didn't touch it much until Uni. Teachers typically couldn't explain things in a way I "got it" in a school setting. I had some success with a private tutor to get me over the line in high school.Then at Uni I'm doing Computer Graphics, which included advanced (for me) math. I was panicked, and initially struggled until one of my good friends who was also studying the same course, and is VERY good at math, was able to answer my vague "I don't get it" questions, or at least guide me to more specific questions.I think I'm quite a visual learner, I don't think at that time there was a concept of people learning "differently". Luckily my good friend was also a visual learner, along with also being very good at math. It was like someone was able to see how my brain worked and feed me information in a way it could compile. I became quite good at math after that.You really need to learn how to learn. Its fascinating, but also horrifying when I now consider all the lives that have been negatively impacted because this wasn't understood, and people were led to believe they couldn't do something which maybe then really wanted to be able to do.If GenAI can help with that, I'm all in.

评论 #41453800 未加载

评论 #41453778 未加载

评论 #41453745 未加载

eks3919 months ago

If I lived before the tape measure was invented, and rely on carefully placing my metersticks to measure things, I can get really good at measuring without the need for a measuring tape. After all, a measuring tape is just a few flexible metersticks anyways, so if you need to measure something longer than the full length of the tape, you are screwed.If you take the measuring tape away from the person who relied on that tool instead of being good at using a meterstick, or perhaps no tools besides their own arm length, they are gonna suddenly not be able to measure, unless they go through the effort of learning to measure without the tape.You can argue that measuring tape is a crutch preventing people from learning how to properly measure, and has its own limitations, but regardless its still really helpful, especially for people who only need to measure things occassionally, and not super long things.ChatGPT is a tool. Just like all other tools, like computers, cars, etc., if you take it away, most people cannot perform the function for which they relied on the tool to help them do.

评论 #41453639 未加载

评论 #41455250 未加载

评论 #41453879 未加载

sim7c009 months ago

why is this surprising. all such tools hamper learning. if you want to learn, read books, read and write. don't use a spellchecker for ur language exam. no calculator for calculus. pen and paper. how is this going backwards :(

评论 #41453474 未加载

评论 #41453506 未加载

评论 #41453555 未加载

评论 #41453478 未加载

ignoramous9 months ago

The title could be worded better. Kids using "base" GPT4 performed poorly but the ones with access to a finely-tuned "tutor" GPT4 did okay. The study was purposefully done in a domain the current SoTA LLMs struggle in (Math).From the (draft!) paper's abstract:<pre><code> A key remaining question is how generative AI affects learning, namely, how humans acquire new skills as they perform tasks. This kind of skill learning is critical to long-term productivity gains, especially in domains where generative AI is fallible and human experts must check its outputs. .. Consistent with prior work, our results show that access to GPT-4 significantly improves performance (48% improvement for GPT Base and 127% for GPT Tutor). However, we additionally find that when access is subsequently taken away, students actually perform worse than those who never had access (17% reduction for GPT Base). That is, access to GPT-4 can harm educational outcomes. These negative learning effects are largely mitigated by the safeguards included in GPT Tutor. </code></pre> <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4895486" rel="nofollow">https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4895486</a>

评论 #41453908 未加载

langsoul-com9 months ago

Not surprising. Test is all about memorising things. If you don't need to memorise everything because it's on google, you won't.Thus, when the test rolls around, nothing is memorised and then they do bad.It's like memorising phone numbers VS keeping in the contacts app. Before I memorised tons of numbers, but now they're all on the app and I barely recall my own.

ggm9 months ago

I was willing to entertain the idea they could do better. I guess the tests have to be written to leverage the skill.That said, all things being equal kids who write notes by hand out-perform kids who type them. Even touch type them. So maybe the old ways are better in this specific brain-knowledge-competency-understanding forming space?

评论 #41453542 未加载

JeremyNT8 months ago

It seems kind of obvious, no?The act of repetition and processing the data ourselves is what leads to a deeper understanding, and asking a chatbot for an answer seems like it would skip the thinking required when learning "the old fashioned way."Maybe we can learn how to incorporate using chatbots in education, but I suspect there need to be guardrails on when and how they are used so students can get the benefit of doing the work themselves.

golergka9 months ago

> A third group of students had access to a revised version of ChatGPT that functioned more like a tutor. This chatbot was programmed to provide hints without directly divulging the answer. The students who used it did spectacularly better on the practice problems, solving 127 percent more of them correctly compared with students who did their practice work without any high-tech aids.Is it me, or is does this directly contradicts the title?

评论 #41453627 未加载

评论 #41453662 未加载

makk9 months ago

What if the test is irrelevant to the current times?“Those with ChatGPT solved 48 percent more of the practice problems correctly, but they ultimately scored 17 percent worse on a test of the topic that the students were learning.”So, in the real world, where people can use chatgpt in their jobs, the kids that use it will do better than the kids who don’t.Maybe a better test is: can you catch chatgpt when it is wrong? Not, can you answer without ChatGPT?

评论 #41453551 未加载

评论 #41453571 未加载

评论 #41453564 未加载

alabhyajindal9 months ago

I recently used AI assistants for help with programming homework. My usual prompts include "help me think in the right direction", "is my thinking correct" etc. I also find myself copy pasting a question in chat to understand it better.I had the suspicion that this is not aiding in my learning process even though I am able to "solve" more problems. Nice to see this confirmed. Time to stop!

评论 #41453632 未加载

评论 #41453658 未加载

评论 #41454116 未加载

fzeindl9 months ago

Side note and blog promotion: I find fascinated that ChatGPT can easily simulate the age of child when giving answers for homework: <a href="https://www.fabianzeindl.com/posts/chatgpt-simulating-agegroups" rel="nofollow">https://www.fabianzeindl.com/posts/chatgpt-simulating-agegro...</a>

jdeaton9 months ago

Why is the study specifically of Turkish students

评论 #41453519 未加载

评论 #41453520 未加载

thatkid029 months ago

its like gulping food; one has to chew. Time to learn how to educate when knowledge is at your fingertips.

gitroom9 months ago

What were the primary reasons that made students who used ChatGPT do poorly on math assessments, even though they had worked correctly through a greater number of practice problems?

评论 #41453514 未加载

评论 #41453657 未加载

BlueTemplar8 months ago

> A draft paper about the experiment was posted on the website of SSRN, formerly known as the Social Science Research Network, in July 2024. The paper has not yet been published in a peer-reviewed journal and could still be revised.Should have started with that.A study without independent replication hardly counts as «researchers found», much less one that hadn't even been peer-reviewed yet !

seeg9 months ago

I think the problem that people don't see anymore is using tests themselves. A clever idea is worth more than a single tick in the correct checkbox. This applies to maths as well. Tests are faster to check and, supposedly, objective, but a viva voce exam is still superior imho.

tj4449 months ago

The evaluation method is wrong.It's like when cars first came out, you ask people to drive cars for a month and they get used to cars. Then you ask them to compete in a horse race and see how fast they can go.We should evaluate how fast they solve a problem, no matter how.

ern9 months ago

I use ChatGPT 4o to check my child's homework, but I forbid them from using it directly. That way, I can make sure the work is correct (or at least wrong in the same way as ChatGPT) without straining my tired brain.

wiradikusuma9 months ago

Kids who have their parents do their homework do worse on tests.s/parents/chatgpt

评论 #41453595 未加载

评论 #41453467 未加载

评论 #41453472 未加载

tmaly8 months ago

I think the way to us ChatGPT is to have it explain a concept once and give a few examples.After that, the student should struggle the old fashion way with problems.I would like to see a study that looks at this approach.

alberth9 months ago

I have wondered if future generations will struggling with critical thinking / problem solving - without the aided technology assistance.

评论 #41453587 未加载

评论 #41453536 未加载

评论 #41453944 未加载

valval9 months ago

I wouldn’t be terribly concerned about that, as testing as it’s done in school is a moronic practice to begin with.

plusfour8 months ago

Test results are a measure of how well you can do on tests.

darthrupert9 months ago

Did nobody read the article? It says right there that the students who used chatgpt right, as a tutor, did much better than their peers.If your human tutors just give you the answers when you ask for them, how do you think it'll ho?

评论 #41453576 未加载

评论 #41453538 未加载

jgb19849 months ago

I have a visceral dislike, even hate, for what the LLM hype brought the world. The never ending slop it is spouting, filling up the entire internet. More and more I get confronted with images and media that turn out to be AI generated, when I find out I am disgusted and just close the tab.Soulless drivel, endlessly streaming.And I'm confident that the education system as we know it will be severely damaged because of it.Even in our own field, I can guarantee you that software developers that "grew up" with these garbage AI assistants will be worse coders than the generation that came before. You will never develop the understanding, the insight, that's needed by chatgpt'ing your way through college and life.Excellent news for my own market value of course, but I don't hesitate to say that I regret the LLM hype happened, the impact on the world is overwhelmingly negative (not even touching on the catastrophic environmental and financial cost to society).

brianhama9 months ago

But better in life…

godelski9 months ago

Are people not reading the article here?Let me tldr:<pre><code> - Study had 3 groups: normal GPT, system prompt to make GPT act as tutor and focus on giving hints, not answers, and no GPT Group 1 (normal GPT) - 48% better on practice problems - 17% worse on test Group 2 (tutor GPT) - 127% better on practice problems - equal test score to control group GPT errors: - 50% error rate - 8% error on arithmetic problems - step by step instructions we're wrong 42% of time - GPT tutor was fed answers - students with GPT and GPT tutor predicted that they did better (so both groups were over confident) </code></pre> Paper: <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4895486" rel="nofollow">https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4895486</a>I'll reply with my opinion to this comment. But many comments are not responding to the article content

评论 #41453913 未加载

hilbert429 months ago

I learned the hard way: no pain, no gain.

namaria9 months ago

I liken using llms for 'studying' to going to a gym with a hydraulic lift.Yeah you'll lift much, much more. But is that the point?

Jiahang9 months ago

learning need focus

JSDevOps9 months ago

Yeah because it lies and makes stuff up to fill in any gaps.

DeepYogurt9 months ago

Gasp