When the Robotic Maths Tutor is Wrong - Can Children Identify Mistakes Generated by ChatGPT?
View/ Open
Author
Helal, Manal
Holthaus, Patrick
Wood, Luke
Velmurugan, Vignesh
Lakatos, Gabriella
Moros Espanol, Sílvia
Amirabdollahian, Farshid
Attention
2299/28285
Abstract
This study delves into integrating Large Language Models (LLMs), particularly ChatGPT-powered robots, as educational tools in primary school mathematics. Against the backdrop of Artificial Intelligence (AI) increasingly permeating educational settings, our investigation focuses on the response of young learners to errors made by these LLM-powered robots. Employing a user study approach, we conducted an experiment using the Pepper robot in a primary school classroom environment, where 77 primary school students from multiple grades (Year 3 to 5) took part in interacting with the robot. Our statistically significant findings highlight that most students, regardless of the year group, could discern between correct and incorrect responses generated by the robots, demonstrating a promising level of understanding and engagement with the AI-driven educational tool. Additionally, we observed that students' correctness in answering the Maths questions significantly influenced their ability to identify errors, underscoring the importance of prior knowledge in verifying LLM responses and detecting errors. Additionally, we examined potential confounding factors such as age and gender. Our findings underscore the importance of gradually integrating AI-powered educational tools under the guidance of domain experts following thorough verification processes. Moreover, our study calls for further research to establish best practices for implementing AI-driven pedagogical approaches in educational settings.