This website requires JavaScript.

Can Generative Pre-trained Transformers (GPT) Pass Assessments in Higher Education Programming Courses?

Jaromir SavelkaArav AgarwalChristopher BogartYifan SongMajd Sakr
Mar 2023
摘要
We evaluated the capability of generative pre-trained transformers (GPT), topass assessments in introductory and intermediate Python programming courses atthe postsecondary level. Discussions of potential uses (e.g., exercisegeneration, code explanation) and misuses (e.g., cheating) of this emergingtechnology in programming education have intensified, but to date there has notbeen a rigorous analysis of the models' capabilities in the realistic contextof a full-fledged programming course with diverse set of assessmentinstruments. We evaluated GPT on three Python courses that employ assessmentsranging from simple multiple-choice questions (no code involved) to complexprogramming projects with code bases distributed into multiple files (599exercises overall). Further, we studied if and how successfully GPT modelsleverage feedback provided by an auto-grader. We found that the current modelsare not capable of passing the full spectrum of assessments typically involvedin a Python programming course (<70% on even entry-level modules). Yet, it isclear that a straightforward application of these easily accessible modelscould enable a learner to obtain a non-trivial portion of the overall availablescore (>55%) in introductory and intermediate courses alike. While the modelsexhibit remarkable capabilities, including correcting solutions based onauto-grader's feedback, some limitations exist (e.g., poor handling ofexercises requiring complex chains of reasoning steps). These findings can beleveraged by instructors wishing to adapt their assessments so that GPT becomesa valuable assistant for a learner as opposed to an end-to-end solution.
展开全部
图表提取

暂无人提供速读十问回答

论文十问由沈向洋博士提出,鼓励大家带着这十个问题去阅读论文,用有用的信息构建认知模型。写出自己的十问回答,还有机会在当前页面展示哦。

Q1论文试图解决什么问题?
Q2这是否是一个新的问题?
Q3这篇文章要验证一个什么科学假设?
0
被引用
笔记
问答