Abstract
Prior studies have shown the importance of classroom dialogue in academic performance, through which knowledge construction and social interaction among students take place. However, most of them were based on small scale or qualitative data, and few has explored the availability and potential of big data collected from online classrooms. To address this issue, this paper analyzes dialogues in live classrooms of a large online learning platform in China based on natural language processing techniques. The features of interactive types and emotional expression are extracted from classroom dialogues. We then develop neural network models based on these features to predict high- and low-academic performing students, and employ interpretable AI (artificial intelligence) techniques to determine the most important predictors in the prediction models. In both STEM (science, technology, engineering, mathematics) and non-STEM courses, it is found that high-performing students consistently exhibit more positive emotion, cognition and off-topic dialogues in all stages of the lesson than low-performing students. However, while the metacognitive dialogue illustrates its importance in non-STEM courses, this effect cannot be found in STEM courses. While high-performing students in non-STEM courses show negative emotion in the last stage of lessons, STEM students show positive emotion.