LINGUISTIC ANALYSIS AND A HYBRID HUMAN-AUTOMATIC COACH FOR IMPROVING MATH IDENTITY. (NATIONAL SCIENCE FOUNDATION Grant No. #1739012, CYBERLEARNING AND FUTURE LEARNING TECHNOLOGIES)

This was a joint project led by researchers at the University of Pennsylvania (Jaclyn Ocumpaugh and Ryan Baker), Georgia State University (Scott Crossley) and Imagine Learning (Matthew Labrum).

This project studied an existing hybrid human-automatic learning system used at scale: the GenieMail system within Imagine Learning’s Reasoning Mind platform. We determined how students’ behaviors in the Reasoning Mind system, demographics, and mathematics skill relate to their math identity. This work will enable researchers to develop proxy measures for math identity that can be used to drive interventions.

This project has shown that students who have inconsistent records of performance (e.g., high variability in the number of problems they are able to answer correctly) are more likely to have low indicators of math identity (e.g., Karumbaiah et al., 2019; Slater et al., 2018). It has also shown that students with high math identity are more likely to use more sophisticated language (Crossley et al., 2018), and it has discovered demographic differences in help-seeking behaviors that may help researchers and practitioners to better design for equity (Karumbaiah et al., 2019).

In addition to better understanding how students' experiences within Reasoning Mind might be related to their Math Identity, this project has also supported the development and improvement of a number of tools for Natural Language Processing, including the Tool for the Automatic Analysis of Cohesion (TAACO) 2.0, the Tool for the Automatic Analysis of Lexical Diversity (TAALED), and the Grammar and Mechanic Errors Tool (GAMET).

This work has contributed to the training of several students at the University of Pennsylvania and Georgia State University, including Shamya Karumbaiah, Anya Ma, Stefan Slater, Franklin Bradfield, Analyn Bustamente, Minkyung Kim, Robert Phillips, Jacob Synder, Keegan Wray, Qian Wan, Yi Tian, Rorik Lol Tywoniw, and Genggeng Zheng.

Acknowledgement: This material is based upon work supported by the National Science Foundation under Grant No. #1739012.

Disclaimer: Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Selected Publications

Crossley, S. A., Karumbaiah, S., Ocumpaugh, J., Labrum, M., & Baker, R. (2020). Predicting math identity through language and click-stream patterns in a blended learning mathematics program for elementary students. Journal of Learning Analytics, 7(1), 19-37. [doi] [pdf]

Crossley, S. A., Bradfield, F., & Bustamante, A. (2019). Using human judgments to examine the validity of automated grammar, syntax, and mechanical errors in writing. Journal of Writing Research, 11(2), 251-270. doi: 10.17239/jowr-2019.11.02.01. [pdf]

Crossley, S. A., Karumbaiah, S., Labrum, M., Ocumpaugh, J., & Baker, R. (2019). Predicting Math Success in an Online Tutoring System Using Language Data and Click-Stream Variables: A Longitudinal Analysis. Proceedings of Language Data and Knowledge, 70, (pp. 1-13). Open Access Series in Informatics (OASIcs). doi: 10.4230/OASIcs.LDK.2019.25. [pdf]

Crossley, S. A., Kyle, K., & Dascalu, M. (2019). The Tool for the Automatic Analysis of Cohesion 2.0: Integrating semantic similarity and text overlap. Behavior Research Methods. 51 (1), 14. [pdf]

Crossley, S., Bustamante, A., & Bradfield, F. (2018). GAMET: A tool for automatically assessing grammar and mechanic errors in learner corpora. Presented at the 14th American Association for Corpus Linguistics (AACL) Conference. Atlanta, GA.

Crossley, S. A. (2018). How Many Words Needed? Using Natural Language Processing Tools in Educational Data Mining. Proceedings of the 10th International Conference on Educational Data Mining (EDM). [pdf]

Crossley, S., Ocumpaugh, J., Labrum, M., Bradfield, F., Dascalu, M., & Baker, R. S. (2018). Modeling Math Identity and Math Success through Sentiment Analysis and Linguistic Features. International Educational Data Mining Society.

Ocumpaugh J., Baker R.S., Karumbaiah S., Crossley S.A., Labrum M. (2020) Affective Sequences and Student Actions Within Reasoning Mind. In: Bittencourt I., Cukurova M., Muldner K., Luckin R., Millán E. (eds) Artificial Intelligence in Education. AIED 2020. Lecture Notes in Computer Science, vol 12163. Springer, Cham. [doi] [pdf]

Slater, S., Ocumpaugh, J., Baker, R., Labrum, M., and Li, J. (2019). Identifying Changes in Math Identity Through Adaptive Learning Systems Use. Proceedings of the 26th International Conference on Computers in Education. [pdf]

Crossley, S. A., Ocumpaugh, J., Labrum, M., Bradfield, F., Dascalu, M., & Baker, R. (2018). Modeling Math Identity and Math Success through Sentiment Analysis and Linguistic Features. Proceedings of the 10th International Conference on Educational Data Mining (EDM). [pdf]

Karumbaiah, S., Ocumpaugh, J., Labrum, M.J., Baker, R.S. (2019). Temporally Rich Features Capture Variable Performance Associated with Elementary Students' Lower Math Self-concept. Proceedings of the Workshop on Social-Emotional Learning at the 9th International Learning Analytics and Knowledge Conference. [pdf]

Karumbaiah, S., Ocumpaugh, J., Baker, R.S. (2019). The Influence of School Demographics on the Relationship Between Students' Help-Seeking Behavior and Performance and Motivational Measures. Proceedings of the 12th International Conference on Educational Data Mining. [pdf]

Tywoniw R., Crossley S.A., Ocumpaugh J., Karumbaiah S., Baker R. (2020) Relationships Between Math Performance and Human Judgments of Motivational Constructs in an Online Math Tutoring System. In: Bittencourt I., Cukurova M., Muldner K., Luckin R., Millán E. (eds) Artificial Intelligence in Education. AIED 2020. Lecture Notes in Computer Science, vol 12164. Springer, Cham. [doi] [pdf]

Tools this grant helped to develop

TAACO: TOOL FOR THE AUTOMATIC ANALYSIS OF COHESION

GAMET: GRAMMAR AND MECHANICS ERROR TOOL

TAALED: TOOL FOR THE AUTOMATIC ANALYSIS OF LEXICAL DIVERSITY

Data Access

Researchers may request access to select data from this project by contacting Co-PI Ryan Baker at RYBAKER AT UPENN DOT EDU. This deidentified data will be made available for research purposes only after researchers have agreed to our terms and conditions, which include strict privacy guidelines. Researchers who have agreed to our terms & conditions will receive login information for an online archive.

For other information on this project, please contact Jaclyn Ocumpaugh at ojaclyn AT upenn DOT edu. This website was last updated August 2019.