Big Data and Education


A Massive Online Open Textbook (MOOT)
7th Edition (10th Anniversary Edition)
by Ryan Baker
in cooperation between the University of Pennsylvania, Teachers College, Columbia University, and the Columbia Center for New Media Teaching and Learning

As seen on Coursera (2013) and EdX (2015, 2017, 2018, 2019, 2020, 2023)

Chapter 1: Prediction Modeling
Video 1: Introduction [YouTube] [pptx]
Video 2: Regressors [YouTube] [pptx]
Video 3: Classifiers part 1 [YouTube] [pptx]
Video 4: Classifiers part 2 [YouTube] [pptx]
Video 5: Case study in classification [YouTube] [pptx]
Video 6: Advanced Classifiers [YouTube] [pptx]
Video 7: Explainable AI [YouTube] [pptx]

Chapter 2: Model Goodness and Validation
Video 1: Detector confidence [YouTube] [pptx]
Video 2: Diagnostic metrics: part 1 [YouTube] [pptx]
Video 3: Diagnostic metrics: part 2 [YouTube] [pptx]
Video 4: Diagnostic metrics: part 3 [YouTube] [pptx]
Video 5: Cross-validation and over-fitting [YouTube] [pptx]
Video 6: Types of validity [YouTube] [pptx]
Video 7: Algorithmic Bias [YouTube] [pptx]

Chapter 3: Behavior Detection
Video 1: Ground Truth [YouTube] [pptx]
Video 2: Data synchronization [YouTube] [pptx]
Video 3: Feature engineering [YouTube] [pptx]
Video 4: Automated feature generation and selection [YouTube] [pptx]
Video 5: Knowledge engineering and data mining [YouTube] [pptx]
Video 6: Tweaking towards optimality [YouTube] [pptx]
Video 7: Transfer learning and active learning [YouTube] [pptx]

Chapter 4: Knowledge Inference
Video 1: Knowledge Tracing [YouTube] [pptx]
Video 2: Bayesian Knowledge Tracing [YouTube] [pptx]
Video 3: Logistic Knowledge Tracing [YouTube] [pptx]
Video 4: Item Response Theory [YouTube] [pptx]
Video 5: Advanced Bayesian Knowledge Tracing [YouTube] [pptx]
Video 6: Deep Knowledge Tracing [YouTube] [pptx]
Video 7: Memory Algorithms [YouTube] [pptx]

Chapter 5: Relationship Mining
Video 1: Correlation Mining [YouTube] [pptx]
Video 2: Causal Mining [YouTube] [pptx]
Video 3: Association Rule Mining [YouTube] [pptx]
Video 4: Sequential Pattern Mining [YouTube] [pptx]
Video 5: Network Analysis [YouTube] [pptx]
Video 6: Epistemic Network Analysis [YouTube] [pptx]

Chapter 6: Structure Discovery
Video 1: Clustering [YouTube] [pptx]
Video 2: Cluster Validation [YouTube] [pptx]
Video 3: Advanced Clustering Algorithms [YouTube] [pptx]
Video 4: Applications of Clustering in EDM [YouTube] [pptx]
Video 5: Factor Analysis [YouTube] [pptx]
Video 6: Knowledge Structure: Q-Matrixes [YouTube] [pptx]
Video 7: Knowledge Structures: Other Approaches [YouTube] [pptx]
Video 8: Knowledge Structures: Learning Curves [YouTube] [pptx]

Chapter 7: Advanced Data Sources
Video 1: Advanced Data Sources: Text Mining [YouTube] [pptx]
Video 2: Advanced Data Sources: Foundation Models [YouTube] [pptx]
Video 3: Advanced Data Sources: Multimodal Learning Analytics [YouTube] [pptx]

Chapter 8: Advanced Topics
Video 1: Discovery with Models [YouTube] [pptx]
Video 2: Discovery with Models Case Study [YouTube] [pptx]
Video 3: Hidden Markov Models [YouTube] [pptx]
Video 4: Reinforcement Learning [YouTube] [pptx]
Video 5: Conclusions and Future Directions [YouTube] [pptx]

Acknowledgements: Sincerest thanks to Elle Wang, Miggy Andres, Michael Cennamo, Stephanie Ogden, Luc Paquette, Jose Diaz, Michael de Leon, Therese Condit, Megan Carr, Christopher Cook, Tanvi Gupta, students who have recommended additions or corrections, and others.

These materials were created with generous support from the Army Research Laboratory, the National Science Foundation (#DRL-1418378, #DRL-1661987), and the Provost and President of Teachers College, Columbia University. The content represents the views of the author, and does not necessarily represent the views of the National Science Foundation.

Bugs? Errors? Email Ryan Baker.

Please cite this MOOT and MOOC as Baker, R.S. (2023) Big Data and Education. 7th Edition. Philadelphia, PA: University of Pennsylvania.

Content from the previous edition can be accessed here!

All materials here copyright Teachers College, Columbia University, the University of Pennsylvania, and Columbia University, 2013-2023.

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.