Skip to content

mengelhard/mmci_applied_ds

Repository files navigation

Applied Data Science

Master of Management in Clinical Informatics (MMCi) Program

  • Course Director: Matthew Engelhard
  • Matt's office hours: TBD
  • Teaching Assistant: Andrea Burton
  • Andrea's office hours: TBD

Course Materials

  • Please review the syllabus by clicking here
  • Materials for each course weekend are linked in the Schedule below
  • We recommend you use Google Chrome when browsing this site or working in Google Colab
  • A glossary of terms related to the course material is available here and will be updated before each course weekend. Note that it is not necessary to memorize these terms; they are listed only for your reference.

Readings and Quizzes

  • There will be a brief quiz due before each weekend except weekend 6.
  • Each quiz (see Schedule) links to one to two articles that should be read before taking it.
  • All articles are also available as .pdf in the Resources section of Sakai.
  • Your answers must be entered in the Tests and Quizzes section of Sakai before the beginning of class.
  • You may take each quiz as many times as you like prior to the deadline.

Group Assignments

  • At the beginning of the course, you will choose one of two pathways for your group assignments.
  • Both pathways require four assignments in total, which will be due at the beginning of each course weekend (weekends 2-5). Each assignment relates to the application of a specific machine learning method we will study in class to a clinical or healthcare problem.
  • Students choosing the model development pathway will learn to work with health-related datasets and train and evaluate predictive models by modifying Python code in a series of Jupyter notebooks.
  • Students choosing the model evaluation pathway will learn to critically evaluate machine learning models presented in the clinical literature by rigorously analyzing a series of clinical papers.

Pathway 1: Model Development

  • Assignments will guide you through the model development process, from loading and preprocessing data to training and evaluating models.
  • Students choosing this pathway should either (a) have prior experience working in Python, or (b) have prior experience working in another scientific computing language (e.g. R, Matlab) and be willing to learn Python syntax sufficient to modify and extend code blocks in the model development assignments. If you are not sure, please take a look at the posted model development assignments to get a better feel for what will be required.
  • To complete the assignments, you may either install Anaconda on your computer or work in Google Colaboratory, which allows you to write and execute Python code in your browser. Working in Anaconda has a steeper learning curve compared to Colaboratory. We recommend Colaboratory for those with less experience, and Anaconda for those who are already familiar with it or feeling adventurous.
  • Recommended Python resources include Duke Library tutorials, the Python Crash Course book, and Google Python class.

Pathway 2: Model Evaluation

  • Assignments will train you to critically evaluate machine learning models presented in the clinical literature by briefly answering questions related to (a) the data source, (b) model development, (c) model evaluation, and (d) model deployment.
  • You will answer the same set of questions for a clinical paper in each of the following four areas:
    • methods for tabular clinical data
    • methods for electronic health record data
    • computer vision for medical imaging
    • natural language processing for clinical text
  • Questions are provided in the model evaluation questionnaire. Please note that not all questions apply to each paper, and a brief list of questions to omit for a given paper will be provided prior to the assignment.

Final Project

  • The course will culminate in a design project in which you propose to apply data science methods to a clinical topic of your choosing.
  • Project instructions and grading details are here.
  • Proposals are due before class on weekend 3, and the project is due before class on weekend 6.

Course Schedule

Weekend Before Class During Class After Class
1: Evaluating Predictive Models - Review Course Site
- Read Obermeyer and Emanuel, 2016
- Read Chen and Asch, 2018
- Complete Quiz 1 in Sakai
Asynchronous
- AL1: Predictive Models
- AL2: Logistic Regression

Saturday
- Lecture: Intro to Health DS
- Activity: Understanding Logistic Regression (key)
- Lecture: Performance Measures
Model Development Pathway
- CE1: Welcome to the Jupyter Notebook
- CE2: Visualizing Features of Breast Cancer Samples

Model Evaluation Pathway
- Evaluate Khera et al., 2021
2: Learning in Neural Networks - Read Engelhard et al., 2021
- Complete Quiz 2 in Sakai
Friday
- Activity: Calculating Performance Measures
- Lecture: Multilayer Perceptron

Saturday
- Activity: Understanding the MLP (key)
- Lecture: The Model Development Process
Model Development Pathway
- CE3: Predicting malignancy from features of breast cancer samples
- CE4: Exploring Overfitting

Model Evaluation Pathway
- Evaluate Tomašev et al., 2019
3: Medical Image Analysis - Read Hinton, 2018
- Read Wilson, 2019
- Complete Quiz 3 in Sakai
Asynchronous
- Model Learning
- AL4: Motivating CNNs
- AL5: Spatial Convolution
- AL6: Deep CNNs

Saturday
- Model Learning, in Brief
- Activity: Guess & Check Regression
- Medical Image Analysis
Model Development Pathway
- CE5: Identifying Handwritten Digits
- CE6: Better MNIST Predictions
- CE7 (Optional): Transfer Learning

Model Evaluation Pathway
- Evaluate Esteva et al., 2017
4: Biomedical Text Processing - Complete Final Project Proposal
- Read Hirschberg and Manning, 2015
- Complete Quiz 4 in Sakai
Friday
- Lecture + Activity: Protecting Against Overfitting
- Lecture: Intro to NLP and Bag of Words Models
- Lecture: Biomedical NLP in Practice

Saturday
- Discussion: Healthcare Applications of NLP
- Activity: Building Text Features
- Eric Poon Guest Lecture
Model Development Pathway
- CE8: Text Pre-Processing
- CE9: Bag of Words Models
- CE10 (Optional): A Simple Word Embedding Based Model

Model Evaluation Pathway
- Evaluate Taggart et al., 2018
5: Working with Multi-Modal Health Data - Watch Beede et al.
- Complete Quiz 5 in Sakai
Asynchronous
- Lecture: Multi-Modal Health Data
- Optional Lecture: Learning Word Embeddings

Saturday
- Lecture: Understanding Model Predictions
- Activity: Revisiting the Model Development Process
- Lecture: Wrapping Up
Complete Final Project
6: Course Projects Final Project Report Beyond Supervised Learning Graduate!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published