Materials for a technical, nuts-and-bolts course about increasing transparency, fairness, robustness, and security in machine learning.
- Lecture 1: Explainable Machine Learning Models
- Lecture 2: Post-hoc Explanation
- Lecture 3: Bias Testing and Remediation
- Lecture 4: Machine Learning Security
- Lecture 5: Machine Learning Model Debugging
- Lecture 6: Responsible Machine Learning Best Practices
- Lecture 7: Risk Mitigation Proposals for Language Models
Corrections or suggestions? Please file a GitHub issue.
Source: Simple Explainable Boosting Machine Example
- Lecture Notes
- Software Example
- Assignment 1:
- Reading: Machine Learning for High-Risk Applications, Chapter 2 (pp. 33 - 50) and Chapter 6 (pp. 189 - 217)
- Check availablity through GWU Libraries access to O'Reilly Safari
- Lecture 1 Additional Materials
Source: Global and Local Explanations of a Constrained Model
- Lecture Notes
- Software Example
- Assignment 2
- Reading: Machine Learning for High-Risk Applications, Chapter 2 (pp. 50 - 80) and Chapter 6 (pp. 208 - 230)
- Check availablity through GWU Libraries access to O'Reilly Safari
- Lecture 2 Additional Materials
Source: Lecture 3 Notes
- Lecture Notes
- Software Example
- Assignment 3
- Reading Machine Learning for High-Risk Applications, Chapter 4 and Chapter 10
- Check availablity through GWU Libraries access to O'Reilly Safari
- Lecture 3 Additional Materials
Source: Responsible Machine Learning
- Lecture Notes
- Software Examples:
- Assignment 4
- Reading: Machine Learning for High-Risk Applications, Chapter 5 and Chapter 11
- Lecture 4 Additional Materials
Source: Real-World Strategies for Model Debugging
- Lecture Notes
- Software Examples:
- Sensitivity Analysis:
- Residual Analysis
- Assignment 5
- Reading: Machine Learning for High-Risk Applications, Chapter 3 and Chapter 8
- Check availablity through GWU Libraries access to O'Reilly Safari
- Lecture 5 Additional Materials
A Responsible Machine Learning Workflow Diagram. Source: Information, 11(3) (March 2020).
- Lecture Notes
- Assignment 6 (Final Assessment)
- Reading: Machine Learning for High-Risk Applications, Chapter 1 and Chapter 12
- Lecture 6 Additional Materials
A diagram for retrieval augmented generation. Source: Lecture 7 notes.
- Lecture Notes
- Software Example
- Reading: Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile, pgs. 1-12, 47-53
- Lecture 7 Additional Materials
-
Create a folder in your GWU Google Drive
My Drive
calledDNSC_6330_Software
-
To run the lecture examples:
- Create a folder named
Lecture_01
inside theDNSC_6330_Software
- Save a copy of the class
01_Explainable_AI_Models.ipynb
notebook into theLecture_01
folder by using theFile
->
Save a Copy In Drive
menu options, or download the class notebook and upload to your folder - In cell 1 of the
01_Explainable_AI_Models.ipynb
update the path to theLecture_01
folder:- likely
%cd drive/My\ Drive/DNSC_6330_Software/Lecture_01/
- Use the
%cd
and%ls
commands to find your folder if needed - Generally the
drive.mount('/content/drive/', force_remount=True)
command can only be used once in a Colab session, so use%cd
and%ls
commands in a different cell or restart your colab session if you see strange errors
- likely
- Download the example data from: https://drive.google.com/drive/folders/1jYZvT1j5khFnJC5NSqNeGiCNoOeib9YK?usp=sharing (Click triangle beside
Data
at top -> Download, then unzip and upload intoDNSC_6330_Software
folder) - Download some necessary Python code from: https://drive.google.com/drive/folders/1BPXxGp0QAKRl1ZP6Vd1xKuCwitiLyuy6?usp=sharing (Click triangle beside
hrml_book
at top -> Download, then unzip and upload intoDNSC_6330_Software
folder) - Upload these folders into the
DNSC_6330_Software
folder
- Create a folder named
-
To run the homework templates:
- Create a folder called
assignments
in theDNSC_6330_Software
folder - Download then upload the notebook
assign_1_template.ipynb
into theassignments
folder - Create a folder called
data
in theDNSC_6330_Software
folder - Download the assignment data from the class GitHub: https://github.com/jphall663/GWU_rml/tree/master/assignments/data
- Unzip the data files into CSV files and upload them into the
data
folder - In the
assignment_1.ipynb
add a cell before cell 1 that installsh2o
andinterpret
:!pip install interpret h2o
- In the
assignment_1.ipynb
add a cell before cell 3 that connects the notebook to the data:
- Create a folder called
from google.colab import drive
drive.mount('/content/drive/', force_remount=True)
# may need to be updated to location on your drive
%cd drive/My\ Drive/DNSC_6330_Software/assignments/
%ls
- Whenever asked, allow Colab to connect to your Google drive
- Delete any
__pycache__
folders you see - In the end the
DNSC_6330_Software
folder should look like:
DNSC_6330_Software
├── assignments
│ ├── assign_1_template.ipynb
├── data
│ ├── hmda_test_preprocessed.csv
│ └── hmda_train_preprocessed.csv
├── Data
│ ├── backdoor_testing
│ │ ├── constrained_backdoor_output.csv
│ │ ├── constrained_output.csv
│ │ ├── overfit_backdoor_output.csv
│ │ ├── overfit_output.csv
│ │ └── test_data.csv
│ ├── credit_line_increase.csv
│ ├── data_dictionary.csv
│ └── synthetic_data.csv
├── hrml_book
│ ├── explain.py
│ └── partial_dep_ice.ipynb
└── Lecture_01
└── 01_Explainable_AI_Models.ipynb
- You can use the following commands in a colab notebook to check your file structure:
from google.colab import drive
drive.mount('/content/drive/', force_remount=True)
# may need to be updated to location on your drive
%cd drive/My\ Drive/DNSC_6330_Software/
!apt-get -y install tree
! tree
- Add new assignment templates into the
assignments
folder - Add new code examples to
Lecture_XX/XX_notebook_name.iypnb
folders- For Lecture 2 that would be
Lecture_02/02_Explainable_AI_Post_Hoc.ipynb
- For Lecture 2 that would be