Machine learning algorithms create potentially more accurate models than linear models, but any increase in accuracy over more traditional, better-understood, and more easily explainable techniques is not practical for those who must explain their models to regulators or customers. For many decades, the models created by machine learning algorithms were taken to be black-boxes. However, a recent flurry of research has introduced credible techniques for interpreting complex, machine-learned models. Materials presented here illustrate applications or adaptations of these techniques for practicing data scientists.
-
Practical ML interpretability examples
-
Comparison of LIME, Shapley, and treeinterpreter explanations
General
- Interpretable Machine Learning
- Towards A Rigorous Science of Interpretable Machine Learning
- Explaining Explanations: An Approach to Evaluating Interpretability of Machine Learning
- A Survey Of Methods For Explaining Black Box Models
- Trends and Trajectories for Explainable, Accountable and Intelligible Systems: An HCI Research Agenda
- UC Berkeley CS 294: Fairness in Machine Learning
- Ideas for Machine Learning Interpretability
- An Introduction to Machine Learning Interpretability (or Blackboard electronic reserves)
- On the Art and Science of Machine Learning Explanations
Techniques
- Partial Dependence: Elements of Statistical Learning, Section 10.13
- LIME: “Why Should I Trust You?” Explaining the Predictions of Any Classifier
- LOCO: Distribution-Free Predictive Inference for Regression
- ICE: Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation
- Surrogate Models
- TreeInterpreter: Random forest interpretation with scikit-learn
- Shapley Explanations: A Unified Approach to Interpreting Model Predictions
- Explainable neural networks (xNN): Explainable Neural Networks based on Additive Index Models