AI Model Interpretability: SHAP, LIME, and Integrated Gradients for XAI
Explaining black-box ML models for compliance, debugging, and stakeholder communication
AI Model Interpretability: SHAP, LIME, and Integrated Gradients for XAI
Explaining black-box ML models for compliance, debugging, and stakeholder communication
Master explainable AI techniques including SHAP values, LIME, integrated gradients, and attention visualization to interpret machine learning models for debugging, compliance, and stakeholder communication.
Explainability is required for regulatory compliance and essential for debugging ML models. SHAP (SHapley Additive exPlanations): game theory-based feature attribution. Each feature receives a SHAP value representing its marginal contribution to the prediction. Global: feature importance as mean absolute SHAP values. Local: waterfall plot explaining individual predictions. import shap; explainer = shap.TreeExplainer(model); shap_values = explainer.shap_values(X_test); shap.summary_plot(shap_values, X_test, feature_names=features). LIME (Local Interpretable Model-agnostic Explanations): creates local linear approximation around each prediction using perturbed samples. Good for any model type including neural networks. Integrated Gradients: gradient-based attribution for neural networks. Accumulates gradients along path from baseline to input, more theoretically grounded than raw gradients. Attention visualization: for transformer models, visualize attention weights to see which tokens influence predictions. Use BertViz for interactive visualization. When to use each: SHAP for tabular data (fast TreeSHAP for tree models, slower KernelSHAP for any model). LIME for model-agnostic explanations with any data type. Integrated Gradients for neural networks. Regulatory use: SHAP values for individual credit decision explanations (EU AI Act, GDPR right to explanation). Feature importance for compliance audits. Limitation: all explanation methods approximate the true model behavior and can be inconsistent for highly complex models.