4 Counterfactual Explanations - Explaining and Debugging

XAI Course Notes

How to explain a machine learning model such that the explanation is truthful to the model and yet interpretable to people? This question is key to ML explanations research because explanation techniques face an inherent tradeoff between fidelity and interpretability — a high-fidelity explanation for an ML model tends to be complex and hard to interpret, while an interpretable explanation is often inconsistent with the ML model. In this talk, I will present counterfactual (CF) explanations that bridge this tradeoff. Rather than approximate an ML model or rank features by their predictive importance, a CF explanation “interrogates” a model to find required changes that would flip the model’s decision and presents those examples to a user. Such examples offer a true reflection of how the model would change its prediction, thus helping decision-subject decide what they should do next to obtain a desired outcome and helping model designers debug their model. Using benchmark datasets on loan approval, I will compare counterfactual explanations to popular alternatives like LIME and SHAP. I will also present a case study on generating CF examples for image classifiers that can be used for evaluating fairness and even improving the generalizability of a model.

explainable AI
XAI
machine learning
ML
data science
contrafactuals
casual inference
CI
Author

Oren Bochman

Published

Thursday, March 23, 2023

Session Video

Course Leaders:

  • Bitya Neuhof - DataNights
  • Yasmin Bokobza - Microsoft

Speaker:

  • Amit Sharma - Microsoft

Sharma is a Principal Researcher at Microsoft Research India. His work bridges CI (causal inference) techniques with machine learning, to make ML models generalize better, be explainable and avoid hidden biases. To this end, Sharma has co-led the development of the open-source DoWhy library for causal inference and DiCE library for counterfactual explanations. The broader theme in his work is how ML can be used for better decision-making, especially in sensitive domains. In this direction, Sharma collaborates with NIMHANS on mental health technology, including a recent app, MindNotes, that encourages people to break the stigma and reach out to professionals.

His work has received many awards including:

  • a Best Paper Award at ACM CHI 2021 conference,
  • Best Paper Honorable Mention at ACM CSCW 2016 conference,
  • the 2012 Yahoo! Key Scientific Challenges Award and
  • the 2009 Honda Young Engineer and Scientist Award.

Amit received his:

  • Ph.D. in computer science from Cornell University and
  • B.Tech. in Computer Science and Engineering from the Indian Institute of Technology (IIT) Kharagpur.
  • Profile

What is this session about?

How to explain a machine learning model such that the explanation is truthful to the model and yet interpretable to people? This question is key to ML explanations research because explanation techniques face an inherent trade-off between fidelity and interpretability: a high-fidelity explanation for an ML model tends to be complex and hard to interpret, while an interpretable explanation is often inconsistent with the ML model.

In this talk, the speaker presented counterfactual explanations (CFX) that bridge this trade-off. Rather than approximate an ML model or rank features by their predictive importance, a CF explanation “interrogates” a model to find required changes that would flip the model’s decision and presents those examples to a user. Such examples offer a true reflection of how the model would change its prediction, thus helping decision-subject decide what they should do next to obtain a desired outcome and helping model designers debug their model. Using benchmark datasets on loan approval, I will compare counterfactual explanations to popular alternatives like LIME and SHAP. I will also present a case study on generating CF examples for image classifiers that can be used for evaluating the fairness of models as well as improving the generalizability of a model.

The speaker pointed out that he is primarily interested lay in CI and that when he later got interested in XAI his focused was on the cusp of CI and XAI.

Sharma shared that initially his work on XAI focused on deterministic, differential models. Only later when people asked about using them with traditional ML models like sk-learn and random forest that he went back to the drawing board and discovered how sampling counterfactual locally it is possible to got even better results.

Sharma also pointed out a shortcomings of algorithms like LIME and SHAP. While these present feature importance, their explanation are not actionable. This is in the sense that they fail to spell out to decision maker which interventions would allow them to cross decision boundaries, with least resistance, into their zone of desired outcomes.

Outline

outline

outline

Background

Assessing human decision-making

Assessing human decision-making

A great starting point for ML tasks if often best motivated by considering pros and cons of human capabilities in these tasks. Sharma points out that in (Weichselbaumer 2019) researchers used counterfactual thinking to study if employers discriminate against women wearing a head scarf. The idea was to they sent resumes sent to German Companies and modified names and images of applicants. German companies usually require images in the C.V. The study found there was discrimination.

Counterfactual Definition

What is a counterfactual

What is a counterfactual

Sharma presents the definition for a counterfactual provided by Judea Pearl

Given a system output y, a counterfactual y_{X_i=x'} is the output of a system had some input X_i changed but everything else unaffected by X_i remained the same. — (Pearl 2009).

Under the holistic paradigm introduced in Smuts (1926) complex real world systems are inherently interconnected with the implication that that a change to just one thing will end up changing everything. ML Models of reality are reductionist, make simplifying assumptions Linear model and many traditional ML model will allow us to test a CF type intervention.

And this can be be very useful.

The many uses of model CF Models

Estimating f(X_i=x')-f(x) can provide:

  1. Individual Effect of Feature a feature X_i

X_i = E[Y_{X_i=x'}\mid X=x,Y=y]-E[Y \mid X=x] \qquad \tag{1}

  1. Explanation of how important is feature X_i

  2. Bias in model M if X_i is a sensitive feature1

  3. More generally, CF provide a natural way to debug ML models via

Why do we need CF Explanations?

Feature Importance

Feature Importance

Feature importance is not enough?

A problem with SHAP & LIME

A problem with SHAP & LIME

Suppose an ML model recommends that an individual should be denied a loan

  • 🧑 Loan Officer : would like to understand why this individual was denied?

  • 👳 Individual: would also like to know what she could do get the loan approved?

Sharma points out two shortcomings of traditional XAI methods

  • Feature importance is inadequate to fully inform the stakeholders if it does not suggest a useful action.
  • Feature importance can have low fidelity 😡
    • The top feature may mandate unrealistic changes.

      e.g. “increase your income by 5x” 🙀

    • While the third Credit years and may not be on the path of least resistance to getting the loan.

      e.g. “just wait three more years and you will be approved.” 👍

Desiderata for counterfactuals

Desiderata of contractual explanation

Desiderata of contractual explanation
  1. Actionalbility - Ceteris paribus a CFX should be actionable for the decision subject.
  2. Diversity - we want to understand different casual choices
  3. Proximity - the CFX should be similar to the “query” in the sense of a local explanation.
  4. User constraints - it should only suggest actions that can be performed by the user. A loan applicant cannot easily, become younger, change sex or get a degree.2
  5. Sparsity - a CFX should only require change a minimal set of features. i.e. a few small steps in two or three dimensions to cross the decision boundary.
  6. Casual constraints
  • Going further it is suggested that we should view CFX as aggregating feasibility with diversity components

Before introducing his ideas Sharma references two prior works.

  • In the lengthy (Wachter, Mittelstadt, and Russell 2018), the authors suggest that to comply with GDPR regulations CFX should take the form:

    Score p was returned because variables V had values (v_1,v_2,...) associated with them. If V instead had values (v_1',v_2',...), and all other variables had remained constant, then score p' would have been returned.

    And and an approach to come up with suitable CFXs. Sharma references a formula:

    C= \arg \min_c loss_y(f(c),y)+|x-c| \qquad \tag{2}

    🏴 But this formula is not in the 📃 paper — perhaps it is a simplification of the idea.
    🤔 I believe it suggests their recipe to generate desirable CFX by picking a change c in feature x with a minimal impact on y as measured by some loss function on outcome y.

  • This approach is summarized in @molnar2022§9.3.1.1

  • In (Russell 2019) the author introduced a CFX algorithm based on mixed integer programming that supports diversity. However it is limited to linear ML models.

General Optimization framework

CF loss function

CF loss function

This is the simple framework used in DICE to generate diverse counterfactual explanations.

  • what is the easiest way to get CFX ?

  • if the model is differentialable and

  • if we have deep model we know gradient descent.

What have we have here ?

  • we start with a mean of a Wachter type constraint
    • this is being minimized.
  • we add a proximity constraint weighted by hyper parameter \lambda_1
    • this is being minimized.
  • we add a diversity constraint weighted by hyper parameter \lambda_2
    • this is being maximized.
    • based on K some kind of metric for the distance for CFX distances.

Sharma considers this approach dated in lieu of more recent publications.

He references to other methods.

I think though he is talking about MOC which is based on multi-objective optimization problem, introduced in (Dandl et al. 2020) which the authors compare to DiCE (Mothilal, Sharma, and Tan 2020) Recourse from (Ustun, Spangher, and Liu 2019) and Tweaking from (Tolomei et al. 2017)

Diverse CFX

these can be used to inspect the black box model and understand what is going in there

diverse CFX

Generating debugging edge-cases

CFX as a way to generate debugging edge cases

CFX as a way to generate debugging edge cases

Quantitative Evaluation for CFX

Quantitative evaluation for CFX

Quantitative evaluation for CFX

This is how we translate the desiderata into a formal model using metrics.

results comparing CF based methods

results comparing CF based methods

How does DiCE compare with LIME and SHAP

Comparing DiCE with LIME and CHAP

Comparing DiCE with LIME and CHAP

The results section from the DiCE paper!

Practical considerations

Practical Considerations

Practical Considerations

Returning to the optimization problem

Returning to the optimization problem

Returning to the optimization problem

How to Generate a CF for a ML model

How to generate a counterfactual for a ML model

How to generate a counterfactual for a ML model

Conclusion

Methods

Methods
Data Name Citation Python R
Tabular DoWhy (Sharma and Kiciman 2020a)
(Blöbaum et al. 2022)
pywhy
Tabular DiCE (Mothilal, Sharma, and Tan 2020) github
Tabular MOC (Dandl et al. 2020) github
Tabular Recourse (Ustun, Spangher, and Liu 2019)
Tabular Tweaking (Tolomei et al. 2017)
Text Checklist (Ribeiro et al. 2020) checklist
Text Litmus litmus
Image CF-CLIP (Yu et al. 2022) CF-CLIP

Conclusion

Conclusion

Conclusion

Resources:

Action Items:

  1. Once again I want to put some JSON-LD data as a Knowledge Graph into this article but I don’t have the tools to do it with.
    1. collect the people’s info using a headless CMS like sanity or blazegraph
    2. store the data on the papers using bibtex
    3. use the YAML metadata with categories
    4. some ontology for concepts and conferences
    5. write a sequence of queries
    6. visualize and interact with the output of the queries
  2. Try out DiCE notbook
  3. Try out DoWhy notebook
  4. Review the papers
  5. Consider:
    1. how can we use MCMC + XCF to generate useful examples for debugging our model.

References

Blöbaum, Patrick, Peter Götz, Kailash Budhathoki, Atalanti A. Mastakouri, and Dominik Janzing. 2022. “DoWhy-GCM: An Extension of DoWhy for Causal Inference in Graphical Causal Models.” arXiv Preprint arXiv:2206.06821.
Dandl, Susanne, Christoph Molnar, Martin Binder, and Bernd Bischl. 2020. “Multi-Objective Counterfactual Explanations.” In Lecture Notes in Computer Science, 448–69. Springer International Publishing. https://doi.org/10.1007/978-3-030-58112-1_31.
Mothilal, Ramaravind K., Amit Sharma, and Chenhao Tan. 2020. “Explaining Machine Learning Classifiers Through Diverse Counterfactual Explanations.” In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. ACM. https://doi.org/10.1145/3351095.3372850.
Pearl, Judea. 2009. “Causality,” September. https://doi.org/10.1017/cbo9780511803161.
Ribeiro, Marco Tulio, Tongshuang Wu, Carlos Guestrin, and Sameer Singh. 2020. “Beyond Accuracy: Behavioral Testing of NLP Models with CheckList.” In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, edited by Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault, 4902–12. Online: Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.acl-main.442.
Russell, Chris. 2019. “Efficient Search for Diverse Coherent Explanations.” CoRR abs/1901.04909. http://arxiv.org/abs/1901.04909.
Sharma, Amit, and Emre Kiciman. 2020a. “DoWhy: An End-to-End Library for Causal Inference.” arXiv Preprint arXiv:2011.04216.
———. 2020b. “DoWhy: An End-to-End Library for Causal Inference.” https://arxiv.org/abs/2011.04216.
Smuts, J. C. 1926. Holism and Evolution. Books for College Libraries. Macmillan.
Tolomei, Gabriele, Fabrizio Silvestri, Andrew Haines, and Mounia Lalmas. 2017. “Interpretable Predictions of Tree-Based Ensembles via Actionable Feature Tweaking.” In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 465–74. KDD ’17. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3097983.3098039.
Ustun, Berk, Alexander Spangher, and Yang Liu. 2019. “Actionable Recourse in Linear Classification.” Proceedings of the Conference on Fairness, Accountability, and Transparency, January. https://doi.org/10.1145/3287560.3287566.
Wachter, Sandra, Brent Mittelstadt, and Chris Russell. 2018. “Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR.” https://arxiv.org/abs/1711.00399.
Weichselbaumer, Doris. 2019. “Multiple Discrimination Against Female Immigrants Wearing Headscarves.” ILR Review 73 (3): 600–627. https://doi.org/10.1177/0019793919875707.
Yu, Yingchen, Fangneng Zhan, Rongliang Wu, Jiahui Zhang, Shijian Lu, Miaomiao Cui, Xuansong Xie, Xian-Sheng Hua, and Chunyan Miao. 2022. “Towards Counterfactual Image Manipulation via CLIP.” https://arxiv.org/abs/2207.02812.

Footnotes

  1. what is a sensitive feature?↩︎

  2. Such factors can create a bias leading to discrimination.↩︎

Reuse

CC SA BY-NC-ND

Citation

BibTeX citation:
@online{bochman2023,
  author = {Bochman, Oren},
  title = {4 {Counterfactual} {Explanations} - {Explaining} and
    {Debugging}},
  date = {2023-03-23},
  url = {https://orenbochman.github.io/notes/XAI/l04/Counterfactual-Explanations.html},
  langid = {en}
}
For attribution, please cite this work as:
Bochman, Oren. 2023. “4 Counterfactual Explanations - Explaining and Debugging.” March 23, 2023. https://orenbochman.github.io/notes/XAI/l04/Counterfactual-Explanations.html.