Explainability

The sixth and final principle of the FUTURE-AI guidelines is the one of Explainability, which states that medical AI algorithms should be able to provide meaningful and actionable explanations to the clinicians for their predictions. Explainability provides insight into the algorithmic mechanisms behind AI decision making processes thereby allowing for clinical validation and scrutinisation of these decisions. While local explainability highlights the reasons behind a particular prediction by the AI model for an individual image, global explanations identify the common characteristics that the AI model considers important for a particular image analysis task. Attribution maps (or heat-maps) are commonly used visual methods for explainability in medical AI, which highlight the relevant regions on the input image that the AI model considers important. To assess and achieve explainability in medical AI, we recommend the following quality checks:

  1. Application-specific definition: At the design phase, the need for explainability should be established first, since it is not required for all medical AI tools. Once deemed important, different explainability approaches should be intuitively presented to the clinicians to facilitate their determining of the most suitable method for the AI application in question (e.g. feature importance vs. counterfactual method).
  2. Explainability validation: Applicable AI validation studies should include the evaluation of explainability with end-users. Both qualitative and quantitative evaluation of explainability methods should be performed to ensure that the explanations are both meaningful and useful. Criteria such as explanation goodness and explanation satisfaction should be estimated (e.g. System Causability Scale).
  3. Uncertainty awareness: Medical AI tools should output a confidence score, e.g., a number between 0 and 1, that represents the likelihood that the AI output is correct. This will allow the end-users to and take the next course of actions (e.g. trust the AI result or acquire new data to reduce the uncertainty). Confidence scores can be calculated using uncertainty estimation methods.