Universality

The Universality principle states that a medical AI tool should be generalisable outside the controlled environment where it was built. Specifically, the AI tool should be able to generalise to new patients and new users (e.g. new clinicians), and when applicable, to new clinical sites. Depending on the intended radius of application, medical AI tools should be as interoperable and as transferable as possible, so they can benefit citizens and clinicians at scale.

To this end, four recommendations for Universality are defined in the FUTURE-AI framework. First, the AI developers should define the requirements for universality, i.e. the radius of application of their medical AI tool (e.g. clinical centres, countries, clinical settings), and accordingly anticipate any potential obstacles to universality, such as differences in clinical workflows, medical equipment or digital infrastructures (Universality 1). To enhance interoperability, development teams should favour the use of established community-defined standards (e.g. clinical definitions, medical ontologies, data annotations, technical standards) throughout the AI tool’s production lifetime (Universality 2). To enhance generalisability, the medical AI tool should be tested with external datasets and, when applicable, across multiple sites (Universality 3). Finally, medical AI tools should be evaluated for their local clinical validity, and if necessary, calibrated so they perform well given the local populations and local clinical workflows (Universality 4).

Recommendation Practical steps Examples of approaches and methods Stage
Universality 1. Define clinical settings
  • Specify intended use
  • Define target populations
  • Identify potential obstacles
  • Primary healthcare centres vs. hospitals
  • Home care vs. clinical settings
  • Low vs. high-resource settings
  • Single vs. multiple countries
Design
Universality 2. Use existing standards
  • Identify relevant standards
  • Implement standards in AI development
  • Ensure compliance with standards
  • Document standard usage
  • Clinical definitions (e.g. SNOMED CT)
  • Data models (e.g. OMOP)
  • Interface standards (e.g. DICOM, FHIR)
  • Technical standards (e.g. IEEE, ISO)
Development
Universality 3. Evaluate using external data
  • Collect external datasets
  • Perform technical validation
  • Assess generalisability
  • Apply transfer learning if needed
  • Multi-centre evaluation
  • Domain adaptation techniques
  • Benchmarking datasets
  • Transfer learning methods
Evaluation
Universality 4. Evaluate local clinical validity
  • Perform local evaluations
  • Assess fit with local workflows
  • Evaluate performance on local populations
  • Apply model recalibration if needed
  • On-site clinical trials
  • Workflow integration assessment
  • Population-specific performance analysis
  • Model fine-tuning techniques
Evaluation