Design Phase

The design phase of FUTURE-AI framework focuses on early-stage planning and requirement gathering, encompassing 10 key recommendations that span from stakeholder engagement to risk management processes. Table 3 provides a comprehensive breakdown of these design phase recommendations, detailing specific operations and examples for each recommendation, including interdisciplinary stakeholder engagement (general 1), user requirements definition (usability 1), clinical settings specification (universality 1), data variation source identification (robustness 1), bias source definition (fairness 1), explainability needs assessment (explainability 1), ethical and social issue investigation (general 6 and 7), standards implementation (universality 2), and risk management process establishment (traceability 1).

Practical steps and examples to implement FUTURE-AI recommendations during design phase

Recommendations Operations Examples
Engage interdisciplinary stakeholders (general 1) Identify all relevant stakeholders Patients, GPs, nurses, ethicists, data managers
Provide information on the AI tool and AI Educational seminars, training materials, webinars
Set up communication channels with stakeholders Regular group meetings, one-to-one interviews, virtual platform
Organise cocreation consensus meetings One day cocreation workshop with n=15 multidisciplinary stakeholders
Use qualitative methods to gather feedback Online surveys, focus groups, narrative interviews
Define intended use and user requirements (usability 1) Define the clinical need and AI tool’s goal Risk prediction, disease detection, image quantification
Define the AI tool’s end users Patients, cardiologists, radiologists, nurses
Define the AI model’s inputs Symptoms, heart rate, blood pressure, ECG, image scan, genetic test
Define the AI tool’s functionalities and interfaces Data upload, AI prediction, AI explainability, uncertainty estimation
Define requirements for human oversight Visual quality control, manual corrections
Adjust user requirements for all end user subgroups According to role, age group, digital literacy level
Define intended clinical settings and cross setting variations (universality 1) Define the AI tool’s healthcare setting(s) Primary care, hospital, remote care facility, home care
Define the resources needed at each setting Personnel (experience, digital literacy), medical equipment (eg, >1.5 T MRI scanner), IT infrastructure
Specify if the AI tool is intended for high end and/or low resource settings Facilities with MRI scanners >1.5 T v low field MRIs (eg, 0.5 T), high end v low cost portable ultrasound
Identify all cross settings variations Data formats, medical equipment, data protocols, IT infrastructure
Define sources of data heterogeneity (robustness 1) Engage relevant stakeholders to assess data heterogeneity Clinicians, technicians, data managers, IT managers, radiologists, device vendors
Identify equipment related data variations Differences in medical devices, manufacturers, calibrations, machine ranges (from low cost to high end)
Identify protocol related data variations Differences in image sequences, data acquisition protocols, data annotation methods, sampling rates, preprocessing standards
Identify operator related data variations Different in experience and proficiency, operator fatigue, subjective judgment, technique variability
Identify sources of artefacts and noises Image noise, motion artefacts, signal dropout, sensor malfunction
Identify context specific data variations Lower data quality acquisition in emergency units, during high patient volume times
Define any potential sources of bias (fairness 1) Engage relevant stakeholders to define the sources of bias Patients, clinicians, epidemiologists, ethicists, social carers
Define standard attributes that might affect the AI tool’s fairness Sex, age, socioeconomic status
Identify application specific sources of bias beyond standard attributes Skin colour for skin cancer detection, breast density for breast cancer detection
Identify all possible human biases Data labelling, data curation
Define the need and requirements for explainability with end users (explainability 1) Engage end users to define explainability requirements Clinicians, technicians, patients
Specify if explainability is necessary Not necessary for AI enabled image segmentation part, critical for AI enabled diagnosis
Specify the objectives of AI explainability (if it is needed) Understanding AI model, aiding diagnostic reasoning, justifying treatment recommendations
Define suitable explainability approaches Visual explanations, feature importance, counterfactuals
Adjust the design of the AI explanations for all end user subgroups Heatmaps for clinicians, feature importance for patients
Investigate ethical issues (general 6) Consult ethicists on ethical considerations Ethicists specialised in medical AI and/or in the application domain (eg, paediatrics)
Assess if the AI tool’s design is aligned with relevant ethical values Right to autonomy, information, consent, confidentiality, equity
Identify application specific ethical issues Ethical risks for a paediatric AI tool (eg, emotional impact on children)
Comply with local ethical AI frameworks AI ethical guidelines from Europe, United Kingdom, United States, Canada, China, India, Japan, Australia, etc
Investigate social and environmental issues (general 7) Investigate AI tool’s social and environmental impact Workforce displacement, worsened working conditions and relations, deskilling, dehumanisation of care, reduced health literacy, increased carbon footprint, negative public perception
Define mitigations to enhance the AI tool’s social and environmental impact Interfaces for physician-patient communication, workforce training, educational programmes, energy efficient computing practices, public engagement initiatives
Optimise algorithms, energy efficiency Develop and use energy efficient algorithms that minimise computational demands. Techniques like model pruning, quantisation, and edge computing can reduce the energy required for AI tasks
Promote responsible data usage Focus on collecting and processing only the necessary amount of data. Implement federated learning techniques to minimise data transfers
Monitor and report the environmental impact of the AI tool Regularly monitor and report on the environmental impact of AI systems used in healthcare, including energy usage, carbon emissions, and waste generation
Use community defined standards (universality 2) Use a standard definition for the clinical task Definition of heart failure by the American Academy of Cardiology
Use a standard method for data labelling BI-RADS for breast imaging
Use a standard ontology for the AI inputs DICOM for imaging data, SNOMED for clinical data
Adopt technical standards IEEE 2801-2022 for medical AI software
Use standard evaluation criteria See Maier-Hein et al[21] for medical imaging applications, for fairness evaluation
Implement a risk management process (traceability 1) Identify all possible clinical, technical, ethical, and societal risks Bias against under-represented subgroups, limited generalisability to low resource facilities, data drift, lack of acceptance by end users, sensitivity to noisy inputs
Identify all possible operational risks Misuse of the AI tool (owing to insufficient training or not following the instructions), application of the AI tool outside of the target population (eg, individuals with implants), use of the tool by others than the target end users (eg, technician instead of physician), hardware failure, incorrect data annotations, adversarial attacks
Assess the likelihood of each risk Very likely, likely, possible, rare
Assess the consequences of each risk Patient harm, discrimination, lack of transparency, loss of autonomy, patient reidentification
Prioritise all the risks depending on their likelihood and consequences Risk of bias (if no personal attributes are included in the model) v risk of patient reidentification (if personal attributes are collected)
Define mitigation measures to be applied during AI development Data enhancement, data augmentation, bias correction techniques, domain adaptation, transfer learning, continuous learning
Define mitigation measures to be applied after deployment Warnings to the users, system shutdown, reprocessing of the input data, acquisition of new input data, use of an alternative procedure, or human judgment only
Set up a mechanism to monitor and manage risks over time Periodic risk assessment every six months
Create a comprehensive risk management file Including all risks, their likelihood and consequences, risk mitigation measures, risk monitoring strategy