Variability between human experts and artificial intelligence in identification of anatomical structures by ultrasound in regional anaesthesia: a framework for evaluation of assistive artificial intelligence
Bowness JS., Morse R., Lewis O., Lloyd J., Burckett-St Laurent D., Bellew B., Macfarlane AJR., Pawa A., Taylor A., Noble JA., Higham H.
Background: ScanNavTM Anatomy Peripheral Nerve Block (ScanNav™) is an artificial intelligence (AI)-based device that produces a colour overlay on real-time B-mode ultrasound to highlight key anatomical structures for regional anaesthesia. This study compares consistency of identification of sono-anatomical structures between expert ultrasonographers and ScanNav™. Methods: Nineteen experts in ultrasound-guided regional anaesthesia (UGRA) annotated 100 structures in 30 ultrasound videos across six anatomical regions. These annotations were compared with each other to produce a quantitative assessment of the level of agreement amongst human experts. The AI colour overlay was then compared with all expert annotations. Differences in human–human and human–AI agreement are presented for each structure class (artery, muscle, nerve, fascia/serosal plane) and structure. Clinical context is provided through subjective assessment data from UGRA experts. Results: For human–human and human–AI annotations, agreement was highest for arteries (mean Dice score 0.88/0.86), then muscles (0.80/0.77), and lowest for nerves (0.48/0.41). Wide discrepancy exists in consistency for different structures, both with human–human and human–AI comparisons; highest for sartorius muscle (0.91/0.92) and lowest for the radial nerve (0.21/0.27). Conclusions: Human experts and the AI system both showed the same pattern of agreement in sono-anatomical structure identification. The clinical significance of the differences presented must be explored; however the perception that human expert opinion is uniform must be challenged. Elements of this assessment framework could be used for other devices to allow consistent evaluations that inform clinical training and practice. Anaesthetists should be actively engaged in the development and adoption of new AI technology.