A new study describes a three-stage machine-learning approach to automated surgical skill assessment in laparoscopic cholecystectomy videos.
You might also like: Leading experts including Jorge Juan Fernández (EIT Health), John Halamka (Mayo Clinic) and medical futurist Bertalan Meskó discussed the opportunities and challenges of AI in healthcare. Learn more
Currently, assessment of a surgeon’s skill is done manually by experts, either directly during the operation, or from video footage. This is a time-consuming, not always reproducible and potentially subjective procedure, and there have been attempts to automate it.
A group of researchers has applied machine learning (ML) algorithms to automate surgical skill assessment in laparoscopic cholecystectomy videos. They developed a method comprising three stages:
- Training of a Convolutional Neural Network (CNN) to recognise instruments in video frames
- Tracking these instruments over time and calculating relevant motion metrics
- Training a linear regression model on those metrics to assess surgical skill.
For the study, a sample of 242 videos was selected, divided into 949 clip applications of surgical gestures (the end of the hepatocystic dissection phase). A panel of surgeons rated the recordings of laparoscopic cholecystectomy.
In stage 1, the model showed 78% average precision (AP) and 82% average recall (AR) for the clipper on the test set. It is noted that two types of clippers were used for the study which might have negatively affected the AP performance. The model was overall efficient in difficult cases (e.g. with poor lighting), but more substantial visual challenges such as heavy occlusion could undermine correct model performance.
In stage 2, the motion metrics and expert skill ratings were compared. The results showed that shorter clip application phase, smaller radius of clipper locations and smaller constant movement direction all indicated to faster, more succinct and smoother performance, i.e. higher surgical skill.
In stage 3, the ML model demonstrated 87 ± 0.2% accuracy in predicting the good/poor surgical skill and 70% accuracy in predicting the skill level. The factors affecting the model’s performance included inability to predict low skills as opposed to high skills (due to the respective example videos being underrepresented in the sample); limitations due to reliance on instrument movements only (e.g. identifying clip dropping as poor skill regardless of previous handling of the instrument); camera movement and zoom that led to different interpretations of the instrument movement and, hence, the surgical skill (e.g. with the camera zoomed, the instrument movement appeared small, and vice versa). In addition, the authors note that instrument handling is only one of the factors that underpin surgical skill, and others, such as tissue handling, were out of the study’s scope. They conclude that future research should allow for larger training databases and focus on algorithm refinement before the technology can be used in clinical practice.
Image credit: silkfactory