Artificial intelligence for colonoscopy: the new Silk Road
Alessandro Repici et al.
It all began with a few hybrid information technology–medical studies claiming that a new type of software could recognize endoscopic lesions on offline images [1]. The use of the term “artificial intelligence” (AI) underlined how the learning process was driven by automatic extraction of main features from annotated images rather than by the application of human judgment [2].
“It could be argued that most of the increase in ADR by AI was driven by additional detection of adenomas < 5 mm in size rather than larger and potentially more advanced lesions. However this is unfair because most of the ADR is represented by diminutive lesions, thus any ADR increase will necessarily be driven by an increased detection of such lesions.”
The initial excitement was mirrored by a contrary skepticism. How often had similar claims for a variety of devices or techniques ended in disappointment? We had sadly become familiar with part of the Gartner Hype Cycle for emerging technologies: an initial boost of good results ultimately reversed by negative studies. The limitations of AI certainly could not be ignored. First, the AI software was validated only against human ground truth as a reference standard in artificial studies, indicating that in the best scenario AI was equivalent to experts but not superior. Secondly, most of these AI systems were validated offline, generating uncertainty about their feasibility in a real-time setting [3]. In addition, the first prototypes for real-time application required two parallel monitors, one for the endoscope and one for the counterpart AI images, because of an unavoidable delay as the signal transited through the AI system [4] [5]. Fourthly, there was the fear of pointless waste of time as some systems presented extremely high rates of false-positive results, in up to 20 % of the entire videos. Fifthly, there was extreme heterogeneity among the AI systems, regarding training dataset, validation procedures, and the architecture of the AI algorithm. Finally, the new AI jargon, with several terms, such as “training,” “cross-validation,” and “testing,” being used in an information technology sense that was unfamiliar to both academic and community endoscopists [6].
Such skepticism, however, could not countervail our desperate need for AI in screening colonoscopy. Despite the undeniable increases in adenoma detection rates (ADRs) due to quality assurance, we still miss one in every four neoplastic lesions [7] [8]. This may be related to distraction or fatigue, especially after several hours of activity, as well as to suboptimal competence in recognition. The latter is likely to be the case regarding flat advanced lesions, such as nongranular lateral spreading tumors (NG-LSTs). In turn, missing of lesions has been estimated to be the predominant factor contributing to post-colonoscopy colorectal cancer rates, with an incidence as high as 1 % over 10 years [7] [9]. In addition, an embarrassing variability in ADRs across any series of endoscopists raises questions about the status of screening colonoscopy as a clinical standard. Of note, relatively, low-detectors can miss up to 75 % of neoplastic lesions as compared with high-detectors [7]. Unsurprisingly, similarly high values of missed diagnosis have been reported for Barrett-related dysplasia or early gastric cancer, indicating a general reluctance of some endoscopists to recognize their own underperformance [10] [11].
Dum Romae consulitur, Saguntum expugnatur! [Whilst they deliberated in Rome, Saguntum was captured!] While the pros and cons of AI were being compared, science was suddenly overcome by technology. All the major players in the field of endoscopy upgraded their AI systems with graphic interfaces able to display colorectal lesions in a real-time setting, superimposing a clearly visible mark on any suspected lesions with high degrees of accuracy. These systems were released immediately in the European market as regulatory clearance was primarily based on artificial studies. While there had not been enough time to reply to the initial question “Should we implement AI?,” the next question suddenly became: “What are the effects of AI on screening colonoscopy?”