Info Pulse Now

HOMEmiscentertainmentcorporateresearchwellnessathletics

Artificial intelligence-assisted endoscopic diagnosis system for diagnosing Helicobacter pylori infection: a multicenter study - BMC Medicine

By Chen

Artificial intelligence-assisted endoscopic diagnosis system for diagnosing Helicobacter pylori infection: a multicenter study - BMC Medicine

In this study, we present an innovative approach for developing HOPE AI (simplified from Helicobacter pylori identification AI), a visualized, interpretable, and stable AI system designed for H. pylori infection diagnosis. The system is based on a multi-instance learning framework (MIL) and incorporates transformer and long short-term memory (LSTM) architectures. We evaluated HOPE AI's diagnostic efficacy through comprehensive internal and external validation using a substantial multicenter endoscopic dataset collected from seven medical institutions. This represents the first large-scale external validation study of its kind, demonstrating the model's generalizability and establishing a strong foundation for clinical implementation.

This multi-institutional diagnostic investigation was implemented across seven Chinese medical centers, encompassing three distinct methodological phases: (phase 1) development and internal validation of the HOPE AI algorithm utilizing clinical endoscopic imagery from the First Affiliated Hospital of Zhejiang Chinese Medical University, Hubin Campus (FAHZCMU, H); (phase 2) external temporal validation employing prospectively acquired endoscopic images and video sequences from FAHZCMU, H; and (phase 3) external geographic validation through prospective data collection from multiple institutions, including FAHZCMU Qiantang Campus and six additional regional hospitals (the First Affiliated Hospital of Wenzhou Medical University, the First Hospital of Jiaxing, Wenzhou Central Hospital, the First Affiliated Hospital of Ningbo University, Huzhou Central Hospital, and Yuyao People's Hospital of Zhejiang). The research protocol adhered to Helsinki Declaration guidelines and received institutional review board approval from FAHZCMU (No. 2024-KLS-489-01), agreed by other centers. The study was registered in the Chinese Clinical Trial Registry (ChiCTR 2400091317, 2,400,091,720).

In this multicenter ambispective cohort study, the retrospective derivation cohort (phase 1) encompassed 1271 patients during the period from March 1 to June 30, 2024. Subjects were stratified through randomization (7:3 ratio) into training and internal validation datasets for HOPE AI development. The external validation cohort (phases 2 and 3) prospectively recruited 4936 patients across seven medical centers between October 25 and November 30, 2024 (Fig. 1) [14].

Study eligibility criteria were defined as follows: adult patients (≥ 18 years), regardless of gender, completion of standardized gastroscopic examination (minimum 8 gastric images, including cardia, upper body, middle body, lesser curvature, angle, lower body, antrum, and pylorus [15]), and confirmed H. pylori status through histopathological assessment (negative, encompassing both absence of prior H. pylori infection and successful previous eradication therapy; positive, indicating current active H. pylori infection, regardless of prior anti-H. pylori history). For video analyses, an additional criterion mandated complete gastroscopic examination documentation. Exclusion criteria comprised presence of retained gastric contents, neoplastic lesions, luminal obstruction, and documented history of esophagogastric malignancy or gastric surgical intervention.

In our comprehensive dataset compilation, we incorporated all endoscopic images acquired during routine clinical procedures, ensuring an unbiased representation of real clinical scenarios. The dataset incorporates diverse imaging modalities, including conventional white-light endoscopy (WLE) and narrow-band imaging (NBI), encompassing varying image qualities such as focused, unfocused, overexposed, motion-blurred, and hemorrhagic specimens. The dataset was dichotomously classified based on Helicobacter pylori infection status: non-infected (label 0) and infected (label 1). We preprocessed the images by excluding the metadata sidebar, preserving only the diagnostically relevant endoscopic field. The endoscopic images, originally ranging from 400 × 480 to 1920 × 1080 pixels, were standardized to 352 × 352 pixels to facilitate deep learning network compatibility.

In this study, we introduce a novel MIL architecture integrating vision transformer (ViT) [16, 17] and LSTM networks to comprehensively analyze case-wide image aggregation for H. pylori infection detection. The MIL framework, trained exclusively on patient-level annotations, autonomously identifies intrinsic image correlations and patterns. This methodology effectively isolates sentinel images for case classification, maintaining robustness against incomplete or noisy datasets. The inference pipeline generates patient-level infection status predictions and identifies infection-associated high-risk images from extensive case repositories, enhancing clinical decision-making precision. The architecture comprises three main components: a transformer-based feature extraction module that computes H. pylori risk scores for individual images, a selection mechanism that identifies the top-k high-risk images based on MIL principles, and an LSTM network that synthesizes features from these top-k images to generate case-level infection probability (Fig. 2). All quantitative evaluations in the analysis relied on case-level infection probabilities as the primary metric, and we also analyzed the contribution of image-level predictions to clinical diagnosis. Detailed methodological protocols and implementation details are summarized in the Additional file 1: Supplementary Materials [10, 16,17,18,19,20,21,22,23,24,25,26,27].

We initially evaluated HOPE AI's performance in detecting H. pylori infection through an internal validation cohort, followed by a prospective external temporal validation dataset comprising endoscopic imagery and video sequences from FAHZCMU, H. Subsequently, we conducted a comprehensive assessment of HOPE AI's reliability using geographically diverse external validation datasets from seven institutions, each characterized by substantial endoscopic examination volumes and highly credentialed endoscopists.

To further validate and extend the applicability of our AI model, we selected endoscopic video sequences from the prospective external temporal validation cohort due to its enhanced objectivity and mitigation of the selection bias inherent in physician-dependent image acquisition. Three junior endoscopists and three senior endoscopists, blinded to patient demographics and histopathological outcomes, independently analyzed identical test videos, with their assessments compared against HOPE AI's diagnostic output. These junior practitioners possessed less than 2 years of endoscopic experience; the seniors had over 10 years of experience in endoscopic procedures and had conducted more than 5000 examinations. To maintain objectivity, these evaluators were excluded from image selection and annotation processes, and all visual data was mixed and de­identified prior to their assessment.

The diagnostic performance metrics of HOPE AI in detecting H. pylori infection were comprehensively assessed through statistical analysis of accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV), with 95% confidence intervals computed via the Clopper-Pearson methodology. Statistical comparisons of diagnostic parameters (sensitivity, specificity, and accuracy) were performed via permutation testing, generating two-sided P values based on 10,000 permutations. The discriminative capability of the deep learning algorithm was evaluated through receiver operating characteristic (ROC) curve analysis, wherein sensitivity was plotted against 1-specificity across varying probability thresholds. The AUC metric served as a quantitative measure of diagnostic efficacy, with higher values indicating superior performance. All analyses employed two-tailed statistical tests with α = 0.05, implemented using Python 3.9.0.

Previous articleNext article

POPULAR CATEGORY

misc

14002

entertainment

14983

corporate

12230

research

7782

wellness

12585

athletics

15704