Development and validation of a novel diagnostic model for initially clinical diagnosed gastrointestinal stromal tumors using an extreme gradient-boosting machine
BMC Gastroenterology volume 21, Article number: 481 (2021)
Gastrointestinal stromal tumor (GIST) is the most common gastrointestinal soft tissue tumor. Clinical diagnosis mainly relies on enhanced CT, endoscopy and endoscopic ultrasound (EUS), but the misdiagnosis rate is still high without fine needle aspiration biopsy. We aim to develop a novel diagnostic model by analyzing the preoperative data of the patients.
We used the data of patients who were initially diagnosed as gastric GIST and underwent partial gastrectomy. The patients were randomly divided into training dataset and test dataset at a ratio of 3 to 1. After pre-experimental screening, max depth = 2, eta = 0.1, gamma = 0.5, and nrounds = 200 were defined as the best parameters, and in this way we developed the initial extreme gradient-boosting (XGBoost) model. Based on the importance of the features in the initial model, we improved the model by excluding the hematological features. In this way we obtained the final XGBoost model and underwent validation using the test dataset.
In the initial XGBoost model, we found that the hematological indicators (including inflammation and nutritional indicators) examined before the surgery had little effect on the outcome, so we subsequently excluded the hematological indicators. Similarly, we also screened the features from enhanced CT and ultrasound gastroscopy, and finally determined the 6 most important predictors for GIST diagnosis, including the ratio of long and short diameter under CT, the CT value of the tumor, the enhancement of the tumor in arterial period and venous period, existence of liquid area and calcific area inside the tumor under EUS. Round or round-like tumors with a CT value of around 30 (25–37) and delayed enhancement, as well as liquid but not calcific area inside the tumor best indicate the diagnosis of GIST.
We developed a model to further differential diagnose GIST from other tumors in initially clinical diagnosed gastric GIST patients by analyzing the results of clinical examinations that most patients should have completed before surgical resection.
Gastrointestinal stromal tumors (GIST) are the most common gastrointestinal soft tissue tumors, accounting for about 20% of all soft tissue tumors . GIST mainly occurs in the stomach, followed by the small intestine and colon, and rarely occurs in tissues outside the gastrointestinal tract (GI-tract) . The clinical diagnosis of GISTs relies on enhanced computed tomography (CT), endoscopy and endoscopic ultrasound (EUS), but relying on these tests alone, or even their combination, has a high misdiagnosis rate. Moreover, GISTs are difficult to be differential diagnosed from other gastrointestinal submucosal tumors (SMTs) without fine needle aspiration biopsy [3, 4]. The pathology and immunohistochemistry results after fine needle aspiration biopsy are the most accurate ways to diagnose GISTs before surgery. Even though recent studies have confirmed that fine-needle aspiration biopsy will not result in tumor rupture or gastrointestinal dissemination of GISTs, it is still not frequently used in China and some other regions around the world [5, 6]. In addition, fine-needle aspiration has a certain false-negative rate because of the small specimen size, and patients often refuse to be examined because of the concerns about this invasive procedure. We still need a more convenient and non-invasive method to differential diagnose GISTs from other SMTs.
GISTs have a higher malignant biological potential compared with other gastrointestinal SMTs. According to the current GIST risk assessment standards and recommendations from guidelines around the world, most GISTs should undergo complete surgical resections . For other SMTs that occur in the gastrointestinal tract, especially stomach, surgical intervention is generally not required. Therefore, creating a convenient and non-invasive model for further differential diagnosis of GISTs utilizing common examination results is pivotal to improving patients' quality of life and relieving economic stress.
The aim of this study is to develop a novel diagnostic model by analyzing the preoperative enhanced CT, endoscopy, EUS, and hematological test data of the patients. For this purpose, we trained an extreme gradient-boosting (XGBoost) model on a single-center dataset to predict the patient’s diagnosis, give the most important predictors and how they affect the predictive outcome .
We retrospectively collected the data of all the patients who were initially diagnosed as gastric “gastrointestinal stromal tumor”, in which circumstances the patients with other SMTs that can be easily differentiated from GIST after CT scan or endoscopy were excluded, and completed surgical resection through January 2017 to June 2021 in the electronic medical record system of the Department of Gastrointestinal Surgery, Peking University People's Hospital. 128 continuous patients were screened. Among them, 4 patients were excluded due to the following reasons: 1 patient was accompanied by gastric cancer, 3 patients were missing important data in our electronic medical record system, because they had already performed preoperative examinations in other hospitals before admission. The remaining 124 patients were included in this study. We divided the patients into “GIST” and “other SMT” groups for comparison according to their postoperative pathological results. The t-test was conducted to compare continuous variables, as well as the Chi-square or Fisher’s test was utilized to compare categorical variables. Subsequently, we randomly divided all 124 patients into training dataset and validation dataset at a ratio of 3 to 1. The patient's enhanced CT, endoscopy, EUS, and hematological test results were included in this study, at least in initial model development. Missing values accounted for 12.78% of the total dataset. All statistical analysis, model training, validation and graphic plotting were based on R statistical software v4.0.3.
All 22 clinical test results potentially related to the patient's diagnosis were included in this study. All predictors are from enhanced CT, endoscopy, EUS, and hematological test results. Predictors from enhanced CT include tumor boundary, ratio of long to short diameter, homogeneity of enhancement, and the CT values of the tumor in different phases. Predictors from endoscopy include the existence of ulcers and bleeding on the mucosal surface of the tumor. Predictors from EUS include echogenicity, and the existence of calcification or liquid area inside the tumor. Besides, previous studies have suggested that some blood inflammatory and nutritious indicators are independent factors influencing the prognosis of gastrointestinal stromal tumors . Therefore, we also tentatively included them as diagnostic predictors in this study, including peripheral blood leukocytes, neutrophils, lymphocytes, platelets, hemoglobin, alanine transaminase (ALT), glutamic oxaloacetic transaminase (AST), albumin levels, platelet–lymphocyte ratio (PLR), neutrophillymphocyte ratio (NLR), systemic immune-inflammation index (SII), De Ritis ratio (AST/ALT), prognostic nutritional index (PNI), and fibrinogen level. Data of all the hematological indicators were collected from the first test result after admission to our hospital and consideration for gastric GIST. The final diagnosis was based on the results of postoperative pathology and immunohistochemistry.
Initial model development and selection of the predictors
Missing values were patched using the missForest package . Then we incorporated all the 22 factors into the initial XGBoost model. After pre-experimental screening, max depth = 2, eta = 0.01, gamma = 0.25, and nrounds = 200 were defined as the best parameters and in this way we developed the initial XGBoost model. The xgb.importance() function of the xgboost package was employed to calculate the importance of each feature. After repeating the random grouping and model development 200 times, we can get the importance distribution of all 22 factors. Using the ggplot package to draw the box plot of their importance, we can see that the hematology test data generally has a very weak effect on the outcome. Therefore, we excluded the hematology test data in subsequent model development.
Using the same method for the second model development, it can be seen that three factors (the existence of ulcers on the mucosal surface of the tumor under gastroscopy, or bleeding, and homogeneity of enhancement in enhanced CT) also have a weak influence on the outcome, in which reason they were also excluded. The remaining 6 factors will be used to build the final model, including the ratio of long and short diameter under CT, the CT value of the tumor, the enhancement of the tumor in arterial period and venous period, existence of liquid area and calcific area inside the tumor under EUS.
Final model development
Six most important predictors were selected using the previously described method, and checked for correlation by the corrgram package. The xgboost package was again applied to construct the XGBoost model in a similar way.
Explain the model
The xgb.importance() function of the xgboost package was employed to calculate the importance of these 6 predictors and ggplot2 package was employed to draw the box plot of the distribution of importance. Then we used the explain() function and variable_effect() function of the DELAX package to further elaborate how each predictor in the model affected the diagnostic outcome. The predict_parts_break_down() function was applied to illustrate the impact of these predictors on the diagnostic outcome in individual cases.
Validation of the model
The optimalCutoff() function in the InformationValue package was used to obtain the best cutoff value of the model. The accuracy, precision, recall and f1-score of the model were then acquired from the test dataset, based on the given cutoff value. The receiver operating characteristic (ROC) curve was plotted with plotROC() function, and the area-under-ROC (auROC) value was subsequently calculated. The concordance index (C-index) was calculated with rcorr.cens() in the Hmisc package.
The final model development process was also repeated 200 times to acquire the distribution range of the predictor importance and evaluating results, and reported the 2.5 and 97.5 percentiles as 95 CI.
A total of 124 patients who were initially diagnosed with gastric GISTs and underwent surgery from January 2017 to June 2021 were included in this study. 90 individuals were diagnosed as GIST according to postoperative pathology and immunohistochemistry. The other 34 patients were diagnosed as other SMT based on postoperative pathological diagnosis, including leiomyoma (n = 9), schwannoma (n = 5), lymphoma (n = 4), ectopic pancreas (n = 4), neurofibromas (n = 4), gastric duplication (n = 3), neuroendocrine tumors (n = 1), cysts (n = 1), congenital accessory spleen (n = 1), myofibroblastoma (n = 1), and lipoleiomyosarcoma (n = 1). 106 patients completed endoscopic examinations, all of which suggested the presence of submucosal lesions, and 62 of them had mucosal pathological biopsy taken under endoscopy. None of the patient’s preoperative mucosal biopsy detected tumors. 59 patients completed the preoperative EUS, and all showed submucosal mid-hypoechoic or hypoechoic lesions. Only 2 patients completed fine-needle aspiration biopsy, and the pathology showed gastrointestinal stromal tumors. All patients completed hematological test before surgery. All characteristics included in the initial analysis were shown in Table 1.
We can see the importance of each factor in the two models constructed at the beginning of this study from Fig. 1. We used all 22 factors for the first model development, and we can find that most of the hematological indicators except ALT have little effect on the diagnosis of GIST (Fig. 1a). Therefore, the second model was conducted after all hematological test data were excluded. Similarly, we can see the importance of each factor in the second model in Fig. 1b. It is not difficult to find out that the latter three factors have little impact on the diagnosis of GIST, so they were also excluded. The remaining six predictors have an important impact on the diagnosis of GIST, including the ratio of long and short diameter under CT, the CT value of the tumor, the enhancement of the tumor in arterial period and venous period, existence of liquid area and calcific area inside the tumor under EUS. The correlation between most of the selected predictors are not strong (Additional file 1: Fig. S1).
The optimal cut-off value of this model is 0.684. That is, when the output calculation result is higher than 0.684, the patient would be diagnosed as “GIST”, and when the output calculation result is lower than 0.684, it indicates “Other SMT”. According to this cut-off value, the performance of the model in the validation dataset is calculated (Table 2): the auROC value is 0.77 (0.57–0.90), the C-index is 0.76 (0.56–0.89). The accuracy of model prediction in the validation dataset is 0.73 (0.58–0.88), precision is 0.79 (0.60–0.95), recall is 0.87 (0.67–1.00), and f1-score is 0.82 (0.70–0.92).
The most important predictors of this model is the existence of liquid area inside the tumor under EUS, following by the ratio of long and short diameter under CT, the CT value of the tumor, the enhancement of the tumor in arterial period and venous period, existence calcific area inside the tumor under EUS (Fig. 1c; Additional file 2: Table S1). The influence of each predictor on the predicted outcome is shown in Fig. 2a–f. Round or round-like tumors with a CT value of around 30 (25–37) and delayed enhancement, as well as liquid but not calcific area inside the tumor best indicate the diagnosis of GIST.
Prediction and interpretation at the individual scale
We input the clinical data of 2 patients into this model. The CT value of Patient 1's tumor was 37, and the ratio of long to short diameter of the tumor was 1.071. The CT value in arterial phase increased by 9 while 35 in venous phase. EUS showed that the tumor did not contain a liquid or calcific area. After inputting all these data into the model, the prediction value of patient 1 was 0.732, which was higher than 0.684, so the predictive result of patient 1 in this model is “GIST” (Fig. 3a). The CT value of Patient 2's tumor was 39, increased by 12 in arterial phase and 41 in venous phase. The ratio of long to short diameter of the tumor was 1.545. No liquid area but some calcification were found under EUS. After inputting these data into the model, the prediction value of patient 2 was 0.332, which was lower than 0.684, and the prediction result of patient 2 was “Other SMT” (Fig. 3b).
We have developed and validated a novel diagnostic model for gastric GISTs by an XGBoost machine based on a single-center retrospective dataset. The predictors selected into this study through initial XGBoost model include: the ratio of long and short diameter under CT, the CT value of the tumor, the enhancement of the tumor in arterial period and venous period, existence of liquid area and calcific area inside the tumor under EUS. Considering that we only included the patients who were initially diagnosed GIST and excluded those with other SMTs that can be easily recognized in CT scan or endoscopy at their first presentation, the model yielded satisfactory result for validation and provided a novel measure for those patients with other SMTs that were extremely similar with GIST in CT and endoscopic features.
Current guidelines around the world recommend enhanced CT scan, endoscopy and EUS as the primary diagnostic modalities, while determining whether a tumor is GIST still relies on EUS or CT guided fine-needle aspiration biopsy . Although the guidelines recommend that fine-needle aspiration biopsy should be performed for those considering GISTs, preoperative biopsy is not promoted widely in some countries or regions in the world due to poor conditions or some other reasons and guidelines also mention that preoperative biopsy can be 'omitted' or 'not necessary' for limited resectable SMTs [11, 12]. In addition, fine needle aspiration biopsy may give false negative results due to its small specimen size . Therefore, although this invasive test has been proven secure and will not result in tumor rupture or GI-tract dissemination, the clinical application rate is still very low (2/124 in our database). For patients without preoperative biopsy result, the misdiagnosis rate is rather high (34/122 in our database), which is proof that the use of enhanced CT scan, endoscopy or EUS alone to diagnose GIST is not accurate enough [3, 4].
Recently, many studies demonstrated the influence of peripheral blood hematological indicators of systemic inflammation or nutrition on the long-term prognosis of various cancers and even GISTs after surgical resection [9, 13,14,15,16]. It’s suggested that higher-risk tumors, including GISTs, have a stronger impact on the patients’ nutritional state and inflammatory levels. Compared with other benign or low-malignant gastrointestinal SMTs, GISTs should further decrease nutritional indicators and increase inflammatory indicators. Therefore, we included peripheral blood systemic inflammation and nutrition indicators in our initial analysis. But not surprisingly, we found that these hematological test data, except for ALT, have little effect on the outcome. As is currently no evidence to support the impact of ALT level changes alone on the diagnosis of GIST, we excluded all hematological test data in the next model development.
Recent years witnessed the boost in artificial intelligence application in the medical field, assisting in disease detection, diagnosis and treatment decision-making . As the concept of precision medicine being promoted for years, the use of machine learning algorithms to help clinical diagnosis and treatment has become an inevitable trend. However, data science is not able to perfectly match the facts all the time. Selecting appropriate machine learning algorithm is crucial to yielding meaningful and useful results, yet it is not an easy process . Ensemble-based classifier is better than any single classifier in analyzing the influence of the combination of various factors on outcome. In terms of complex nonlinear multi-feature models such as predictive clinical models, the tree-boosting machine has better performance, giving both the importance and ranking of each factor simultaneously [19, 20]. The XGBoost algorithm has been applied in various clinical studies in constructing disease prediction models, and proved to have good validation results [21, 22]. Therefore, it is logical to choose XGBoost in our model development.
The first and foremost achievement of this research is the development of a GIST clinical diagnostic model. All patients included in our dataset were initially diagnosed as gastric GIST after preoperative examinations, and the model proved to have satisfactory validation results in such circumstance. The model outputs the importance of each predictor, suggesting that the existence of liquid area inside the tumor under EUS is the most important predictor, followed by the ratio of long and short diameter under CT, the CT value of the tumor, the enhancement of the tumor in arterial period and venous period, existence calcific area inside the tumor under EUS. All the data we used to develop the model came from the patients’ preoperative clinical examinations and hematological tests, which would not cause any additional pain or economic stress for the patients.
The main limitation of this study is that it is a single center, small sample, retrospective study, with 124 patients included. Large-scale, multi-center studies are required for the development of more accurate models. In addition, it should be noted that the analytic process using the gradient-boosting machine in this study was entirely based on data science. Clinical results may be different from mathematical calculation. At present, there is no perfect and absolutely accurate statistical algorithm that can predict the exact clinical outcome of every patient. Moreover, this model is only suitable for patients who consider gastric GIST as their initial clinical diagnosis and cannot perform biopsy for some reason before surgery. For patients who are not considered GIST initially or have a SMT out of stomach, this model may yield inaccurate prediction, or even contrary results. This model is only a tool to assist clinical diagnosis, by giving an interpretation of the clinical test results to doctors, aiding them in making the final diagnosis and intervention measures. In the future, nationwide, multi-center large scale studies are expected for further improvement of current models.
Availability of data and materials
We uploaded the R code of our study on the website: https://github.com/hbz0411/GIST_diagnose for the purpose of academic sharing. The datasets analyzed during this study are available from the corresponding authors on reasonable request.
von Mehren M, Joensuu H. Gastrointestinal stromal tumors. J Clin Oncol Off J Am Soc Clin Oncol. 2018;36:136–43. https://doi.org/10.1200/jco.2017.74.9705.
Søreide K, et al. Global epidemiology of gastrointestinal stromal tumours (GIST): a systematic review of population-based cohort studies. Cancer Epidemiol. 2016;40:39–46. https://doi.org/10.1016/j.canep.2015.10.031.
Karaca C, et al. Accuracy of EUS in the evaluation of small gastric subepithelial lesions. Gastrointest Endosc. 2010;71:722–7. https://doi.org/10.1016/j.gie.2009.10.019.
Park CH, et al. Impact of periodic endoscopy on incidentally diagnosed gastric gastrointestinal stromal tumors: findings in surgically resected and confirmed lesions. Ann Surg Oncol. 2015;22:2933–9. https://doi.org/10.1245/s10434-015-4517-0.
von Mehren M, et al. NCCN guidelines insights: soft tissue sarcoma, version 1.2021. J Natl Compr Cancer Netw JNCCN. 2020;18:1604–12. https://doi.org/10.6004/jnccn.2020.0058.
Scarpa M, et al. A systematic review on the clinical diagnosis of gastrointestinal stromal tumors. J Surg Oncol. 2008;98:384–92. https://doi.org/10.1002/jso.21120.
Akahoshi K, et al. Current clinical management of gastrointestinal stromal tumor. World J Gastroenterol. 2018;24:2806–17. https://doi.org/10.3748/wjg.v24.i26.2806.
Chen T, Guestrin C. XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. Association for Computing Machinery; 2016, p. 785–94. https://doi.org/10.1145/2939672.2939785.
Lin Y, et al. Development and validation of a prognostic nomogram to predict recurrence in high-risk gastrointestinal stromal tumour: a retrospective analysis of two independent cohorts. EBioMedicine. 2020;60:103016. https://doi.org/10.1016/j.ebiom.2020.103016.
Stekhoven DJ, Buehlmann P. MissForest—nonparametric missing value imputation for mixed-type data. Bioinformatics. 2012;28(1):112–8. https://doi.org/10.1093/bioinformatics/btr597.
Koo DH, et al. Asian Consensus Guidelines for the diagnosis and management of gastrointestinal stromal tumor. Cancer Res Treat. 2016;48:1155–66. https://doi.org/10.4143/crt.2016.187.
Poveda A, et al. GEIS guidelines for gastrointestinal sarcomas (GIST). Cancer Treat Rev. 2017;55:107–19. https://doi.org/10.1016/j.ctrv.2016.11.011.
Sun J, et al. Relationship of prognostic nutritional index with prognosis of gastrointestinal stromal tumors. J Cancer. 2019;10:2679–86. https://doi.org/10.7150/jca.32299.
Cao X, et al. Fibrinogen/albumin ratio index is an independent prognosis predictor of recurrence-free survival in patients after surgical resection of gastrointestinal stromal tumors. Front Oncol. 2020;10:1459. https://doi.org/10.3389/fonc.2020.01459.
Maruyama T, et al. Preoperative prognostic nutritional index predicts risk of recurrence after curative resection for stage IIA colon cancer. Am J Surg. 2020. https://doi.org/10.1016/j.amjsurg.2020.10.032.
Okadome K, et al. Prognostic nutritional index, tumor-infiltrating lymphocytes, and prognosis in patients with esophageal cancer. Ann Surg. 2020;271:693–700. https://doi.org/10.1097/sla.0000000000002985.
Goecks J, et al. How machine learning will transform biomedicine. Cell. 2020;181:92–101. https://doi.org/10.1016/j.cell.2020.03.022.
Deo RC. Machine learning in medicine. Circulation. 2015;132:1920–30. https://doi.org/10.1161/circulationaha.115.001593.
Zihni E, et al. Opening the black box of artificial intelligence for clinical decision support: a study predicting stroke outcome. PLoS ONE. 2020;15:e0231166. https://doi.org/10.1371/journal.pone.0231166.
Sugasawa S, Noma H. Estimating individual treatment effects by gradient boosting trees. Stat Med. 2019;38:5146–59. https://doi.org/10.1002/sim.8357.
Bibault JE, et al. Development and validation of a model to predict survival in colorectal cancer using a gradient-boosted machine. Gut. 2020. https://doi.org/10.1136/gutjnl-2020-321799.
Liu P, et al. Optimizing survival analysis of XGBoost for ties to predict disease progression of breast cancer. IEEE Trans Biomed Eng. 2021;68:148–60. https://doi.org/10.1109/tbme.2020.2993278.
We thank the assistance and cooperation provided by all staff of the department of gastrointestinal surgery, Peking University People’s Hospital.
This study was funded by the Peking University People’s Hospital Scientific Research Development Funds (RDL2020-06).
Ethics approval and consent to participate
This study has been approved by the Ethics Committee of Peking University People's Hospital (2021PHB260-001). Written informed consent was waived owing to its retrospective study design.
Consent for publication
All authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Hu, B., Wang, C., Jiang, K. et al. Development and validation of a novel diagnostic model for initially clinical diagnosed gastrointestinal stromal tumors using an extreme gradient-boosting machine. BMC Gastroenterol 21, 481 (2021). https://doi.org/10.1186/s12876-021-02048-1