Development of A Novel Competing Risk Survival Predicting Model for Esophagogastric Junction Adenocarcinoma: a SEER population-based study


 Background: Adenocarcinoma in Esophagogastric Junction (AEG) is a severe gastrointestinal malignancy with a unique clinicopathological feature. To develop a competing risk nomogram for AEG patients and compared it with new 8th traditional tumor-node-metastasis (TNM) staging system.Methods: Based on AEG patients from the Surveillance, Epidemiology, and End Results (SEER) database between 2004 and 2010, we used the univariate and multivariate analysis to filter clinical factors and then built the competing risk nomogram to predicting the AEG cause-specific survival. We measured the clinical accuracy by comparing to the 8th TNM stage with receiver operating characteristic (ROC) curve, Brier score, and decision curve analysis (DCA).Results: Total of 1755 patients were included into this study. This nomogram was based on five variables: Number of examined lymph nodes, grade, invasion, metastatic LNs and age. The nomogram model was greater than traditional TNM staging with ROC curve (1-year AUC:0.747 vs. 0.641, 3-year AUC: 0.761 vs. 0.679, 5-year AUC: 0.759 vs. 0.682, 7-year AUC: 0.749 vs. 0.673, P<0.001), Brier score (3-year: 0.198 vs. 0.217, P=0.012; 5-year: 0.198 vs. 0.216, P=0.008; 7-year: 0.199 vs. 0.215, P=0.014) and DCA.Conclusions: Based on the SEER database with AEG patients, the competing risk nomogram showed the greater accurate individualized prediction of the survival compared with traditional TNM classification.


Introduction
Despite the incidence trend continuously decreased over the past decades, gastric cancer still is the third leading cause of cancer-related mortality and the fifth most common cancer globally. 1,2 Ascending population-based studies had reported that the incidence of adenocarcinoma in the esophagogastric junction (AEG) presented a significant increasing trend. 3 Staging Manual: viz. cancers with esophagogastric junction (EGJ) invasion that have their epicenter within the proximal 2 cm of the EGJ (Siewert type I/II) are to be staged as TNM-EC, while cancers whose epicenter is more than 2 cm distal from the EGJ, even if the EGJ is involved, would be staged by TNM-GC. 6 However, this staging strategy for AEG only focused on the 'location' of invasion, neglecting other critical clinical features, such as age, sex and the number of resected lymph node (LN), which could be predicting factors that influencing patients' prognosis. 7−10 Thus, the prognostic evaluation system for AEG needs to be further explored.
In general, the survival of cancer patients may be affected by more than two events, and only one event occurs finally. 11 Those events other than the one of interest are called competing risk events.
The traditional survival analysis may overestimate the cumulative incidence by treating competing events as censored events, which could be improved by the competing risk analysis. Nomogram, a simple graphical linear prediction model, is widely used for cancer prognosis. 12 Hence, in this study, we aimed to explore a new classification system by competing risk model through the populationbased Surveillance, Epidemiology and End Results (SEER) database and further develop a nomogram for predicting survival for AEG patients.

Study design and data acquisition
Patient data were obtained from the SEER website (http://seer.cancer.gov/) using SEER*stat version 8.3.5. Totally 11639 esophagogastric junctions (EGJ) cancer patients over 18 years old were initially analyzed, from 2004 to 2010. We then excluded the patients with multiple primary tumors, primary EGJ cancer without adenocarcinomas and without histological confirmation. Furthermore, patients with metastasis disease (M1) and less than 3 months follow-up time were excluded. For further comparing the feasibility of nomogram with the TNM staging, we excluded patients with unknown N stage. Patients with unknown prognostic characteristics such as race, tumor size and tumor location were also excluded. Finally, we extracted clinicopathologic variables of 1755 patients including age, gender, race, location of the tumor, TNM staging, the grade of the tumor, histologic grade, number of examined LNs, number of positive LNs, tumor size and survival months.

Exploration of a new evaluation system and nomogram presenting
We regarded AEG cause-specific death and other causes of death as two competing events in our competing-risk analysis. The multivariate proportional sub-distribution hazard model was used to calculate the adjusted sub-distribution hazard ratio (SHR) of the new examined evaluation system.
Variables associated with AEG cause-specific death with a P value <0.1 in the univariate analysis and P value <0.05 in the initial multivariate analysis were included as variables in the final multivariate analysis. We not only built the proportional sub-distribution hazard model to predict cause-specific death for patients but also competing-risk nomograms based on Fine and Gray's model. 13 For comparing the predicted probability with the observed at a certain time point, a calibration plot was used. And if pairs of the predicted and observed probabilities lie on the 45°angle line, it implies that both probabilities match well to each other and the model is ideal. The discrimination of the model was assessed by areas under receiver operating characteristic curves (AUC). 14 If the AUC >0. 8, it indicates that the discriminatory accuracy of a model is good. The discrimination and calibration of the model were also measured by the Brier score at the same time. 15 The decision curve analysis (DCA) was used to estimate the clinical usefulness and net benefit of the predictive models and compare with the traditional TNM staging system of the whole cohort. 16,17

Statistical analysis
All statistical analyses were performed using R version 3.3.3 software (Institute for Statistics and Mathematics, Vienna, Austria; www.r-project.org). Statistical significance was set at two-sided P < 0.05.

Patients Characteristics
Totally 11639 adults were diagnosed with EGJ cancer from 2004 to 2010, and 2859 patients were excluded due to multiple primary tumors. Patients with other histology (N=1810) except adenocarcinoma, without histological confirmation (N=4206), with distant metastasis (N=407), and with unknown examined LNs (N=209) were also excluded. Besides, primary AEG with follow-up less than 3 months (N=103), unknown size (N=265), unknown invasion depth (N=11), and cause of unknown death (N=14) were also excluded. Finally, 1755 cases were included in further analysis ( Figure 1). A sum of 373 females (21.2%) and 1382 males (78.8%) were included in the analysis. Sixty years was used as a cut-off of elder people. T stage ranged from T1 to T4 (N= 355, 227, 768, 405, respectively), N stage from N0 to N3 (N=716, 391, 340, 308, respectively). The details of the baseline characteristics of participants were shown in Table 1.

AEG survival prediction model
In  Table 2. Then we used DCA to compare the clinical usability between the nomogram and traditional TNM staging. Based on a continuum of potential thresholds for death (x-axis) and the net benefit of using the model to risk stratify patients (y-axis) relative to assuming all patients will be alive, the DCA graphically presented that the nomogram was better than traditional TNM staging in clinical conditions ( Figure 5). Compared with traditional TNM staging, the nomogram showed a larger net benefit across the range of death risk in the analysis.

Discussion
As a junctional tumor type between esophagus and stomach, the definition, evaluation, and management of AEG remained elusive. Based on the 8th AJCC TNM stage classification, the AEG patients could receive better evaluation and management. 18 However, the new complex classification may sometimes confuse the clinicians, resulting in unfavorable evaluation and therapy. 6 Ergo, a new specific evaluation and classification system for AEG is urgently needed. This study developed a new classification system by competing risk model nomogram for predicting survival in patients with AEG.
This nomogram is based on five variables: number of examined LNs, grade, invasion (T stage), metastatic LNs (N stage) and age. This nomogram produced better and more accurate predictions than the pathologic 8th TNM staging method and showed better clinical usefulness throughout the survival as assessed by DCA.
Due to the unique anatomic location, the AJCC classification of AEG remains debate in recent years. An accurate staging system is effective for clinicians to choose the best follow-up treatment. 19 In the 8th edition of AJCC classification, it paid more attention to changes and developments leading to better clinical decision making and predictive accuracy. Separated staging of AEG reflected the concept of individualized approach by AJCC. In the new AJCC staging system, it showed a more complex classification separated by 3 different groups, namely clinical (cTNM), pathologic (pTNM), and post-neoadjuvant pathologic (ypTNM) stage groups. 18 Clinical stage is defined using physical examination, endoscopy and imaging examination, which showed big heterogeneity between different surgeons. In the clinical practice, pathologic stage groups showed the most widely distributed survival. 20,21 Besides, a direct comparison of these editions is possible only for pathologic staging. Therefore, we used the pathologic stage groups of 8th TNM staging method as a comparison.
Traditional TNM classification stratified the AEG into 3 grouping methods: the pathologic depth, the number of metastatic LN and the distant metastasis of the tumor. The survival data are accompanied by multiple outcomes usually, which may have a competitive association, 10,22,23 resulting overestimate of the cumulative incidence. Nomogram is a well-analyzed statistic tool which provides a comprehensive probability of outcome. 24 A prior study compared with 7th TNM classification showed great accuracy than 7th AJCC classification, which included six clinical associated factors (age, sex, depth of invasion, metastasized LNs, examined LNs, histological grade). 8 In clinical practice, we found the outcome of AEG could be blocked by many other events. 5,25 Therefore, we used a new competing risk nomogram to reduce the influence of these outcomes. 26,27 In our study, we used the number of examined LN, grade, N stage, T stage, and age as the classification factors, part of which is in consensus with the TNM stage system. This simple nomogram could exclude possible bias from other lethal factors and be useful for clinicians in practice.
In our nomogram, the number of examined LN was considered to evaluate the survival except normal factors in the AJCC staging system. A number of examined LN presented to be a protective factor (< 10, < 15,>=15, SHR 0.751, 0.635, 0.540, P < 0.001) in this nomogram which meant that resection of more LN seems to lead to better survival. Several trials also recommended the number of examined LN as a great prediction in staging system in AEG. 28−30 Different surgical methods may determine the number of LN examinations. 31 The choice of surgery depends on the type of AEG: type I treated as esophageal cancer, type II and III regarded as gastric cancer. 10,31 More examined LNs may represent the more exhaustive of surgical dissection and less resided positive LNs. Moreover, not only the dissection procedure of the surgeon could affect the number of examined LN, but also the LN searching of the pathologist. Thus, we could mark the examined LN as a prediction for pathological credibility.
This work also has some limitations. Firstly, as the evolution of cancer biology, discovery and validation of biologic factors showed more effect to predict the outcome of cancer. 18,23 8th AJCC classification recommended some of this factor with strong evidence and showed great accuracy in AEG. According to the missing data in the SEER database, we did not compare it to the whole 8th classification. Further study combined biologic data may lead to a more 'personalized' approach.
Secondly, the SEER database is based on retrospective data collection and diagnosis and surgery all depend on different doctors from several medical centers, and the present study did not include different surgical approaches of AEG such as transhiatal oesophagectomy, three-stage oesophagectomy and total gastrectomy. Besides, the missing data during collection caused muchexcluded patients that might lead to bias. Finally, the lack of a validation cohort is another limitation of our current study.

Conclusion
We developed a nomogram predicting overall survival of AEG patients based on a large database. The

CONFLICT OF INTEREST:
All authors declare no conflicts of interest related to this article.

FUNDING:
This study received funding from the Special Research Projects for Capital Health Development, No.

ACKOWLEGEMENT:
We would like to thank Dr Shan Wu from Shanghai Six People's Hospital for her help in supporting the methodology of R language.

AUTHORSHIP:
Conception and design: DB Zhao, YT Chen, TB Wang, Y Wu; Collection and assembly of data: TB Wang, H Zhou, CR Wu, XJ Zhang; Data analysis and interpretation: TB Wang, Y Wu; Manuscript writing: All authors; Final approval of the manuscript: All authors.

ETHICAL STATEMENT AND INFORMED CONSENT STATEMENT:
The SEER database is publicly available and provides de-identified case data. Because of the analysis used anonymous clinical data, written informed consent from subjects was waived.

AVAILABILITY OF DATA AND MATERIALS:
The datasets generated and analysed during the current study are available in the SEER database, https://seer.cancer.gov/.       The calibration curves for predicting patient survival at 1-, 3-, 5-year, 7-year, 10-year point.
Nomogram-predicted cancer specific survival is plotted on the x-axis; actual cancer specific survival is plotted on the y-axis. A plot along the 45-degree line would indicate a perfect calibration model in which the predicted probabilities are identical to the actual outcomes.