Risk prediction of second primary malignancies in patients after rectal cancer: analysis based on SEER Program

Background This study will focus on exploring the clinical characteristics of rectal cancer (RC) patients with Second Primary Malignancies (SPMs) and constructing a prognostic nomogram to provide clinical treatment decisions. Methods We determined the association between risk factors and overall survival (OS) while establishing a nomogram to forecast the further OS status of these patients via Cox regression analysis. Finally, we evaluated the performance of the prognostic nomogram to predict further OS status. Results Nine parameters were identified to establish the prognostic nomogram in this study, and, the C-index of the training set and validation set was 0.691 (95%CI, 0.662–0.720) and 0.731 (95%CI, 0.676–0.786), respectively. The calibration curve showed a high agreement between the predicted and actual results, and the receiver operating characteristic (ROC) curves verified the superiority of our model for clinical usefulness. In addition, the nomogram classification could more precisely differentiate risk subgroups and improved the discrimination of SPMs’ prognosis. Conclusions We systematically explored the clinical characteristics of SPMs after RC and constructed a satisfactory nomogram. Supplementary Information The online version contains supplementary material available at 10.1186/s12876-023-02974-2.


Introduction
Rectal cancer represents the eighth most frequent diagnosed malignancy and the tenth most common reason for cancer-related deaths globally in 2018, [1] with approximately 732,210 new cases and 339,022 fatalities in 2020 [2,3].Nowadays, due to the progress of early diagnosis, comprehensive treatment, and advances in cancer detection, the OS of RC patients has greatly improved [4].For early-treated rectal cancer, the 5-year OS rate among patients could even reach 90% [5,6].However, second primary malignancies are threatening the lives of RC patients who underwent long-term survival [7].Recently, A growing number of studies have been carried out to investigate the risk factors for the development of SPMs in specific tumors, such as lung cancer [8], prostate cancer [9], breast cancer [10], stomach cancer [11], and so on.The prevalence of SPMs in RC survivors has been reported in earlier studies is 4-8% higher than in the normal population [12].Factors thought to be influencing this higher rate have been explored in several studies, related to the patient's genetic factors, lifestyle, environmental risk factors, and cancer therapy [13][14][15].
Nomogram have been identified as a simpler and more sophisticated clinical prediction tool for predicting individualized OS based on clinical characteristics and risk factors [16][17][18].We discover that it is extremely important to understand the incidence and prognosis of SPM patients for treatment providers and RC patients.Therefore, this study will concentrate on the risk factors for SPMs and will develop a nomogram to forecast the 1-, 3-, and 5-year OS of SPMs after RC.

Definition of SPMs
SPMs was defined as metachronous invasive solid cancer developing ≥ 6 months after initial primary cancer (IPC), under criteria of Warren and Gates as modified by the National Cancer Institute [19].The SEER database listed the pathologic subtypes of IPC and SPMs.To better distinguish SPMs from primary and metastatic tumors, we defined SPMs as second malignancy and histological different from IPC with an incubation period of not less than 6 months.Likewise, SEER database provided key clinical information on "malignant tumors for patient" and the "sequence number" of the multiple primary malignancies.It could be used to identify patients with SPM and index the sequence of multiple malignancies.

Patient selection
The clinicopathological information of a total of 4374 patients with rectal cancer was obtained from the SEER database.The following were the inclusion criteria: (1) Diagnosed age was between 20 and 80 years.(2) Rectal cancer was discovered in patients between January 2004 and December 2013, and the follow-up period was at least 5 years; (3) Detailed survival data and follow-up information on patients should be provided.The following were the exclusion criteria: (1) Patients without pathological confirmation of the diagnosis; (2) Patients who only provided death certificate records or autopsy records; (3) Latency periods of fewer than 6 months between IPC and SPMs.Next, we screened for the same histological type as rectal cancer (N = 2536), wherein 1838 patients were still diagnosed with SPMs.Patients with unclear clinical data were excluded, including the patients who have no TNM stage (N = 403), unknown lymph node removed (LNR) and marital status (N = 639), and unknown clinical stage of RC (N = 55).Finally, the prognostic nomogram was created using the risk factors that were identified, which were integrated from the detailed clinical data of 741 SPM patients with rectal cancer.Then, the data of 741 patients were randomly split into a training set (N = 585) and a verification set (N = 156) at a ratio of 7:2.Meanwhile, the training and validation set were used for external and internal validation, respectively.The precise details of SPMs screening were shown in Fig. 1.

Statistical analysis
To investigated the relationship between clinicopathological variables and OS of SPMs, univariate and multivariate Cox regression analyses were performed to specify the risk factors.Next, significantly different risk factors were used to build a nomogram that accurately forecast the 1-, 3-and 5-year survival rates of SPM patients.To verified the performance of the nomogram we constructed, the C-index was used to assess the accuracy of the prediction results.Next, the calibration curve was created to evaluate the consistency between predicted and actual results while bootstrapping with 1000 resamples was used to assess discrimination and calibration.Then, survival predictions for 1-, 3-and 5-year were estimated using the ROC curve.In addition, the nutrition risk index (NRI) and integrated discrimination improvement (IDI) were used to evaluate the degree the of accuracy between the nomogram and the conventional AJCC staging system, And the clinical usefulness and benefits of the nomogram were estimated by the decision curve analysis (DCA) plots.
In this study, R software (version 4.1.2) and SPSS 25.0 were both used for all statistical analysis.All tests were two-way and P < 0.05 was considered statistically significant.

Characteristics of patients
A total of 51,611 patients diagnosed with rectal cancer during 2004-2013 was obtained from the SEER database, of which 4,374 patients were diagnosed with cancer more Fig. 1 Study flowchart showing the process of constructing nomogram to predict the overall survival (OS) of second primary malignancies (SPMs) after rectal cancer (RC).LNR: lymph node removed than 6 months after the initial diagnosis of RC.To rule out caused recurrence and metastasis of RC, the patient's data with the same histological type as RC was ruled out.Ultimately, a total of 1838 (3.56%) patients diagnosed with SPMs were identified.The results showed that the median interval between RC and SPMs diagnosis was 36 months and the median age at SPMs diagnosis was 67.5 years.By using original data obtained from the SEER database, 741 cases of SPMs were found.After removing those with unclear clinical information, more than 1% of the patients' SPM sites and histological types were listed (Fig. 2), suggesting that the three most common sites for SPMs were the Lung and Bronchus (18.35%),Urinary Bladder (15.11%), and Breast (11.20%) (Table 1) (Table S1).The three most prevalent histological types for SPMs were Squamous Cell Neoplasms (21.32%),Adenomas and Adenocarcinomas (18.76%),Transitional Cell Papillomas and Carcinomas (15.11%) (Table 1) (Table S2).
Final enrollment for further analysis included 741 patients in total, both the training set (N = 585) and the validation set (N = 156) were randomly divided from the 741 patients.Meanwhile, there was no significant difference in clinical information by using the χ2 test (P > 0.05), including the site of SPMs, histology of SPMs, age, race, TNM stage, treatment information, tumor size, and grade of SPMs (Table 2).The training set was used to build the nomogram and verify the model internally, while the validation set was utilized for external validation.

Prognostic factors selection and nomogram construction
Univariate and multivariate Cox regression analysis was applied to reveal OS-related factors in SPMs.The results (Table 3) show that the OS of SPMs was a significantly higher risk with age, TNM stage, stage M of RC, SPMs surgical history, SPMs tumor size (P < 0.001) and site(P = 0.009), while the OS of SPMs was a significantly lower risk with chemotherapy and radiotherapy(P<0.001).Multivariate Cox regression analysis revealed that age, stage-M, stage-M of RC, and SPMs surgical history(P<0.001),stage-T(P = 0.003), and stage-N(P = 0.012) were independent predictive variables for SPMs survival.According to the results of univariate and multivariate Cox regression analysis, 9 parameters including the site, age, stage TNM, stage M of RC, SPMs surgical history, SPMs radiotherapy records, SPMs chemotherapy records, and SPMs tumor size were used to establish a nomogram for predicting 1-, 3-, and 5-year OS (Fig. 3).To use the nomogram more conveniently, each of these characteristics was allocated a particular point on the scale.A total point was received for the individual patients, followed by a summary of the points from each parameter.Then, the probability of OS occurrence after 1, 3, and 5 years was predicted by transferring the entire score to the nomogram's total score table.As an example, the total point of all variables for an SPM patient diagnosed with 60 years in urinary bladder site of 5 cm Tumor size, T2N2M0, M0 of RC, having SPMs Surgery record and Radiation record, but no chemotherapy record was 135, which corresponded to 1-,3-, and 5-year OS rates of about 88.3%,62.5%,and 50.1%, respectively.

Discussion
As the incidence of SPMs increased significantly, recent developments in SPMs had heightened the need for research on the monitoring, prognosis, and treatment decisions for clinical and public health [20,21].
To investigated the prognosis of SPMs following RC, 9 parameters including the site, age, stage TNM, stage M of RC, SPMs surgical history, radiotherapy records, chemotherapy records, and tumor size were analyzed, which were applied to create a new nomogram that forecasts the survival rate of SPM patients.Taken together, our research showed that nomograph is superior to the AJCC staging system in predicting the probability of OS after 1 year, 3 years, and 5 years in the training set and validation set.
In reviewing the literature, Du et al. [22] reported that the three most prevalent sites of SPMs were neoplasms of colorectum (SIR 1.59, 95%CI   showed that Patients with RC were more likely to develop malignant tumors in the thyroid, uterine body, colon, rectum, lung/ bronchus.The same as our research results showed that the three most popular sites for SPMs were the Lung and Bronchus (18.35%),Urinary Bladder (15.11%), and Breast (11.20%).Therefore, it is of great significance to regular and long-term monitoring of the Lung and Bronchus, Urinary Bladder, and Rectum, which was necessary for RC patients at high risk.Among the 9 parameters included in our nomogram, Age was recognized important risk contributor for SPM patients [24,25].Liu et al. [26] reported that Age (50-59:HR 0.958, 95%CI 0.842 − 0.091; 60-100:HR 1.557, 95%1.370-1.747;18-49 as a reference) by multivariate analysis were all correlated with OS (P<0.001).Similarly, Li et al. [27] noted that Age (≥ 73:HR 1.482,95%CI 1.048-2.152;<73 as a reference) by multivariate analysis were all correlated with OS(P = 0.045).After dividing age into four age groups to better explore the relationship between age and overall survival, the results indicate that Age (60-69:HR1.422,95%CI1.074-1.883;70-79:HR1.713,95%1.297-2.263;≥80:HR 2.801,95%11.763-4.450;<60 as a reference) by multivariate analysis were all correlated with OS (P < 0.001).The degradation of the physical state, terrible treatment sensitivity, and the worsening cancer stage in elderly patients may all be contributing factors to these results.
Likewise, multivariate analysis in our study revealed that N stage (N1:HR 0.926, 95%CI 0.660-1.299;N2:1.534 95%CI 1.071-2.197;N3:HR2.011,95%CI 0.923-4.380;N0 as a reference) for SPM patients had statistically significant OS rates(P = 0.012).This is consistent with those the findings of previous work that the N stage was one of the most significant contributions to OS [28,29].This view is supported by Park et al. [30] who reported that patients had higher pathological N stage (N1:HR 1.182,95%CI 1.191-1.845,P<0.001; N2:2.344 95%CI 1.779-3.289,P<0.001; N0 as a reference) significantly associated with OS, suggesting that surveillance was more frequent.As noted by Song et al [31], the N stage was considered as a Nomogram as a suitable scoring tool for clinical research, it could integrate the effects of various prognostic factors and present the results intuitively.Compared with the current AJCC sixth edition, the nomogram we created demonstrates a noticeably stronger capacity for risk stratification of RC SPM patients.Meanwhile, it is straightforward to gather nine prognostic factors on SPM patients, match that data with the nomogram we created, and calculate the corresponding scores.We could convenient to obtain the 1-, 3-, and 5-year OS by adding and matching the nomogram.The nomogram could help patients' contributions to information on survival, clinical decision-making guidance, and treatment allocation.For those patients at high risk, they need active therapeutic and close monitoring to improve their overall survival.
Several questions still remain unanswered at present.First, although this study is a retrospective study and strictly complies with the inclusion and exclusion criteria, potential selection bias may have occurred.Secondly, Due to the lack of data relating to chemotherapy protocols and dose, it is not possible to evaluate the effects of different protocols and dose on the onset of secondary cancer.Finally, although our predictive model performs well through internal validation, additional external validation with other populations is still required.

Fig. 2
Fig. 2 Features of second primary malignancies (SPMs) after rectal cancer (RC).(a) Sites of SPMs that over than 1%, (b) Histology types of SPMs that more than 1%

Fig. 4
Fig.4 The calibration curve evaluate the 1-year(a),3-year (c) and 5-year (e) survival for second primary malignancy (SPM) patients in the training set; The calibration curve to evaluate the 1-year(b),3-year (d) and 5-year (f)survival for SPM patients in the validation set

Fig. 6
Fig. 6 DCA curves of the nomogram and AJCC TNM staging system for predicting 1-,3-and 5-year OS in the training set (a, b, c), the internal validation set (d, e, f)

Table 1
Site and Histology types of SPMs after RC that the top 20 Abbreviations: SPMs: second primary malignancies; RC: rectal cancer

Table 2
Clinicopathological characteristics of SPM patients with RCresults demonstrated that the accuracy of the nomogram to predict OS is much superior than the usual AJCC staging system.

Table 3
Univariate and multivariate Cox analysis of SPMs patients after RC in the training and validation set

Table 4
NRI and IDI of the nomogram and the traditional AJCC staging system in OS prediction for RC