Identification of prognostic TMB-related lncRNAs and model construction
As shown in Fig. 1, Patients in the TCGA-PAAD cohort were divided into high-TMB and low-TMB groups based on the median value of the TMB score. 352 down-regulated lncRNAs and 133 up-regulated lncRNAs were identified as TMB-related lncRNAs (Fig. 2A) and the details were shown in Supplementary Table 2. And then, univariate analysis showed that 43 TMB-related lncRNAs were selected as candidates for model construction (Supplementary Table 3) and the expression pattern of 43 prognostic TMB-related lncRNAs between high-TMB and low-TMB groups was shown in Fig. 2B. Finally, a 14-lncRNA prognostic model was constructed by LASSO Cox method (Fig. 2C-2D) in the training set, including TRPM2-AS, MIR600HG, MIR3142HG, LINC01940, LINC01518, HOXA-AS2, FIRRE, ELFN1-AS1, C8orf31, AL139246.4, AC114947.2, AC103853.1, AC092756.1, and AC010175.1 (Supplementary Table 4). In addition, multivariate analysis was performed and the result indicated that these 14 lncRNAs were independent factor on the prognosis in the TCGA-PAAD cohort (p < 0.05), seven of which (TRPM2-AS, MIR600HG, MIR3142HG, HOXA-AS2, AL139246.4, AC114947.2, and AC010175.1) were protective factors with hazard ratios (HR) < 1 and the rest of which were poor factors with HR > 1 (Fig. 2E).
Evaluation and validation of the 14-lncRNA prognostic model
The risk scores of each patient were calculated and the training set was divided into high-risk and low-risk groups based on the median value of risk scores (Fig. 3A). The status of each patient and the expression pattern of 14 lncRNAs of each patient were shown in Fig. 3B-3C respectively. Comparison of survival curve between high-risk and low-risk groups suggested that patients with high risk score had significantly shorter overall survival (OS) than those with low risk score (Fig. 3G, p < 0.0001). Besides, the AUC of 1-year, 3-year, and 5-year OS was 0.795, 0.878, and 0.876, respectively (Fig. 3I). As validation, the same process was performed in the testing set (Fig. 3D-3F), the entire TCGA-PAAD cohort (Fig. 4A-4C), and the PACA-AU cohort (Fig. 4D-4F). Similarly, the high-risk group was associated with worse prognosis compared to the low-risk group in the testing set (Fig. 3H, p = 3e-04), the entire TCGA-PAAD cohort (Fig. 4G, p < 0.0001), and the PACA-AU cohort (Fig. 4H, p = 0.0017). Additionally, we also evaluated the AUC of 1-year, 3-year, and 5-year OS in the testing set (Fig. 3J), the entire TCGA-PAAD cohort (Fig. 4I), and the PACA-AU cohort (Fig. 4J). It is also well-known that 90% of PAAD were classified as pancreatic ductal adenocarcinoma (PDAC). Thus, we further evaluate the prognostic value of our signature only in the PDAC patients of the PACA-AU cohort. The result showed that patients with high risk score had inferior outcomes compared to those with low risk score, but the difference was not significant (Supplementary Fig. 1). The possible reason for this result is the small sample size of PDAC (n = 40).
Association of the 14-lncRNA prognostic model and clinicopathological characteristics
We compared the clinicopathological characteristics of high-risk and low-risk groups in the TCGA-PAAD cohort. As shown in Fig. 5A, the T stage was significantly associated with the risk score: high risk group had more patients with T3-T4 while patients with T1-T2 were more clustered in low risk group. No significant difference was found in other clinicopathological characteristics between high-risk and low-risk groups. And then, the differences of TMB, microsatellite instability (MSI), fraction genome altered (FGA), homologous recombination deficiency (HRD)-score, and immune score between high-risk and low-risk groups were also compared. Notably, the risk score was positively correlated with the TMB (Fig. 5B, p < 0.001), MSI (Fig. 5C, p < 0.05), FGA (Fig. 5D, p < 0.0001) and HRD-score (Fig. 5E, p < 0.001).
To further investigate whether the prognostic value of the 14-lncRNA prognostic model can be impacted by the clinicopathological characteristics, patients in the TCGA-PAAD cohort were grouped by the clinicopathological characteristics including age (≤ 60 years and > 60 years), gender (male and female), family history of cancer (yes and no), history of diabetes (yes and no), history of alcohol exposure (yes and no), history of chronic pancreatitis (yes and no), M stage (M0 and M1), N stage (N0 and N1), T stage (T1 + T2 and T3 + T4), grade (grade 1–2 and grade 3–4), tumor stage (stage I-II and stage III-IV), history of radiation therapy (yes and no). Each subgroup was further divided into high-risk and low-risk groups based on the median value of risk score. Except for patients with M1 and patients with history of chronic pancreatitis, high-risk group was associated with inferior prognosis compared to low-risk group in every subgroup (Fig. 6). The possible reason for this result was the small number of patients in the subgroup of patients with M1 (n = 4) and patients with history of chronic pancreatitis (n = 13). Besides, all the clinicopathological characteristics were enrolled in the univariate analysis, and the result showed that the risk score was also the poor prognostic factor in the TCGA-PAAD cohort (Fig. 7A, p < 0.0001) besides T stage (Fig. 7A, p = 0.028) and N stage (Fig. 7A, p = 0.005). And then, we enrolled these three prognostic factors in the multivariate analysis. Interestingly, the risk score was also the only independent factor on the prognosis in the TCGA-PAAD cohort (Fig. 7A, p < 0.0001).
Models comparison
Firstly, we compared the value of AUC between our model and some clinicopathological features including age, gender, tumor stage, grade, TMN stage and history of radiation therapy. Unsurprisingly, the AUC of our model was always the largest comparing with other features (Fig. 7B). And then, comparison of our model and three recently published models was performed, including the m6A-related lncRNA prognostic model reported by Yuan et al. [36], the TMB-related genes prognostic model reported by Tang et al. [16], and the autophagy-related lncRNA prognostic model reported by Deng et al. [37]. As shown in Fig. 7C, our 14-lncRNA prognostic model had the best predictive efficiency of 3-year OS in the TCGA-PAAD cohort compared with other three published models. Besides, our model had the biggest value of C-index compared to other three published models (Fig. 7D), which further indicated that our model had excellent performance in the prediction of prognosis in PAAD.
Exploration of potential mechanism of the 14-lncRNA prognostic model in PAAD.
To further explore the potential mechanism of our model, differentially expressed genes (DEGs) analysis was performed between high-risk and low-risk groups in the TCGA-PAAD cohort. 1586 down-regulated genes and 779 up-regulated genes were selected as DEGs (Fig. 8A, Supplementary Table 5). And then, functional enrichment analysis was performed based on the DEGs. KEGG analysis showed that the DEGs were majorly enriched in pancreatic secretion pathway and many immune-related signaling pathways including cytokine-cytokine receptor interaction, cell adhesion molecules, primary immunodeficiency, chemokine signaling pathway, and intestinal immune network for IgA production, T cell receptor signaling pathway, and Th1 and Th2 cell differentiation (Fig. 8B), which suggested that our model might be associated with the immunoregulation of PAAD. Therefore, we compared the immune score and immune infiltration of 28 immune cells between high-risk and low-risk groups in the TCGA-PAAD cohort. Notably, patients with high risk score had significantly lower immune score (Fig. 8C, p < 0.001). Besides, the infiltrations of 19 types of immune cells were significantly higher in the low-risk groups compared to the high-risk group, including some immune cells that played crucial role in tumor immunity such as CD8 T cells, CD4 T cells, natural killer cells and B cells (Fig. 8D).