Skip to main content

A novel ceRNA-immunoregulatory axis based on immune cell infiltration in ulcerative colitis-associated colorectal carcinoma by integrated weighted gene co-expression network analysis



Patients with ulcerative colitis are at an increased risk of developing colorectal cancer with a prolonged disease course. Many studies have shown that alterations in the immune microenvironment play a key role in ulcerative colitis-associated colorectal cancer. Additionally, competing endogenous RNAs have important functions in immunoregulation, affecting inflammation and tumorigenesis. However, the complexity and behavioral characteristics of the competing endogenous RNA immunoregulatory network in ulcerative colitis-associated colorectal cancer remain unclear. We constructed a competing endogenous RNA immunoregulatory network to discover and validate a novel competing endogenous RNA immunoregulatory axis to provide insight into ulcerative colitis-associated colorectal cancer progression.


The competing endogenous RNA immunoregulatory network was constructed using differential expression analysis, weighted gene co-expression network analysis, and immune-related genes. Cmap was used to identify small-molecule drugs with therapeutic potential in ulcerative colitis-associated colorectal cancer. The ulcerative colitis-associated colorectal cancer-related pathways were identified by gene set variation and enrichment analysis. CIBERSORT, single-sample Gene Set Enrichment Analysis, and xCell were used to evaluate the infiltration of immune cells and screen hub immunocytes. The competing endogenous RNA immunoregulatory axis was identified by correlation analysis.


We identified 130 hub immune genes and constructed a competing endogenous RNA immunoregulatory network consisting of 56 long non-coding RNAs, four microRNAs, and six targeted hub immune genes. Four small-molecule drugs exerted potential therapeutic effects by reversing the expression of hub immune genes. Pathway analysis showed that the NF-κB pathway was significantly enriched. Neutrophils were identified as hub immunocytes, and IL6ST was significantly positively correlated with the neutrophil count. In addition, NEAT1 may serve as a competing endogenous RNA to sponge miR-1-3p and promote IL6ST expression.


The competing endogenous RNA immunoregulatory axis may regulate neutrophil infiltration, affecting the occurrence of ulcerative colitis-associated colorectal cancer.

Peer Review reports


Ulcerative colitis (UC) is a chronic, recurrent, and intestinal inflammatory disease; however, its pathogenesis remains unclear. It is generally considered that the intestinal mucosal immune system of patients with UC produces abnormal amplification of the immune response to intestinal microbial antigens in a specific environment, which can cause intestinal mucosal inflammatory damage [1]. The imbalance between inflammation and mucosal immunity is an important feature of colorectal carcinogenesis [2]. Therefore, patients with UC have an increased risk of UC-associated colorectal cancer (CAC).

Although an increasing number of studies have investigated coding gene-related biomarkers in CAC, protein-coding genes only account for 2% of the human genome. The competitive regulatory crosstalk of different molecular species, especially the competition between protein-coding and non-coding RNA transcripts, is a key link in the occurrence of disease [3, 4]. MicroRNAs (miRNAs), as an abundant class of small, non-coding, single-stranded oligoribonucleotides, act as regulators in various cellular processes and function as sequence-specific silencers of target gene transcription after binding, thereby affecting the expression levels of more than half of all protein-coding genes [5, 6]. Therefore, Salmena et al. [7] proposed the competing endogenous RNA (ceRNA) hypothesis, which posits that most RNA molecules act in a “many-to-many” manner. As each miRNA molecule may potentially target miRNA response elements on multiple messenger RNA (mRNA) molecules, each mRNA molecule can be targeted by multiple miRNA molecules [8]. This suggests that miRNA molecules have a pivotal role in competitively regulating crosstalk and lead to gene silencing by binding to mRNAs, and that ceRNA regulates gene expression by competitively binding to miRNAs.

Numerous ceRNA studies in different diseases (ie tumors, inflammatory bowel diseases, liver fibrosis, and cardiovascular diseases) have suggested that ceRNA can influence dysregulation of the immune microenvironment by regulating the interaction of different types of RNAs, thereby promoting the occurrence and development of diseases [9,10,11,12,13,14,15]. However, the specific immunoregulatory mechanisms in the CAC process remain unclear. We hypothesized that the ceRNA regulatory axis can alter the immune microenvironment during the development of CAC.

To verify this hypothesis, we integrated immune-related genes (IRGs), weighted gene co-expression network analysis (WGCNA), and differential expression analysis results, and then constructed a ceRNA immunoregulatory network based on the principle of reverse prediction. Subsequently, small-molecule medicines with possible applications in therapy for CAC were investigated. Pathway enrichment analysis and immune cell infiltration analysis were performed to identify the key pathways and immune cells. Correlation analysis was conducted to identify the key ceRNA regulatory axis from the network. This study provides a foundation for improving the understanding of the pathophysiological processes of CAC and insights into the variations in the immunological microenvironment of the disease.


Data acquisition and preprocessing

The research and design flowchart is shown in Fig. 1. Gene expression profiles of datasets GSE37283 [16] and GSE68306 [17] were downloaded from the National Center for Biotechnology Information (NCBI).

Fig. 1
figure 1

Research and design flow chart. UC ulcerative colitis, NC normal control, CAC ulcerative colitis-associated colorectal cancer, DEGs differentially expressed genes, IRGs immune-related genes, DEMs differentially expressed micro RNAs, HIGs hub immune genes

Gene Expression Omnibus (GEO) database ( The GSE37283 dataset contains mRNA microarray raw data from 20 colonic mucosal samples (five normal control [NC] samples, four UC samples, and 11 UC samples harboring remote neoplasia). The GSE68306 dataset contains miRNA normalized expression matrix data of 11 UC-associated neoplastic samples and 16 NC samples. Raw CEL data (data storage format) were parsed using the ‘affy’ package in R [18]. The robust multiarray average (RMA) algorithm background correction and quantile normalization were performed on gene expression profiles with the ‘oligo’ package in R [19]. The annotation package of the microarray platform GPL13158 was used to convert the probe IDs into gene symbols. Chip quality was assessed using the ‘arrayQualityMetrics’ package in R [20]. Moreover, 1793 IRGs were downloaded from the Immunology Database and Analysis Portal database (, which covers 17 immune categories [21].

Construction of weighted gene co-expression gene network

The ‘WGCNA’ package in R was used to analyze the co-expression network of the top 25% gene variants in the GSE37283 dataset [22]. The adjacency matrix was constructed by calculating Pearson’s correlation coefficient. Subsequently, a scale-free co-expression network based on the soft threshold parameter β was established. The adjacency matrix was converted to a topological overlap matrix (TOM) by comparing the weighted correlation of the two nodes with the others. Hierarchical clustering of dissimilarity TOM (dissTOM = 1 − TOM) rendered similar gene expression levels in the same gene module. Next, the dynamic cut tree algorithm was used to further partition the modules. Additionally, module eigengenes (MEs) and gene significance (GS) were used to identify modules associated with CAC. MEs represent the principal component of all gene expression levels in individual modules. GS was considered as the mediated p-value for each gene, representing the correlation between gene expression and CAC.

Differential expression analysis and identification of hub immune genes

We analyzed the differentially expressed mRNAs and miRNAs between CAC and NC samples using the classical Bayesian methodology with the ‘limma’ package in R [23]. In GSE37283, the cut-off criteria were set at p < 0.05 and |log2fold-change (FC)|> 1. In GSE68306, we adjusted the threshold of differential expression to p < 0.05 and |log2FC|> 0.5, allowing the results to be optimized. The differentially expressed mRNAs, IRGs, and CAC-related module genes in WGCNA were intersected to obtain the differentially expressed hub immune genes (HIGs), which were visualized in a Venn diagram [24].

Small-molecule drug prediction

The Cmap database ( includes and collates 6100 gene expression profiles from 7056 microarray datasets, covering 1309 Food and Drug Administration-approved small-molecule drugs [25]. The database is commonly used to predict potential drugs for treating diseases and repurposing existing drugs. Uploading the HIGs into the Cmap database resulted in connectivity scores for the corresponding small-molecule drugs with values between + 1 and − 1. A positive connectivity score indicated that the small-molecule drug induced gene expression, whereas a negative connectivity score suggested that specific drugs could reverse gene expression patterns. Therefore, the screening criteria for potential therapeutic drugs were enrichment < −0.7, and p < 0.01. The molecular structures of potential therapeutic drugs were displayed using the PubChem database ( [26].

Gene set enrichment analysis

Gene set enrichment analysis (GSEA) and gene set variation analysis (GSVA) are two types of gene enrichment methods [27, 28], the former estimates the enrichment of pathways in phenotypic genes according to the background gene set, whereas the latter transforms the gene expression matrix into a gene set enrichment matrix using an unsupervised method. We downloaded the hallmark gene set from the Molecular Signatures Database (MSigDB; for use as reference [29]. The ‘clusterpProfiler’ package in R was used for GSEA and visualization [30]. The p and q values were obtained by simulating permutations of 1000 random genomes, and p < 0.05 and q < 0.05 were used as screening criteria. For GSVA, we used the ‘gsva’ and ‘limma’ packages in R to screen pathways with significant differences (p < 0.05 and |log2FC|> 0.5), and the results were displayed with the ‘pheatmap’ package [23, 28].

Immune cell infiltration analysis

Many algorithms have been developed to evaluate the infiltration level of immune cell populations, among which CIBERSORT, single-sample GSEA (ssGSEA), and xCell are the most widely used research methods [31, 32]. For CIBERSORT, we estimated the proportion of 22 immune cell populations in the samples using the support vector regression machine learning method with the ‘cibersort’ package in R [33]. For ssGSEA, we compared 28 types of immune cell characteristic gene sets and converted the gene expression values of samples into the enrichment fraction to obtain the relative abundance of immune cells in samples using the ‘gsva’ package in R [34]. For xCell, we performed a cell type enrichment analysis from the gene expression data of 64 immune and stromal cell types using the ‘xcell’ package in R. We subsequently identified immune cell types that were significantly different in CAC and normal tissue using the Wilcoxon test.

ceRNA network construction

ENCORI ( is a comprehensive database that provides experimentally supported miRNA-mRNA and miRNA-long non-coding RNA (lncRNA) interaction networks [35]. In addition, it integrates information from multiple miRNA-mRNA prediction databases, including PITA, miRmap, miRanda, and TargetScan. To ensure the accuracy of predicting target genes, only those that existed in the four above-mentioned databases were included in the regulatory network. Subsequently, differentially expressed mRNAs (DEmRNAs) were intersected with predicted mRNAs to explore differentially expressed miRNA-mRNA regulatory relationship pairs in CAC. Next, the roles of differentially expressed miRNAs (DEmiRNAs) and lncRNAs were predicted using the ENCORI and LncBase databases [36]. The network was constructed by screening out the lncRNA-miRNA-mRNA regulatory axis that was regulated by the same DEmiRNAs and visualized in Cytoscape 3.8.2 [37]. To further identify the key targets in the process of CAC, the Cytoscape plugin ‘Cytohubba’ was used to perform non-weighted parameter analysis of node connectivity in the network [38]. Finally, the subnetwork was screened according to the connectivity score. Based on the ceRNA hypothesis, lncRNA plays a sponge adsorption role in the cytoplasm and is positively correlated with mRNA expression levels. Therefore, we used LNCipedia ( to obtain the lncRNA sequence and determine its cellular location using the lncLocator database ( [39, 40]. Co-expression analysis of lncRNA and mRNA was performed based on The Cancer Genome Atlas colorectal cancer data compiled by the ENCORI database. Additionally, CentroidFold ( was used to predict the secondary structure of the miRNA precursor stem-loop [41]. Statistical significance was set at p < 0.05.

Statistical analysis

Data analysis was performed with R 4.0.3. The Wilcoxon test was used to compare the types of immune cells in the CAC and NC groups. Spearman’s correlation analysis was used to assess the correlation between the two groups. The chi-square test was used for categorical variables. Statistical significance was set at p < 0.05.


Chip quality analysis

To evaluate the quality of the chip and increase the reliability of the results, we compared the differences in raw data obtained from the GSE37283 dataset of the GEO database before and after correction using boxplots, smoothed histograms, principal component analysis (PCA), and array clustering heatmaps. Boxplots detected outliers by calculating the distribution of each array using pooled data, whereas the smoothed histograms estimated the density of data to indicate the chip quality-related phenomenon. We found many raw data outliers and large differences in density curves among samples that required normalization to eliminate the effect of systematic errors (Fig. 2A–D). Therefore, PCA and array clustering heatmaps were used to describe batch effects among samples (Fig. 2E–G). Although the influence of the batch effect was markedly reduced, restoring the actual biological characteristics of the chip, there were still two abnormal samples (GSM915458 and GSM915469) in the corrected data (Table 1). Overall, these results showed that two outliers must be removed to ensure the reliability of subsequent analysis based on the chip quality control method.

Fig. 2
figure 2

Chip data correction and evaluation. Boxplots show gene expression levels before (A) and after (B) correction. Density plots show estimates of data density before (C) and after (D) normalization. PCA is a dimension reduction and visualization technique used to project the multivariate data vector of each array into a two-dimensional plot, and thus, the spatial arrangement at the midpoint of the plot reflects similarity between the overall data. PCA before processing (E) and shows after processing (F). Heatmaps before correction (G) and after correction (H) show the array aggregation caused by expected biological or unexpected experimental factors (batch effects). Outliers are denoted by *. PCA principal component analysis

Table 1 Overview of chip quality control

Weighted co-expression network construction and identification of key modules

After the analysis of variance (ANOVA), the top 25% of genes (4980 genes) from the 18 samples screened for sequencing were subjected to WGCNA to identify genes highly associated with CAC. To define the adjacency matrix, we introduced weight parameters when soft-thresholded β = 16 (scale-free R2 > 0.8) and scale-free networks were constructed (Fig. 3A). The dynamic tree-cutting algorithm classified the 4980 genes into 10 co-expression modules (Fig. 3B; Additional file 1: Table S1). By calculating the correlation of each module with the corresponding clinical features, we found that genes in the MEturquoise module were most strongly associated with CAC (r = 0.84, p < 0.01) and were identified as key modules in the CAC process (Fig. 3C, D). The MEturquoise module contained 2264 genes. Further statistical tests on the gene significance (GS) of MEturquoise module membership (MM) and CAC showed that 2264 genes contributed significantly to the membership of the MEturquoise module and CAC (r = 0.87, p < 0.01) (Fig. 3E). Moreover, hierarchical clustering analysis of gene expression levels in the MEturquoise module revealed that gene expression in the CAC group differed from that in the UC and NC groups (Fig. 3F). Therefore, these genes co-expressed with CAC require further study.

Fig. 3
figure 3

Construction of weighted co-expression network and module analysis. A Scale-free index under various soft threshold power (β) and average connectivity analysis. B Cluster dendrogram of co-expression network module. Each color represents one specific co-expression module. C Heatmap of the correlation between the module eigengenes and clinical traits. Each row represents a color module and every column a clinical trait (UC, NC, and CAC). Each cell contains the corresponding correlation and p-value. Statistical significance was set at p < 0.05. D Dendrogram (top) and heatmap (bottom) display the strength of correlations between CAC and other modules. Red represents a higher positive adjacency and blue a lower adjacency. E Correlation between module membership of turquoise module and gene significance with CAC. F Heatmap with clusters of DEGs in turquoise module among the three different groups. UC ulcerative colitis, NC normal control, CAC ulcerative colitis-associated colorectal cancer, DEGs differentially expressed genes

GSVA and GSEA reveal CAC-related pathways

Using p < 0.05, and |log2FC|> 0.5, 12 pathways with significant changes were screened out (Fig. 4A; Table 2). According to the threshold value (p adjust < 0.05), we obtained the top 10 pathways with normalized enrichment scores from the results of GSEA. By combining the results of GSEA and GSVA, we found that TNFα signaling via the NF-κB pathway showed significant changes in CAC and had the highest degree of enrichment (normalize enrichment score: 2.285) (Fig. 4B).

Fig. 4
figure 4

CAC-related pathways analysis. A Heatmap showing significantly different pathways in GSVA; p < 0.05 and |log2FC|> 0.5. B Ridge plots with normalized enrichment score displaying the 10 most enriched pathways of CAC in GSEA. CAC ulcerative colitis-associated colorectal cancer, GSVA gene set variation analysis, NES normalized enrichment scores, GSEA gene set enrichment analysis

Table 2 Differentially expressed pathways in gene set variation analysis

Identification of CAC-related hub immune genes and small molecule therapeutic drug prediction

We performed differential expression analysis of the GSE37283 microarray expression matrix and filtered out 1271 DEGs, which met the criteria (p < 0.05, |log2FC|> 1) between the CAC and NC groups; of these genes, 1012 were upregulated and 258 were downregulated (Fig. 5A). Subsequent hierarchical clustering analysis of the 1270 DEGs revealed significantly different expression between the two groups (Fig. 5B). We then intersected 1793 immune genes downloaded from the Immport database, 2252 genes obtained from CAC modules in WGCNA, and DEGs to identify 130 HIGs (Fig. 5C; Additional file 1: Table S2). Based on the importance of HIGs in CAC, we explored agents that could reverse gene expression and showed therapeutic potential. Four drugs from the Cmap database met the criteria (enrichment < −0.7 and p < 0.01), and their three-dimensional molecular structures were displayed in the PubChem database (Fig. 5D–G; Table 3).

Fig. 5
figure 5

Identification of HIGs and small-molecule therapeutic drugs. Volcano plot (A) and heatmap (B) of DEGs between CAC and NC, p < 0.05, |log2FC|> 1. C Venn diagram displaying genes overlap in IRGs, GSEA and WGCNA results. DG Prediction results of potential small-molecule drugs for treating CAC based on HIGs. HIGs hub immune genes, DEGs differentially expressed genes, CAC ulcerative colitis-associated colorectal cancer, NC normal control, IRGs immune-related genes, GSEA gene set enrichment analysis, WGCNA weighted gene co-expression network analysis

Table 3 Potential drugs with therapeutic potential for ulcerative colitis-associated colorectal cancer

Identification of differentially expressed miRNA and ceRNA network construction

We obtained 23 DEmiRNAs (17 upregulated and six downregulated) from the GSE68306 dataset using the same differential analysis method (Fig. 6A, B). Guided by the ceRNA hypothesis, we reversely predicted the mRNAs and lncRNAs downstream and upstream of the 23 miRNAs, respectively. Based on the ENCORI database, DEmiRNAs were matched with the potential mRNAs. Next, we matched the predicted target genes with the HIGs and identified eight pairs of miRNA-mRNAs, which contained four DEmiRNAs and six mRNAs. Next, based on the predicted lncRNAs upstream of the four DEmiRNAs, we used the intersection of the ENCORI and LncBase databases to identify 63 pairs of lncRNAs-miRNAs, including 56 lncRNAs. Finally, we integrated 56 lncRNAs, four miRNAs, and six targeted HIGs into a lncRNA-miRNA-mRNA network (Fig. 6C). Each node of the network was subjected to connectivity analysis using cytohubba, resulting in a subnetwork containing 10 nodes (Fig. 6D; Table 4). Overall, the ceRNA regulatory axis in the subnetwork may be closely related to the occurrence and development of CAC (Fig. 6E).

Fig. 6
figure 6

Construction of ceRNA immunoregulation network. Volcano plot (A) and heatmap (B) of DEMs between CAC and NC, p < 0.05, |log2FC|> 0.5. C Potential lncRNA-miRNA-mRNA immunoregulation network in CAC. The diamond represents the lncRNAs; triangle represents the miRNAs; and circle represents the mRNAs. Use Cytoscape's plugin Cytohubba to filter sub-networks (D), and visualize the regulatory relationships of sub-networks (E). ceRNA competing endogenous RNA, DEMs differentially expressed miRNAs, CAC ulcerative colitis-associated colorectal cancer, NC normal control, miRNA micro RNA, mRNA messenger RNA

Table 4 Competing endogenous RNA immune regulatory subnetwork

Immune landscape of CAC

Previous studies demonstrated that inflammation or tumor initiation and progression are associated with the immune microenvironment and that immunocytes are key components in the microenvironment [42]. Therefore, exploration of the immune landscape in CAC is highly warranted. We first performed ssGSEA to quantify the relative abundance of immune cell infiltration in the CAC group compared to in the NC group (Fig. 7A). Wilcoxon analysis showed that 22 types of immune cells were significantly different between the two groups (p < 0.05) (Fig. 7B). Subsequently, CIBERSORT and xCell were used to evaluate immune cell infiltration and reduce bias before visualization in heatmaps (Fig. 7C, D). The most important observation from data comparison was that the neutrophils were the central immune cells, as they were the only cells showing the same trend in differential expression across the three algorithms.

Fig. 7
figure 7

Immune cell infiltration analysis. A Percentage stacked bar chart shows the distribution of 28 immune cells in the CAC and NC group samples from the GSE37283 data set. B Results of ssGSEA showed differences in the abundance of immune cells between NC and CAC groups. Blue represents the NC group and red the CAC group. The heatmap demonstrates the statistics of immune cell infiltration results by the xCell (C) and CIBERSORT (D) algorithms. * indicates statistical significance at p < 0.05. CAC ulcerative colitis-associated colorectal cancer, NC normal control, ssGSEA single-sample gene set enrichment analysis

Identification and verification of a ceRNA-immunoregulation axis

Overall, the significant difference between the NF-κB pathway and neutrophils in CAC indirectly demonstrated that both factors play critical roles in CAC formation. Consequently, we carried out correlation analysis, which showed that the NF-κB pathway was significantly positively correlated with neutrophils (r = 0.79, p < 0.01) (Fig. 8A). We also intersected the core genes enriched in the NF-κB pathway with HIGs to obtain two key immune genes, IL6ST and TNFAIP3. Of these, IL6ST was significantly positively correlated with neutrophils (r = 0.75, p < 0.01) (Fig. 8B). To further analyze the regulatory mechanism of IL6ST at the transcriptional level in the ceRNA immunoregulatory network, we identified two upstream lncRNAs, NEAT1 and KCNQ1OT1 that regulate IL6ST. The mechanism of lncRNA is related to its localization in cells [43]. We predicted the subcellular localization of lncRNAs using the lnclocator algorithm. NEAT1 was mainly distributed in the cytoplasm, whereas KCNQ1OT1 in the nucleus (Fig. 8C). Using the ENCORI database, we verified that NEAT1 promotes the expression of the target mRNA IL6ST (Fig. 8D). We then predicted the precursor stem loop structures for the two target miRNAs of NEAT1 (Additional file 2: Fig. S1). Compared with miR-133a-3p, miR-1-3p had the highest connectivity score in the ceRNA network and a higher predictive score for interacting with NEAT1. These results indicate that NEAT1 absorbs miR-1-3p through sponge action to increase the expression of IL6ST (Fig. 8E).

Fig. 8
figure 8

Identification of ceRNA-immunoregulatory axis. A Correlation analysis of the NF-κB pathway and neutrophils. B Correlation analysis of IL6ST and neutrophils. C Bioinformatics predict cellular localization of the lncRNAs NEAT1 and KCNQ1OT1. D Correlation plot of NEAT1 and IL6ST expression using the ENCORI database. E Schematic model of ceRNA; blue represents downregulation and red upregulation. Base pairing between miR-1-3p and the target sites of NEAT1 and IL6ST 3′ untranslated region as predicted using the database


UC is a chronic inflammatory disease of unknown etiology, and its cumulative incidence of progressing to CAC has markedly increased over the years [44]. Because of chronic inflammation caused by UC, various layers of the intestinal wall are infiltrated by immune cells, forming an immune microenvironment and participating in the induction of CAC through the production of cytokines and chemokines [45]. However, the underlying mechanisms remain unclear. Here, we explored the role of the ceRNA-immunoregulatory axis in the immune regulation of CAC at the transcriptional level.

We integrated three bioinformatics methods, including immune gene list, WGCNA, and differential gene expression analysis, to identify 130 HIGs. We found that metronidazole, 3-acetamidocoumarin, heptaminol, and isometheteptene reversed the expression levels of HIGs and play a potential role in the treatment of CAC. Compared with the other three drugs not reported in CAC, metronidazole has been shown to inhibit the occurrence of tumors and reduce the degree of inflammation in CAC animal models [46]. Based on the ceRNA hypothesis, we predicted miRNAs and lncRNAs using HIGs and constructed a ceRNA-immunoregulatory network to better understand the CAC-related molecular mechanisms and biological phenomena at the transcriptional level.

To determine the underlying mechanisms affecting the occurrence of CAC, we analyzed the differences in pathways between the CAC and NC groups using two major pathway enrichment methods. The results showed that the largest number of genes (93 genes) was enriched in the NF-κB pathway, revealing that this pathway is closely related to CAC. The NF-κB pathway is involved in the immune response in vivo through classical and non-classical pathways, allowing the massive release of pro-inflammatory cytokines that cause tissue damage and participate in tumor invasion and metastasis by regulating the expression of angiogenesis-related genes [47,48,49,50]. A previous study showed that the NF-κB pathway exerts a tumor-promoting effect in a CAC mouse model [51].

We further evaluated alterations in the immune microenvironment of CAC and found that only neutrophils had the same expression trend (high in the CAC group and low in the NC group) across the three methods used to calculate the degree of immune cell infiltration. These results were consistent with those reported in previous studies showing that neutrophil infiltration is significantly higher in the colonic mucosal layer of a CAC mouse model, promoting CAC occurrence by secreting chemokines and, consequently, recruiting chemotactic receptors [52, 53]. Correlation analysis showed that the NF-κB pathway was positively correlated with neutrophils. Further analysis of 93 core genes enriched in the NF-κB pathway revealed that IL6ST was positively associated with neutrophils. We subsequently screened the miRNAs regulating IL6ST (miR-1-3p) and its upstream lncRNA from the ceRNA network (NEAT1). Previous studies suggested that NEAT1 promotes tumor development by downregulating target miRNAs. Zhou et al. [54] reported that NEAT1 targets and downregulates miR-500a-3p, promoting gastric cancer cell proliferation and invasion; Huang et al. [55] suggested that NEAT1 promotes pancreatic cancer progression by negatively regulating miR-506-3p; whereas Zhang et al. [56] showed that upregulation of NEAT1 is involved in the proliferation of glioma cells by negatively regulating miR-324-5p. However, the role of NEAT1 in CAC has rarely been reported. MiR-1-3p, a downstream target of NEAT1, has been shown to play a facilitative role in tumors. For instance, Peng et al. [57] reported that miR-1-3p affects gastric cancer cell proliferation by promoting the oxygen glycolytic pathway, and Liu et al. [58] indicated that downregulation of miR-1-3p expression is involved in the growth and motility of lung cancer cells. Our results suggest that IL6ST is a target gene of miR-1-3p in CAC. IL6ST is a subunit of the IL-6 receptor, and the IL-6/IL-6 receptor complex only functions by binding to IL6ST [59]. Previous studies showed that IL-6 promotes the occurrence of CAC by interacting with IL6ST [60]. Our data further supported that NEAT1 can competitively bind to miR-1-3p and upregulate IL6ST at the transcriptional level, affecting the NF-κB pathway and neutrophil infiltration as well as promoting the occurrence and development of CAC.

We constructed a ceRNA-immunoregulatory network by integrating multiple bioinformatics tools and deeply analyzed the alterations of the immune microenvironment and pathways in the process of CAC. Ultimately, we identified a ceRNA immunoregulatory axis closely related to CAC. However, our study had some limitations; (1) we only used public datasets for the analysis, which may be biased because of the limited sample size. (2) Although we conducted a bioinformatic analysis and database prediction, the direct relationship between the ceRNA immunoregulatory axes requires experimental verification. (3) In vivo and in vitro functional experiments are needed to determine the biological role of the ceRNA regulatory axis in CAC. (4) Currently, bioinformatic analysis for the prediction of drug utility is limited; however, we believe that this study may provide valuable insights in designing drugs to treat CAC.


In conclusion, we identified a ceRNA immunoregulatory network of CAC and suggested that the NEAT1/miR-1-3p/IL6ST regulatory axis participates in the CAC process by altering neutrophil infiltration in the immune microenvironment. These findings provide a new perspective and direction for further exploration of the immunoregulatory mechanisms underlying CAC.

Availability of data and materials

Publicly available datasets were analyzed in this study. These data can be found at



Ulcerative colitis-associated colorectal cancer


Competing endogenous RNA


Hub immune genes


Ulcerative colitis


Messenger RNA


Micro RNAs


Immune-related genes


Weighted gene co-expression network analysis


National Center for Biotechnology Information


Gene Expression Omnibus


Normal control


Robust multiarray average


Topological overlap matrix


Module eigengenes


Gene significance


Gene set enrichment analysis


Gene set variation analysis


Molecular Signatures Database


Long non-coding RNA


Differentially expressed mRNAs


Differentially expressed miRNAs


Principal component analysis


After the analysis of variance


Module membership


  1. Neurath MF. Targeting immune cell circuits and trafficking in inflammatory bowel disease. Nat Immunol. 2019;20:970–9.

    CAS  Article  PubMed  Google Scholar 

  2. Hanahan D, Weinberg RA. Hallmarks of cancer: the next generation. Cell. 2011;144:646–74.

    CAS  Article  PubMed  Google Scholar 

  3. Ma L, Weinberg RA. MicroRNAs in malignant progression. Cell Cycle. 2008;7:570–2.

    CAS  Article  PubMed  Google Scholar 

  4. Tay Y, Rinn J, Pandolfi PP. The multilayered complexity of ceRNA crosstalk and competition. Nature. 2014;505:344–52.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  5. Bartel DP. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell. 2004;116:281–97.

    CAS  Article  PubMed  Google Scholar 

  6. Friedman RC, Farh KK-H, Burge CB, Bartel DP. Most mammalian mRNAs are conserved targets of microRNAs. Genome Res. 2009.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Salmena L, Poliseno L, Tay Y, Kats L, Pandolfi PP. A ceRNA hypothesis: the Rosetta Stone of a hidden RNA language? Cell. 2011;146:353–8.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  8. Chan JJ, Tay Y. Noncoding RNA:RNA regulatory networks in cancer. Int J Mol Sci. 2018.

    Article  PubMed  PubMed Central  Google Scholar 

  9. Zhang K, Zhang L, Mi Y, Tang Y, Ren F, Liu B, Zhang Y, Zheng P. A ceRNA network and a potential regulatory axis in gastric cancer with different degrees of immune cell infiltration. Cancer Sci. 2020;111:4041–50.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  10. Zhang L, Zhang K, Liu S, Zhang R, Yang Y, Wang Q, Zhao S, Yang L, Zhang Y, Wang J. Identification of a ceRNA network in lung adenocarcinoma based on integration analysis of tumor-associated macrophage signature genes. Front Cell Dev Biol. 2021;9: 629941.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Luo X, Peng S, Ding S, Zeng Q, Wang R, Ma Y, Chen S, Wang Y, Wang W. Prognostic values, ceRNA network, and immune regulation function of SDPR in KRAS-mutant lung cancer. Cancer Cell Int. 2021;21:49.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  12. Qi X, Zhang D-H, Wu N, Xiao J-H, Wang X, Ma W. ceRNA in cancer: possible functions and clinical implications. J Med Genet. 2015;52:710–8.

    CAS  Article  PubMed  Google Scholar 

  13. Sun F, Liang W, Tang K, Hong M, Qian J. Profiling the lncRNA-miRNA-mRNA ceRNA network to reveal potential crosstalk between inflammatory bowel disease and colorectal cancer. PeerJ. 2019;7: e7451.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Jiang H, Wu F, Jiang N, Gao J, Zhang J. Reconstruction and analysis of competitive endogenous RNA network reveals regulatory role of long non-coding RNAs in hepatic fibrosis. Mol Med Rep. 2019;20:4091–100.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  15. He L, Chen Y, Hao S, Qian J. Uncovering novel landscape of cardiovascular diseases and therapeutic targets for cardioprotection via long noncoding RNA-miRNA-mRNA axes. Epigenomics. 2018;10:661–71.

    CAS  Article  PubMed  Google Scholar 

  16. Pekow J, Dougherty U, Huang Y, Gometz E, Nathanson J, Cohen G, Levy S, Kocherginsky M, Venu N, Westerhoff M, et al. Gene signature distinguishes patients with chronic ulcerative colitis harboring remote neoplastic lesions. Inflamm Bowel Dis. 2013;19:461–70.

    Article  PubMed  Google Scholar 

  17. Pekow J, Meckel K, Dougherty U, Huang Y, Chen X, Almoghrabi A, Mustafi R, Ayaloglu-Butun F, Deng Z, Haider HI, et al. miR-193a-3p is a key tumor suppressor in ulcerative colitis-associated colon cancer and promotes carcinogenesis through upregulation of IL17RD. Clin Cancer Res. 2017;23:5281–91.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  18. Gautier L, Cope L, Bolstad BM, Irizarry RA. affy–analysis of Affymetrix GeneChip data at the probe level. Bioinformatics. 2004;20:307–15.

    CAS  Article  PubMed  Google Scholar 

  19. Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 2003;4:249–64.

    Article  PubMed  Google Scholar 

  20. Kauffmann A, Gentleman R, Huber W. arrayQualityMetrics—a bioconductor package for quality assessment of microarray data. Bioinformatics. 2009;25:415–6.

    CAS  Article  PubMed  Google Scholar 

  21. Bhattacharya S, Dunn P, Thomas CG, Smith B, Schaefer H, Chen J, Hu Z, Zalocusky KA, Shankar RD, Shen-Orr SS, et al. ImmPort, toward repurposing of open access immunological assay data for translational and clinical research. Sci Data. 2018;5: 180015.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  22. Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinform. 2008;9:559.

    CAS  Article  Google Scholar 

  23. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43: e47.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  24. Bardou P, Mariette J, Escudié F, Djemiel C, Klopp C. jvenn: an interactive Venn diagram viewer. BMC Bioinform. 2014;15:293.

    Article  Google Scholar 

  25. Lamb J, Crawford ED, Peck D, Modell JW, Blat IC, Wrobel MJ, Lerner J, Brunet J-P, Subramanian A, Ross KN, et al. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science. 2006;313:1929–35.

    CAS  Article  PubMed  Google Scholar 

  26. Kim S, Chen J, Cheng T, Gindulyte A, He J, He S, Li Q, Shoemaker BA, Thiessen PA, Yu B, et al. PubChem in 2021: new data content and improved web interfaces. Nucleic Acids Res. 2021;49:D1388–95.

    CAS  Article  PubMed  Google Scholar 

  27. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA. 2005;102:15545–50.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  28. Hänzelmann S, Castelo R, Guinney J. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinform. 2013;14:7.

    Article  Google Scholar 

  29. Liberzon A, Birger C, Thorvaldsdóttir H, Ghandi M, Mesirov JP, Tamayo P. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst. 2015;1:417–25.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  30. Yu G, Wang L-G, Han Y, He Q-Y. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16:284–7.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  31. Newman AM, Liu CL, Green MR, Gentles AJ, Feng W, Xu Y, Hoang CD, Diehn M, Alizadeh AA. Robust enumeration of cell subsets from tissue expression profiles. Nat Methods. 2015;12:453–7.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  32. Aran D, Hu Z, Butte AJ. xCell: digitally portraying the tissue cellular heterogeneity landscape. Genome Biol. 2017;18:220.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  33. Scholkopf A, Smola F, Williamson S, Bartlett S. New support vector algorithms. Neural Comput. 2000;12:1207–45.

    CAS  Article  PubMed  Google Scholar 

  34. Bindea G, Mlecnik B, Tosolini M, Kirilovsky A, Waldner M, Obenauf AC, Angell H, Fredriksen T, Lafontaine L, Berger A, et al. Spatiotemporal dynamics of intratumoral immune cells reveal the immune landscape in human cancer. Immunity. 2013;39:782–95.

    CAS  Article  PubMed  Google Scholar 

  35. Li J-H, Liu S, Zhou H, Qu L-H, Yang J-H. starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data. Nucleic Acids Res. 2014;42:D92–7.

    CAS  Article  PubMed  Google Scholar 

  36. Paraskevopoulou MD, Vlachos IS, Karagkouni D, Georgakilas G, Kanellos I, Vergoulis T, Zagganas K, Tsanakas P, Floros E, Dalamagas T, et al. DIANA-LncBase v2: indexing microRNA targets on non-coding transcripts. Nucleic Acids Res. 2016;44:D231–8.

    CAS  Article  PubMed  Google Scholar 

  37. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13:2498–504.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  38. Chin C-H, Chen S-H, Wu H-H, Ho C-W, Ko M-T, Lin C-Y. cytoHubba: identifying hub objects and sub-networks from complex interactome. BMC Syst Biol. 2014;8(Suppl 4):S11.

    Article  PubMed  PubMed Central  Google Scholar 

  39. Volders P-J, Anckaert J, Verheggen K, Nuytens J, Martens L, Mestdagh P, Vandesompele J. LNCipedia 5: towards a reference set of human long non-coding RNAs. Nucleic Acids Res. 2019;47:D135–9.

    CAS  Article  PubMed  Google Scholar 

  40. Lin Y, Pan X, Shen H-B. lncLocator 2.0: a cell-line-specific subcellular localization predictor for long non-coding RNAs with interpretable deep learning. Bioinformatics. 2021.

    Article  PubMed  PubMed Central  Google Scholar 

  41. Sato K, Hamada M, Asai K, Mituyama T. CENTROIDFOLD: a web server for RNA secondary structure prediction. Nucleic Acids Res. 2009;37:W277–80.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  42. Grivennikov SI, Greten FR, Karin M. Immunity, inflammation, and cancer. Cell. 2010;140:883–99.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  43. Geisler S, Coller J. RNA in unexpected places: long non-coding RNA functions in diverse cellular contexts. Nat Rev Mol Cell Biol. 2013;14:699–712.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  44. Choi C-HR, Rutter MD, Askari A, Lee GH, Warusavitarne J, Moorghen M, Thomas-Gibson S, Saunders BP, Graham TA, Hart AL. Forty-year analysis of colonoscopic surveillance program for neoplasia in ulcerative colitis: an updated overview. Am J Gastroenterol. 2015;110:1022–34.

    Article  PubMed  PubMed Central  Google Scholar 

  45. Francescone R, Hou V, Grivennikov SI. Cytokines, IBD, and colitis-associated cancer. Inflamm Bowel Dis. 2015;21:409–18.

    Article  PubMed  Google Scholar 

  46. Lee JG, Lee Y-R, Lee AR, Park CH, Han DS, Eun CS. Role of the global gut microbial community in the development of colitis-associated cancer in a murine model. Biomed Pharmacother. 2021;135: 111206.

    CAS  Article  PubMed  Google Scholar 

  47. Atreya I, Atreya R, Neurath MF. NF-kappaB in inflammatory bowel disease. J Intern Med. 2008;263:591–6.

    CAS  Article  PubMed  Google Scholar 

  48. O’Connor PM, Lapointe TK, Beck PL, Buret AG. Mechanisms by which inflammation may increase intestinal cancer risk in inflammatory bowel disease. Inflamm Bowel Dis. 2010;16:1411–20.

    Article  PubMed  Google Scholar 

  49. Wang S, Liu Z, Wang L, Zhang X. NF-kappaB signaling pathway, inflammation and colorectal cancer. Cell Mol Immunol. 2009;6:327–34.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  50. Karin M, Greten FR. NF-kappaB: linking inflammation and immunity to cancer development and progression. Nat Rev Immunol. 2005;5:749–59.

    CAS  Article  PubMed  Google Scholar 

  51. Greten FR, Eckmann L, Greten TF, Park JM, Li Z-W, Egan LJ, Kagnoff MF, Karin M. IKKbeta links inflammation and tumorigenesis in a mouse model of colitis-associated cancer. Cell. 2004;118:285–96.

    CAS  Article  PubMed  Google Scholar 

  52. Lin Y, Cheng L, Liu Y, Wang Y, Wang Q, Wang HL, Shi G, Li JS, Wang QN, Yang QM, et al. Intestinal epithelium-derived BATF3 promotes colitis-associated colon cancer through facilitating CXCL5-mediated neutrophils recruitment. Mucosal Immunol. 2021;14:187–98.

    CAS  Article  PubMed  Google Scholar 

  53. Shang K, Bai Y-P, Wang C, Wang Z, Gu H-Y, Du X, Zhou X-Y, Zheng C-L, Chi Y-Y, Mukaida N, et al. Crucial involvement of tumor-associated neutrophils in the regulation of chronic colitis-associated carcinogenesis in mice. PLoS ONE. 2012;7: e51848.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  54. Zhou Y, Sha Z, Yang Y, Wu S, Chen H. lncRNA NEAT1 regulates gastric carcinoma cell proliferation, invasion and apoptosis via the miR-500a-3p/XBP-1 axis. Mol Med Rep. 2021.

    Article  PubMed  PubMed Central  Google Scholar 

  55. Huang B, Liu C, Wu Q, Zhang J, Min Q, Sheng T, Wang X, Zou Y. Long non-coding RNA NEAT1 facilitates pancreatic cancer progression through negative modulation of miR-506-3p. Biochem Biophys Res Commun. 2017;482:828–34.

    CAS  Article  PubMed  Google Scholar 

  56. Zhang J, Li Y, Liu Y, Xu G, Hei Y, Lu X, Liu W. Long non-coding RNA NEAT1 regulates glioma cell proliferation and apoptosis by competitively binding to microRNA-324–5p and upregulating KCTD20 expression. Oncol Rep. 2021.

    Article  PubMed  PubMed Central  Google Scholar 

  57. Deng P, Li K, Gu F, Zhang T, Zhao W, Sun M, Hou B. LINC00242/miR-1-3p/G6PD axis regulates Warburg effect and affects gastric cancer proliferation and apoptosis. Mol Med. 2021;27:9.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  58. Liu P-J, Chen Y-H, Tsai K-W, Yeah H-Y, Yeh C-Y, Tu Y-T, Yang C-Y. Involvement of MicroRNA-1-FAM83A axis dysfunction in the growth and motility of lung cancer cells. Int J Mol Sci. 2020.

    Article  PubMed  PubMed Central  Google Scholar 

  59. Kidd VJ, Nesbitt JE, Fuller GM. Chromosomal localization of the IL-6 receptor signal transducing subunit, gp130 (IL6ST). Somat Cell Mol Genet. 1992;18:477–83.

    CAS  Article  PubMed  Google Scholar 

  60. Hirano T, Hirayama D, Wagatsuma K, Yamakawa T, Yokoyama Y, Nakase H. Immunological mechanisms in inflammation-associated colon carcinogenesis. Int J Mol Sci. 2020.

    Article  PubMed  PubMed Central  Google Scholar 

Download references


The authors would like to thank all staff members in the Department of Colorectal Surgery at the Sixth Affiliated Hospital of Sun Yat-sen University.


The research design, data collection, and analysis of this study were supported by the Sixth Affiliated Hospital of Sun Yat-Sen University Clinical Research—‘1010’ Program (Grant Number 010CG (2020)-02), the National Natural Science Foundation of China (Grant Numbers 81770557, 82070684), the Bethune Aixikang Distinguished Surgical Fund project (Grant Number HZB-20190528–5), and the Guangdong Natural Science Fund for Outstanding Youth Scholars (Grant Number 2020B151502067).

Author information

Authors and Affiliations



(I) Conception and design: SY, XL, and ZX; (II) Administrative support: LL, MX and LJ; (III) Provision of study materials or patients: HC, CM, and FZ; (IV) Collection and assembly of data: SY, XL; (V) Data analysis and interpretation: ZX; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

Corresponding author

Correspondence to Lei Lian.

Ethics declarations

Ethics approval and consent to participate

The study was conducted in accordance with the guidelines of the Declaration of Helsinki and approved by the Medical Ethics Committee of the Sixth Affiliated Hospital of Sun Yat-sen University, Guangzhou, China (no. 2021ZSLYEC-006).

Consent for publication

Not applicable.

Competing interests

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1. Table S1

: Number of genes in the 10 modules. Table S2: List of 130 hub immune genes.

Additional file 2. Fig. S1

: shows the stem-loop structure of miRNA precursors. (A) Secondary structures of human pre‐miR‐1. (B) Secondary structures of human pre‐miR‐133a.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Yin, S., Li, X., Xiong, Z. et al. A novel ceRNA-immunoregulatory axis based on immune cell infiltration in ulcerative colitis-associated colorectal carcinoma by integrated weighted gene co-expression network analysis. BMC Gastroenterol 22, 188 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • NEAT1
  • miR-1-3p
  • IL6ST
  • Immune microenvironment
  • Ulcerative colitis-associated colorectal carcinoma