The molecular characteristics of gastric cancer patients living in Qinghai-Tibetan Plateau
BMC Gastroenterology volume 22, Article number: 244 (2022)
Gastric cancer, or stomach cancer, that originates in the inner lining of the stomach, was the fifth most common cancer and the fourth mortality globally, with over one million new cases in 2020 and an estimated 769,000 deaths. The molecular characteristics of gastric cancer has been complicated by histological and intratumor heterogeneity. The incidence of gastric cancer shows wide geographical variation. As the largest and highest region in China, Qinghai-Tibetan Plateau is one of the important global biodiversity hotspots. Here, we collect tumour and paired normal bio-samples from 31 primary gastric cancer patients from Qinghai Provincial People’s Hospital, and discuss the molecular characteristics for gastric cancer patients living in plateau. They have more single nucleotide polymorphisms (SNP) located in chromosome 7 with C → T and G → A as the most common alteration types, barely share the cancer driver genes with western patients, and have no significant differences in various Chinese nation. These characteristics offers a great opportunity to further understanding the divergent mechanism of gastric cancer, increase the efficacy for diagnosis and prognosis, finally lead the optimal targeted therapeutics.
Gastric cancer is one of the most common gastrointestinal malignancies in the world. It is a malignant tumour originating from the muscosal epithelical cells of the gastric. Gastric cancer can occur in any part of the stomach, with the lesser curvature of the gastric antrum and the anterior area of the pylorus the most common, followed by the fundus of the stomach from the cardia . Most gastric mucosal lesions are gradually developed into gastric cancer by atrophy, intestinal metaplasia, low-grade intraepithelial neoplasia and high-grade intraepithelial neoplasia. Early gastric cancer has a good prognosis, with a 5-year survival rate of more than 90%, while advanced gastric cancer has a poor prognosis, with 5-year survival rate of less than 30% [2,3,4]. In recent years, the overall incidence of gastric cancer has decreased, but it is still the fifth most common tumour and the second most lethal cause of cancer in the world. About 950000 new cases of gastric cancer are diagnosed globally every year, and about 700000 patients die from it . In 2020, gastric cancer was responsible for over one million new cases and an estimated 769,000 deaths (equating to one in every 13 deaths globally), ranking fifth for incidence and fourth for mortality globally . Previous studies have shown that the incidence of gastric cancer varies by region , with more than 50% of new cases occurring in developing countries.
The incidence of gastric cancer in China ranks third in the world, and the mortality rate among malignant tumors ranks first . The morbidity and mortality of different regions in China also have obvious differences, among which northwest China and northeast China are highly affected. Northwest China, such as Qinghai, Ningxia, Gansu, has high mortality, while Qinghai has the highest mortality . Qinghai province is located in the Qinghai-Tibet Plateau (QTP). QTP has a complex geological history, and it is a common understanding that the central plateau uplifted first and formed the `proto-QTP’ as early as 40 Mya, followed by outward extensions in early Miocene [9,10,11,12]. The agricultural and pastoral areas in Qinghai are vast, with difficult natural conditions, poor nutrition, relatively weak sanitary conditions and awareness, and high Helicobacter pylori (HP) infection rate, leading to high incidence of gastric cancer. The diagnosis rate of early gastric cancer in China is only 5%-20% . Qinghai has a low detection rate of early gastric cancer and a high mortality rate due to its special low-oxygen regional environment and dietary habits. Therefore, we analyzed the clinical and molecular characteristics of gastric cancer patients living in Qinghai Province, in order to better understand its pathogenesis, so as to know the individualized treatment.
Here, for better understanding the molecular mechanism of gastric cancer patients living in QTP and suggest the targeted therapeutic strategies specially designed them, we collected tumours and paired normal bio-sample from 31 gastric cancer patients from Qinghai Provincial People’s Hospital, discussed their unique molecular characteristics, and predicted the specific therapeutic drugs based on adapted kernel-based machine learning method.
Tumor specimens and their paired normal bio-samples
After receiving informed consent, tumor and paired normal bio-samples were obtained from patients undergoing surgery at Qinghai Provincial People’s Hospital. This work was performed in compliance with all relevant ethical regulations for research using tumor and paired normal specimens. The fresh frozen tissues were delivered to sequencing company, Frasergen (Additional file 1: Table S1), to capture the exonic DNA fragments and perform the whole exome sequencing. All methods were performed in accordance with the relevant guidelines and regulations for research using human specimens.
The pharmacogenomics data that used for training drug response learning model
The pharmacogenomics used to train the prediction model came from the drug response data on cancer cell lines, which was deposited in GDSC . The mutation profile for genes across cancer cell lines and chemical structures for anti-cancer drugs were used to represent cancer cell lines and drugs, respectively.
Whole exome sequencing and mutation calling
The Illumina HiSeq 2000 instrument was applied for whole exome sequencing (WES), which generated 2 × 150 base paired-end reads. FASTQ files were aligned to the human genome assembly (hg38) via Burrows–Wheeler Aligner (BWA) . Before further analysis, the initially aligned BAM files were pre-processed that sorted, removed duplicated reads, locally realigned reads around potential small indels, and recalibrated base quality scores via SAMtools  and Picard (https://broadinstitute.github.io/picard/). The single nucleotide polymorphisms (SNP) was detected through the Genome Analysis ToolKit (GATK)  and annotated via ANNOVAR . The duplicated and low-quality SNPs were removed before annotation.
Genome coverage, somatic mutation, and gene fusion analysis
The coverage per-base was calculated from genomecov function in bedtools package  based on preprocessed BAM files. The coverage at gene level was obtained based on the human genome (hg38) annotation file. The somatic mutation was obtained by using SNPs from paired normal bio-samples as a reference. The gene fusion analysis was performed via FusionMap  based on FASTQ files.
The model for prediction of the effective clinical drugs
We applied our previously prediction model, an adapted kernel-based learning model to predict the effective clinical drugs for cancer patients . Specifically, a bipartite graph framework under the assumption that drugs with similar chemical properties should have similar treatment outcomes, was introduced to represent the relationship between cancer cell and anti-cancer drug . An adapted kernel method was proposed to construct similarity matrix based on different types of features. That is, the cancer genomic data (such as mutation, expression, et al.) and chemical properties were applied to construct kernel-based similarity matrices between cancer cells and anti-cancer drugs. The three classification models, random forest (RF), support vector machine (SVM), and deep learning network (DN) were then applied on these kernel-based similarity matrices, separately, to predict the effective clinical drugs for cancer patients. The RF, SVM and DN model were implemented via `randomForest’ R package with default parameters, LibSVM  in ‘e1071’ R package with RBF kernel function, and the `h2o’ R package with default parameters, respectively. The penalty parameter and the RBF kernel parameter were optimized by the grid search approach with fivefold cross-validation. The area under the ROC curve (AUC)  was introduced as the evaluation criteria to assess the performance of classification model.
Identification of cancer driver gene
The MaxMIF , which was reported to outperform the existing state-of-the-art methods (including MUFFINN , MuttSig2 , MutSigCV , et al.) on TCGA pan-cancer datasets, was introduced to distinguish the cancer driver genes from the passenger genes. MaxMIF integrated the somatic mutation data and molecular interaction data by a maximal mutational impact function. The protein–protein interaction (PPI) network deposited in HumanNet v2  was introduced to represent the molecular interactions.
We obtained gastric primary tumor tissue (fresh frozen) from 31 patients not treated with prior chemotherapy or radiotherapy. All tumor tissues are adenocarcinoma. The clinical information, including the initial diagnosis age, the gender, the location of primary tumor tissue, tumor TNM stage, and the patients’ nation, was shown in Fig. 1. Most of patients were male (22/31), the initial diagnosis age was from 40 to 70 years old, and most of these patients were from 55 to 65 years old. Comparing with the western patients in terms of initial diagnosis age, plateau patients were younger (western patients are around sixty) . Furthermore, most of patients (24/31) got stage III tumors. The differences in clinical properties indicate the unique properties for these patients living in QTP, meaning they may need the specific treatment strategy.
As did in our previous work , we first checked the sequencing depth at gene level via counting the genome coverage along the gene loci. Most of genes had depth around a hundred no matter in tumor or in normal bio-samples (Additional file 1: Figure S1A, S1C) and genes in tumor bio-samples had lower coverage than in normal bio-samples did (Additional file 1: Figure S1B, S1D). Then we discussed the SNP distribution along the chromosome. The SNP distribution along the chromosome showed that chromosome 7 retained the most of variations (Fig. 2A). According to the database of COSMIC , the world's largest and most comprehensive resource for exploring the somatic mutations in human cancer, chromosome 7 included lots of well-known cancer-related genes, including EGFR, BRAF, CDK6, MET, T1F1, and so on. The copy number variation in chromosome 7 was also related with cancer . These results together indicated an important region was indicated here. The biomarkers for further diagnosis and treatment could be focus on this important region, that is chromosome 7. The distribution of SNP alteration type showed the C → T and G → A are majority (Fig. 2B). Here, the somatic variation was obtained by using paired normal bio-samples as the reference. Comparing with that obtained by using germline mutation as the reference, the mutation rate became lower (Additional file 1: Figure S1), that might indicate that the paired tumor and normal bio-samples are ideal choice for somatic mutation analysis.
We then linked the molecular variants with clinical features, including patients’ nation, tumor location, and tumor TNM stage. As a result, the venndiagrams showed that various nationality groups, tumor locations, and tumor stages had their unique molecular variants, meanwhile share some common variants. For instance, there were 12 mutant genes share by Han Chinese and other minority nationalities (Fig. 3A), and Han, Hui, and Zang had 8 (CYP4F2, DSPP, FOXD4, GOLGA6L6, GP6, OR9G1, PABPC3, TBC1D26), 8 (ARHGEF26, CNTNAP3B, COL4A2-AS2, ESRRA, LIMS1, OR11H12, PRAMEF22, TPTE, ZNF208, ZNF737), and 10 (FAM186A, FOXD4L1, HLA-DPA1, MADCAM1, NBPF11, PCDHA8, POTEG, POTEH, RPGR, ZNF717) unique mutant genes, respectively; there were 12 mutant genes share by different locations of tumors (Fig. 3B), and antrum, bogy, and cardia had 10 (AGAP3, ARHGEF26, ARMCX4, CNTNAP3, ESRRA, FLG, GP6, MADCAM1, OTUD7A, PRAMEF22), 17 (FBRSL1, FOXD4, FRG2B, GGT2, KCNJ12, MUC17, MUC6, MYO15B, NBPF10, OR2T34, PLIN4, REG3A, RPL21, RPTN, SETD1B, TAS2R20, TPSAB1), and 9 (AHNAK2, FMN2, HLA-C, NBPF11, POTEG, RPGR, SLAIN1, VKORC1L1, ZNF208) unique mutant genes, respectively; there were 4 mutant genes share by different stages of tumors (Fig. 3C), and stage I, stage II, and stage III had 2 (GOLGA6L6,ZNF717), 8 (FAM186A, FMN2, GP6, HLA-C, MAGEC1, PER3, RFPL4A, ZDHHC8), and 5 (ANKRD36, HRNR, KRT18, MUC20, MUC6) unique mutant genes, respectively.
To distinguish the potential driver genes from passage genes, MaxMIF was introduced . The first 30 genes with highest MaxMIF scores were reported in Fig. 4. Comparing with using germline mutation as reference (QH-v1), only RPGR was shared, meaning the reference played very important role in determining the somatic mutation (Fig. 4). Also, only five genes, TP53, RYR2,RYR1, COL12A1, DST, were also presented in western patients (Fig. 4). Furthermore, from Fig. 4, we can see that, the mutation profiles for these driver genes were not shown the significant differences between Han Chinese and other minor ethnic groups. The follow-up gene fusion analysis showed in-frame gene fusion of the number 2 exon of KRTAP10-7 and number 1 exon of KRTAP10-6 (chr21:46020997 → chr21: 46011685), and number 28 exon of IPO4 and number 21 exon of DNHD1 (chr14:24650800 → chr11: 6567900), which were also found by using germline mutation to determine the somatic mutation. Besides that, it also identified some unique fusion genes, including the in-frame gene fusion of the number 1 exon of HOXD11 and number 11 exon of AGAP3 (chr2:176972342 → chr7: 150783926), and in-frame gene fusion of the number 8 exon of HLA-A and number 7 exon of HLA-J (chr6:29913277 → chr6: 29977361).
In sum, the unique clinical and molecular characteristics for plateau patients were detected, meaning that there should have some treatment strategies that are specifically designed for them.
Prediction of effective clinical anti-cancer drug for gastric patients living in QTB
Thus, to detect the effective clinical anti-cancer drugs for these patients living in QTB, we trained our previously adapted kernel-based learning model  on pharmacogenomics generated from Genomics of Drug Sensitivity in Cancer , to predict the effective clinical drugs for these patients. The predicted results showed that there were no significant differences between Han Chinese and minor ethnic groups in terms of effective clinical drugs (Fig. 5), which may be due to the fact that Han Chinese and minor ethnic groups of patients did not have significant differences in molecular characteristics. All 31 patients had responsed well to six drugs, including Erlotinib (EGFR inhibitor), Crizotinib (Met inhibitor), Bortezomib (Proteasome inhibitor), AUY922 (Hsp90 inhibitor), Axitinib (VEGFR inhibitor), and BEZ235 (PI3K inhibitor). There were few reports of Crizotinib in gastric cancer patients with c-MET amplification . Therefore, it might be a good option for that patients living in QTP.
Qinghai province is in the northeast of QTP with average 4000-m height above sea level. Han Chinese and lots of minor ethnic groups (such as hui, zang (Tibetan people), zhi, sala, and so on) lived here. In addition, people living here always have alcohol and cigarette issue. The gastric cancer, which are related with dietary and lifestyles, is the most common cancer types in Qinnghai province . Here, to better understand the molecular mechanism of gastric cancer patients living in QTP, and predict the most effective clinical drugs for these patients, we collected the paired tumor and normal bio-samples from gastric patients at Qinghai Provincial People’s Hospital, and discussed the clinical and molecular characteristics for those patients. As a result, we found some unique characteristics for our plateau patients, including has lower mutation rate, and unique gene fusions. The drug response prediction mode based on pharmacogenomics from GDSC suggests the effective targeted therapies, which specifically designed for these plateau patients. The prediction results could be evaluated by following up tracking reports.
There are around 5,000,000 people living in Qinghai province, and based on the recent statistic reports, only around 1,000,000 people live in Xining. The sample collection and followed up tracing are quite challenge here. It took us two years to collect these 30 patients. In future, we will collect much more patients with followed up treatment reports to further show the unique characteristics of plateau patients and discuss their special treatment strategy.
Here, to better understand the molecular mechanism of gastric cancer patients living in QTP, and predict the most effective clinical drugs for these patients, the WES was performed on tumour and paired normal bio-samples from 31 primary gastric cancer patients at Qinghai Provincial People’s Hospital. Several unique molecular characteristics for those gastric cancer patients were found, including having more SNPs located in chromosome 7 with C → T and G → A as the most common alteration types, barely sharing the cancer driver genes with western patients, and having no significant differences in various Chinese nation, et al.
Availability of data and materials
The datasets used and/or analysed during the current study available at https://www.jianguoyun.com/p/DeRPAeYQj7eWChi22qUE.
Kim K, Cho Y, Sohn J, et al. Clinicopathologic characteristics of early gastric cancer according to specific intragastric location. BMC Gastroenterol. 2019;19(1):24.
Katai H, Ishikawa T, Akazawa K, et al. Five-year survival analysis of surgically resected gastric cancer cases in Japan: a retrospective analysis of more than 100,000 patients from the nationwide registry of the Japanese Gastric Cancer Association (2001–2007). Gastric Cancer. 2018;21(1):144–54.
Sano T, Coit D, Kim H, et al. Proposal of a new stage grouping of gastric cancer for TNM classification: International Gastric Cancer Association staging project. Gastric Cancer. 2017;20(2):217–25.
Suzuki H, Oda I, Abe S, et al. High rate of 5-year survival among patients with early gastric cancer undergoing curative endoscopic submucosal dissection. Gastric Cancer. 2016;19(1):198–205.
Zheng R, Zeng H, Zhang S, et al. Estimates of cancer incidence and mortality in China, 2013. Chin J Cancer. 2017;39(4):315–20.
Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, Bray F. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 Countries. CA Cancer J Clin. 2021;71(3):209–49.
Rawla P, Barsouk A. Epidemiology of gastric cancer: global trends, risk factors and prevention. Przeglad Gastroenterol. 2019;14(1):26–38.
Yin J, Wu X, Li S, et al. AAAImpact of environmental factors on gastric cancer: a review of the scientific evidence, human prevention and adaptation. J Environ Sci. 2020;89:65–79.
Rowley DB, Currie BS. Palaeo-altimetry of the late Eocene to Miocene Lunpola basin, central Tibet. Nature. 2006;439:677–81.
Favre A, Martin P, Steffen UP, Sonja CJ, Dieter U, Ingo M, Alexandra MR. The role of the uplift of the Qinghai-Tibetan Plateau for the evolution of Tibetan biotas. Biol Rev. 2015;90:236–53.
Xing YW, Richard HR. Uplift-driven diversification in the Hengduan Mountains, a temperate biodiversity hotspot. Proc Natl Acad Sci USA. 2017;114:E3444–51.
Spicer RA, Nigel BWH, Mike W, Alexei BH, Shuangxing G, Paul JV, Jack AW, Simon PK. Constant elevation of southern Tibet over the past 15 million years. Nature. 2003;421:622–4.
Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–60.
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R. 1000 Genome Project Data Processing Subgroup. The Sequence alignment/map (SAM) format and SAMtools. Bioinformatics. 2009;25:2078–9.
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA. The genome analysis toolkit: a mapreduce framework for analyzing nextgeneration DNA sequencing data. Genome Res. 2010;20:1297–303.
Wang K, Li M, Hakonarson H. Annovar: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010;38:e164–e164.
Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26(6):841–2.
Ge H, Liu K, Juan T, Fang F, Newman M, Hoeck W. FusionMap: detecting fusion genes from next-generation sequencing data at basepair resolution. Bioinformatics. 2011;27:1922–8.
Wang Y, Fang J, Chen S. Inferences of drug responses in cancer cells from cancer genomic features and compound chemical and therapeutic properties. Sci Rep. 2016;6:32679.
Chang CC, Lin CJ. LIBSVM: a library for support vector machines. ACM TIST. 2011;2(27):1–27.
Gribskov M, Robinson NL. Use of receiver operating characteristic (roc) analysis to evaluate sequence matching. Comput Chem. 1996;20:25–33.
Hou YN, Gao B, Li GJ, Su ZC. MaxMIF: A New Method for Identifying Cancer Driver Genes through Effective Data Integration. Adv Sci. 2018;5:1800640.
Cho A, Shim JE, Kim E, Supek F, Lehner B, Lee I. MUFFINN: cancer gene discovery via network analysis of somatic mutation data. Genome Biol. 2016;17:129.
Lawrence MS, Stojanov P, Mermel CH, Robinson JT, Garraway LA, Golub TR, Meyerson M, Gabriel SB, Lander ES, Getz G. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature. 2014;505(7484):495–501.
Lawrence MS, Stojanov P, Polak P, Kryukov GV, Cibulskis K, Sivachenko A, Carter SL, Stewart C, Mermel CH, Roberts SA. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature. 2013;499(7457):214–8.
Hwang S, Kim CY, Yang S, Kim E, Hart T, Marcotte EM, Lee I. HumanNet: human gene networks for disease research. Nucleic Acids Res. 2019;47(D1):D573–80.
Cancer Genome Atlas Research Network. Comprehensive molecular characterization of gastric adenocarcinoma. Nature. 2014;513(7517):202–9.
Rong G, Zhang Y, Ma Y, Chen S, Wang Y. The clinical and molecular characterization of gastric cancer patients in Qinghai-Tibetan Plateau. Front Oncol. 2020;10:1033.
Tate JG, Bamford S, Jubb HC, Sondka Z, Beare DM, Bindal N, Boutselakis H, Cole CG, Creatore C, Dawson E, Fish P, Harsha B, Hathaway C, Jupe SC, Kok CY, Noble K, Ponting L, Ramshaw CC, Rye CE, Speedy HE, Stefancsik R, Thompson SL, Wang S, Ward S, Campbell PJ, Forbes SA. COSMIC: the catalogue of somatic mutations in cancer. Nucleic Acids Res. 2019;47(D1):D941–7.
Kurscheid S, Bady P, Sciuscio D, Samarzija I, Shay T, Vassallo I, Criekinge WV, Daniel RT, van den Bent MJ, Marosi C, Weller M, Mason WP, Domany E, Stupp R, Delorenzi M, Hegi ME. Chromosome 7 gain and DNA hypermethylation at the HOXA10 locus are associated with expression of a stem cell related HOX-signature in glioblastoma. Genome Biol. 2015;16:16.
Rees MG, Seashore-Ludlow B, Cheah JH, Adams DJ, Price EV, Gill S, et al. Correlating chemical sensitivity and basal gene expression reveals mechanism of action. Nat Chem Biol. 2016;12(2):109–16.
Hou GX, Song BB. Gastric cancer patient with c-MET amplification treated with crizotinib after failed multi-line treatment: a case report and literature review. Math Biosci Eng. 2019;16(5):5923–30.
We would like to thank the editors and anonymous reviewers for their contributions in this work.
This work was supported by the National Natural Science Foundation of China (Nos. 11671396, 11371365, 31270270), and a grant from Qinghai Sciences and Technology Department for Basic Research Program (Nos. 2020-ZJ-719, 2017-ZJ-Y14).
Ethics approval and consent to participate
The studies involving human participants were reviewed and approved by Qinghai Provincial People’s Hospital ethics committee. The patients/participants provided their written informed consent to participate in this study. All methods were performed in accordance with the relevant guidelines and regulations for research using human specimens.
Consent for publication
The authors declare that there are no conflicts of interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Yuan, L., Chen, S., Wang, Y. et al. The molecular characteristics of gastric cancer patients living in Qinghai-Tibetan Plateau. BMC Gastroenterol 22, 244 (2022). https://doi.org/10.1186/s12876-022-02324-8
- Gastric cancer
- Qinghai-Tibetan plateau
- Molecular characteristics
- The mechanism of cancer
- Cancer targeted therapeutics