Treatise
Screening of lncRNA related to prognosis of colon cancer based on TCGA database and establishment of prognostic risk model
He Tian, Cao Tiansheng
Published 2022-07-01
Cite as IMHGN, 2022, 28(13): 1864-1871. DOI: 10.3760/cma.j.issn.1007-1245.2022.13.018
Abstract
ObjectiveTo screen long non-coded RNA (lncRNA) associated with the prognosis of colon cancer, and to build a prognostic risk model of colon cancer.
MethodsThe data were collected from the establishment to March 1,2022. The transcriptome data of colon cancer were downloaded and sorted from The Cancer Genome Atlas (TCGA), then we constructed an expression matrix of lncRNA about paired samples. Differentially expressed lncRNAs (DElncRNAs) were obtained by R-packet "edgeR". For DElncRNAs, univariate COX regression analysis, Lasso regression analysis, Kaplan-Meier (K-M) survival analysis, and multivariate COX regression analysis were performed to obtain the prognostic associated lncRNAs. The prognostic risk model of colon cancer was established based on the coefficient of multivariate COX regression model. Then we evaluated the accuracy through C-index value, time-dependent receiver operating characteristic curve (ROC), area under ROC (AUC) value, and K-M survival analysis. CeRNA network was constructed for the lncRNAs in our model. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis were performed for related mRNAs to explore the mechanism of lncRNA affecting the progression of colon cancer.
ResultsFive thousand four hundred and sixty lncRNAs were screened by arranging the transcriptome data. Eight hundred and sixty-eight DElncRNAs were obtained by paired-sample analysis, including 548 up-regulated genes and 320 down-regulated genes. After univariate COX regression analysis, 40 lncRNAs were obtained. Through lasso regression analysis, we got 34 lncRNAs. Fourteen lncRNAs remained after K-M survival analysis. Multivariate COX regression analysis revealed 7 prognostic related lncRNAs (down-regulated genes: LINC01132; up-regulated genes: ELFN1-AS1, RP5-884M6.1, LINC00461, RP1-79C4.4, RP4-816N1.7, and RP3-380B8.4). The prognostic assessment model was constructed according to the regression coefficient. The C-index value of the model was 0.82; the AUC values at 3 and 5 years were 0.79 and 0.84; K-M survival analysis showed a statistical difference in the survival rate between the high and low risk groups (P<0.000 1). Next, we constructed the ceRNA network, and the KEGG enrichment analysis suggested that the down-regulation lncRNA inhibited the progression of colon cancer possibly through the pathways of regulation of actin cytoskeleton, proteoglycans in cancer, and PI3K-Akt signaling pathway; up-regulation lncRNAs promoted colon cancer possibly through the pathways of cellular adhesion molecules, focal adhesions, and phagosomes.
ConclusionsIn our study, we constructed a prognostic risk model of colon cancer with 7 lncRNAs. It has a nice accuracy in predicting the patients' survival prognosis. Each lncRNA is a potential independently prognostic biomarker. The prognostic risk model has certain value for clinical prognostic assessment of colon cancer patients.
Key words:
Colon cancer; Colorectal cancer; TCGA; lncRNA; Prognostic model
Contributor Information
He Tian
Department of Gastroenterological Surgery, People's Hospital of Huadu District, Guangzhou 510800, China
Cao Tiansheng
Department of Gastroenterological Surgery, People's Hospital of Huadu District, Guangzhou 510800, China