Survival analysis across the entire transcriptome identifies biomarkers with the highest prognostic power in breast cance
Computational and Structural Biotechnology Journal
Volume 19, 2021, Pages 4101-4109
Extensive research is directed to uncover new biomarkers capable to stratify breast cancer patients into clinically relevant cohorts. However, the overall performance ranking of such marker candidates compared to other genes is virtually absent. Here, we present the ranking of all survival related genes in chemotherapy treated basal and estrogen positive/HER2 negative breast cancer.
We searched the GEO repository to uncover transcriptomic datasets with available follow-up and clinical data. After quality control and normalization, samples entered an integrated database. Molecular subtypes were designated using gene expression data. Relapse-free survival analysis was performed using Cox proportional hazards regression. False discovery rate was computed to combat multiple hypothesis testing. Kaplan-Meier plots were drawn to visualize the best performing genes.
The entire database includes 7,830 unique samples from 55 independent datasets. Of those with available relapse-free survival time, 3,382 samples were estrogen receptor-positive and 696 were basal. In chemotherapy treated ER positive/ERBB2 negative patients the significant prognostic biomarker genes achieved hazard rates between 1.76 and 3.33 with a p value below 5.8E−04. The significant prognostic genes in adjuvant chemotherapy treated basal breast cancer samples reached hazard rates between 1.88 and 3.61 with a p value below 7.2E−04. Our integrated platform was extended enabling the validation of future biomarker candidates.
A reference ranking for all genes in two chemotherapy treated breast cancer cohorts is presented. The results help to neglect those with unlikely clinical significance and to focus future research on the most promising candidates.