TNMplot.com: A Web Tool for the Comparison of Gene Expression in Normal, Tumor and Metastatic Tissues
by Áron Bartha and Balázs Győrffy
Genes showing higher expression in either tumor or metastatic tissues can help in better understanding tumor formation and can serve as biomarkers of progression or as potential therapy targets. Our goal was to establish an integrated database using available transcriptome-level datasets and to create a web platform which enables the mining of this database by comparing normal, tumor and metastatic data across all genes in real time. We utilized data generated by either gene arrays from the Gene Expression Omnibus of the National Center for Biotechnology Information (NCBI-GEO) or RNA-seq from The Cancer Genome Atlas (TCGA), Therapeutically Applicable Research to Generate Effective Treatments (TARGET), and The Genotype-Tissue Expression (GTEx) repositories. The altered expression within different platforms was analyzed separately. Statistical significance was computed using Mann–Whitney or Kruskal–Wallis tests. False Discovery Rate (FDR) was computed using the Benjamini–Hochberg method. The entire database contains 56,938 samples, including 33,520 samples from 3180 gene chip-based studies (453 metastatic, 29,376 tumorous and 3691 normal samples), 11,010 samples from TCGA (394 metastatic, 9886 tumorous and 730 normal), 1193 samples from TARGET (1 metastatic, 1180 tumorous and 12 normal) and 11,215 normal samples from GTEx. The most consistently upregulated genes across multiple tumor types were TOP2A (FC = 7.8), SPP1 (FC = 7.0) and CENPA (FC = 6.03), and the most consistently downregulated gene was ADH1B (FC = 0.15). Validation of differential expression using equally sized training and test sets confirmed the reliability of the database in breast, colon, and lung cancer at an FDR below 10%. The online analysis platform enables unrestricted mining of the database and is accessible at TNMplot.com.