Evaluation of Methods for Gene Selection in Melanoma Cell Lines
DOI:
https://doi.org/10.6000/1929-6029.2017.06.01.1Keywords:
Differential gene expression, Melanoma cell lines, Prediction, Power, Quantitative trait.Abstract
A major objective in microarray experiments is to identify a panel of genes that are associated with a disease outcome or trait. Many statistical methods have been proposed for gene selection within the last fifteen years. While the comparison of some of these methods has been done, most of them concentrated on finding gene signatures based on two groups. This study evaluates four gene selection methods when the outcome of interested is continuous in nature. We provide a comparative review of four methods: the Statistical Analysis of Microarrays (SAM), the Linear Models for Microarray Analysis (LIMMA), the Lassoed Principal Components (LPC), and the Quantitative Trait Analysis (QTA). Comparison is based on the power to identify differentially expressed genes, the predictive ability of the genelists for a continuous outcome (G2 checkpoint function), and the prognostic properties of the genelists for distant metastasis-free survival. A simulated dataset and a publicly available melanoma cell lines dataset are used for simulations and validation, respectively. A primary melanoma dataset is used for assessment of prognosis. No common genes were found among the genelists from the four methods. While the SAM was generally the best in terms of power, the QTA genelist performed the best in the prediction of the G2 checkpoint function. Identification of genelists depends on the choice of the gene selection method. The QTA method would be preferred over the other approaches in predicting a quantitative outcome in melanoma research. We recommend the development of more robust statistical methods for differential gene expression analysis.
References
J, Jose KK. Statistical tests for identification of differentially expressed genes in cDNA microarray experiments. Indian J Biotechnol 2008; 7: 423-436.
Troyanskaya OG, Garber ME, Brown PO, Botstein D, Altman RB. Nonparametric methods for identifying differentially
expressed genes in microarray data. Bioinformatics 2002; 18: 1454-1461. https://doi.org/10.1093/bioinformatics/18.11.1454 DOI: https://doi.org/10.1093/bioinformatics/18.11.1454
Schwender H, Krause A, Ickstadt K. Comparison of the empirical bayes and the significance analysis of microarrays. Technical Report//Universitt Dortmund, SFB 475, Reduction of complexity in multivariate data structures; 2003.
Jeffery IB, Higgins DG, Culhane AC. Comparison and evaluation of methods for generating differentially expressed gene lists from microarray data. BMC Bioinformatics 2006; 7: 359. https://doi.org/10.1186/1471-2105-7-359 DOI: https://doi.org/10.1186/1471-2105-7-359
Kim SY, Lee JW, Sohn IS. Comparison of various statistical methods for identifying differential gene expression in replicated microarray data. Stat Methods Med Res 2006; 15: 3-20. https://doi.org/10.1191/0962280206sm423oa DOI: https://doi.org/10.1191/0962280206sm423oa
Jeanmougin M, de Reynies A, Marisa L, Paccard C, Nuel G, Guedj M. Should we abandon the t-test in the analysis of gene expression microarray data: a comparison of variance modeling strategies. PLoS One 2010; 5: e12336. DOI: https://doi.org/10.1371/journal.pone.0012336
Bair E. Identification of significant features in DNA microarray data: Feature selection in DNA microarray data. Wiley Interdiscip Rev Comput Stat 2013; 5: 309-325. https://doi.org/10.1002/wics.1260 DOI: https://doi.org/10.1002/wics.1260
Bandyopadhyay S, Mallik S, Mukhopadhyay A. A survey and comparative study of statistical tests for identifying differential expression from microarray data. IEEE/ACM Trans Comput Biol Bioinformatics 2014; 11: 95-115. https://doi.org/10.1109/TCBB.2013.147 DOI: https://doi.org/10.1109/TCBB.2013.147
Kaufmann WK, Nevis KR, Qu P, Ibrahim JG, Zhou T, Zhou Y, et al. Defective cell cycle checkpoint functions in melanoma are associated with altered patterns of gene expression. J Invest Dermatol 2008; 128: 175-187. https://doi.org/10.1038/sj.jid.5700935 DOI: https://doi.org/10.1038/sj.jid.5700935
Tusher VG, Tibshirani R, Chu G. Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA 2001; 98: 5116-5121. https://doi.org/10.1073/pnas.091062498 DOI: https://doi.org/10.1073/pnas.091062498
Smyth GK. limma: Linear models for microarray data. In: Gentleman R, Carey VJ, Huber W, Irizarry RA, Dudoit S, Eds. Bioinformatics and computational biology solutions using R and Bioconductor. Springer New York 2005; pp. 397-420. DOI: https://doi.org/10.1007/0-387-29362-0_23
Efron B, Tibshirani R, Storey JD, Tusher V. Empirical Bayes analysis of a microarray experiment. J Am Stat Assoc 2001; 96: 1151-1160. https://doi.org/10.1198/016214501753382129 DOI: https://doi.org/10.1198/016214501753382129
Smyth GK. Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol 2004; 3: 1-25. https://doi.org/10.2202/1544-6115.1027 DOI: https://doi.org/10.2202/1544-6115.1027
Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 2015; 43(7): e47. https://doi.org/10.1093/nar/gkv007 DOI: https://doi.org/10.1093/nar/gkv007
Witten DM, Tibshirani R. Testing significance of features by lassoed principal components. Ann Appl Stat 2008; 2: 986-1012. https://doi.org/10.1214/08-AOAS182 DOI: https://doi.org/10.1214/08-AOAS182
Simon R, Lam A, Li MC, Ngan M, Menenzes S, Zhao Y. Analysis of gene expression data using BRB-Array Tools. Cancer Inform 2007; 3: 11-17. DOI: https://doi.org/10.1177/117693510700300022
Korn EL, Troendle JF, McShane LM, Simon R. Controlling the number of false discoveries: application to high-dimensional genomic data. J Stat Plan Inference 2004; 124: 379-398. https://doi.org/10.1016/S0378-3758(03)00211-8 DOI: https://doi.org/10.1016/S0378-3758(03)00211-8
Golub GH, Van Loan CF. Matrix computations. Johns Hopkins Studies in the Mathematical Sciences. Johns Hopkins University Press; 1996. Available from: https://books.google.co.ke/books?id=mlOa7wPX6OYC.
Owzar K, Jung SH, Sen PK. A copula approach for detecting prognostic genes associated with survival outcome in microarray studies. Biometrics 2007; 63: 1089-1098. https://doi.org/10.1111/j.1541-0420.2007.00802.x DOI: https://doi.org/10.1111/j.1541-0420.2007.00802.x
Omolo B, Carson C, Chu H, Zhou Y, Simpson DA, Hesse JE, et al. A prognostic signature of G2 checkpoint function in melanoma cell lines. Cell Cycle 2013; 12: 1071-1082. https://doi.org/10.4161/cc.24067 DOI: https://doi.org/10.4161/cc.24067
Winnepenninckx V, Lazar V, Michiels S, Dessen P, Stas M, Alonso SR, et al. Gene expression profiling of primary cutaneous melanoma and clinical outcome. J Natl Cancer Inst 2006; 98: 472-482. https://doi.org/10.1093/jnci/djj103 DOI: https://doi.org/10.1093/jnci/djj103
Tibshirani RJ. Regression shrinkage and selection via the LASSO. J Roy Statist Soc B 1996; 58(1): 267-288. DOI: https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
Bair E, Tibshirani R. Semi-supervised methods to predict patient survival from gene expression data. PLoS Biol 2004; 2. https://doi.org/10.1371/journal.pbio.0020108 DOI: https://doi.org/10.1371/journal.pbio.0020108
Tibshirani R, Hastie T, Narasimhan B, Chu G. Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci USA 2002; 99: 6567-6572. https://doi.org/10.1073/pnas.082099299 DOI: https://doi.org/10.1073/pnas.082099299
Andrew H, Florence G, Golum Kibria B. Methods for identifying differentially expressed genes: An empirical comparison. J Biom Biostat 2015; 6(5).
Kaufmann WK, Carson CC, Omolo B, Filgo AJ, Sambade MJ, Simpson DA, et al. Mechanisms of chromosomal instability in melanoma: Chromosomal Instability in Melanoma. Environ Mol Mutagen 2014; 55: 457-471. https://doi.org/10.1002/em.21859 DOI: https://doi.org/10.1002/em.21859
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2016 Linda Chaba, John Odhiambo, Bernard Omolo
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Policy for Journals/Articles with Open Access
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are permitted and encouraged to post links to their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work
Policy for Journals / Manuscript with Paid Access
Authors who publish with this journal agree to the following terms:
- Publisher retain copyright .
- Authors are permitted and encouraged to post links to their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work .