Feature Ranking by Classification Accuracy Estimation of Multiple Data Samples

Natalia Novoselova, Igor Tom, Arkady Borisov, Inese Polaka


This article considers the gene ranking algorithm for the microarray data. The rank vector is estimated by classifications of the random data samples. At each iteration, the ranks of genes participating in the successful classification become higher. Unlike other methods of feature selection, the proposed algorithm allows increasing the generality of the classification models by construction of the balanced training samples and taking into account the descriptiveness of the gene combinations by the subset estimation.


Biomarker; classification; feature ranking; gene expression

Full Text:



X. Liu, A. Krishnan, and A. Mondry, “An entropy-based gene selection method for cancer classification using microarray data”, in BMC Bioinformatics, vol. 6, no. 76, 2005.

N. Novoselova and I. Tom, Methods for gene expression analysis. Survey and perspective directions. LAMBERT Academic Publishing GmbH&Co, 2012, 68 p.

E.R. Dougherty, J. Hua, and C. Sima, “Performance of feature selection methods”, in Curr. Genomics, vol.10, 2009, pp. 365–374.

Y. Wang, I.V. Tetko, and M.A. Hall, “Gene selection from microarray data for cancer classification a machine learning approach”, in Comp Biol Chem., vol. 29, 2005, pp. 37–46.

R. Kohavi and G. John, “Wrapper for feature subset selection”, in Artificial Intelligence, vol. 97, no. 1, 1997, pp. 273-324.

J.G. Thomas, J.M. Olson, S.J. Tapscott, and L.P. Zhao, “An efficient and robust statistical modeling approach to discover differentially expressed genes using genomic expression profiles”, in Genome Res., vol. 11, 2001, pp. 1227–1236.

A. Antoniadis, S. Lambert-Lacroix, and F. Leblanc, “Effective dimension reduction methods for tumor classification using gene expression data”, in Bioinformatics, vol. 19, 2003, pp. 563–570.

I. Inza, P. Larranaga, R. Blanco, and A. Cerrolaza, “Filter versus wrapper gene selection approaches in DNA microarray domains”, in Artif. Intell. Med., vol. 31, no. 2, 2004, pp. 91–103.

M. Xiong, Z. Fang, and J. Zhao, “Biomarker identification by feature wrappers”, in Genome Research, vol. 11, 2001, pp. 1878-1887.

Y. Saeys, I. Inza, and P. Larranaga, “A review of feature selection techniques in bioinformatics”, in Bioinformatics, vol. 23, 2007, pp. 2507-2517.

R. Tibshirani, T. Hastie, B. Narasimhan, and G. Chu, “Diagnosis of multiple cancer types by shrunken centroids of gene expression”, in Proc Natl. Acad. Sci U S A, vol. 99, 2002, pp. 6567–6572.

T.R. Golub, D.K. Slonim, P. Tamayo, et al., “Molecular classification of Cancer: class discovery and class prediction by gene expression monitoring”, in Nature, vol. 286, 1999, pp. 531-537.

S. Dudoit, J. Fridlyand, and T. Speed, “Comparison of discrimination methods for the classification of tumors using gene expression data”, in J. Am. Stat. Assoc., vol. 97, 2002, pp. 77-87.

Cancer Program Data Sets/ Broad Institute of Harvard and MIT. [Online]. Available: http://www.broadinstitute.org/cgi- bin/cancer/datasets.cgi . [Accessed: September 30, 2013].

O. Dagliyan, F. Uney-Yuksektepe, I.H. Kavakli, and M. Turkay, “Optimization Based Tumor Classification from Microarray Gene Expression Data”, PLoS ONE, 6(2): e14579. doi:10.1371/journal.pone.0014579, 2011.

A. Antonov, I.V. Tetko, M.T. Mader, J. Budczies, and H.W. Mewes, “Optimization models for cancer classification extracting gene interaction information from microarray expression data”, in Bioinformatics, vol. 20, 2004, pp. 644–652.

M. Dettling and P. Buhlmann, “Supervised clustering of genes”, in Genome Biol., vol. 3, 2002: research0069.1–0069.15.

W. Chu, Z. Ghahramani, F. Falciani, and D.L. Wild, “Biomarker discovery in microarray gene expression data with gaussian processes”, in Bioinformatics, vol. 21, 2005, pp. 3385–3393.

A.J. Yang and X.Y. Song, “Bayesian variable selection for disease classification using gene expression data”, in Bioinformatics, vol. 26, 2010, pp. 215–222.

Y. Wang et al., “Gene selection from microarray data for cancer classification – a machine learning approach”, in Comput. Biol. Chem., vol. 29, no. 1, 2005, pp. 37-46.


  • There are currently no refbacks.

Copyright (c) 2013 Natalia Novoselova, Igor Tom, Arkady Borisov, Inese Polaka

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.