Fuzzy Classification System for Bioinformatics Data Analysis
Abstract
This article describes the fuzzy classification system developed by the authors and that is particularly applicable to bioinformatics data classification. The description focuses on the following steps in the system: 1) Data preprocessing; 2) Classifier training and construction of the rule base; 3) Classification of new records and 4) Evaluation of the results; it also explains the details of processes in each step as well as the processes of missing data replacement, reduction of the number of alternatives and functions, construction of membership functions and stretching of the induced rules. The article concludes with a justification of the methods and algorithms chosen for each process of the system.
Keywords: |
Classification system; data mining; data preprocessing; fuzzy logic
|
Full Text: |
References
N. Zhang and W. F. Lu, “An Efficient Data Preprocessing Method for Mining Customer Survey Data,” Industrial Informatics, 5th IEEE International Conference, June 23–27, 2007, pp. 573–578 Vienna, Austria.
G.E.A.P.A. Batista and M.C. Monard, “An Analysis of Four Missing Data Treatment Methods for Supervised Learning,” Applied Artificial Intelligence, vol. 17 no. 5, pp. 519–533, 2003. http://dx.doi.org/10.1080/713827181
M. Gasparovica, G. Krievina, L. Aleksejeva, “Biological Interpretation of Metabolic Syndrome Data Missing Value Imputation and Classification,” Proceedings of Workshop on Data Mining in Life Sciences, DMLS'2012, July 20, 2012, pp. 167–176, ,Germany, Berlin.
H. A. B. Feng, G. C. Chen, C. D. Yin, B. B. Yang, Y. E. Chen “A SVM Regression Based Approach to Filling in Missing Values,” 9th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems, KES2005. LNCS, 2005, vol. 3683, pp. 581–587, Melbourne, Australia. Springer.
O. Troyanskaya, M. Cantor, G. Sherlock, et al. “Missing Value estimation Methods for DNA Microarrays,” Bioinformatics, vol. 17, pp. 520–525, 2001. http://dx.doi.org/10.1093/bioinformatics/17.6.520
Y. Saeys, I. Inza, P. Larranaga, “A review of feature selection techniques in bioinformatics,” Bioinformatics, vol. 23, no. 19, pp. 2507– 2517, 2007 http://dx.doi.org/10.1093/bioinformatics/btm344
M. Gasparoviča, L. Aleksejeva, “Feature Selection for Bioinformatics Data Sets – Is It Recommended?” Proceedings of the 5th International Conference on Applied Information and Communication Technologies, AICT2012, Apr. 26–27, 2012, pp. 325–335, Jelgava, Latvia.
L. Yu and H. Liu, “Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution,” Proceedings of the Twentieth International Conference on Machine Learning, ICML-2003, Aug. 21– 24, 2003, pp. 856–863, Washington DC. AAAI Press: Menlo Park, California, 2003.
M. Gasparoviča, L. Aleksejeva, I. Tuleiko, “Finding Membership Functions for Bioinformatics Data,” Proceedings of 17th International Conference on Soft Computing, MENDEL 2011, June 15–17, 2011, pp. 133–140, Brno, Czech Republic.
R.Chutia, S. Mahanta and H. K. Baruah. “An Alternative Method of Finding the Membership of a Fuzzy Number,” in International Journal of Latest Trends in Computing, vol. 1, no. 69, Issue 2, pp. 69–72, Dec. 2010.
M. Gasparoviča M., I. Tuleiko, L. Aleksejeva, “Influence of Membership Functions on Classification of Multi-Dimensional Data” Proceedings of the Riga Technical University International Scientific Conference, no. 49, vol. 5, pp.78–84, 2011.
M. Gasparoviča, L. Aleksejeva, V. Nazaruks, “Using Fuzzy Clustering with Bioinformatics Data,” Proceedings of the 6th International Conference on Applied Information and Communication Technologies, AICT2013, Apr. 25–26, 2013, pp. 62–70. Jelgava, Latvia.
M. Gasparoviča, “Using Fuzzy Algorithms to Solve Classification Tasks,” Master thesis, Riga technical University, Riga. Latvia, 2010 (in Latvian).
J. Zyl, “Fuzzy set covering as a new paradigm for the induction of fuzzy classification rules,” PhD thesis, University of Manheim, Germany, 2007.
M. Gasparoviča, L. Aleksejeva, V. Gersons, “The Use of BEXA Family Algorithms in Bioinformatics Data Classification,” Proceedings of the Riga Technical University International Scientific Conference, vol. 5, no. 50, pp.120–126, 2012.
J. Hühn, J., E. Hüllermeier, “FURIA: An Algorithm for Unordered Fuzzy Rule Induction,” Data Mining and Knowledge Discovery, vol. 19, no. 3, pp.293–319, 2009. http://dx.doi.org/10.1007/s10618-009-0131-8
J. Han, M. Kamber, Data Mining: Concepts and Technologies. 1st Edition. San Francisko etc.: Morgan Kaufman, 2000.
P.N. Tan, M. Steinbach, V. Kumar, Introduction to Data Mining. Boston: Pearson Education, 2006.
Refbacks
- There are currently no refbacks.
Copyright (c) 2014 Madara Gasparoviсa-Asite, Ludmila Aleksejeva
This work is licensed under a Creative Commons Attribution 4.0 International License.