Research on the Classification Ability of Deep Belief Networks on Small and Medium Datasets

Andrey Bondarenko, Arkady Borisov

Abstract


Recent theoretical advances in the learning of deep artificial neural networks have made it possible to overcome a vanishing gradient problem. This limitation has been overcome using a pre-training step, where deep belief networks formed by the stacked Restricted Boltzmann Machines perform unsupervised learning. Once a pre-training step is done, network weights are fine-tuned using regular error back propagation while treating network as a feed-forward net. In the current paper we perform the comparison of described approach and commonly used classification approaches on some well-known classification data sets from the UCI repository as well as on one mid-sized proprietary data set.


Keywords:

Artificial neural networks; classification; deep belief networks; restricted Boltzmann machines

Full Text:

PDF

References


G.E. Hinton, R.R. Salakhutdinov, Reducing the Dimensionality of Data with Neural Networks. Science, Vol. 313. No. 5786, 28 July 2006, pp. 504-507.

Y. Bengio, Learning Deep Architectures for AI Foundations and Trends in Machine Learning 1(2), 2009, pp. 1-127.

Y. Bengio, P. Lamblin, D. Popovici, H. Larochelle, Greedy Layer-Wise Training of Deep Networks Advances in Neural Information Processing Systems 19 (NIPS’06) 1(2), MIT Press 2007, pp. 153-160.

G.E. Hinton, Deep Belief Nets, Tutorial at Deep Learning Workshop NIPS 2007 [Online] Available: http://nips.cc/Conferences/2007/Program/event.php?ID=572 [Accessed September 25, 2013].

L. McAfee, Document Classification using Deep Belief, Report for CSC224n, Stanford, 2008.

H. Lee, R. Grosse, Ranganath, A. Y. Ng, Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations, Proceedings of the 26th Annual International Conference on Machine Learning ICML 2009, pp. 609-616.

S. Zhou, Q. Chen and X.Wang, Active Deep Networks for Semi- supervised Sentiment Classification COLING 2010, Proceedings of the 23rd International Conference on Computational Linguistics: Posters, 2010, pp. 1515-1523.

H. Larochelle, M. Mandel, R. Pascanu, Y. Bengio, Learning Algorithms for the Classification Restricted Boltzmann Machine, Journal of Machine Learning Research 13, 2012, pp. 643-669.

N. Le Roux, Neural Networks RBMs and DBNs presentation at SMILE seminar 10 February 2011 [Online] Available: http://nicolas.le- roux.name/files/SMILE_dbns.pdf [Accessed September 25, 2013].

K. Bache and M. Lichman, UCI Machine Learning Repository, Irvine, CA: University of California, School of Information and Computer Science, 2013 [Online] Available: http://archive.ics.uci.edu/ml/ [Accessed September 25, 2013].

T. Tieleman, Training Restricted Boltzmann Machines Using Approximations of the Log Likelihood Gradient, Proceedings of the 25th Annual International Conference on Machine Learning ICML 2008, pp. 1064-1071.

S. Hochreiter, The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions, International Journal of Uncertainty, Fuzziness and Knowledge Based Systems, Vol. 6, Issue 2, April, 1998, pp. 107-116.

V. Mnih, H. Larochelle, G. E. Hinton, Conditional Restricted Boltzmann Machines for Structured Output Prediction, Proceedings of the Twenty- Seventh Conference on Uncertainty in Artificial Intelligence, UAI-2011, Barcelona, Spain, July 14-17, 2011.

G.E. Hinton. A Practical Guide to Training Restricted Boltzmann Machines Version 1, August 2, 2010. [Online] Available: http://www.cs.toronto.edu/~hinton/absps/guideTR.pdf [Accessed September 25, 2013].

Deep Learning Documentation – Training RBM Version 1 [Online] Available: http://deeplearning.net/tutorial/rbm.html. [Accessed September 25, 2013].

G.E. Hinton, S. Osindero, Y. W. The, A Fast Learning Algorithm for Deep Belief Nets, Neural Computation, Volume 18, 2006, pp. 2283- 2292.

L. Breiman and A. Cutler, Random Forests, [Online] Available: http://www.stat.berkeley.edu/~breiman/RandomForests/cc_papers.htm [Accessed September 26, 2013].

Classification results for various benchmark datasets, [Online] Available: http://www.is.umk.pl/projects/datasets.html [Accessed September 26, 2013].

D. Erhan, P-A. Manzagol, Y. Bengio, S. Bengio and P. Vincent The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training, Proceedings of the 12th International Conference on Artificial Intelligence and Statistics, April 16-18, Clearwater Beach, Florida, USA, 2009.

L.J.P. van der Maaten and G.E. Hinton, Visualizing High-Dimensional Data using t-SNE, Journal of Machine Learning Research 9 (Nov), 2009, pp.2579-2605.

L.J.P. van der Maaten, Learning Parametric Embedding by Preserving Local Structure, Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (AI-STATS), JMLR W&CP 5, 2009, pp.384-391.

L.J.P. van der Maaten Barnes-Hut-SNE, Proceedings of the International Conference on Learning Representations, 2013, Available: http://arxiv.org/abs/1301.3342 [Accessed September 26, 2013].

Chang, Chih-Chung and Lin, Chih-Jen, LIBSVM: A library for support vector machines, ACM Transactions on Intelligent Systems and Technology, Vol. 2, Issue 3, 2011, pp.1-27, Available: http://www.csie.ntu.edu.tw/~cjlin/libsvm [Accessed September 26, 2013].

H. Tang A Comparative Evaluation of Deep Belief Networks in Semi- Surpevised Learning, Report for CSC2515, 2008. Available: http://www.cs.toronto.edu/~hxtang/projects/dbn_eval/dbn_eval.pdf, [Accessed September 26, 2013].

I. Sutskever, G.E. Hinton, Deep, Narrow Sigmoid Belief Networks are Universal Approximators, Journal Neural Computation Volume 20, Issue 11, November. 2008.

G. Montufar, N. Ay, Refinements of Universal Approximation Results for deep Belief Networks and Restricted Boltzmann Machines, Neural Computation, May, 2011, pp. 1306-1319.

G. Montufar, Mixture Models and Representational Power of RBM’s and DBN’s, Deep Learning and Unsupervised Feature Learning Workshop, NIPS’10, December 19, Vncouver, Canada, 2010.

Y. LeCunn, S. Chopra, R. Hadsell, M.A. Ranzato and F.J. Huang, A Tutorial on Energy-Based Learning, v1.0 August 16, 2006, Available: http://yann.lecun.com, [Accessed September 26, 2013].

G.E. Hinton, Training Products of Experts by Minimizing Contrastive Divergence, Neural Computation 14(8), August 2002, pp. 1771-1800.


Refbacks

  • There are currently no refbacks.


Copyright (c) 2013 Andrey Bondarenko, Arkady Borisov

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.