Anchor Box Parameters and Bounding Box Overlap Ratios for the Faster R-CNN Detector in Detecting a Single Object by the Masking Background

Vadim Romanuke

doi:10.7250/itms-2018-0002

Anchor Box Parameters and Bounding Box Overlap Ratios for the Faster R-CNN Detector in Detecting a Single Object by the Masking Background

Vadim Romanuke

Abstract

Anchor box parameters and bounding box overlap ratios are studied in order to set them appropriately for the Faster R-CNN detector. The benchmark detection is based on monochrome images whose background may mask a small dark object. Three object detection tasks are generated, where every image either contains a small black square/rectangle or does not contain the object, representing thus class “background”. The ratios are recommended to be tried at 0.7 if this class is represented. The ratio for positive training samples is tried at a less value but greater than 0.4 for the task every image of which contains an object. The minimum anchor box size is better to try at a lesser value from a range of object sizes. The anchor box pyramid scale factor and the number of levels are better to try at 2 and 8, respectively. Subsequently, these parameters may be corrected as their influence is fuzzier than that of the ratios.

Keywords:

Anchor box; bounding box overlap ratio; object detection; R-CNN.

Full Text:

PDF

References

R. Klette, Concise Computer Vision: An Introduction into Theory and Algorithms. Springer, 2014. https://doi.org/10.1007/978-1-4471-6320-6

A. Balasubramanian, S. Kamate, and N. Yilmazer, “Utilization of robust video processing techniques to aid efficient object detection and tracking,” Procedia Computer Science, vol. 36, pp. 579–586, 2014. https://doi.org/10.1016/j.procs.2014.09.057

V. V. Romanuke, “Parametrization of the optical flow car tracker within MATLAB Computer Vision System Toolbox for visual statistical surveillance of one-direction road traffic,” Radio Electronics, Computer Science, Control, no. 3, pp. 40–48, 2015. https://doi.org/10.15588/1607-3274-2015-3-5

R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2014), June 23–28, pp. 580–587, 2014. https://doi.org/10.1109/cvpr.2014.81

E. R. Davies, Computer Vision: Principles, Algorithms, Applications, Learning. Academic Press, 2018. https://doi.org/10.1016/B978-0-12-809284-2.00021-6

X. Cheng, J. Lu, J. Feng, B. Yuan, and J. Zhou, “Scene recognition with objectness,” Pattern Recognition, vol. 74, pp. 474–487, 2018. https://doi.org/10.1016/j.patcog.2017.09.025

J. R. Uijlings, K. E. Sande, T. Gevers, and A. W. Smeulders, “Selective search for object recognition,” International Journal of Computer Vision, vol. 104, no. 2, pp. 154–171, 2013. https://doi.org/10.1007/s11263-013-0620-5

R. Girshick, “Fast R-CNN,” Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV 2015), December 7–13, pp. 1440–1448, 2015. https://doi.org/10.1109/iccv.2015.169

S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: towards real-time object detection with region proposal networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137–1149, 2017. https://doi.org/10.1109/TPAMI.2016.2577031

Y. Zhang, Y. Bai, M. Ding, Y. Li, and B. Ghanem, “Weakly-supervised object detection via mining pseudo ground truth bounding-boxes,” Pattern Recognition, vol. 84, pp. 68–81, 2018. https://doi.org/10.1016/j.patcog.2018.07.005

K. M. Adal, D. Sidibé, S. Ali, E. Chaum, T. P. Karnowski, and F. Mériaudeau, “Automated detection of microaneurysms using scale-adapted blob analysis and semi-supervised learning,” Computer Methods and Programs in Biomedicine, vol. 114, no. 1, pp. 1–10, 2014. https://doi.org/10.1016/j.cmpb.2013.12.009

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” Advances in Neural Information Processing Systems, vol. 1, pp. 1097–1105, 2012.

L. H. S. Vogado, R. M. S. Veras, F. H. D. Araujo, R. R. V. Silva, and K. R. T. Aires, “Leukemia diagnosis in blood slides using transfer learning in CNNs and SVM for classification,” Engineering Applications of Artificial Intelligence, vol. 72, pp. 415–422, 2018. https://doi.org/10.1016/j.engappai.2018.04.024

D. Han, Q. Liu, and W. Fan, “A new image classification method using CNN transfer learning and web data augmentation,” Expert Systems with Applications, vol. 95, pp. 43–56, 2018. https://doi.org/10.1016/j.eswa.2017.11.028

V. V. Romanuke, “Appropriateness of DropOut layers and allocation of their 0.5 rates across convolutional neural networks for CIFAR-10, EEACL26, and NORB datasets,” Applied Computer Systems, vol. 22, pp. 54–63, 2017. https://doi.org/10.1515/acss-2017-0018

V. V. Romanuke, “Appropriate number of standard 2 × 2 max pooling layers and their allocation in convolutional neural networks for diverse and heterogeneous datasets,” Information Technology and Management Science, vol. 20, pp. 12–19, 2017. https://doi.org/10.1515/itms-2017-0002

DOI: 10.7250/itms-2018-0002

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Username
Password
Remember me