Object Detection Via Flexible Anchor Generation
Abstract
This paper designs a method that can generate anchors of various shapes for the object detection framework. This method has the characteristics of novelty and flexibility. Different from the previous anchors generated by a pre-defined manner, our anchors are generated dynamically by an anchor generator. Specially, the anchor generator is not fixed but learned from the hand-designed anchors, which means that our anchor generator is able to work well in various scenes. In the inference time, the weights of anchor generator are estimated by a simple network where the input is some hand-designed anchor. In addition, in order to make the difference between the number of positive and negative samples smaller, we use an adaptive IOU threshold related to the object size to solve this problem. At the same time, we proved that our proposed method is effective and conducted a lot of experiments on the COCO dataset. Experimental results show that after replacing the anchor generation method in the previous object detectors (such as SSD, mask RCNN, and Retinanet) with our proposed method, the detection performance of the model has been greatly improved compared to before the replacement, which proves our method is effective.
References
- 1. , Cascade r-cnn: Delving into high quality object detection, in Proc. IEEE Conf. Computer Vision and Pattern Recognition (2018), pp. 6154–6162. Crossref, Google Scholar
- 2. X. Chen and A. Gupta, Spatial memory for context reasoning in object detection (2017), arXiv:1704.04224. Google Scholar
- 3. X. Chen, L.-J. Li, L. Fei-Fei and A. Gupta, Iterative visual reasoning beyond convolutions (2018), arXiv:1803.11189. Google Scholar
- 4. , R-fcn: Object detection via region-based fully convolutional networks, in Advances in Neural Information Processing Systems (2016), pp. 379–387. Google Scholar
- 5. , Deformable convolutional networks, CoRR 1(2) (2017) 3. abs/1703.06211. Google Scholar
- 6. , The pascal visual object classes (voc) challenge, Int. J. Comput. Vis. 88(2) (2010) 303–338. Crossref, ISI, Google Scholar
- 7. C.-Y. Fu, W. Liu, A. Ranga, A. Tyagi and A. C. Berg, Dssd: Deconvolutional single shot detector (2017), arXiv:1701.06659. Google Scholar
- 8. S. Gidaris and N. Komodakis, Attend refine repeat: Active box proposal generation via in-out localization (2016), arXiv:1606.04446. Google Scholar
- 9. , Mask r-cnn, Computer Vision (ICCV), 2017 IEEE Int. Conf. (2017), pp. 2980–2988. Crossref, Google Scholar
- 10. , Identity mappings in deep residual networks, European Conf. Computer Vision (2016), pp. 630–645. Crossref, Google Scholar
- 11. , Adaptive anchor for fast object detection in aerial image, IEEE Geosci. Remote Sens. Lett. 17(5) (2019) 839–843. Crossref, ISI, Google Scholar
- 12. , Cornernet: Detecting objects as paired keypoints, in Proc. European Conf. Computer Vision (ECCV) (2018), pp. 734–750. Crossref, Google Scholar
- 13. , Realize your surroundings: Exploiting context information for small object detection, Neurocomputing 433 (2021) 287–299. Crossref, ISI, Google Scholar
- 14. , Zoom out-and-in network with map attention decision for region proposal and object detection, Int. J. Comput. Vis. 127 (2019) 225–238. Crossref, ISI, Google Scholar
- 15. , Feature pyramid networks for object detection, CVPR, Vol. 1(2) (2017), p. 4. Crossref, Google Scholar
- 16. , Focal loss for dense object detection, in Proc. IEEE Int. Conf. Computer Vision (2017), pp. 2980–2988. Crossref, Google Scholar
- 17. , Microsoft coco: Common objects in context, European Conf. Computer Vision (2014), pp. 740–755. Crossref, Google Scholar
- 18. , Ssd: Single shot multibox detector, European Conf. Computer Vision (2016), pp. 21–37. Crossref, Google Scholar
- 19. , A survey and performance evaluation of deep learning methods for small object detection, Expert Syst. Appl. 172 (2021) 114602. Crossref, ISI, Google Scholar
- 20. , Toward scale-invariance and position-sensitive region proposal networks, in Proc. European Conf. Computer Vision (ECCV) (2018), pp. 168–183. Crossref, Google Scholar
- 21. , Aabo: Adaptive anchor box optimization for object detection via bayesian sub-sampling, European Conf. Computer Vision (2020), pp. 560–575. Crossref, Google Scholar
- 22. , Learning to refine object segments, European Conf. Computer Vision (2016), pp. 75–91. Crossref, Google Scholar
- 23. , Yolo9000: Better, faster, stronger, in Proc. IEEE Conf. Computer Vision and Pattern Recognition (2017), pp. 7263–7271. Crossref, Google Scholar
- 24. J. Redmon and A. Farhadi, Yolov3: An incremental improvement (2018), arXiv:1804.02767. Google Scholar
- 25. , Faster r-cnn: Towards real-time object detection with region proposal networks, IEEE Trans. Pattern Anal. Mach. Intell. 39(6) (2017) 1137–1149. Crossref, ISI, Google Scholar
- 26. , Weighted boxes fusion: Ensembling boxes from different object detection models, Image Vis. Comput. 107 (2021) 104117. Crossref, ISI, Google Scholar
- 27. , Local–global attentive adaptation for object detection, Eng. Appl. Artif. Intell. 100 (2021) 104208. Crossref, ISI, Google Scholar
- 28. , Adaptive anchor networks for multi-scale object detection in remote sensing images, IEEE Access 8 (2020) 57552–57565. Crossref, ISI, Google Scholar
- 29. , Single-shot refinement neural network for object detection, in Proc. IEEE Conf. Computer Vision and Pattern Recognition (2018), pp. 4203–4212. Crossref, Google Scholar
- 30. , Bottom-up object detection by grouping extreme and center points, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition (2019), pp. 850–859. Crossref, Google Scholar