ACM Multimedia Asia 2020

Incremental Multi-view Object Detection from a Moving Camera

Ayako Amma
Toyota Motor Corporation

Abstract

Object detection in a single image is a challenging problem due to clutters, occlusions, and a large variety of viewing locations. This task can benefit from integrating multi-frame information captured by a moving camera. In this paper, we propose a method to increment object detection scores extracted from multiple frames captured from different viewpoints. For each frame, we run an efficient end-to-end object detector that outputs object bounding boxes, each of which is associated with the scores of categories and poses. The scores of detected objects are then stored in grid locations in 3D space. After observing multiple frames, the object scores stored in each grid location are integrated based on the best object pose hypothesis. This strategy requires the consistency of object categories and poses among multiple frames, and thus it significantly suppresses miss detections. The performance of the proposed method is evaluated on our newly created multi-class object dataset captured in robot simulation and real environments, as well as on a public benchmark dataset.

Publication

  • Takashi Konno, Ayako Amma, and Asako Kanezaki.
    Incremental Multi-view Object Detection from a Moving Camera
    ACM Multimedia Asia (ACMMMA), accepted, 2020.
    PDF

Dataset

Video

BibTeX

@inproceedings{konno2020_imod,
	title={Incremental Multi-view Object Detection from a Moving Camera},
	author={Takashi Konno and Ayako Amma and Asako Kanezaki},
	booktitle={Proceedings of ACM Multimedia Asia (ACMMMA)},
	year={2020},}
This project is supported by Toyota Motor Corporation.