RT Book,_Section
T1 Relation networks for few-shot video object detection
A1 Cores Costa, Daniel
A1 Seidenari, Lorenzo
A1 Bimbo, Alberto del
A1 Brea Sánchez, Víctor Manuel
A1 Mucientes Molina, Manuel
K1 Few-shot object detection
K1 Video object detection
AB This paper describes a new few-shot video object detection framework that leverages spatio-temporal information through a relation module with attention mechanisms to mine relationships among proposals in different frames. The output of the relation module feeds a spatio-temporal double head with a category-agnostic confidence predictor to decrease overfitting in order to address the issue of reduced training sets inherent to few-shot solutions. The predicted score is the input to a long-term object linking approach that provides object tubes across the whole video, which ensures spatio-temporal consistency. Our proposal establishes a new state-of-the-art in the FSVOD500 dataset.
PB Springer
SN 978-3-031-36616-1
YR 2023
FD 2023-06-25
LK https://hdl.handle.net/10347/43854
UL https://hdl.handle.net/10347/43854
LA eng
NO Cores, D., Seidenari, L., Bimbo, A.D., Brea, V.M., Mucientes, M. (2023). Relation Networks for Few-Shot Video Object Detection. In: Pertusa, A., Gallego, A.J., Sánchez, J.A., Domingues, I. (eds) Pattern Recognition and Image Analysis. IbPRIA 2023. Lecture Notes in Computer Science, vol 14062. Springer, Cham. https://doi.org/10.1007/978-3-031-36616-1_19
NO This research was partially funded by the Spanish Ministerio de Ciencia e Innovación (grant number PID2020-112623GB-I00), and the Galician Consellería de Cultura, Educación e Universidade (grant numbers ED431C 2018/29, ED431C 2021/048, ED431G 2019/04). These grants are co-funded by the European Regional Development Fund (ERDF).
DS Minerva
RD 8 jun 2026