RT Book,_Section T1 Relation networks for few-shot video object detection A1 Cores Costa, Daniel A1 Seidenari, Lorenzo A1 Bimbo, Alberto del A1 Brea Sánchez, Víctor Manuel A1 Mucientes Molina, Manuel K1 Few-shot object detection K1 Video object detection AB This paper describes a new few-shot video object detection framework that leverages spatio-temporal information through a relation module with attention mechanisms to mine relationships among proposals in different frames. The output of the relation module feeds a spatio-temporal double head with a category-agnostic confidence predictor to decrease overfitting in order to address the issue of reduced training sets inherent to few-shot solutions. The predicted score is the input to a long-term object linking approach that provides object tubes across the whole video, which ensures spatio-temporal consistency. Our proposal establishes a new state-of-the-art in the FSVOD500 dataset. PB Springer SN 978-3-031-36616-1 YR 2023 FD 2023-06-25 LK https://hdl.handle.net/10347/43854 UL https://hdl.handle.net/10347/43854 LA eng NO Cores, D., Seidenari, L., Bimbo, A.D., Brea, V.M., Mucientes, M. (2023). Relation Networks for Few-Shot Video Object Detection. In: Pertusa, A., Gallego, A.J., Sánchez, J.A., Domingues, I. (eds) Pattern Recognition and Image Analysis. IbPRIA 2023. Lecture Notes in Computer Science, vol 14062. Springer, Cham. https://doi.org/10.1007/978-3-031-36616-1_19 NO This research was partially funded by the Spanish Ministerio de Ciencia e Innovación (grant number PID2020-112623GB-I00), and the Galician Consellería de Cultura, Educación e Universidade (grant numbers ED431C 2018/29, ED431C 2021/048, ED431G 2019/04). These grants are co-funded by the European Regional Development Fund (ERDF). DS Minerva RD 23 abr 2026