Spatiotemporal tubelet feature aggregation and object linking for small object detection in videos

Cores Costa, Daniel; Brea Sánchez, Víctor Manuel; Mucientes Molina, Manuel

doi:10.1007/s10489-022-03529-w

Spatiotemporal tubelet feature aggregation and object linking for small object detection in videos

dc.contributor.affiliation	Universidade de Santiago de Compostela. Departamento de Electrónica e Computación	gl
dc.contributor.area	Área de Enxeñaría e Arquitectura
dc.contributor.author	Cores Costa, Daniel
dc.contributor.author	Brea Sánchez, Víctor Manuel
dc.contributor.author	Mucientes Molina, Manuel
dc.date.accessioned	2022-08-26T10:28:43Z
dc.date.available	2022-08-26T10:28:43Z
dc.date.issued	2022
dc.description.abstract	This paper addresses the problem of exploiting spatiotemporal information to improve small object detection precision in video. We propose a two-stage object detector called FANet based on short-term spatiotemporal feature aggregation and long-term object linking to refine object detections. First, we generate a set of short tubelet proposals. Then, we aggregate RoI pooled deep features throughout the tubelet using a new temporal pooling operator that summarizes the information with a fixed output size independent of the tubelet length. In addition, we define a double head implementation that we feed with spatiotemporal information for spatiotemporal classification and with spatial information for object localization and spatial classification. Finally, a long-term linking method builds long tubes with the previously calculated short tubelets to overcome detection errors. The association strategy addresses the generally low overlap between instances of small objects in consecutive frames by reducing the influence of the overlap in the final linking score. We evaluated our model in three different datasets with small objects, outperforming previous state-of-the-art spatiotemporal object detectors and our spatial baseline	gl
dc.description.peerreviewed	SI	gl
dc.description.sponsorship	Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature	gl
dc.identifier.citation	Appl Intell (2022). https://doi.org/10.1007/s10489-022-03529-w	gl
dc.identifier.doi	10.1007/s10489-022-03529-w
dc.identifier.essn	1573-7497
dc.identifier.issn	0924-669X
dc.identifier.uri	http://hdl.handle.net/10347/29157
dc.language.iso	eng	gl
dc.publisher	Springer	gl
dc.relation.publisherversion	Spatiotemporal tubelet feature aggregation and object linking for small object detection in videos	gl
dc.rights	© The Author(s) 2022. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/	gl
dc.rights	Atribución 4.0 Internacional
dc.rights.accessRights	open access	gl
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/
dc.subject	Video object detection	gl
dc.subject	Small object detection	gl
dc.subject	Convolutional neural network	gl
dc.subject	Spatiotemporal CNN	gl
dc.title	Spatiotemporal tubelet feature aggregation and object linking for small object detection in videos	gl
dc.type	journal article	gl
dc.type.hasVersion	VoR	gl
dspace.entity.type	Publication
relation.isAuthorOfPublication	3daa2166-1c2d-4b3d-bbb0-3d0036bd8cf2
relation.isAuthorOfPublication	22d4aeb8-73ba-4743-a84e-9118799ab1f2
relation.isAuthorOfPublication	21112b72-72a3-4a96-bda4-065e7e2bb262
relation.isAuthorOfPublication.latestForDiscovery	3daa2166-1c2d-4b3d-bbb0-3d0036bd8cf2

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 2022_appint_cores_spatiotemporal.pdf
Size:: 1.5 MB
Format:: Adobe Portable Document Format
Description:

Download

Collections

Electrónica e Computación