Spatiotemporal tubelet feature aggregation and object linking for small object detection in videos

dc.contributor.affiliationUniversidade de Santiago de Compostela. Departamento de Electrónica e Computacióngl
dc.contributor.areaÁrea de Enxeñaría e Arquitectura
dc.contributor.authorCores Costa, Daniel
dc.contributor.authorBrea Sánchez, Víctor Manuel
dc.contributor.authorMucientes Molina, Manuel
dc.date.accessioned2022-08-26T10:28:43Z
dc.date.available2022-08-26T10:28:43Z
dc.date.issued2022
dc.description.abstractThis paper addresses the problem of exploiting spatiotemporal information to improve small object detection precision in video. We propose a two-stage object detector called FANet based on short-term spatiotemporal feature aggregation and long-term object linking to refine object detections. First, we generate a set of short tubelet proposals. Then, we aggregate RoI pooled deep features throughout the tubelet using a new temporal pooling operator that summarizes the information with a fixed output size independent of the tubelet length. In addition, we define a double head implementation that we feed with spatiotemporal information for spatiotemporal classification and with spatial information for object localization and spatial classification. Finally, a long-term linking method builds long tubes with the previously calculated short tubelets to overcome detection errors. The association strategy addresses the generally low overlap between instances of small objects in consecutive frames by reducing the influence of the overlap in the final linking score. We evaluated our model in three different datasets with small objects, outperforming previous state-of-the-art spatiotemporal object detectors and our spatial baselinegl
dc.description.peerreviewedSIgl
dc.description.sponsorshipOpen Access funding provided thanks to the CRUE-CSIC agreement with Springer Naturegl
dc.identifier.citationAppl Intell (2022). https://doi.org/10.1007/s10489-022-03529-wgl
dc.identifier.doi10.1007/s10489-022-03529-w
dc.identifier.essn1573-7497
dc.identifier.issn0924-669X
dc.identifier.urihttp://hdl.handle.net/10347/29157
dc.language.isoenggl
dc.publisherSpringergl
dc.relation.publisherversionSpatiotemporal tubelet feature aggregation and object linking for small object detection in videosgl
dc.rights© The Author(s) 2022. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/gl
dc.rightsAtribución 4.0 Internacional
dc.rights.accessRightsopen accessgl
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subjectVideo object detectiongl
dc.subjectSmall object detectiongl
dc.subjectConvolutional neural networkgl
dc.subjectSpatiotemporal CNNgl
dc.titleSpatiotemporal tubelet feature aggregation and object linking for small object detection in videosgl
dc.typejournal articlegl
dc.type.hasVersionVoRgl
dspace.entity.typePublication
relation.isAuthorOfPublication3daa2166-1c2d-4b3d-bbb0-3d0036bd8cf2
relation.isAuthorOfPublication22d4aeb8-73ba-4743-a84e-9118799ab1f2
relation.isAuthorOfPublication21112b72-72a3-4a96-bda4-065e7e2bb262
relation.isAuthorOfPublication.latestForDiscovery3daa2166-1c2d-4b3d-bbb0-3d0036bd8cf2

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2022_appint_cores_spatiotemporal.pdf
Size:
1.5 MB
Format:
Adobe Portable Document Format
Description: