Spatiotemporal tubelet feature aggregation and object linking for small object detection in videos
| dc.contributor.affiliation | Universidade de Santiago de Compostela. Departamento de Electrónica e Computación | gl |
| dc.contributor.area | Área de Enxeñaría e Arquitectura | |
| dc.contributor.author | Cores Costa, Daniel | |
| dc.contributor.author | Brea Sánchez, Víctor Manuel | |
| dc.contributor.author | Mucientes Molina, Manuel | |
| dc.date.accessioned | 2022-08-26T10:28:43Z | |
| dc.date.available | 2022-08-26T10:28:43Z | |
| dc.date.issued | 2022 | |
| dc.description.abstract | This paper addresses the problem of exploiting spatiotemporal information to improve small object detection precision in video. We propose a two-stage object detector called FANet based on short-term spatiotemporal feature aggregation and long-term object linking to refine object detections. First, we generate a set of short tubelet proposals. Then, we aggregate RoI pooled deep features throughout the tubelet using a new temporal pooling operator that summarizes the information with a fixed output size independent of the tubelet length. In addition, we define a double head implementation that we feed with spatiotemporal information for spatiotemporal classification and with spatial information for object localization and spatial classification. Finally, a long-term linking method builds long tubes with the previously calculated short tubelets to overcome detection errors. The association strategy addresses the generally low overlap between instances of small objects in consecutive frames by reducing the influence of the overlap in the final linking score. We evaluated our model in three different datasets with small objects, outperforming previous state-of-the-art spatiotemporal object detectors and our spatial baseline | gl |
| dc.description.peerreviewed | SI | gl |
| dc.description.sponsorship | Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature | gl |
| dc.identifier.citation | Appl Intell (2022). https://doi.org/10.1007/s10489-022-03529-w | gl |
| dc.identifier.doi | 10.1007/s10489-022-03529-w | |
| dc.identifier.essn | 1573-7497 | |
| dc.identifier.issn | 0924-669X | |
| dc.identifier.uri | http://hdl.handle.net/10347/29157 | |
| dc.language.iso | eng | gl |
| dc.publisher | Springer | gl |
| dc.relation.publisherversion | Spatiotemporal tubelet feature aggregation and object linking for small object detection in videos | gl |
| dc.rights | © The Author(s) 2022. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ | gl |
| dc.rights | Atribución 4.0 Internacional | |
| dc.rights.accessRights | open access | gl |
| dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | |
| dc.subject | Video object detection | gl |
| dc.subject | Small object detection | gl |
| dc.subject | Convolutional neural network | gl |
| dc.subject | Spatiotemporal CNN | gl |
| dc.title | Spatiotemporal tubelet feature aggregation and object linking for small object detection in videos | gl |
| dc.type | journal article | gl |
| dc.type.hasVersion | VoR | gl |
| dspace.entity.type | Publication | |
| relation.isAuthorOfPublication | 3daa2166-1c2d-4b3d-bbb0-3d0036bd8cf2 | |
| relation.isAuthorOfPublication | 22d4aeb8-73ba-4743-a84e-9118799ab1f2 | |
| relation.isAuthorOfPublication | 21112b72-72a3-4a96-bda4-065e7e2bb262 | |
| relation.isAuthorOfPublication.latestForDiscovery | 3daa2166-1c2d-4b3d-bbb0-3d0036bd8cf2 |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- 2022_appint_cores_spatiotemporal.pdf
- Size:
- 1.5 MB
- Format:
- Adobe Portable Document Format
- Description: