RT Journal Article T1 STDnet-ST: Spatio-temporal ConvNet for small object detection A1 Bosquet Mera, Brais A1 Mucientes Molina, Manuel A1 Brea Sánchez, Víctor Manuel K1 Small object detection K1 Spatio-temporal convolutional network K1 Object linking AB Object detection through convolutional neural networks is reaching unprecedented levels of precision. However, a detailed analysis of the results shows that the accuracy in the detection of small objects is still far from being satisfactory. A recent trend that will likely improve the overall object detection success is to use the spatial information operating alongside temporal video information. This paper introduces STDnet-ST, an end-to-end spatio-temporal convolutional neural network for small object detection in video. We define small as those objects under px, where the features become less distinctive. STDnet-ST is an architecture that detects small objects over time and correlates pairs of the top-ranked regions with the highest likelihood of containing those small objects. This permits to link the small objects across the time as tubelets. Furthermore, we propose a procedure to dismiss unprofitable object links in order to provide high quality tubelets, increasing the accuracy. STDnet-ST is evaluated on the publicly accessible USC-GRAD-STDdb, UAVDT and VisDrone2019-VID video datasets, where it achieves state-of-the-art results for small objects PB Elsevier SN 0031-3203 YR 2021 FD 2021 LK http://hdl.handle.net/10347/28783 UL http://hdl.handle.net/10347/28783 LA eng NO Pattern Recognition 116 (2021) 107929 NO This research was partially funded by the Spanish Ministry of Science, Innovation and Universities under grants TIN2017-84796-C2-1-R and RTI2018-097088-B-C32, and the Galician Ministry of Education, Culture and Universities under grants ED431C 2018/29, ED431C 2017/69 and accreditation 2016-2019, ED431G/08. These grants are co-funded by the European Regional Development Fund (ERDF/FEDER program) DS Minerva RD 28 abr 2026