Cores Costa, DanielBrea Sánchez, Víctor ManuelMucientes Molina, Manuel2024-02-092024-02-092021-09-27http://hdl.handle.net/10347/32692We propose a new two stage spatio-temporal object detector framework able to improve detection precision by taking into account temporal information. First, a short-term proposal linking and aggregation method improves box features. Then, we design a long-term attention module that further enhances short-term aggregated features adding long-term spatio-temporal information. This module takes into account object trajectories to effectively exploit long-term relationships between proposals in arbitrary distant frames. Many videos recorded from UAV on board cameras have a high density of small objects, making the detection problem very challenging. Our method takes advantage of spatiotemporal information to address these issues increasing the detection robustness. We have compared our method with state-of-the-art video object detectors in two different publicly available datasets focused on UAV recorded videos. Our approach outperforms previous methods in both datasets.engobject detectionspatio-temporal featuresCNNSpatio-temporal object detection from UAV on board camerasconference outputopen access