MixUDA: From Synthetic to Real Object Detection
Loading...
Identifiers
Publication date
Advisors
Tutors
Journal Title
Journal ISSN
Volume Title
Publisher
Springer
Abstract
Object detection has made remarkable progress in recent years, driven by advancements in deep learning and the availability of large-scale annotated datasets. However, these methods often require extensive labeled data, which may not be accessible for specific or emerging applications. This limitation has generated interest in Unsupervised Domain Adaptation (UDA), which facilitates knowledge transfer from a labeled source domain to an unlabeled and differently distributed target domain. This study addresses the challenge of UDA between synthetic and realworld data. A methodology for generating synthetic datasets is proposed using AirSim and Unreal Engine, enabling the creation of highly customizable and diverse datasets. We also propose a Domain Adaptation technique, MixUDA, that maximizes the utility of the synthetic dataset to improve the performance of a model in a real domain. MixUDA is a UDA approach which uses a Mean Teacher architecture and employs pseudo-labels combined with two different image-mixing operations to achieve a smooth and progressive transition from the synthetic to the real domain: pseudo-mosaic and pseudo-mixup.
The obtained results demonstrate encouraging progress, as MixUDA surpasses state-of-the-art models D3T and MixPL by 1.18 and 4 AP points respectively, approaching performance of oracle models trained directly on the target domain. These findings suggest that synthetic datasets have significant potential in addressing data scarcity and improving model generalization, while also pointing to promising directions for further exploration in this area.
Description
Bibliographic citation
Gil-Pérez, P., Cores, D., Mucientes, M. (2026). MixUDA: From Synthetic to Real Object Detection. In: Gonçalves, N., Oliveira, H.P., Sánchez, J.A. (eds) Pattern Recognition and Image Analysis. IbPRIA 2025. Lecture Notes in Computer Science, vol 15937. Springer, Cham. https://doi.org/10.1007/978-3-031-99565-1_10
Relation
Has part
Has version
Is based on
Is part of
Is referenced by
Is version of
Requires
Publisher version
https://doi.org/10.1007/978-3-031-99565-1_10Sponsors
This research was partially funded by the Spanish Ministerio de Ciencia e Innovación (grant number PID2020-112623GB-I00, PID2023-149549NB-I00), and the Galician Consellería de Cultura, Educación e Universidade (grant numbers ED431C 2018/29 and ED431G2019/04). These grants are co-funded by the European Regional Development Fund (ERDF). Pablo Gil-Pérez is supported by the Spanish Ministerio de Universidades under the FPI national plan (grant number PRE2023-000607).








