MixUDA: From Synthetic to Real Object Detection

Research Projects

Organizational Units

Journal Issue

Abstract

Object detection has made remarkable progress in recent years, driven by advancements in deep learning and the availability of large-scale annotated datasets. However, these methods often require extensive labeled data, which may not be accessible for specific or emerging applications. This limitation has generated interest in Unsupervised Domain Adaptation (UDA), which facilitates knowledge transfer from a labeled source domain to an unlabeled and differently distributed target domain. This study addresses the challenge of UDA between synthetic and realworld data. A methodology for generating synthetic datasets is proposed using AirSim and Unreal Engine, enabling the creation of highly customizable and diverse datasets. We also propose a Domain Adaptation technique, MixUDA, that maximizes the utility of the synthetic dataset to improve the performance of a model in a real domain. MixUDA is a UDA approach which uses a Mean Teacher architecture and employs pseudo-labels combined with two different image-mixing operations to achieve a smooth and progressive transition from the synthetic to the real domain: pseudo-mosaic and pseudo-mixup. The obtained results demonstrate encouraging progress, as MixUDA surpasses state-of-the-art models D3T and MixPL by 1.18 and 4 AP points respectively, approaching performance of oracle models trained directly on the target domain. These findings suggest that synthetic datasets have significant potential in addressing data scarcity and improving model generalization, while also pointing to promising directions for further exploration in this area.

Description

Bibliographic citation

Gil-Pérez, P., Cores, D., Mucientes, M. (2026). MixUDA: From Synthetic to Real Object Detection. In: Gonçalves, N., Oliveira, H.P., Sánchez, J.A. (eds) Pattern Recognition and Image Analysis. IbPRIA 2025. Lecture Notes in Computer Science, vol 15937. Springer, Cham. https://doi.org/10.1007/978-3-031-99565-1_10

Relation

Has part

Has version

Is based on

Is part of

Is referenced by

Is version of

Requires

Sponsors

This research was partially funded by the Spanish Ministerio de Ciencia e Innovación (grant number PID2020-112623GB-I00, PID2023-149549NB-I00), and the Galician Consellería de Cultura, Educación e Universidade (grant numbers ED431C 2018/29 and ED431G2019/04). These grants are co-funded by the European Regional Development Fund (ERDF). Pablo Gil-Pérez is supported by the Spanish Ministerio de Universidades under the FPI national plan (grant number PRE2023-000607).

Rights