Superpowering Open-Vocabulary Object Detectors for X-ray Vision

dc.contributor.affiliationUniversidade de Santiago de Compostela. Centro de Investigación en Tecnoloxías Intelixentes da USC (CiTIUS)
dc.contributor.authorGarcía Fernández, Pablo
dc.contributor.authorVaquero Otal, Lorenzo
dc.contributor.authorLiu, Mingxuan
dc.contributor.authorXue, Feng
dc.contributor.authorCores Costa, Daniel
dc.contributor.authorSebe, Nicu
dc.contributor.authorMucientes Molina, Manuel
dc.contributor.authorRicci, Elisa
dc.date.accessioned2025-11-17T12:32:01Z
dc.date.available2025-11-17T12:32:01Z
dc.date.issued2025-10-19
dc.descriptionInternational Conference on Computer Vision, ICCV 2025, Honolulu 2025
dc.description.abstractOpen-vocabulary object detection (OvOD) is set to revolutionize security screening by enabling systems to recognize any item in X-ray scans. However, developing effective OvOD models for X-ray imaging presents unique challenges due to data scarcity and the modality gap that prevents direct adoption of RGB-based solutions. To overcome these limitations, we propose RAXO, a training-free framework that repurposes off-the-shelf RGB OvOD detectors for robust X-ray detection. RAXO builds high-quality X-ray class descriptors using a dual-source retrieval strategy. It gathers relevant RGB images from the web and enriches them via a novel X-ray material transfer mechanism, eliminating the need for labeled databases. These visual descriptors replace text-based classification in OvOD, leveraging intra-modal feature distances for robust detection. Extensive experiments demonstrate that RAXO consistently improves OvOD performance, providing an average mAP increase of up to 17.0 points over base detectors. To further support research in this emerging field, we also introduce DET-COMPASS, a new benchmark featuring bounding box annotations for over 300 object categories, enabling large-scale evaluation of OvOD in X-ray.
dc.description.sponsorshipWe thank CINECA and the ISCRA initiative for the availability of high-performance computing resources. This work was partially supported by the EU HORIZON IAMI (HORIZON-CL3-2023-FCT-01-04-101168272) project, the EU HORIZON ELIAS (HORIZONCL4-2022-HUMAN-02-101120237) project, the EU ISFP PRECRISIS (ISFP-2022-TFI-AG-PROTECT-02101100539) project, the MUR PNRR FAIR (PE00000013) project funded by the NextGenerationEU, the Spanish Ministerio de Ciencia e Innovación (grant numbers PID2020-112623GB-I00, PID2023-149549NB-I00), and the Galician Consellería de Cultura, Educación e Universidade (2024-2027 ED431G-2023/04). Some of these grants are co-funded by the European Regional Development Fund (ERDF). Pablo Garcia-Fernandez is supported by the Spanish Ministerio de Universidades under the FPU national plan (grant number FPU21/05581).
dc.identifier.urihttps://hdl.handle.net/10347/43850
dc.language.isoeng
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2023-149549NB-I00/ES/APROVECHANDO LA INTELIGENCIA ARTIFICIAL PARA UNA MONITORIZACION PREDICTIVA ROBUSTA EN MINERIA DE PROCESOS
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2023-149549NB-I00/ES/APROVECHANDO LA INTELIGENCIA ARTIFICIAL PARA UNA MONITORIZACION PREDICTIVA ROBUSTA EN MINERIA DE PROCESOS
dc.rights.accessRightsopen access
dc.subjectX-ray
dc.subjectOpen-vocabulary object detection
dc.subject.classification120304 Inteligencia artificial
dc.titleSuperpowering Open-Vocabulary Object Detectors for X-ray Vision
dc.typeconference output
dspace.entity.typePublication
relation.isAuthorOfPublicationb84267f3-fe5b-4ab3-aed0-cc8a3f34690b
relation.isAuthorOfPublication3daa2166-1c2d-4b3d-bbb0-3d0036bd8cf2
relation.isAuthorOfPublication21112b72-72a3-4a96-bda4-065e7e2bb262
relation.isAuthorOfPublication.latestForDiscovery3daa2166-1c2d-4b3d-bbb0-3d0036bd8cf2

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2025_iccv_cores_superpowering.pdf
Size:
14.76 MB
Format:
Adobe Portable Document Format