Exploring Open-Vocabulary Models for Category-Free Detection
| dc.contributor.affiliation | Universidade de Santiago de Compostela. Centro de Investigación en Tecnoloxías Intelixentes da USC (CiTIUS) | |
| dc.contributor.affiliation | Universidade de Santiago de Compostela. Departamento de Electrónica e Computación | |
| dc.contributor.author | García Fernández, Pablo | |
| dc.contributor.author | Mucientes Molina, Manuel | |
| dc.contributor.author | Cores Costa, Daniel | |
| dc.date.accessioned | 2025-11-10T13:40:13Z | |
| dc.date.available | 2025-11-10T13:40:13Z | |
| dc.date.issued | 2025-09-22 | |
| dc.description | Paper presented in The 21st International Conference in Computer Analysis of Images and Patterns | |
| dc.description.abstract | Object detection models typically rely on a predefined setof categories, limiting their applicability in real-world scenarios whereobject classes may be unknown. In this paper, we propose a novel,training-free framework that enables off-the-shelf open-vocabulary ob-ject detectors (OvOD) to perform category-free detection —localizingand classifying objects without any prior category knowledge. Our ap-proach leverages image captioning to dynamically generate descriptiveterms directly from the image content, followed by a WordNet-based fil-tering process to extract semantically meaningful category names. Thesediscovered categories are then embedded and matched with visual regionfeatures using a frozen OvOD model to perform detection. We evaluateour method on the COCO dataset in a fully zero-shot setting and demon-strate that it significantly outperforms strong multimodal large languagemodel baselines, achieving an improvement of over 30 AP points. Thishighlights our method as a promising direction for more adaptive solu-tions to real-world detection challenges. | |
| dc.description.sponsorship | This work was partially supported by the Spanish Ministerio de Ciencia e In- novación (grant numbers PID2020-112623GB-I00, PID2023-149549NB-I00), and the Galician Consellería de Cultura, Educación e Universidade (2024-2027 ED431G- 2023/04). These grants are co-funded by the European Regional Development Fund (ERDF). Pablo Garcia-Fernandez is supported by the Spanish Ministerio de Universidades under the FPU national plan (grant number FPU21/05581). | |
| dc.identifier.citation | Garcia-Fernandez, P., Cores, D., Mucientes, M. (2026). Exploring Open-Vocabulary Models for Category-Free Detection. In: Castrillón-Santana, M., et al. Computer Analysis of Images and Patterns. CAIP 2025. Lecture Notes in Computer Science, vol 15621. Springer, Cham. https://doi.org/10.1007/978-3-032-04968-1_24 | |
| dc.identifier.doi | 10.1007/978-3-032-04968-1_24 | |
| dc.identifier.isbn | 978-3-032-04968-1 | |
| dc.identifier.uri | https://hdl.handle.net/10347/43664 | |
| dc.language.iso | eng | |
| dc.publisher | Springer | |
| dc.relation.ispartofseries | Lecture Notes in Computer Science; 15621 | |
| dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-112623GB-I00/ES/ | |
| dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2023-149549NB-I00/ES/ | |
| dc.relation.publisherversion | https://doi.org/10.1007/978-3-032-04968-1_24 | |
| dc.rights.accessRights | open access | |
| dc.subject | Category-free | |
| dc.subject | Open-vocabulary object detection | |
| dc.subject | Captioning | |
| dc.title | Exploring Open-Vocabulary Models for Category-Free Detection | |
| dc.type | book part | |
| dc.type.hasVersion | AM | |
| dspace.entity.type | Publication | |
| relation.isAuthorOfPublication | b84267f3-fe5b-4ab3-aed0-cc8a3f34690b | |
| relation.isAuthorOfPublication | 21112b72-72a3-4a96-bda4-065e7e2bb262 | |
| relation.isAuthorOfPublication | 3daa2166-1c2d-4b3d-bbb0-3d0036bd8cf2 | |
| relation.isAuthorOfPublication | 21112b72-72a3-4a96-bda4-065e7e2bb262 | |
| relation.isAuthorOfPublication.latestForDiscovery | 3daa2166-1c2d-4b3d-bbb0-3d0036bd8cf2 |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- caip25_final_20250702092653301.pdf
- Size:
- 920.86 KB
- Format:
- Adobe Portable Document Format