Dataset bias exposed in face verification

López López, EricPardo López, Xosé ManuelVázquez Regueiro, CarlosIglesias Rodríguez, RobertoEstévez Casado, Fernando2021-04-162021-04-162019López‐López, E., Pardo, X.M., Regueiro, C.V., Iglesias, R. and Casado, F.E. (2019), Dataset bias exposed in face verification. IET Biom., 8: 249-258 . https://doi.org/10.1049/iet-bmt.2018.5224http://hdl.handle.net/10347/26000This is the peer reviewed version of the following article: López‐López, E., Pardo, X.M., Regueiro, C.V., Iglesias, R. and Casado, F.E. (2019), Dataset bias exposed in face verification. IET Biom., 8: 249-258, which has been published in final form at https://doi.org/10.1049/iet-bmt.2018.5224. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Use of Self-Archived VersionsMost facial verification methods assume that training and testing sets contain independent and identically distributed samples, although, in many real applications, this assumption does not hold. Whenever gathering a representative dataset in the target domain is unfeasible, it is necessary to choose one of the already available (source domain) datasets. Here, a study was performed over the differences among six public datasets, and how this impacts on the performance of the learned methods. In the considered scenario of mobile devices, the individual of interest is enrolled using a few facial images taken in the operational domain, while training impostors are drawn from one of the public available datasets. This work tried to shed light on the inherent differences among the datasets, and potential harms that should be considered when they are combined for training and testing. Results indicate that a drop in performance occurs whenever training and testing are done on different datasets compared to the case of using the same dataset in both phases. However, the decay strongly depends on the kind of features. Besides, the representation of samples in the feature space reveals insights into what extent bias is an endogenous or an exogenous factoreng© 2019 The Institution of Engineering and Technology. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-ArchivingFace recognitionLearning (artificial intelligence)Mobile devicesFacial imagesPublic available datasetsFace verificationFacial verification methodsTarget domainSource domainDataset bias exposed in face verificationjournal article10.1049/iet-bmt.2018.52242047-4946open access