IBM used Flickr face photos for recognition algorithm without reporting
IBM released a collection of one million facial photos in January from the photo site Flickr. These have been used to train an algorithm on facial features and recognizing ethnicity, but in certain cases no permission was requested.
According to NBC, several photographers have complained that they had no idea that photos taken by them and posted to Flickr were used by IBM to train facial recognition algorithms. A photographer, of which more than 700 images are in IBM’s collection, says none of the subjects he photographed were aware of this.
IBM said in a response to The Verge that it takes the privacy of individuals very seriously and that it has taken great care to comply with the privacy principles. According to IBM, the dataset was only accessible to verified researchers and contained only publicly available images. According to the spokesperson, individuals can also opt-out from this dataset.
However, NBC states that it is nearly impossible to have photos removed from the dataset. This is related to IBM’s requirement that photographers send an email with links to the photos to be deleted, while the company has not made public the list of Flickr photos and users used in the dataset. As a result, it is not easy to find out whose photos are in it. IBM did not answer questions from NBC about this.
Originally, the photos from the dataset were not collected by IBM; these are part of a collection of nearly a hundred million photos called YFCC100M. This collection was compiled for research purposes by Yahoo, the former owner of Flickr. The photos are under a Creative Commons license.
IBM says it will use the photos to develop “fairer” facial recognition systems. In the dataset used by IBM, the photos do not refer to the names of the subjects, which means that the systems cannot directly identify the persons.
 
			