By Cade Metz | The New York Times
Dozens of databases of people’s faces are being compiled without their knowledge by companies and researchers, with many of the images then being shared around the world, in what has become a vast ecosystem fueling the spread of facial recognition technology.
The databases are pulled together with images from social networks, photo websites, dating services like OkCupid and cameras placed in restaurants and on college quads. While there is no precise count of the data sets, privacy activists have pinpointed repositories that were built by Microsoft, Stanford University and others, with one holding over 10 million images while another had more than two million.
The face compilations are being driven by the race to create leading-edge facial recognition systems. This technology learns how to identify people by analyzing as many digital pictures as possible using “neural networks,” which are complex mathematical systems that require vast amounts of data to build pattern recognition.
Tech giants such as Facebook and Google have most likely amassed the largest face data sets, which they do not distribute, according to research papers. But other companies and universities have widely shared their image troves with researchers, governments and private enterprises in Australia, China, India, Singapore and Switzerland for training artificial intelligence, according to academics, activists and public papers.