A study discovered that a contentious facial recognition dataset of trans people remained available online for years after its initial controversy. The researchers even left the full videos in an unprotected Dropbox until 2021.
The dataset audit, published in Big Data & Society by authors Os Keyes and Jeanine Austin, delves deeply into the project’s origins and problems, which used still images from YouTube videos uploaded by 38 different people. Researchers led by UNCW’s Karl Ricanek extracted more than one million frames from videos of trans people in transition to create the “HRT Transgender Dataset.”
The HRT Transgender Dataset was not only still available as a Dropbox URL until April 2021, but it also contained the videos, many of which had since been deleted from YouTube by the posters and all of which were subject to copyright.
They write in a peer-reviewed article published in Big Data & Society that the data set, which was allegedly shelved five years ago, was still available online in April 2021 as a Dropbox URL with no password protection. Furthermore, the data set did not contain a list of YouTube URLs, as claimed by Mr Ricanek, but rather the videos themselves, including videos that had since been made private or deleted.
The sources for this piece include article in Vice.