What did The Atlantic actually find?

An investigation by Alex Reisner for The Atlantic pulled the lid off four datasets that have been quietly passed around AI developers, holding more than 21 million copyrighted tracks between them. The largest alone runs to about 12 million songs, roughly 91 years of music if you tried to play it end to end, and it has been downloaded thousands of times. Much of it traces back to the Free Music Archive, the long-running hub where independent and underground artists posted their work for free, and both Google and Stability AI have confirmed in their own research papers that they trained music models on it. Reisner also built a searchable tool, the AI Watchdog, so any artist can type in their name and see how much of their catalog is sitting in the pile.

Why is the underground's name all over it?

Because the underground gave a lot of it away on purpose. The Free Music Archive was a gift economy: Creative-Commons licences, free downloads, the edit-and-share culture house and techno have run on for twenty years. That generosity is exactly what made the catalog so easy to scrape. Search the datasets and the scene's canon is right there, about 151 Daft Punk tracks, 89 by Charlotte de Witte, 54 by Eric Prydz and 22 by DJ Sabrina the Teenage DJ, the pseudonymous bedroom producer whose dense, sample-stitched house became a cult record with no major behind it. She did not take it quietly.

To everyone who thought my music sounded like AI slop: did you ever think it was because a dataset contained 22 of my songs?

That is the part that stings. The music people accused of sounding machine-made may have been feeding the machines all along. None of these artists were asked, and a personal-streaming or Creative-Commons licence was never a permission slip for commercial AI training.

Can artists actually do anything?

On their own, not much yet, which is the hard truth under all of this. "Until the major labels go through their lawsuits, there's no way for artists or labels to fight back," said Vince Valholla of Valholla Records after finding more than 100 of his releases in the data. The leverage sits with the majors: Suno and Udio, the two best-known AI music generators, are already being sued by the labels over what they trained on, and Udio settled with Universal in 2025 to build a licensed platform instead. The Atlantic's watchdog at least turns a vague dread into evidence. You can now name the songs, count them and point to the dataset, which is the thing every one of these cases has been missing.