What did The Atlantic actually publish?
For two years the music industry argued about AI training data without being able to point at it. That just changed. The Atlantic's AI Watchdog, a project led by reporter Alex Reisner, turned four opaque training databases into a single public search box. Type an artist or a track name and it tells you how many matches sit in each set. The project started in 2025 tracking books, research papers and video; the June 2026 expansion into music converts a rumour everyone repeated into something you can verify in ten seconds.
The numbers are blunt. Across the four collections there are more than 21 million recordings. Two of them top 100,000 tracks each; the other two are enormous, at roughly 9 million and 12 million. The catalogue spans Taylor Swift, Bad Bunny, Nirvana, Billie Eilish and The Beatles alongside jazz, classical, and tens of thousands of names nobody outside their Bandcamp followers would recognise.
How did the music end up there?
The largest set, LAION-DISCO-12M, was released in November 2024 by LAION, the German nonprofit behind the image datasets that trained a generation of picture generators. It is a list of around 12 million YouTube tracks, scraped and handed around for anyone building a model. A second set was assembled by pulling lyrics and metadata straight off Genius. These are the raw feedstock that companies like Suno and Udio have been accused of training on, and the labels' lawsuits hinge on exactly this question of what went in.
For years "your music is probably in there somewhere" was a shrug. Now it is a search result with a number next to it.
Why this hits underground producers hardest
A major label has a legal department to chase a settlement. The producer who uploaded a deep house EP to YouTube in 2019 has nothing, and that producer is now staring at a database entry. Hainbach, the Berlin instrument experimentalist beloved by the modular crowd, ran his own name and found 151 songs in one set. Multiply that across every small electronic artist who ever posted a set or a loop pack and you understand why the mood online turned from curiosity to anger within a day. The tool does not get anyone paid or removed. What it does is end the deniability, and in a fight that is finally heading for a courtroom this summer, evidence is leverage.



