Anyone who has spent any time on Tumblr knows that there is a beautiful cornucopia of weird and niche stuff — especially among fandoms. 

This week, 404 Media has reported that Auttomatic, the company that owns WordPress and Tumblr, is making a deal to provide data from their sites to help train OpenAI and Midjourney.

Does this mean that ChatGPT will now be able to write even better Fawnlock fanfic?

A public blog post states “we are also working directly with select AI companies as long as their plans align with what our community cares about: attribution, opt-outs, and control,” the blog post says. “Our partnerships will respect all opt-out settings.”

404 Media’s report included internal Auttomatic employee messages describing how engineers were tasked with compiling posts from 2014 to 2023, but had made some mistakes, according to 404’s reporting. The employees included posts from deleted or suspended blogs, private posts on public blogs, and private answers from the “Ask” function, the report said.

Most notably, they also included content marked NSFW or “mature,” even though they weren’t supposed to include those. Tumblr banned pornography and nudity in 2018, but in 2022 it loosened those rules to allow nudity (but still not sexually explicit images). It’s worth reading 404’s story on what Auttomatic is or isn’t doing about these apparent errors.Tumblr is not the only social platform that is making deals like this. Reddit has a $60 million-a-year deal to license its data to Google to train its AI. Facebook and Instagram, of course, are already using data for Meta’s own internal AI tools.

This can be controversial for some users, who feel uncomfortable about their content — on Tumblr, this is often personal writing or photography or art — being used to train AI.

Categorized in: