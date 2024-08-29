Read our disclosure page to find out how can you help MSPoweruser sustain the editorial team Read more

Are you a website owner and you want to opt out of Applebot, Apple’s AI scraper? You’re not the only one.

Meta, the parent company behind social media platforms like Instagram and Facebook, has reportedly opted out of Applebot-Extended so Apple won’t be able to train its AI models on both. Publication Wired has confirmed that other high-profile news sites, like The New York Times, Vox Media, the USA Today Network, The Atlantic, and The Financial Times have also opted out.

Another social media platform, Tumblr, is also mentioned in the report, alongside ad site Craiglist.

At the end of the day, it’s also a new field with a lot of interest going on. There is a lot of controversy over the use of copyrighted material in AI training: the New York Times has previously sued OpenAI over a similar issue. But, some of the sites, like the Financial Times, also chose to partner up with AI creators like OpenAI.

While the original Applebot was designed to gather information for Apple’s search features, Applebot-Extended specifically focuses on AI training. This means that if a website blocks Applebot-Extended, Apple won’t use the data from that site to improve its AI models, although the site’s content will still be accessible for search functions.

“With Applebot-Extended, web publishers can choose to opt out of their website content being used to train Apple’s foundation models powering generative AI features across Apple products, including Apple Intelligence, Services, and Developer Tools,” says Apple in the Applebot support page.

The AI scraper works by automatically collecting content from websites for the training, and you can only opt out of it via a publicly accessible robots.txt file. Here’s how.