How to Opt Out of AppleBot So Apple Won’t Train AI on Your Websites
Privacy is (still) a beautiful thing
2 min. read
Published on
Read our disclosure page to find out how can you help MSPoweruser sustain the editorial team Read more
Key notes
- Applebot is Apple’s web crawler that powers features like Spotlight, Siri, and Safari search.
- It can be managed via robots.txt to allow or disallow crawling and content usage.
- Applebot-Extended offers additional control over data usage for AI model training.
Apple arrived big at this week’s WWDC 2024 event. The Cupertino tech giant announced the ChatGPT-powered iOS 18 update, among other things like the Copilot alternative, Apple Intelligence, as well as Apple’s On-Device SLM (small language model) that boasts 3B parameters for you to run AI locally, just a little less compared to Microsoft’s Phi-3-mini model.
The iPhone makers also said that it’s scraping “publicly available data” through its web crawler, AppleBot, for further AI training, although it’s said in an “About AppleBot” page that it only gathers data to enhance features like Spotlight, Siri, and Safari search results.
And while this does “seem” harmless, your websites still do have the choice to opt-out from being used on AI training because, well, privacy is a beautiful thing. To opt out of Applebot, you can add directives to your website’s robots.txt file to disallow it from crawling your site.
How to Opt Out of AppleBot
1. Open your robots.txt file (or create one if it doesn’t exist).
2. Add the following lines to disallow AppleBot
User-agent: Applebot
Disallow: /
3. Save the file and upload it to the root directory of your website
4. If you also want to opt out of AppleBot-Extended, which controls data usage for training Apple’s AI models, add the following lines as well:
User-agent: Applebot-Extended
Disallow: /
The robots.txt file is a text file used by websites to communicate with web crawlers and robots. It provides instructions about which parts of the site can be crawled and indexed by search engines and other automated agents. You can check how your websites’ robots.txt file looks like by going to (yourdomain.com)/robots.txt.
AppleBot-Extended, AppleBot’s expanded version, lets you as publishers control data usage for Apple’s AI models and follow meta tag rules to manage indexing and rendering of web pages.
“Applebot-Extended does not crawl webpages. Webpages that disallow Applebot-Extended can still be included in search results. Applebot-Extended is only used to determine how to use the data crawled by the Applebot user agent,” says Apple in the announcement.
User forum
0 messages