mirror of
https://github.com/kremalicious/blog.git
synced 2024-11-26 20:01:24 +01:00
999 B
999 B
date | title | linkurl | tags | ||
---|---|---|---|---|---|
2023-10-06T10:22:16.581Z | Block the Bots that Feed “AI” Models by Scraping Your Website | https://neil-clarke.com/block-the-bots-that-feed-ai-models-by-scraping-your-website/ |
|
Neil Clarke with an excellent overview about current techniques to block all the "AI" web scraping bots from your content, e.g. via robots.txt
. The reasons for doing so are numerous:
“AI” companies think that we should have to opt-out of data-scraping bots that take our work to train their products. [...] These companies should be prevented from using data that they haven’t been given explicit consent for. Opt-out is problematic as it counts on concerned parties hearing about new or modified bots BEFORE their sites are targeted by them. That is simply not practical.[...] The online community is under no responsibility to help them create their products.