Make Use Of: What Is GPTBot and Why Are Websites Blocking It? https://www.makeuseof.com/what-is-gptbot-and-why-are-websites-blocking-it/ #Tech #MakeUseOf #TechNews #IT via @morganeogerbc #TechnologyExplained #WebCrawlers #ChatGPT #Chatbot
#Tech #MakeUseOf #technews #it #TechnologyExplained #webcrawlers #chatgpt #chatbot
Seems like a good idea, and should've been in place from the get-go.
OpenAI launches webcrawler GPTBot, and instructions on how to block it https://mashable.com/article/open-ai-gptbot-crawler-block
What the actual fuck?
Will someone kindly explain to "global cybersecurity leader" Palo Alto Networks that the User-Agent header is a place to put the name of your user agent? You send the name of your user agent, and you obey `robots.txt` (which they don't, of course). You DO NOT write a short essay ending with a request for people to mail you to opt-out. It is 2023 and the right way to do this was established DECADES ago.
#paloaltonetworks #clownshoes #robotstxt #webcrawlers #www #web
#paloaltonetworks #clownshoes #robotstxt #webcrawlers #www #web
One thing some web crawlers seem to be particularly dumb about is handling 301 Moved Permanently. A lot of bots are still requesting content on my sites that was marked as "moved permanently" or even 410 Gone several years ago ... and they'll be back tomorrow to ask for it again.
This seems inefficient to me, but ¯\_(ツ)_/¯ ...
#bots #crawlers #seo #search #web #webcrawlers #httpstatus #http
#bots #crawlers #seo #search #web #webcrawlers #httpstatus #http
One thing some web crawlers seem to be particularly dumb about is handling 301 Moved Permanently. A lot of bots are still requesting content on my sites that was marked as "moved permanently" or even 410 Gone several years ago ... and they'll be back tomorrow to ask for it again.
This seems inefficient to me, but ¯\_(ツ)_/¯ ...
#bots #crawlers #seo #search #web #webcrawlers #httpstatus #http
#bots #crawlers #seo #search #web #webcrawlers #httpstatus #http