FedSearch - Federated network search engine

FedSearch

beSpacific · @bespacific

1106 followers · 2068 posts · Server newsie.social

The Verge - The New York Times blocks OpenAI’s web crawler

The #NewYorkTimes has blocked #OpenAI’s #webcrawler, meaning that OpenAI can’t use content from the publication to train its AI models. If you check the NYT’s robots.txt page, you can see that the NYT disallows #GPTBot, the crawler that OpenAI introduced earlier this month. Based on the #InternetArchive’s #WaybackMachine, it appears NYT blocked the crawler as early as August 17th. https://www.theverge.com/2023/8/21/23840705/new-york-times-openai-web-crawler-ai-gpt #copyright #legalresearch

#newyorktimes #openai #WebCrawler #gptbot #internetarchive #waybackmachine #copyright #legalresearch

Last updated 2 years ago

Original post

Paul Belcher 🇪🇺🇬🇧🇪🇸 · @pauljbelcher

106 followers · 28 posts · Server eupolicy.social

#Mastodon: Well, I have only been here for three days, but it feels very much like 1994 all over again: That first hesitant email, those first adventurous clicks on pre-#Google internet search engines. Remember grappling with these?

#WebCrawler
#Lycos
#AltaVista
#Excite
#Dogpile
#AskJeeves
#JumpStation

Let's be patient, progress may be happening before our eyes, again!

#mastodon #google #WebCrawler #lycos #AltaVista #Excite #Dogpile #AskJeeves #JumpStation

Last updated 3 years ago

Original post