🤖 Wer nicht will, dass die eigenen Webseiten von KI-Crawlern ausgelesen werden, muss das explizit ausschließen (§ 44b Abs. 3 UrhG). Das kann man per robots.txt (man kann z.B. alle Crawler ausschließen und nur Google o.a. erlauben). Nunmehr soll man ChatGPT per "User-agent: GPTBot" direkt ansprechen können. Eine andere Frage ist natürlich, ob das beachtet wird.
I've created a #Mastodon #Hashtag #crawler to find related hashtags and to find folks who talk about them.
I created this to primarily find others across Mastodon servers who have similar tastes as me, but opening this up to everyone if this is helpful for you as well.
Note that I know the response is not as fast it could be, but its experimental so I hope you will forgive me.
My #Mastodon #Crawler to search for #Chrome and #ChromeOS posts is running smoothly. The entire crawler is relatively tiny and runs off one of my #chromebook :)
I think I can pick Chrome related posts in English language within an hour of the post. (Feel free to test it by using #chrome hashtag on your next post)
Just the start though: I have more ideas of how to use this.
Please let me know if you have creative ideas to use this.
#mastodon #Crawler #chrome #chromeos #chromebook
Weekend Fun: In my attempt to find other folks who talk about #Chrome, I wrote a small #crawler to crawl through Mastodon servers to find others talking about related topics. I was pleasantly surprised how easy it was to crawl compared to the birdsite. Also realize it was easier because the firehose is not too big yet.
That being said, two of the top 10 Tags on Mastodon had Twitter in it :) I think we should celebrate once it drops below the 20th rank.
#deecoob = Dienstleister, seit 1.5 Jahren Teil der #GEMA, dem DE-Monopolisten zur treuhänderischen Verwertung von musikalischen Urheberrechten. Die scannen mit MESLIS inkl. KI Webseiten z.B. nach unangemeldeten (Konzert-)aufnahmen, um Veranstalter mit Strafzahlungen zu überziehen. Das tötet Subkultur.
Ping deecoob.com 217.160.0.112
Ping meslis.com 52.219.171.20
... Keine IPs in den Serverlogs, trotzdem einfach mal #htaccess preppen
#followerpower #block #crawler #meslis
GEMA-Mag:
#meslis #Crawler #block #followerpower #htaccess #gema #deecoob