FedSearch - Federated network search engine

Thomas Gramespacher :mirified: · @tgramespacher

99 followers · 127 posts · Server bonn.social

Open media

Der aktuelle MEDIEN INTERNET und RECHT-Newsletter wurde versandt.

Themen u.a.: #Scraping, Unzulässige #Werbung in Social-Media-Netzwerken, #Ladenöffnung an Sonntagen, #Preiswerbung für Photovoltaik-Produkte, muenchen.de u.a.m. ... 🤗

Anmelden? 📮 http://newsletter.medien-internet-und-recht.de

#scraping #werbung #ladenoffnung #preiswerbung

Last updated 2 years ago

Original post

Thomas Gramespacher :mirified: · @tgramespacher

99 followers · 125 posts · Server bonn.social

#Scraping - Bei der Geltendmachung datenschutzrechtlicher Ansprüche (#Schadenersatz, #Unterlassung und #Auskunft) wegen eines Scraping-Vorfalls auf einer Social-Media-Plattform ist eine #Wertfestsetzung in Höhe von insgesamt EUR 6.000,00 angemessen

👉 OLG Frankfurt a.M., http://miur.de/3304

#Datenschutzrecht #SocialMedia #Streitwert

📮MEDIEN INTERNET und RECHT Newsletter abonnieren? http://newsletter.medien-internet-und-recht.de

#scraping #schadenersatz #unterlassung #auskunft #wertfestsetzung #datenschutzrecht #socialmedia #streitwert

Last updated 2 years ago

Original post

Marcel SIneM(S)US · @simsus

217 followers · 5366 posts · Server social.tchncs.de

Edöb äussert sich zu "Data #Scraping" - inside-it.ch https://www.inside-it.ch/edoeb-aeussert-sich-zu-data-scraping-20230828 #Datenschutz #privacy #SocialMedia

#scraping #datenschutz #privacy #socialmedia

Last updated 2 years ago

Original post

Swift · @swift

9 followers · 41 posts · Server sunny.garden

@brook have you considered excluding sunny.garden from #AI #scraping #openai?

robots.txt

User-agent: GPTBot
Disallow: /

It seems odd to disallow ai art yet leave published original art susceptible to ai scraping, for example.

This is a genuine question not a complaint, there's lots I dont know about this area. Thanks for your work 🙂

#ai #scraping #openai

Last updated 2 years ago

Original post

Marcel SIneM(S)US · @simsus

217 followers · 5319 posts · Server social.tchncs.de

heise online - 2,6 Millionen Datensätze von Duolingo-Nutzern bei Have I Been Pwned

2,6 Millionen Datensätze von #Duolingo-Nutzern bei Have I Been Pwned | Security https://www.heise.de/news/2-6-Millionen-Datensaetze-von-Duolingo-Nutzern-bei-Have-I-Been-Pwned-9283391.html #haveibeenpwned #Datenschutz #privacy #DataLeak #Datenleck #Scraping #DataBreach #DataTheft

#duolingo #haveibeenpwned #datenschutz #privacy #dataleak #datenleck #scraping #databreach #datatheft

Last updated 2 years ago

Original post

Mr.Trunk · @mrtrunk

9 followers · 16134 posts · Server dromedary.seedoubleyou.me

HackRead: API Misuse: Hacker Exposes 2.6M Duolingo Users’ Emails & Names https://www.hackread.com/api-misuse-hacker-leak-duolingo-emails-names/ #Security #Duolingo #Scraping #security #Privacy #Leaks #LEAKS #API

#security #duolingo #scraping #privacy #leaks #api

Last updated 2 years ago

Original post

Mr.Trunk · @mrtrunk

7 followers · 15221 posts · Server dromedary.seedoubleyou.me

HackRead: Overcoming web scraping blocks: Best practices and considerations https://www.hackread.com/web-scraping-blocks-practices-considerations/ #DataScraping #WebScraping #Technology #javascript #Scraping #Python #HowTo

#datascraping #webscraping #technology #javascript #scraping #python #howto

Last updated 2 years ago

Original post

Mr.Trunk · @mrtrunk

7 followers · 15119 posts · Server dromedary.seedoubleyou.me

HackRead: Overcoming web scraping blocks: Best practices and considerations https://www.hackread.com/web-scraping-blocks-practices-considerations/ #DataScraping #WebScraping #Technology #javascript #Scraping #Python #HowTo

#datascraping #webscraping #technology #javascript #scraping #python #howto

Last updated 2 years ago

Original post

Fabio Manganiello · @blacklight

1215 followers · 1258 posts · Server social.platypush.tech

Indie Hackers - “It will be the greatest theft in the entire history of humanity.” Indie hackers weigh in on big AI companies scraping the web

I'm actually not entirely against AIs #scraping the web.

Once the genie is out of the bottle, you can't put it back in. If there's some content out there that is freely accessible, and it can be used to make large models better, it will certainly be used - we shouldn't be too naive or ideological about that.

I've always supported total freedom of scraping for everyone. I've always supported a world were all the content on the Internet can also be parsed by machines (that was the entire idea behind the semantic web). Once public content is out there, we lose control over who accesses it and for what purposes - that's simply how the web works.

But if Google and Meta are suddenly in this "we ♥ scraping" mood, I'd expect them to stick to their words and allow bidirectional scraping at least.

As an AI geek, I'd love to train my models on large corpora of audio extracted from YouTube videos. Or what people post in public Facebook groups when particular events happen. Or how the price of a product fluctuates on Amazon as the result of several external factors.

But I can't legally do any of these things. Those platforms are sealed, their APIs are very limited by design, only a limited amount of researchers can access some of that data (after signing lengthy NDAs and agreeing that the mother company will decide if the research can be published), and they will have tons of frontend-only checks to ensure that only a human downloads that content - and that they watch a sufficient amount of ads in the process. Not only - the developers behind scraping software like youtube-dl also get regularly harassed by Google.

So how come should I tolerate a world where if you're big enough you can afford to scrape the shit out of everyone, and use that knowledge to become even bigger and more powerful, but nobody is allowed to do the same with your own content?

We urgently need regulation that creates a level playing field when it comes to automated access to online information.

Freedom of scraping means freedom of growing. We can't give this freedom only to those who are already big enough. That's an unfair economic system with insurmountable entry barriers.

We need to make web scraping a fundamental human right.

And large companies should be compelled with sharing their data without barriers to scrapers too, if they aren't willing to build proper APIs.

Until that happens, I'll keep scraping the shit out of those monopolists without feeling an inch of guilt.

https://www.indiehackers.com/post/it-will-be-the-greatest-theft-in-the-entire-history-of-humanity-indie-hackers-weigh-in-on-big-ai-companies-scraping-the-web-6e78a4a4b7

#scraping

Last updated 2 years ago

Original post

Debarko ☑️ · @debarko

7 followers · 9 posts · Server fosstodon.org

It's not even funny how easy it is to #scrape a #website which implements super strong anti #scraping tactics. Wrote a script today which rotates IP and User agent on every request. It also simulates all the #cookie magic that the website implements to rate limit scrapers. #AWS can actually help you implement scrapers like a DDOS farm.

#scrape #website #scraping #cookie #aws

Last updated 2 years ago

Original post

C.H. · @c_th1

132 followers · 120 posts · Server digitalcourage.social

Der Datenschutz Talk: Schüler verpflichtet Land #NRW - #Datenschutz News KW 30/2023

Was ist in der KW 30 in der Datenschutzwelt passiert, was ist für Datenschutzbeauftragte interessant ?

Wir geben einen kurzen Überblick der aktuellen Themen:

Kein Löschanspruch: VGH München, Beschluss vom 29.06.2023,

Schadensersatz #Scraping:
#Facebook
LG Ravensburg, Urteil vom 13.06.2023,

Schüler gegen NRW 🤩

Bußgeld wegen Paparazzifotos in die Privatwohnung Prominenter

Österreicher wegen #Überwachung der Ehefrau verurteilt

Kinder: Datenschutzverstoß durch Filmaufnahmen

Empfehlungen & Lesetipps:

E-Mail Kommunikation sicher gestalten

Webseite der Episode: https://migosens.de/schuler-verpflichtet-land-nrw-datenschutz-news-kw-30-2023/

Mediendatei: https://migosens.de/podlove/file/781/s/feed/c/ddt/DDT228.mp3

#uberwachung #facebook #scraping #datenschutz #nrw

Last updated 2 years ago

Original post

J.A. Jablonski (Jude) ✒ 🏳️‍🌈 · @JAJablonski

288 followers · 651 posts · Server writing.exchange

I just read that Google is scraping Google Docs for their AI "training database." Wondering if this is indeed the case.

Some text files in my Drive folders convert to using the Google Docs interface when I open them. So I was wondering where one can store stuff on the cloud that isn't beholden to such thievery.

Found this article: "Google Docs AI: Is it safe? I’m a novel writer and tech journalist — let’s talk" Rami Tabari

https://www.laptopmag.com/features/google-docs-ai-is-it-safe-im-a-novel-writer-and-tech-journalist-lets-talk

#AI #Copyright #Scraping #WritingCommunity

#writingcommunity #scraping #copyright #ai

Last updated 2 years ago

Original post

Kaan Barmore-Genç · @kaan

162 followers · 228 posts · Server fosstodon.org

#Scraping is so exhilarating, I was hitting a website so hard it went down

It was our own website, I took our own website down

If anyone asks I was just red teaming

#scraping

Last updated 2 years ago

Original post

Mariette Timmer · @mariettetimmer

20 followers · 429 posts · Server mastodon.nl

Is it possible to scrape dynamic webpages with #GoogleAppsScript?
#dtv #scraping

#scraping #dtv #googleappsscript

Last updated 2 years ago

Original post

PyLadies Bot · @pyladies_bot

115 followers · 117 posts · Server botsin.space

Open media

📝 "Web Scraping for Data Scientists (With No Web Programming Background)"

👤 Valery C. Briz (@valerybriz)

🔗 https://dev.to/valerybriz/web-scraping-for-data-scientists-with-no-web-programming-background-1j3a

#pyladies #python #oldiebutgoodie #datascience #datamining #scraping

Last updated 2 years ago

Original post

Pxl Phile · @ppxl

86 followers · 25 posts · Server social.tchncs.de

Open media

Is that true that Meta scrapes the shit out of every corner to gain Real Names™? It feels true given the shitscape we're currently living in. #Threads #Scraping

#threads #scraping

Last updated 2 years ago

Original post

Rechtsanwälte Kotz · @kanzlei_kotz

6 followers · 161 posts · Server nrw.social

Open media

Streitwert in "Scraping"-Verfahren

Rechtskonflikt im digitalen Zeitalter: Die Bedeutung von #Scraping und #Datenschutz
Das vorgegebene #Urteil wirft ein Schlaglicht auf ein drängendes Thema des digitalen Zeitalters: den Datenschutz in Zusammenhang mit dem sogenannten Scraping.

#Internetrecht

Symbolfoto:Andrii Yalanskyi /Shutterstock

https://www.ra-kotz.de/streitwert-in-scraping-verfahren.htm

#scraping #Datenschutz #Urteil #internetrecht

Last updated 2 years ago

Original post

DSGVO.watch · @dsgvo_watch

0 followers · 4 posts · Server social.arkm.de

Open media

Facebook Nutzer erhalten 500 EUR Schadensersatz wegen DSGVO Verstoß

Facebook Nutzer erhielten 500 EUR Schadensersatz wegen einem erneuten DSGVO Verstoß. Im Zeitraum von 2018 bis 2019 wurden die Daten von 533 Millionen Facebook-Accounts von unbekannten Angreifern kompromittiert. Dabe

https://dsgvo.watch/gerichtsurteile/facebook-nutzer-500euro-schadensersatz-landgericht-luebeck/

#Darknet #Datenschutz #DSGVO #Facebook #Gerichtsurteile #LandgerichtLbeck #Meta #Schadensersatz #Scraping