It's not even funny how easy it is to #scrape a #website which implements super strong anti #scraping tactics. Wrote a script today which rotates IP and User agent on every request. It also simulates all the #cookie magic that the website implements to rate limit scrapers. #AWS can actually help you implement scrapers like a DDOS farm.
#scrape #website #scraping #cookie #aws
#Google Says It'll #Scrape Everything You Post #Online for #AI An update to Google's #privacypolicy suggests that the entire public #internet is fair game for it's AI projects. There is No Privacy online. https://gizmodo.com/google-says-itll-scrape-everything-you-post-online-for-1850601486 #legalresearch
#google #scrape #online #ai #privacypolicy #internet #legalresearch
Here's how to #scrape an #RSS feed for http://magazinelib.com and http://freemagazines.top
Here's how to #scrape an #RSS feed for http://magazinelib.com and http://freemagazines.top
https://textbin.net/hsnhtiaukv
I wrote up a setup doc on me using RSS-Bridge with it's XPath Bridge & MergeFeeds Bridge to scrape these sites into RSS feeds. Since all the arguments are stored in the query strings, the URLs are huge so I save them to a file & do a WGET indirectly against that URL. It scrapes 270 items from 10 pages so you only have to run it once a day. Details in the doc.
With #fb making moves to join the #fediverse, i see a little benefit to more public acceptance.
On the other hand i am pretty sure that they will just #hoover and #scrape all this public data.
They may not be able to algorithmically influence #timelines, but i am left wondering if there are any #defences #defence against this data collection.
Any pointers or information welcome! Please boost for reach.
#fb #fediverse #Hoover #scrape #timelines #defences #defence #askfedi
Ein paar optische Nettigkeiten für ArkOS auf dem RG353V hinzufügen. #scrape #skraper #retro #retrogaming #gaming #anbernic #rg353v #gamelist.xml #arkos #sbcgaming #sega #sony #microsoft #nintendo #psx #scummvm #snes #nes #gb #gbc #gba #megadrive #genesis
#scrape #skraper #retro #retrogaming #gaming #anbernic #rg353v #gamelist #arkos #sbcgaming #sega #sony #microsoft #nintendo #psx #scummvm #snes #nes #gb #gbc #gba #megadrive #genesis
Ein paar optische Nettigkeiten für ArkOS aur dem RG353V hinzufügen. #scrape #skraper #retro #retrogaming #gaming #anbernic #rg353v #gamelist.xml #arkos #sbcgaming #sega #sony #microsoft #nintendo #psx #scummvm #snes #nes #gb #gbc #gba #megadrive #genesis
#scrape #skraper #retro #retrogaming #gaming #anbernic #rg353v #gamelist #arkos #sbcgaming #sega #sony #microsoft #nintendo #psx #scummvm #snes #nes #gb #gbc #gba #megadrive #genesis
<?php
$html = file_get_contents('https://nitter.absturztau.be/chillartaholic');
$dom = new DOMDocument;
@$dom->loadHTML($html);
$links = $dom->getElementsByTagName('a');
$url = 'https://nitter.absturztau.be/chillartaholic';
$html = file_get_contents($url);
$dom = new DOMDocument();
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$nodes = $xpath->query('//a[@class="tweet-link"]');
foreach ($nodes as $node){
echo $link->nodeValue;
echo $node-> getAttribute('href'), '<br>';
}
?>
@themarkup And if you're technically inclined and want to get around some of those fake RSS feeds, this #tutorial can help you get #FreshRSS and #XPath to #scrape the content of the article for you https://joelchrono12.xyz/blog/fetch-full-article-content-freshrss/
#tutorial #freshrss #xpath #scrape
Excellent! A bit of frost in Aberdeenshire. Ahhh the good old days.
#aberdeenshire #weather #climatechange #frost #scrape
#wirzeichnenrosenkohl #inktober2022 #inktober #scrape
Expediton Brassica o., Expeditionstagebuch, Tag 24
„Sehen Sie das Problem, mein Lieber? - Ja, aber gibt es denn nicht vielleicht eine andere Reisemöglichkeit?“
(Zwar immer noch Äonen hinter den Inktoberprompts hinterher, aber der passte einfach gerade zu gut zum heutigen Zeichentag)
Material: Scribtol, Fineliner
Fundort: Kunstquartier
Zeit: nach einem weiteren langen Kurstag
Soundtrack: Donovan - Season of the Witch
#scrape #inktober #Inktober2022 #wirzeichnenrosenkohl
took a brief break, time to try catch up again
i'm kinda questioning my clothing-design-choices for some of these inktobers :blobmelt: 👕
#inktober #day16 #day17 #day18 #day19 #fowl #salty #scrape #ponytail #sfw #booty
#inktober #day16 #day17 #day18 #day19 #fowl #salty #scrape #ponytail #sfw #booty
I have been doing my Inktobers but I did forget to post then to my Instagram and Mastodon accounts, oops! I'll catch up with you guys now...
Day 18 and today's #inktober prompt word is #scrape
Not a good word, had not much in the way of ideas for this one, here's an old fashioned boot scraper shaped like a duck. It's not very exciting this one, but I got it done, not missed a day yet!
#inktober2022 #penandinkart #drawingchallenge #sketchbookdrawing #inkdrawing #MastoArt #CreativeToots
#CreativeToots #MastoArt #inkdrawing #sketchbookdrawing #drawingchallenge #penandinkart #inktober2022 #scrape #inktober
“As surprising as it seems, learning to act as a maid isn’t scraping at the bottom of the barrel for Érié, taking this role so seriously that she sometimes seems to scrape and bow.”
#inktober #inktober22 #inktober2022 #art #comics #photostudy #digitalinking #bw #blackandwhite #comicsartist #frenchinktober #scrape #maid #comictober #mastart #fediart #creativetoots
#inktober #inktober22 #inktober2022 #art #comics #photostudy #digitalinking #bw #blackandwhite #comicsartist #frenchinktober #scrape #maid #comictober #mastart #fediart #creativetoots