That’s groundbreaking throughput of *checks notes* 0.065Gbps per node if 100s of nodes is literally only 100 of them #webscale
@thedarktangent sure. It depends how the software scales out right now, but I imagine it's still fairly naïve. My theory is that the actual behind-the-scenes work pulling data from all other federated instances takes quite a bit of processing power and bandwidth. It would be wise to break this out to a separate vertically scaled service while the day-to-day handling of user requests is scaled out horizontally.
What are your favorite / the best #WebCrawlers for broad / #WebScale #crawling?
I've built a list but am looking for anything I missed: https://github.com/davidshq/awesome-search-engines/blob/main/WebCrawlers.md
Main options I've found include #Apache #Nutch, #StormCrawler, #Scrapy, #Norconex, #PulsarR, #Heritrix, and #sparkler
#WebCrawlers #webscale #crawling #apache #nutch #stormcrawler #scrapy #norconex #pulsarr #heritrix #sparkler #question #search #searchengines
“It works for my use-case” is the #webScale version of “ it works on my machine”.
I’m looking at you #kubernetes.
We saw this with the whole #webscale debacle when multiple NoSQL databases had websites that looked more like lifestyle products than actually, you know, things designed to store and retrieve data.
"We can't possibly run our infrastructure on a Legacy Database™ like #PSQL we need NoSQL or it won't be #webscale!" is something that with very, very little hyperbole I have actually hear from an actual developer (1. my toaster is NoSQL, 2. not to be broccoliman but you have 10k QPS, sit down Dan).
@Gargron but is it #webscale 😜?
https://www.youtube.com/watch?v=b2F-DItXtZs
(Apologies if jokes are not etiquette, though it must be eternal September now)
@jerry I wonder if the #fediverse #softwarearchitecture might be more scalable? Wonder what it would take to make it #webscale?
#webscale #SoftwareArchitecture #Fediverse
Which is why I'd now like to introduce you to my newest and most unholy of projects...
BlortMail™!!!
It's the Rusian roulette of mail servers! Did it send? Was it to your ex-girlfriend or your boss? You'll never know, and that's half the fun! (The other half is configuring this ungodly abomination!) Coming soon to an enterprise #webscale #blockchain #cloudfirst deployment near YOU!
With love... from BlortCo™!
#webscale #blockchain #cloudfirst
The Tail at Scale - Dean, et al.
PDF: cs.rutgers.edu/~badri/552dir/…
#webscale #latency #replication #scaling #tech