I have plenty more achievable goals for https://schizo.social (like multi-account, or #Calckey support) but something I'd love to try is #classifying posts with #machineLearning #tfidf
I'd like to be able to define "labels" and then train it to identify those on the fly. Then either mute or highlight posts that #classify highly.
Not so much an #algorithm, as a #filter.
#classifier #webdev #ml #ai #filter #algorithm #classify #tfidf #machinelearning #classifying #calckey
What with the Yandex leak showing us that BM25 is more important than tf*idf (boo to all those SEO content editing tools that promote all that tf*idf 'magic bullet' bollox) it's worth brushing up on a few search engine basics.
It's an oldish article (not that it matters in this case) but here, Lan Chu does a really great job of breaking down term-based retrieval methods in information retrieval.
If this is your bag, give it a read.
planning for #CfgMgmtCamp ? Not sure what track(s) to attend? I have you covered!
Behold, my entirely hacky TF-IDF analysis of the talk submissions, broken down by room & day. In other words, what words are common to each room *specifically* in the talks blurb.
#cfgmgmtcamp #rstats #textmining #tfidf