Can you guess which of these Chekhovs, Gogols, and Ostrovskys are FAKE (ChatGPT-generated in the style of each writer) looking at the #stylometry similarity visualisations? Its an easy task but I find it somewhat educational😉 Inspired by recent work by @rebsim at the
@aiucd conf
NB: it is not about size, the sizes of generated texts are in the same range here (I used iterative chain prompting to generate long-enough stuff)
Nice thread, with a punchline for #DH2023 attendees interested in #stylometry... read till the end: https://mastodon.social/@mhoye/110707564773784439
#DH2023 yesterday's first day with so many parallel and interesting sessions, I was a bit overwhelmed and forgot to share here for people who can't be in #Graz: highlights of the day, apart from the #stylometry : the #DH #Theory session with @rabeakleymann , Jennifer Edmond, @nabsiddiqui and more
#DH2023 #graz #stylometry #dh #theory
Another very intriguing subject in the #stylometry session at #DH2023 : are there different sources to the Thora, and more importantly can this can be proved statistically. Very interesting approach, I just don't quite agree with the last comment that the mathematical method is completely objective and without biases - hmm think about the starting point which is the text and the assumptions made on it... Yes, statistical methods can create evidence but they are not objective
#DH2023 the morning starts right away with highlights in #stylometry: Jan Rybicki spoke about literature translations and it turns out that deepl copies the author's style better than humans... well and Marcel #Proust can't be translated at all. Think about what #Picasso said about translating himself in Spanish and French: "si je pense dans une langue..."
#DH2023 #stylometry #proust #picasso
At #CCLS2023, we're now having the pleasure of attending a (as always) very inspiring and colorful #keynote by Jan #Rybicki, on #stylometry and distant reading applied to 10,005 novels in #Polish (translated or original).
#ccls2023 #keynote #rybicki #stylometry #polish
Estrella Samba-Campos is presenting at our 9th IDHN conference her latest research on kutub al-Ê¿ilm and muá¹£annaf collections using #stylometry Join us https://tinyurl.com/idhn9conf or check out her profile at the Universidad Complutense de Madrid.
I thoroughly enjoyed presenting my data-driven research on late Ottoman #Arabic #Periodicals at #DigHis23. The focus of my paper was stylometric authorship attribution, which relied on earlier collaborative work with Maxim Romanov on establishing parameters for reliable authorship attribution in Arabic for the `stylo()` package in #R (#Rstats).
Slides are available at https://tinyurl.com/dighis23-grallert
#MultilingualDH #DigitalHumanities #DigitalHistory #PeriodicalStudies #Stylometry
#stylometry #PeriodicalStudies #digitalhistory #digitalhumanities #MultilingualDH #rstats #r #dighis23 #Periodicals #arabic
The #DigHis23 continues with an inspirational presentation by @tillgrallert: He identifies anonymous authors of Arabic periodical articles through stylometric authorship attribution. #Stylometry #DH
This week on #osspodcast @kurtseifried and I chat about #stylometry
There's a tool to look at #HackerNews authors and see if their writing is similar to another user (sock puppets anyone?)
This of course leads to larger discussions about #privacy, #cybersecurity, #impersonation, and of course, #shakespeare
https://opensourcesecurity.io/2022/12/04/episode-352-stylometry-removes-anonymity/
#OSSPodcast #stylometry #hackernews #privacy #cybersecurity #impersonation #shakespeare
Hi Fediverse, #introduction
Currently I'm spending a lot of my time on the computer researching into #music #corpora in order to finish my #phd @ #epfl by the end of 2023. My main subject is #musicTheory and I'm trying to measure stylistic differences between tonal languages of the last four centuries through #statistics on #harmony (#stylometry).
I'm here to connect with people who are interested in #dh #DataScience #machinelearning #opendata #dataset #foss #privacy #musicianship #funk #techno
#introduction #music #corpora #phd #epfl #musictheory #statistics #harmony #stylometry #dh #DataScience #machinelearning #opendata #dataset #foss #privacy #musicianship #funk #techno
Hi! Here's an #introduction. I'm Jonathan Reeve, and I work in computational approaches to literary study, using #NLP, #AI, #ML, #stylometry, and methods of #digitalhumanities, in languages like #Python and #Haskell.
I maintain https://open-editions.org, which collects #TEI XML editions of James #Joyce and other writers; https://corpus-db.org, an API for literary corpora; and text-matcher, a text reuse detection engine. More of my projects are up at https://github.com/JonathanReeve.
I'm a PhD student in English and Comparative Literature, in my final year at Columbia University, writing a dissertation which models visuality in British #modernism.
As a long-term Mastodon user, I'm happy to see a #twittermigration taking place, and even happier that it's happening through my old employer (in a previous iteration), HCommons! I was j0_0n on Twitter: https://twitter.com/j0_0n. Check out my blog at https://jonreeve.com.
#introduction #nlp #ai #ml #stylometry #digitalhumanities #python #haskell #tei #joyce #modernism #twittermigration
This looks brilliant! Preprint on "Boosting word frequencies in authorship attribution" by Maciej Eder. Instead of relative frequencies, frequency normalisation against a background of semantically similar words was performed. Significant performance gains shown via fascinating heatmaps. See: https://arxiv.org/abs/2211.01289 #stylometry #AuthorshipAttribution #stylo #Kraków #CHR2022 #WordEmbeddings #Heatmaps #BurrowsDelta #CosineDelta
#stylometry #AuthorshipAttribution #stylo #kraków #CHR2022 #wordembeddings #Heatmaps #BurrowsDelta #CosineDelta
Aiming at finding various tribes, I just post a collection of interests
#DigitalHumanities #DigitalHistory #SocialHistory #MediaStudies #Stylometry #PeriodicalStudies #الانسانيات-الرقمية #OttomanEmpire #Arabic #MultilingualDH
#MultilingualDH #arabic #OttomanEmpire #الانسانيات #PeriodicalStudies #stylometry #mediastudies #socialhistory #digitalhistory #digitalhumanities