Check out Tobias Englmeier et al.’s work “Using an Advanced Text Index Structure for Corpus Exploration in Digital Humanities” which shows ways to explore #corpuses through symmetric compacted directed acyclic word graphs (SCDAWGs)- offering ways to answer many of the questions raised in #DH research:
http://digitalhumanities.org:8081/dhq/vol/15/1/000526/000526.html
Exploring #microhistories of the #Holocaust, see “Algorithmic Close Reading: Using Semantic Triplets to Index and Analyze Agency in Holocaust Testimonies” by Lizhou Fan & Todd Presner which uses #text #analysis #methods to search #testimonies:
http://digitalhumanities.org:8081/dhq/vol/16/3/000623/000623.html
#corpuses #dh #microhistories #holocaust #text #analysis #methods #testimonies
With #ChatGPT still making news, we decided to highlight some of the pieces from our vault on language models.
Check out “Digital Humanities and Natural Language Processing: Je t’aime... Moi non plus” by Barbara McGillivray, Thierry Poibeau & Pablo Ruiz Fabo which focuses on more collaboration between #DH datasets and #NLP tools:
http://digitalhumanities.org:8081/dhq/vol/14/2/000454/000454.html
See Diego Jiménez–Badillo et al.’s work titled “Developing Geographically Oriented NLP Approaches to Sixteenth–Century Historical Documents: Digging into Early Colonial Mexico” exploring how #NLP and other #computational approaches can be applied to understand large #historical #corpuses:
http://digitalhumanities.org:8081/dhq/vol/14/4/000490/000490.html
#chatgpt #dh #nlp #computational #historical #corpuses
Cross-boost (excerpts) from
v buckenham (@v21@🐦.com)
there is a real fear among #AI researchers that the last big #corpuses of human written #text have already been captured. all future scrapes of the internet for text to learn from will be contaminated by machine-speak.
...
funny to think of a time when generated text is recognizable due to it's use of typically 2020-ish #speech patterns and references. a cultural fixed point new models start from...