Martin Prell · @mprell
173 followers · 77 posts · Server fedihum.org

Wow, with the Super Model "Titan I", seems to have taken a next very impressive step. In particular, our corpora with very heterogeneous hands are recognised significantly better in our tests so far than models trained specifically on the hands. If this is confirmed, even the time-consuming creation of training data and the model training could become obsolete 🤯

readcoop.eu/introducing-transk

#Transkribus #htr #atr #propylaen

Last updated 1 year ago

After successfully training and improving my model for the prints, I used it on the youngest Ethica from 1728. Typographically, it is still very similar to the editions from the 17th century, so that should not be a big problem. I'll quality-check the first eight pages today and see, whether it is worthwhile improving the model for this particular print.

#Transkribus #htr #ethicacomplementoria

Last updated 1 year ago

· @dehypotheses
1355 followers · 437 posts · Server fedihum.org

À ne pas rater, la keynote de notre atelier ATR par Dominique Stutzmann (IRHT/Humboldt-univ.):

"La reconnaissance automatique des textes (ATR). Nouveaux horizons pour les historiennes et les historiens"
07/09/2023
18h00
Sur place & en ligne

Infos 👉ow.ly/1QVC50PHl7t

#atr #ocr #htr #humanitesnumeriques #histoirenumerique #DigitalHumanities

Last updated 1 year ago

DHI Paris · @dhiparis
429 followers · 41 posts · Server wisskomm.social

Nicht verpassen, die Keynote zu unserem ATR-Workshop von Dominique Stutzmann (IRHT/Humboldt-Univ.):

"Automatische Texterkennung (ATR). Neue Möglichkeiten für Historikerinnen und Historiker"
07.09.2023
18 Uhr
Online & vor Ort

Infos 👉 ow.ly/9eAt50PHkY3

#atr #ocr #htr #digitalhumanities #dh #digitalhistory

Last updated 1 year ago

And that's it! Sent off the new training set and hopefully improve the issues I have encountered!
I will also use this model for the 1728 print. It's a different printing press, but overall, the two prints are very much alike, and so is the type they use.

#fraktur #Transkribus #ethicacomplementoria #htr #earlymodern #printhistory

Last updated 1 year ago

DHI Paris · @dhiparis
429 followers · 41 posts · Server wisskomm.social

Workshop: "Von der historischen Quelle zum Volltext. Anwendung automatisierter Schrifterkennung (ATR)"

Mehr Infos ➡️ ow.ly/WqzQ50PGM5H

Nächste Woche am DHIP!
07.09.2023–08.09.2023

#atr #ocr #htr #digitalhumanities #dh #digitalhistory

Last updated 1 year ago

Moving back to the issues, we have phenomena like these (shown in the image): The layout detection model draws 'short' lines when the text is warped in the book fold. This leads to especially slim letters and punctuation not getting recognized. When I do the corrections, I will also extend the lines to include these characters. Hopefully, it will improve the recognition!

#htr #Transkribus #ethicacomplementoria

Last updated 1 year ago

This phenomenon is highlighted because the model I've been training to recognize the text had too few characters in cursive in the training set. So they get misread, and I have to correct them, thus becoming hyperaware of these differences in the typesetting.
When I'm done, I will reconstruct the sheets and printing order to look at the distribution of spelling and other errors and typographical conventions. Exciting!

#htr #bookhistory #printhistory #analyticbibliography

Last updated 1 year ago

Phillip Ströbel · @phillipstroebel
156 followers · 37 posts · Server techhub.social
Phillip Ströbel · @phillipstroebel
141 followers · 36 posts · Server techhub.social

My thesis has been published & is now available: doi.org/10.5167/uzh-234886

I recommend Chapter 4 if you are interested in & what can do for .

Thanks again to my supervisors Martin Volk & @thist, & everyone at the Department of Computational Linguistics from the University of Zurich!

#htr #transformers #historical #documents #digitalhumanities #ocr

Last updated 1 year ago

Exciting! Overnight, the text recognition job was completed and now I get to see how good the transcription is!

#ethicacomplementoria #Transkribus #htr

Last updated 1 year ago

A long time ago, I wanted to do a project on text reuse & text production strategies of . I estimated ca. 30,000 pages of printed text. It was 2009/10, and large-scale digitisation of older books had just started. was a mess, & hadn't been a thing yet. Eventually, I abandoned the project, since manually transcribing 30,000 pp. & then doing computational analysis for text similarity & re-use was unfeasible.
Imagine I wanted to do that now!

#phd #earlymodern #author #georggreflinger #ocr #htr

Last updated 1 year ago

Adi Keinan-Schoonbaert · @adi
211 followers · 96 posts · Server glammr.us

We are collaborating with Foundation on the Wikisource Loves Manuscripts project, with a special twist! :apartyblobcat: The transcribed Indonesian manuscripts will be used to train a model! @BL_DigiSchol Read more here: blogs.bl.uk/digital-scholarshi

#wikimedia #Transkribus #htr

Last updated 1 year ago

Jez 📚 · @petrichor
433 followers · 1275 posts · Server digipres.club

Here's a few more details about my progress training a handwriting model with

erambler.co.uk/blog/training-a

#htr #transkribus

Last updated 1 year ago

A blog post continuing my series of posts about the project on the weblog: greflinger.hypotheses.org/716
Today about and re-using models.

#ethicacomplementoria #DigitalScholarlyEdition #georggreflinger #htr #Transkribus

Last updated 1 year ago

Jez 📚 · @petrichor
427 followers · 1183 posts · Server digipres.club

Since is open source, it should be possible now to recreate this training on my own desktop with the same parameters, and apply the model to recognise new pages, and from there figure out a workflow to simplify getting handwritten notes into plain text for reference or publication.

Has done any of these stages? Any pointers?

#htr #transkribus #pylaia

Last updated 1 year ago

Phillip Ströbel · @phillipstroebel
66 followers · 27 posts · Server techhub.social

Anna Scius-Bertrand is presenting the writer adaptation challenge @ ICDAR2023. The is available here: tc11.cvc.uab.es/datasets/Bulli
The paper here: link.springer.com/chapter/10.1

#bullinger #dataset #htr #ocr #digitalhumanities

Last updated 1 year ago

Phillip Ströbel · @phillipstroebel
63 followers · 25 posts · Server techhub.social

Next week, I will present our (authors being @thist, @boente Martin Volk & me) paper about the of -based OCR models for documents at the ADAPDA workshop @icdar2023. Preprint available here: researchgate.net/publication/3. Disclaimer: This preprint has not undergone any post-submission improvements or corrections. The Version of Record of this contribution is published in Document Analysis and Recognition – ICDAR 2023 Workshops, and is available online at doi.org/10.1007/978-3-031-4149.

#adaptability #transformer #historical #digitalhumanites #ocr #htr #letters

Last updated 1 year ago

Alix Chagué · @Alix_Tz
40 followers · 10 posts · Server hcommons.social

[📢 ] Unexpectedly, there was more to say on the experiments I started last week with Lucien Peraire's archives! So I wrote a new blog post to tell you more about it and dive into Kraken's more advance parameters for training models!

➡️ alix-tz.github.io/phd/posts/01

#PhD #blog #htr

Last updated 1 year ago

Alix Chagué · @Alix_Tz
21 followers · 6 posts · Server hcommons.social

[📢 ] I had a little bit of fun this afternoon training models on the Lucien Peraire's archives. I even go to try out @wouter_haverals's CERberus to test my models' predictions. Inevitably, I wrote a new blog post about it!

Since Peraire's archives have two ensembles of very different documents in French (same handwriting but different writing tools), I tested how good a model can be when trained only on one of the ensembles and vice-versa. I got really excited by the results because one of my models just completely lost it on the unseen writing tool 😋

➡️ alix-tz.github.io/phd/posts/01

#PhD #htr

Last updated 1 year ago