FedSearch - Federated network search engine

Christopher Pollin · @chpollin

98 followers · 53 posts · Server fedihum.org

Open media

Die #DigEdTnT Webinarreihe bietet Einblicke in Übergänge zwischen Tools für digitale Editionen.

Details & Zoom-Link: https://informationsmodellierung.uni-graz.at/de/neuigkeiten/detail/article/digedtnt-webinarreihe/

Start: 19.09, 17:00 - 18:00 : #FromThePage → #ediarum

https://digedtnt.github.io/

#digedtnt #FromThePage #ediarum

Last updated 2 years ago

Original post

Ben W. Brumfield · @benwbrum

1420 followers · 297 posts · Server hcommons.social

Open media

We just passed two million pages transcribed on #FromThePage!

#FromThePage

Last updated 2 years ago

Original post

Ben W. Brumfield · @benwbrum

1402 followers · 285 posts · Server hcommons.social

#dayofdh2023 continues with a lunch break followed by releasing the fixes and features we developed over the last couple of days. #FromThePage is open-source, so first we deploy to the SAAS fromthepage.com as a last verification step, then we merge into the main branch and cut a new release: https://github.com/benwbrum/fromthepage/releases/tag/v23.5.4

Our friends at the University of Texas-Austin Libraries have been testing these releases, including accessibility and security scans, so they're the first people we email about the release.

#dayofdh2023 #FromThePage

Last updated 2 years ago

Original post

Ben W. Brumfield · @benwbrum

1401 followers · 282 posts · Server hcommons.social

Open media

On to the next task for #dayofdh2023: One of the #IIIF manifests on the UT-Austin Islandora instance blows up when importing it into #FromThePage, due to using multilingual values from the v2 Presentation API rather than v3.

Example of v2 metadata element: {"label": "Topic", "value": {"@value": "Incas| Conquest, 1522-1548| History", "@language": "en"}}

The same in v3 would be something like
{ "label": {"en": ["Topic"]}, "value": {"en": ["Incas| Conquest, 1522-1548| History"]}}

Apparently we missed supporting multilingual values when we implemented IIIF support for v2 (though we did implement multilingual labels).

Success!

#dayofdh2023 #iiif #FromThePage

Last updated 2 years ago

Original post

Ben W. Brumfield · @benwbrum

1401 followers · 279 posts · Server hcommons.social

Open media

I just learned that today is #dayofdh2023
So far, I've met with one of our developers to plan work while I'm away at the #iiif conference, then reviewed and tested a new feature adding search term highlighting to #FromThePage.

One tricky bit is highlighting index terms as well as full-text search hits. Indexed subjects may have different verbatim text, as in this example, so we have to highlight based on semantics rather than spelling

#dayofdh2023 #iiif #FromThePage

Last updated 2 years ago

Original post

Ben W. Brumfield · @benwbrum

1387 followers · 239 posts · Server hcommons.social

What's the best #MySQL #database #cloud hosting option for occasional use by developers?

Our amazing undergraduate software engineer (formerly our high school intern) has set up #FromThePage on #Github #CodeSpaces so she doesn't have to fight with her local installation. This opens up a world of possibilities for easy collaboration, but we really need to connect the space to a (bowdlerized) copy of our production DB and portion of the filesystem.

#mysql #database #Cloud #FromThePage #github #codespaces

Last updated 3 years ago

Original post

Ben W. Brumfield · @benwbrum

1375 followers · 230 posts · Server hcommons.social

@ekansa In related news, we just deployed a fix to #FromThePage to switch its CDN dependencies to jsdelivr.net from unpkg.

So that was a little pre-dinner excitement.

#FromThePage

Last updated 3 years ago

Original post

Ben W. Brumfield · @benwbrum

1318 followers · 198 posts · Server hcommons.social

I just ran some quick stats for #FromThePage project partners, and volunteers have transcribed 15,619,250 lines of prose text, 681,299 records from ledgers, and 1,174,263 records from index cards and forms. Hooray for #crowdsourcing

(NB This data is a snapshot in time on FromThePage.com and does not reflect deleted projects or self-hosted projects.)

#FromThePage #crowdsourcing

Last updated 3 years ago

Original post

Ben W. Brumfield · @benwbrum

1219 followers · 159 posts · Server hcommons.social

We're doing our year-in-review analysis for #FromThePage, and I'm wondering what metrics other #crowdsourcing platforms use to measure "success"?

There are product-focused metrics like completed projects or pages transcribed, but what about user-focused ones? What proportion of volunteers were able to find a project that interested them? How many potential volunteers tried to perform a task but were turned away? How many people were discouraged from participating, and how many became passionate participants? How many discovered something that excited them during the process?

#FromThePage #crowdsourcing

Last updated 3 years ago

Original post

Ben W. Brumfield · @benwbrum

1139 followers · 154 posts · Server hcommons.social

@digitaldogsbody @jbaiter Once we get this working, I'll publish it as a gem under an Apache license. I feel like it opens up a lot of neat opportunities for #crowdsourcing, since volunteers prefer working in plaintext, but libraries (and their users) really like word highlighting in search results.

I'm not sure how we'll integrate it into #FromThePage yet, but I hope we can make it part of the data flow.

#crowdsourcing #FromThePage

Last updated 3 years ago

Original post

Ben W. Brumfield · @benwbrum

1015 followers · 130 posts · Server hcommons.social

Can you use #crowdsourcing to create #alttext descriptions for photographs in #specialcollections and #archives? We want to find out.

Last week we released a new alt-text input type for #FromThePage projects. (video walk-through at https://www.youtube.com/watch?v=_--zAZgpJCs )

It was requested by partners at a university library who hope to use the feature for remote work by staff, but we think that there are some interesting possibilities for collaboration with the public to increase #a11y for cultural heritage objects.

If you have examples of easy-to-follow instructions for creating alt-text, I'd love to add them to our resources for project owners.

#crowdsourcing #alttext #specialcollections #archives #FromThePage #a11y

Last updated 3 years ago

Original post

Ben W. Brumfield · @benwbrum

895 followers · 114 posts · Server hcommons.social

A #ChatGPT test:

A month or two ago, a customer asked if #FromThePage could be used to generate #alttext for photographs in library #specialcollections. The software supports field-based transcription, so--in theory--they could configure a form with a single text field and instructions on describing images, then have that presented to users next to the photo. The problem was that our text field forms had a limit of 2048 characters -- too much for good alt-text.

UI is not a strong point for me or Sara, but we have a brilliant college student working for us, so we assigned the feature to her. By the end of the Thanksgiving break, she'd produced a new input type that worked really well. With a bit of polish on our end, the feature looks like this: https://www.youtube.com/watch?v=_--zAZgpJCs

But could we have used an AI to do this instead?

#chatgpt #FromThePage #alttext #specialcollections

Last updated 3 years ago

Original post

Ben W. Brumfield · @benwbrum

847 followers · 97 posts · Server hcommons.social

Community Engagement and Data-driven Indexing with The Library of Virginia’s “Free Negro Registers” webinar tomorrow at 12:00 EST

The Library of Virginia holds thirty-nine registers documenting free Black and multi-racial individuals from Virginia localities between 1794-1865. LVA has digitized each register as part of Virginia Untold, a digital project that provides access to materials documenting Black history in antebellum Virginia. Like other Virginia Untold documents, they wanted to #crowdsource their transcription using #FromThePage, but the registers presented unique challenges.

Sonya Coleman and Lydia Neuroth will discuss how LVA addressed those challenges, including workflows for crowdsourced #indexing rather than verbatim #transcription to better align with the data-driven goals of the Virginia Untold project. They will also share virtual and in-person programming ideas, lessons from working with volunteers, and feedback from their users.

Sign up: https://content.fromthepage.com/dec-2022-webinar-community-engagement/

#crowdsource #FromThePage #indexing #transcription

Last updated 3 years ago

Original post

Ben W. Brumfield · @benwbrum

742 followers · 88 posts · Server hcommons.social

@susankitchens Wow, these are some great questions! (And how cool to reconnect here with someone from the old blogosphere/Google Reader days!)

We do a lot of analysis on repeat participants by cohort and referrer on #FromThePage to reach out to project owners whose messaging/configuration may not be leading to volunteer success, and to try to analyze ways to improve our user experience for volunteers.

On the other questions, I'm not sure if any platforms provide a live chat for volunteers, status map, or "online now" number. Maybe @mia, @sam, @AlexaRenggli, @meghaninmotion or @algeebraten would know?

#FromThePage

Last updated 3 years ago

Original post

Ben W. Brumfield · @benwbrum

641 followers · 79 posts · Server hcommons.social

I'm analyzing #FromThePage data for our December newsletter, focusing on volunteer behavior over the holidays. I have my own questions, but what would you like to know about #volunteers on #crowdsourcing platforms?

#FromThePage #volunteers #crowdsourcing

Last updated 3 years ago

Original post

Ben W. Brumfield · @benwbrum

620 followers · 77 posts · Server hcommons.social

@jorisvanzundert As part of that NEH grant, we also addressed the privacy concerns -- in the US academic context these revolve around FERPA law protecting privacy of students.

This suggestion from your paper was very close to what we implemented in #FromThePage:
"Future complications could be avoided by presenting the citizen scientists with a digital form asking them to check a box if they agree to being named in a publication before they apply to the project. Thus, they can knowingly opt in. It is crucial that such a form clearly state how exactly their name would be used, as part of expectation management, if the participant allows for their name to be used at all."

We developed three levels of identification:
* pure pseudonymity,
* pseudonymity on the online platform + a "Real Name" for citation purposes in appropriate data exports
* Real Name + ORCID to be included in exports.

#FromThePage

Last updated 3 years ago

Original post

Ben W. Brumfield · @benwbrum

420 followers · 49 posts · Server hcommons.social

When I was home full-time on paternity leave in 2005, I had a problem: how to read while feeding a baby? I tried a number of books, with little success. At the time, my wife was reading Agile Web Development with #Rails, and raving about it.

The extra-wide quarto would lay open on the couch next to me while I fed my daughter. I read the whole thing in a few weeks, and that transformed #FromThePage. (I'd trying to build a web-based #transcription tool in PHP after failing to separate article pages from facsimile/transcription pages in Mediawiki in May of that year.)

Rails was a wonder. It even had best practices for things like DB migrations! And #Ruby was really nice after I'd worked in Java for five years. Eventually I'd fall in love with the community of scrappy freelancers and F/OSS developers, especially here in #Austin.

I'm not sure anyone buys books to learn new technology anymore, but AWDWR changed my career.

#rails #FromThePage #transcription #ruby #austin

Last updated 3 years ago

Original post

Ben W. Brumfield · @benwbrum

297 followers · 33 posts · Server hcommons.social

While #volunteer activity on #FromThePage was low (but within normal Thursday ranges) on Thanksgiving, the profile of activity was very different. If you look at the number of contributions per volunteer, we see that number rise by 36%. Unlike the previous statistics, mean contribution per user (31) was well outside the normal Thursday range (20 to 25).

So the total number of volunteers drops on the #crowdsourcing platform on a holiday, but those who do participate do a lot more work on the site.

#volunteer #FromThePage #crowdsourcing

Last updated 3 years ago

Original post

Ben W. Brumfield · @benwbrum

284 followers · 22 posts · Server hcommons.social

@bencomp That's a good idea. No, currently I'm trying to put together a DH-focused Zurich->Vienna itinerary to justify a trip an event in Graz. (So my plans are very open: give a talk, meet people, hack together, ???)

Meeting the #Transkribus people would be a lovely thing to do in Innsbruck, if they're interested. (We will have just finished supporting a Feb 22 workshop on #FromThePage<->#Transkribus integration flows, so there might be issues to discuss that come from that?)

#FromThePage #Transkribus

Last updated 3 years ago

Original post