Colin Rosenthal · @gorhendad_oldbuck
90 followers · 924 posts · Server mastodon.scot

Interesting administrative position for anyone worldwide with
* Familiarity with library, archive, or museum collections and practices
* Knowledge of web archiving
* Experience of work within a research institution or library

recruiting.paylocity.com/recru

#webarchiving #archive #library #jobannouncement

Last updated 1 year ago

nestor · @nestorNetzwerk
175 followers · 36 posts · Server openbiblio.social

Am 24. August drehte sich bei der @DNB_Aktuelles in Frankfurt am Main alles um Webarchivierung. Für alle, die nicht bei sein konnten, und jene, die nochmal nachlesen möchten: Die Präsentationen der Vorträge sind nun online!

langzeitarchivierung.de/Webs/n

#digipres #webarchiving

Last updated 1 year ago

IIPC · @netpreserve
48 followers · 28 posts · Server digipres.club

📆This marks just over one month before the CfP for closes. Start thinking about your submission today!

📢CfP: in Context📢
netpreserve.org/ga2024/cfp/
🟣PROPOSALS DUE SEPT 24
🟣 24-26 APR 2024
🇫🇷NATIONAL LIBRARY OF FRANCE (BnF), PARIS, FRANCE
🖥 🎉
💾 📖

#digitalhumanities #digitalpreservation #iipc20years #webarchiving #webarchives #iipcwac24 #webarchivewednesday

Last updated 1 year ago

Shawn M. Jones, PhD · @shawnmjones
459 followers · 3839 posts · Server hachyderm.io

What about that metadata that is present? Grusky et al. (doi.org/10.18653/v1/N18-1065 ) realized that, because page authors create that metadata, it can serve as ground truth to evaluate .

We analyzed pages from and saw how this metadata evolved. By 2010 we saw a metadata explosion with the use of Cards, Open Graph Protocol, Tracking, and more. Things like Twitter cards created a metadata renaissance for HTML.

Ref: doi.org/10.1109/JCDL52503.2021

#automatic #summarization #webarchiving #Twitter #facebook

Last updated 1 year ago

Shawn M. Jones, PhD · @shawnmjones
459 followers · 3839 posts · Server hachyderm.io

In 2020, we developed a special tool, MementoEmbed, for generating/extracting metadata from archived web pages. We presented this tool at the Web Archiving and Digital Libraries Workshop (WADL2020).

We found out that , , , and others could not reliably create cards for archived web pages. We use MementoEmbed’s cards in with our tool Raintale to create a of this .

Ref: arxiv.org/abs/2008.00137

#Twitter #facebook #tumblr #storytelling #visualization #summarization #webarchiving #digitalpreservation

Last updated 1 year ago

Ed Summers · @edsu
1710 followers · 6939 posts · Server social.coop

I am just noticing that the French proto-socialmedia site Skyblog (now Skyrocket) is shutting down, and that the Bibliothèque nationale de France (BNF) and the Institut national de l’audiovisuel (INA) were asked to archive the 19 million blogs, I guess using tech? (I need to find someone with a Le Monde subscription to read the rest of the article):

lemonde.fr/pixels/article/2023

#webarchiving

Last updated 1 year ago

Shawn M. Jones, PhD · @shawnmjones
459 followers · 3836 posts · Server hachyderm.io

@fourjuaneight I wrote a blog post in 2019 in response to + going offline (ws-dl.blogspot.com/2019/02/201). My post, along with other sources, was applied by the glorious all-volunteer ArchiveTeam (wiki.archiveteam.org) coordinated by @textfiles. They preserved as much of Google+ as they could before it was gone. They also have projects to preserve , , , , , and many .ua sites.

#google #reddit #imgur #telegram #github #youtube #ukraine #twittermigration #digitalpreservation #webarchiving

Last updated 1 year ago

IIPC · @netpreserve
44 followers · 27 posts · Server digipres.club
Shawn M. Jones, PhD · @shawnmjones
457 followers · 3822 posts · Server hachyderm.io
IIPC · @netpreserve
40 followers · 23 posts · Server digipres.club

🎥 recordings are now available to view!
🔵Check out the full list here on YouTube: youtube.com/@iipc8855/playlist
🔵Browse the final program: netpreserve.org/ga2023/program

🙌Thank you to all of our wonderful presenters, session chairs, & co-authors for sharing their work with us!

#webarchivering #webarchiving #webarchives #iipc20years #IIPCWAC23

Last updated 1 year ago

IIPC · @netpreserve
38 followers · 21 posts · Server digipres.club

📢CfP: in Context📢
netpreserve.org/ga2024/cfp/
🟣PROPOSALS DUE SEPT 24
🟣 24-26 APR 2024
🇫🇷NATIONAL LIBRARY OF FRANCE (BnF), PARIS, FRANCE
🖥 🎉
💾 📖

Thanks very much to our Program Committee for their work in putting this CfP together! (netpreserve.org/ga2024/organiz)

#digitalhumanities #digitalpreservation #iipc20years #webarchiving #iipcwac24 #webarchives

Last updated 1 year ago

nestor · @nestorNetzwerk
173 followers · 33 posts · Server openbiblio.social

Am 24.8.2023 geht es in einer nestor-Veranstaltung um "Webarchivierung - Praxis und Perspektiven". Die Anmeldung ist noch bis 4. August 2023 möglich! Mehr Infos unter: langzeitarchivierung.de/Webs/n

#digipres #webarchiving

Last updated 1 year ago

Andrew N. Jackson · @anj
607 followers · 1331 posts · Server digipres.club
James Louis Smith · @scrivenersmith
1365 followers · 605 posts · Server hcommons.social

Today in dodgy bodges, I'd like to introduce the 'tile suck'.

So if you use most conventional automated archiving tools on a website with a leaflet map on it (in this example a Mapbox map for the @PortsPastPres website), you may notice that they only ever capture the map tiles visible on the screen.

Not good enough, I say! So my strategy is this:

1) Get the biggest screen you can find. Biiiiiiig. (zooming out on the browser doesn't seem to quite work)

2) Open up the map with the tileset that you want to hoover up into your WARC record (probably in Chrome because that's what Archiveweb.page works on). Set your Leaflet or whatever map viewer to full screen. Start your crawl.

3) Zoom around at each level of zoom for areas of zoom and focus on your map. I'm using a point cluster map so I focus on the areas where my points are. The big screen means that whenever you hit an undownloaded tile, it'll pull it in.

#spatialhumanities #curatescape #webarchiving

Last updated 1 year ago

Lozana Rossenova · @lozross
361 followers · 166 posts · Server post.lurk.org

First workshop on with tools at con & very happy to be here for this convergence of & webarchiving communities (hint: the former really needs more engagement with the latter ;)) with Ilya Kreymer & Jasmine Mulliken.

[PS At the con for the whole week, so if anyone here is in Graz, DM for meetups]

#webarchiving #webrecorder #DH2023 #dh

Last updated 1 year ago

v_i_o_l_a · @v_i_o_l_a
893 followers · 5065 posts · Server openbiblio.social

the NSZL uses a set of scripts and natural language processing tools to extract and clean the text from the archived web pages.

#webarchiving #researchdata #liber2023

Last updated 1 year ago

Andrew N. Jackson · @anj
607 followers · 1331 posts · Server digipres.club

It took a long time to write it, and now you have to read it! A new blog post: Robust file transfers with Rclone anjackson.net/2023/07/04/robus

#digipres #webarchiving

Last updated 1 year ago

Andrew N. Jackson · @anj
592 followers · 1303 posts · Server digipres.club

RIP our Twitter crawls…

#webarchiving

Last updated 1 year ago

Paige Roberts, Ph.D. · @paigeroberts
585 followers · 868 posts · Server historians.social

Archives Research Corporate Hub:
new research and education service that helps users easily build, access, and analyze digital collections computationally at scale
blog.archive.org/2023/06/26/bu
@thomas @internetarchive
Archives Unleashed (Ian Milligan et al), Mellon Foundation
@histodons

#webarchiving

Last updated 1 year ago

Andrew N. Jackson · @anj
591 followers · 1271 posts · Server digipres.club

Some reflections on IIPC Web Archiving Conference 2023: anjackson.net/2023/06/20/refle (cross-posted from UKWA blog).

#webarchiving

Last updated 1 year ago