FedSearch - Federated network search engine

Tim Sherratt · @wragge

1090 followers · 1001 posts · Server hcommons.social

📣 119,085 digitised newspaper articles added to #Trove last week. Once again they're mostly (112,604) from the Sydney Daily Mirror, 1944-45. But there's also 6,481 added to the Kyabram Free Press and Rodney and Deakin Shire Advocate in 1954.

See the Trove Data Dashboard: https://wragge.github.io/trove-newspaper-totals/ #GLAM #histodons

#trove #glam #histodons

Last updated 2 years ago

Original post

Tim Sherratt · @wragge

1087 followers · 996 posts · Server hcommons.social

So #Trove has already used machine learning to improve the OCR of at least 10 million newspaper articles: http://nla-overproof.projectcomputing.com/

#trove

Last updated 2 years ago

Original post

Tim Sherratt · @wragge

1087 followers · 996 posts · Server hcommons.social

Today in the #Trove Data Guide – I think the getting data from newspaper pages section is nearly finished: https://wragge.github.io/trove-data-guide/accessing-data/newspapers-and-gazettes-pages.html

#WIP

#trove #wip

Last updated 2 years ago

Original post

Tim Sherratt · @wragge

1083 followers · 976 posts · Server hcommons.social

Documentation is hard. Every time I work on a section in the #Trove Data Guide I realise I need to update/create several other sections. It just keeps getting bigger. Anyway, nearly finished this 'HOW TO' on harvesting a complete set of search results using the Trove API: https://wragge.github.io/trove-data-guide/how-to/harvest-complete-results.html #digitalHumanities #GLAM

#trove #digitalhumanities #glam

Last updated 2 years ago

Original post

Tim Sherratt · @wragge

1083 followers · 976 posts · Server hcommons.social

GitHub - Known API v3 bugs · Issue #49 · GLAM-Workbench/trove-api-intro

I'm continuing to log bugs in the #Trove v3 API here: https://github.com/GLAM-Workbench/trove-api-intro/issues/49 (Trove itself doesn't have any public list of issues/bugs)

#trove

Last updated 2 years ago

Original post

Tim Sherratt · @wragge

1084 followers · 969 posts · Server hcommons.social

@warpedtime If you can bear to use FB, there is an unofficial #Trove user group: https://www.facebook.com/groups/troveusergroup

#trove

Last updated 2 years ago

Original post

Tim Sherratt · @wragge

1084 followers · 969 posts · Server hcommons.social

📣 Just like last week the only change to #Trove's digitised newspapers in the past week has been the addition of more articles from the Sydney Daily Mirror – 123,565 articles from 1944-45.

See the Trove Newspaper Data Dashboard for more: https://wragge.github.io/trove-newspaper-totals/

#trove

Last updated 2 years ago

Original post

Tim Sherratt · @wragge

1083 followers · 967 posts · Server hcommons.social

Just resubmitted a #Trove bug report from 2021 as it's still not fixed -- affects advanced search when filtering by holding organisation.

#trove

Last updated 2 years ago

Original post

Tim Sherratt · @wragge

1083 followers · 966 posts · Server hcommons.social

Looks like there might have been a #Trove update yesterday. The following bugs reported in the last couple of months have been fixed:

https://github.com/GLAM-Workbench/trove-api-intro/issues/49#issuecomment-1652789707

https://github.com/GLAM-Workbench/trove-api-intro/issues/49#issuecomment-1652791988

https://github.com/GLAM-Workbench/trove-api-intro/issues/49#issuecomment-1652797423

#GLAM #digitalHumanities

#trove #glam #digitalhumanities

Last updated 2 years ago

Original post

Tim Sherratt · @wragge

1085 followers · 961 posts · Server hcommons.social

Some important updates for the Trove Newspaper & Gazette Harvester

I've been playing around a lot with RO-Crate lately. It's a way of describing & packaging research data. Here's a post about how I've updated the #Trove Newspaper & Gazette Harvester to automatically document every harvest it creates using RO-Crate: https://updates.timsherratt.org/2023/08/31/some-important-updates.html #researchInfrastructure #rocrate #glam #digitalHumanities

#trove #researchinfrastructure #rocrate #glam #digitalhumanities

Last updated 2 years ago

Original post

Tim Sherratt · @wragge

1087 followers · 957 posts · Server hcommons.social

Not really much in this webinar to help people undertake new forms of digital research (which I thought was the point of the ARDC investment). But anyway, in a few more months the #Trove Data Guide will cover all of that and more (still much to do... 😬): https://wragge.github.io/trove-data-guide/home.html

#trove

Last updated 2 years ago

Original post

Tim Sherratt · @wragge

1088 followers · 955 posts · Server hcommons.social

Now talking about citations... Guess what? It would be a hell of a lot easier to capture and manage citations if #Trove hadn't broken the Zotero translator with the 2020 update... 😡 (though Zotero still works with individual newspaper articles)

#trove

Last updated 2 years ago

Original post

Tim Sherratt · @wragge

1088 followers · 941 posts · Server hcommons.social

Tuning in to the "How to research on #Trove" webinar. Includes an update on some recent ARDC-funded improvements to the API and web interface. Chat is disabled... so maybe I'll drop some comments here.

#trove

Last updated 2 years ago

Original post

Tim Sherratt · @wragge

1088 followers · 939 posts · Server hcommons.social

There's a new 'How to research' page on #Trove, but I have to say it's a bit disappointing: https://trove.nla.gov.au/blog/2023/08/31/how-research-trove Hopefully, I can fill in some gaps with the Trove Data Guide.

#trove

Last updated 2 years ago

Original post

Tim Sherratt · @wragge

1088 followers · 939 posts · Server hcommons.social

Trove newspaper & gazette harvester - GLAM Workbench

New version of the #Trove Newspaper Harvester section of the #GLAMWorkbench (v2.0.0). Now using v3 of the Trove API. https://glam-workbench.net/trove-harvester/ #GLAM #digitalHumanities

#trove #glamworkbench #glam #digitalhumanities

Last updated 2 years ago

Original post

Tim Sherratt · @wragge

1088 followers · 935 posts · Server hcommons.social

There's a new version of the #Trove Newspaper & Gazette Harvester Python package – now using v3 of the Trove API, and automatically generating an RO-Crate file to capture the details of each harvest. Use it as a library or a command line tool to harvest metadata, text, images & PDFs from thousands (even millions) of digitised newspaper articles.

Release details: https://github.com/wragge/trove-newspaper-harvester/releases/tag/v0.7.1

Full documentation: https://wragge.github.io/trove-newspaper-harvester/ #GLAM #digitalHumanities #histodons

#trove #glam #digitalhumanities #histodons

Last updated 2 years ago

Original post

Tim Sherratt · @wragge

1088 followers · 930 posts · Server hcommons.social

Trove API v3 - GLAM Workbench

Aaand I've updated the #GLAMWorkbench's list of breaking changes in the #Trove API v3 with today's discoveries: https://glam-workbench.net/trove-api-v3/ #GLAM #digitalHumanities

#glamworkbench #trove #glam #digitalhumanities

Last updated 2 years ago

Original post

Tim Sherratt · @wragge

1088 followers · 929 posts · Server hcommons.social

So today's unexpected updates...

Trove Query Parser now v0.2.1: https://github.com/wragge/trove_query_parser/releases/tag/v0.2.1

Trove API Console updated to use the changed v3 facets `wordCount` and `illustrationType`, eg: https://troveconsole.herokuapp.com/v3/?url=https%3A//api.trove.nla.gov.au/v3/result%3Fq%3Dwragge%26category%3Dnewspaper%26encoding%3Djson%26l-illustrated%3Dtrue%26l-illustrationType%3DPhoto

#Trove #GLAM #digitalHumanities

#trove #glam #digitalhumanities

Last updated 2 years ago

Original post

Tim Sherratt · @wragge

1088 followers · 928 posts · Server hcommons.social

Accidently typed 'arse_query` instead of `parse_query` and that's about how I'm feeling about the #Trove API update at the moment...

#trove

Last updated 2 years ago

Original post

Tim Sherratt · @wragge

1088 followers · 925 posts · Server hcommons.social

😡 Another undocumented, breaking change in v3 of the #Trove API. the `illtype` facet has been renamed `illustrationType`. Excuse while I now go waste an hour or so updating the Trove API Console, the trove-query-parser etc... #digitalHumanities

#trove #digitalhumanities

Last updated 2 years ago

Original post