FedSearch - Federated network search engine

FedSearch

Markus Lauber · @markks

38 followers · 27 posts · Server det.social

KNIME Community Forum - Automate pdf reader and convert data to excel table with correct column mappings

@juxtacognition I have used #RStats "#tabulizer" to extract tables from a #PDF (https://forum.knime.com/t/automate-pdf-reader-and-convert-data-to-excel-table-with-correct-column-mappings/26384/10?u=mlauber71) and "#pdftools" to extract text (https://forum.knime.com/t/unstructured-text-mining-from-pdf/48625/4?u=mlauber71). Maybe you can adapt this. Then there is a #KNIME node that uses "PDFBox" or another parser (https://kni.me/w/kjy6Q-3szxcH6716) - but I have not used it myself

#RStats #tabulizer #pdf #pdftools #KNIME

Last updated 3 years ago

Original post

Pachá · @pacha

108 followers · 32 posts · Server mastodonsocial.ca

#tabulizer provides #rstats bindings to the Tabula java library, which can be used to computationaly extract tables from PDF documents.

Check https://github.com/ropensci/tabulizer

#tabulizer #rstats

Last updated 3 years ago

Original post